qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements
@ 2011-10-18 18:50 Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 01/21] target-sparc: Add accessors for single-precision fpr access Richard Henderson
                   ` (21 more replies)
  0 siblings, 22 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This started out to be simply flushing out the VIS2 instruction set.
But when I got a look a the DT0/1 "calling convention" I choked, and
thought we could really do better than that.

The end result (op_opt,out_asm) looks significantly cleaner for a
64-bit host.  It looks about the same for a 32-bit host.

I've been testing this vs the gcc testsuite, both for its generic
ieee test cases, and the vectorization tests w/ -mvis2.

Watch out for the last patch.  It was an attempt to get rid of the
hundreds of tls failures in the gcc testsuite by supporting NPTL.
Except the clone syscall crashes, and seems to be crashing at a point
where it's difficult to see what's going wrong.  That patch is 
present here for discussion only.

All of this is relative to blueswirl's sparc tree.  Which I think
should go in as a most excellent cleanup of target-sparc.  I've
pushed the tree to

  git://repo.or.cz/qemu/rth.git rth/vis


r~


Richard Henderson (21):
  target-sparc: Add accessors for single-precision fpr access.
  target-sparc: Mark fprs dirty in store accessor.
  target-sparc: Add accessors for double-precision fpr access.
  target-sparc: Pass float64 parameters instead of dt0/1 temporaries.
  target-sparc: Make VIS helpers const when possible.
  target-sparc: Extract common code for floating-point operations.
  target-sparc: Extract float128 move to a function.
  target-sparc: Undo cpu_fpr rename.
  target-sparc: Change fpr representation to doubles.
  tcg: Optimize some forms of deposit.
  target-sparc: Do exceptions management fully inside the helpers.
  sparc-linux-user: Handle SIGILL.
  target-sparc: Implement PDIST.
  target-sparc: Implement fpack{16,32,fix}.
  target-sparc: Implement EDGE* instructions.
  target-sparc: Implement ALIGNADDR* inline.
  target-sparc: Implement BMASK/BSHUFFLE.
  target-sparc: Tidy fpack32.
  target-sparc: Implement FALIGNDATA inline.
  sparc-linux-user: Add some missing syscall numbers
  sparc-linux-user: Enable NPTL

 configure                     |    3 +
 gdbstub.c                     |   35 +-
 linux-user/main.c             |    9 +
 linux-user/signal.c           |   28 +-
 linux-user/sparc/syscall_nr.h |    3 +
 linux-user/syscall.c          |   12 +-
 monitor.c                     |   96 ++--
 target-sparc/cpu.h            |   38 +-
 target-sparc/cpu_init.c       |    6 +-
 target-sparc/fop_helper.c     |  294 ++++++---
 target-sparc/helper.h         |  120 ++--
 target-sparc/ldst_helper.c    |  123 +---
 target-sparc/machine.c        |   20 +-
 target-sparc/translate.c      | 1461 ++++++++++++++++++++++++-----------------
 target-sparc/vis_helper.c     |  251 +++++---
 tcg/tcg-op.h                  |   65 ++-
 16 files changed, 1503 insertions(+), 1061 deletions(-)

-- 
1.7.6.4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 01/21] target-sparc: Add accessors for single-precision fpr access.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 02/21] target-sparc: Mark fprs dirty in store accessor Richard Henderson
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Load, store, and "create destination".  This version attempts to
change the behaviour of the translator as little as possible.  We
previously used cpu_tmp32 as the temporary destination, and we
continue to use that.  This will eventually allow a change in
representation of the fprs.

Change the name of the cpu_fpr array to make certain that all
instances are converted.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |  532 +++++++++++++++++++++++++++++-----------------
 1 files changed, 337 insertions(+), 195 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index b7a6bf1..19f41b7 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -63,7 +63,7 @@ static TCGv cpu_tmp0;
 static TCGv_i32 cpu_tmp32;
 static TCGv_i64 cpu_tmp64;
 /* Floating point registers */
-static TCGv_i32 cpu_fpr[TARGET_FPREGS];
+static TCGv_i32 cpu__fpr[TARGET_FPREGS];
 
 static target_ulong gen_opc_npc[OPC_BUF_SIZE];
 static target_ulong gen_opc_jump_pc[2];
@@ -115,63 +115,78 @@ static int sign_extend(int x, int len)
 #define IS_IMM (insn & (1<<13))
 
 /* floating point registers moves */
+static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
+{
+    return cpu__fpr[src];
+}
+
+static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
+{
+    tcg_gen_mov_i32 (cpu__fpr[dst], v);
+}
+
+static TCGv_i32 gen_dest_fpr_F(void)
+{
+    return cpu_tmp32;
+}
+
 static void gen_op_load_fpr_DT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.lower));
 }
 
 static void gen_op_load_fpr_DT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, dt1) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt1) +
                    offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt1) +
                    offsetof(CPU_DoubleU, l.lower));
 }
 
 static void gen_op_store_DT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
                    offsetof(CPU_DoubleU, l.lower));
 }
 
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_load_fpr_QT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_store_QT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_ld_i32(cpu_fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_ld_i32(cpu_fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu__fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
@@ -1892,6 +1907,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
 {
     unsigned int opc, rs1, rs2, rd;
     TCGv cpu_src1, cpu_src2, cpu_tmp1, cpu_tmp2;
+    TCGv_i32 cpu_src1_32, cpu_src2_32, cpu_dst_32;
     target_long simm;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)))
@@ -2369,23 +2385,32 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 save_state(dc, cpu_cond);
                 switch (xop) {
                 case 0x1: /* fmovs */
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x5: /* fnegs */
-                    gen_helper_fnegs(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fnegs(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x9: /* fabss */
-                    gen_helper_fabss(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fabss(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x29: /* fsqrts */
                     CHECK_FPU_FEATURE(dc, FSQRT);
                     gen_clear_float_exceptions();
-                    gen_helper_fsqrts(cpu_tmp32, cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fsqrts(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x2a: /* fsqrtd */
@@ -2408,10 +2433,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x41: /* fadds */
                     gen_clear_float_exceptions();
-                    gen_helper_fadds(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fadds(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x42: /* faddd */
@@ -2435,10 +2463,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x45: /* fsubs */
                     gen_clear_float_exceptions();
-                    gen_helper_fsubs(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fsubs(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x46: /* fsubd */
@@ -2463,10 +2494,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 case 0x49: /* fmuls */
                     CHECK_FPU_FEATURE(dc, FMUL);
                     gen_clear_float_exceptions();
-                    gen_helper_fmuls(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fmuls(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x4a: /* fmuld */
@@ -2492,10 +2526,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x4d: /* fdivs */
                     gen_clear_float_exceptions();
-                    gen_helper_fdivs(cpu_tmp32, cpu_env, cpu_fpr[rs1],
-                                     cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fdivs(cpu_dst_32, cpu_env,
+                                     cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x4e: /* fdivd */
@@ -2520,7 +2557,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 case 0x69: /* fsmuld */
                     CHECK_FPU_FEATURE(dc, FSMULD);
                     gen_clear_float_exceptions();
-                    gen_helper_fsmuld(cpu_env, cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fsmuld(cpu_env, cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
@@ -2537,35 +2576,41 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0xc4: /* fitos */
                     gen_clear_float_exceptions();
-                    gen_helper_fitos(cpu_tmp32, cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fitos(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xc6: /* fdtos */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdtos(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fdtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xc7: /* fqtos */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtos(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fqtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xc8: /* fitod */
-                    gen_helper_fitod(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fitod(cpu_env, cpu_src1_32);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0xc9: /* fstod */
-                    gen_helper_fstod(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fstod(cpu_env, cpu_src1_32);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
@@ -2580,13 +2625,15 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0xcc: /* fitoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_helper_fitoq(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fitoq(cpu_env, cpu_src1_32);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0xcd: /* fstoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_helper_fstoq(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fstoq(cpu_env, cpu_src1_32);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
@@ -2599,44 +2646,50 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0xd1: /* fstoi */
                     gen_clear_float_exceptions();
-                    gen_helper_fstoi(cpu_tmp32, cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fstoi(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xd2: /* fdtoi */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdtoi(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fdtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0xd3: /* fqtoi */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtoi(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fqtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x3: /* V9 fmovq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)], cpu_fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1],
-                                    cpu_fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2],
-                                    cpu_fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3],
-                                    cpu_fpr[QFPREG(rs2) + 3]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],
+                                    cpu__fpr[QFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],
+                                    cpu__fpr[QFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],
+                                    cpu__fpr[QFPREG(rs2) + 2]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],
+                                    cpu__fpr[QFPREG(rs2) + 3]);
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0x6: /* V9 fnegd */
@@ -2667,7 +2720,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x81: /* V9 fstox */
                     gen_clear_float_exceptions();
-                    gen_helper_fstox(cpu_env, cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_helper_fstox(cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_DT0_fpr(DFPREG(rd));
                     gen_update_fprs_dirty(DFPREG(rd));
@@ -2692,9 +2746,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 case 0x84: /* V9 fxtos */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fxtos(cpu_tmp32, cpu_env);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fxtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_tmp32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x88: /* V9 fxtod */
@@ -2738,7 +2793,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     gen_set_label(l1);
                     break;
@@ -2750,8 +2806,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1], cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)], cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1], cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     gen_set_label(l1);
                     break;
@@ -2764,10 +2820,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)], cpu_fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1], cpu_fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2], cpu_fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3], cpu_fpr[QFPREG(rs2) + 3]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)], cpu__fpr[QFPREG(rs2)]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1], cpu__fpr[QFPREG(rs2) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2], cpu__fpr[QFPREG(rs2) + 2]);
+                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3], cpu__fpr[QFPREG(rs2) + 3]);
                     gen_update_fprs_dirty(QFPREG(rd));
                     gen_set_label(l1);
                     break;
@@ -2786,7 +2842,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);     \
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
+                        gen_store_fpr_F(dc, rd, cpu_src1_32);           \
                         gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2802,10 +2859,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)],            \
-                                        cpu_fpr[DFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],        \
-                                        cpu_fpr[DFPREG(rs2) + 1]);      \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
+                                        cpu__fpr[DFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
+                                        cpu__fpr[DFPREG(rs2) + 1]);     \
                         gen_update_fprs_dirty(DFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2821,14 +2878,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)],            \
-                                        cpu_fpr[QFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1],        \
-                                        cpu_fpr[QFPREG(rs2) + 1]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2],        \
-                                        cpu_fpr[QFPREG(rs2) + 2]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3],        \
-                                        cpu_fpr[QFPREG(rs2) + 3]);      \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
+                                        cpu__fpr[QFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
+                                        cpu__fpr[QFPREG(rs2) + 1]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
+                                        cpu__fpr[QFPREG(rs2) + 2]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
+                                        cpu__fpr[QFPREG(rs2) + 3]);     \
                         gen_update_fprs_dirty(QFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2887,7 +2944,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);     \
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
+                        gen_store_fpr_F(dc, rd, cpu_src1_32);           \
                         gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2903,10 +2961,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)],            \
-                                        cpu_fpr[DFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],        \
-                                        cpu_fpr[DFPREG(rs2) + 1]);      \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
+                                        cpu__fpr[DFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
+                                        cpu__fpr[DFPREG(rs2) + 1]);     \
                         gen_update_fprs_dirty(DFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2922,14 +2980,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd)],            \
-                                        cpu_fpr[QFPREG(rs2)]);          \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 1],        \
-                                        cpu_fpr[QFPREG(rs2) + 1]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 2],        \
-                                        cpu_fpr[QFPREG(rs2) + 2]);      \
-                        tcg_gen_mov_i32(cpu_fpr[QFPREG(rd) + 3],        \
-                                        cpu_fpr[QFPREG(rs2) + 3]);      \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
+                                        cpu__fpr[QFPREG(rs2)]);         \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
+                                        cpu__fpr[QFPREG(rs2) + 1]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
+                                        cpu__fpr[QFPREG(rs2) + 2]);     \
+                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
+                                        cpu__fpr[QFPREG(rs2) + 3]);     \
                         gen_update_fprs_dirty(QFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -2960,7 +3018,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
 #undef FMOVQCC
 #endif
                     case 0x51: /* fcmps, V9 %fcc */
-                        gen_op_fcmps(rd & 3, cpu_fpr[rs1], cpu_fpr[rs2]);
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                        cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                        gen_op_fcmps(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x52: /* fcmpd, V9 %fcc */
                         gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2974,7 +3034,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_op_fcmpq(rd & 3);
                         break;
                     case 0x55: /* fcmpes, V9 %fcc */
-                        gen_op_fcmpes(rd & 3, cpu_fpr[rs1], cpu_fpr[rs2]);
+                        cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                        cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                        gen_op_fcmpes(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x56: /* fcmped, V9 %fcc */
                         gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -4021,8 +4083,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x051: /* VIS I fpadd16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpadd16s(cpu_env, cpu_fpr[rd],
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpadd16s(cpu_dst_32, cpu_env,
+                                        cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x052: /* VIS I fpadd32 */
@@ -4035,8 +4101,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x053: /* VIS I fpadd32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpadd32s(cpu_env, cpu_fpr[rd],
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_add_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x054: /* VIS I fpsub16 */
@@ -4049,8 +4118,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x055: /* VIS I fpsub16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpsub16s(cpu_env, cpu_fpr[rd],
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpsub16s(cpu_dst_32, cpu_env,
+                                        cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x056: /* VIS I fpsub32 */
@@ -4063,169 +4136,222 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x057: /* VIS I fpsub32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_helper_fpsub32s(cpu_env, cpu_fpr[rd],
-                                        cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_sub_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd)], 0);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd) + 1], 0);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], 0);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], 0);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x061: /* VIS I fzeros */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[rd], 0);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_movi_i32(cpu_dst_32, 0);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nor_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_nor_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x063: /* VIS I fnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nor_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_nor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                     cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                     cpu_fpr[DFPREG(rs1) + 1],
-                                     cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
+                                     cpu__fpr[DFPREG(rs1)],
+                                     cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                     cpu__fpr[DFPREG(rs1) + 1],
+                                     cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x065: /* VIS I fandnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_andc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x067: /* VIS I fnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)],
-                                     cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_andc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                     cpu_fpr[DFPREG(rs2) + 1],
-                                     cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
+                                     cpu__fpr[DFPREG(rs2)],
+                                     cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                     cpu__fpr[DFPREG(rs2) + 1],
+                                     cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x069: /* VIS I fandnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu_fpr[rd], cpu_fpr[rs2], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_andc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_not_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x06b: /* VIS I fnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu_fpr[rd], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x06d: /* VIS I fxors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xor_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_xor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nand_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                     cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_nand_i32(cpu_fpr[DFPREG(rd) + 1],
-                                     cpu_fpr[DFPREG(rs1) + 1],
-                                     cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd)],
+                                     cpu__fpr[DFPREG(rs1)],
+                                     cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd) + 1],
+                                     cpu__fpr[DFPREG(rs1) + 1],
+                                     cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x06f: /* VIS I fnands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nand_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_nand_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_and_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_and_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x071: /* VIS I fands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_and_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_and_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xori_i32(cpu_tmp32, cpu_fpr[DFPREG(rs2)], -1);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd)], cpu_tmp32,
-                                    cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_xori_i32(cpu_tmp32, cpu_fpr[DFPREG(rs2) + 1], -1);
-                    tcg_gen_xor_i32(cpu_fpr[DFPREG(rd) + 1], cpu_tmp32,
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x073: /* VIS I fxnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xori_i32(cpu_tmp32, cpu_fpr[rs2], -1);
-                    tcg_gen_xor_i32(cpu_fpr[rd], cpu_tmp32, cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_eqv_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_mov_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x075: /* VIS I fsrc1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                    cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs1)],
+                                    cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x077: /* VIS I fornot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_orc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x078: /* VIS I fsrc2 */
@@ -4236,46 +4362,59 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x079: /* VIS I fsrc2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
+                    gen_store_fpr_F(dc, rd, cpu_src1_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs2)],
-                                    cpu_fpr[DFPREG(rs1)]);
-                    tcg_gen_orc_i32(cpu_fpr[DFPREG(rd) + 1],
-                                    cpu_fpr[DFPREG(rs2) + 1],
-                                    cpu_fpr[DFPREG(rs1) + 1]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
+                                    cpu__fpr[DFPREG(rs2)],
+                                    cpu__fpr[DFPREG(rs1)]);
+                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
+                                    cpu__fpr[DFPREG(rs2) + 1],
+                                    cpu__fpr[DFPREG(rs1) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x07b: /* VIS I fornot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu_fpr[rd], cpu_fpr[rs2], cpu_fpr[rs1]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_orc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_or_i32(cpu_fpr[DFPREG(rd)], cpu_fpr[DFPREG(rs1)],
-                                   cpu_fpr[DFPREG(rs2)]);
-                    tcg_gen_or_i32(cpu_fpr[DFPREG(rd) + 1],
-                                   cpu_fpr[DFPREG(rs1) + 1],
-                                   cpu_fpr[DFPREG(rs2) + 1]);
+                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd)],
+                                   cpu__fpr[DFPREG(rs1)],
+                                   cpu__fpr[DFPREG(rs2)]);
+                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd) + 1],
+                                   cpu__fpr[DFPREG(rs1) + 1],
+                                   cpu__fpr[DFPREG(rs2) + 1]);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x07d: /* VIS I fors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_or_i32(cpu_fpr[rd], cpu_fpr[rs1], cpu_fpr[rs2]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_or_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd)], -1);
-                    tcg_gen_movi_i32(cpu_fpr[DFPREG(rd) + 1], -1);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], -1);
+                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], -1);
                     gen_update_fprs_dirty(DFPREG(rd));
                     break;
                 case 0x07f: /* VIS I fones */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu_fpr[rd], -1);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_movi_i32(cpu_dst_32, -1);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x080: /* VIS I shutdown */
@@ -4660,7 +4799,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 case 0x20:      /* ldf, load fpreg */
                     gen_address_mask(dc, cpu_addr);
                     tcg_gen_qemu_ld32u(cpu_tmp0, cpu_addr, dc->mem_idx);
-                    tcg_gen_trunc_tl_i32(cpu_fpr[rd], cpu_tmp0);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    tcg_gen_trunc_tl_i32(cpu_dst_32, cpu_tmp0);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
                     gen_update_fprs_dirty(rd);
                     break;
                 case 0x21:      /* ldfsr, V9 ldxfsr */
@@ -4812,7 +4953,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 switch (xop) {
                 case 0x24: /* stf, store fpreg */
                     gen_address_mask(dc, cpu_addr);
-                    tcg_gen_ext_i32_tl(cpu_tmp0, cpu_fpr[rd]);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rd);
+                    tcg_gen_ext_i32_tl(cpu_tmp0, cpu_src1_32);
                     tcg_gen_qemu_st32(cpu_tmp0, cpu_addr, dc->mem_idx);
                     break;
                 case 0x25: /* stfsr, V9 stxfsr */
@@ -5246,9 +5388,9 @@ void gen_intermediate_code_init(CPUSPARCState *env)
                                               offsetof(CPUState, gregs[i]),
                                               gregnames[i]);
         for (i = 0; i < TARGET_FPREGS; i++)
-            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
-                                                offsetof(CPUState, fpr[i]),
-                                                fregnames[i]);
+            cpu__fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
+                                                 offsetof(CPUState, fpr[i]),
+                                                 fregnames[i]);
 
         /* register helpers */
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 02/21] target-sparc: Mark fprs dirty in store accessor.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 01/21] target-sparc: Add accessors for single-precision fpr access Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 03/21] target-sparc: Add accessors for double-precision fpr access Richard Henderson
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |   54 ++++++---------------------------------------
 1 files changed, 8 insertions(+), 46 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 19f41b7..3dd72ab 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -114,6 +114,13 @@ static int sign_extend(int x, int len)
 
 #define IS_IMM (insn & (1<<13))
 
+static inline void gen_update_fprs_dirty(int rd)
+{
+#if defined(TARGET_SPARC64)
+    tcg_gen_ori_i32(cpu_fprs, cpu_fprs, (rd < 32) ? 1 : 2);
+#endif
+}
+
 /* floating point registers moves */
 static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 {
@@ -123,6 +130,7 @@ static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
     tcg_gen_mov_i32 (cpu__fpr[dst], v);
+    gen_update_fprs_dirty(dst);
 }
 
 static TCGv_i32 gen_dest_fpr_F(void)
@@ -1585,13 +1593,6 @@ static int gen_trap_ifnofpu(DisasContext *dc, TCGv r_cond)
     return 0;
 }
 
-static inline void gen_update_fprs_dirty(int rd)
-{
-#if defined(TARGET_SPARC64)
-    tcg_gen_ori_i32(cpu_fprs, cpu_fprs, (rd < 32) ? 1 : 2);
-#endif
-}
-
 static inline void gen_op_clear_ieee_excp_and_FTT(void)
 {
     tcg_gen_andi_tl(cpu_fsr, cpu_fsr, FSR_FTT_CEXC_NMASK);
@@ -2387,21 +2388,18 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 case 0x1: /* fmovs */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x5: /* fnegs */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
                     gen_helper_fnegs(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x9: /* fabss */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
                     gen_helper_fabss(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x29: /* fsqrts */
                     CHECK_FPU_FEATURE(dc, FSQRT);
@@ -2411,7 +2409,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fsqrts(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x2a: /* fsqrtd */
                     CHECK_FPU_FEATURE(dc, FSQRT);
@@ -2440,7 +2437,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x42: /* faddd */
                     gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2470,7 +2466,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x46: /* fsubd */
                     gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2501,7 +2496,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x4a: /* fmuld */
                     CHECK_FPU_FEATURE(dc, FMUL);
@@ -2533,7 +2527,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x4e: /* fdivd */
                     gen_op_load_fpr_DT0(DFPREG(rs1));
@@ -2581,7 +2574,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fitos(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xc6: /* fdtos */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
@@ -2590,7 +2582,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fdtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xc7: /* fqtos */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2600,7 +2591,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fqtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xc8: /* fitod */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
@@ -2651,7 +2641,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fstoi(cpu_dst_32, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xd2: /* fdtoi */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
@@ -2660,7 +2649,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fdtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0xd3: /* fqtoi */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2670,7 +2658,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fqtoi(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
@@ -2750,7 +2737,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fxtos(cpu_dst_32, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x88: /* V9 fxtod */
                     gen_op_load_fpr_DT1(DFPREG(rs2));
@@ -2795,7 +2781,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                        0, l1);
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     gen_set_label(l1);
                     break;
                 } else if ((xop & 0x11f) == 0x006) { // V9 fmovdr
@@ -2844,7 +2829,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                            0, l1);                      \
                         cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
                         gen_store_fpr_F(dc, rd, cpu_src1_32);           \
-                        gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -2946,7 +2930,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                                            0, l1);                      \
                         cpu_src1_32 = gen_load_fpr_F(dc, rs2);          \
                         gen_store_fpr_F(dc, rd, cpu_src1_32);           \
-                        gen_update_fprs_dirty(rd);                      \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -4089,7 +4072,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fpadd16s(cpu_dst_32, cpu_env,
                                         cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x052: /* VIS I fpadd32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4106,7 +4088,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_add_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x054: /* VIS I fpsub16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4124,7 +4105,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_helper_fpsub16s(cpu_dst_32, cpu_env,
                                         cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x056: /* VIS I fpsub32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4141,7 +4121,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_sub_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4154,7 +4133,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_movi_i32(cpu_dst_32, 0);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4173,7 +4151,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_nor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4192,7 +4169,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_andc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4208,7 +4184,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4227,7 +4202,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_andc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4243,7 +4217,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4262,7 +4235,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_xor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4281,7 +4253,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_nand_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4300,7 +4271,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_and_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4319,7 +4289,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_eqv_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4333,7 +4302,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4352,7 +4320,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_orc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x078: /* VIS I fsrc2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4364,7 +4331,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4383,7 +4349,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_orc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4402,7 +4367,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_or_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4415,7 +4379,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_movi_i32(cpu_dst_32, -1);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x080: /* VIS I shutdown */
                 case 0x081: /* VIS II siam */
@@ -4802,7 +4765,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_dst_32 = gen_dest_fpr_F();
                     tcg_gen_trunc_tl_i32(cpu_dst_32, cpu_tmp0);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
-                    gen_update_fprs_dirty(rd);
                     break;
                 case 0x21:      /* ldfsr, V9 ldxfsr */
 #ifdef TARGET_SPARC64
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 03/21] target-sparc: Add accessors for double-precision fpr access.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 01/21] target-sparc: Add accessors for single-precision fpr access Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 02/21] target-sparc: Mark fprs dirty in store accessor Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 04/21] target-sparc: Pass float64 parameters instead of dt0/1 temporaries Richard Henderson
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Begin using i64 quantities to manipulate double-precision values.
On a 64-bit host this will, for the moment, generate less efficient
code; on a 32-bit host code quality should be largely unchanged.
Code quality for 64-bit will be adjusted with a subsequent patch.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |  242 +++++++++++++++++++++++++---------------------
 1 files changed, 130 insertions(+), 112 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 3dd72ab..bea93af 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -82,6 +82,8 @@ typedef struct DisasContext {
     uint32_t cc_op;  /* current CC operation */
     struct TranslationBlock *tb;
     sparc_def_t *def;
+    TCGv_i64 t64[3];
+    int n_t64;
 } DisasContext;
 
 // This function uses non-native bit order
@@ -129,7 +131,7 @@ static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
-    tcg_gen_mov_i32 (cpu__fpr[dst], v);
+    tcg_gen_mov_i32(cpu__fpr[dst], v);
     gen_update_fprs_dirty(dst);
 }
 
@@ -138,6 +140,52 @@ static TCGv_i32 gen_dest_fpr_F(void)
     return cpu_tmp32;
 }
 
+static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
+{
+    TCGv_i64 ret = tcg_temp_new_i64();
+    src = DFPREG(src);
+
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu__fpr[src]);
+    tcg_gen_mov_i32(TCGV_LOW(ret), cpu__fpr[src + 1]);
+#else
+    {
+        TCGv_i64 t = tcg_temp_new_i64();
+        tcg_gen_extu_i32_i64(ret, cpu__fpr[src]);
+        tcg_gen_extu_i32_i64(t, cpu__fpr[src + 1]);
+        tcg_gen_shli_i64(ret, ret, 32);
+        tcg_gen_or_i64(ret, ret, t);
+        tcg_temp_free_i64(t);
+    }
+#endif
+
+    dc->t64[dc->n_t64++] = ret;
+    assert(dc->n_t64 <= ARRAY_SIZE(dc->t64));
+
+    return ret;
+}
+
+static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
+{
+    dst = DFPREG(dst);
+
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(cpu__fpu[dst], TCGV_HIGH(v));
+    tcg_gen_mov_i32(cpu__fpu[dst + 1], TCGV_LOW(v));
+#else
+    tcg_gen_trunc_i64_i32(cpu__fpr[dst + 1], v);
+    tcg_gen_shri_i64(v, v, 32);
+    tcg_gen_trunc_i64_i32(cpu__fpr[dst], v);
+#endif
+
+    gen_update_fprs_dirty(dst);
+}
+
+static TCGv_i64 gen_dest_fpr_D(void)
+{
+    return cpu_tmp64;
+}
+
 static void gen_op_load_fpr_DT0(unsigned int src)
 {
     tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
@@ -1909,6 +1957,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
     unsigned int opc, rs1, rs2, rd;
     TCGv cpu_src1, cpu_src2, cpu_tmp1, cpu_tmp2;
     TCGv_i32 cpu_src1_32, cpu_src2_32, cpu_dst_32;
+    TCGv_i64 cpu_src1_64, cpu_src2_64, cpu_dst_64;
     target_long simm;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)))
@@ -2661,11 +2710,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     break;
                 case 0x3: /* V9 fmovq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2791,9 +2837,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)], cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1], cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     gen_set_label(l1);
                     break;
                 } else if ((xop & 0x11f) == 0x007) { // V9 fmovqr
@@ -2843,11 +2888,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
-                                        cpu__fpr[DFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
-                                        cpu__fpr[DFPREG(rs2) + 1]);     \
-                        gen_update_fprs_dirty(DFPREG(rd));              \
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs2);          \
+                        gen_store_fpr_D(dc, rd, cpu_src1_64);           \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -2944,10 +2986,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],           \
-                                        cpu__fpr[DFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],       \
-                                        cpu__fpr[DFPREG(rs2) + 1]);     \
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs2);          \
+                        gen_store_fpr_D(dc, rd, cpu_src1_64);           \
                         gen_update_fprs_dirty(DFPREG(rd));              \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
@@ -4124,9 +4164,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], 0);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], 0);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_movi_i64(cpu_dst_64, 0);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x061: /* VIS I fzeros */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4136,13 +4176,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_nor_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_nor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x063: /* VIS I fnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4154,13 +4192,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
-                                     cpu__fpr[DFPREG(rs1)],
-                                     cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                     cpu__fpr[DFPREG(rs1) + 1],
-                                     cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_andc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x065: /* VIS I fandnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4172,11 +4208,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x067: /* VIS I fnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4187,13 +4222,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd)],
-                                     cpu__fpr[DFPREG(rs2)],
-                                     cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_andc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                     cpu__fpr[DFPREG(rs2) + 1],
-                                     cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_andc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x069: /* VIS I fandnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4205,11 +4238,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_not_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x06b: /* VIS I fnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4220,13 +4252,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_xor_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_xor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x06d: /* VIS I fxors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4238,13 +4268,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd)],
-                                     cpu__fpr[DFPREG(rs1)],
-                                     cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_nand_i32(cpu__fpr[DFPREG(rd) + 1],
-                                     cpu__fpr[DFPREG(rs1) + 1],
-                                     cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_nand_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x06f: /* VIS I fnands */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4256,13 +4284,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_and_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_and_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x071: /* VIS I fands */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4274,13 +4300,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_eqv_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_eqv_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x073: /* VIS I fxnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4292,11 +4316,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_mov_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     break;
                 case 0x075: /* VIS I fsrc1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4305,13 +4326,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs1)],
-                                    cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_orc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x077: /* VIS I fornot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4323,9 +4342,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x078: /* VIS I fsrc2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs2));
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_store_fpr_D(dc, rd, cpu_src1_64);
                     break;
                 case 0x079: /* VIS I fsrc2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4334,13 +4352,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd)],
-                                    cpu__fpr[DFPREG(rs2)],
-                                    cpu__fpr[DFPREG(rs1)]);
-                    tcg_gen_orc_i32(cpu__fpr[DFPREG(rd) + 1],
-                                    cpu__fpr[DFPREG(rs2) + 1],
-                                    cpu__fpr[DFPREG(rs1) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_orc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x07b: /* VIS I fornot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4352,13 +4368,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd)],
-                                   cpu__fpr[DFPREG(rs1)],
-                                   cpu__fpr[DFPREG(rs2)]);
-                    tcg_gen_or_i32(cpu__fpr[DFPREG(rd) + 1],
-                                   cpu__fpr[DFPREG(rs1) + 1],
-                                   cpu__fpr[DFPREG(rs2) + 1]);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_or_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x07d: /* VIS I fors */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4370,9 +4384,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd)], -1);
-                    tcg_gen_movi_i32(cpu__fpr[DFPREG(rd) + 1], -1);
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_movi_i64(cpu_dst_64, -1);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x07f: /* VIS I fones */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -5203,6 +5217,10 @@ static inline void gen_intermediate_code_internal(TranslationBlock * tb,
     tcg_temp_free_i64(cpu_tmp64);
     tcg_temp_free_i32(cpu_tmp32);
     tcg_temp_free(cpu_tmp0);
+    for (j = dc->n_t64 - 1; j >= 0; --j) {
+        tcg_temp_free_i64(dc->t64[j]);
+    }
+
     if (tb->cflags & CF_LAST_IO)
         gen_io_end();
     if (!dc->is_br) {
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 04/21] target-sparc: Pass float64 parameters instead of dt0/1 temporaries.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (2 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 03/21] target-sparc: Add accessors for double-precision fpr access Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 20:04   ` Blue Swirl
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 05/21] target-sparc: Make VIS helpers const when possible Richard Henderson
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/cpu.h         |    1 -
 target-sparc/fop_helper.c  |  120 ++++++------
 target-sparc/helper.h      |   95 +++++-----
 target-sparc/ldst_helper.c |   52 -----
 target-sparc/translate.c   |  449 ++++++++++++++++++++++----------------------
 target-sparc/vis_helper.c  |  113 ++++++------
 6 files changed, 381 insertions(+), 449 deletions(-)

diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 99370d5..a4419a5 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -463,7 +463,6 @@ typedef struct CPUSPARCState {
     uint64_t prom_addr;
 #endif
     /* temporary float registers */
-    float64 dt0, dt1;
     float128 qt0, qt1;
     float_status fp_status;
 #if defined(TARGET_SPARC64)
diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
index 23502f3..f6348c2 100644
--- a/target-sparc/fop_helper.c
+++ b/target-sparc/fop_helper.c
@@ -20,8 +20,6 @@
 #include "cpu.h"
 #include "helper.h"
 
-#define DT0 (env->dt0)
-#define DT1 (env->dt1)
 #define QT0 (env->qt0)
 #define QT1 (env->qt1)
 
@@ -33,9 +31,10 @@
     {                                                           \
         return float32_ ## name (src1, src2, &env->fp_status);  \
     }                                                           \
-    F_HELPER(name, d)                                           \
+    float64 helper_f ## name ## d (CPUState * env, float64 src1,\
+                                   float64 src2)                \
     {                                                           \
-        DT0 = float64_ ## name (DT0, DT1, &env->fp_status);     \
+        return float64_ ## name (src1, src2, &env->fp_status);  \
     }                                                           \
     F_HELPER(name, q)                                           \
     {                                                           \
@@ -48,17 +47,17 @@ F_BINOP(mul);
 F_BINOP(div);
 #undef F_BINOP
 
-void helper_fsmuld(CPUState *env, float32 src1, float32 src2)
+float64 helper_fsmuld(CPUState *env, float32 src1, float32 src2)
 {
-    DT0 = float64_mul(float32_to_float64(src1, &env->fp_status),
-                      float32_to_float64(src2, &env->fp_status),
-                      &env->fp_status);
+    return float64_mul(float32_to_float64(src1, &env->fp_status),
+                       float32_to_float64(src2, &env->fp_status),
+                       &env->fp_status);
 }
 
-void helper_fdmulq(CPUState *env)
+void helper_fdmulq(CPUState *env, float64 src1, float64 src2)
 {
-    QT0 = float128_mul(float64_to_float128(DT0, &env->fp_status),
-                       float64_to_float128(DT1, &env->fp_status),
+    QT0 = float128_mul(float64_to_float128(src1, &env->fp_status),
+                       float64_to_float128(src2, &env->fp_status),
                        &env->fp_status);
 }
 
@@ -68,9 +67,9 @@ float32 helper_fnegs(float32 src)
 }
 
 #ifdef TARGET_SPARC64
-F_HELPER(neg, d)
+float64 helper_fnegd(float64 src)
 {
-    DT0 = float64_chs(DT1);
+    return float64_chs(src);
 }
 
 F_HELPER(neg, q)
@@ -85,9 +84,9 @@ float32 helper_fitos(CPUState *env, int32_t src)
     return int32_to_float32(src, &env->fp_status);
 }
 
-void helper_fitod(CPUState *env, int32_t src)
+float64 helper_fitod(CPUState *env, int32_t src)
 {
-    DT0 = int32_to_float64(src, &env->fp_status);
+    return int32_to_float64(src, &env->fp_status);
 }
 
 void helper_fitoq(CPUState *env, int32_t src)
@@ -96,32 +95,32 @@ void helper_fitoq(CPUState *env, int32_t src)
 }
 
 #ifdef TARGET_SPARC64
-float32 helper_fxtos(CPUState *env)
+float32 helper_fxtos(CPUState *env, int64_t src)
 {
-    return int64_to_float32(*((int64_t *)&DT1), &env->fp_status);
+    return int64_to_float32(src, &env->fp_status);
 }
 
-F_HELPER(xto, d)
+float64 helper_fxtod(CPUState *env, int64_t src)
 {
-    DT0 = int64_to_float64(*((int64_t *)&DT1), &env->fp_status);
+    return int64_to_float64(src, &env->fp_status);
 }
 
-F_HELPER(xto, q)
+void helper_fxtoq(CPUState *env, int64_t src)
 {
-    QT0 = int64_to_float128(*((int64_t *)&DT1), &env->fp_status);
+    QT0 = int64_to_float128(src, &env->fp_status);
 }
 #endif
 #undef F_HELPER
 
 /* floating point conversion */
-float32 helper_fdtos(CPUState *env)
+float32 helper_fdtos(CPUState *env, float64 src)
 {
-    return float64_to_float32(DT1, &env->fp_status);
+    return float64_to_float32(src, &env->fp_status);
 }
 
-void helper_fstod(CPUState *env, float32 src)
+float64 helper_fstod(CPUState *env, float32 src)
 {
-    DT0 = float32_to_float64(src, &env->fp_status);
+    return float32_to_float64(src, &env->fp_status);
 }
 
 float32 helper_fqtos(CPUState *env)
@@ -134,14 +133,14 @@ void helper_fstoq(CPUState *env, float32 src)
     QT0 = float32_to_float128(src, &env->fp_status);
 }
 
-void helper_fqtod(CPUState *env)
+float64 helper_fqtod(CPUState *env)
 {
-    DT0 = float128_to_float64(QT1, &env->fp_status);
+    return float128_to_float64(QT1, &env->fp_status);
 }
 
-void helper_fdtoq(CPUState *env)
+void helper_fdtoq(CPUState *env, float64 src)
 {
-    QT0 = float64_to_float128(DT1, &env->fp_status);
+    QT0 = float64_to_float128(src, &env->fp_status);
 }
 
 /* Float to integer conversion.  */
@@ -150,9 +149,9 @@ int32_t helper_fstoi(CPUState *env, float32 src)
     return float32_to_int32_round_to_zero(src, &env->fp_status);
 }
 
-int32_t helper_fdtoi(CPUState *env)
+int32_t helper_fdtoi(CPUState *env, float64 src)
 {
-    return float64_to_int32_round_to_zero(DT1, &env->fp_status);
+    return float64_to_int32_round_to_zero(src, &env->fp_status);
 }
 
 int32_t helper_fqtoi(CPUState *env)
@@ -161,19 +160,19 @@ int32_t helper_fqtoi(CPUState *env)
 }
 
 #ifdef TARGET_SPARC64
-void helper_fstox(CPUState *env, float32 src)
+int64_t helper_fstox(CPUState *env, float32 src)
 {
-    *((int64_t *)&DT0) = float32_to_int64_round_to_zero(src, &env->fp_status);
+    return float32_to_int64_round_to_zero(src, &env->fp_status);
 }
 
-void helper_fdtox(CPUState *env)
+int64_t helper_fdtox(CPUState *env, float64 src)
 {
-    *((int64_t *)&DT0) = float64_to_int64_round_to_zero(DT1, &env->fp_status);
+    return float64_to_int64_round_to_zero(src, &env->fp_status);
 }
 
-void helper_fqtox(CPUState *env)
+int64_t helper_fqtox(CPUState *env)
 {
-    *((int64_t *)&DT0) = float128_to_int64_round_to_zero(QT1, &env->fp_status);
+    return float128_to_int64_round_to_zero(QT1, &env->fp_status);
 }
 #endif
 
@@ -183,9 +182,9 @@ float32 helper_fabss(float32 src)
 }
 
 #ifdef TARGET_SPARC64
-void helper_fabsd(CPUState *env)
+float64 helper_fabsd(CPUState *env, float64 src)
 {
-    DT0 = float64_abs(DT1);
+    return float64_abs(src);
 }
 
 void helper_fabsq(CPUState *env)
@@ -199,9 +198,9 @@ float32 helper_fsqrts(CPUState *env, float32 src)
     return float32_sqrt(src, &env->fp_status);
 }
 
-void helper_fsqrtd(CPUState *env)
+float64 helper_fsqrtd(CPUState *env, float64 src)
 {
-    DT0 = float64_sqrt(DT1, &env->fp_status);
+    return float64_sqrt(src, &env->fp_status);
 }
 
 void helper_fsqrtq(CPUState *env)
@@ -245,8 +244,8 @@ void helper_fsqrtq(CPUState *env)
             break;                                                      \
         }                                                               \
     }
-#define GEN_FCMPS(name, size, FS, E)                                    \
-    void glue(helper_, name)(CPUState *env, float32 src1, float32 src2) \
+#define GEN_FCMP_T(name, size, FS, E)                                   \
+    void glue(helper_, name)(CPUState *env, size src1, size src2)       \
     {                                                                   \
         env->fsr &= FSR_FTT_NMASK;                                      \
         if (E && (glue(size, _is_any_nan)(src1) ||                      \
@@ -282,41 +281,42 @@ void helper_fsqrtq(CPUState *env)
         }                                                               \
     }
 
-GEN_FCMPS(fcmps, float32, 0, 0);
-GEN_FCMP(fcmpd, float64, DT0, DT1, 0, 0);
+GEN_FCMP_T(fcmps, float32, 0, 0);
+GEN_FCMP_T(fcmpd, float64, 0, 0);
 
-GEN_FCMPS(fcmpes, float32, 0, 1);
-GEN_FCMP(fcmped, float64, DT0, DT1, 0, 1);
+GEN_FCMP_T(fcmpes, float32, 0, 1);
+GEN_FCMP_T(fcmped, float64, 0, 1);
 
 GEN_FCMP(fcmpq, float128, QT0, QT1, 0, 0);
 GEN_FCMP(fcmpeq, float128, QT0, QT1, 0, 1);
 
 #ifdef TARGET_SPARC64
-GEN_FCMPS(fcmps_fcc1, float32, 22, 0);
-GEN_FCMP(fcmpd_fcc1, float64, DT0, DT1, 22, 0);
+GEN_FCMP_T(fcmps_fcc1, float32, 22, 0);
+GEN_FCMP_T(fcmpd_fcc1, float64, 22, 0);
 GEN_FCMP(fcmpq_fcc1, float128, QT0, QT1, 22, 0);
 
-GEN_FCMPS(fcmps_fcc2, float32, 24, 0);
-GEN_FCMP(fcmpd_fcc2, float64, DT0, DT1, 24, 0);
+GEN_FCMP_T(fcmps_fcc2, float32, 24, 0);
+GEN_FCMP_T(fcmpd_fcc2, float64, 24, 0);
 GEN_FCMP(fcmpq_fcc2, float128, QT0, QT1, 24, 0);
 
-GEN_FCMPS(fcmps_fcc3, float32, 26, 0);
-GEN_FCMP(fcmpd_fcc3, float64, DT0, DT1, 26, 0);
+GEN_FCMP_T(fcmps_fcc3, float32, 26, 0);
+GEN_FCMP_T(fcmpd_fcc3, float64, 26, 0);
 GEN_FCMP(fcmpq_fcc3, float128, QT0, QT1, 26, 0);
 
-GEN_FCMPS(fcmpes_fcc1, float32, 22, 1);
-GEN_FCMP(fcmped_fcc1, float64, DT0, DT1, 22, 1);
+GEN_FCMP_T(fcmpes_fcc1, float32, 22, 1);
+GEN_FCMP_T(fcmped_fcc1, float64, 22, 1);
 GEN_FCMP(fcmpeq_fcc1, float128, QT0, QT1, 22, 1);
 
-GEN_FCMPS(fcmpes_fcc2, float32, 24, 1);
-GEN_FCMP(fcmped_fcc2, float64, DT0, DT1, 24, 1);
+GEN_FCMP_T(fcmpes_fcc2, float32, 24, 1);
+GEN_FCMP_T(fcmped_fcc2, float64, 24, 1);
 GEN_FCMP(fcmpeq_fcc2, float128, QT0, QT1, 24, 1);
 
-GEN_FCMPS(fcmpes_fcc3, float32, 26, 1);
-GEN_FCMP(fcmped_fcc3, float64, DT0, DT1, 26, 1);
+GEN_FCMP_T(fcmpes_fcc3, float32, 26, 1);
+GEN_FCMP_T(fcmped_fcc3, float64, 26, 1);
 GEN_FCMP(fcmpeq_fcc3, float128, QT0, QT1, 26, 1);
 #endif
-#undef GEN_FCMPS
+#undef GEN_FCMP_T
+#undef GEN_FCMP
 
 void helper_check_ieee_exceptions(CPUState *env)
 {
diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index c1b4e65..089233f 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -39,8 +39,6 @@ DEF_HELPER_3(udiv, tl, env, tl, tl)
 DEF_HELPER_3(udiv_cc, tl, env, tl, tl)
 DEF_HELPER_3(sdiv, tl, env, tl, tl)
 DEF_HELPER_3(sdiv_cc, tl, env, tl, tl)
-DEF_HELPER_3(stdf, void, env, tl, int)
-DEF_HELPER_3(lddf, void, env, tl, int)
 DEF_HELPER_3(ldqf, void, env, tl, int)
 DEF_HELPER_3(stqf, void, env, tl, int)
 #if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64)
@@ -52,29 +50,29 @@ DEF_HELPER_1(check_ieee_exceptions, void, env)
 DEF_HELPER_1(clear_float_exceptions, void, env)
 DEF_HELPER_1(fabss, f32, f32)
 DEF_HELPER_2(fsqrts, f32, env, f32)
-DEF_HELPER_1(fsqrtd, void, env)
+DEF_HELPER_2(fsqrtd, f64, env, f64)
 DEF_HELPER_3(fcmps, void, env, f32, f32)
-DEF_HELPER_1(fcmpd, void, env)
+DEF_HELPER_3(fcmpd, void, env, f64, f64)
 DEF_HELPER_3(fcmpes, void, env, f32, f32)
-DEF_HELPER_1(fcmped, void, env)
+DEF_HELPER_3(fcmped, void, env, f64, f64)
 DEF_HELPER_1(fsqrtq, void, env)
 DEF_HELPER_1(fcmpq, void, env)
 DEF_HELPER_1(fcmpeq, void, env)
 #ifdef TARGET_SPARC64
 DEF_HELPER_2(ldxfsr, void, env, i64)
-DEF_HELPER_1(fabsd, void, env)
+DEF_HELPER_2(fabsd, f64, env, f64)
 DEF_HELPER_3(fcmps_fcc1, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc2, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc3, void, env, f32, f32)
-DEF_HELPER_1(fcmpd_fcc1, void, env)
-DEF_HELPER_1(fcmpd_fcc2, void, env)
-DEF_HELPER_1(fcmpd_fcc3, void, env)
+DEF_HELPER_3(fcmpd_fcc1, void, env, f64, f64)
+DEF_HELPER_3(fcmpd_fcc2, void, env, f64, f64)
+DEF_HELPER_3(fcmpd_fcc3, void, env, f64, f64)
 DEF_HELPER_3(fcmpes_fcc1, void, env, f32, f32)
 DEF_HELPER_3(fcmpes_fcc2, void, env, f32, f32)
 DEF_HELPER_3(fcmpes_fcc3, void, env, f32, f32)
-DEF_HELPER_1(fcmped_fcc1, void, env)
-DEF_HELPER_1(fcmped_fcc2, void, env)
-DEF_HELPER_1(fcmped_fcc3, void, env)
+DEF_HELPER_3(fcmped_fcc1, void, env, f64, f64)
+DEF_HELPER_3(fcmped_fcc2, void, env, f64, f64)
+DEF_HELPER_3(fcmped_fcc3, void, env, f64, f64)
 DEF_HELPER_1(fabsq, void, env)
 DEF_HELPER_1(fcmpq_fcc1, void, env)
 DEF_HELPER_1(fcmpq_fcc2, void, env)
@@ -86,77 +84,78 @@ DEF_HELPER_1(fcmpeq_fcc3, void, env)
 DEF_HELPER_2(raise_exception, void, env, int)
 DEF_HELPER_0(shutdown, void)
 #define F_HELPER_0_1(name) DEF_HELPER_1(f ## name, void, env)
-#define F_HELPER_DQ_0_1(name)                   \
-    F_HELPER_0_1(name ## d);                    \
-    F_HELPER_0_1(name ## q)
 
-F_HELPER_DQ_0_1(add);
-F_HELPER_DQ_0_1(sub);
-F_HELPER_DQ_0_1(mul);
-F_HELPER_DQ_0_1(div);
+DEF_HELPER_3(faddd, f64, env, f64, f64)
+DEF_HELPER_3(fsubd, f64, env, f64, f64)
+DEF_HELPER_3(fmuld, f64, env, f64, f64)
+DEF_HELPER_3(fdivd, f64, env, f64, f64)
+F_HELPER_0_1(addq)
+F_HELPER_0_1(subq)
+F_HELPER_0_1(mulq)
+F_HELPER_0_1(divq)
 
 DEF_HELPER_3(fadds, f32, env, f32, f32)
 DEF_HELPER_3(fsubs, f32, env, f32, f32)
 DEF_HELPER_3(fmuls, f32, env, f32, f32)
 DEF_HELPER_3(fdivs, f32, env, f32, f32)
 
-DEF_HELPER_3(fsmuld, void, env, f32, f32)
-F_HELPER_0_1(dmulq);
+DEF_HELPER_3(fsmuld, f64, env, f32, f32)
+DEF_HELPER_3(fdmulq, void, env, f64, f64);
 
 DEF_HELPER_1(fnegs, f32, f32)
-DEF_HELPER_2(fitod, void, env, s32)
+DEF_HELPER_2(fitod, f64, env, s32)
 DEF_HELPER_2(fitoq, void, env, s32)
 
 DEF_HELPER_2(fitos, f32, env, s32)
 
 #ifdef TARGET_SPARC64
-DEF_HELPER_1(fnegd, void, env)
+DEF_HELPER_1(fnegd, f64, f64)
 DEF_HELPER_1(fnegq, void, env)
-DEF_HELPER_1(fxtos, i32, env)
-F_HELPER_DQ_0_1(xto);
+DEF_HELPER_2(fxtos, f32, env, s64)
+DEF_HELPER_2(fxtod, f64, env, s64)
+DEF_HELPER_2(fxtoq, void, env, s64)
 #endif
-DEF_HELPER_1(fdtos, f32, env)
-DEF_HELPER_2(fstod, void, env, f32)
+DEF_HELPER_2(fdtos, f32, env, f64)
+DEF_HELPER_2(fstod, f64, env, f32)
 DEF_HELPER_1(fqtos, f32, env)
 DEF_HELPER_2(fstoq, void, env, f32)
-F_HELPER_0_1(qtod);
-F_HELPER_0_1(dtoq);
+DEF_HELPER_1(fqtod, f64, env)
+DEF_HELPER_2(fdtoq, void, env, f64)
 DEF_HELPER_2(fstoi, s32, env, f32)
-DEF_HELPER_1(fdtoi, s32, env)
+DEF_HELPER_2(fdtoi, s32, env, f64)
 DEF_HELPER_1(fqtoi, s32, env)
 #ifdef TARGET_SPARC64
-DEF_HELPER_2(fstox, void, env, i32)
-F_HELPER_0_1(dtox);
-F_HELPER_0_1(qtox);
-F_HELPER_0_1(aligndata);
+DEF_HELPER_2(fstox, s64, env, f32)
+DEF_HELPER_2(fdtox, s64, env, f64)
+DEF_HELPER_1(fqtox, s64, env)
+DEF_HELPER_3(faligndata, i64, env, i64, i64)
 
-F_HELPER_0_1(pmerge);
-F_HELPER_0_1(mul8x16);
-F_HELPER_0_1(mul8x16al);
-F_HELPER_0_1(mul8x16au);
-F_HELPER_0_1(mul8sux16);
-F_HELPER_0_1(mul8ulx16);
-F_HELPER_0_1(muld8sux16);
-F_HELPER_0_1(muld8ulx16);
-F_HELPER_0_1(expand);
+DEF_HELPER_3(fpmerge, i64, env, i64, i64)
+DEF_HELPER_3(fmul8x16, i64, env, i64, i64)
+DEF_HELPER_3(fmul8x16al, i64, env, i64, i64)
+DEF_HELPER_3(fmul8x16au, i64, env, i64, i64)
+DEF_HELPER_3(fmul8sux16, i64, env, i64, i64)
+DEF_HELPER_3(fmul8ulx16, i64, env, i64, i64)
+DEF_HELPER_3(fmuld8sux16, i64, env, i64, i64)
+DEF_HELPER_3(fmuld8ulx16, i64, env, i64, i64)
+DEF_HELPER_3(fexpand, i64, env, i64, i64)
 #define VIS_HELPER(name)                                 \
-    F_HELPER_0_1(name##16);                              \
+    DEF_HELPER_3(f ## name ## 16, i64, env, i64, i64)    \
     DEF_HELPER_3(f ## name ## 16s, i32, env, i32, i32)   \
-    F_HELPER_0_1(name##32);                              \
+    DEF_HELPER_3(f ## name ## 32, i64, env, i64, i64)    \
     DEF_HELPER_3(f ## name ## 32s, i32, env, i32, i32)
 
 VIS_HELPER(padd);
 VIS_HELPER(psub);
 #define VIS_CMPHELPER(name)                              \
-    DEF_HELPER_1(f##name##16, i64, env);                 \
-    DEF_HELPER_1(f##name##32, i64, env)
+    DEF_HELPER_3(f##name##16, i64, env, i64, i64)        \
+    DEF_HELPER_3(f##name##32, i64, env, i64, i64)
 VIS_CMPHELPER(cmpgt);
 VIS_CMPHELPER(cmpeq);
 VIS_CMPHELPER(cmple);
 VIS_CMPHELPER(cmpne);
 #endif
 #undef F_HELPER_0_1
-#undef F_HELPER_DQ_0_1
 #undef VIS_HELPER
 #undef VIS_CMPHELPER
 DEF_HELPER_1(compute_psr, void, env);
diff --git a/target-sparc/ldst_helper.c b/target-sparc/ldst_helper.c
index 1e4337d..ec9b5f2 100644
--- a/target-sparc/ldst_helper.c
+++ b/target-sparc/ldst_helper.c
@@ -61,8 +61,6 @@
 #endif
 #endif
 
-#define DT0 (env->dt0)
-#define DT1 (env->dt1)
 #define QT0 (env->qt0)
 #define QT1 (env->qt1)
 
@@ -2228,56 +2226,6 @@ target_ulong helper_casx_asi(CPUState *env, target_ulong addr,
 }
 #endif /* TARGET_SPARC64 */
 
-void helper_stdf(CPUState *env, target_ulong addr, int mem_idx)
-{
-    helper_check_align(env, addr, 7);
-#if !defined(CONFIG_USER_ONLY)
-    switch (mem_idx) {
-    case MMU_USER_IDX:
-        cpu_stfq_user(env, addr, DT0);
-        break;
-    case MMU_KERNEL_IDX:
-        cpu_stfq_kernel(env, addr, DT0);
-        break;
-#ifdef TARGET_SPARC64
-    case MMU_HYPV_IDX:
-        cpu_stfq_hypv(env, addr, DT0);
-        break;
-#endif
-    default:
-        DPRINTF_MMU("helper_stdf: need to check MMU idx %d\n", mem_idx);
-        break;
-    }
-#else
-    stfq_raw(address_mask(env, addr), DT0);
-#endif
-}
-
-void helper_lddf(CPUState *env, target_ulong addr, int mem_idx)
-{
-    helper_check_align(env, addr, 7);
-#if !defined(CONFIG_USER_ONLY)
-    switch (mem_idx) {
-    case MMU_USER_IDX:
-        DT0 = cpu_ldfq_user(env, addr);
-        break;
-    case MMU_KERNEL_IDX:
-        DT0 = cpu_ldfq_kernel(env, addr);
-        break;
-#ifdef TARGET_SPARC64
-    case MMU_HYPV_IDX:
-        DT0 = cpu_ldfq_hypv(env, addr);
-        break;
-#endif
-    default:
-        DPRINTF_MMU("helper_lddf: need to check MMU idx %d\n", mem_idx);
-        break;
-    }
-#else
-    DT0 = ldfq_raw(address_mask(env, addr));
-#endif
-}
-
 void helper_ldqf(CPUState *env, target_ulong addr, int mem_idx)
 {
     /* XXX add 128 bit load */
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index bea93af..f0614b5 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -186,30 +186,6 @@ static TCGv_i64 gen_dest_fpr_D(void)
     return cpu_tmp64;
 }
 
-static void gen_op_load_fpr_DT0(unsigned int src)
-{
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.lower));
-}
-
-static void gen_op_load_fpr_DT1(unsigned int src)
-{
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt1) +
-                   offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt1) +
-                   offsetof(CPU_DoubleU, l.lower));
-}
-
-static void gen_op_store_DT0_fpr(unsigned int dst)
-{
-    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.upper));
-    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
-                   offsetof(CPU_DoubleU, l.lower));
-}
-
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
     tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
@@ -1490,20 +1466,20 @@ static inline void gen_op_fcmps(int fccno, TCGv_i32 r_rs1, TCGv_i32 r_rs2)
     }
 }
 
-static inline void gen_op_fcmpd(int fccno)
+static inline void gen_op_fcmpd(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
     switch (fccno) {
     case 0:
-        gen_helper_fcmpd(cpu_env);
+        gen_helper_fcmpd(cpu_env, r_rs1, r_rs2);
         break;
     case 1:
-        gen_helper_fcmpd_fcc1(cpu_env);
+        gen_helper_fcmpd_fcc1(cpu_env, r_rs1, r_rs2);
         break;
     case 2:
-        gen_helper_fcmpd_fcc2(cpu_env);
+        gen_helper_fcmpd_fcc2(cpu_env, r_rs1, r_rs2);
         break;
     case 3:
-        gen_helper_fcmpd_fcc3(cpu_env);
+        gen_helper_fcmpd_fcc3(cpu_env, r_rs1, r_rs2);
         break;
     }
 }
@@ -1544,20 +1520,20 @@ static inline void gen_op_fcmpes(int fccno, TCGv_i32 r_rs1, TCGv_i32 r_rs2)
     }
 }
 
-static inline void gen_op_fcmped(int fccno)
+static inline void gen_op_fcmped(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
     switch (fccno) {
     case 0:
-        gen_helper_fcmped(cpu_env);
+        gen_helper_fcmped(cpu_env, r_rs1, r_rs2);
         break;
     case 1:
-        gen_helper_fcmped_fcc1(cpu_env);
+        gen_helper_fcmped_fcc1(cpu_env, r_rs1, r_rs2);
         break;
     case 2:
-        gen_helper_fcmped_fcc2(cpu_env);
+        gen_helper_fcmped_fcc2(cpu_env, r_rs1, r_rs2);
         break;
     case 3:
-        gen_helper_fcmped_fcc3(cpu_env);
+        gen_helper_fcmped_fcc3(cpu_env, r_rs1, r_rs2);
         break;
     }
 }
@@ -1587,9 +1563,9 @@ static inline void gen_op_fcmps(int fccno, TCGv r_rs1, TCGv r_rs2)
     gen_helper_fcmps(cpu_env, r_rs1, r_rs2);
 }
 
-static inline void gen_op_fcmpd(int fccno)
+static inline void gen_op_fcmpd(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
-    gen_helper_fcmpd(cpu_env);
+    gen_helper_fcmpd(cpu_env, r_rs1, r_rs2);
 }
 
 static inline void gen_op_fcmpq(int fccno)
@@ -1602,9 +1578,9 @@ static inline void gen_op_fcmpes(int fccno, TCGv r_rs1, TCGv r_rs2)
     gen_helper_fcmpes(cpu_env, r_rs1, r_rs2);
 }
 
-static inline void gen_op_fcmped(int fccno)
+static inline void gen_op_fcmped(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
 {
-    gen_helper_fcmped(cpu_env);
+    gen_helper_fcmped(cpu_env, r_rs1, r_rs2);
 }
 
 static inline void gen_op_fcmpeq(int fccno)
@@ -2461,12 +2437,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x2a: /* fsqrtd */
                     CHECK_FPU_FEATURE(dc, FSQRT);
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fsqrtd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fsqrtd(cpu_dst_64, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x2b: /* fsqrtq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2488,13 +2464,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x42: /* faddd */
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_faddd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_faddd(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x43: /* faddq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2517,13 +2494,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x46: /* fsubd */
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fsubd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fsubd(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x47: /* fsubq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2548,13 +2526,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x4a: /* fmuld */
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fmuld(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmuld(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x4b: /* fmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2578,13 +2557,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x4e: /* fdivd */
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdivd(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fdivd(cpu_dst_64, cpu_env,
+                                     cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x4f: /* fdivq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2601,17 +2581,18 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_clear_float_exceptions();
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fsmuld(cpu_env, cpu_src1_32, cpu_src2_32);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fsmuld(cpu_dst_64, cpu_env,
+                                      cpu_src1_32, cpu_src2_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x6e: /* fdmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdmulq(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fdmulq(cpu_env, cpu_src1_64, cpu_src2_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
@@ -2625,10 +2606,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0xc6: /* fdtos */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtos(cpu_dst_32, cpu_env);
+                    gen_helper_fdtos(cpu_dst_32, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
@@ -2643,24 +2624,24 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0xc8: /* fitod */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fitod(cpu_env, cpu_src1_32);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fitod(cpu_dst_64, cpu_env, cpu_src1_32);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xc9: /* fstod */
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fstod(cpu_env, cpu_src1_32);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fstod(cpu_dst_64, cpu_env, cpu_src1_32);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xcb: /* fqtod */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtod(cpu_env);
+                    gen_op_load_fpr_QT1(QFPREG(rs2));
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fqtod(cpu_dst_64, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xcc: /* fitoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2678,8 +2659,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0xce: /* fdtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fdtoq(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fdtoq(cpu_env, cpu_src1_64);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
@@ -2692,10 +2673,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0xd2: /* fdtoi */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtoi(cpu_dst_32, cpu_env);
+                    gen_helper_fdtoi(cpu_dst_32, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
@@ -2726,10 +2707,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0x6: /* V9 fnegd */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fnegd(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fnegd(cpu_dst_64, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x7: /* V9 fnegq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2739,10 +2720,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0xa: /* V9 fabsd */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fabsd(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fabsd(cpu_dst_64, cpu_env, cpu_src1_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0xb: /* V9 fabsq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -2754,49 +2735,49 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 case 0x81: /* V9 fstox */
                     gen_clear_float_exceptions();
                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fstox(cpu_env, cpu_src1_32);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fstox(cpu_dst_64, cpu_env, cpu_src1_32);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x82: /* V9 fdtox */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fdtox(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fdtox(cpu_dst_64, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x83: /* V9 fqtox */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     gen_op_load_fpr_QT1(QFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fqtox(cpu_env);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fqtox(cpu_dst_64, cpu_env);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x84: /* V9 fxtos */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fxtos(cpu_dst_32, cpu_env);
+                    gen_helper_fxtos(cpu_dst_32, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x88: /* V9 fxtod */
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fxtod(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fxtod(cpu_dst_64, cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x8c: /* V9 fxtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
                     gen_clear_float_exceptions();
-                    gen_helper_fxtoq(cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fxtoq(cpu_env, cpu_src1_64);
                     gen_helper_check_ieee_exceptions(cpu_env);
                     gen_op_store_QT0_fpr(QFPREG(rd));
                     gen_update_fprs_dirty(QFPREG(rd));
@@ -3046,9 +3027,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_op_fcmps(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x52: /* fcmpd, V9 %fcc */
-                        gen_op_load_fpr_DT0(DFPREG(rs1));
-                        gen_op_load_fpr_DT1(DFPREG(rs2));
-                        gen_op_fcmpd(rd & 3);
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                        cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                        gen_op_fcmpd(rd & 3, cpu_src1_64, cpu_src2_64);
                         break;
                     case 0x53: /* fcmpq, V9 %fcc */
                         CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -3062,9 +3043,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_op_fcmpes(rd & 3, cpu_src1_32, cpu_src2_32);
                         break;
                     case 0x56: /* fcmped, V9 %fcc */
-                        gen_op_load_fpr_DT0(DFPREG(rs1));
-                        gen_op_load_fpr_DT1(DFPREG(rs2));
-                        gen_op_fcmped(rd & 3);
+                        cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                        cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                        gen_op_fcmped(rd & 3, cpu_src1_64, cpu_src2_64);
                         break;
                     case 0x57: /* fcmpeq, V9 %fcc */
                         CHECK_FPU_FEATURE(dc, FLOAT128);
@@ -3953,115 +3934,130 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     goto illegal_insn;
                 case 0x020: /* VIS I fcmple16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmple16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmple16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x022: /* VIS I fcmpne16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpne16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpne16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x024: /* VIS I fcmple32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmple32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmple32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x026: /* VIS I fcmpne32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpne32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpne32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x028: /* VIS I fcmpgt16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpgt16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpgt16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02a: /* VIS I fcmpeq16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpeq16(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpeq16(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02c: /* VIS I fcmpgt32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpgt32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpgt32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02e: /* VIS I fcmpeq32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fcmpeq32(cpu_dst, cpu_env);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    gen_helper_fcmpeq32(cpu_dst, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x031: /* VIS I fmul8x16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8x16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8x16(cpu_dst_64, cpu_env,
+                                        cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x033: /* VIS I fmul8x16au */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8x16au(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8x16au(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x035: /* VIS I fmul8x16al */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8x16al(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8x16al(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x036: /* VIS I fmul8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8sux16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8sux16(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x037: /* VIS I fmul8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmul8ulx16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x038: /* VIS I fmuld8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmuld8sux16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_env,
+                                           cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fmuld8ulx16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_env,
+                                           cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x03a: /* VIS I fpack32 */
                 case 0x03b: /* VIS I fpack16 */
@@ -4071,38 +4067,42 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     goto illegal_insn;
                 case 0x048: /* VIS I faligndata */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_faligndata(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_faligndata(cpu_dst_64, cpu_env,
+                                          cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x04b: /* VIS I fpmerge */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpmerge(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpmerge(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x04c: /* VIS II bshuffle */
                     // XXX
                     goto illegal_insn;
                 case 0x04d: /* VIS I fexpand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fexpand(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fexpand(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x050: /* VIS I fpadd16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpadd16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpadd16(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x051: /* VIS I fpadd16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4115,11 +4115,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x052: /* VIS I fpadd32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpadd32(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpadd32(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x053: /* VIS I fpadd32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4131,11 +4132,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x054: /* VIS I fpsub16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpsub16(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpsub16(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x055: /* VIS I fpsub16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4148,11 +4150,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x056: /* VIS I fpsub32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_op_load_fpr_DT0(DFPREG(rs1));
-                    gen_op_load_fpr_DT1(DFPREG(rs2));
-                    gen_helper_fpsub32(cpu_env);
-                    gen_op_store_DT0_fpr(DFPREG(rd));
-                    gen_update_fprs_dirty(DFPREG(rd));
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpsub32(cpu_dst_64, cpu_env,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x057: /* VIS I fpsub32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4812,16 +4815,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     }
                     break;
                 case 0x23:      /* lddf, load double fpreg */
-                    {
-                        TCGv_i32 r_const;
-
-                        r_const = tcg_const_i32(dc->mem_idx);
-                        gen_address_mask(dc, cpu_addr);
-                        gen_helper_lddf(cpu_env, cpu_addr, r_const);
-                        tcg_temp_free_i32(r_const);
-                        gen_op_store_DT0_fpr(DFPREG(rd));
-                        gen_update_fprs_dirty(DFPREG(rd));
-                    }
+                    gen_address_mask(dc, cpu_addr);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    tcg_gen_qemu_ld64(cpu_dst_64, cpu_addr, dc->mem_idx);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 default:
                     goto illegal_insn;
@@ -4973,15 +4970,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
 #endif
 #endif
                 case 0x27: /* stdf, store double fpreg */
-                    {
-                        TCGv_i32 r_const;
-
-                        gen_op_load_fpr_DT0(DFPREG(rd));
-                        r_const = tcg_const_i32(dc->mem_idx);
-                        gen_address_mask(dc, cpu_addr);
-                        gen_helper_stdf(cpu_env, cpu_addr, r_const);
-                        tcg_temp_free_i32(r_const);
-                    }
+                    gen_address_mask(dc, cpu_addr);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rd);
+                    tcg_gen_qemu_st64(cpu_src1_64, cpu_addr, dc->mem_idx);
                     break;
                 default:
                     goto illegal_insn;
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index a22c10b..a007b0f 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -20,11 +20,6 @@
 #include "cpu.h"
 #include "helper.h"
 
-#define DT0 (env->dt0)
-#define DT1 (env->dt1)
-#define QT0 (env->qt0)
-#define QT1 (env->qt1)
-
 /* This function uses non-native bit order */
 #define GET_FIELD(X, FROM, TO)                                  \
     ((X) >> (63 - (TO)) & ((1ULL << ((TO) - (FROM) + 1)) - 1))
@@ -58,16 +53,16 @@ target_ulong helper_alignaddr(CPUState *env, target_ulong addr,
     return tmp & ~7ULL;
 }
 
-void helper_faligndata(CPUState *env)
+uint64_t helper_faligndata(CPUState *env, uint64_t src1, uint64_t src2)
 {
     uint64_t tmp;
 
-    tmp = (*((uint64_t *)&DT0)) << ((env->gsr & 7) * 8);
+    tmp = src1 << ((env->gsr & 7) * 8);
     /* on many architectures a shift of 64 does nothing */
     if ((env->gsr & 7) != 0) {
-        tmp |= (*((uint64_t *)&DT1)) >> (64 - (env->gsr & 7) * 8);
+        tmp |= src2 >> (64 - (env->gsr & 7) * 8);
     }
-    *((uint64_t *)&DT0) = tmp;
+    return tmp;
 }
 
 #ifdef HOST_WORDS_BIGENDIAN
@@ -102,12 +97,12 @@ typedef union {
     float32 f;
 } VIS32;
 
-void helper_fpmerge(CPUState *env)
+uint64_t helper_fpmerge(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
     /* Reverse calculation order to handle overlap */
     d.VIS_B64(7) = s.VIS_B64(3);
@@ -119,16 +114,16 @@ void helper_fpmerge(CPUState *env)
     d.VIS_B64(1) = s.VIS_B64(0);
     /* d.VIS_B64(0) = d.VIS_B64(0); */
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8x16(CPUState *env)
+uint64_t helper_fmul8x16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                 \
     tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B64(r);       \
@@ -143,16 +138,16 @@ void helper_fmul8x16(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8x16al(CPUState *env)
+uint64_t helper_fmul8x16al(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                 \
     tmp = (int32_t)d.VIS_SW64(1) * (int32_t)s.VIS_B64(r);       \
@@ -167,16 +162,16 @@ void helper_fmul8x16al(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8x16au(CPUState *env)
+uint64_t helper_fmul8x16au(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                 \
     tmp = (int32_t)d.VIS_SW64(0) * (int32_t)s.VIS_B64(r);       \
@@ -191,16 +186,16 @@ void helper_fmul8x16au(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8sux16(CPUState *env)
+uint64_t helper_fmul8sux16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8);       \
@@ -215,16 +210,16 @@ void helper_fmul8sux16(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmul8ulx16(CPUState *env)
+uint64_t helper_fmul8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2));        \
@@ -239,16 +234,16 @@ void helper_fmul8ulx16(CPUState *env)
     PMUL(3);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmuld8sux16(CPUState *env)
+uint64_t helper_fmuld8sux16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8);       \
@@ -262,16 +257,16 @@ void helper_fmuld8sux16(CPUState *env)
     PMUL(0);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fmuld8ulx16(CPUState *env)
+uint64_t helper_fmuld8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
 
-    s.d = DT0;
-    d.d = DT1;
+    s.ll = src1;
+    d.ll = src2;
 
 #define PMUL(r)                                                         \
     tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2));        \
@@ -285,38 +280,38 @@ void helper_fmuld8ulx16(CPUState *env)
     PMUL(0);
 #undef PMUL
 
-    DT0 = d.d;
+    return d.ll;
 }
 
-void helper_fexpand(CPUState *env)
+uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
 {
     VIS32 s;
     VIS64 d;
 
-    s.l = (uint32_t)(*(uint64_t *)&DT0 & 0xffffffff);
-    d.d = DT1;
+    s.l = (uint32_t)src1;
+    d.ll = src2;
     d.VIS_W64(0) = s.VIS_B32(0) << 4;
     d.VIS_W64(1) = s.VIS_B32(1) << 4;
     d.VIS_W64(2) = s.VIS_B32(2) << 4;
     d.VIS_W64(3) = s.VIS_B32(3) << 4;
 
-    DT0 = d.d;
+    return d.ll;
 }
 
 #define VIS_HELPER(name, F)                             \
-    void name##16(CPUState *env)                        \
+    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
-        s.d = DT0;                                      \
-        d.d = DT1;                                      \
+        s.ll = src1;                                    \
+        d.ll = src2;                                    \
                                                         \
         d.VIS_W64(0) = F(d.VIS_W64(0), s.VIS_W64(0));   \
         d.VIS_W64(1) = F(d.VIS_W64(1), s.VIS_W64(1));   \
         d.VIS_W64(2) = F(d.VIS_W64(2), s.VIS_W64(2));   \
         d.VIS_W64(3) = F(d.VIS_W64(3), s.VIS_W64(3));   \
                                                         \
-        DT0 = d.d;                                      \
+        return d.ll;                                    \
     }                                                   \
                                                         \
     uint32_t name##16s(CPUState *env, uint32_t src1,    \
@@ -333,17 +328,17 @@ void helper_fexpand(CPUState *env)
         return d.l;                                     \
     }                                                   \
                                                         \
-    void name##32(CPUState *env)                        \
+    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
-        s.d = DT0;                                      \
-        d.d = DT1;                                      \
+        s.ll = src1;                                    \
+        d.ll = src2;                                    \
                                                         \
         d.VIS_L64(0) = F(d.VIS_L64(0), s.VIS_L64(0));   \
         d.VIS_L64(1) = F(d.VIS_L64(1), s.VIS_L64(1));   \
                                                         \
-        DT0 = d.d;                                      \
+        return d.ll;                                    \
     }                                                   \
                                                         \
     uint32_t name##32s(CPUState *env, uint32_t src1,    \
@@ -365,12 +360,12 @@ VIS_HELPER(helper_fpadd, FADD)
 VIS_HELPER(helper_fpsub, FSUB)
 
 #define VIS_CMPHELPER(name, F)                                    \
-    uint64_t name##16(CPUState *env)                              \
+    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
-        s.d = DT0;                                                \
-        d.d = DT1;                                                \
+        s.ll = src1;                                              \
+        d.ll = src2;                                              \
                                                                   \
         d.VIS_W64(0) = F(s.VIS_W64(0), d.VIS_W64(0)) ? 1 : 0;     \
         d.VIS_W64(0) |= F(s.VIS_W64(1), d.VIS_W64(1)) ? 2 : 0;    \
@@ -381,12 +376,12 @@ VIS_HELPER(helper_fpsub, FSUB)
         return d.ll;                                              \
     }                                                             \
                                                                   \
-    uint64_t name##32(CPUState *env)                              \
+    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
-        s.d = DT0;                                                \
-        d.d = DT1;                                                \
+        s.ll = src1;                                              \
+        d.ll = src2;                                              \
                                                                   \
         d.VIS_L64(0) = F(s.VIS_L64(0), d.VIS_L64(0)) ? 1 : 0;     \
         d.VIS_L64(0) |= F(s.VIS_L64(1), d.VIS_L64(1)) ? 2 : 0;    \
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 05/21] target-sparc: Make VIS helpers const when possible.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (3 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 04/21] target-sparc: Pass float64 parameters instead of dt0/1 temporaries Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations Richard Henderson
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This also removes the unused ENV parameter from these helpers.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |   42 +++++++++++++----------
 target-sparc/translate.c  |   81 ++++++++++++++++++---------------------------
 target-sparc/vis_helper.c |   35 +++++++++----------
 3 files changed, 72 insertions(+), 86 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 089233f..9c15b8a 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -16,7 +16,7 @@ DEF_HELPER_1(rdccr, tl, env)
 DEF_HELPER_2(wrccr, void, env, tl)
 DEF_HELPER_1(rdcwp, tl, env)
 DEF_HELPER_2(wrcwp, void, env, tl)
-DEF_HELPER_3(array8, tl, env, tl, tl)
+DEF_HELPER_FLAGS_2(array8, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl, tl)
 DEF_HELPER_3(alignaddr, tl, env, tl, tl)
 DEF_HELPER_1(popc, tl, tl)
 DEF_HELPER_4(ldda_asi, void, env, tl, int, int)
@@ -130,26 +130,32 @@ DEF_HELPER_2(fdtox, s64, env, f64)
 DEF_HELPER_1(fqtox, s64, env)
 DEF_HELPER_3(faligndata, i64, env, i64, i64)
 
-DEF_HELPER_3(fpmerge, i64, env, i64, i64)
-DEF_HELPER_3(fmul8x16, i64, env, i64, i64)
-DEF_HELPER_3(fmul8x16al, i64, env, i64, i64)
-DEF_HELPER_3(fmul8x16au, i64, env, i64, i64)
-DEF_HELPER_3(fmul8sux16, i64, env, i64, i64)
-DEF_HELPER_3(fmul8ulx16, i64, env, i64, i64)
-DEF_HELPER_3(fmuld8sux16, i64, env, i64, i64)
-DEF_HELPER_3(fmuld8ulx16, i64, env, i64, i64)
-DEF_HELPER_3(fexpand, i64, env, i64, i64)
-#define VIS_HELPER(name)                                 \
-    DEF_HELPER_3(f ## name ## 16, i64, env, i64, i64)    \
-    DEF_HELPER_3(f ## name ## 16s, i32, env, i32, i32)   \
-    DEF_HELPER_3(f ## name ## 32, i64, env, i64, i64)    \
-    DEF_HELPER_3(f ## name ## 32s, i32, env, i32, i32)
+DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+#define VIS_HELPER(name)                                                 \
+    DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
+                       i64, i64, i64)                                    \
+    DEF_HELPER_FLAGS_2(f ## name ## 16s, TCG_CALL_CONST | TCG_CALL_PURE, \
+                       i32, i32, i32)                                    \
+    DEF_HELPER_FLAGS_2(f ## name ## 32, TCG_CALL_CONST | TCG_CALL_PURE,  \
+                       i64, i64, i64)                                    \
+    DEF_HELPER_FLAGS_2(f ## name ## 32s, TCG_CALL_CONST | TCG_CALL_PURE, \
+                       i32, i32, i32)
 
 VIS_HELPER(padd);
 VIS_HELPER(psub);
-#define VIS_CMPHELPER(name)                              \
-    DEF_HELPER_3(f##name##16, i64, env, i64, i64)        \
-    DEF_HELPER_3(f##name##32, i64, env, i64, i64)
+#define VIS_CMPHELPER(name)                                              \
+    DEF_HELPER_FLAGS_2(f##name##16, TCG_CALL_CONST | TCG_CALL_PURE,      \
+                       i64, i64, i64)                                    \
+    DEF_HELPER_FLAGS_2(f##name##32, TCG_CALL_CONST | TCG_CALL_PURE,      \
+                       i64, i64, i64)
 VIS_CMPHELPER(cmpgt);
 VIS_CMPHELPER(cmpeq);
 VIS_CMPHELPER(cmple);
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index f0614b5..5c70870 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -3902,14 +3902,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_array8(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_helper_array8(cpu_dst, cpu_src1, cpu_src2);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x012: /* VIS I array16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_array8(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_helper_array8(cpu_dst, cpu_src1, cpu_src2);
                     tcg_gen_shli_i64(cpu_dst, cpu_dst, 1);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
@@ -3917,7 +3917,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_array8(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_helper_array8(cpu_dst, cpu_src1, cpu_src2);
                     tcg_gen_shli_i64(cpu_dst, cpu_dst, 2);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
@@ -3936,64 +3936,56 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmple16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmple16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x022: /* VIS I fcmpne16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpne16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpne16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x024: /* VIS I fcmple32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmple32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmple32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x026: /* VIS I fcmpne32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpne32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpne32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x028: /* VIS I fcmpgt16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpgt16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpgt16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02a: /* VIS I fcmpeq16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpeq16(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpeq16(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02c: /* VIS I fcmpgt32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpgt32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpgt32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x02e: /* VIS I fcmpeq32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fcmpeq32(cpu_dst, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fcmpeq32(cpu_dst, cpu_src1_64, cpu_src2_64);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x031: /* VIS I fmul8x16 */
@@ -4001,8 +3993,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16(cpu_dst_64, cpu_env,
-                                        cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8x16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x033: /* VIS I fmul8x16au */
@@ -4010,8 +4001,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16au(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8x16au(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x035: /* VIS I fmul8x16al */
@@ -4019,8 +4010,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16al(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8x16al(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x036: /* VIS I fmul8sux16 */
@@ -4028,8 +4019,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8sux16(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8sux16(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x037: /* VIS I fmul8ulx16 */
@@ -4037,8 +4028,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_src1_64,
+                                          cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x038: /* VIS I fmuld8sux16 */
@@ -4046,8 +4037,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_env,
-                                           cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_src1_64,
+                                           cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
@@ -4055,8 +4046,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_env,
-                                           cpu_src1_64, cpu_src2_64);
+                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_src1_64,
+                                           cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x03a: /* VIS I fpack32 */
@@ -4079,8 +4070,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpmerge(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x04c: /* VIS II bshuffle */
@@ -4091,8 +4081,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fexpand(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fexpand(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x050: /* VIS I fpadd16 */
@@ -4100,8 +4089,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd16(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpadd16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x051: /* VIS I fpadd16s */
@@ -4109,8 +4097,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     cpu_src2_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpadd16s(cpu_dst_32, cpu_env,
-                                        cpu_src1_32, cpu_src2_32);
+                    gen_helper_fpadd16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x052: /* VIS I fpadd32 */
@@ -4118,8 +4105,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd32(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpadd32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x053: /* VIS I fpadd32s */
@@ -4135,8 +4121,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub16(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpsub16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x055: /* VIS I fpsub16s */
@@ -4144,8 +4129,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
                     cpu_src2_32 = gen_load_fpr_F(dc, rs2);
                     cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpsub16s(cpu_dst_32, cpu_env,
-                                        cpu_src1_32, cpu_src2_32);
+                    gen_helper_fpsub16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
                     gen_store_fpr_F(dc, rd, cpu_dst_32);
                     break;
                 case 0x056: /* VIS I fpsub32 */
@@ -4153,8 +4137,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
                     cpu_src2_64 = gen_load_fpr_D(dc, rs2);
                     cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub32(cpu_dst_64, cpu_env,
-                                       cpu_src1_64, cpu_src2_64);
+                    gen_helper_fpsub32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
                     gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x057: /* VIS I fpsub32s */
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index a007b0f..39c8d9a 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -28,8 +28,7 @@
 #define GET_FIELD_SP(X, FROM, TO)               \
     GET_FIELD(X, 63 - (TO), 63 - (FROM))
 
-target_ulong helper_array8(CPUState *env, target_ulong pixel_addr,
-                           target_ulong cubesize)
+target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
 {
     return (GET_FIELD_SP(pixel_addr, 60, 63) << (17 + 2 * cubesize)) |
         (GET_FIELD_SP(pixel_addr, 39, 39 + cubesize - 1) << (17 + cubesize)) |
@@ -97,7 +96,7 @@ typedef union {
     float32 f;
 } VIS32;
 
-uint64_t helper_fpmerge(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fpmerge(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
 
@@ -117,7 +116,7 @@ uint64_t helper_fpmerge(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -141,7 +140,7 @@ uint64_t helper_fmul8x16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16al(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16al(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -165,7 +164,7 @@ uint64_t helper_fmul8x16al(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16au(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16au(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -189,7 +188,7 @@ uint64_t helper_fmul8x16au(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8sux16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8sux16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -213,7 +212,7 @@ uint64_t helper_fmul8sux16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8ulx16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -237,7 +236,7 @@ uint64_t helper_fmul8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmuld8sux16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmuld8sux16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -260,7 +259,7 @@ uint64_t helper_fmuld8sux16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmuld8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fmuld8ulx16(uint64_t src1, uint64_t src2)
 {
     VIS64 s, d;
     uint32_t tmp;
@@ -283,7 +282,7 @@ uint64_t helper_fmuld8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
+uint64_t helper_fexpand(uint64_t src1, uint64_t src2)
 {
     VIS32 s;
     VIS64 d;
@@ -299,7 +298,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
 }
 
 #define VIS_HELPER(name, F)                             \
-    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##16(uint64_t src1, uint64_t src2)     \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
@@ -314,8 +313,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
         return d.ll;                                    \
     }                                                   \
                                                         \
-    uint32_t name##16s(CPUState *env, uint32_t src1,    \
-                       uint32_t src2)                   \
+    uint32_t name##16s(uint32_t src1, uint32_t src2)    \
     {                                                   \
         VIS32 s, d;                                     \
                                                         \
@@ -328,7 +326,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
         return d.l;                                     \
     }                                                   \
                                                         \
-    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##32(uint64_t src1, uint64_t src2)     \
     {                                                   \
         VIS64 s, d;                                     \
                                                         \
@@ -341,8 +339,7 @@ uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
         return d.ll;                                    \
     }                                                   \
                                                         \
-    uint32_t name##32s(CPUState *env, uint32_t src1,    \
-                       uint32_t src2)                   \
+    uint32_t name##32s(uint32_t src1, uint32_t src2)    \
     {                                                   \
         VIS32 s, d;                                     \
                                                         \
@@ -360,7 +357,7 @@ VIS_HELPER(helper_fpadd, FADD)
 VIS_HELPER(helper_fpsub, FSUB)
 
 #define VIS_CMPHELPER(name, F)                                    \
-    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##16(uint64_t src1, uint64_t src2)               \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
@@ -376,7 +373,7 @@ VIS_HELPER(helper_fpsub, FSUB)
         return d.ll;                                              \
     }                                                             \
                                                                   \
-    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
+    uint64_t name##32(uint64_t src1, uint64_t src2)               \
     {                                                             \
         VIS64 s, d;                                               \
                                                                   \
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (4 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 05/21] target-sparc: Make VIS helpers const when possible Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 20:24   ` Blue Swirl
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 07/21] target-sparc: Extract float128 move to a function Richard Henderson
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/fop_helper.c |    2 +-
 target-sparc/helper.h     |    4 +-
 target-sparc/translate.c  |  840 +++++++++++++++++++++------------------------
 3 files changed, 389 insertions(+), 457 deletions(-)

diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
index f6348c2..e652021 100644
--- a/target-sparc/fop_helper.c
+++ b/target-sparc/fop_helper.c
@@ -182,7 +182,7 @@ float32 helper_fabss(float32 src)
 }
 
 #ifdef TARGET_SPARC64
-float64 helper_fabsd(CPUState *env, float64 src)
+float64 helper_fabsd(float64 src)
 {
     return float64_abs(src);
 }
diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 9c15b8a..df367a4 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -48,7 +48,7 @@ DEF_HELPER_5(st_asi, void, env, tl, i64, int, int)
 DEF_HELPER_2(ldfsr, void, env, i32)
 DEF_HELPER_1(check_ieee_exceptions, void, env)
 DEF_HELPER_1(clear_float_exceptions, void, env)
-DEF_HELPER_1(fabss, f32, f32)
+DEF_HELPER_FLAGS_1(fabss, TCG_CALL_CONST | TCG_CALL_PURE, f32, f32)
 DEF_HELPER_2(fsqrts, f32, env, f32)
 DEF_HELPER_2(fsqrtd, f64, env, f64)
 DEF_HELPER_3(fcmps, void, env, f32, f32)
@@ -60,7 +60,7 @@ DEF_HELPER_1(fcmpq, void, env)
 DEF_HELPER_1(fcmpeq, void, env)
 #ifdef TARGET_SPARC64
 DEF_HELPER_2(ldxfsr, void, env, i64)
-DEF_HELPER_2(fabsd, f64, env, f64)
+DEF_HELPER_FLAGS_1(fabsd, TCG_CALL_CONST | TCG_CALL_PURE, f64, f64)
 DEF_HELPER_3(fcmps_fcc1, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc2, void, env, f32, f32)
 DEF_HELPER_3(fcmps_fcc3, void, env, f32, f32)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 5c70870..c47a035 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -24,6 +24,11 @@
 #include <string.h>
 #include <inttypes.h>
 
+/* Turn off the stupid always-inline hack in osdep.h.  This gets in the
+   way of the callback mechanisms we use in this file, generating warnings
+   for always-inline functions called indirectly.  */
+#define always_inline inline
+
 #include "cpu.h"
 #include "disas.h"
 #include "helper.h"
@@ -1627,6 +1632,305 @@ static inline void gen_clear_float_exceptions(void)
     gen_helper_clear_float_exceptions(cpu_env);
 }
 
+static void gen_fop_FF(DisasContext *dc, int rd, int rs,
+                       void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32))
+{
+    TCGv_i32 dst, src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static void gen_ne_fop_FF(DisasContext *dc, int rd, int rs,
+                          void (*gen)(TCGv_i32, TCGv_i32))
+{
+    TCGv_i32 dst, src;
+
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, src);
+
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static void gen_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32))
+{
+    TCGv_i32 dst, src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_F(dc, rs1);
+    src2 = gen_load_fpr_F(dc, rs2);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+#ifdef TARGET_SPARC64
+static void gen_ne_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
+                           void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32))
+{
+    TCGv_i32 dst, src1, src2;
+
+    src1 = gen_load_fpr_F(dc, rs1);
+    src2 = gen_load_fpr_F(dc, rs2);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, src1, src2);
+
+    gen_store_fpr_F(dc, rd, dst);
+}
+#endif
+
+static void gen_fop_DD(DisasContext *dc, int rd, int rs,
+                       void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i64))
+{
+    TCGv_i64 dst, src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_D(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+#ifdef TARGET_SPARC64
+static void gen_ne_fop_DD(DisasContext *dc, int rd, int rs,
+                          void (*gen)(TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src;
+
+    src = gen_load_fpr_D(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, src);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+#endif
+
+static void gen_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+#ifdef TARGET_SPARC64
+static void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
+                           void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src1, src2;
+
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, src1, src2);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+#endif
+
+static void gen_fop_QQ(DisasContext *dc, int rd, int rs,
+                       void (*gen)(TCGv_ptr))
+{
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT1(QFPREG(rs));
+
+    gen(cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+#ifdef TARGET_SPARC64
+static void gen_ne_fop_QQ(DisasContext *dc, int rd, int rs,
+                          void (*gen)(TCGv_ptr))
+{
+    gen_op_load_fpr_QT1(QFPREG(rs));
+
+    gen(cpu_env);
+
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+#endif
+
+static void gen_fop_QQQ(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_ptr))
+{
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT0(QFPREG(rs1));
+    gen_op_load_fpr_QT1(QFPREG(rs2));
+
+    gen(cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+static void gen_fop_DFF(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32, TCGv_i32))
+{
+    TCGv_i64 dst;
+    TCGv_i32 src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_F(dc, rs1);
+    src2 = gen_load_fpr_F(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+static void gen_fop_QDD(DisasContext *dc, int rd, int rs1, int rs2,
+                        void (*gen)(TCGv_ptr, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 src1, src2;
+
+    gen_clear_float_exceptions();
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+
+    gen(cpu_env, src1, src2);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+#ifdef TARGET_SPARC64
+static void gen_fop_DF(DisasContext *dc, int rd, int rs,
+                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32))
+{
+    TCGv_i64 dst;
+    TCGv_i32 src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+#endif
+
+static void gen_ne_fop_DF(DisasContext *dc, int rd, int rs,
+                          void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32))
+{
+    TCGv_i64 dst;
+    TCGv_i32 src;
+
+    src = gen_load_fpr_F(dc, rs);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env, src);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+static void gen_fop_FD(DisasContext *dc, int rd, int rs,
+                        void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i64))
+{
+    TCGv_i32 dst;
+    TCGv_i64 src;
+
+    gen_clear_float_exceptions();
+    src = gen_load_fpr_D(dc, rs);
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env, src);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static void gen_fop_FQ(DisasContext *dc, int rd, int rs,
+                        void (*gen)(TCGv_i32, TCGv_ptr))
+{
+    TCGv_i32 dst;
+
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT1(QFPREG(rs));
+    dst = gen_dest_fpr_F();
+
+    gen(dst, cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_F(dc, rd, dst);
+}
+
+static void gen_fop_DQ(DisasContext *dc, int rd, int rs,
+                        void (*gen)(TCGv_i64, TCGv_ptr))
+{
+    TCGv_i64 dst;
+
+    gen_clear_float_exceptions();
+    gen_op_load_fpr_QT1(QFPREG(rs));
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_env);
+
+    gen_helper_check_ieee_exceptions(cpu_env);
+    gen_store_fpr_D(dc, rd, dst);
+}
+
+static void gen_ne_fop_QF(DisasContext *dc, int rd, int rs,
+                          void (*gen)(TCGv_ptr, TCGv_i32))
+{
+    TCGv_i32 src;
+
+    src = gen_load_fpr_F(dc, rs);
+
+    gen(cpu_env, src);
+
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
+static void gen_ne_fop_QD(DisasContext *dc, int rd, int rs,
+                          void (*gen)(TCGv_ptr, TCGv_i64))
+{
+    TCGv_i64 src;
+
+    src = gen_load_fpr_D(dc, rs);
+
+    gen(cpu_env, src);
+
+    gen_op_store_QT0_fpr(QFPREG(rd));
+    gen_update_fprs_dirty(QFPREG(rd));
+}
+
 /* asi moves */
 #ifdef TARGET_SPARC64
 static inline TCGv_i32 gen_get_asi(int insn, TCGv r_addr)
@@ -2415,279 +2719,115 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_store_fpr_F(dc, rd, cpu_src1_32);
                     break;
                 case 0x5: /* fnegs */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fnegs(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs2, gen_helper_fnegs);
                     break;
                 case 0x9: /* fabss */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fabss(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs2, gen_helper_fabss);
                     break;
                 case 0x29: /* fsqrts */
                     CHECK_FPU_FEATURE(dc, FSQRT);
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fsqrts(cpu_dst_32, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FF(dc, rd, rs2, gen_helper_fsqrts);
                     break;
                 case 0x2a: /* fsqrtd */
                     CHECK_FPU_FEATURE(dc, FSQRT);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fsqrtd(cpu_dst_64, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DD(dc, rd, rs2, gen_helper_fsqrtd);
                     break;
                 case 0x2b: /* fsqrtq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fsqrtq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQ(dc, rd, rs2, gen_helper_fsqrtq);
                     break;
                 case 0x41: /* fadds */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fadds(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fadds);
                     break;
                 case 0x42: /* faddd */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_faddd(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_faddd);
                     break;
                 case 0x43: /* faddq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_faddq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_faddq);
                     break;
                 case 0x45: /* fsubs */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fsubs(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fsubs);
                     break;
                 case 0x46: /* fsubd */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fsubd(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fsubd);
                     break;
                 case 0x47: /* fsubq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fsubq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fsubq);
                     break;
                 case 0x49: /* fmuls */
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fmuls(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fmuls);
                     break;
                 case 0x4a: /* fmuld */
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld);
                     break;
                 case 0x4b: /* fmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
                     CHECK_FPU_FEATURE(dc, FMUL);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fmulq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fmulq);
                     break;
                 case 0x4d: /* fdivs */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdivs(cpu_dst_32, cpu_env,
-                                     cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fdivs);
                     break;
                 case 0x4e: /* fdivd */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fdivd(cpu_dst_64, cpu_env,
-                                     cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fdivd);
                     break;
                 case 0x4f: /* fdivq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT0(QFPREG(rs1));
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    gen_helper_fdivq(cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fdivq);
                     break;
                 case 0x69: /* fsmuld */
                     CHECK_FPU_FEATURE(dc, FSMULD);
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fsmuld(cpu_dst_64, cpu_env,
-                                      cpu_src1_32, cpu_src2_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DFF(dc, rd, rs1, rs2, gen_helper_fsmuld);
                     break;
                 case 0x6e: /* fdmulq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fdmulq(cpu_env, cpu_src1_64, cpu_src2_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_fop_QDD(dc, rd, rs1, rs2, gen_helper_fdmulq);
                     break;
                 case 0xc4: /* fitos */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fitos(cpu_dst_32, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FF(dc, rd, rs2, gen_helper_fitos);
                     break;
                 case 0xc6: /* fdtos */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtos(cpu_dst_32, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FD(dc, rd, rs2, gen_helper_fdtos);
                     break;
                 case 0xc7: /* fqtos */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fqtos(cpu_dst_32, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FQ(dc, rd, rs2, gen_helper_fqtos);
                     break;
                 case 0xc8: /* fitod */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fitod(cpu_dst_64, cpu_env, cpu_src1_32);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DF(dc, rd, rs2, gen_helper_fitod);
                     break;
                 case 0xc9: /* fstod */
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fstod(cpu_dst_64, cpu_env, cpu_src1_32);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DF(dc, rd, rs2, gen_helper_fstod);
                     break;
                 case 0xcb: /* fqtod */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_clear_float_exceptions();
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fqtod(cpu_dst_64, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DQ(dc, rd, rs2, gen_helper_fqtod);
                     break;
                 case 0xcc: /* fitoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fitoq(cpu_env, cpu_src1_32);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QF(dc, rd, rs2, gen_helper_fitoq);
                     break;
                 case 0xcd: /* fstoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    gen_helper_fstoq(cpu_env, cpu_src1_32);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QF(dc, rd, rs2, gen_helper_fstoq);
                     break;
                 case 0xce: /* fdtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fdtoq(cpu_env, cpu_src1_64);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QD(dc, rd, rs2, gen_helper_fdtoq);
                     break;
                 case 0xd1: /* fstoi */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fstoi(cpu_dst_32, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FF(dc, rd, rs2, gen_helper_fstoi);
                     break;
                 case 0xd2: /* fdtoi */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fdtoi(cpu_dst_32, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FD(dc, rd, rs2, gen_helper_fdtoi);
                     break;
                 case 0xd3: /* fqtoi */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fqtoi(cpu_dst_32, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FQ(dc, rd, rs2, gen_helper_fqtoi);
                     break;
 #ifdef TARGET_SPARC64
                 case 0x2: /* V9 fmovd */
@@ -2707,80 +2847,38 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_update_fprs_dirty(QFPREG(rd));
                     break;
                 case 0x6: /* V9 fnegd */
-                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fnegd(cpu_dst_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs2, gen_helper_fnegd);
                     break;
                 case 0x7: /* V9 fnegq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_helper_fnegq(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QQ(dc, rd, rs2, gen_helper_fnegq);
                     break;
                 case 0xa: /* V9 fabsd */
-                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fabsd(cpu_dst_64, cpu_env, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs2, gen_helper_fabsd);
                     break;
                 case 0xb: /* V9 fabsq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_helper_fabsq(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QQ(dc, rd, rs2, gen_helper_fabsq);
                     break;
                 case 0x81: /* V9 fstox */
-                    gen_clear_float_exceptions();
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fstox(cpu_dst_64, cpu_env, cpu_src1_32);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DF(dc, rd, rs2, gen_helper_fstox);
                     break;
                 case 0x82: /* V9 fdtox */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fdtox(cpu_dst_64, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DD(dc, rd, rs2, gen_helper_fdtox);
                     break;
                 case 0x83: /* V9 fqtox */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_op_load_fpr_QT1(QFPREG(rs2));
-                    gen_clear_float_exceptions();
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fqtox(cpu_dst_64, cpu_env);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DQ(dc, rd, rs2, gen_helper_fqtox);
                     break;
                 case 0x84: /* V9 fxtos */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fxtos(cpu_dst_32, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_fop_FD(dc, rd, rs2, gen_helper_fxtos);
                     break;
                 case 0x88: /* V9 fxtod */
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fxtod(cpu_dst_64, cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_fop_DD(dc, rd, rs2, gen_helper_fxtod);
                     break;
                 case 0x8c: /* V9 fxtoq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    gen_clear_float_exceptions();
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    gen_helper_fxtoq(cpu_env, cpu_src1_64);
-                    gen_helper_check_ieee_exceptions(cpu_env);
-                    gen_op_store_QT0_fpr(QFPREG(rd));
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_ne_fop_QD(dc, rd, rs2, gen_helper_fxtoq);
                     break;
 #endif
                 default:
@@ -3990,65 +4088,31 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x031: /* VIS I fmul8x16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16);
                     break;
                 case 0x033: /* VIS I fmul8x16au */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16au(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16au);
                     break;
                 case 0x035: /* VIS I fmul8x16al */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8x16al(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16al);
                     break;
                 case 0x036: /* VIS I fmul8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8sux16(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8sux16);
                     break;
                 case 0x037: /* VIS I fmul8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_src1_64,
-                                          cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8ulx16);
                     break;
                 case 0x038: /* VIS I fmuld8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_src1_64,
-                                           cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8sux16);
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_src1_64,
-                                           cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8ulx16);
                     break;
                 case 0x03a: /* VIS I fpack32 */
                 case 0x03b: /* VIS I fpack16 */
@@ -4067,86 +4131,46 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x04b: /* VIS I fpmerge */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
                     break;
                 case 0x04c: /* VIS II bshuffle */
                     // XXX
                     goto illegal_insn;
                 case 0x04d: /* VIS I fexpand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fexpand(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fexpand);
                     break;
                 case 0x050: /* VIS I fpadd16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpadd16);
                     break;
                 case 0x051: /* VIS I fpadd16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpadd16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, gen_helper_fpadd16s);
                     break;
                 case 0x052: /* VIS I fpadd32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpadd32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpadd32);
                     break;
                 case 0x053: /* VIS I fpadd32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_add_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_add_i32);
                     break;
                 case 0x054: /* VIS I fpsub16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpsub16);
                     break;
                 case 0x055: /* VIS I fpsub16s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    gen_helper_fpsub16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, gen_helper_fpsub16s);
                     break;
                 case 0x056: /* VIS I fpsub32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpsub32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpsub32);
                     break;
                 case 0x057: /* VIS I fpsub32s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_sub_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_sub_i32);
                     break;
                 case 0x060: /* VIS I fzero */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4162,143 +4186,75 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x062: /* VIS I fnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_nor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_nor_i64);
                     break;
                 case 0x063: /* VIS I fnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_nor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_nor_i32);
                     break;
                 case 0x064: /* VIS I fandnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_andc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_andc_i64);
                     break;
                 case 0x065: /* VIS I fandnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_andc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_andc_i32);
                     break;
                 case 0x066: /* VIS I fnot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs2, tcg_gen_not_i64);
                     break;
                 case 0x067: /* VIS I fnot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs2, tcg_gen_not_i32);
                     break;
                 case 0x068: /* VIS I fandnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_andc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs2, rs1, tcg_gen_andc_i64);
                     break;
                 case 0x069: /* VIS I fandnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_andc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs2, rs1, tcg_gen_andc_i32);
                     break;
                 case 0x06a: /* VIS I fnot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DD(dc, rd, rs1, tcg_gen_not_i64);
                     break;
                 case 0x06b: /* VIS I fnot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FF(dc, rd, rs1, tcg_gen_not_i32);
                     break;
                 case 0x06c: /* VIS I fxor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_xor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_xor_i64);
                     break;
                 case 0x06d: /* VIS I fxors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_xor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_xor_i32);
                     break;
                 case 0x06e: /* VIS I fnand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_nand_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_nand_i64);
                     break;
                 case 0x06f: /* VIS I fnands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_nand_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_nand_i32);
                     break;
                 case 0x070: /* VIS I fand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_and_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_and_i64);
                     break;
                 case 0x071: /* VIS I fands */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_and_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_and_i32);
                     break;
                 case 0x072: /* VIS I fxnor */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_eqv_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_eqv_i64);
                     break;
                 case 0x073: /* VIS I fxnors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_eqv_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_eqv_i32);
                     break;
                 case 0x074: /* VIS I fsrc1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4312,19 +4268,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x076: /* VIS I fornot2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_orc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_orc_i64);
                     break;
                 case 0x077: /* VIS I fornot2s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_orc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_orc_i32);
                     break;
                 case 0x078: /* VIS I fsrc2 */
                     CHECK_FPU_FEATURE(dc, VIS1);
@@ -4338,35 +4286,19 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x07a: /* VIS I fornot1 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_orc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs2, rs1, tcg_gen_orc_i64);
                     break;
                 case 0x07b: /* VIS I fornot1s */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_orc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs2, rs1, tcg_gen_orc_i32);
                     break;
                 case 0x07c: /* VIS I for */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    tcg_gen_or_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_or_i64);
                     break;
                 case 0x07d: /* VIS I fors */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
-                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
-                    cpu_dst_32 = gen_dest_fpr_F();
-                    tcg_gen_or_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
-                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_or_i32);
                     break;
                 case 0x07e: /* VIS I fone */
                     CHECK_FPU_FEATURE(dc, VIS1);
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 07/21] target-sparc: Extract float128 move to a function.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (5 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 08/21] target-sparc: Undo cpu_fpr rename Richard Henderson
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |   50 ++++++++++++++++-----------------------------
 1 files changed, 18 insertions(+), 32 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index c47a035..f37dbb1 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -227,6 +227,20 @@ static void gen_op_store_QT0_fpr(unsigned int dst)
                    offsetof(CPU_QuadU, l.lowest));
 }
 
+#ifdef TARGET_SPARC64
+static void gen_move_Q(int rd, int rs)
+{
+    rd = QFPREG(rd);
+    rs = QFPREG(rs);
+
+    tcg_gen_mov_i32(cpu__fpr[rd], cpu__fpr[rs]);
+    tcg_gen_mov_i32(cpu__fpr[rd + 1], cpu__fpr[rs + 1]);
+    tcg_gen_mov_i32(cpu__fpr[rd + 2], cpu__fpr[rs + 2]);
+    tcg_gen_mov_i32(cpu__fpr[rd + 3], cpu__fpr[rs + 3]);
+    gen_update_fprs_dirty(rd);
+}
+#endif
+
 /* moves */
 #ifdef CONFIG_USER_ONLY
 #define supervisor(dc) 0
@@ -2836,15 +2850,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x3: /* V9 fmovq */
                     CHECK_FPU_FEATURE(dc, FLOAT128);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],
-                                    cpu__fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],
-                                    cpu__fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],
-                                    cpu__fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],
-                                    cpu__fpr[QFPREG(rs2) + 3]);
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_move_Q(rd, rs2);
                     break;
                 case 0x6: /* V9 fnegd */
                     gen_ne_fop_DD(dc, rd, rs2, gen_helper_fnegd);
@@ -2929,11 +2935,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     cpu_src1 = get_src1(insn, cpu_src1);
                     tcg_gen_brcondi_tl(gen_tcg_cond_reg[cond], cpu_src1,
                                        0, l1);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)], cpu__fpr[QFPREG(rs2)]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1], cpu__fpr[QFPREG(rs2) + 1]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2], cpu__fpr[QFPREG(rs2) + 2]);
-                    tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3], cpu__fpr[QFPREG(rs2) + 3]);
-                    gen_update_fprs_dirty(QFPREG(rd));
+                    gen_move_Q(rd, rs2);
                     gen_set_label(l1);
                     break;
                 }
@@ -2983,15 +2985,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_fcond(r_cond, fcc, cond);                   \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
-                                        cpu__fpr[QFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
-                                        cpu__fpr[QFPREG(rs2) + 1]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
-                                        cpu__fpr[QFPREG(rs2) + 2]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
-                                        cpu__fpr[QFPREG(rs2) + 3]);     \
-                        gen_update_fprs_dirty(QFPREG(rd));              \
+                        gen_move_Q(rd, rs2);                            \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
@@ -3082,15 +3076,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                         gen_cond(r_cond, icc, cond, dc);                \
                         tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond,         \
                                            0, l1);                      \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd)],           \
-                                        cpu__fpr[QFPREG(rs2)]);         \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 1],       \
-                                        cpu__fpr[QFPREG(rs2) + 1]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 2],       \
-                                        cpu__fpr[QFPREG(rs2) + 2]);     \
-                        tcg_gen_mov_i32(cpu__fpr[QFPREG(rd) + 3],       \
-                                        cpu__fpr[QFPREG(rs2) + 3]);     \
-                        gen_update_fprs_dirty(QFPREG(rd));              \
+                        gen_move_Q(rd, rs2);                            \
                         gen_set_label(l1);                              \
                         tcg_temp_free(r_cond);                          \
                     }
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 08/21] target-sparc: Undo cpu_fpr rename.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (6 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 07/21] target-sparc: Extract float128 move to a function Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 09/21] target-sparc: Change fpr representation to doubles Richard Henderson
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |   56 +++++++++++++++++++++++-----------------------
 1 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index f37dbb1..f8d3bf2 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -68,7 +68,7 @@ static TCGv cpu_tmp0;
 static TCGv_i32 cpu_tmp32;
 static TCGv_i64 cpu_tmp64;
 /* Floating point registers */
-static TCGv_i32 cpu__fpr[TARGET_FPREGS];
+static TCGv_i32 cpu_fpr[TARGET_FPREGS];
 
 static target_ulong gen_opc_npc[OPC_BUF_SIZE];
 static target_ulong gen_opc_jump_pc[2];
@@ -131,12 +131,12 @@ static inline void gen_update_fprs_dirty(int rd)
 /* floating point registers moves */
 static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 {
-    return cpu__fpr[src];
+    return cpu_fpr[src];
 }
 
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
-    tcg_gen_mov_i32(cpu__fpr[dst], v);
+    tcg_gen_mov_i32(cpu_fpr[dst], v);
     gen_update_fprs_dirty(dst);
 }
 
@@ -151,13 +151,13 @@ static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
     src = DFPREG(src);
 
 #if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu__fpr[src]);
-    tcg_gen_mov_i32(TCGV_LOW(ret), cpu__fpr[src + 1]);
+    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu_fpr[src]);
+    tcg_gen_mov_i32(TCGV_LOW(ret), cpu_fpr[src + 1]);
 #else
     {
         TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_extu_i32_i64(ret, cpu__fpr[src]);
-        tcg_gen_extu_i32_i64(t, cpu__fpr[src + 1]);
+        tcg_gen_extu_i32_i64(ret, cpu_fpr[src]);
+        tcg_gen_extu_i32_i64(t, cpu_fpr[src + 1]);
         tcg_gen_shli_i64(ret, ret, 32);
         tcg_gen_or_i64(ret, ret, t);
         tcg_temp_free_i64(t);
@@ -178,9 +178,9 @@ static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
     tcg_gen_mov_i32(cpu__fpu[dst], TCGV_HIGH(v));
     tcg_gen_mov_i32(cpu__fpu[dst + 1], TCGV_LOW(v));
 #else
-    tcg_gen_trunc_i64_i32(cpu__fpr[dst + 1], v);
+    tcg_gen_trunc_i64_i32(cpu_fpr[dst + 1], v);
     tcg_gen_shri_i64(v, v, 32);
-    tcg_gen_trunc_i64_i32(cpu__fpr[dst], v);
+    tcg_gen_trunc_i64_i32(cpu_fpr[dst], v);
 #endif
 
     gen_update_fprs_dirty(dst);
@@ -193,37 +193,37 @@ static TCGv_i64 gen_dest_fpr_D(void)
 
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_load_fpr_QT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu__fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu__fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
+    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
 static void gen_op_store_QT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.upper));
-    tcg_gen_ld_i32(cpu__fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lower));
-    tcg_gen_ld_i32(cpu__fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
+    tcg_gen_ld_i32(cpu_fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
                    offsetof(CPU_QuadU, l.lowest));
 }
 
@@ -233,10 +233,10 @@ static void gen_move_Q(int rd, int rs)
     rd = QFPREG(rd);
     rs = QFPREG(rs);
 
-    tcg_gen_mov_i32(cpu__fpr[rd], cpu__fpr[rs]);
-    tcg_gen_mov_i32(cpu__fpr[rd + 1], cpu__fpr[rs + 1]);
-    tcg_gen_mov_i32(cpu__fpr[rd + 2], cpu__fpr[rs + 2]);
-    tcg_gen_mov_i32(cpu__fpr[rd + 3], cpu__fpr[rs + 3]);
+    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs]);
+    tcg_gen_mov_i32(cpu_fpr[rd + 1], cpu_fpr[rs + 1]);
+    tcg_gen_mov_i32(cpu_fpr[rd + 2], cpu_fpr[rs + 2]);
+    tcg_gen_mov_i32(cpu_fpr[rd + 3], cpu_fpr[rs + 3]);
     gen_update_fprs_dirty(rd);
 }
 #endif
@@ -5260,9 +5260,9 @@ void gen_intermediate_code_init(CPUSPARCState *env)
                                               offsetof(CPUState, gregs[i]),
                                               gregnames[i]);
         for (i = 0; i < TARGET_FPREGS; i++)
-            cpu__fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
-                                                 offsetof(CPUState, fpr[i]),
-                                                 fregnames[i]);
+            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
+                                                offsetof(CPUState, fpr[i]),
+                                                fregnames[i]);
 
         /* register helpers */
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 09/21] target-sparc: Change fpr representation to doubles.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (7 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 08/21] target-sparc: Undo cpu_fpr rename Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 20:28   ` Blue Swirl
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 10/21] tcg: Optimize some forms of deposit Richard Henderson
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This allows a more efficient representation for 64-bit hosts.
It should be about the same for 32-bit hosts, as we can still
access the individual pieces of the double.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 gdbstub.c                  |   35 +++++++---
 linux-user/signal.c        |   28 +++++----
 monitor.c                  |   96 ++++++++++++++--------------
 target-sparc/cpu.h         |    7 +-
 target-sparc/cpu_init.c    |    6 +-
 target-sparc/ldst_helper.c |   71 +++++++++------------
 target-sparc/machine.c     |   20 ++----
 target-sparc/translate.c   |  146 ++++++++++++++++++++-----------------------
 8 files changed, 199 insertions(+), 210 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 1d99e19..6c18634 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -814,7 +814,11 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t *mem_buf, int n)
 #if defined(TARGET_ABI32) || !defined(TARGET_SPARC64)
     if (n < 64) {
         /* fprs */
-        GET_REG32(*((uint32_t *)&env->fpr[n - 32]));
+        if (n & 1) {
+            GET_REG32(env->fpr[(n - 32) / 2].l.lower);
+        } else {
+            GET_REG32(env->fpr[(n - 32) / 2].l.upper);
+        }
     }
     /* Y, PSR, WIM, TBR, PC, NPC, FPSR, CPSR */
     switch (n) {
@@ -831,15 +835,15 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t *mem_buf, int n)
 #else
     if (n < 64) {
         /* f0-f31 */
-        GET_REG32(*((uint32_t *)&env->fpr[n - 32]));
+        if (n & 1) {
+            GET_REG32(env->fpr[(n - 32) / 2].l.lower);
+        } else {
+            GET_REG32(env->fpr[(n - 32) / 2].l.upper);
+        }
     }
     if (n < 80) {
         /* f32-f62 (double width, even numbers only) */
-        uint64_t val;
-
-        val = (uint64_t)*((uint32_t *)&env->fpr[(n - 64) * 2 + 32]) << 32;
-        val |= *((uint32_t *)&env->fpr[(n - 64) * 2 + 33]);
-        GET_REG64(val);
+        GET_REG64(env->fpr[(n - 32) / 2].ll);
     }
     switch (n) {
     case 80: GET_REGL(env->pc);
@@ -878,7 +882,12 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n)
 #if defined(TARGET_ABI32) || !defined(TARGET_SPARC64)
     else if (n < 64) {
         /* fprs */
-        *((uint32_t *)&env->fpr[n - 32]) = tmp;
+        /* f0-f31 */
+        if (n & 1) {
+            env->fpr[(n - 32) / 2].l.lower = tmp;
+        } else {
+            env->fpr[(n - 32) / 2].l.upper = tmp;
+        }
     } else {
         /* Y, PSR, WIM, TBR, PC, NPC, FPSR, CPSR */
         switch (n) {
@@ -896,12 +905,16 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n)
 #else
     else if (n < 64) {
         /* f0-f31 */
-        env->fpr[n] = ldfl_p(mem_buf);
+        tmp = ldl_p(mem_buf);
+        if (n & 1) {
+            env->fpr[(n - 32) / 2].l.lower = tmp;
+        } else {
+            env->fpr[(n - 32) / 2].l.upper = tmp;
+        }
         return 4;
     } else if (n < 80) {
         /* f32-f62 (double width, even numbers only) */
-        *((uint32_t *)&env->fpr[(n - 64) * 2 + 32]) = tmp >> 32;
-        *((uint32_t *)&env->fpr[(n - 64) * 2 + 33]) = tmp;
+        env->fpr[(n - 32) / 2].ll = tmp;
     } else {
         switch (n) {
         case 80: env->pc = tmp; break;
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 89276eb..d68dc94 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -2299,12 +2299,14 @@ void sparc64_set_context(CPUSPARCState *env)
      */
     err |= __get_user(env->fprs, &(ucp->tuc_mcontext.mc_fpregs.mcfpu_fprs));
     {
-        uint32_t *src, *dst;
-        src = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
-        dst = env->fpr;
-        /* XXX: check that the CPU storage is the same as user context */
-        for (i = 0; i < 64; i++, dst++, src++)
-            err |= __get_user(*dst, src);
+        uint32_t *src = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
+        for (i = 0; i < 64; i++, src++) {
+            if (i & 1) {
+                err |= __get_user(env->fpr[i/2].l.lower, src);
+            } else {
+                err |= __get_user(env->fpr[i/2].l.upper, src);
+            }
+        }
     }
     err |= __get_user(env->fsr,
                       &(ucp->tuc_mcontext.mc_fpregs.mcfpu_fsr));
@@ -2393,12 +2395,14 @@ void sparc64_get_context(CPUSPARCState *env)
     err |= __put_user(i7, &(mcp->mc_i7));
 
     {
-        uint32_t *src, *dst;
-        src = env->fpr;
-        dst = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
-        /* XXX: check that the CPU storage is the same as user context */
-        for (i = 0; i < 64; i++, dst++, src++)
-            err |= __put_user(*src, dst);
+        uint32_t *dst = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
+        for (i = 0; i < 64; i++, dst++) {
+            if (i & 1) {
+                err |= __put_user(env->fpr[i/2].l.lower, dst);
+            } else {
+                err |= __put_user(env->fpr[i/2].l.upper, dst);
+            }
+        }
     }
     err |= __put_user(env->fsr, &(mcp->mc_fpregs.mcfpu_fsr));
     err |= __put_user(env->gsr, &(mcp->mc_fpregs.mcfpu_gsr));
diff --git a/monitor.c b/monitor.c
index da13471..02d7e2e 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3657,55 +3657,55 @@ static const MonitorDef monitor_defs[] = {
 #endif
     { "tbr", offsetof(CPUState, tbr) },
     { "fsr", offsetof(CPUState, fsr) },
-    { "f0", offsetof(CPUState, fpr[0]) },
-    { "f1", offsetof(CPUState, fpr[1]) },
-    { "f2", offsetof(CPUState, fpr[2]) },
-    { "f3", offsetof(CPUState, fpr[3]) },
-    { "f4", offsetof(CPUState, fpr[4]) },
-    { "f5", offsetof(CPUState, fpr[5]) },
-    { "f6", offsetof(CPUState, fpr[6]) },
-    { "f7", offsetof(CPUState, fpr[7]) },
-    { "f8", offsetof(CPUState, fpr[8]) },
-    { "f9", offsetof(CPUState, fpr[9]) },
-    { "f10", offsetof(CPUState, fpr[10]) },
-    { "f11", offsetof(CPUState, fpr[11]) },
-    { "f12", offsetof(CPUState, fpr[12]) },
-    { "f13", offsetof(CPUState, fpr[13]) },
-    { "f14", offsetof(CPUState, fpr[14]) },
-    { "f15", offsetof(CPUState, fpr[15]) },
-    { "f16", offsetof(CPUState, fpr[16]) },
-    { "f17", offsetof(CPUState, fpr[17]) },
-    { "f18", offsetof(CPUState, fpr[18]) },
-    { "f19", offsetof(CPUState, fpr[19]) },
-    { "f20", offsetof(CPUState, fpr[20]) },
-    { "f21", offsetof(CPUState, fpr[21]) },
-    { "f22", offsetof(CPUState, fpr[22]) },
-    { "f23", offsetof(CPUState, fpr[23]) },
-    { "f24", offsetof(CPUState, fpr[24]) },
-    { "f25", offsetof(CPUState, fpr[25]) },
-    { "f26", offsetof(CPUState, fpr[26]) },
-    { "f27", offsetof(CPUState, fpr[27]) },
-    { "f28", offsetof(CPUState, fpr[28]) },
-    { "f29", offsetof(CPUState, fpr[29]) },
-    { "f30", offsetof(CPUState, fpr[30]) },
-    { "f31", offsetof(CPUState, fpr[31]) },
+    { "f0", offsetof(CPUState, fpr[0].l.upper) },
+    { "f1", offsetof(CPUState, fpr[0].l.lower) },
+    { "f2", offsetof(CPUState, fpr[1].l.upper) },
+    { "f3", offsetof(CPUState, fpr[1].l.lower) },
+    { "f4", offsetof(CPUState, fpr[2].l.upper) },
+    { "f5", offsetof(CPUState, fpr[2].l.lower) },
+    { "f6", offsetof(CPUState, fpr[3].l.upper) },
+    { "f7", offsetof(CPUState, fpr[3].l.lower) },
+    { "f8", offsetof(CPUState, fpr[4].l.upper) },
+    { "f9", offsetof(CPUState, fpr[4].l.lower) },
+    { "f10", offsetof(CPUState, fpr[5].l.upper) },
+    { "f11", offsetof(CPUState, fpr[5].l.lower) },
+    { "f12", offsetof(CPUState, fpr[6].l.upper) },
+    { "f13", offsetof(CPUState, fpr[6].l.lower) },
+    { "f14", offsetof(CPUState, fpr[7].l.upper) },
+    { "f15", offsetof(CPUState, fpr[7].l.lower) },
+    { "f16", offsetof(CPUState, fpr[8].l.upper) },
+    { "f17", offsetof(CPUState, fpr[8].l.lower) },
+    { "f18", offsetof(CPUState, fpr[9].l.upper) },
+    { "f19", offsetof(CPUState, fpr[9].l.lower) },
+    { "f20", offsetof(CPUState, fpr[10].l.upper) },
+    { "f21", offsetof(CPUState, fpr[10].l.lower) },
+    { "f22", offsetof(CPUState, fpr[11].l.upper) },
+    { "f23", offsetof(CPUState, fpr[11].l.lower) },
+    { "f24", offsetof(CPUState, fpr[12].l.upper) },
+    { "f25", offsetof(CPUState, fpr[12].l.lower) },
+    { "f26", offsetof(CPUState, fpr[13].l.upper) },
+    { "f27", offsetof(CPUState, fpr[13].l.lower) },
+    { "f28", offsetof(CPUState, fpr[14].l.upper) },
+    { "f29", offsetof(CPUState, fpr[14].l.lower) },
+    { "f30", offsetof(CPUState, fpr[15].l.upper) },
+    { "f31", offsetof(CPUState, fpr[15].l.lower) },
 #ifdef TARGET_SPARC64
-    { "f32", offsetof(CPUState, fpr[32]) },
-    { "f34", offsetof(CPUState, fpr[34]) },
-    { "f36", offsetof(CPUState, fpr[36]) },
-    { "f38", offsetof(CPUState, fpr[38]) },
-    { "f40", offsetof(CPUState, fpr[40]) },
-    { "f42", offsetof(CPUState, fpr[42]) },
-    { "f44", offsetof(CPUState, fpr[44]) },
-    { "f46", offsetof(CPUState, fpr[46]) },
-    { "f48", offsetof(CPUState, fpr[48]) },
-    { "f50", offsetof(CPUState, fpr[50]) },
-    { "f52", offsetof(CPUState, fpr[52]) },
-    { "f54", offsetof(CPUState, fpr[54]) },
-    { "f56", offsetof(CPUState, fpr[56]) },
-    { "f58", offsetof(CPUState, fpr[58]) },
-    { "f60", offsetof(CPUState, fpr[60]) },
-    { "f62", offsetof(CPUState, fpr[62]) },
+    { "f32", offsetof(CPUState, fpr[16]) },
+    { "f34", offsetof(CPUState, fpr[17]) },
+    { "f36", offsetof(CPUState, fpr[18]) },
+    { "f38", offsetof(CPUState, fpr[19]) },
+    { "f40", offsetof(CPUState, fpr[20]) },
+    { "f42", offsetof(CPUState, fpr[21]) },
+    { "f44", offsetof(CPUState, fpr[22]) },
+    { "f46", offsetof(CPUState, fpr[23]) },
+    { "f48", offsetof(CPUState, fpr[24]) },
+    { "f50", offsetof(CPUState, fpr[25]) },
+    { "f52", offsetof(CPUState, fpr[26]) },
+    { "f54", offsetof(CPUState, fpr[27]) },
+    { "f56", offsetof(CPUState, fpr[28]) },
+    { "f58", offsetof(CPUState, fpr[29]) },
+    { "f60", offsetof(CPUState, fpr[30]) },
+    { "f62", offsetof(CPUState, fpr[31]) },
     { "asi", offsetof(CPUState, asi) },
     { "pstate", offsetof(CPUState, pstate) },
     { "cansave", offsetof(CPUState, cansave) },
diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index a4419a5..71a890c 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -3,16 +3,17 @@
 
 #include "config.h"
 #include "qemu-common.h"
+#include "bswap.h"
 
 #if !defined(TARGET_SPARC64)
 #define TARGET_LONG_BITS 32
-#define TARGET_FPREGS 32
+#define TARGET_DPREGS 16
 #define TARGET_PAGE_BITS 12 /* 4k */
 #define TARGET_PHYS_ADDR_SPACE_BITS 36
 #define TARGET_VIRT_ADDR_SPACE_BITS 32
 #else
 #define TARGET_LONG_BITS 64
-#define TARGET_FPREGS 64
+#define TARGET_DPREGS 32
 #define TARGET_PAGE_BITS 13 /* 8k */
 #define TARGET_PHYS_ADDR_SPACE_BITS 41
 # ifdef TARGET_ABI32
@@ -395,7 +396,7 @@ typedef struct CPUSPARCState {
 
     uint32_t psr;      /* processor state register */
     target_ulong fsr;      /* FPU state register */
-    float32 fpr[TARGET_FPREGS];  /* floating point registers */
+    CPU_DoubleU fpr[TARGET_DPREGS];  /* floating point registers */
     uint32_t cwp;      /* index of current register window (extracted
                           from PSR) */
 #if !defined(TARGET_SPARC64) || defined(TARGET_ABI32)
diff --git a/target-sparc/cpu_init.c b/target-sparc/cpu_init.c
index 08b72a9..1118f31 100644
--- a/target-sparc/cpu_init.c
+++ b/target-sparc/cpu_init.c
@@ -813,11 +813,11 @@ void cpu_dump_state(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
         }
     }
     cpu_fprintf(f, "\nFloating Point Registers:\n");
-    for (i = 0; i < TARGET_FPREGS; i++) {
+    for (i = 0; i < TARGET_DPREGS; i++) {
         if ((i & 3) == 0) {
-            cpu_fprintf(f, "%%f%02d:", i);
+            cpu_fprintf(f, "%%f%02d:", i * 2);
         }
-        cpu_fprintf(f, " %016f", *(float *)&env->fpr[i]);
+        cpu_fprintf(f, " %016" PRIx64, env->fpr[i].ll);
         if ((i & 3) == 3) {
             cpu_fprintf(f, "\n");
         }
diff --git a/target-sparc/ldst_helper.c b/target-sparc/ldst_helper.c
index ec9b5f2..a4254e7 100644
--- a/target-sparc/ldst_helper.c
+++ b/target-sparc/ldst_helper.c
@@ -2057,7 +2057,7 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
                     int rd)
 {
     unsigned int i;
-    CPU_DoubleU u;
+    target_ulong val;
 
     helper_check_align(env, addr, 3);
     addr = asi_address_mask(env, asi, addr);
@@ -2072,13 +2072,11 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
             return;
         }
         helper_check_align(env, addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            *(uint32_t *)&env->fpr[rd++] = helper_ld_asi(env, addr, asi & 0x8f,
-                                                         4, 0);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
+            env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi & 0x8f, 8, 0);
         }
-
         return;
+
     case 0x16: /* UA2007 Block load primary, user privilege */
     case 0x17: /* UA2007 Block load secondary, user privilege */
     case 0x1e: /* UA2007 Block load primary LE, user privilege */
@@ -2092,13 +2090,11 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
             return;
         }
         helper_check_align(env, addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            *(uint32_t *)&env->fpr[rd++] = helper_ld_asi(env, addr, asi & 0x19,
-                                                         4, 0);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 4) {
+            env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi & 0x19, 8, 0);
         }
-
         return;
+
     default:
         break;
     }
@@ -2106,20 +2102,19 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
     switch (size) {
     default:
     case 4:
-        *((uint32_t *)&env->fpr[rd]) = helper_ld_asi(env, addr, asi, size, 0);
+        val = helper_ld_asi(env, addr, asi, size, 0);
+        if (rd & 1) {
+            env->fpr[rd/2].l.lower = val;
+        } else {
+            env->fpr[rd/2].l.upper = val;
+        }
         break;
     case 8:
-        u.ll = helper_ld_asi(env, addr, asi, size, 0);
-        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
-        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
+        env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi, size, 0);
         break;
     case 16:
-        u.ll = helper_ld_asi(env, addr, asi, 8, 0);
-        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
-        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
-        u.ll = helper_ld_asi(env, addr + 8, asi, 8, 0);
-        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
-        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
+        env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi, 8, 0);
+        env->fpr[rd/2 + 1].ll = helper_ld_asi(env, addr + 8, asi, 8, 0);
         break;
     }
 }
@@ -2128,8 +2123,7 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
                     int rd)
 {
     unsigned int i;
-    target_ulong val = 0;
-    CPU_DoubleU u;
+    target_ulong val;
 
     helper_check_align(env, addr, 3);
     addr = asi_address_mask(env, asi, addr);
@@ -2146,10 +2140,8 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
             return;
         }
         helper_check_align(env, addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            val = *(uint32_t *)&env->fpr[rd++];
-            helper_st_asi(env, addr, val, asi & 0x8f, 4);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
+            helper_st_asi(env, addr, env->fpr[rd/2].ll, asi & 0x8f, 8);
         }
 
         return;
@@ -2166,10 +2158,8 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
             return;
         }
         helper_check_align(env, addr, 0x3f);
-        for (i = 0; i < 16; i++) {
-            val = *(uint32_t *)&env->fpr[rd++];
-            helper_st_asi(env, addr, val, asi & 0x19, 4);
-            addr += 4;
+        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
+            helper_st_asi(env, addr, env->fpr[rd/2].ll, asi & 0x19, 8);
         }
 
         return;
@@ -2180,20 +2170,19 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
     switch (size) {
     default:
     case 4:
-        helper_st_asi(env, addr, *(uint32_t *)&env->fpr[rd], asi, size);
+        if (rd & 1) {
+            val = env->fpr[rd/2].l.lower;
+        } else {
+            val = env->fpr[rd/2].l.upper;
+        }
+        helper_st_asi(env, addr, val, asi, size);
         break;
     case 8:
-        u.l.upper = *(uint32_t *)&env->fpr[rd++];
-        u.l.lower = *(uint32_t *)&env->fpr[rd++];
-        helper_st_asi(env, addr, u.ll, asi, size);
+        helper_st_asi(env, addr, env->fpr[rd/2].ll, asi, size);
         break;
     case 16:
-        u.l.upper = *(uint32_t *)&env->fpr[rd++];
-        u.l.lower = *(uint32_t *)&env->fpr[rd++];
-        helper_st_asi(env, addr, u.ll, asi, 8);
-        u.l.upper = *(uint32_t *)&env->fpr[rd++];
-        u.l.lower = *(uint32_t *)&env->fpr[rd++];
-        helper_st_asi(env, addr + 8, u.ll, asi, 8);
+        helper_st_asi(env, addr, env->fpr[rd/2].ll, asi, 8);
+        helper_st_asi(env, addr + 8, env->fpr[rd/2 + 1].ll, asi, 8);
         break;
     }
 }
diff --git a/target-sparc/machine.c b/target-sparc/machine.c
index 56ae041..235b088 100644
--- a/target-sparc/machine.c
+++ b/target-sparc/machine.c
@@ -21,13 +21,9 @@ void cpu_save(QEMUFile *f, void *opaque)
         qemu_put_betls(f, &env->regbase[i]);
 
     /* FPU */
-    for(i = 0; i < TARGET_FPREGS; i++) {
-        union {
-            float32 f;
-            uint32_t i;
-        } u;
-        u.f = env->fpr[i];
-        qemu_put_be32(f, u.i);
+    for (i = 0; i < TARGET_DPREGS; i++) {
+        qemu_put_be32(f, env->fpr[i].l.upper);
+        qemu_put_be32(f, env->fpr[i].l.lower);
     }
 
     qemu_put_betls(f, &env->pc);
@@ -128,13 +124,9 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
         qemu_get_betls(f, &env->regbase[i]);
 
     /* FPU */
-    for(i = 0; i < TARGET_FPREGS; i++) {
-        union {
-            float32 f;
-            uint32_t i;
-        } u;
-        u.i = qemu_get_be32(f);
-        env->fpr[i] = u.f;
+    for (i = 0; i < TARGET_DPREGS; i++) {
+        env->fpr[i].l.upper = qemu_get_be32(f);
+        env->fpr[i].l.lower = qemu_get_be32(f);
     }
 
     qemu_get_betls(f, &env->pc);
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index f8d3bf2..97a462b 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -68,7 +68,7 @@ static TCGv cpu_tmp0;
 static TCGv_i32 cpu_tmp32;
 static TCGv_i64 cpu_tmp64;
 /* Floating point registers */
-static TCGv_i32 cpu_fpr[TARGET_FPREGS];
+static TCGv_i64 cpu_fpr[TARGET_DPREGS];
 
 static target_ulong gen_opc_npc[OPC_BUF_SIZE];
 static target_ulong gen_opc_jump_pc[2];
@@ -87,8 +87,8 @@ typedef struct DisasContext {
     uint32_t cc_op;  /* current CC operation */
     struct TranslationBlock *tb;
     sparc_def_t *def;
-    TCGv_i64 t64[3];
-    int n_t64;
+    TCGv_i32 t32[3];
+    int n_t32;
 } DisasContext;
 
 // This function uses non-native bit order
@@ -131,12 +131,44 @@ static inline void gen_update_fprs_dirty(int rd)
 /* floating point registers moves */
 static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 {
-    return cpu_fpr[src];
+#if TCG_TARGET_REG_BITS == 32
+    if (src & 1) {
+        return TCGV_LOW(cpu_fpr[src / 2]);
+    } else {
+        return TCGV_HIGH(cpu_fpr[src / 2]);
+    }
+#else
+    if (src & 1) {
+        return MAKE_TCGV_I32(GET_TCGV_I64(cpu_fpr[src / 2]));
+    } else {
+        TCGv_i32 ret = tcg_temp_local_new_i32();
+        TCGv_i64 t = tcg_temp_new_i64();
+
+        tcg_gen_shri_i64(t, cpu_fpr[src / 2], 32);
+        tcg_gen_trunc_i64_i32(ret, t);
+        tcg_temp_free_i64(t);
+
+        dc->t32[dc->n_t32++] = ret;
+        assert(dc->n_t32 <= ARRAY_SIZE(dc->t32));
+
+        return ret;
+    }
+#endif
 }
 
 static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
 {
-    tcg_gen_mov_i32(cpu_fpr[dst], v);
+#if TCG_TARGET_REG_BITS == 32
+    if (dst & 1) {
+        tcg_gen_mov_i32(TCGV_LOW(cpu_fpr[dst / 2]), v);
+    } else {
+        tcg_gen_mov_i32(TCGV_HIGH(cpu_fpr[dst / 2]), v);
+    }
+#else
+    TCGv_i64 t = MAKE_TCGV_I64(GET_TCGV_I32(v));
+    tcg_gen_deposit_i64(cpu_fpr[dst / 2], cpu_fpr[dst / 2], t,
+                        (dst & 1 ? 0 : 32), 32);
+#endif
     gen_update_fprs_dirty(dst);
 }
 
@@ -147,42 +179,14 @@ static TCGv_i32 gen_dest_fpr_F(void)
 
 static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
 {
-    TCGv_i64 ret = tcg_temp_new_i64();
     src = DFPREG(src);
-
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu_fpr[src]);
-    tcg_gen_mov_i32(TCGV_LOW(ret), cpu_fpr[src + 1]);
-#else
-    {
-        TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_extu_i32_i64(ret, cpu_fpr[src]);
-        tcg_gen_extu_i32_i64(t, cpu_fpr[src + 1]);
-        tcg_gen_shli_i64(ret, ret, 32);
-        tcg_gen_or_i64(ret, ret, t);
-        tcg_temp_free_i64(t);
-    }
-#endif
-
-    dc->t64[dc->n_t64++] = ret;
-    assert(dc->n_t64 <= ARRAY_SIZE(dc->t64));
-
-    return ret;
+    return cpu_fpr[src / 2];
 }
 
 static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
 {
     dst = DFPREG(dst);
-
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(cpu__fpu[dst], TCGV_HIGH(v));
-    tcg_gen_mov_i32(cpu__fpu[dst + 1], TCGV_LOW(v));
-#else
-    tcg_gen_trunc_i64_i32(cpu_fpr[dst + 1], v);
-    tcg_gen_shri_i64(v, v, 32);
-    tcg_gen_trunc_i64_i32(cpu_fpr[dst], v);
-#endif
-
+    tcg_gen_mov_i64(cpu_fpr[dst / 2], v);
     gen_update_fprs_dirty(dst);
 }
 
@@ -193,50 +197,36 @@ static TCGv_i64 gen_dest_fpr_D(void)
 
 static void gen_op_load_fpr_QT0(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lowest));
+    tcg_gen_st_i64(cpu_fpr[src / 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.upper));
+    tcg_gen_st_i64(cpu_fpr[src/2 + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.lower));
 }
 
 static void gen_op_load_fpr_QT1(unsigned int src)
 {
-    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.upper));
-    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.lower));
-    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
-                   offsetof(CPU_QuadU, l.lowest));
+    tcg_gen_st_i64(cpu_fpr[src / 2], cpu_env, offsetof(CPUSPARCState, qt1) +
+                   offsetof(CPU_QuadU, ll.upper));
+    tcg_gen_st_i64(cpu_fpr[src/2 + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
+                   offsetof(CPU_QuadU, ll.lower));
 }
 
 static void gen_op_store_QT0_fpr(unsigned int dst)
 {
-    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upmost));
-    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.upper));
-    tcg_gen_ld_i32(cpu_fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lower));
-    tcg_gen_ld_i32(cpu_fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
-                   offsetof(CPU_QuadU, l.lowest));
+    tcg_gen_ld_i64(cpu_fpr[dst / 2], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.upper));
+    tcg_gen_ld_i64(cpu_fpr[dst/2 + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
+                   offsetof(CPU_QuadU, ll.lower));
 }
 
 #ifdef TARGET_SPARC64
-static void gen_move_Q(int rd, int rs)
+static void gen_move_Q(unsigned int rd, unsigned int rs)
 {
     rd = QFPREG(rd);
     rs = QFPREG(rs);
 
-    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs]);
-    tcg_gen_mov_i32(cpu_fpr[rd + 1], cpu_fpr[rs + 1]);
-    tcg_gen_mov_i32(cpu_fpr[rd + 2], cpu_fpr[rs + 2]);
-    tcg_gen_mov_i32(cpu_fpr[rd + 3], cpu_fpr[rs + 3]);
+    tcg_gen_mov_i64(cpu_fpr[rd / 2], cpu_fpr[rs / 2]);
+    tcg_gen_mov_i64(cpu_fpr[rd / 2 + 1], cpu_fpr[rs / 2 + 1]);
     gen_update_fprs_dirty(rd);
 }
 #endif
@@ -5008,6 +4998,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
  egress:
     tcg_temp_free(cpu_tmp1);
     tcg_temp_free(cpu_tmp2);
+    if (dc->n_t32 != 0) {
+        int i;
+        for (i = dc->n_t32 - 1; i >= 0; --i) {
+            tcg_temp_free_i32(dc->t32[i]);
+        }
+        dc->n_t32 = 0;
+    }
 }
 
 static inline void gen_intermediate_code_internal(TranslationBlock * tb,
@@ -5109,9 +5106,6 @@ static inline void gen_intermediate_code_internal(TranslationBlock * tb,
     tcg_temp_free_i64(cpu_tmp64);
     tcg_temp_free_i32(cpu_tmp32);
     tcg_temp_free(cpu_tmp0);
-    for (j = dc->n_t64 - 1; j >= 0; --j) {
-        tcg_temp_free_i64(dc->t64[j]);
-    }
 
     if (tb->cflags & CF_LAST_IO)
         gen_io_end();
@@ -5177,15 +5171,11 @@ void gen_intermediate_code_init(CPUSPARCState *env)
         "g6",
         "g7",
     };
-    static const char * const fregnames[64] = {
-        "f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7",
-        "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",
-        "f16", "f17", "f18", "f19", "f20", "f21", "f22", "f23",
-        "f24", "f25", "f26", "f27", "f28", "f29", "f30", "f31",
-        "f32", "f33", "f34", "f35", "f36", "f37", "f38", "f39",
-        "f40", "f41", "f42", "f43", "f44", "f45", "f46", "f47",
-        "f48", "f49", "f50", "f51", "f52", "f53", "f54", "f55",
-        "f56", "f57", "f58", "f59", "f60", "f61", "f62", "f63",
+    static const char * const fregnames[32] = {
+        "f0", "f2", "f4", "f6", "f8", "f10", "f12", "f14",
+        "f16", "f18", "f20", "f22", "f24", "f26", "f28", "f30",
+        "f32", "f34", "f36", "f38", "f40", "f42", "f44", "f46",
+        "f48", "f50", "f52", "f54", "f56", "f58", "f60", "f62",
     };
 
     /* init various static tables */
@@ -5259,8 +5249,8 @@ void gen_intermediate_code_init(CPUSPARCState *env)
             cpu_gregs[i] = tcg_global_mem_new(TCG_AREG0,
                                               offsetof(CPUState, gregs[i]),
                                               gregnames[i]);
-        for (i = 0; i < TARGET_FPREGS; i++)
-            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
+        for (i = 0; i < TARGET_DPREGS; i++)
+            cpu_fpr[i] = tcg_global_mem_new_i64(TCG_AREG0,
                                                 offsetof(CPUState, fpr[i]),
                                                 fregnames[i]);
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 10/21] tcg: Optimize some forms of deposit.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (8 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 09/21] target-sparc: Change fpr representation to doubles Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 20:30   ` Blue Swirl
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 11/21] target-sparc: Do exceptions management fully inside the helpers Richard Henderson
                   ` (11 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

If the deposit replaces the entire word, optimize to a move.

If we're inserting to the top of the word, avoid the mask of arg2
as we'll be shifting out all of the garbage and shifting in zeros.

If the host is 32-bit, reduce a 64-bit deposit to a 32-bit deposit
when possible.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg-op.h |   65 +++++++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index fea5983..2276c72 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -2045,38 +2045,75 @@ static inline void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1,
 				       TCGv_i32 arg2, unsigned int ofs,
 				       unsigned int len)
 {
+    uint32_t mask;
+    TCGv_i32 t1;
+
+    if (ofs == 0 && len == 32) {
+        tcg_gen_mov_i32(ret, arg2);
+        return;
+    }
     if (TCG_TARGET_HAS_deposit_i32 && TCG_TARGET_deposit_i32_valid(ofs, len)) {
         tcg_gen_op5ii_i32(INDEX_op_deposit_i32, ret, arg1, arg2, ofs, len);
-    } else {
-        uint32_t mask = (1u << len) - 1;
-        TCGv_i32 t1 = tcg_temp_new_i32 ();
+        return;
+    }
+
+    mask = (1u << len) - 1;
+    t1 = tcg_temp_new_i32 ();
 
+    if (ofs + len < 32) {
         tcg_gen_andi_i32(t1, arg2, mask);
         tcg_gen_shli_i32(t1, t1, ofs);
-        tcg_gen_andi_i32(ret, arg1, ~(mask << ofs));
-        tcg_gen_or_i32(ret, ret, t1);
-
-        tcg_temp_free_i32(t1);
+    } else {
+        tcg_gen_shli_i32(t1, arg2, ofs);
     }
+    tcg_gen_andi_i32(ret, arg1, ~(mask << ofs));
+    tcg_gen_or_i32(ret, ret, t1);
+
+    tcg_temp_free_i32(t1);
 }
 
 static inline void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1,
 				       TCGv_i64 arg2, unsigned int ofs,
 				       unsigned int len)
 {
+    uint64_t mask;
+    TCGv_i64 t1;
+
+    if (ofs == 0 && len == 64) {
+        tcg_gen_mov_i64(ret, arg2);
+        return;
+    }
     if (TCG_TARGET_HAS_deposit_i64 && TCG_TARGET_deposit_i64_valid(ofs, len)) {
         tcg_gen_op5ii_i64(INDEX_op_deposit_i64, ret, arg1, arg2, ofs, len);
-    } else {
-        uint64_t mask = (1ull << len) - 1;
-        TCGv_i64 t1 = tcg_temp_new_i64 ();
+        return;
+    }
+
+#if TCG_TARGET_REG_BITS == 32
+    if (ofs >= 32) {
+        tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
+                            TCGV_LOW(arg2), ofs - 32, len);
+        return;
+    }
+    if (ofs + len <= 32) {
+        tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
+                            TCGV_LOW(arg2), ofs, len);
+        return;
+    }
+#endif
+
+    mask = (1ull << len) - 1;
+    t1 = tcg_temp_new_i64 ();
 
+    if (ofs + len < 64) {
         tcg_gen_andi_i64(t1, arg2, mask);
         tcg_gen_shli_i64(t1, t1, ofs);
-        tcg_gen_andi_i64(ret, arg1, ~(mask << ofs));
-        tcg_gen_or_i64(ret, ret, t1);
-
-        tcg_temp_free_i64(t1);
+    } else {
+        tcg_gen_shli_i64(t1, arg2, ofs);
     }
+    tcg_gen_andi_i64(ret, arg1, ~(mask << ofs));
+    tcg_gen_or_i64(ret, ret, t1);
+
+    tcg_temp_free_i64(t1);
 }
 
 /***************************************/
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 11/21] target-sparc: Do exceptions management fully inside the helpers.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (9 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 10/21] tcg: Optimize some forms of deposit Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 12/21] sparc-linux-user: Handle SIGILL Richard Henderson
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This reduces the size of the individual translation blocks, since
we only emit a single call for each FOP rather than three.  In
addition, clear_float_exceptions expands inline to a single byte store.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/fop_helper.c |  206 ++++++++++++++++++++++++++++++++-------------
 target-sparc/helper.h     |    2 -
 target-sparc/translate.c  |   29 -------
 3 files changed, 146 insertions(+), 91 deletions(-)

diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
index e652021..c7a2512 100644
--- a/target-sparc/fop_helper.c
+++ b/target-sparc/fop_helper.c
@@ -23,22 +23,71 @@
 #define QT0 (env->qt0)
 #define QT1 (env->qt1)
 
+static void check_ieee_exceptions(CPUState *env)
+{
+    target_ulong status;
+
+    status = get_float_exception_flags(&env->fp_status);
+    if (status) {
+        /* Copy IEEE 754 flags into FSR */
+        if (status & float_flag_invalid) {
+            env->fsr |= FSR_NVC;
+        }
+        if (status & float_flag_overflow) {
+            env->fsr |= FSR_OFC;
+        }
+        if (status & float_flag_underflow) {
+            env->fsr |= FSR_UFC;
+        }
+        if (status & float_flag_divbyzero) {
+            env->fsr |= FSR_DZC;
+        }
+        if (status & float_flag_inexact) {
+            env->fsr |= FSR_NXC;
+        }
+
+        if ((env->fsr & FSR_CEXC_MASK) & ((env->fsr & FSR_TEM_MASK) >> 23)) {
+            /* Unmasked exception, generate a trap */
+            env->fsr |= FSR_FTT_IEEE_EXCP;
+            helper_raise_exception(env, TT_FP_EXCP);
+        } else {
+            /* Accumulate exceptions */
+            env->fsr |= (env->fsr & FSR_CEXC_MASK) << 5;
+        }
+    }
+}
+
+static inline void clear_float_exceptions(CPUState *env)
+{
+    set_float_exception_flags(0, &env->fp_status);
+}
+
 #define F_HELPER(name, p) void helper_f##name##p(CPUState *env)
 
 #define F_BINOP(name)                                           \
-    float32 helper_f ## name ## s (CPUState * env, float32 src1,\
+    float32 helper_f ## name ## s (CPUState *env, float32 src1, \
                                    float32 src2)                \
     {                                                           \
-        return float32_ ## name (src1, src2, &env->fp_status);  \
+        float32 ret;                                            \
+        clear_float_exceptions(env);                            \
+        ret = float32_ ## name (src1, src2, &env->fp_status);   \
+        check_ieee_exceptions(env);                             \
+        return ret;                                             \
     }                                                           \
     float64 helper_f ## name ## d (CPUState * env, float64 src1,\
                                    float64 src2)                \
     {                                                           \
-        return float64_ ## name (src1, src2, &env->fp_status);  \
+        float64 ret;                                            \
+        clear_float_exceptions(env);                            \
+        ret = float64_ ## name (src1, src2, &env->fp_status);   \
+        check_ieee_exceptions(env);                             \
+        return ret;                                             \
     }                                                           \
     F_HELPER(name, q)                                           \
     {                                                           \
+        clear_float_exceptions(env);                            \
         QT0 = float128_ ## name (QT0, QT1, &env->fp_status);    \
+        check_ieee_exceptions(env);                             \
     }
 
 F_BINOP(add);
@@ -49,16 +98,22 @@ F_BINOP(div);
 
 float64 helper_fsmuld(CPUState *env, float32 src1, float32 src2)
 {
-    return float64_mul(float32_to_float64(src1, &env->fp_status),
-                       float32_to_float64(src2, &env->fp_status),
-                       &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float64_mul(float32_to_float64(src1, &env->fp_status),
+                      float32_to_float64(src2, &env->fp_status),
+                      &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fdmulq(CPUState *env, float64 src1, float64 src2)
 {
+    clear_float_exceptions(env);
     QT0 = float128_mul(float64_to_float128(src1, &env->fp_status),
                        float64_to_float128(src2, &env->fp_status),
                        &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 float32 helper_fnegs(float32 src)
@@ -81,32 +136,48 @@ F_HELPER(neg, q)
 /* Integer to float conversion.  */
 float32 helper_fitos(CPUState *env, int32_t src)
 {
-    return int32_to_float32(src, &env->fp_status);
+    /* Inexact error possible converting int to float.  */
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = int32_to_float32(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fitod(CPUState *env, int32_t src)
 {
+    /* No possible exceptions converting int to double.  */
     return int32_to_float64(src, &env->fp_status);
 }
 
 void helper_fitoq(CPUState *env, int32_t src)
 {
+    /* No possible exceptions converting int to long double.  */
     QT0 = int32_to_float128(src, &env->fp_status);
 }
 
 #ifdef TARGET_SPARC64
 float32 helper_fxtos(CPUState *env, int64_t src)
 {
-    return int64_to_float32(src, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = int64_to_float32(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fxtod(CPUState *env, int64_t src)
 {
-    return int64_to_float64(src, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = int64_to_float64(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fxtoq(CPUState *env, int64_t src)
 {
+    /* No possible exceptions converting long long to long double.  */
     QT0 = int64_to_float128(src, &env->fp_status);
 }
 #endif
@@ -115,64 +186,108 @@ void helper_fxtoq(CPUState *env, int64_t src)
 /* floating point conversion */
 float32 helper_fdtos(CPUState *env, float64 src)
 {
-    return float64_to_float32(src, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = float64_to_float32(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fstod(CPUState *env, float32 src)
 {
-    return float32_to_float64(src, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float32_to_float64(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float32 helper_fqtos(CPUState *env)
 {
-    return float128_to_float32(QT1, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = float128_to_float32(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fstoq(CPUState *env, float32 src)
 {
+    clear_float_exceptions(env);
     QT0 = float32_to_float128(src, &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 float64 helper_fqtod(CPUState *env)
 {
-    return float128_to_float64(QT1, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float128_to_float64(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fdtoq(CPUState *env, float64 src)
 {
+    clear_float_exceptions(env);
     QT0 = float64_to_float128(src, &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 /* Float to integer conversion.  */
 int32_t helper_fstoi(CPUState *env, float32 src)
 {
-    return float32_to_int32_round_to_zero(src, &env->fp_status);
+    int32_t ret;
+    clear_float_exceptions(env);
+    ret = float32_to_int32_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int32_t helper_fdtoi(CPUState *env, float64 src)
 {
-    return float64_to_int32_round_to_zero(src, &env->fp_status);
+    int32_t ret;
+    clear_float_exceptions(env);
+    ret = float64_to_int32_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int32_t helper_fqtoi(CPUState *env)
 {
-    return float128_to_int32_round_to_zero(QT1, &env->fp_status);
+    int32_t ret;
+    clear_float_exceptions(env);
+    ret = float128_to_int32_round_to_zero(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 #ifdef TARGET_SPARC64
 int64_t helper_fstox(CPUState *env, float32 src)
 {
-    return float32_to_int64_round_to_zero(src, &env->fp_status);
+    int64_t ret;
+    clear_float_exceptions(env);
+    ret = float32_to_int64_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int64_t helper_fdtox(CPUState *env, float64 src)
 {
-    return float64_to_int64_round_to_zero(src, &env->fp_status);
+    int64_t ret;
+    clear_float_exceptions(env);
+    ret = float64_to_int64_round_to_zero(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 int64_t helper_fqtox(CPUState *env)
 {
-    return float128_to_int64_round_to_zero(QT1, &env->fp_status);
+    int64_t ret;
+    clear_float_exceptions(env);
+    ret = float128_to_int64_round_to_zero(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 #endif
 
@@ -195,17 +310,27 @@ void helper_fabsq(CPUState *env)
 
 float32 helper_fsqrts(CPUState *env, float32 src)
 {
-    return float32_sqrt(src, &env->fp_status);
+    float32 ret;
+    clear_float_exceptions(env);
+    ret = float32_sqrt(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 float64 helper_fsqrtd(CPUState *env, float64 src)
 {
-    return float64_sqrt(src, &env->fp_status);
+    float64 ret;
+    clear_float_exceptions(env);
+    ret = float64_sqrt(src, &env->fp_status);
+    check_ieee_exceptions(env);
+    return ret;
 }
 
 void helper_fsqrtq(CPUState *env)
 {
+    clear_float_exceptions(env);
     QT0 = float128_sqrt(QT1, &env->fp_status);
+    check_ieee_exceptions(env);
 }
 
 #define GEN_FCMP(name, size, reg1, reg2, FS, E)                         \
@@ -318,45 +443,6 @@ GEN_FCMP(fcmpeq_fcc3, float128, QT0, QT1, 26, 1);
 #undef GEN_FCMP_T
 #undef GEN_FCMP
 
-void helper_check_ieee_exceptions(CPUState *env)
-{
-    target_ulong status;
-
-    status = get_float_exception_flags(&env->fp_status);
-    if (status) {
-        /* Copy IEEE 754 flags into FSR */
-        if (status & float_flag_invalid) {
-            env->fsr |= FSR_NVC;
-        }
-        if (status & float_flag_overflow) {
-            env->fsr |= FSR_OFC;
-        }
-        if (status & float_flag_underflow) {
-            env->fsr |= FSR_UFC;
-        }
-        if (status & float_flag_divbyzero) {
-            env->fsr |= FSR_DZC;
-        }
-        if (status & float_flag_inexact) {
-            env->fsr |= FSR_NXC;
-        }
-
-        if ((env->fsr & FSR_CEXC_MASK) & ((env->fsr & FSR_TEM_MASK) >> 23)) {
-            /* Unmasked exception, generate a trap */
-            env->fsr |= FSR_FTT_IEEE_EXCP;
-            helper_raise_exception(env, TT_FP_EXCP);
-        } else {
-            /* Accumulate exceptions */
-            env->fsr |= (env->fsr & FSR_CEXC_MASK) << 5;
-        }
-    }
-}
-
-void helper_clear_float_exceptions(CPUState *env)
-{
-    set_float_exception_flags(0, &env->fp_status);
-}
-
 static inline void set_fsr(CPUState *env)
 {
     int rnd_mode;
diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index df367a4..6e66574 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -46,8 +46,6 @@ DEF_HELPER_5(ld_asi, i64, env, tl, int, int, int)
 DEF_HELPER_5(st_asi, void, env, tl, i64, int, int)
 #endif
 DEF_HELPER_2(ldfsr, void, env, i32)
-DEF_HELPER_1(check_ieee_exceptions, void, env)
-DEF_HELPER_1(clear_float_exceptions, void, env)
 DEF_HELPER_FLAGS_1(fabss, TCG_CALL_CONST | TCG_CALL_PURE, f32, f32)
 DEF_HELPER_2(fsqrts, f32, env, f32)
 DEF_HELPER_2(fsqrtd, f64, env, f64)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 97a462b..d3f7648 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -1631,23 +1631,16 @@ static inline void gen_op_clear_ieee_excp_and_FTT(void)
     tcg_gen_andi_tl(cpu_fsr, cpu_fsr, FSR_FTT_CEXC_NMASK);
 }
 
-static inline void gen_clear_float_exceptions(void)
-{
-    gen_helper_clear_float_exceptions(cpu_env);
-}
-
 static void gen_fop_FF(DisasContext *dc, int rd, int rs,
                        void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32))
 {
     TCGv_i32 dst, src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_F(dc, rs);
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1669,14 +1662,12 @@ static void gen_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
 {
     TCGv_i32 dst, src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_F(dc, rs1);
     src2 = gen_load_fpr_F(dc, rs2);
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1701,13 +1692,11 @@ static void gen_fop_DD(DisasContext *dc, int rd, int rs,
 {
     TCGv_i64 dst, src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_D(dc, rs);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
@@ -1731,14 +1720,12 @@ static void gen_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
 {
     TCGv_i64 dst, src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_D(dc, rs1);
     src2 = gen_load_fpr_D(dc, rs2);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
@@ -1761,12 +1748,10 @@ static void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
 static void gen_fop_QQ(DisasContext *dc, int rd, int rs,
                        void (*gen)(TCGv_ptr))
 {
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT1(QFPREG(rs));
 
     gen(cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_op_store_QT0_fpr(QFPREG(rd));
     gen_update_fprs_dirty(QFPREG(rd));
 }
@@ -1787,13 +1772,11 @@ static void gen_ne_fop_QQ(DisasContext *dc, int rd, int rs,
 static void gen_fop_QQQ(DisasContext *dc, int rd, int rs1, int rs2,
                         void (*gen)(TCGv_ptr))
 {
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT0(QFPREG(rs1));
     gen_op_load_fpr_QT1(QFPREG(rs2));
 
     gen(cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_op_store_QT0_fpr(QFPREG(rd));
     gen_update_fprs_dirty(QFPREG(rd));
 }
@@ -1804,14 +1787,12 @@ static void gen_fop_DFF(DisasContext *dc, int rd, int rs1, int rs2,
     TCGv_i64 dst;
     TCGv_i32 src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_F(dc, rs1);
     src2 = gen_load_fpr_F(dc, rs2);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
@@ -1820,13 +1801,11 @@ static void gen_fop_QDD(DisasContext *dc, int rd, int rs1, int rs2,
 {
     TCGv_i64 src1, src2;
 
-    gen_clear_float_exceptions();
     src1 = gen_load_fpr_D(dc, rs1);
     src2 = gen_load_fpr_D(dc, rs2);
 
     gen(cpu_env, src1, src2);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_op_store_QT0_fpr(QFPREG(rd));
     gen_update_fprs_dirty(QFPREG(rd));
 }
@@ -1838,13 +1817,11 @@ static void gen_fop_DF(DisasContext *dc, int rd, int rs,
     TCGv_i64 dst;
     TCGv_i32 src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_F(dc, rs);
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 #endif
@@ -1869,13 +1846,11 @@ static void gen_fop_FD(DisasContext *dc, int rd, int rs,
     TCGv_i32 dst;
     TCGv_i64 src;
 
-    gen_clear_float_exceptions();
     src = gen_load_fpr_D(dc, rs);
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env, src);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1884,13 +1859,11 @@ static void gen_fop_FQ(DisasContext *dc, int rd, int rs,
 {
     TCGv_i32 dst;
 
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT1(QFPREG(rs));
     dst = gen_dest_fpr_F();
 
     gen(dst, cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_F(dc, rd, dst);
 }
 
@@ -1899,13 +1872,11 @@ static void gen_fop_DQ(DisasContext *dc, int rd, int rs,
 {
     TCGv_i64 dst;
 
-    gen_clear_float_exceptions();
     gen_op_load_fpr_QT1(QFPREG(rs));
     dst = gen_dest_fpr_D();
 
     gen(dst, cpu_env);
 
-    gen_helper_check_ieee_exceptions(cpu_env);
     gen_store_fpr_D(dc, rd, dst);
 }
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 12/21] sparc-linux-user: Handle SIGILL.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (10 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 11/21] target-sparc: Do exceptions management fully inside the helpers Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 20:32   ` Blue Swirl
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 13/21] target-sparc: Implement PDIST Richard Henderson
                   ` (9 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, Riku Voipio

Signed-off-by: Richard Henderson <rth@twiddle.net>
Cc: Riku Voipio <riku.voipio@iki.fi>
---
 linux-user/main.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 186358b..686f6f6 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -1191,6 +1191,15 @@ void cpu_loop (CPUSPARCState *env)
         case EXCP_INTERRUPT:
             /* just indicate that signals should be handled asap */
             break;
+        case TT_ILL_INSN:
+            {
+                info.si_signo = SIGILL;
+                info.si_errno = 0;
+                info.si_code = TARGET_ILL_ILLOPC;
+                info._sifields._sigfault._addr = env->pc;
+                queue_signal(env, info.si_signo, &info);
+            }
+            break;
         case EXCP_DEBUG:
             {
                 int sig;
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 13/21] target-sparc: Implement PDIST.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (11 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 12/21] sparc-linux-user: Handle SIGILL Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 14/21] target-sparc: Implement fpack{16, 32, fix} Richard Henderson
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 +
 target-sparc/translate.c  |   21 +++++++++++++++++++--
 target-sparc/vis_helper.c |   21 +++++++++++++++++++++
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 6e66574..1a8e586 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -137,6 +137,7 @@ DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
+DEF_HELPER_FLAGS_3(pdist, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 #define VIS_HELPER(name)                                                 \
     DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
                        i64, i64, i64)                                    \
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index d3f7648..0acb477 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -1743,6 +1743,21 @@ static void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
 
     gen_store_fpr_D(dc, rd, dst);
 }
+
+static void gen_ne_fop_DDDD(DisasContext *dc, int rd, int rs1, int rs2,
+                            void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src0, src1, src2;
+
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    src0 = gen_load_fpr_D(dc, rd);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, src0, src1, src2);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
 #endif
 
 static void gen_fop_QQ(DisasContext *dc, int rd, int rs,
@@ -4064,9 +4079,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                 case 0x03a: /* VIS I fpack32 */
                 case 0x03b: /* VIS I fpack16 */
                 case 0x03d: /* VIS I fpackfix */
-                case 0x03e: /* VIS I pdist */
-                    // XXX
                     goto illegal_insn;
+                case 0x03e: /* VIS I pdist */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_ne_fop_DDDD(dc, rd, rs1, rs2, gen_helper_pdist);
+                    break;
                 case 0x048: /* VIS I faligndata */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 39c8d9a..cd5d4a7 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -396,3 +396,24 @@ VIS_CMPHELPER(helper_fcmpgt, FCMPGT)
 VIS_CMPHELPER(helper_fcmpeq, FCMPEQ)
 VIS_CMPHELPER(helper_fcmple, FCMPLE)
 VIS_CMPHELPER(helper_fcmpne, FCMPNE)
+
+uint64_t helper_pdist(uint64_t sum, uint64_t src1, uint64_t src2)
+{
+    int i;
+    for (i = 0; i < 8; i++) {
+        int s1, s2;
+
+        s1 = (src1 >> (56 - (i * 8))) & 0xff;
+        s2 = (src2 >> (56 - (i * 8))) & 0xff;
+
+        /* Absolute value of difference. */
+        s1 -= s2;
+        if (s1 < 0) {
+            s1 = -s1;
+        }
+
+        sum += s1;
+    }
+
+    return sum;
+}
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 14/21] target-sparc: Implement fpack{16, 32, fix}.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (12 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 13/21] target-sparc: Implement PDIST Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 15/21] target-sparc: Implement EDGE* instructions Richard Henderson
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    3 ++
 target-sparc/translate.c  |   21 ++++++++++++++-
 target-sparc/vis_helper.c |   64 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 87 insertions(+), 1 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 1a8e586..5c8d266 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -138,6 +138,9 @@ DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_3(pdist, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
+DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fpackfix, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
 #define VIS_HELPER(name)                                                 \
     DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
                        i64, i64, i64)                                    \
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 0acb477..1edf255 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -4077,9 +4077,28 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8ulx16);
                     break;
                 case 0x03a: /* VIS I fpack32 */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D();
+                    gen_helper_fpack32(cpu_dst_64, cpu_gsr,
+                                       cpu_src1_64, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    break;
                 case 0x03b: /* VIS I fpack16 */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpack16(cpu_dst_32, cpu_gsr, cpu_src1_64);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    break;
                 case 0x03d: /* VIS I fpackfix */
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_32 = gen_dest_fpr_F();
+                    gen_helper_fpackfix(cpu_dst_32, cpu_gsr, cpu_src1_64);
+                    gen_store_fpr_F(dc, rd, cpu_dst_32);
+                    break;
                 case 0x03e: /* VIS I pdist */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     gen_ne_fop_DDDD(dc, rd, rs1, rs2, gen_helper_pdist);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index cd5d4a7..59ca8d7 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -417,3 +417,67 @@ uint64_t helper_pdist(uint64_t sum, uint64_t src1, uint64_t src2)
 
     return sum;
 }
+
+uint32_t helper_fpack16(uint64_t gsr, uint64_t rs2)
+{
+    int scale = (gsr >> 3) & 0xf;
+    uint32_t ret = 0;
+    int byte;
+
+    for (byte = 0; byte < 4; byte++) {
+        uint32_t val;
+        int16_t src = rs2 >> (byte * 16);
+        int32_t scaled = src << scale;
+        int32_t from_fixed = scaled >> 7;
+
+        val = (from_fixed < 0 ?  0 :
+               from_fixed > 255 ?  255 : from_fixed);
+
+        ret |= val << (8 * byte);
+    }
+
+    return ret;
+}
+
+uint64_t helper_fpack32(uint64_t gsr, uint64_t rs1, uint64_t rs2)
+{
+    int scale = (gsr >> 3) & 0x1f;
+    uint64_t ret = 0;
+    int word;
+
+    ret = (rs1 << 8) & ~(0x000000ff000000ffULL);
+    for (word = 0; word < 2; word++) {
+        uint64_t val;
+        int32_t src = rs2 >> (word * 32);
+        int64_t scaled = (int64_t)src << scale;
+        int64_t from_fixed = scaled >> 23;
+
+        val = (from_fixed < 0 ? 0 :
+               (from_fixed > 255) ? 255 : from_fixed);
+
+        ret |= val << (32 * word);
+    }
+
+    return ret;
+}
+
+uint32_t helper_fpackfix(uint64_t gsr, uint64_t rs2)
+{
+    int scale = (gsr >> 3) & 0x1f;
+    uint32_t ret = 0;
+    int word;
+
+    for (word = 0; word < 2; word++) {
+        uint32_t val;
+        int32_t src = rs2 >> (word * 32);
+        int64_t scaled = src << scale;
+        int64_t from_fixed = scaled >> 16;
+
+        val = (from_fixed < -32768 ? -32768 :
+               from_fixed > 32767 ?  32767 : from_fixed);
+
+        ret |= (val & 0xffff) << (word * 16);
+    }
+
+    return ret;
+}
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 15/21] target-sparc: Implement EDGE* instructions.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (13 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 14/21] target-sparc: Implement fpack{16, 32, fix} Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 16/21] target-sparc: Implement ALIGNADDR* inline Richard Henderson
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |  177 +++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 175 insertions(+), 2 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 1edf255..df82ecc 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2212,6 +2212,109 @@ static inline void gen_load_trap_state_at_tl(TCGv_ptr r_tsptr, TCGv_ptr cpu_env)
 
     tcg_temp_free_i32(r_tl);
 }
+
+static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
+                     int width, bool cc, bool left)
+{
+    TCGv lo1, lo2, t1, t2;
+    uint64_t amask, tabl, tabr;
+    int shift, imask, omask;
+
+    if (cc) {
+        tcg_gen_mov_tl(cpu_cc_src, s1);
+        tcg_gen_mov_tl(cpu_cc_src2, s2);
+        tcg_gen_sub_tl(cpu_cc_dst, s1, s2);
+        tcg_gen_movi_i32(cpu_cc_op, CC_OP_SUB);
+        dc->cc_op = CC_OP_SUB;
+    }
+
+    /* Theory of operation: there are two tables, left and right (not to
+       be confused with the left and right versions of the opcode).  These
+       are indexed by the low 3 bits of the inputs.  To make things "easy",
+       these tables are loaded into two constants, TABL and TABR below.
+       The operation index = (input & imask) << shift calculates the index
+       into the constant, while val = (table >> index) & omask calculates
+       the value we're looking for.  */
+    switch (width) {
+    case 8:
+        imask = 0x7;
+        shift = 3;
+        omask = 0xff;
+        if (left) {
+            tabl = 0x80c0e0f0f8fcfeffULL;
+            tabr = 0xff7f3f1f0f070301ULL;
+        } else {
+            tabl = 0x0103070f1f3f7fffULL;
+            tabr = 0xfffefcf8f0e0c080ULL;
+        }
+        break;
+    case 16:
+        imask = 0x6;
+        shift = 1;
+        omask = 0xf;
+        if (left) {
+            tabl = 0x8cef;
+            tabr = 0xf731;
+        } else {
+            tabl = 0x137f;
+            tabr = 0xfec8;
+        }
+        break;
+    case 32:
+        imask = 0x4;
+        shift = 0;
+        omask = 0x3;
+        if (left) {
+            tabl = (2 << 2) | 3;
+            tabr = (3 << 2) | 1;
+        } else {
+            tabl = (1 << 2) | 3;
+            tabr = (3 << 2) | 2;
+        }
+        break;
+    default:
+        abort();
+    }
+
+    lo1 = tcg_temp_new();
+    lo2 = tcg_temp_new();
+    tcg_gen_andi_tl(lo1, s1, imask);
+    tcg_gen_andi_tl(lo2, s2, imask);
+    tcg_gen_shli_tl(lo1, lo1, shift);
+    tcg_gen_shli_tl(lo2, lo2, shift);
+
+    t1 = tcg_const_tl(tabl);
+    t2 = tcg_const_tl(tabr);
+    tcg_gen_shr_tl(lo1, t1, lo1);
+    tcg_gen_shr_tl(lo2, t2, lo2);
+    tcg_gen_andi_tl(dst, lo1, omask);
+    tcg_gen_andi_tl(lo2, lo2, omask);
+
+    amask = -8;
+    if (AM_CHECK(dc)) {
+        amask &= 0xffffffffULL;
+    }
+    tcg_gen_andi_tl(s1, s1, amask);
+    tcg_gen_andi_tl(s2, s2, amask);
+
+    /* We want to compute
+        dst = (s1 == s2 ? lo1 : lo1 & lo2).
+       We've already done dst = lo1, so this reduces to
+        dst &= (s1 == s2 ? -1 : lo2)
+       Which we perform by
+        lo2 |= -(s1 == s2)
+        dst &= lo2
+    */
+    tcg_gen_setcond_tl(TCG_COND_EQ, t1, s1, s2);
+    tcg_gen_neg_tl(t1, t1);
+    tcg_gen_or_tl(lo2, lo2, t1);
+    tcg_gen_and_tl(dst, dst, lo2);
+
+    tcg_temp_free(lo1);
+    tcg_temp_free(lo2);
+    tcg_temp_free(t1);
+    tcg_temp_free(t2);
+}
 #endif
 
 #define CHECK_IU_FEATURE(dc, FEATURE)                      \
@@ -3945,19 +4048,89 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
 
                 switch (opf) {
                 case 0x000: /* VIS I edge8cc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 1, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x001: /* VIS II edge8n */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 0, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x002: /* VIS I edge8lcc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 1, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x003: /* VIS II edge8ln */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 8, 0, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x004: /* VIS I edge16cc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 1, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x005: /* VIS II edge16n */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 0, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x006: /* VIS I edge16lcc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 1, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x007: /* VIS II edge16ln */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 16, 0, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x008: /* VIS I edge32cc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 1, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x009: /* VIS II edge32n */
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 0, 0);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x00a: /* VIS I edge32lcc */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 1, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x00b: /* VIS II edge32ln */
-                    // XXX
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_movl_reg_TN(rs1, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_edge(dc, cpu_dst, cpu_src1, cpu_src2, 32, 0, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x010: /* VIS I array8 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 16/21] target-sparc: Implement ALIGNADDR* inline.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (14 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 15/21] target-sparc: Implement EDGE* instructions Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 17/21] target-sparc: Implement BMASK/BSHUFFLE Richard Henderson
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

While ALIGNADDR was implemented out-of-line, ALIGNADDRL was not
implemeneted at all.  However, this is a very simple operation
so we're better off doing this inline.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 -
 target-sparc/translate.c  |   24 ++++++++++++++++++++++--
 target-sparc/vis_helper.c |   11 -----------
 3 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 5c8d266..4a61b77 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -17,7 +17,6 @@ DEF_HELPER_2(wrccr, void, env, tl)
 DEF_HELPER_1(rdcwp, tl, env)
 DEF_HELPER_2(wrcwp, void, env, tl)
 DEF_HELPER_FLAGS_2(array8, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl, tl)
-DEF_HELPER_3(alignaddr, tl, env, tl, tl)
 DEF_HELPER_1(popc, tl, tl)
 DEF_HELPER_4(ldda_asi, void, env, tl, int, int)
 DEF_HELPER_5(ldf_asi, void, env, tl, int, int, int)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index df82ecc..e955bf3 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2315,6 +2315,20 @@ static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
     tcg_temp_free(t1);
     tcg_temp_free(t2);
 }
+
+static void gen_alignaddr(TCGv dst, TCGv s1, TCGv s2, bool left)
+{
+    TCGv tmp = tcg_temp_new();
+
+    tcg_gen_add_tl(tmp, s1, s2);
+    tcg_gen_andi_tl(dst, tmp, -8);
+    if (left) {
+        tcg_gen_neg_tl(tmp, tmp);
+    }
+    tcg_gen_deposit_tl(cpu_gsr, cpu_gsr, tmp, 0, 3);
+
+    tcg_temp_free(tmp);
+}
 #endif
 
 #define CHECK_IU_FEATURE(dc, FEATURE)                      \
@@ -4158,11 +4172,17 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1 = get_src1(insn, cpu_src1);
                     gen_movl_reg_TN(rs2, cpu_src2);
-                    gen_helper_alignaddr(cpu_dst, cpu_env, cpu_src1, cpu_src2);
+                    gen_alignaddr(cpu_dst, cpu_src1, cpu_src2, 0);
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
-                case 0x019: /* VIS II bmask */
                 case 0x01a: /* VIS I alignaddrl */
+                    CHECK_FPU_FEATURE(dc, VIS1);
+                    cpu_src1 = get_src1(insn, cpu_src1);
+                    gen_movl_reg_TN(rs2, cpu_src2);
+                    gen_alignaddr(cpu_dst, cpu_src1, cpu_src2, 1);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
+                case 0x019: /* VIS II bmask */
                     // XXX
                     goto illegal_insn;
                 case 0x020: /* VIS I fcmple16 */
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 59ca8d7..40adb47 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -41,17 +41,6 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
         GET_FIELD_SP(pixel_addr, 11, 12);
 }
 
-target_ulong helper_alignaddr(CPUState *env, target_ulong addr,
-                              target_ulong offset)
-{
-    uint64_t tmp;
-
-    tmp = addr + offset;
-    env->gsr &= ~7ULL;
-    env->gsr |= tmp & 7ULL;
-    return tmp & ~7ULL;
-}
-
 uint64_t helper_faligndata(CPUState *env, uint64_t src1, uint64_t src2)
 {
     uint64_t tmp;
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 17/21] target-sparc: Implement BMASK/BSHUFFLE.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (15 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 16/21] target-sparc: Implement ALIGNADDR* inline Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 20:36   ` Blue Swirl
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 18/21] target-sparc: Tidy fpack32 Richard Henderson
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 +
 target-sparc/translate.c  |   28 ++++++++++++++++++++++++----
 target-sparc/vis_helper.c |   29 +++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index 4a61b77..ec00436 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -140,6 +140,7 @@ DEF_HELPER_FLAGS_3(pdist, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
 DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fpackfix, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
+DEF_HELPER_FLAGS_3(bshuffle, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
 #define VIS_HELPER(name)                                                 \
     DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
                        i64, i64, i64)                                    \
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index e955bf3..66107ee 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -1744,6 +1744,20 @@ static void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
     gen_store_fpr_D(dc, rd, dst);
 }
 
+static void gen_gsr_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
+                            void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
+{
+    TCGv_i64 dst, src1, src2;
+
+    src1 = gen_load_fpr_D(dc, rs1);
+    src2 = gen_load_fpr_D(dc, rs2);
+    dst = gen_dest_fpr_D();
+
+    gen(dst, cpu_gsr, src1, src2);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+
 static void gen_ne_fop_DDDD(DisasContext *dc, int rd, int rs1, int rs2,
                             void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
 {
@@ -4183,8 +4197,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_movl_TN_reg(rd, cpu_dst);
                     break;
                 case 0x019: /* VIS II bmask */
-                    // XXX
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    cpu_src1 = get_src1(insn, cpu_src1);
+                    cpu_src2 = get_src1(insn, cpu_src2);
+                    tcg_gen_add_tl(cpu_dst, cpu_src1, cpu_src2);
+                    tcg_gen_deposit_tl(cpu_gsr, cpu_gsr, cpu_dst, 32, 32);
+                    gen_movl_TN_reg(rd, cpu_dst);
+                    break;
                 case 0x020: /* VIS I fcmple16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
@@ -4310,8 +4329,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
                     break;
                 case 0x04c: /* VIS II bshuffle */
-                    // XXX
-                    goto illegal_insn;
+                    CHECK_FPU_FEATURE(dc, VIS2);
+                    gen_gsr_fop_DDD(dc, rd, rs1, rs2, gen_helper_bshuffle);
+                    break;
                 case 0x04d: /* VIS I fexpand */
                     CHECK_FPU_FEATURE(dc, VIS1);
                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fexpand);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 40adb47..7830120 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -470,3 +470,32 @@ uint32_t helper_fpackfix(uint64_t gsr, uint64_t rs2)
 
     return ret;
 }
+
+uint64 helper_bshuffle(uint64_t gsr, uint64_t src1, uint64_t src2)
+{
+    union {
+        uint64_t ll[2];
+        uint8_t b[16];
+    } s;
+    VIS64 r;
+    uint32_t i, mask, host;
+
+    /* Set up S such that we can index across all of the bytes.  */
+#ifdef HOST_WORDS_BIGENDIAN
+    s.ll[0] = src1;
+    s.ll[1] = src2;
+    host = 0;
+#else
+    s.ll[1] = src1;
+    s.ll[0] = src2;
+    host = 15;
+#endif
+    mask = gsr >> 32;
+
+    for (i = 0; i < 8; ++i) {
+        unsigned e = (mask >> (28 - i*4)) & 0xf;
+        r.VIS_B64(i) = s.b[e ^ host];
+    }
+
+    return r.ll;
+}
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 18/21] target-sparc: Tidy fpack32.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (16 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 17/21] target-sparc: Implement BMASK/BSHUFFLE Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 19/21] target-sparc: Implement FALIGNDATA inline Richard Henderson
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

Use the new gen_gsr_fop_DDD helper.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c |    7 +------
 1 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 66107ee..267ac71 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -4290,12 +4290,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x03a: /* VIS I fpack32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_fpack32(cpu_dst_64, cpu_gsr,
-                                       cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_gsr_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpack32);
                     break;
                 case 0x03b: /* VIS I fpack16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 19/21] target-sparc: Implement FALIGNDATA inline.
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (17 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 18/21] target-sparc: Tidy fpack32 Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 20/21] sparc-linux-user: Add some missing syscall numbers Richard Henderson
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel

This is a relatively simple sequence of shifts.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/helper.h     |    1 -
 target-sparc/translate.c  |   32 ++++++++++++++++++++++++++------
 target-sparc/vis_helper.c |   12 ------------
 3 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/target-sparc/helper.h b/target-sparc/helper.h
index ec00436..7626504 100644
--- a/target-sparc/helper.h
+++ b/target-sparc/helper.h
@@ -125,7 +125,6 @@ DEF_HELPER_1(fqtoi, s32, env)
 DEF_HELPER_2(fstox, s64, env, f32)
 DEF_HELPER_2(fdtox, s64, env, f64)
 DEF_HELPER_1(fqtox, s64, env)
-DEF_HELPER_3(faligndata, i64, env, i64, i64)
 
 DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 267ac71..591b391 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2343,6 +2343,31 @@ static void gen_alignaddr(TCGv dst, TCGv s1, TCGv s2, bool left)
 
     tcg_temp_free(tmp);
 }
+
+static void gen_faligndata(TCGv dst, TCGv gsr, TCGv s1, TCGv s2)
+{
+    TCGv t1, t2, shift;
+
+    t1 = tcg_temp_new();
+    t2 = tcg_temp_new();
+    shift = tcg_temp_new();
+
+    tcg_gen_andi_tl(shift, gsr, 7);
+    tcg_gen_shli_tl(shift, shift, 3);
+    tcg_gen_shl_tl(t1, s1, shift);
+
+    /* A shift of 64 does not produce 0 in TCG.  Divide this into a
+       shift of (up to 63) followed by a constant shift of 1.  */
+    tcg_gen_xori_tl(shift, shift, 63);
+    tcg_gen_shr_tl(t2, s2, shift);
+    tcg_gen_shri_tl(t2, t2, 1);
+
+    tcg_gen_or_tl(dst, t1, t2);
+
+    tcg_temp_free(t1);
+    tcg_temp_free(t2);
+    tcg_temp_free(shift);
+}
 #endif
 
 #define CHECK_IU_FEATURE(dc, FEATURE)                      \
@@ -4312,12 +4337,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x048: /* VIS I faligndata */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
-                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
-                    cpu_dst_64 = gen_dest_fpr_D();
-                    gen_helper_faligndata(cpu_dst_64, cpu_env,
-                                          cpu_src1_64, cpu_src2_64);
-                    gen_store_fpr_D(dc, rd, cpu_dst_64);
+                    gen_gsr_fop_DDD(dc, rd, rs1, rs2, gen_faligndata);
                     break;
                 case 0x04b: /* VIS I fpmerge */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
index 7830120..a992c29 100644
--- a/target-sparc/vis_helper.c
+++ b/target-sparc/vis_helper.c
@@ -41,18 +41,6 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
         GET_FIELD_SP(pixel_addr, 11, 12);
 }
 
-uint64_t helper_faligndata(CPUState *env, uint64_t src1, uint64_t src2)
-{
-    uint64_t tmp;
-
-    tmp = src1 << ((env->gsr & 7) * 8);
-    /* on many architectures a shift of 64 does nothing */
-    if ((env->gsr & 7) != 0) {
-        tmp |= src2 >> (64 - (env->gsr & 7) * 8);
-    }
-    return tmp;
-}
-
 #ifdef HOST_WORDS_BIGENDIAN
 #define VIS_B64(n) b[7 - (n)]
 #define VIS_W64(n) w[3 - (n)]
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 20/21] sparc-linux-user: Add some missing syscall numbers
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (18 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 19/21] target-sparc: Implement FALIGNDATA inline Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 21/21] sparc-linux-user: Enable NPTL Richard Henderson
  2011-10-18 19:50 ` [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Blue Swirl
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, Riku Voipio

Signed-off-by: Richard Henderson <rth@twiddle.net>
Cc: Riku Voipio <riku.voipio@iki.fi>
---
 linux-user/sparc/syscall_nr.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/linux-user/sparc/syscall_nr.h b/linux-user/sparc/syscall_nr.h
index be503f2..f201f9f 100644
--- a/linux-user/sparc/syscall_nr.h
+++ b/linux-user/sparc/syscall_nr.h
@@ -136,6 +136,7 @@
 #define TARGET_NR_utimes             138 /* SunOS Specific                              */
 #define TARGET_NR_stat64		139 /* Linux sparc32 Specific			   */
 #define TARGET_NR_getpeername        141 /* Common                                      */
+#define TARGET_NR_futex              142 /* gethostid under SunOS                       */
 #define TARGET_NR_gettid             143 /* ENOSYS under SunOS                          */
 #define TARGET_NR_getrlimit          144 /* Common                                      */
 #define TARGET_NR_setrlimit          145 /* Common                                      */
@@ -153,6 +154,7 @@
 #define TARGET_NR_getdomainname      162 /* SunOS Specific                              */
 #define TARGET_NR_setdomainname      163 /* Common                                      */
 #define TARGET_NR_quotactl           165 /* Common                                      */
+#define TARGET_NR_set_tid_address    166 /* Linux specific, exportfs under SunOS        */
 #define TARGET_NR_mount              167 /* Common                                      */
 #define TARGET_NR_ustat              168 /* Common                                      */
 #define TARGET_NR_getdents           174 /* Common                                      */
@@ -177,6 +179,7 @@
 #define TARGET_NR_readahead          205 /* Linux Specific                              */
 #define TARGET_NR_socketcall         206 /* Linux Specific                              */
 #define TARGET_NR_syslog             207 /* Linux Specific                              */
+#define TARGET_NR_tgkill             211 /* Linux Specific                              */
 #define TARGET_NR_waitpid            212 /* Linux Specific                              */
 #define TARGET_NR_swapoff            213 /* Linux Specific                              */
 #define TARGET_NR_sysinfo            214 /* Linux Specific                              */
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [Qemu-devel] [PATCH 21/21] sparc-linux-user: Enable NPTL
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (19 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 20/21] sparc-linux-user: Add some missing syscall numbers Richard Henderson
@ 2011-10-18 18:50 ` Richard Henderson
  2011-10-18 19:50 ` [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Blue Swirl
  21 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: blauwirbel, Riku Voipio

??? This doesn't work yet.  The new thread crashes more or less
immediately in the translated code, and then TCG aborts.

Perhaps some of that cpu_reset is really required?  The problem
with it is that it zeros pc/npc, which also sends us off into
nevernever land.  Perhaps cpu_clone_regs should take both the
old and new env, and move the copy/reset/update into cpu-specific
code?  That would certainly avoid the ifdef there...

Anyone see what's going wrong?

Not-signed-off-by: Richard Henderson <rth@twiddle.net>
Cc: Riku Voipio <riku.voipio@iki.fi>
---
 configure            |    3 +++
 linux-user/syscall.c |   12 +++++++++++-
 target-sparc/cpu.h   |   30 +++++++++++++++++++++++++-----
 3 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/configure b/configure
index 283ba81..8df9a6d 100755
--- a/configure
+++ b/configure
@@ -3313,11 +3313,13 @@ case "$target_arch2" in
   ;;
   sparc)
     target_phys_bits=64
+    target_nptl="yes"
   ;;
   sparc64)
     TARGET_BASE_ARCH=sparc
     target_phys_bits=64
     target_long_alignment=8
+    target_nptl="yes"
   ;;
   sparc32plus)
     TARGET_ARCH=sparc64
@@ -3325,6 +3327,7 @@ case "$target_arch2" in
     TARGET_ABI_DIR=sparc
     echo "TARGET_ABI32=y" >> $config_target_mak
     target_phys_bits=64
+    target_nptl="yes"
   ;;
   s390x)
     target_nptl="yes"
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 7735008..dfd7a89 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -3961,6 +3961,12 @@ static void *clone_func(void *arg)
     /* Wait until the parent has finshed initializing the tls state.  */
     pthread_mutex_lock(&clone_lock);
     pthread_mutex_unlock(&clone_lock);
+
+#ifdef TARGET_SPARC
+    /* Funny calling conventions for Sparc: the new TID is in %o0.  */
+    env->regwptr[0] = info->tid;
+#endif
+
     cpu_loop(env);
     /* never exits */
     return NULL;
@@ -4006,8 +4012,12 @@ static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp,
         init_task_state(ts);
         /* we create a new CPU instance. */
         new_env = cpu_copy(env);
-#if defined(TARGET_I386) || defined(TARGET_SPARC) || defined(TARGET_PPC)
+#if defined(TARGET_I386) || defined(TARGET_PPC)
         cpu_reset(new_env);
+#elif defined(TARGET_SPARC)
+        /* Funny calling conventions for Sparc: %o1 == 0 for parent,
+           and == 1 for child.  We handle the later in cpu_clone_regs.  */
+        env->regwptr[1] = 0;
 #endif
         /* Init regs that differ from the parent.  */
         cpu_clone_regs(new_env, newsp);
diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 71a890c..2c7d67b 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -675,12 +675,32 @@ static inline int cpu_pil_allowed(CPUState *env1, int pil)
 #if defined(CONFIG_USER_ONLY)
 static inline void cpu_clone_regs(CPUState *env, target_ulong newsp)
 {
-    if (newsp)
+    if (newsp) {
+        if (TARGET_VIRT_ADDR_SPACE_BITS == 32) {
+            newsp &= 0xffffffff;
+        }
         env->regwptr[22] = newsp;
-    env->regwptr[0] = 0;
-    /* FIXME: Do we also need to clear CF?  */
-    /* XXXXX */
-    printf ("HELPME: %s:%d\n", __FILE__, __LINE__);
+    }
+
+    /* Glibc tests for syscall error (carry set) before testing for
+       parent or child.  We must signal success.  */
+#if defined(TARGET_SPARC64) && !defined(TARGET_ABI32)
+    env->xcc &= ~PSR_CARRY;
+#else
+    env->psr &= ~PSR_CARRY;
+#endif
+
+    /* Indicate child.  */
+    env->regwptr[1] = 1;
+
+    /* Next instruction.  */
+    env->pc = env->npc;
+    env->npc = env->npc + 4;
+}
+
+static inline void cpu_set_tls(CPUState *env, target_ulong newtls)
+{
+    env->gregs[7] = newtls;
 }
 #endif
 
-- 
1.7.6.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements
  2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
                   ` (20 preceding siblings ...)
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 21/21] sparc-linux-user: Enable NPTL Richard Henderson
@ 2011-10-18 19:50 ` Blue Swirl
  2011-10-18 20:03   ` Richard Henderson
  21 siblings, 1 reply; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 19:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
> This started out to be simply flushing out the VIS2 instruction set.
> But when I got a look a the DT0/1 "calling convention" I choked, and
> thought we could really do better than that.
>
> The end result (op_opt,out_asm) looks significantly cleaner for a
> 64-bit host.  It looks about the same for a 32-bit host.
>
> I've been testing this vs the gcc testsuite, both for its generic
> ieee test cases, and the vectorization tests w/ -mvis2.

Excellent patch series.

> Watch out for the last patch.  It was an attempt to get rid of the
> hundreds of tls failures in the gcc testsuite by supporting NPTL.
> Except the clone syscall crashes, and seems to be crashing at a point
> where it's difficult to see what's going wrong.  That patch is
> present here for discussion only.

There is something fishy with Sparc fork() (spork?), I also tried to
fix that in the past several times. Maybe fork() was actually
implemented as vfork()?

> All of this is relative to blueswirl's sparc tree.  Which I think
> should go in as a most excellent cleanup of target-sparc.  I've
> pushed the tree to
>
>  git://repo.or.cz/qemu/rth.git rth/vis

Thanks. Unfortunately I'm not sure I'll be able to fix and push my
series before the freeze, because other than x86_64, none of the TCG
targets implement AREG0 free mode.

>
> r~
>
>
> Richard Henderson (21):
>  target-sparc: Add accessors for single-precision fpr access.
>  target-sparc: Mark fprs dirty in store accessor.
>  target-sparc: Add accessors for double-precision fpr access.
>  target-sparc: Pass float64 parameters instead of dt0/1 temporaries.
>  target-sparc: Make VIS helpers const when possible.
>  target-sparc: Extract common code for floating-point operations.
>  target-sparc: Extract float128 move to a function.
>  target-sparc: Undo cpu_fpr rename.
>  target-sparc: Change fpr representation to doubles.
>  tcg: Optimize some forms of deposit.
>  target-sparc: Do exceptions management fully inside the helpers.
>  sparc-linux-user: Handle SIGILL.
>  target-sparc: Implement PDIST.
>  target-sparc: Implement fpack{16,32,fix}.
>  target-sparc: Implement EDGE* instructions.
>  target-sparc: Implement ALIGNADDR* inline.
>  target-sparc: Implement BMASK/BSHUFFLE.
>  target-sparc: Tidy fpack32.
>  target-sparc: Implement FALIGNDATA inline.
>  sparc-linux-user: Add some missing syscall numbers
>  sparc-linux-user: Enable NPTL
>
>  configure                     |    3 +
>  gdbstub.c                     |   35 +-
>  linux-user/main.c             |    9 +
>  linux-user/signal.c           |   28 +-
>  linux-user/sparc/syscall_nr.h |    3 +
>  linux-user/syscall.c          |   12 +-
>  monitor.c                     |   96 ++--
>  target-sparc/cpu.h            |   38 +-
>  target-sparc/cpu_init.c       |    6 +-
>  target-sparc/fop_helper.c     |  294 ++++++---
>  target-sparc/helper.h         |  120 ++--
>  target-sparc/ldst_helper.c    |  123 +---
>  target-sparc/machine.c        |   20 +-
>  target-sparc/translate.c      | 1461 ++++++++++++++++++++++++-----------------
>  target-sparc/vis_helper.c     |  251 +++++---
>  tcg/tcg-op.h                  |   65 ++-
>  16 files changed, 1503 insertions(+), 1061 deletions(-)
>
> --
> 1.7.6.4
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements
  2011-10-18 19:50 ` [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Blue Swirl
@ 2011-10-18 20:03   ` Richard Henderson
  2011-10-18 20:19     ` Blue Swirl
  0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 20:03 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On 10/18/2011 12:50 PM, Blue Swirl wrote:
> Thanks. Unfortunately I'm not sure I'll be able to fix and push my
> series before the freeze, because other than x86_64, none of the TCG
> targets implement AREG0 free mode.

Oh, I see, you've not actually left the "normal" load/store helpers
in op_helper.c, and so the backends need fixing up for that feature.

Can we leave only __ld/st_mmu in op_helper.c for now, so that we 
don't rely on changes to the other backends for 1.0?  It seems like
all the other cleanups will Just Work, and there shouldn't be many
conflicts otherwise.

IIRC you measured both changes as in the noise of measurement.



r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 04/21] target-sparc: Pass float64 parameters instead of dt0/1 temporaries.
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 04/21] target-sparc: Pass float64 parameters instead of dt0/1 temporaries Richard Henderson
@ 2011-10-18 20:04   ` Blue Swirl
  2011-10-18 20:07     ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 20:04 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/cpu.h         |    1 -
>  target-sparc/fop_helper.c  |  120 ++++++------
>  target-sparc/helper.h      |   95 +++++-----
>  target-sparc/ldst_helper.c |   52 -----
>  target-sparc/translate.c   |  449 ++++++++++++++++++++++----------------------
>  target-sparc/vis_helper.c  |  113 ++++++------
>  6 files changed, 381 insertions(+), 449 deletions(-)
>
> diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
> index 99370d5..a4419a5 100644
> --- a/target-sparc/cpu.h
> +++ b/target-sparc/cpu.h
> @@ -463,7 +463,6 @@ typedef struct CPUSPARCState {
>     uint64_t prom_addr;
>  #endif
>     /* temporary float registers */
> -    float64 dt0, dt1;
>     float128 qt0, qt1;
>     float_status fp_status;
>  #if defined(TARGET_SPARC64)
> diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
> index 23502f3..f6348c2 100644
> --- a/target-sparc/fop_helper.c
> +++ b/target-sparc/fop_helper.c
> @@ -20,8 +20,6 @@
>  #include "cpu.h"
>  #include "helper.h"
>
> -#define DT0 (env->dt0)
> -#define DT1 (env->dt1)
>  #define QT0 (env->qt0)
>  #define QT1 (env->qt1)
>
> @@ -33,9 +31,10 @@
>     {                                                           \
>         return float32_ ## name (src1, src2, &env->fp_status);  \
>     }                                                           \
> -    F_HELPER(name, d)                                           \
> +    float64 helper_f ## name ## d (CPUState * env, float64 src1,\
> +                                   float64 src2)                \
>     {                                                           \
> -        DT0 = float64_ ## name (DT0, DT1, &env->fp_status);     \
> +        return float64_ ## name (src1, src2, &env->fp_status);  \
>     }                                                           \

Could we call float64_##name() directly from generated code and avoid
the wrapper? Translator could generate &env->fp_status and in other
cases that could be passed around instead of env.

>     F_HELPER(name, q)                                           \
>     {                                                           \
> @@ -48,17 +47,17 @@ F_BINOP(mul);
>  F_BINOP(div);
>  #undef F_BINOP
>
> -void helper_fsmuld(CPUState *env, float32 src1, float32 src2)
> +float64 helper_fsmuld(CPUState *env, float32 src1, float32 src2)
>  {
> -    DT0 = float64_mul(float32_to_float64(src1, &env->fp_status),
> -                      float32_to_float64(src2, &env->fp_status),
> -                      &env->fp_status);
> +    return float64_mul(float32_to_float64(src1, &env->fp_status),
> +                       float32_to_float64(src2, &env->fp_status),
> +                       &env->fp_status);
>  }
>
> -void helper_fdmulq(CPUState *env)
> +void helper_fdmulq(CPUState *env, float64 src1, float64 src2)
>  {
> -    QT0 = float128_mul(float64_to_float128(DT0, &env->fp_status),
> -                       float64_to_float128(DT1, &env->fp_status),
> +    QT0 = float128_mul(float64_to_float128(src1, &env->fp_status),
> +                       float64_to_float128(src2, &env->fp_status),
>                        &env->fp_status);
>  }
>
> @@ -68,9 +67,9 @@ float32 helper_fnegs(float32 src)
>  }
>
>  #ifdef TARGET_SPARC64
> -F_HELPER(neg, d)
> +float64 helper_fnegd(float64 src)
>  {
> -    DT0 = float64_chs(DT1);
> +    return float64_chs(src);
>  }
>
>  F_HELPER(neg, q)
> @@ -85,9 +84,9 @@ float32 helper_fitos(CPUState *env, int32_t src)
>     return int32_to_float32(src, &env->fp_status);
>  }
>
> -void helper_fitod(CPUState *env, int32_t src)
> +float64 helper_fitod(CPUState *env, int32_t src)
>  {
> -    DT0 = int32_to_float64(src, &env->fp_status);
> +    return int32_to_float64(src, &env->fp_status);
>  }
>
>  void helper_fitoq(CPUState *env, int32_t src)
> @@ -96,32 +95,32 @@ void helper_fitoq(CPUState *env, int32_t src)
>  }
>
>  #ifdef TARGET_SPARC64
> -float32 helper_fxtos(CPUState *env)
> +float32 helper_fxtos(CPUState *env, int64_t src)
>  {
> -    return int64_to_float32(*((int64_t *)&DT1), &env->fp_status);
> +    return int64_to_float32(src, &env->fp_status);
>  }
>
> -F_HELPER(xto, d)
> +float64 helper_fxtod(CPUState *env, int64_t src)
>  {
> -    DT0 = int64_to_float64(*((int64_t *)&DT1), &env->fp_status);
> +    return int64_to_float64(src, &env->fp_status);
>  }
>
> -F_HELPER(xto, q)
> +void helper_fxtoq(CPUState *env, int64_t src)
>  {
> -    QT0 = int64_to_float128(*((int64_t *)&DT1), &env->fp_status);
> +    QT0 = int64_to_float128(src, &env->fp_status);
>  }
>  #endif
>  #undef F_HELPER
>
>  /* floating point conversion */
> -float32 helper_fdtos(CPUState *env)
> +float32 helper_fdtos(CPUState *env, float64 src)
>  {
> -    return float64_to_float32(DT1, &env->fp_status);
> +    return float64_to_float32(src, &env->fp_status);
>  }
>
> -void helper_fstod(CPUState *env, float32 src)
> +float64 helper_fstod(CPUState *env, float32 src)
>  {
> -    DT0 = float32_to_float64(src, &env->fp_status);
> +    return float32_to_float64(src, &env->fp_status);
>  }
>
>  float32 helper_fqtos(CPUState *env)
> @@ -134,14 +133,14 @@ void helper_fstoq(CPUState *env, float32 src)
>     QT0 = float32_to_float128(src, &env->fp_status);
>  }
>
> -void helper_fqtod(CPUState *env)
> +float64 helper_fqtod(CPUState *env)
>  {
> -    DT0 = float128_to_float64(QT1, &env->fp_status);
> +    return float128_to_float64(QT1, &env->fp_status);
>  }
>
> -void helper_fdtoq(CPUState *env)
> +void helper_fdtoq(CPUState *env, float64 src)
>  {
> -    QT0 = float64_to_float128(DT1, &env->fp_status);
> +    QT0 = float64_to_float128(src, &env->fp_status);
>  }
>
>  /* Float to integer conversion.  */
> @@ -150,9 +149,9 @@ int32_t helper_fstoi(CPUState *env, float32 src)
>     return float32_to_int32_round_to_zero(src, &env->fp_status);
>  }
>
> -int32_t helper_fdtoi(CPUState *env)
> +int32_t helper_fdtoi(CPUState *env, float64 src)
>  {
> -    return float64_to_int32_round_to_zero(DT1, &env->fp_status);
> +    return float64_to_int32_round_to_zero(src, &env->fp_status);
>  }
>
>  int32_t helper_fqtoi(CPUState *env)
> @@ -161,19 +160,19 @@ int32_t helper_fqtoi(CPUState *env)
>  }
>
>  #ifdef TARGET_SPARC64
> -void helper_fstox(CPUState *env, float32 src)
> +int64_t helper_fstox(CPUState *env, float32 src)
>  {
> -    *((int64_t *)&DT0) = float32_to_int64_round_to_zero(src, &env->fp_status);
> +    return float32_to_int64_round_to_zero(src, &env->fp_status);
>  }
>
> -void helper_fdtox(CPUState *env)
> +int64_t helper_fdtox(CPUState *env, float64 src)
>  {
> -    *((int64_t *)&DT0) = float64_to_int64_round_to_zero(DT1, &env->fp_status);
> +    return float64_to_int64_round_to_zero(src, &env->fp_status);
>  }
>
> -void helper_fqtox(CPUState *env)
> +int64_t helper_fqtox(CPUState *env)
>  {
> -    *((int64_t *)&DT0) = float128_to_int64_round_to_zero(QT1, &env->fp_status);
> +    return float128_to_int64_round_to_zero(QT1, &env->fp_status);
>  }
>  #endif
>
> @@ -183,9 +182,9 @@ float32 helper_fabss(float32 src)
>  }
>
>  #ifdef TARGET_SPARC64
> -void helper_fabsd(CPUState *env)
> +float64 helper_fabsd(CPUState *env, float64 src)
>  {
> -    DT0 = float64_abs(DT1);
> +    return float64_abs(src);
>  }
>
>  void helper_fabsq(CPUState *env)
> @@ -199,9 +198,9 @@ float32 helper_fsqrts(CPUState *env, float32 src)
>     return float32_sqrt(src, &env->fp_status);
>  }
>
> -void helper_fsqrtd(CPUState *env)
> +float64 helper_fsqrtd(CPUState *env, float64 src)
>  {
> -    DT0 = float64_sqrt(DT1, &env->fp_status);
> +    return float64_sqrt(src, &env->fp_status);
>  }
>
>  void helper_fsqrtq(CPUState *env)
> @@ -245,8 +244,8 @@ void helper_fsqrtq(CPUState *env)
>             break;                                                      \
>         }                                                               \
>     }
> -#define GEN_FCMPS(name, size, FS, E)                                    \
> -    void glue(helper_, name)(CPUState *env, float32 src1, float32 src2) \
> +#define GEN_FCMP_T(name, size, FS, E)                                   \
> +    void glue(helper_, name)(CPUState *env, size src1, size src2)       \
>     {                                                                   \
>         env->fsr &= FSR_FTT_NMASK;                                      \
>         if (E && (glue(size, _is_any_nan)(src1) ||                      \
> @@ -282,41 +281,42 @@ void helper_fsqrtq(CPUState *env)
>         }                                                               \
>     }
>
> -GEN_FCMPS(fcmps, float32, 0, 0);
> -GEN_FCMP(fcmpd, float64, DT0, DT1, 0, 0);
> +GEN_FCMP_T(fcmps, float32, 0, 0);
> +GEN_FCMP_T(fcmpd, float64, 0, 0);
>
> -GEN_FCMPS(fcmpes, float32, 0, 1);
> -GEN_FCMP(fcmped, float64, DT0, DT1, 0, 1);
> +GEN_FCMP_T(fcmpes, float32, 0, 1);
> +GEN_FCMP_T(fcmped, float64, 0, 1);
>
>  GEN_FCMP(fcmpq, float128, QT0, QT1, 0, 0);
>  GEN_FCMP(fcmpeq, float128, QT0, QT1, 0, 1);
>
>  #ifdef TARGET_SPARC64
> -GEN_FCMPS(fcmps_fcc1, float32, 22, 0);
> -GEN_FCMP(fcmpd_fcc1, float64, DT0, DT1, 22, 0);
> +GEN_FCMP_T(fcmps_fcc1, float32, 22, 0);
> +GEN_FCMP_T(fcmpd_fcc1, float64, 22, 0);
>  GEN_FCMP(fcmpq_fcc1, float128, QT0, QT1, 22, 0);
>
> -GEN_FCMPS(fcmps_fcc2, float32, 24, 0);
> -GEN_FCMP(fcmpd_fcc2, float64, DT0, DT1, 24, 0);
> +GEN_FCMP_T(fcmps_fcc2, float32, 24, 0);
> +GEN_FCMP_T(fcmpd_fcc2, float64, 24, 0);
>  GEN_FCMP(fcmpq_fcc2, float128, QT0, QT1, 24, 0);
>
> -GEN_FCMPS(fcmps_fcc3, float32, 26, 0);
> -GEN_FCMP(fcmpd_fcc3, float64, DT0, DT1, 26, 0);
> +GEN_FCMP_T(fcmps_fcc3, float32, 26, 0);
> +GEN_FCMP_T(fcmpd_fcc3, float64, 26, 0);
>  GEN_FCMP(fcmpq_fcc3, float128, QT0, QT1, 26, 0);
>
> -GEN_FCMPS(fcmpes_fcc1, float32, 22, 1);
> -GEN_FCMP(fcmped_fcc1, float64, DT0, DT1, 22, 1);
> +GEN_FCMP_T(fcmpes_fcc1, float32, 22, 1);
> +GEN_FCMP_T(fcmped_fcc1, float64, 22, 1);
>  GEN_FCMP(fcmpeq_fcc1, float128, QT0, QT1, 22, 1);
>
> -GEN_FCMPS(fcmpes_fcc2, float32, 24, 1);
> -GEN_FCMP(fcmped_fcc2, float64, DT0, DT1, 24, 1);
> +GEN_FCMP_T(fcmpes_fcc2, float32, 24, 1);
> +GEN_FCMP_T(fcmped_fcc2, float64, 24, 1);
>  GEN_FCMP(fcmpeq_fcc2, float128, QT0, QT1, 24, 1);
>
> -GEN_FCMPS(fcmpes_fcc3, float32, 26, 1);
> -GEN_FCMP(fcmped_fcc3, float64, DT0, DT1, 26, 1);
> +GEN_FCMP_T(fcmpes_fcc3, float32, 26, 1);
> +GEN_FCMP_T(fcmped_fcc3, float64, 26, 1);
>  GEN_FCMP(fcmpeq_fcc3, float128, QT0, QT1, 26, 1);
>  #endif
> -#undef GEN_FCMPS
> +#undef GEN_FCMP_T
> +#undef GEN_FCMP
>
>  void helper_check_ieee_exceptions(CPUState *env)
>  {
> diff --git a/target-sparc/helper.h b/target-sparc/helper.h
> index c1b4e65..089233f 100644
> --- a/target-sparc/helper.h
> +++ b/target-sparc/helper.h
> @@ -39,8 +39,6 @@ DEF_HELPER_3(udiv, tl, env, tl, tl)
>  DEF_HELPER_3(udiv_cc, tl, env, tl, tl)
>  DEF_HELPER_3(sdiv, tl, env, tl, tl)
>  DEF_HELPER_3(sdiv_cc, tl, env, tl, tl)
> -DEF_HELPER_3(stdf, void, env, tl, int)
> -DEF_HELPER_3(lddf, void, env, tl, int)
>  DEF_HELPER_3(ldqf, void, env, tl, int)
>  DEF_HELPER_3(stqf, void, env, tl, int)
>  #if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64)
> @@ -52,29 +50,29 @@ DEF_HELPER_1(check_ieee_exceptions, void, env)
>  DEF_HELPER_1(clear_float_exceptions, void, env)
>  DEF_HELPER_1(fabss, f32, f32)
>  DEF_HELPER_2(fsqrts, f32, env, f32)
> -DEF_HELPER_1(fsqrtd, void, env)
> +DEF_HELPER_2(fsqrtd, f64, env, f64)
>  DEF_HELPER_3(fcmps, void, env, f32, f32)
> -DEF_HELPER_1(fcmpd, void, env)
> +DEF_HELPER_3(fcmpd, void, env, f64, f64)
>  DEF_HELPER_3(fcmpes, void, env, f32, f32)
> -DEF_HELPER_1(fcmped, void, env)
> +DEF_HELPER_3(fcmped, void, env, f64, f64)
>  DEF_HELPER_1(fsqrtq, void, env)
>  DEF_HELPER_1(fcmpq, void, env)
>  DEF_HELPER_1(fcmpeq, void, env)
>  #ifdef TARGET_SPARC64
>  DEF_HELPER_2(ldxfsr, void, env, i64)
> -DEF_HELPER_1(fabsd, void, env)
> +DEF_HELPER_2(fabsd, f64, env, f64)
>  DEF_HELPER_3(fcmps_fcc1, void, env, f32, f32)
>  DEF_HELPER_3(fcmps_fcc2, void, env, f32, f32)
>  DEF_HELPER_3(fcmps_fcc3, void, env, f32, f32)
> -DEF_HELPER_1(fcmpd_fcc1, void, env)
> -DEF_HELPER_1(fcmpd_fcc2, void, env)
> -DEF_HELPER_1(fcmpd_fcc3, void, env)
> +DEF_HELPER_3(fcmpd_fcc1, void, env, f64, f64)
> +DEF_HELPER_3(fcmpd_fcc2, void, env, f64, f64)
> +DEF_HELPER_3(fcmpd_fcc3, void, env, f64, f64)
>  DEF_HELPER_3(fcmpes_fcc1, void, env, f32, f32)
>  DEF_HELPER_3(fcmpes_fcc2, void, env, f32, f32)
>  DEF_HELPER_3(fcmpes_fcc3, void, env, f32, f32)
> -DEF_HELPER_1(fcmped_fcc1, void, env)
> -DEF_HELPER_1(fcmped_fcc2, void, env)
> -DEF_HELPER_1(fcmped_fcc3, void, env)
> +DEF_HELPER_3(fcmped_fcc1, void, env, f64, f64)
> +DEF_HELPER_3(fcmped_fcc2, void, env, f64, f64)
> +DEF_HELPER_3(fcmped_fcc3, void, env, f64, f64)
>  DEF_HELPER_1(fabsq, void, env)
>  DEF_HELPER_1(fcmpq_fcc1, void, env)
>  DEF_HELPER_1(fcmpq_fcc2, void, env)
> @@ -86,77 +84,78 @@ DEF_HELPER_1(fcmpeq_fcc3, void, env)
>  DEF_HELPER_2(raise_exception, void, env, int)
>  DEF_HELPER_0(shutdown, void)
>  #define F_HELPER_0_1(name) DEF_HELPER_1(f ## name, void, env)
> -#define F_HELPER_DQ_0_1(name)                   \
> -    F_HELPER_0_1(name ## d);                    \
> -    F_HELPER_0_1(name ## q)
>
> -F_HELPER_DQ_0_1(add);
> -F_HELPER_DQ_0_1(sub);
> -F_HELPER_DQ_0_1(mul);
> -F_HELPER_DQ_0_1(div);
> +DEF_HELPER_3(faddd, f64, env, f64, f64)
> +DEF_HELPER_3(fsubd, f64, env, f64, f64)
> +DEF_HELPER_3(fmuld, f64, env, f64, f64)
> +DEF_HELPER_3(fdivd, f64, env, f64, f64)
> +F_HELPER_0_1(addq)
> +F_HELPER_0_1(subq)
> +F_HELPER_0_1(mulq)
> +F_HELPER_0_1(divq)
>
>  DEF_HELPER_3(fadds, f32, env, f32, f32)
>  DEF_HELPER_3(fsubs, f32, env, f32, f32)
>  DEF_HELPER_3(fmuls, f32, env, f32, f32)
>  DEF_HELPER_3(fdivs, f32, env, f32, f32)
>
> -DEF_HELPER_3(fsmuld, void, env, f32, f32)
> -F_HELPER_0_1(dmulq);
> +DEF_HELPER_3(fsmuld, f64, env, f32, f32)
> +DEF_HELPER_3(fdmulq, void, env, f64, f64);
>
>  DEF_HELPER_1(fnegs, f32, f32)
> -DEF_HELPER_2(fitod, void, env, s32)
> +DEF_HELPER_2(fitod, f64, env, s32)
>  DEF_HELPER_2(fitoq, void, env, s32)
>
>  DEF_HELPER_2(fitos, f32, env, s32)
>
>  #ifdef TARGET_SPARC64
> -DEF_HELPER_1(fnegd, void, env)
> +DEF_HELPER_1(fnegd, f64, f64)
>  DEF_HELPER_1(fnegq, void, env)
> -DEF_HELPER_1(fxtos, i32, env)
> -F_HELPER_DQ_0_1(xto);
> +DEF_HELPER_2(fxtos, f32, env, s64)
> +DEF_HELPER_2(fxtod, f64, env, s64)
> +DEF_HELPER_2(fxtoq, void, env, s64)
>  #endif
> -DEF_HELPER_1(fdtos, f32, env)
> -DEF_HELPER_2(fstod, void, env, f32)
> +DEF_HELPER_2(fdtos, f32, env, f64)
> +DEF_HELPER_2(fstod, f64, env, f32)
>  DEF_HELPER_1(fqtos, f32, env)
>  DEF_HELPER_2(fstoq, void, env, f32)
> -F_HELPER_0_1(qtod);
> -F_HELPER_0_1(dtoq);
> +DEF_HELPER_1(fqtod, f64, env)
> +DEF_HELPER_2(fdtoq, void, env, f64)
>  DEF_HELPER_2(fstoi, s32, env, f32)
> -DEF_HELPER_1(fdtoi, s32, env)
> +DEF_HELPER_2(fdtoi, s32, env, f64)
>  DEF_HELPER_1(fqtoi, s32, env)
>  #ifdef TARGET_SPARC64
> -DEF_HELPER_2(fstox, void, env, i32)
> -F_HELPER_0_1(dtox);
> -F_HELPER_0_1(qtox);
> -F_HELPER_0_1(aligndata);
> +DEF_HELPER_2(fstox, s64, env, f32)
> +DEF_HELPER_2(fdtox, s64, env, f64)
> +DEF_HELPER_1(fqtox, s64, env)
> +DEF_HELPER_3(faligndata, i64, env, i64, i64)
>
> -F_HELPER_0_1(pmerge);
> -F_HELPER_0_1(mul8x16);
> -F_HELPER_0_1(mul8x16al);
> -F_HELPER_0_1(mul8x16au);
> -F_HELPER_0_1(mul8sux16);
> -F_HELPER_0_1(mul8ulx16);
> -F_HELPER_0_1(muld8sux16);
> -F_HELPER_0_1(muld8ulx16);
> -F_HELPER_0_1(expand);
> +DEF_HELPER_3(fpmerge, i64, env, i64, i64)
> +DEF_HELPER_3(fmul8x16, i64, env, i64, i64)
> +DEF_HELPER_3(fmul8x16al, i64, env, i64, i64)
> +DEF_HELPER_3(fmul8x16au, i64, env, i64, i64)
> +DEF_HELPER_3(fmul8sux16, i64, env, i64, i64)
> +DEF_HELPER_3(fmul8ulx16, i64, env, i64, i64)
> +DEF_HELPER_3(fmuld8sux16, i64, env, i64, i64)
> +DEF_HELPER_3(fmuld8ulx16, i64, env, i64, i64)
> +DEF_HELPER_3(fexpand, i64, env, i64, i64)
>  #define VIS_HELPER(name)                                 \
> -    F_HELPER_0_1(name##16);                              \
> +    DEF_HELPER_3(f ## name ## 16, i64, env, i64, i64)    \
>     DEF_HELPER_3(f ## name ## 16s, i32, env, i32, i32)   \
> -    F_HELPER_0_1(name##32);                              \
> +    DEF_HELPER_3(f ## name ## 32, i64, env, i64, i64)    \
>     DEF_HELPER_3(f ## name ## 32s, i32, env, i32, i32)
>
>  VIS_HELPER(padd);
>  VIS_HELPER(psub);
>  #define VIS_CMPHELPER(name)                              \
> -    DEF_HELPER_1(f##name##16, i64, env);                 \
> -    DEF_HELPER_1(f##name##32, i64, env)
> +    DEF_HELPER_3(f##name##16, i64, env, i64, i64)        \
> +    DEF_HELPER_3(f##name##32, i64, env, i64, i64)
>  VIS_CMPHELPER(cmpgt);
>  VIS_CMPHELPER(cmpeq);
>  VIS_CMPHELPER(cmple);
>  VIS_CMPHELPER(cmpne);
>  #endif
>  #undef F_HELPER_0_1
> -#undef F_HELPER_DQ_0_1
>  #undef VIS_HELPER
>  #undef VIS_CMPHELPER
>  DEF_HELPER_1(compute_psr, void, env);
> diff --git a/target-sparc/ldst_helper.c b/target-sparc/ldst_helper.c
> index 1e4337d..ec9b5f2 100644
> --- a/target-sparc/ldst_helper.c
> +++ b/target-sparc/ldst_helper.c
> @@ -61,8 +61,6 @@
>  #endif
>  #endif
>
> -#define DT0 (env->dt0)
> -#define DT1 (env->dt1)
>  #define QT0 (env->qt0)
>  #define QT1 (env->qt1)
>
> @@ -2228,56 +2226,6 @@ target_ulong helper_casx_asi(CPUState *env, target_ulong addr,
>  }
>  #endif /* TARGET_SPARC64 */
>
> -void helper_stdf(CPUState *env, target_ulong addr, int mem_idx)
> -{
> -    helper_check_align(env, addr, 7);
> -#if !defined(CONFIG_USER_ONLY)
> -    switch (mem_idx) {
> -    case MMU_USER_IDX:
> -        cpu_stfq_user(env, addr, DT0);
> -        break;
> -    case MMU_KERNEL_IDX:
> -        cpu_stfq_kernel(env, addr, DT0);
> -        break;
> -#ifdef TARGET_SPARC64
> -    case MMU_HYPV_IDX:
> -        cpu_stfq_hypv(env, addr, DT0);
> -        break;
> -#endif
> -    default:
> -        DPRINTF_MMU("helper_stdf: need to check MMU idx %d\n", mem_idx);
> -        break;
> -    }
> -#else
> -    stfq_raw(address_mask(env, addr), DT0);
> -#endif
> -}
> -
> -void helper_lddf(CPUState *env, target_ulong addr, int mem_idx)
> -{
> -    helper_check_align(env, addr, 7);
> -#if !defined(CONFIG_USER_ONLY)
> -    switch (mem_idx) {
> -    case MMU_USER_IDX:
> -        DT0 = cpu_ldfq_user(env, addr);
> -        break;
> -    case MMU_KERNEL_IDX:
> -        DT0 = cpu_ldfq_kernel(env, addr);
> -        break;
> -#ifdef TARGET_SPARC64
> -    case MMU_HYPV_IDX:
> -        DT0 = cpu_ldfq_hypv(env, addr);
> -        break;
> -#endif
> -    default:
> -        DPRINTF_MMU("helper_lddf: need to check MMU idx %d\n", mem_idx);
> -        break;
> -    }
> -#else
> -    DT0 = ldfq_raw(address_mask(env, addr));
> -#endif
> -}
> -
>  void helper_ldqf(CPUState *env, target_ulong addr, int mem_idx)
>  {
>     /* XXX add 128 bit load */
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index bea93af..f0614b5 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -186,30 +186,6 @@ static TCGv_i64 gen_dest_fpr_D(void)
>     return cpu_tmp64;
>  }
>
> -static void gen_op_load_fpr_DT0(unsigned int src)
> -{
> -    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt0) +
> -                   offsetof(CPU_DoubleU, l.upper));
> -    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
> -                   offsetof(CPU_DoubleU, l.lower));
> -}
> -
> -static void gen_op_load_fpr_DT1(unsigned int src)
> -{
> -    tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, dt1) +
> -                   offsetof(CPU_DoubleU, l.upper));
> -    tcg_gen_st_i32(cpu__fpr[src + 1], cpu_env, offsetof(CPUSPARCState, dt1) +
> -                   offsetof(CPU_DoubleU, l.lower));
> -}
> -
> -static void gen_op_store_DT0_fpr(unsigned int dst)
> -{
> -    tcg_gen_ld_i32(cpu__fpr[dst], cpu_env, offsetof(CPUSPARCState, dt0) +
> -                   offsetof(CPU_DoubleU, l.upper));
> -    tcg_gen_ld_i32(cpu__fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, dt0) +
> -                   offsetof(CPU_DoubleU, l.lower));
> -}
> -
>  static void gen_op_load_fpr_QT0(unsigned int src)
>  {
>     tcg_gen_st_i32(cpu__fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
> @@ -1490,20 +1466,20 @@ static inline void gen_op_fcmps(int fccno, TCGv_i32 r_rs1, TCGv_i32 r_rs2)
>     }
>  }
>
> -static inline void gen_op_fcmpd(int fccno)
> +static inline void gen_op_fcmpd(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
>  {
>     switch (fccno) {
>     case 0:
> -        gen_helper_fcmpd(cpu_env);
> +        gen_helper_fcmpd(cpu_env, r_rs1, r_rs2);
>         break;
>     case 1:
> -        gen_helper_fcmpd_fcc1(cpu_env);
> +        gen_helper_fcmpd_fcc1(cpu_env, r_rs1, r_rs2);
>         break;
>     case 2:
> -        gen_helper_fcmpd_fcc2(cpu_env);
> +        gen_helper_fcmpd_fcc2(cpu_env, r_rs1, r_rs2);
>         break;
>     case 3:
> -        gen_helper_fcmpd_fcc3(cpu_env);
> +        gen_helper_fcmpd_fcc3(cpu_env, r_rs1, r_rs2);
>         break;
>     }
>  }
> @@ -1544,20 +1520,20 @@ static inline void gen_op_fcmpes(int fccno, TCGv_i32 r_rs1, TCGv_i32 r_rs2)
>     }
>  }
>
> -static inline void gen_op_fcmped(int fccno)
> +static inline void gen_op_fcmped(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
>  {
>     switch (fccno) {
>     case 0:
> -        gen_helper_fcmped(cpu_env);
> +        gen_helper_fcmped(cpu_env, r_rs1, r_rs2);
>         break;
>     case 1:
> -        gen_helper_fcmped_fcc1(cpu_env);
> +        gen_helper_fcmped_fcc1(cpu_env, r_rs1, r_rs2);
>         break;
>     case 2:
> -        gen_helper_fcmped_fcc2(cpu_env);
> +        gen_helper_fcmped_fcc2(cpu_env, r_rs1, r_rs2);
>         break;
>     case 3:
> -        gen_helper_fcmped_fcc3(cpu_env);
> +        gen_helper_fcmped_fcc3(cpu_env, r_rs1, r_rs2);
>         break;
>     }
>  }
> @@ -1587,9 +1563,9 @@ static inline void gen_op_fcmps(int fccno, TCGv r_rs1, TCGv r_rs2)
>     gen_helper_fcmps(cpu_env, r_rs1, r_rs2);
>  }
>
> -static inline void gen_op_fcmpd(int fccno)
> +static inline void gen_op_fcmpd(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
>  {
> -    gen_helper_fcmpd(cpu_env);
> +    gen_helper_fcmpd(cpu_env, r_rs1, r_rs2);
>  }
>
>  static inline void gen_op_fcmpq(int fccno)
> @@ -1602,9 +1578,9 @@ static inline void gen_op_fcmpes(int fccno, TCGv r_rs1, TCGv r_rs2)
>     gen_helper_fcmpes(cpu_env, r_rs1, r_rs2);
>  }
>
> -static inline void gen_op_fcmped(int fccno)
> +static inline void gen_op_fcmped(int fccno, TCGv_i64 r_rs1, TCGv_i64 r_rs2)
>  {
> -    gen_helper_fcmped(cpu_env);
> +    gen_helper_fcmped(cpu_env, r_rs1, r_rs2);
>  }
>
>  static inline void gen_op_fcmpeq(int fccno)
> @@ -2461,12 +2437,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x2a: /* fsqrtd */
>                     CHECK_FPU_FEATURE(dc, FSQRT);
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fsqrtd(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fsqrtd(cpu_dst_64, cpu_env, cpu_src1_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x2b: /* fsqrtq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2488,13 +2464,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
>                 case 0x42: /* faddd */
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_faddd(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_faddd(cpu_dst_64, cpu_env,
> +                                     cpu_src1_64, cpu_src2_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x43: /* faddq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2517,13 +2494,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
>                 case 0x46: /* fsubd */
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fsubd(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fsubd(cpu_dst_64, cpu_env,
> +                                     cpu_src1_64, cpu_src2_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x47: /* fsubq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2548,13 +2526,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x4a: /* fmuld */
>                     CHECK_FPU_FEATURE(dc, FMUL);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fmuld(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmuld(cpu_dst_64, cpu_env,
> +                                     cpu_src1_64, cpu_src2_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x4b: /* fmulq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2578,13 +2557,14 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
>                 case 0x4e: /* fdivd */
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fdivd(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fdivd(cpu_dst_64, cpu_env,
> +                                     cpu_src1_64, cpu_src2_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x4f: /* fdivq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2601,17 +2581,18 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_clear_float_exceptions();
>                     cpu_src1_32 = gen_load_fpr_F(dc, rs1);
>                     cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    gen_helper_fsmuld(cpu_env, cpu_src1_32, cpu_src2_32);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fsmuld(cpu_dst_64, cpu_env,
> +                                      cpu_src1_32, cpu_src2_32);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x6e: /* fdmulq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fdmulq(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fdmulq(cpu_env, cpu_src1_64, cpu_src2_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
>                     gen_op_store_QT0_fpr(QFPREG(rd));
>                     gen_update_fprs_dirty(QFPREG(rd));
> @@ -2625,10 +2606,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
>                 case 0xc6: /* fdtos */
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
>                     cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fdtos(cpu_dst_32, cpu_env);
> +                    gen_helper_fdtos(cpu_dst_32, cpu_env, cpu_src1_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
> @@ -2643,24 +2624,24 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0xc8: /* fitod */
>                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    gen_helper_fitod(cpu_env, cpu_src1_32);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fitod(cpu_dst_64, cpu_env, cpu_src1_32);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0xc9: /* fstod */
>                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    gen_helper_fstod(cpu_env, cpu_src1_32);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fstod(cpu_dst_64, cpu_env, cpu_src1_32);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0xcb: /* fqtod */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fqtod(cpu_env);
> +                    gen_op_load_fpr_QT1(QFPREG(rs2));
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fqtod(cpu_dst_64, cpu_env);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0xcc: /* fitoq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2678,8 +2659,8 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0xce: /* fdtoq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fdtoq(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fdtoq(cpu_env, cpu_src1_64);
>                     gen_op_store_QT0_fpr(QFPREG(rd));
>                     gen_update_fprs_dirty(QFPREG(rd));
>                     break;
> @@ -2692,10 +2673,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
>                 case 0xd2: /* fdtoi */
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
>                     cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fdtoi(cpu_dst_32, cpu_env);
> +                    gen_helper_fdtoi(cpu_dst_32, cpu_env, cpu_src1_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
> @@ -2726,10 +2707,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_update_fprs_dirty(QFPREG(rd));
>                     break;
>                 case 0x6: /* V9 fnegd */
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fnegd(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fnegd(cpu_dst_64, cpu_src1_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x7: /* V9 fnegq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2739,10 +2720,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_update_fprs_dirty(QFPREG(rd));
>                     break;
>                 case 0xa: /* V9 fabsd */
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fabsd(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fabsd(cpu_dst_64, cpu_env, cpu_src1_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0xb: /* V9 fabsq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -2754,49 +2735,49 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                 case 0x81: /* V9 fstox */
>                     gen_clear_float_exceptions();
>                     cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    gen_helper_fstox(cpu_env, cpu_src1_32);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fstox(cpu_dst_64, cpu_env, cpu_src1_32);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x82: /* V9 fdtox */
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fdtox(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fdtox(cpu_dst_64, cpu_env, cpu_src1_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x83: /* V9 fqtox */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
>                     gen_op_load_fpr_QT1(QFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fqtox(cpu_env);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fqtox(cpu_dst_64, cpu_env);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x84: /* V9 fxtos */
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
>                     cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fxtos(cpu_dst_32, cpu_env);
> +                    gen_helper_fxtos(cpu_dst_32, cpu_env, cpu_src1_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
>                     gen_store_fpr_F(dc, rd, cpu_dst_32);
>                     break;
>                 case 0x88: /* V9 fxtod */
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fxtod(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fxtod(cpu_dst_64, cpu_env, cpu_src1_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x8c: /* V9 fxtoq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
>                     gen_clear_float_exceptions();
> -                    gen_helper_fxtoq(cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fxtoq(cpu_env, cpu_src1_64);
>                     gen_helper_check_ieee_exceptions(cpu_env);
>                     gen_op_store_QT0_fpr(QFPREG(rd));
>                     gen_update_fprs_dirty(QFPREG(rd));
> @@ -3046,9 +3027,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                         gen_op_fcmps(rd & 3, cpu_src1_32, cpu_src2_32);
>                         break;
>                     case 0x52: /* fcmpd, V9 %fcc */
> -                        gen_op_load_fpr_DT0(DFPREG(rs1));
> -                        gen_op_load_fpr_DT1(DFPREG(rs2));
> -                        gen_op_fcmpd(rd & 3);
> +                        cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                        cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                        gen_op_fcmpd(rd & 3, cpu_src1_64, cpu_src2_64);
>                         break;
>                     case 0x53: /* fcmpq, V9 %fcc */
>                         CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -3062,9 +3043,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                         gen_op_fcmpes(rd & 3, cpu_src1_32, cpu_src2_32);
>                         break;
>                     case 0x56: /* fcmped, V9 %fcc */
> -                        gen_op_load_fpr_DT0(DFPREG(rs1));
> -                        gen_op_load_fpr_DT1(DFPREG(rs2));
> -                        gen_op_fcmped(rd & 3);
> +                        cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                        cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                        gen_op_fcmped(rd & 3, cpu_src1_64, cpu_src2_64);
>                         break;
>                     case 0x57: /* fcmpeq, V9 %fcc */
>                         CHECK_FPU_FEATURE(dc, FLOAT128);
> @@ -3953,115 +3934,130 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     goto illegal_insn;
>                 case 0x020: /* VIS I fcmple16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmple16(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmple16(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x022: /* VIS I fcmpne16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmpne16(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmpne16(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x024: /* VIS I fcmple32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmple32(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmple32(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x026: /* VIS I fcmpne32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmpne32(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmpne32(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x028: /* VIS I fcmpgt16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmpgt16(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmpgt16(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x02a: /* VIS I fcmpeq16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmpeq16(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmpeq16(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x02c: /* VIS I fcmpgt32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmpgt32(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmpgt32(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x02e: /* VIS I fcmpeq32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fcmpeq32(cpu_dst, cpu_env);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    gen_helper_fcmpeq32(cpu_dst, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x031: /* VIS I fmul8x16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fmul8x16(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmul8x16(cpu_dst_64, cpu_env,
> +                                        cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x033: /* VIS I fmul8x16au */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fmul8x16au(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmul8x16au(cpu_dst_64, cpu_env,
> +                                          cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x035: /* VIS I fmul8x16al */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fmul8x16al(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmul8x16al(cpu_dst_64, cpu_env,
> +                                          cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x036: /* VIS I fmul8sux16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fmul8sux16(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmul8sux16(cpu_dst_64, cpu_env,
> +                                          cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x037: /* VIS I fmul8ulx16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fmul8ulx16(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_env,
> +                                          cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x038: /* VIS I fmuld8sux16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fmuld8sux16(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_env,
> +                                           cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x039: /* VIS I fmuld8ulx16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fmuld8ulx16(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_env,
> +                                           cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x03a: /* VIS I fpack32 */
>                 case 0x03b: /* VIS I fpack16 */
> @@ -4071,38 +4067,42 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     goto illegal_insn;
>                 case 0x048: /* VIS I faligndata */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_faligndata(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_faligndata(cpu_dst_64, cpu_env,
> +                                          cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x04b: /* VIS I fpmerge */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fpmerge(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fpmerge(cpu_dst_64, cpu_env,
> +                                       cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x04c: /* VIS II bshuffle */
>                     // XXX
>                     goto illegal_insn;
>                 case 0x04d: /* VIS I fexpand */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fexpand(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fexpand(cpu_dst_64, cpu_env,
> +                                       cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x050: /* VIS I fpadd16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fpadd16(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fpadd16(cpu_dst_64, cpu_env,
> +                                       cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x051: /* VIS I fpadd16s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> @@ -4115,11 +4115,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x052: /* VIS I fpadd32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fpadd32(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fpadd32(cpu_dst_64, cpu_env,
> +                                       cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x053: /* VIS I fpadd32s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> @@ -4131,11 +4132,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x054: /* VIS I fpsub16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fpsub16(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fpsub16(cpu_dst_64, cpu_env,
> +                                       cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x055: /* VIS I fpsub16s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> @@ -4148,11 +4150,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x056: /* VIS I fpsub32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_op_load_fpr_DT0(DFPREG(rs1));
> -                    gen_op_load_fpr_DT1(DFPREG(rs2));
> -                    gen_helper_fpsub32(cpu_env);
> -                    gen_op_store_DT0_fpr(DFPREG(rd));
> -                    gen_update_fprs_dirty(DFPREG(rd));
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> +                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    gen_helper_fpsub32(cpu_dst_64, cpu_env,
> +                                       cpu_src1_64, cpu_src2_64);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 case 0x057: /* VIS I fpsub32s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> @@ -4812,16 +4815,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     }
>                     break;
>                 case 0x23:      /* lddf, load double fpreg */
> -                    {
> -                        TCGv_i32 r_const;
> -
> -                        r_const = tcg_const_i32(dc->mem_idx);
> -                        gen_address_mask(dc, cpu_addr);
> -                        gen_helper_lddf(cpu_env, cpu_addr, r_const);
> -                        tcg_temp_free_i32(r_const);
> -                        gen_op_store_DT0_fpr(DFPREG(rd));
> -                        gen_update_fprs_dirty(DFPREG(rd));
> -                    }
> +                    gen_address_mask(dc, cpu_addr);
> +                    cpu_dst_64 = gen_dest_fpr_D();
> +                    tcg_gen_qemu_ld64(cpu_dst_64, cpu_addr, dc->mem_idx);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                     break;
>                 default:
>                     goto illegal_insn;
> @@ -4973,15 +4970,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>  #endif
>  #endif
>                 case 0x27: /* stdf, store double fpreg */
> -                    {
> -                        TCGv_i32 r_const;
> -
> -                        gen_op_load_fpr_DT0(DFPREG(rd));
> -                        r_const = tcg_const_i32(dc->mem_idx);
> -                        gen_address_mask(dc, cpu_addr);
> -                        gen_helper_stdf(cpu_env, cpu_addr, r_const);
> -                        tcg_temp_free_i32(r_const);
> -                    }
> +                    gen_address_mask(dc, cpu_addr);
> +                    cpu_src1_64 = gen_load_fpr_D(dc, rd);
> +                    tcg_gen_qemu_st64(cpu_src1_64, cpu_addr, dc->mem_idx);
>                     break;
>                 default:
>                     goto illegal_insn;
> diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
> index a22c10b..a007b0f 100644
> --- a/target-sparc/vis_helper.c
> +++ b/target-sparc/vis_helper.c
> @@ -20,11 +20,6 @@
>  #include "cpu.h"
>  #include "helper.h"
>
> -#define DT0 (env->dt0)
> -#define DT1 (env->dt1)
> -#define QT0 (env->qt0)
> -#define QT1 (env->qt1)
> -
>  /* This function uses non-native bit order */
>  #define GET_FIELD(X, FROM, TO)                                  \
>     ((X) >> (63 - (TO)) & ((1ULL << ((TO) - (FROM) + 1)) - 1))
> @@ -58,16 +53,16 @@ target_ulong helper_alignaddr(CPUState *env, target_ulong addr,
>     return tmp & ~7ULL;
>  }
>
> -void helper_faligndata(CPUState *env)
> +uint64_t helper_faligndata(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     uint64_t tmp;
>
> -    tmp = (*((uint64_t *)&DT0)) << ((env->gsr & 7) * 8);
> +    tmp = src1 << ((env->gsr & 7) * 8);
>     /* on many architectures a shift of 64 does nothing */
>     if ((env->gsr & 7) != 0) {
> -        tmp |= (*((uint64_t *)&DT1)) >> (64 - (env->gsr & 7) * 8);
> +        tmp |= src2 >> (64 - (env->gsr & 7) * 8);
>     }
> -    *((uint64_t *)&DT0) = tmp;
> +    return tmp;
>  }
>
>  #ifdef HOST_WORDS_BIGENDIAN
> @@ -102,12 +97,12 @@ typedef union {
>     float32 f;
>  } VIS32;
>
> -void helper_fpmerge(CPUState *env)
> +uint64_t helper_fpmerge(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>     /* Reverse calculation order to handle overlap */
>     d.VIS_B64(7) = s.VIS_B64(3);
> @@ -119,16 +114,16 @@ void helper_fpmerge(CPUState *env)
>     d.VIS_B64(1) = s.VIS_B64(0);
>     /* d.VIS_B64(0) = d.VIS_B64(0); */
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fmul8x16(CPUState *env)
> +uint64_t helper_fmul8x16(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>     uint32_t tmp;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>  #define PMUL(r)                                                 \
>     tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B64(r);       \
> @@ -143,16 +138,16 @@ void helper_fmul8x16(CPUState *env)
>     PMUL(3);
>  #undef PMUL
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fmul8x16al(CPUState *env)
> +uint64_t helper_fmul8x16al(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>     uint32_t tmp;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>  #define PMUL(r)                                                 \
>     tmp = (int32_t)d.VIS_SW64(1) * (int32_t)s.VIS_B64(r);       \
> @@ -167,16 +162,16 @@ void helper_fmul8x16al(CPUState *env)
>     PMUL(3);
>  #undef PMUL
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fmul8x16au(CPUState *env)
> +uint64_t helper_fmul8x16au(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>     uint32_t tmp;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>  #define PMUL(r)                                                 \
>     tmp = (int32_t)d.VIS_SW64(0) * (int32_t)s.VIS_B64(r);       \
> @@ -191,16 +186,16 @@ void helper_fmul8x16au(CPUState *env)
>     PMUL(3);
>  #undef PMUL
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fmul8sux16(CPUState *env)
> +uint64_t helper_fmul8sux16(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>     uint32_t tmp;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>  #define PMUL(r)                                                         \
>     tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8);       \
> @@ -215,16 +210,16 @@ void helper_fmul8sux16(CPUState *env)
>     PMUL(3);
>  #undef PMUL
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fmul8ulx16(CPUState *env)
> +uint64_t helper_fmul8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>     uint32_t tmp;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>  #define PMUL(r)                                                         \
>     tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2));        \
> @@ -239,16 +234,16 @@ void helper_fmul8ulx16(CPUState *env)
>     PMUL(3);
>  #undef PMUL
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fmuld8sux16(CPUState *env)
> +uint64_t helper_fmuld8sux16(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>     uint32_t tmp;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>  #define PMUL(r)                                                         \
>     tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8);       \
> @@ -262,16 +257,16 @@ void helper_fmuld8sux16(CPUState *env)
>     PMUL(0);
>  #undef PMUL
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fmuld8ulx16(CPUState *env)
> +uint64_t helper_fmuld8ulx16(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS64 s, d;
>     uint32_t tmp;
>
> -    s.d = DT0;
> -    d.d = DT1;
> +    s.ll = src1;
> +    d.ll = src2;
>
>  #define PMUL(r)                                                         \
>     tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2));        \
> @@ -285,38 +280,38 @@ void helper_fmuld8ulx16(CPUState *env)
>     PMUL(0);
>  #undef PMUL
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
> -void helper_fexpand(CPUState *env)
> +uint64_t helper_fexpand(CPUState *env, uint64_t src1, uint64_t src2)
>  {
>     VIS32 s;
>     VIS64 d;
>
> -    s.l = (uint32_t)(*(uint64_t *)&DT0 & 0xffffffff);
> -    d.d = DT1;
> +    s.l = (uint32_t)src1;
> +    d.ll = src2;
>     d.VIS_W64(0) = s.VIS_B32(0) << 4;
>     d.VIS_W64(1) = s.VIS_B32(1) << 4;
>     d.VIS_W64(2) = s.VIS_B32(2) << 4;
>     d.VIS_W64(3) = s.VIS_B32(3) << 4;
>
> -    DT0 = d.d;
> +    return d.ll;
>  }
>
>  #define VIS_HELPER(name, F)                             \
> -    void name##16(CPUState *env)                        \
> +    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
>     {                                                   \
>         VIS64 s, d;                                     \
>                                                         \
> -        s.d = DT0;                                      \
> -        d.d = DT1;                                      \
> +        s.ll = src1;                                    \
> +        d.ll = src2;                                    \
>                                                         \
>         d.VIS_W64(0) = F(d.VIS_W64(0), s.VIS_W64(0));   \
>         d.VIS_W64(1) = F(d.VIS_W64(1), s.VIS_W64(1));   \
>         d.VIS_W64(2) = F(d.VIS_W64(2), s.VIS_W64(2));   \
>         d.VIS_W64(3) = F(d.VIS_W64(3), s.VIS_W64(3));   \
>                                                         \
> -        DT0 = d.d;                                      \
> +        return d.ll;                                    \
>     }                                                   \
>                                                         \
>     uint32_t name##16s(CPUState *env, uint32_t src1,    \
> @@ -333,17 +328,17 @@ void helper_fexpand(CPUState *env)
>         return d.l;                                     \
>     }                                                   \
>                                                         \
> -    void name##32(CPUState *env)                        \
> +    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
>     {                                                   \
>         VIS64 s, d;                                     \
>                                                         \
> -        s.d = DT0;                                      \
> -        d.d = DT1;                                      \
> +        s.ll = src1;                                    \
> +        d.ll = src2;                                    \
>                                                         \
>         d.VIS_L64(0) = F(d.VIS_L64(0), s.VIS_L64(0));   \
>         d.VIS_L64(1) = F(d.VIS_L64(1), s.VIS_L64(1));   \
>                                                         \
> -        DT0 = d.d;                                      \
> +        return d.ll;                                    \
>     }                                                   \
>                                                         \
>     uint32_t name##32s(CPUState *env, uint32_t src1,    \
> @@ -365,12 +360,12 @@ VIS_HELPER(helper_fpadd, FADD)
>  VIS_HELPER(helper_fpsub, FSUB)
>
>  #define VIS_CMPHELPER(name, F)                                    \
> -    uint64_t name##16(CPUState *env)                              \
> +    uint64_t name##16(CPUState *env, uint64_t src1, uint64_t src2) \
>     {                                                             \
>         VIS64 s, d;                                               \
>                                                                   \
> -        s.d = DT0;                                                \
> -        d.d = DT1;                                                \
> +        s.ll = src1;                                              \
> +        d.ll = src2;                                              \
>                                                                   \
>         d.VIS_W64(0) = F(s.VIS_W64(0), d.VIS_W64(0)) ? 1 : 0;     \
>         d.VIS_W64(0) |= F(s.VIS_W64(1), d.VIS_W64(1)) ? 2 : 0;    \
> @@ -381,12 +376,12 @@ VIS_HELPER(helper_fpsub, FSUB)
>         return d.ll;                                              \
>     }                                                             \
>                                                                   \
> -    uint64_t name##32(CPUState *env)                              \
> +    uint64_t name##32(CPUState *env, uint64_t src1, uint64_t src2) \
>     {                                                             \
>         VIS64 s, d;                                               \
>                                                                   \
> -        s.d = DT0;                                                \
> -        d.d = DT1;                                                \
> +        s.ll = src1;                                              \
> +        d.ll = src2;                                              \
>                                                                   \
>         d.VIS_L64(0) = F(s.VIS_L64(0), d.VIS_L64(0)) ? 1 : 0;     \
>         d.VIS_L64(0) |= F(s.VIS_L64(1), d.VIS_L64(1)) ? 2 : 0;    \
> --
> 1.7.6.4
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 04/21] target-sparc: Pass float64 parameters instead of dt0/1 temporaries.
  2011-10-18 20:04   ` Blue Swirl
@ 2011-10-18 20:07     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 20:07 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On 10/18/2011 01:04 PM, Blue Swirl wrote:
>> > -    F_HELPER(name, d)                                           \
>> > +    float64 helper_f ## name ## d (CPUState * env, float64 src1,\
>> > +                                   float64 src2)                \
>> >     {                                                           \
>> > -        DT0 = float64_ ## name (DT0, DT1, &env->fp_status);     \
>> > +        return float64_ ## name (src1, src2, &env->fp_status);  \
>> >     }                                                           \
> Could we call float64_##name() directly from generated code and avoid
> the wrapper? Translator could generate &env->fp_status and in other
> cases that could be passed around instead of env.
> 

The helper.h machinery isn't set up for that.  Also, see patch 11.


r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements
  2011-10-18 20:03   ` Richard Henderson
@ 2011-10-18 20:19     ` Blue Swirl
  0 siblings, 0 replies; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 20:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 8:03 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 10/18/2011 12:50 PM, Blue Swirl wrote:
>> Thanks. Unfortunately I'm not sure I'll be able to fix and push my
>> series before the freeze, because other than x86_64, none of the TCG
>> targets implement AREG0 free mode.
>
> Oh, I see, you've not actually left the "normal" load/store helpers
> in op_helper.c, and so the backends need fixing up for that feature.
>
> Can we leave only __ld/st_mmu in op_helper.c for now, so that we
> don't rely on changes to the other backends for 1.0?  It seems like
> all the other cleanups will Just Work, and there shouldn't be many
> conflicts otherwise.

You mean pushing all except the last one? That may be a nice
compromise. Could you review the patches?

> IIRC you measured both changes as in the noise of measurement.

Yes, but I'm not very happy with the result. The problem with laptops
is that the CPU speed varies and various SMM services can kick in or
something.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations.
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations Richard Henderson
@ 2011-10-18 20:24   ` Blue Swirl
  2011-10-18 22:21     ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 20:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/fop_helper.c |    2 +-
>  target-sparc/helper.h     |    4 +-
>  target-sparc/translate.c  |  840 +++++++++++++++++++++------------------------
>  3 files changed, 389 insertions(+), 457 deletions(-)
>
> diff --git a/target-sparc/fop_helper.c b/target-sparc/fop_helper.c
> index f6348c2..e652021 100644
> --- a/target-sparc/fop_helper.c
> +++ b/target-sparc/fop_helper.c
> @@ -182,7 +182,7 @@ float32 helper_fabss(float32 src)
>  }
>
>  #ifdef TARGET_SPARC64
> -float64 helper_fabsd(CPUState *env, float64 src)
> +float64 helper_fabsd(float64 src)

This probably should go to previous patch.

>  {
>     return float64_abs(src);
>  }
> diff --git a/target-sparc/helper.h b/target-sparc/helper.h
> index 9c15b8a..df367a4 100644
> --- a/target-sparc/helper.h
> +++ b/target-sparc/helper.h
> @@ -48,7 +48,7 @@ DEF_HELPER_5(st_asi, void, env, tl, i64, int, int)
>  DEF_HELPER_2(ldfsr, void, env, i32)
>  DEF_HELPER_1(check_ieee_exceptions, void, env)
>  DEF_HELPER_1(clear_float_exceptions, void, env)
> -DEF_HELPER_1(fabss, f32, f32)
> +DEF_HELPER_FLAGS_1(fabss, TCG_CALL_CONST | TCG_CALL_PURE, f32, f32)
>  DEF_HELPER_2(fsqrts, f32, env, f32)
>  DEF_HELPER_2(fsqrtd, f64, env, f64)
>  DEF_HELPER_3(fcmps, void, env, f32, f32)
> @@ -60,7 +60,7 @@ DEF_HELPER_1(fcmpq, void, env)
>  DEF_HELPER_1(fcmpeq, void, env)
>  #ifdef TARGET_SPARC64
>  DEF_HELPER_2(ldxfsr, void, env, i64)
> -DEF_HELPER_2(fabsd, f64, env, f64)
> +DEF_HELPER_FLAGS_1(fabsd, TCG_CALL_CONST | TCG_CALL_PURE, f64, f64)
>  DEF_HELPER_3(fcmps_fcc1, void, env, f32, f32)
>  DEF_HELPER_3(fcmps_fcc2, void, env, f32, f32)
>  DEF_HELPER_3(fcmps_fcc3, void, env, f32, f32)
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index 5c70870..c47a035 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -24,6 +24,11 @@
>  #include <string.h>
>  #include <inttypes.h>
>
> +/* Turn off the stupid always-inline hack in osdep.h.  This gets in the
> +   way of the callback mechanisms we use in this file, generating warnings
> +   for always-inline functions called indirectly.  */
> +#define always_inline inline

It would be better to just delete the offending (or all) inlines.

> +
>  #include "cpu.h"
>  #include "disas.h"
>  #include "helper.h"
> @@ -1627,6 +1632,305 @@ static inline void gen_clear_float_exceptions(void)
>     gen_helper_clear_float_exceptions(cpu_env);
>  }
>
> +static void gen_fop_FF(DisasContext *dc, int rd, int rs,
> +                       void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32))
> +{
> +    TCGv_i32 dst, src;
> +
> +    gen_clear_float_exceptions();
> +    src = gen_load_fpr_F(dc, rs);
> +    dst = gen_dest_fpr_F();
> +
> +    gen(dst, cpu_env, src);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_F(dc, rd, dst);
> +}
> +
> +static void gen_ne_fop_FF(DisasContext *dc, int rd, int rs,

'ne' is for no exception? How about noexcp or something?

> +                          void (*gen)(TCGv_i32, TCGv_i32))
> +{
> +    TCGv_i32 dst, src;
> +
> +    src = gen_load_fpr_F(dc, rs);
> +    dst = gen_dest_fpr_F();
> +
> +    gen(dst, src);
> +
> +    gen_store_fpr_F(dc, rd, dst);
> +}
> +
> +static void gen_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
> +                        void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32))
> +{
> +    TCGv_i32 dst, src1, src2;
> +
> +    gen_clear_float_exceptions();
> +    src1 = gen_load_fpr_F(dc, rs1);
> +    src2 = gen_load_fpr_F(dc, rs2);
> +    dst = gen_dest_fpr_F();
> +
> +    gen(dst, cpu_env, src1, src2);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_F(dc, rd, dst);
> +}
> +
> +#ifdef TARGET_SPARC64
> +static void gen_ne_fop_FFF(DisasContext *dc, int rd, int rs1, int rs2,
> +                           void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32))
> +{
> +    TCGv_i32 dst, src1, src2;
> +
> +    src1 = gen_load_fpr_F(dc, rs1);
> +    src2 = gen_load_fpr_F(dc, rs2);
> +    dst = gen_dest_fpr_F();
> +
> +    gen(dst, src1, src2);
> +
> +    gen_store_fpr_F(dc, rd, dst);
> +}
> +#endif
> +
> +static void gen_fop_DD(DisasContext *dc, int rd, int rs,
> +                       void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i64))
> +{
> +    TCGv_i64 dst, src;
> +
> +    gen_clear_float_exceptions();
> +    src = gen_load_fpr_D(dc, rs);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, cpu_env, src);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +
> +#ifdef TARGET_SPARC64
> +static void gen_ne_fop_DD(DisasContext *dc, int rd, int rs,
> +                          void (*gen)(TCGv_i64, TCGv_i64))
> +{
> +    TCGv_i64 dst, src;
> +
> +    src = gen_load_fpr_D(dc, rs);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, src);
> +
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +#endif
> +
> +static void gen_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
> +                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i64, TCGv_i64))
> +{
> +    TCGv_i64 dst, src1, src2;
> +
> +    gen_clear_float_exceptions();
> +    src1 = gen_load_fpr_D(dc, rs1);
> +    src2 = gen_load_fpr_D(dc, rs2);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, cpu_env, src1, src2);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +
> +#ifdef TARGET_SPARC64
> +static void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
> +                           void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64))
> +{
> +    TCGv_i64 dst, src1, src2;
> +
> +    src1 = gen_load_fpr_D(dc, rs1);
> +    src2 = gen_load_fpr_D(dc, rs2);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, src1, src2);
> +
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +#endif
> +
> +static void gen_fop_QQ(DisasContext *dc, int rd, int rs,
> +                       void (*gen)(TCGv_ptr))
> +{
> +    gen_clear_float_exceptions();
> +    gen_op_load_fpr_QT1(QFPREG(rs));
> +
> +    gen(cpu_env);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_op_store_QT0_fpr(QFPREG(rd));
> +    gen_update_fprs_dirty(QFPREG(rd));
> +}
> +
> +#ifdef TARGET_SPARC64
> +static void gen_ne_fop_QQ(DisasContext *dc, int rd, int rs,
> +                          void (*gen)(TCGv_ptr))
> +{
> +    gen_op_load_fpr_QT1(QFPREG(rs));
> +
> +    gen(cpu_env);
> +
> +    gen_op_store_QT0_fpr(QFPREG(rd));
> +    gen_update_fprs_dirty(QFPREG(rd));
> +}
> +#endif
> +
> +static void gen_fop_QQQ(DisasContext *dc, int rd, int rs1, int rs2,
> +                        void (*gen)(TCGv_ptr))
> +{
> +    gen_clear_float_exceptions();
> +    gen_op_load_fpr_QT0(QFPREG(rs1));
> +    gen_op_load_fpr_QT1(QFPREG(rs2));
> +
> +    gen(cpu_env);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_op_store_QT0_fpr(QFPREG(rd));
> +    gen_update_fprs_dirty(QFPREG(rd));
> +}
> +
> +static void gen_fop_DFF(DisasContext *dc, int rd, int rs1, int rs2,
> +                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32, TCGv_i32))
> +{
> +    TCGv_i64 dst;
> +    TCGv_i32 src1, src2;
> +
> +    gen_clear_float_exceptions();
> +    src1 = gen_load_fpr_F(dc, rs1);
> +    src2 = gen_load_fpr_F(dc, rs2);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, cpu_env, src1, src2);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +
> +static void gen_fop_QDD(DisasContext *dc, int rd, int rs1, int rs2,
> +                        void (*gen)(TCGv_ptr, TCGv_i64, TCGv_i64))
> +{
> +    TCGv_i64 src1, src2;
> +
> +    gen_clear_float_exceptions();
> +    src1 = gen_load_fpr_D(dc, rs1);
> +    src2 = gen_load_fpr_D(dc, rs2);
> +
> +    gen(cpu_env, src1, src2);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_op_store_QT0_fpr(QFPREG(rd));
> +    gen_update_fprs_dirty(QFPREG(rd));
> +}
> +
> +#ifdef TARGET_SPARC64
> +static void gen_fop_DF(DisasContext *dc, int rd, int rs,
> +                        void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32))
> +{
> +    TCGv_i64 dst;
> +    TCGv_i32 src;
> +
> +    gen_clear_float_exceptions();
> +    src = gen_load_fpr_F(dc, rs);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, cpu_env, src);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +#endif
> +
> +static void gen_ne_fop_DF(DisasContext *dc, int rd, int rs,
> +                          void (*gen)(TCGv_i64, TCGv_ptr, TCGv_i32))
> +{
> +    TCGv_i64 dst;
> +    TCGv_i32 src;
> +
> +    src = gen_load_fpr_F(dc, rs);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, cpu_env, src);
> +
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +
> +static void gen_fop_FD(DisasContext *dc, int rd, int rs,
> +                        void (*gen)(TCGv_i32, TCGv_ptr, TCGv_i64))
> +{
> +    TCGv_i32 dst;
> +    TCGv_i64 src;
> +
> +    gen_clear_float_exceptions();
> +    src = gen_load_fpr_D(dc, rs);
> +    dst = gen_dest_fpr_F();
> +
> +    gen(dst, cpu_env, src);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_F(dc, rd, dst);
> +}
> +
> +static void gen_fop_FQ(DisasContext *dc, int rd, int rs,
> +                        void (*gen)(TCGv_i32, TCGv_ptr))
> +{
> +    TCGv_i32 dst;
> +
> +    gen_clear_float_exceptions();
> +    gen_op_load_fpr_QT1(QFPREG(rs));
> +    dst = gen_dest_fpr_F();
> +
> +    gen(dst, cpu_env);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_F(dc, rd, dst);
> +}
> +
> +static void gen_fop_DQ(DisasContext *dc, int rd, int rs,
> +                        void (*gen)(TCGv_i64, TCGv_ptr))
> +{
> +    TCGv_i64 dst;
> +
> +    gen_clear_float_exceptions();
> +    gen_op_load_fpr_QT1(QFPREG(rs));
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, cpu_env);
> +
> +    gen_helper_check_ieee_exceptions(cpu_env);
> +    gen_store_fpr_D(dc, rd, dst);
> +}
> +
> +static void gen_ne_fop_QF(DisasContext *dc, int rd, int rs,
> +                          void (*gen)(TCGv_ptr, TCGv_i32))
> +{
> +    TCGv_i32 src;
> +
> +    src = gen_load_fpr_F(dc, rs);
> +
> +    gen(cpu_env, src);
> +
> +    gen_op_store_QT0_fpr(QFPREG(rd));
> +    gen_update_fprs_dirty(QFPREG(rd));
> +}
> +
> +static void gen_ne_fop_QD(DisasContext *dc, int rd, int rs,
> +                          void (*gen)(TCGv_ptr, TCGv_i64))
> +{
> +    TCGv_i64 src;
> +
> +    src = gen_load_fpr_D(dc, rs);
> +
> +    gen(cpu_env, src);
> +
> +    gen_op_store_QT0_fpr(QFPREG(rd));
> +    gen_update_fprs_dirty(QFPREG(rd));
> +}
> +
>  /* asi moves */
>  #ifdef TARGET_SPARC64
>  static inline TCGv_i32 gen_get_asi(int insn, TCGv r_addr)
> @@ -2415,279 +2719,115 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_store_fpr_F(dc, rd, cpu_src1_32);
>                     break;
>                 case 0x5: /* fnegs */
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fnegs(cpu_dst_32, cpu_src1_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FF(dc, rd, rs2, gen_helper_fnegs);
>                     break;
>                 case 0x9: /* fabss */
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fabss(cpu_dst_32, cpu_src1_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FF(dc, rd, rs2, gen_helper_fabss);
>                     break;
>                 case 0x29: /* fsqrts */
>                     CHECK_FPU_FEATURE(dc, FSQRT);
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fsqrts(cpu_dst_32, cpu_env, cpu_src1_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FF(dc, rd, rs2, gen_helper_fsqrts);
>                     break;
>                 case 0x2a: /* fsqrtd */
>                     CHECK_FPU_FEATURE(dc, FSQRT);
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fsqrtd(cpu_dst_64, cpu_env, cpu_src1_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DD(dc, rd, rs2, gen_helper_fsqrtd);
>                     break;
>                 case 0x2b: /* fsqrtq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    gen_helper_fsqrtq(cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_fop_QQ(dc, rd, rs2, gen_helper_fsqrtq);
>                     break;
>                 case 0x41: /* fadds */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fadds(cpu_dst_32, cpu_env,
> -                                     cpu_src1_32, cpu_src2_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fadds);
>                     break;
>                 case 0x42: /* faddd */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_faddd(cpu_dst_64, cpu_env,
> -                                     cpu_src1_64, cpu_src2_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_faddd);
>                     break;
>                 case 0x43: /* faddq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT0(QFPREG(rs1));
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    gen_helper_faddq(cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_faddq);
>                     break;
>                 case 0x45: /* fsubs */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fsubs(cpu_dst_32, cpu_env,
> -                                     cpu_src1_32, cpu_src2_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fsubs);
>                     break;
>                 case 0x46: /* fsubd */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fsubd(cpu_dst_64, cpu_env,
> -                                     cpu_src1_64, cpu_src2_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fsubd);
>                     break;
>                 case 0x47: /* fsubq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT0(QFPREG(rs1));
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    gen_helper_fsubq(cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fsubq);
>                     break;
>                 case 0x49: /* fmuls */
>                     CHECK_FPU_FEATURE(dc, FMUL);
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fmuls(cpu_dst_32, cpu_env,
> -                                     cpu_src1_32, cpu_src2_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fmuls);
>                     break;
>                 case 0x4a: /* fmuld */
>                     CHECK_FPU_FEATURE(dc, FMUL);
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmuld(cpu_dst_64, cpu_env,
> -                                     cpu_src1_64, cpu_src2_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld);
>                     break;
>                 case 0x4b: /* fmulq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
>                     CHECK_FPU_FEATURE(dc, FMUL);
> -                    gen_op_load_fpr_QT0(QFPREG(rs1));
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    gen_helper_fmulq(cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fmulq);
>                     break;
>                 case 0x4d: /* fdivs */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fdivs(cpu_dst_32, cpu_env,
> -                                     cpu_src1_32, cpu_src2_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FFF(dc, rd, rs1, rs2, gen_helper_fdivs);
>                     break;
>                 case 0x4e: /* fdivd */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fdivd(cpu_dst_64, cpu_env,
> -                                     cpu_src1_64, cpu_src2_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DDD(dc, rd, rs1, rs2, gen_helper_fdivd);
>                     break;
>                 case 0x4f: /* fdivq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT0(QFPREG(rs1));
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    gen_helper_fdivq(cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_fop_QQQ(dc, rd, rs1, rs2, gen_helper_fdivq);
>                     break;
>                 case 0x69: /* fsmuld */
>                     CHECK_FPU_FEATURE(dc, FSMULD);
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fsmuld(cpu_dst_64, cpu_env,
> -                                      cpu_src1_32, cpu_src2_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DFF(dc, rd, rs1, rs2, gen_helper_fsmuld);
>                     break;
>                 case 0x6e: /* fdmulq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    gen_helper_fdmulq(cpu_env, cpu_src1_64, cpu_src2_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_fop_QDD(dc, rd, rs1, rs2, gen_helper_fdmulq);
>                     break;
>                 case 0xc4: /* fitos */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fitos(cpu_dst_32, cpu_env, cpu_src1_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FF(dc, rd, rs2, gen_helper_fitos);
>                     break;
>                 case 0xc6: /* fdtos */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fdtos(cpu_dst_32, cpu_env, cpu_src1_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FD(dc, rd, rs2, gen_helper_fdtos);
>                     break;
>                 case 0xc7: /* fqtos */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fqtos(cpu_dst_32, cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FQ(dc, rd, rs2, gen_helper_fqtos);
>                     break;
>                 case 0xc8: /* fitod */
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fitod(cpu_dst_64, cpu_env, cpu_src1_32);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DF(dc, rd, rs2, gen_helper_fitod);
>                     break;
>                 case 0xc9: /* fstod */
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fstod(cpu_dst_64, cpu_env, cpu_src1_32);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DF(dc, rd, rs2, gen_helper_fstod);
>                     break;
>                 case 0xcb: /* fqtod */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_clear_float_exceptions();
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fqtod(cpu_dst_64, cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DQ(dc, rd, rs2, gen_helper_fqtod);
>                     break;
>                 case 0xcc: /* fitoq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    gen_helper_fitoq(cpu_env, cpu_src1_32);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_ne_fop_QF(dc, rd, rs2, gen_helper_fitoq);
>                     break;
>                 case 0xcd: /* fstoq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    gen_helper_fstoq(cpu_env, cpu_src1_32);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_ne_fop_QF(dc, rd, rs2, gen_helper_fstoq);
>                     break;
>                 case 0xce: /* fdtoq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    gen_helper_fdtoq(cpu_env, cpu_src1_64);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_ne_fop_QD(dc, rd, rs2, gen_helper_fdtoq);
>                     break;
>                 case 0xd1: /* fstoi */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fstoi(cpu_dst_32, cpu_env, cpu_src1_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FF(dc, rd, rs2, gen_helper_fstoi);
>                     break;
>                 case 0xd2: /* fdtoi */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fdtoi(cpu_dst_32, cpu_env, cpu_src1_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FD(dc, rd, rs2, gen_helper_fdtoi);
>                     break;
>                 case 0xd3: /* fqtoi */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fqtoi(cpu_dst_32, cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FQ(dc, rd, rs2, gen_helper_fqtoi);
>                     break;
>  #ifdef TARGET_SPARC64
>                 case 0x2: /* V9 fmovd */
> @@ -2707,80 +2847,38 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_update_fprs_dirty(QFPREG(rd));
>                     break;
>                 case 0x6: /* V9 fnegd */
> -                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fnegd(cpu_dst_64, cpu_src1_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DD(dc, rd, rs2, gen_helper_fnegd);
>                     break;
>                 case 0x7: /* V9 fnegq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_helper_fnegq(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_ne_fop_QQ(dc, rd, rs2, gen_helper_fnegq);
>                     break;
>                 case 0xa: /* V9 fabsd */
> -                    cpu_src1_64 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fabsd(cpu_dst_64, cpu_env, cpu_src1_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DD(dc, rd, rs2, gen_helper_fabsd);
>                     break;
>                 case 0xb: /* V9 fabsq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_helper_fabsq(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_ne_fop_QQ(dc, rd, rs2, gen_helper_fabsq);
>                     break;
>                 case 0x81: /* V9 fstox */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fstox(cpu_dst_64, cpu_env, cpu_src1_32);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DF(dc, rd, rs2, gen_helper_fstox);
>                     break;
>                 case 0x82: /* V9 fdtox */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fdtox(cpu_dst_64, cpu_env, cpu_src1_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DD(dc, rd, rs2, gen_helper_fdtox);
>                     break;
>                 case 0x83: /* V9 fqtox */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_op_load_fpr_QT1(QFPREG(rs2));
> -                    gen_clear_float_exceptions();
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fqtox(cpu_dst_64, cpu_env);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DQ(dc, rd, rs2, gen_helper_fqtox);
>                     break;
>                 case 0x84: /* V9 fxtos */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fxtos(cpu_dst_32, cpu_env, cpu_src1_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_fop_FD(dc, rd, rs2, gen_helper_fxtos);
>                     break;
>                 case 0x88: /* V9 fxtod */
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fxtod(cpu_dst_64, cpu_env, cpu_src1_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_fop_DD(dc, rd, rs2, gen_helper_fxtod);
>                     break;
>                 case 0x8c: /* V9 fxtoq */
>                     CHECK_FPU_FEATURE(dc, FLOAT128);
> -                    gen_clear_float_exceptions();
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    gen_helper_fxtoq(cpu_env, cpu_src1_64);
> -                    gen_helper_check_ieee_exceptions(cpu_env);
> -                    gen_op_store_QT0_fpr(QFPREG(rd));
> -                    gen_update_fprs_dirty(QFPREG(rd));
> +                    gen_ne_fop_QD(dc, rd, rs2, gen_helper_fxtoq);
>                     break;
>  #endif
>                 default:
> @@ -3990,65 +4088,31 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x031: /* VIS I fmul8x16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmul8x16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16);
>                     break;
>                 case 0x033: /* VIS I fmul8x16au */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmul8x16au(cpu_dst_64, cpu_src1_64,
> -                                          cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16au);
>                     break;
>                 case 0x035: /* VIS I fmul8x16al */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmul8x16al(cpu_dst_64, cpu_src1_64,
> -                                          cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16al);
>                     break;
>                 case 0x036: /* VIS I fmul8sux16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmul8sux16(cpu_dst_64, cpu_src1_64,
> -                                          cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8sux16);
>                     break;
>                 case 0x037: /* VIS I fmul8ulx16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmul8ulx16(cpu_dst_64, cpu_src1_64,
> -                                          cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8ulx16);
>                     break;
>                 case 0x038: /* VIS I fmuld8sux16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmuld8sux16(cpu_dst_64, cpu_src1_64,
> -                                           cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8sux16);
>                     break;
>                 case 0x039: /* VIS I fmuld8ulx16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fmuld8ulx16(cpu_dst_64, cpu_src1_64,
> -                                           cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8ulx16);
>                     break;
>                 case 0x03a: /* VIS I fpack32 */
>                 case 0x03b: /* VIS I fpack16 */
> @@ -4067,86 +4131,46 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x04b: /* VIS I fpmerge */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
>                     break;
>                 case 0x04c: /* VIS II bshuffle */
>                     // XXX
>                     goto illegal_insn;
>                 case 0x04d: /* VIS I fexpand */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fexpand(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fexpand);
>                     break;
>                 case 0x050: /* VIS I fpadd16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fpadd16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpadd16);
>                     break;
>                 case 0x051: /* VIS I fpadd16s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fpadd16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, gen_helper_fpadd16s);
>                     break;
>                 case 0x052: /* VIS I fpadd32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fpadd32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpadd32);
>                     break;
>                 case 0x053: /* VIS I fpadd32s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_add_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_add_i32);
>                     break;
>                 case 0x054: /* VIS I fpsub16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fpsub16(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpsub16);
>                     break;
>                 case 0x055: /* VIS I fpsub16s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    gen_helper_fpsub16s(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, gen_helper_fpsub16s);
>                     break;
>                 case 0x056: /* VIS I fpsub32 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    gen_helper_fpsub32(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpsub32);
>                     break;
>                 case 0x057: /* VIS I fpsub32s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_sub_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_sub_i32);
>                     break;
>                 case 0x060: /* VIS I fzero */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> @@ -4162,143 +4186,75 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x062: /* VIS I fnor */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_nor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_nor_i64);
>                     break;
>                 case 0x063: /* VIS I fnors */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_nor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_nor_i32);
>                     break;
>                 case 0x064: /* VIS I fandnot2 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_andc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_andc_i64);
>                     break;
>                 case 0x065: /* VIS I fandnot2s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_andc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_andc_i32);
>                     break;
>                 case 0x066: /* VIS I fnot2 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DD(dc, rd, rs2, tcg_gen_not_i64);
>                     break;
>                 case 0x067: /* VIS I fnot2s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FF(dc, rd, rs2, tcg_gen_not_i32);
>                     break;
>                 case 0x068: /* VIS I fandnot1 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_andc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs2, rs1, tcg_gen_andc_i64);
>                     break;
>                 case 0x069: /* VIS I fandnot1s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_andc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs2, rs1, tcg_gen_andc_i32);
>                     break;
>                 case 0x06a: /* VIS I fnot1 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_not_i64(cpu_dst_64, cpu_src1_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DD(dc, rd, rs1, tcg_gen_not_i64);
>                     break;
>                 case 0x06b: /* VIS I fnot1s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_not_i32(cpu_dst_32, cpu_src1_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FF(dc, rd, rs1, tcg_gen_not_i32);
>                     break;
>                 case 0x06c: /* VIS I fxor */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_xor_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_xor_i64);
>                     break;
>                 case 0x06d: /* VIS I fxors */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_xor_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_xor_i32);
>                     break;
>                 case 0x06e: /* VIS I fnand */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_nand_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_nand_i64);
>                     break;
>                 case 0x06f: /* VIS I fnands */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_nand_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_nand_i32);
>                     break;
>                 case 0x070: /* VIS I fand */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_and_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_and_i64);
>                     break;
>                 case 0x071: /* VIS I fands */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_and_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_and_i32);
>                     break;
>                 case 0x072: /* VIS I fxnor */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_eqv_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_eqv_i64);
>                     break;
>                 case 0x073: /* VIS I fxnors */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_eqv_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_eqv_i32);
>                     break;
>                 case 0x074: /* VIS I fsrc1 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> @@ -4312,19 +4268,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x076: /* VIS I fornot2 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_orc_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_orc_i64);
>                     break;
>                 case 0x077: /* VIS I fornot2s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_orc_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_orc_i32);
>                     break;
>                 case 0x078: /* VIS I fsrc2 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> @@ -4338,35 +4286,19 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     break;
>                 case 0x07a: /* VIS I fornot1 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_orc_i64(cpu_dst_64, cpu_src2_64, cpu_src1_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs2, rs1, tcg_gen_orc_i64);
>                     break;
>                 case 0x07b: /* VIS I fornot1s */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_orc_i32(cpu_dst_32, cpu_src2_32, cpu_src1_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs2, rs1, tcg_gen_orc_i32);
>                     break;
>                 case 0x07c: /* VIS I for */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> -                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
> -                    cpu_dst_64 = gen_dest_fpr_D();
> -                    tcg_gen_or_i64(cpu_dst_64, cpu_src1_64, cpu_src2_64);
> -                    gen_store_fpr_D(dc, rd, cpu_dst_64);
> +                    gen_ne_fop_DDD(dc, rd, rs1, rs2, tcg_gen_or_i64);
>                     break;
>                 case 0x07d: /* VIS I fors */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> -                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> -                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> -                    cpu_dst_32 = gen_dest_fpr_F();
> -                    tcg_gen_or_i32(cpu_dst_32, cpu_src1_32, cpu_src2_32);
> -                    gen_store_fpr_F(dc, rd, cpu_dst_32);
> +                    gen_ne_fop_FFF(dc, rd, rs1, rs2, tcg_gen_or_i32);
>                     break;
>                 case 0x07e: /* VIS I fone */
>                     CHECK_FPU_FEATURE(dc, VIS1);
> --
> 1.7.6.4
>
>

Excellent cleanup.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 09/21] target-sparc: Change fpr representation to doubles.
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 09/21] target-sparc: Change fpr representation to doubles Richard Henderson
@ 2011-10-18 20:28   ` Blue Swirl
  2011-10-18 22:25     ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 20:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
> This allows a more efficient representation for 64-bit hosts.
> It should be about the same for 32-bit hosts, as we can still
> access the individual pieces of the double.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  gdbstub.c                  |   35 +++++++---
>  linux-user/signal.c        |   28 +++++----
>  monitor.c                  |   96 ++++++++++++++--------------
>  target-sparc/cpu.h         |    7 +-
>  target-sparc/cpu_init.c    |    6 +-
>  target-sparc/ldst_helper.c |   71 +++++++++------------
>  target-sparc/machine.c     |   20 ++----
>  target-sparc/translate.c   |  146 ++++++++++++++++++++-----------------------
>  8 files changed, 199 insertions(+), 210 deletions(-)
>
> diff --git a/gdbstub.c b/gdbstub.c
> index 1d99e19..6c18634 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -814,7 +814,11 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t *mem_buf, int n)
>  #if defined(TARGET_ABI32) || !defined(TARGET_SPARC64)
>     if (n < 64) {
>         /* fprs */
> -        GET_REG32(*((uint32_t *)&env->fpr[n - 32]));
> +        if (n & 1) {
> +            GET_REG32(env->fpr[(n - 32) / 2].l.lower);
> +        } else {
> +            GET_REG32(env->fpr[(n - 32) / 2].l.upper);
> +        }
>     }
>     /* Y, PSR, WIM, TBR, PC, NPC, FPSR, CPSR */
>     switch (n) {
> @@ -831,15 +835,15 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t *mem_buf, int n)
>  #else
>     if (n < 64) {
>         /* f0-f31 */
> -        GET_REG32(*((uint32_t *)&env->fpr[n - 32]));
> +        if (n & 1) {
> +            GET_REG32(env->fpr[(n - 32) / 2].l.lower);
> +        } else {
> +            GET_REG32(env->fpr[(n - 32) / 2].l.upper);
> +        }
>     }
>     if (n < 80) {
>         /* f32-f62 (double width, even numbers only) */
> -        uint64_t val;
> -
> -        val = (uint64_t)*((uint32_t *)&env->fpr[(n - 64) * 2 + 32]) << 32;
> -        val |= *((uint32_t *)&env->fpr[(n - 64) * 2 + 33]);
> -        GET_REG64(val);
> +        GET_REG64(env->fpr[(n - 32) / 2].ll);
>     }
>     switch (n) {
>     case 80: GET_REGL(env->pc);
> @@ -878,7 +882,12 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n)
>  #if defined(TARGET_ABI32) || !defined(TARGET_SPARC64)
>     else if (n < 64) {
>         /* fprs */
> -        *((uint32_t *)&env->fpr[n - 32]) = tmp;
> +        /* f0-f31 */
> +        if (n & 1) {
> +            env->fpr[(n - 32) / 2].l.lower = tmp;
> +        } else {
> +            env->fpr[(n - 32) / 2].l.upper = tmp;
> +        }
>     } else {
>         /* Y, PSR, WIM, TBR, PC, NPC, FPSR, CPSR */
>         switch (n) {
> @@ -896,12 +905,16 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n)
>  #else
>     else if (n < 64) {
>         /* f0-f31 */
> -        env->fpr[n] = ldfl_p(mem_buf);
> +        tmp = ldl_p(mem_buf);
> +        if (n & 1) {
> +            env->fpr[(n - 32) / 2].l.lower = tmp;
> +        } else {
> +            env->fpr[(n - 32) / 2].l.upper = tmp;
> +        }
>         return 4;
>     } else if (n < 80) {
>         /* f32-f62 (double width, even numbers only) */
> -        *((uint32_t *)&env->fpr[(n - 64) * 2 + 32]) = tmp >> 32;
> -        *((uint32_t *)&env->fpr[(n - 64) * 2 + 33]) = tmp;
> +        env->fpr[(n - 32) / 2].ll = tmp;
>     } else {
>         switch (n) {
>         case 80: env->pc = tmp; break;
> diff --git a/linux-user/signal.c b/linux-user/signal.c
> index 89276eb..d68dc94 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -2299,12 +2299,14 @@ void sparc64_set_context(CPUSPARCState *env)
>      */
>     err |= __get_user(env->fprs, &(ucp->tuc_mcontext.mc_fpregs.mcfpu_fprs));
>     {
> -        uint32_t *src, *dst;
> -        src = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
> -        dst = env->fpr;
> -        /* XXX: check that the CPU storage is the same as user context */
> -        for (i = 0; i < 64; i++, dst++, src++)
> -            err |= __get_user(*dst, src);
> +        uint32_t *src = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
> +        for (i = 0; i < 64; i++, src++) {
> +            if (i & 1) {
> +                err |= __get_user(env->fpr[i/2].l.lower, src);
> +            } else {
> +                err |= __get_user(env->fpr[i/2].l.upper, src);
> +            }
> +        }
>     }
>     err |= __get_user(env->fsr,
>                       &(ucp->tuc_mcontext.mc_fpregs.mcfpu_fsr));
> @@ -2393,12 +2395,14 @@ void sparc64_get_context(CPUSPARCState *env)
>     err |= __put_user(i7, &(mcp->mc_i7));
>
>     {
> -        uint32_t *src, *dst;
> -        src = env->fpr;
> -        dst = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
> -        /* XXX: check that the CPU storage is the same as user context */
> -        for (i = 0; i < 64; i++, dst++, src++)
> -            err |= __put_user(*src, dst);
> +        uint32_t *dst = ucp->tuc_mcontext.mc_fpregs.mcfpu_fregs.sregs;
> +        for (i = 0; i < 64; i++, dst++) {
> +            if (i & 1) {
> +                err |= __put_user(env->fpr[i/2].l.lower, dst);
> +            } else {
> +                err |= __put_user(env->fpr[i/2].l.upper, dst);
> +            }
> +        }
>     }
>     err |= __put_user(env->fsr, &(mcp->mc_fpregs.mcfpu_fsr));
>     err |= __put_user(env->gsr, &(mcp->mc_fpregs.mcfpu_gsr));
> diff --git a/monitor.c b/monitor.c
> index da13471..02d7e2e 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -3657,55 +3657,55 @@ static const MonitorDef monitor_defs[] = {
>  #endif
>     { "tbr", offsetof(CPUState, tbr) },
>     { "fsr", offsetof(CPUState, fsr) },
> -    { "f0", offsetof(CPUState, fpr[0]) },
> -    { "f1", offsetof(CPUState, fpr[1]) },
> -    { "f2", offsetof(CPUState, fpr[2]) },
> -    { "f3", offsetof(CPUState, fpr[3]) },
> -    { "f4", offsetof(CPUState, fpr[4]) },
> -    { "f5", offsetof(CPUState, fpr[5]) },
> -    { "f6", offsetof(CPUState, fpr[6]) },
> -    { "f7", offsetof(CPUState, fpr[7]) },
> -    { "f8", offsetof(CPUState, fpr[8]) },
> -    { "f9", offsetof(CPUState, fpr[9]) },
> -    { "f10", offsetof(CPUState, fpr[10]) },
> -    { "f11", offsetof(CPUState, fpr[11]) },
> -    { "f12", offsetof(CPUState, fpr[12]) },
> -    { "f13", offsetof(CPUState, fpr[13]) },
> -    { "f14", offsetof(CPUState, fpr[14]) },
> -    { "f15", offsetof(CPUState, fpr[15]) },
> -    { "f16", offsetof(CPUState, fpr[16]) },
> -    { "f17", offsetof(CPUState, fpr[17]) },
> -    { "f18", offsetof(CPUState, fpr[18]) },
> -    { "f19", offsetof(CPUState, fpr[19]) },
> -    { "f20", offsetof(CPUState, fpr[20]) },
> -    { "f21", offsetof(CPUState, fpr[21]) },
> -    { "f22", offsetof(CPUState, fpr[22]) },
> -    { "f23", offsetof(CPUState, fpr[23]) },
> -    { "f24", offsetof(CPUState, fpr[24]) },
> -    { "f25", offsetof(CPUState, fpr[25]) },
> -    { "f26", offsetof(CPUState, fpr[26]) },
> -    { "f27", offsetof(CPUState, fpr[27]) },
> -    { "f28", offsetof(CPUState, fpr[28]) },
> -    { "f29", offsetof(CPUState, fpr[29]) },
> -    { "f30", offsetof(CPUState, fpr[30]) },
> -    { "f31", offsetof(CPUState, fpr[31]) },
> +    { "f0", offsetof(CPUState, fpr[0].l.upper) },
> +    { "f1", offsetof(CPUState, fpr[0].l.lower) },
> +    { "f2", offsetof(CPUState, fpr[1].l.upper) },
> +    { "f3", offsetof(CPUState, fpr[1].l.lower) },
> +    { "f4", offsetof(CPUState, fpr[2].l.upper) },
> +    { "f5", offsetof(CPUState, fpr[2].l.lower) },
> +    { "f6", offsetof(CPUState, fpr[3].l.upper) },
> +    { "f7", offsetof(CPUState, fpr[3].l.lower) },
> +    { "f8", offsetof(CPUState, fpr[4].l.upper) },
> +    { "f9", offsetof(CPUState, fpr[4].l.lower) },
> +    { "f10", offsetof(CPUState, fpr[5].l.upper) },
> +    { "f11", offsetof(CPUState, fpr[5].l.lower) },
> +    { "f12", offsetof(CPUState, fpr[6].l.upper) },
> +    { "f13", offsetof(CPUState, fpr[6].l.lower) },
> +    { "f14", offsetof(CPUState, fpr[7].l.upper) },
> +    { "f15", offsetof(CPUState, fpr[7].l.lower) },
> +    { "f16", offsetof(CPUState, fpr[8].l.upper) },
> +    { "f17", offsetof(CPUState, fpr[8].l.lower) },
> +    { "f18", offsetof(CPUState, fpr[9].l.upper) },
> +    { "f19", offsetof(CPUState, fpr[9].l.lower) },
> +    { "f20", offsetof(CPUState, fpr[10].l.upper) },
> +    { "f21", offsetof(CPUState, fpr[10].l.lower) },
> +    { "f22", offsetof(CPUState, fpr[11].l.upper) },
> +    { "f23", offsetof(CPUState, fpr[11].l.lower) },
> +    { "f24", offsetof(CPUState, fpr[12].l.upper) },
> +    { "f25", offsetof(CPUState, fpr[12].l.lower) },
> +    { "f26", offsetof(CPUState, fpr[13].l.upper) },
> +    { "f27", offsetof(CPUState, fpr[13].l.lower) },
> +    { "f28", offsetof(CPUState, fpr[14].l.upper) },
> +    { "f29", offsetof(CPUState, fpr[14].l.lower) },
> +    { "f30", offsetof(CPUState, fpr[15].l.upper) },
> +    { "f31", offsetof(CPUState, fpr[15].l.lower) },
>  #ifdef TARGET_SPARC64
> -    { "f32", offsetof(CPUState, fpr[32]) },
> -    { "f34", offsetof(CPUState, fpr[34]) },
> -    { "f36", offsetof(CPUState, fpr[36]) },
> -    { "f38", offsetof(CPUState, fpr[38]) },
> -    { "f40", offsetof(CPUState, fpr[40]) },
> -    { "f42", offsetof(CPUState, fpr[42]) },
> -    { "f44", offsetof(CPUState, fpr[44]) },
> -    { "f46", offsetof(CPUState, fpr[46]) },
> -    { "f48", offsetof(CPUState, fpr[48]) },
> -    { "f50", offsetof(CPUState, fpr[50]) },
> -    { "f52", offsetof(CPUState, fpr[52]) },
> -    { "f54", offsetof(CPUState, fpr[54]) },
> -    { "f56", offsetof(CPUState, fpr[56]) },
> -    { "f58", offsetof(CPUState, fpr[58]) },
> -    { "f60", offsetof(CPUState, fpr[60]) },
> -    { "f62", offsetof(CPUState, fpr[62]) },
> +    { "f32", offsetof(CPUState, fpr[16]) },
> +    { "f34", offsetof(CPUState, fpr[17]) },
> +    { "f36", offsetof(CPUState, fpr[18]) },
> +    { "f38", offsetof(CPUState, fpr[19]) },
> +    { "f40", offsetof(CPUState, fpr[20]) },
> +    { "f42", offsetof(CPUState, fpr[21]) },
> +    { "f44", offsetof(CPUState, fpr[22]) },
> +    { "f46", offsetof(CPUState, fpr[23]) },
> +    { "f48", offsetof(CPUState, fpr[24]) },
> +    { "f50", offsetof(CPUState, fpr[25]) },
> +    { "f52", offsetof(CPUState, fpr[26]) },
> +    { "f54", offsetof(CPUState, fpr[27]) },
> +    { "f56", offsetof(CPUState, fpr[28]) },
> +    { "f58", offsetof(CPUState, fpr[29]) },
> +    { "f60", offsetof(CPUState, fpr[30]) },
> +    { "f62", offsetof(CPUState, fpr[31]) },
>     { "asi", offsetof(CPUState, asi) },
>     { "pstate", offsetof(CPUState, pstate) },
>     { "cansave", offsetof(CPUState, cansave) },
> diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
> index a4419a5..71a890c 100644
> --- a/target-sparc/cpu.h
> +++ b/target-sparc/cpu.h
> @@ -3,16 +3,17 @@
>
>  #include "config.h"
>  #include "qemu-common.h"
> +#include "bswap.h"
>
>  #if !defined(TARGET_SPARC64)
>  #define TARGET_LONG_BITS 32
> -#define TARGET_FPREGS 32
> +#define TARGET_DPREGS 16
>  #define TARGET_PAGE_BITS 12 /* 4k */
>  #define TARGET_PHYS_ADDR_SPACE_BITS 36
>  #define TARGET_VIRT_ADDR_SPACE_BITS 32
>  #else
>  #define TARGET_LONG_BITS 64
> -#define TARGET_FPREGS 64
> +#define TARGET_DPREGS 32
>  #define TARGET_PAGE_BITS 13 /* 8k */
>  #define TARGET_PHYS_ADDR_SPACE_BITS 41
>  # ifdef TARGET_ABI32
> @@ -395,7 +396,7 @@ typedef struct CPUSPARCState {
>
>     uint32_t psr;      /* processor state register */
>     target_ulong fsr;      /* FPU state register */
> -    float32 fpr[TARGET_FPREGS];  /* floating point registers */
> +    CPU_DoubleU fpr[TARGET_DPREGS];  /* floating point registers */
>     uint32_t cwp;      /* index of current register window (extracted
>                           from PSR) */
>  #if !defined(TARGET_SPARC64) || defined(TARGET_ABI32)
> diff --git a/target-sparc/cpu_init.c b/target-sparc/cpu_init.c
> index 08b72a9..1118f31 100644
> --- a/target-sparc/cpu_init.c
> +++ b/target-sparc/cpu_init.c
> @@ -813,11 +813,11 @@ void cpu_dump_state(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
>         }
>     }
>     cpu_fprintf(f, "\nFloating Point Registers:\n");
> -    for (i = 0; i < TARGET_FPREGS; i++) {
> +    for (i = 0; i < TARGET_DPREGS; i++) {
>         if ((i & 3) == 0) {
> -            cpu_fprintf(f, "%%f%02d:", i);
> +            cpu_fprintf(f, "%%f%02d:", i * 2);
>         }
> -        cpu_fprintf(f, " %016f", *(float *)&env->fpr[i]);
> +        cpu_fprintf(f, " %016" PRIx64, env->fpr[i].ll);
>         if ((i & 3) == 3) {
>             cpu_fprintf(f, "\n");
>         }
> diff --git a/target-sparc/ldst_helper.c b/target-sparc/ldst_helper.c
> index ec9b5f2..a4254e7 100644
> --- a/target-sparc/ldst_helper.c
> +++ b/target-sparc/ldst_helper.c
> @@ -2057,7 +2057,7 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
>                     int rd)
>  {
>     unsigned int i;
> -    CPU_DoubleU u;
> +    target_ulong val;
>
>     helper_check_align(env, addr, 3);
>     addr = asi_address_mask(env, asi, addr);
> @@ -2072,13 +2072,11 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
>             return;
>         }
>         helper_check_align(env, addr, 0x3f);
> -        for (i = 0; i < 16; i++) {
> -            *(uint32_t *)&env->fpr[rd++] = helper_ld_asi(env, addr, asi & 0x8f,
> -                                                         4, 0);
> -            addr += 4;
> +        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
> +            env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi & 0x8f, 8, 0);
>         }
> -
>         return;
> +
>     case 0x16: /* UA2007 Block load primary, user privilege */
>     case 0x17: /* UA2007 Block load secondary, user privilege */
>     case 0x1e: /* UA2007 Block load primary LE, user privilege */
> @@ -2092,13 +2090,11 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
>             return;
>         }
>         helper_check_align(env, addr, 0x3f);
> -        for (i = 0; i < 16; i++) {
> -            *(uint32_t *)&env->fpr[rd++] = helper_ld_asi(env, addr, asi & 0x19,
> -                                                         4, 0);
> -            addr += 4;
> +        for (i = 0; i < 8; i++, rd += 2, addr += 4) {
> +            env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi & 0x19, 8, 0);
>         }
> -
>         return;
> +
>     default:
>         break;
>     }
> @@ -2106,20 +2102,19 @@ void helper_ldf_asi(CPUState *env, target_ulong addr, int asi, int size,
>     switch (size) {
>     default:
>     case 4:
> -        *((uint32_t *)&env->fpr[rd]) = helper_ld_asi(env, addr, asi, size, 0);
> +        val = helper_ld_asi(env, addr, asi, size, 0);
> +        if (rd & 1) {
> +            env->fpr[rd/2].l.lower = val;
> +        } else {
> +            env->fpr[rd/2].l.upper = val;
> +        }
>         break;
>     case 8:
> -        u.ll = helper_ld_asi(env, addr, asi, size, 0);
> -        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
> -        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
> +        env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi, size, 0);
>         break;
>     case 16:
> -        u.ll = helper_ld_asi(env, addr, asi, 8, 0);
> -        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
> -        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
> -        u.ll = helper_ld_asi(env, addr + 8, asi, 8, 0);
> -        *((uint32_t *)&env->fpr[rd++]) = u.l.upper;
> -        *((uint32_t *)&env->fpr[rd++]) = u.l.lower;
> +        env->fpr[rd/2].ll = helper_ld_asi(env, addr, asi, 8, 0);
> +        env->fpr[rd/2 + 1].ll = helper_ld_asi(env, addr + 8, asi, 8, 0);
>         break;
>     }
>  }
> @@ -2128,8 +2123,7 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
>                     int rd)
>  {
>     unsigned int i;
> -    target_ulong val = 0;
> -    CPU_DoubleU u;
> +    target_ulong val;
>
>     helper_check_align(env, addr, 3);
>     addr = asi_address_mask(env, asi, addr);
> @@ -2146,10 +2140,8 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
>             return;
>         }
>         helper_check_align(env, addr, 0x3f);
> -        for (i = 0; i < 16; i++) {
> -            val = *(uint32_t *)&env->fpr[rd++];
> -            helper_st_asi(env, addr, val, asi & 0x8f, 4);
> -            addr += 4;
> +        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
> +            helper_st_asi(env, addr, env->fpr[rd/2].ll, asi & 0x8f, 8);
>         }
>
>         return;
> @@ -2166,10 +2158,8 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
>             return;
>         }
>         helper_check_align(env, addr, 0x3f);
> -        for (i = 0; i < 16; i++) {
> -            val = *(uint32_t *)&env->fpr[rd++];
> -            helper_st_asi(env, addr, val, asi & 0x19, 4);
> -            addr += 4;
> +        for (i = 0; i < 8; i++, rd += 2, addr += 8) {
> +            helper_st_asi(env, addr, env->fpr[rd/2].ll, asi & 0x19, 8);
>         }
>
>         return;
> @@ -2180,20 +2170,19 @@ void helper_stf_asi(CPUState *env, target_ulong addr, int asi, int size,
>     switch (size) {
>     default:
>     case 4:
> -        helper_st_asi(env, addr, *(uint32_t *)&env->fpr[rd], asi, size);
> +        if (rd & 1) {
> +            val = env->fpr[rd/2].l.lower;
> +        } else {
> +            val = env->fpr[rd/2].l.upper;
> +        }
> +        helper_st_asi(env, addr, val, asi, size);
>         break;
>     case 8:
> -        u.l.upper = *(uint32_t *)&env->fpr[rd++];
> -        u.l.lower = *(uint32_t *)&env->fpr[rd++];
> -        helper_st_asi(env, addr, u.ll, asi, size);
> +        helper_st_asi(env, addr, env->fpr[rd/2].ll, asi, size);
>         break;
>     case 16:
> -        u.l.upper = *(uint32_t *)&env->fpr[rd++];
> -        u.l.lower = *(uint32_t *)&env->fpr[rd++];
> -        helper_st_asi(env, addr, u.ll, asi, 8);
> -        u.l.upper = *(uint32_t *)&env->fpr[rd++];
> -        u.l.lower = *(uint32_t *)&env->fpr[rd++];
> -        helper_st_asi(env, addr + 8, u.ll, asi, 8);
> +        helper_st_asi(env, addr, env->fpr[rd/2].ll, asi, 8);
> +        helper_st_asi(env, addr + 8, env->fpr[rd/2 + 1].ll, asi, 8);
>         break;
>     }
>  }
> diff --git a/target-sparc/machine.c b/target-sparc/machine.c
> index 56ae041..235b088 100644
> --- a/target-sparc/machine.c
> +++ b/target-sparc/machine.c
> @@ -21,13 +21,9 @@ void cpu_save(QEMUFile *f, void *opaque)
>         qemu_put_betls(f, &env->regbase[i]);
>
>     /* FPU */
> -    for(i = 0; i < TARGET_FPREGS; i++) {
> -        union {
> -            float32 f;
> -            uint32_t i;
> -        } u;
> -        u.f = env->fpr[i];
> -        qemu_put_be32(f, u.i);
> +    for (i = 0; i < TARGET_DPREGS; i++) {
> +        qemu_put_be32(f, env->fpr[i].l.upper);
> +        qemu_put_be32(f, env->fpr[i].l.lower);
>     }
>
>     qemu_put_betls(f, &env->pc);
> @@ -128,13 +124,9 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id)
>         qemu_get_betls(f, &env->regbase[i]);
>
>     /* FPU */
> -    for(i = 0; i < TARGET_FPREGS; i++) {
> -        union {
> -            float32 f;
> -            uint32_t i;
> -        } u;
> -        u.i = qemu_get_be32(f);
> -        env->fpr[i] = u.f;
> +    for (i = 0; i < TARGET_DPREGS; i++) {
> +        env->fpr[i].l.upper = qemu_get_be32(f);
> +        env->fpr[i].l.lower = qemu_get_be32(f);
>     }
>
>     qemu_get_betls(f, &env->pc);
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index f8d3bf2..97a462b 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -68,7 +68,7 @@ static TCGv cpu_tmp0;
>  static TCGv_i32 cpu_tmp32;
>  static TCGv_i64 cpu_tmp64;
>  /* Floating point registers */
> -static TCGv_i32 cpu_fpr[TARGET_FPREGS];
> +static TCGv_i64 cpu_fpr[TARGET_DPREGS];
>
>  static target_ulong gen_opc_npc[OPC_BUF_SIZE];
>  static target_ulong gen_opc_jump_pc[2];
> @@ -87,8 +87,8 @@ typedef struct DisasContext {
>     uint32_t cc_op;  /* current CC operation */
>     struct TranslationBlock *tb;
>     sparc_def_t *def;
> -    TCGv_i64 t64[3];
> -    int n_t64;
> +    TCGv_i32 t32[3];
> +    int n_t32;
>  } DisasContext;
>
>  // This function uses non-native bit order
> @@ -131,12 +131,44 @@ static inline void gen_update_fprs_dirty(int rd)
>  /* floating point registers moves */
>  static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
>  {
> -    return cpu_fpr[src];
> +#if TCG_TARGET_REG_BITS == 32
> +    if (src & 1) {
> +        return TCGV_LOW(cpu_fpr[src / 2]);
> +    } else {
> +        return TCGV_HIGH(cpu_fpr[src / 2]);
> +    }
> +#else
> +    if (src & 1) {
> +        return MAKE_TCGV_I32(GET_TCGV_I64(cpu_fpr[src / 2]));
> +    } else {
> +        TCGv_i32 ret = tcg_temp_local_new_i32();
> +        TCGv_i64 t = tcg_temp_new_i64();
> +
> +        tcg_gen_shri_i64(t, cpu_fpr[src / 2], 32);
> +        tcg_gen_trunc_i64_i32(ret, t);
> +        tcg_temp_free_i64(t);
> +
> +        dc->t32[dc->n_t32++] = ret;
> +        assert(dc->n_t32 <= ARRAY_SIZE(dc->t32));
> +
> +        return ret;
> +    }
> +#endif
>  }
>
>  static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
>  {
> -    tcg_gen_mov_i32(cpu_fpr[dst], v);
> +#if TCG_TARGET_REG_BITS == 32
> +    if (dst & 1) {
> +        tcg_gen_mov_i32(TCGV_LOW(cpu_fpr[dst / 2]), v);
> +    } else {
> +        tcg_gen_mov_i32(TCGV_HIGH(cpu_fpr[dst / 2]), v);
> +    }
> +#else
> +    TCGv_i64 t = MAKE_TCGV_I64(GET_TCGV_I32(v));
> +    tcg_gen_deposit_i64(cpu_fpr[dst / 2], cpu_fpr[dst / 2], t,
> +                        (dst & 1 ? 0 : 32), 32);
> +#endif
>     gen_update_fprs_dirty(dst);
>  }
>
> @@ -147,42 +179,14 @@ static TCGv_i32 gen_dest_fpr_F(void)
>
>  static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
>  {
> -    TCGv_i64 ret = tcg_temp_new_i64();
>     src = DFPREG(src);
> -
> -#if TCG_TARGET_REG_BITS == 32
> -    tcg_gen_mov_i32(TCGV_HIGH(ret), cpu_fpr[src]);
> -    tcg_gen_mov_i32(TCGV_LOW(ret), cpu_fpr[src + 1]);
> -#else
> -    {
> -        TCGv_i64 t = tcg_temp_new_i64();
> -        tcg_gen_extu_i32_i64(ret, cpu_fpr[src]);
> -        tcg_gen_extu_i32_i64(t, cpu_fpr[src + 1]);
> -        tcg_gen_shli_i64(ret, ret, 32);
> -        tcg_gen_or_i64(ret, ret, t);
> -        tcg_temp_free_i64(t);
> -    }
> -#endif
> -
> -    dc->t64[dc->n_t64++] = ret;
> -    assert(dc->n_t64 <= ARRAY_SIZE(dc->t64));
> -
> -    return ret;
> +    return cpu_fpr[src / 2];
>  }
>
>  static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
>  {
>     dst = DFPREG(dst);
> -
> -#if TCG_TARGET_REG_BITS == 32
> -    tcg_gen_mov_i32(cpu__fpu[dst], TCGV_HIGH(v));
> -    tcg_gen_mov_i32(cpu__fpu[dst + 1], TCGV_LOW(v));
> -#else
> -    tcg_gen_trunc_i64_i32(cpu_fpr[dst + 1], v);
> -    tcg_gen_shri_i64(v, v, 32);
> -    tcg_gen_trunc_i64_i32(cpu_fpr[dst], v);
> -#endif
> -
> +    tcg_gen_mov_i64(cpu_fpr[dst / 2], v);
>     gen_update_fprs_dirty(dst);
>  }
>
> @@ -193,50 +197,36 @@ static TCGv_i64 gen_dest_fpr_D(void)
>
>  static void gen_op_load_fpr_QT0(unsigned int src)
>  {
> -    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.upmost));
> -    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.upper));
> -    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.lower));
> -    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.lowest));
> +    tcg_gen_st_i64(cpu_fpr[src / 2], cpu_env, offsetof(CPUSPARCState, qt0) +
> +                   offsetof(CPU_QuadU, ll.upper));
> +    tcg_gen_st_i64(cpu_fpr[src/2 + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
> +                   offsetof(CPU_QuadU, ll.lower));
>  }
>
>  static void gen_op_load_fpr_QT1(unsigned int src)
>  {
> -    tcg_gen_st_i32(cpu_fpr[src], cpu_env, offsetof(CPUSPARCState, qt1) +
> -                   offsetof(CPU_QuadU, l.upmost));
> -    tcg_gen_st_i32(cpu_fpr[src + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
> -                   offsetof(CPU_QuadU, l.upper));
> -    tcg_gen_st_i32(cpu_fpr[src + 2], cpu_env, offsetof(CPUSPARCState, qt1) +
> -                   offsetof(CPU_QuadU, l.lower));
> -    tcg_gen_st_i32(cpu_fpr[src + 3], cpu_env, offsetof(CPUSPARCState, qt1) +
> -                   offsetof(CPU_QuadU, l.lowest));
> +    tcg_gen_st_i64(cpu_fpr[src / 2], cpu_env, offsetof(CPUSPARCState, qt1) +
> +                   offsetof(CPU_QuadU, ll.upper));
> +    tcg_gen_st_i64(cpu_fpr[src/2 + 1], cpu_env, offsetof(CPUSPARCState, qt1) +
> +                   offsetof(CPU_QuadU, ll.lower));
>  }
>
>  static void gen_op_store_QT0_fpr(unsigned int dst)
>  {
> -    tcg_gen_ld_i32(cpu_fpr[dst], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.upmost));
> -    tcg_gen_ld_i32(cpu_fpr[dst + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.upper));
> -    tcg_gen_ld_i32(cpu_fpr[dst + 2], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.lower));
> -    tcg_gen_ld_i32(cpu_fpr[dst + 3], cpu_env, offsetof(CPUSPARCState, qt0) +
> -                   offsetof(CPU_QuadU, l.lowest));
> +    tcg_gen_ld_i64(cpu_fpr[dst / 2], cpu_env, offsetof(CPUSPARCState, qt0) +
> +                   offsetof(CPU_QuadU, ll.upper));
> +    tcg_gen_ld_i64(cpu_fpr[dst/2 + 1], cpu_env, offsetof(CPUSPARCState, qt0) +
> +                   offsetof(CPU_QuadU, ll.lower));
>  }
>
>  #ifdef TARGET_SPARC64
> -static void gen_move_Q(int rd, int rs)
> +static void gen_move_Q(unsigned int rd, unsigned int rs)
>  {
>     rd = QFPREG(rd);
>     rs = QFPREG(rs);
>
> -    tcg_gen_mov_i32(cpu_fpr[rd], cpu_fpr[rs]);
> -    tcg_gen_mov_i32(cpu_fpr[rd + 1], cpu_fpr[rs + 1]);
> -    tcg_gen_mov_i32(cpu_fpr[rd + 2], cpu_fpr[rs + 2]);
> -    tcg_gen_mov_i32(cpu_fpr[rd + 3], cpu_fpr[rs + 3]);
> +    tcg_gen_mov_i64(cpu_fpr[rd / 2], cpu_fpr[rs / 2]);
> +    tcg_gen_mov_i64(cpu_fpr[rd / 2 + 1], cpu_fpr[rs / 2 + 1]);
>     gen_update_fprs_dirty(rd);
>  }
>  #endif
> @@ -5008,6 +4998,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>  egress:
>     tcg_temp_free(cpu_tmp1);
>     tcg_temp_free(cpu_tmp2);
> +    if (dc->n_t32 != 0) {
> +        int i;
> +        for (i = dc->n_t32 - 1; i >= 0; --i) {
> +            tcg_temp_free_i32(dc->t32[i]);
> +        }
> +        dc->n_t32 = 0;
> +    }
>  }
>
>  static inline void gen_intermediate_code_internal(TranslationBlock * tb,
> @@ -5109,9 +5106,6 @@ static inline void gen_intermediate_code_internal(TranslationBlock * tb,
>     tcg_temp_free_i64(cpu_tmp64);
>     tcg_temp_free_i32(cpu_tmp32);
>     tcg_temp_free(cpu_tmp0);
> -    for (j = dc->n_t64 - 1; j >= 0; --j) {
> -        tcg_temp_free_i64(dc->t64[j]);
> -    }
>
>     if (tb->cflags & CF_LAST_IO)
>         gen_io_end();
> @@ -5177,15 +5171,11 @@ void gen_intermediate_code_init(CPUSPARCState *env)
>         "g6",
>         "g7",
>     };
> -    static const char * const fregnames[64] = {
> -        "f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7",
> -        "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",
> -        "f16", "f17", "f18", "f19", "f20", "f21", "f22", "f23",
> -        "f24", "f25", "f26", "f27", "f28", "f29", "f30", "f31",
> -        "f32", "f33", "f34", "f35", "f36", "f37", "f38", "f39",
> -        "f40", "f41", "f42", "f43", "f44", "f45", "f46", "f47",
> -        "f48", "f49", "f50", "f51", "f52", "f53", "f54", "f55",
> -        "f56", "f57", "f58", "f59", "f60", "f61", "f62", "f63",
> +    static const char * const fregnames[32] = {
> +        "f0", "f2", "f4", "f6", "f8", "f10", "f12", "f14",
> +        "f16", "f18", "f20", "f22", "f24", "f26", "f28", "f30",
> +        "f32", "f34", "f36", "f38", "f40", "f42", "f44", "f46",
> +        "f48", "f50", "f52", "f54", "f56", "f58", "f60", "f62",

Shouldn't these become "d0" etc?

>     };
>
>     /* init various static tables */
> @@ -5259,8 +5249,8 @@ void gen_intermediate_code_init(CPUSPARCState *env)
>             cpu_gregs[i] = tcg_global_mem_new(TCG_AREG0,
>                                               offsetof(CPUState, gregs[i]),
>                                               gregnames[i]);
> -        for (i = 0; i < TARGET_FPREGS; i++)
> -            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
> +        for (i = 0; i < TARGET_DPREGS; i++)

Please add braces.

> +            cpu_fpr[i] = tcg_global_mem_new_i64(TCG_AREG0,
>                                                 offsetof(CPUState, fpr[i]),
>                                                 fregnames[i]);
>
> --
> 1.7.6.4
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] tcg: Optimize some forms of deposit.
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 10/21] tcg: Optimize some forms of deposit Richard Henderson
@ 2011-10-18 20:30   ` Blue Swirl
  2011-10-18 22:27     ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 20:30 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
> If the deposit replaces the entire word, optimize to a move.
>
> If we're inserting to the top of the word, avoid the mask of arg2
> as we'll be shifting out all of the garbage and shifting in zeros.
>
> If the host is 32-bit, reduce a 64-bit deposit to a 32-bit deposit
> when possible.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>

Nice patch, but why would it belong to this series?

> ---
>  tcg/tcg-op.h |   65 +++++++++++++++++++++++++++++++++++++++++++++------------
>  1 files changed, 51 insertions(+), 14 deletions(-)
>
> diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
> index fea5983..2276c72 100644
> --- a/tcg/tcg-op.h
> +++ b/tcg/tcg-op.h
> @@ -2045,38 +2045,75 @@ static inline void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1,
>                                       TCGv_i32 arg2, unsigned int ofs,
>                                       unsigned int len)
>  {
> +    uint32_t mask;
> +    TCGv_i32 t1;
> +
> +    if (ofs == 0 && len == 32) {
> +        tcg_gen_mov_i32(ret, arg2);
> +        return;
> +    }
>     if (TCG_TARGET_HAS_deposit_i32 && TCG_TARGET_deposit_i32_valid(ofs, len)) {
>         tcg_gen_op5ii_i32(INDEX_op_deposit_i32, ret, arg1, arg2, ofs, len);
> -    } else {
> -        uint32_t mask = (1u << len) - 1;
> -        TCGv_i32 t1 = tcg_temp_new_i32 ();
> +        return;
> +    }
> +
> +    mask = (1u << len) - 1;
> +    t1 = tcg_temp_new_i32 ();
>
> +    if (ofs + len < 32) {
>         tcg_gen_andi_i32(t1, arg2, mask);
>         tcg_gen_shli_i32(t1, t1, ofs);
> -        tcg_gen_andi_i32(ret, arg1, ~(mask << ofs));
> -        tcg_gen_or_i32(ret, ret, t1);
> -
> -        tcg_temp_free_i32(t1);
> +    } else {
> +        tcg_gen_shli_i32(t1, arg2, ofs);
>     }
> +    tcg_gen_andi_i32(ret, arg1, ~(mask << ofs));
> +    tcg_gen_or_i32(ret, ret, t1);
> +
> +    tcg_temp_free_i32(t1);
>  }
>
>  static inline void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1,
>                                       TCGv_i64 arg2, unsigned int ofs,
>                                       unsigned int len)
>  {
> +    uint64_t mask;
> +    TCGv_i64 t1;
> +
> +    if (ofs == 0 && len == 64) {
> +        tcg_gen_mov_i64(ret, arg2);
> +        return;
> +    }
>     if (TCG_TARGET_HAS_deposit_i64 && TCG_TARGET_deposit_i64_valid(ofs, len)) {
>         tcg_gen_op5ii_i64(INDEX_op_deposit_i64, ret, arg1, arg2, ofs, len);
> -    } else {
> -        uint64_t mask = (1ull << len) - 1;
> -        TCGv_i64 t1 = tcg_temp_new_i64 ();
> +        return;
> +    }
> +
> +#if TCG_TARGET_REG_BITS == 32
> +    if (ofs >= 32) {
> +        tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
> +                            TCGV_LOW(arg2), ofs - 32, len);
> +        return;
> +    }
> +    if (ofs + len <= 32) {
> +        tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
> +                            TCGV_LOW(arg2), ofs, len);
> +        return;
> +    }
> +#endif
> +
> +    mask = (1ull << len) - 1;
> +    t1 = tcg_temp_new_i64 ();
>
> +    if (ofs + len < 64) {
>         tcg_gen_andi_i64(t1, arg2, mask);
>         tcg_gen_shli_i64(t1, t1, ofs);
> -        tcg_gen_andi_i64(ret, arg1, ~(mask << ofs));
> -        tcg_gen_or_i64(ret, ret, t1);
> -
> -        tcg_temp_free_i64(t1);
> +    } else {
> +        tcg_gen_shli_i64(t1, arg2, ofs);
>     }
> +    tcg_gen_andi_i64(ret, arg1, ~(mask << ofs));
> +    tcg_gen_or_i64(ret, ret, t1);
> +
> +    tcg_temp_free_i64(t1);
>  }
>
>  /***************************************/
> --
> 1.7.6.4
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 12/21] sparc-linux-user: Handle SIGILL.
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 12/21] sparc-linux-user: Handle SIGILL Richard Henderson
@ 2011-10-18 20:32   ` Blue Swirl
  2011-10-18 22:27     ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 20:32 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Riku Voipio, qemu-devel

On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> Cc: Riku Voipio <riku.voipio@iki.fi>
> ---
>  linux-user/main.c |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
>
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 186358b..686f6f6 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -1191,6 +1191,15 @@ void cpu_loop (CPUSPARCState *env)
>         case EXCP_INTERRUPT:
>             /* just indicate that signals should be handled asap */
>             break;
> +        case TT_ILL_INSN:
> +            {
> +                info.si_signo = SIGILL;

TARGET_SIGILL

> +                info.si_errno = 0;
> +                info.si_code = TARGET_ILL_ILLOPC;
> +                info._sifields._sigfault._addr = env->pc;
> +                queue_signal(env, info.si_signo, &info);
> +            }
> +            break;
>         case EXCP_DEBUG:
>             {
>                 int sig;
> --
> 1.7.6.4
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 17/21] target-sparc: Implement BMASK/BSHUFFLE.
  2011-10-18 18:50 ` [Qemu-devel] [PATCH 17/21] target-sparc: Implement BMASK/BSHUFFLE Richard Henderson
@ 2011-10-18 20:36   ` Blue Swirl
  0 siblings, 0 replies; 37+ messages in thread
From: Blue Swirl @ 2011-10-18 20:36 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/helper.h     |    1 +
>  target-sparc/translate.c  |   28 ++++++++++++++++++++++++----
>  target-sparc/vis_helper.c |   29 +++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+), 4 deletions(-)
>
> diff --git a/target-sparc/helper.h b/target-sparc/helper.h
> index 4a61b77..ec00436 100644
> --- a/target-sparc/helper.h
> +++ b/target-sparc/helper.h
> @@ -140,6 +140,7 @@ DEF_HELPER_FLAGS_3(pdist, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
>  DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
>  DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
>  DEF_HELPER_FLAGS_2(fpackfix, TCG_CALL_CONST | TCG_CALL_PURE, i32, i64, i64)
> +DEF_HELPER_FLAGS_3(bshuffle, TCG_CALL_CONST | TCG_CALL_PURE, i64, i64, i64, i64)
>  #define VIS_HELPER(name)                                                 \
>     DEF_HELPER_FLAGS_2(f ## name ## 16, TCG_CALL_CONST | TCG_CALL_PURE,  \
>                        i64, i64, i64)                                    \
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index e955bf3..66107ee 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -1744,6 +1744,20 @@ static void gen_ne_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
>     gen_store_fpr_D(dc, rd, dst);
>  }
>
> +static void gen_gsr_fop_DDD(DisasContext *dc, int rd, int rs1, int rs2,
> +                            void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
> +{
> +    TCGv_i64 dst, src1, src2;
> +
> +    src1 = gen_load_fpr_D(dc, rs1);
> +    src2 = gen_load_fpr_D(dc, rs2);
> +    dst = gen_dest_fpr_D();
> +
> +    gen(dst, cpu_gsr, src1, src2);
> +
> +    gen_store_fpr_D(dc, rd, dst);
> +}

This could be introduced with fpack functions, so the next patch could
be squashed into that one.

> +
>  static void gen_ne_fop_DDDD(DisasContext *dc, int rd, int rs1, int rs2,
>                             void (*gen)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
>  {
> @@ -4183,8 +4197,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_movl_TN_reg(rd, cpu_dst);
>                     break;
>                 case 0x019: /* VIS II bmask */
> -                    // XXX
> -                    goto illegal_insn;
> +                    CHECK_FPU_FEATURE(dc, VIS2);
> +                    cpu_src1 = get_src1(insn, cpu_src1);
> +                    cpu_src2 = get_src1(insn, cpu_src2);
> +                    tcg_gen_add_tl(cpu_dst, cpu_src1, cpu_src2);
> +                    tcg_gen_deposit_tl(cpu_gsr, cpu_gsr, cpu_dst, 32, 32);
> +                    gen_movl_TN_reg(rd, cpu_dst);
> +                    break;
>                 case 0x020: /* VIS I fcmple16 */
>                     CHECK_FPU_FEATURE(dc, VIS1);
>                     cpu_src1_64 = gen_load_fpr_D(dc, rs1);
> @@ -4310,8 +4329,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
>                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
>                     break;
>                 case 0x04c: /* VIS II bshuffle */
> -                    // XXX
> -                    goto illegal_insn;
> +                    CHECK_FPU_FEATURE(dc, VIS2);
> +                    gen_gsr_fop_DDD(dc, rd, rs1, rs2, gen_helper_bshuffle);
> +                    break;
>                 case 0x04d: /* VIS I fexpand */
>                     CHECK_FPU_FEATURE(dc, VIS1);
>                     gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fexpand);
> diff --git a/target-sparc/vis_helper.c b/target-sparc/vis_helper.c
> index 40adb47..7830120 100644
> --- a/target-sparc/vis_helper.c
> +++ b/target-sparc/vis_helper.c
> @@ -470,3 +470,32 @@ uint32_t helper_fpackfix(uint64_t gsr, uint64_t rs2)
>
>     return ret;
>  }
> +
> +uint64 helper_bshuffle(uint64_t gsr, uint64_t src1, uint64_t src2)
> +{
> +    union {
> +        uint64_t ll[2];
> +        uint8_t b[16];
> +    } s;
> +    VIS64 r;
> +    uint32_t i, mask, host;
> +
> +    /* Set up S such that we can index across all of the bytes.  */
> +#ifdef HOST_WORDS_BIGENDIAN
> +    s.ll[0] = src1;
> +    s.ll[1] = src2;
> +    host = 0;
> +#else
> +    s.ll[1] = src1;
> +    s.ll[0] = src2;
> +    host = 15;
> +#endif
> +    mask = gsr >> 32;
> +
> +    for (i = 0; i < 8; ++i) {
> +        unsigned e = (mask >> (28 - i*4)) & 0xf;
> +        r.VIS_B64(i) = s.b[e ^ host];
> +    }
> +
> +    return r.ll;
> +}
> --
> 1.7.6.4
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations.
  2011-10-18 20:24   ` Blue Swirl
@ 2011-10-18 22:21     ` Richard Henderson
  2011-10-23 11:34       ` Blue Swirl
  0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 22:21 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On 10/18/2011 01:24 PM, Blue Swirl wrote:
>>  #ifdef TARGET_SPARC64
>> -float64 helper_fabsd(CPUState *env, float64 src)
>> +float64 helper_fabsd(float64 src)
> 
> This probably should go to previous patch.

Sure.

>> +/* Turn off the stupid always-inline hack in osdep.h.  This gets in the
>> +   way of the callback mechanisms we use in this file, generating warnings
>> +   for always-inline functions called indirectly.  */
>> +#define always_inline inline
> 
> It would be better to just delete the offending (or all) inlines.

I certainly would like to delete the offending hack in osdep.h.

The inline markers themselves are generated by def-helper.h, and are required
so that we don't wind up with a corresponding number of defined-but-not-used
errors from the helper.h definitions.

I really didn't know any one way to handle this situation that would be
immediately acceptable to everyone.  I assumed limiting the change to 
the sparc front-end would minimize the pushback.

>> +static void gen_ne_fop_FF(DisasContext *dc, int rd, int rs,
> 
> 'ne' is for no exception? How about noexcp or something?

no-exception when it's first introduced.  Then after patch 11 it would
become no-env.  Preferences for the intermediate stage?


r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 09/21] target-sparc: Change fpr representation to doubles.
  2011-10-18 20:28   ` Blue Swirl
@ 2011-10-18 22:25     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 22:25 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On 10/18/2011 01:28 PM, Blue Swirl wrote:
>> > +    static const char * const fregnames[32] = {
>> > +        "f0", "f2", "f4", "f6", "f8", "f10", "f12", "f14",
>> > +        "f16", "f18", "f20", "f22", "f24", "f26", "f28", "f30",
>> > +        "f32", "f34", "f36", "f38", "f40", "f42", "f44", "f46",
>> > +        "f48", "f50", "f52", "f54", "f56", "f58", "f60", "f62",
> Shouldn't these become "d0" etc?
> 

That's what I had at first, but then after looking at the dumps for
a few additional patches went back and changed it.  My feeling is
that these should match the disassembler, and it always uses f%d.

>> -        for (i = 0; i < TARGET_FPREGS; i++)
>> -            cpu_fpr[i] = tcg_global_mem_new_i32(TCG_AREG0,
>> +        for (i = 0; i < TARGET_DPREGS; i++)
> Please add braces.
> 

Sure.


r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] tcg: Optimize some forms of deposit.
  2011-10-18 20:30   ` Blue Swirl
@ 2011-10-18 22:27     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 22:27 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel

On 10/18/2011 01:30 PM, Blue Swirl wrote:
> On Tue, Oct 18, 2011 at 6:50 PM, Richard Henderson <rth@twiddle.net> wrote:
>> If the deposit replaces the entire word, optimize to a move.
>>
>> If we're inserting to the top of the word, avoid the mask of arg2
>> as we'll be shifting out all of the garbage and shifting in zeros.
>>
>> If the host is 32-bit, reduce a 64-bit deposit to a 32-bit deposit
>> when possible.
>>
>> Signed-off-by: Richard Henderson <rth@twiddle.net>
> 
> Nice patch, but why would it belong to this series?

Only because we start generating a lot of these special cases in this
series.  It's certainly otherwise independent.


r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 12/21] sparc-linux-user: Handle SIGILL.
  2011-10-18 20:32   ` Blue Swirl
@ 2011-10-18 22:27     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2011-10-18 22:27 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Riku Voipio, qemu-devel

On 10/18/2011 01:32 PM, Blue Swirl wrote:
>> > +                info.si_signo = SIGILL;
> TARGET_SIGILL
> 

Doh.


r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations.
  2011-10-18 22:21     ` Richard Henderson
@ 2011-10-23 11:34       ` Blue Swirl
  0 siblings, 0 replies; 37+ messages in thread
From: Blue Swirl @ 2011-10-23 11:34 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, Oct 18, 2011 at 22:21, Richard Henderson <rth@twiddle.net> wrote:
> On 10/18/2011 01:24 PM, Blue Swirl wrote:
>>>  #ifdef TARGET_SPARC64
>>> -float64 helper_fabsd(CPUState *env, float64 src)
>>> +float64 helper_fabsd(float64 src)
>>
>> This probably should go to previous patch.
>
> Sure.
>
>>> +/* Turn off the stupid always-inline hack in osdep.h.  This gets in the
>>> +   way of the callback mechanisms we use in this file, generating warnings
>>> +   for always-inline functions called indirectly.  */
>>> +#define always_inline inline
>>
>> It would be better to just delete the offending (or all) inlines.
>
> I certainly would like to delete the offending hack in osdep.h.
>
> The inline markers themselves are generated by def-helper.h, and are required
> so that we don't wind up with a corresponding number of defined-but-not-used
> errors from the helper.h definitions.
>
> I really didn't know any one way to handle this situation that would be
> immediately acceptable to everyone.  I assumed limiting the change to
> the sparc front-end would minimize the pushback.

It should also be possible to add non-inlined wrapper functions to
inlined functions.

>>> +static void gen_ne_fop_FF(DisasContext *dc, int rd, int rs,
>>
>> 'ne' is for no exception? How about noexcp or something?
>
> no-exception when it's first introduced.  Then after patch 11 it would
> become no-env.  Preferences for the intermediate stage?

Nevermind then.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2011-10-23 11:35 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-18 18:50 [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 01/21] target-sparc: Add accessors for single-precision fpr access Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 02/21] target-sparc: Mark fprs dirty in store accessor Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 03/21] target-sparc: Add accessors for double-precision fpr access Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 04/21] target-sparc: Pass float64 parameters instead of dt0/1 temporaries Richard Henderson
2011-10-18 20:04   ` Blue Swirl
2011-10-18 20:07     ` Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 05/21] target-sparc: Make VIS helpers const when possible Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 06/21] target-sparc: Extract common code for floating-point operations Richard Henderson
2011-10-18 20:24   ` Blue Swirl
2011-10-18 22:21     ` Richard Henderson
2011-10-23 11:34       ` Blue Swirl
2011-10-18 18:50 ` [Qemu-devel] [PATCH 07/21] target-sparc: Extract float128 move to a function Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 08/21] target-sparc: Undo cpu_fpr rename Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 09/21] target-sparc: Change fpr representation to doubles Richard Henderson
2011-10-18 20:28   ` Blue Swirl
2011-10-18 22:25     ` Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 10/21] tcg: Optimize some forms of deposit Richard Henderson
2011-10-18 20:30   ` Blue Swirl
2011-10-18 22:27     ` Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 11/21] target-sparc: Do exceptions management fully inside the helpers Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 12/21] sparc-linux-user: Handle SIGILL Richard Henderson
2011-10-18 20:32   ` Blue Swirl
2011-10-18 22:27     ` Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 13/21] target-sparc: Implement PDIST Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 14/21] target-sparc: Implement fpack{16, 32, fix} Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 15/21] target-sparc: Implement EDGE* instructions Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 16/21] target-sparc: Implement ALIGNADDR* inline Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 17/21] target-sparc: Implement BMASK/BSHUFFLE Richard Henderson
2011-10-18 20:36   ` Blue Swirl
2011-10-18 18:50 ` [Qemu-devel] [PATCH 18/21] target-sparc: Tidy fpack32 Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 19/21] target-sparc: Implement FALIGNDATA inline Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 20/21] sparc-linux-user: Add some missing syscall numbers Richard Henderson
2011-10-18 18:50 ` [Qemu-devel] [PATCH 21/21] sparc-linux-user: Enable NPTL Richard Henderson
2011-10-18 19:50 ` [Qemu-devel] [PATCH 00/21] Sparc FPU/VIS improvements Blue Swirl
2011-10-18 20:03   ` Richard Henderson
2011-10-18 20:19     ` Blue Swirl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).