[Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions
@ 2014-08-11 19:23 Tom Musta
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm Tom Musta
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

These patches fix assorted bugs in the emulation of Power Fixed Point Unit
instructions.

All instructions have been thorougly tested by running millions of random
patterns through actual hardware and comparing the results against QEMU.
The bugs all appear to be limited to 64 bit implementations.

Tom Musta (8):
  target-ppc: Bug Fix: rlwinm
  target-ppc: Bug Fix: rlwnm
  target-ppc: Bug Fix: rlwimi
  target-ppc: Bug Fix: mullw
  target-ppc: Bug Fix: mullwo
  target-ppc: Bug Fix: mulldo OV Detection
  target-ppc: Bug Fix: srawi
  target-ppc: Bug Fix: srad

 target-ppc/int_helper.c |   16 ++++++++++--
 target-ppc/translate.c  |   60 +++++++++++++++++++++++++++++++---------------
 2 files changed, 53 insertions(+), 23 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-15 18:34   ` Richard Henderson
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 2/8] target-ppc: Bug Fix: rlwnm Tom Musta
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

The rlwinm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/translate.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index b23933f..a27d063 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1672,11 +1672,9 @@ static void gen_rlwinm(DisasContext *ctx)
     } else {
         TCGv t0 = tcg_temp_new();
 #if defined(TARGET_PPC64)
-        TCGv_i32 t1 = tcg_temp_new_i32();
-        tcg_gen_trunc_i64_i32(t1, cpu_gpr[rS(ctx->opcode)]);
-        tcg_gen_rotli_i32(t1, t1, sh);
-        tcg_gen_extu_i32_i64(t0, t1);
-        tcg_temp_free_i32(t1);
+        tcg_gen_deposit_i64(t0, cpu_gpr[rS(ctx->opcode)],
+            cpu_gpr[rS(ctx->opcode)], 32, 32);
+        tcg_gen_rotli_i64(t0, t0, sh);
 #else
         tcg_gen_rotli_i32(t0, cpu_gpr[rS(ctx->opcode)], sh);
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 2/8] target-ppc: Bug Fix: rlwnm
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-15 19:54   ` Richard Henderson
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 3/8] target-ppc: Bug Fix: rlwimi Tom Musta
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

The rlwnm specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/translate.c |   18 +++++++++---------
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index a27d063..48f13a9 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1695,7 +1695,7 @@ static void gen_rlwnm(DisasContext *ctx)
     uint32_t mb, me;
     TCGv t0;
 #if defined(TARGET_PPC64)
-    TCGv_i32 t1, t2;
+    TCGv t1;
 #endif
 
     mb = MB(ctx->opcode);
@@ -1703,14 +1703,11 @@ static void gen_rlwnm(DisasContext *ctx)
     t0 = tcg_temp_new();
     tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx->opcode)], 0x1f);
 #if defined(TARGET_PPC64)
-    t1 = tcg_temp_new_i32();
-    t2 = tcg_temp_new_i32();
-    tcg_gen_trunc_i64_i32(t1, cpu_gpr[rS(ctx->opcode)]);
-    tcg_gen_trunc_i64_i32(t2, t0);
-    tcg_gen_rotl_i32(t1, t1, t2);
-    tcg_gen_extu_i32_i64(t0, t1);
-    tcg_temp_free_i32(t1);
-    tcg_temp_free_i32(t2);
+    t1 = tcg_temp_new_i64();
+    tcg_gen_deposit_i64(t1, cpu_gpr[rS(ctx->opcode)],
+        cpu_gpr[rS(ctx->opcode)], 32, 32);
+    tcg_gen_rotl_i64(t0, t1, t0);
+    tcg_temp_free_i64(t1);
 #else
     tcg_gen_rotl_i32(t0, cpu_gpr[rS(ctx->opcode)], t0);
 #endif
@@ -1721,6 +1718,9 @@ static void gen_rlwnm(DisasContext *ctx)
 #endif
         tcg_gen_andi_tl(cpu_gpr[rA(ctx->opcode)], t0, MASK(mb, me));
     } else {
+#if defined(TARGET_PPC64)
+        tcg_gen_andi_tl(t0, t0, MASK(32, 63));
+#endif
         tcg_gen_mov_tl(cpu_gpr[rA(ctx->opcode)], t0);
     }
     tcg_temp_free(t0);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 3/8] target-ppc: Bug Fix: rlwimi
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm Tom Musta
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 2/8] target-ppc: Bug Fix: rlwnm Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-15 20:05   ` Richard Henderson
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 4/8] target-ppc: Bug Fix: mullw Tom Musta
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

The rlwimi specification includes the ROTL32 operation, which is defined
to be a left rotation of two copies of the least significant 32 bits of
the source GPR.

The current implementation is incorrect on 64-bit implementations in that
it rotates a single copy of the least significant 32 bits, padding with
zeroes in the most significant bits.

Fix the code to properly implement this ROTL32 operation.

Also fix the special case of MB=31 and ME=0 to copy the entire contents
of the source GPR.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/translate.c |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 48f13a9..f4cc495 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1614,17 +1614,19 @@ static void gen_rlwimi(DisasContext *ctx)
     me = ME(ctx->opcode);
     sh = SH(ctx->opcode);
     if (likely(sh == 0 && mb == 0 && me == 31)) {
+#if defined(TARGET_PPC64)
+        tcg_gen_mov_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
+#else
         tcg_gen_ext32u_tl(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
+#endif
     } else {
         target_ulong mask;
         TCGv t1;
         TCGv t0 = tcg_temp_new();
 #if defined(TARGET_PPC64)
-        TCGv_i32 t2 = tcg_temp_new_i32();
-        tcg_gen_trunc_i64_i32(t2, cpu_gpr[rS(ctx->opcode)]);
-        tcg_gen_rotli_i32(t2, t2, sh);
-        tcg_gen_extu_i32_i64(t0, t2);
-        tcg_temp_free_i32(t2);
+        tcg_gen_deposit_i64(t0, cpu_gpr[rS(ctx->opcode)],
+            cpu_gpr[rS(ctx->opcode)], 32, 32);
+        tcg_gen_rotli_i64(t0, t0, sh);
 #else
         tcg_gen_rotli_i32(t0, cpu_gpr[rS(ctx->opcode)], sh);
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 4/8] target-ppc: Bug Fix: mullw
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
                   ` (2 preceding siblings ...)
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 3/8] target-ppc: Bug Fix: rlwimi Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-15 20:07   ` Richard Henderson
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 5/8] target-ppc: Bug Fix: mullwo Tom Musta
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

For 64-bit implementations, the mullw result is the 64 bit product
of the sign-extended least significant 32 bits of the source
registers.

Fix the code to properly sign extend the source operands and produce
a 64 bit product.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/translate.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index f4cc495..41a5aea 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1125,9 +1125,20 @@ static void gen_mulhwu(DisasContext *ctx)
 /* mullw  mullw. */
 static void gen_mullw(DisasContext *ctx)
 {
+#if defined(TARGET_PPC64)
+    TCGv_i64 t0, t1;
+    t0 = tcg_temp_new_i64();
+    t1 = tcg_temp_new_i64();
+    tcg_gen_ext32s_tl(t0, cpu_gpr[rA(ctx->opcode)]);
+    tcg_gen_ext32s_tl(t1, cpu_gpr[rB(ctx->opcode)]);
+    tcg_gen_mul_i64(cpu_gpr[rD(ctx->opcode)], t0, t1);
+    tcg_temp_free(t0);
+    tcg_temp_free(t1);
+#else
     tcg_gen_mul_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)],
                    cpu_gpr[rB(ctx->opcode)]);
     tcg_gen_ext32s_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rD(ctx->opcode)]);
+#endif
     if (unlikely(Rc(ctx->opcode) != 0))
         gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]);
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 5/8] target-ppc: Bug Fix: mullwo
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
                   ` (3 preceding siblings ...)
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 4/8] target-ppc: Bug Fix: mullw Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-15 20:11   ` Richard Henderson
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 6/8] target-ppc: Bug Fix: mulldo OV Detection Tom Musta
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

On 64-bit implementations, the mullwo result is the 64 bit product of
the signed 32 bit operands.  Fix the implementation to properly deposit
the upper 32 bits into the target register.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/translate.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 41a5aea..4904665 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1148,11 +1148,20 @@ static void gen_mullwo(DisasContext *ctx)
 {
     TCGv_i32 t0 = tcg_temp_new_i32();
     TCGv_i32 t1 = tcg_temp_new_i32();
+#if defined(TARGET_PPC64)
+    TCGv_i64 t2 = tcg_temp_new_i64();
+#endif
 
     tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]);
     tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]);
     tcg_gen_muls2_i32(t0, t1, t0, t1);
     tcg_gen_ext_i32_tl(cpu_gpr[rD(ctx->opcode)], t0);
+#if defined(TARGET_PPC64)
+    tcg_gen_ext_i32_tl(t2, t1);
+    tcg_gen_deposit_i64(cpu_gpr[rD(ctx->opcode)],
+                        cpu_gpr[rD(ctx->opcode)], t2, 32, 32);
+    tcg_temp_free(t2);
+#endif
 
     tcg_gen_sari_i32(t0, t0, 31);
     tcg_gen_setcond_i32(TCG_COND_NE, t0, t0, t1);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 6/8] target-ppc: Bug Fix: mulldo OV Detection
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
                   ` (4 preceding siblings ...)
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 5/8] target-ppc: Bug Fix: mullwo Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-15 20:16   ` Richard Henderson
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 7/8] target-ppc: Bug Fix: srawi Tom Musta
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

Fix the code to properly detect overflow; the 128 bit signed
product must have all zeroes or all ones in the first 65 bits
otherwise OV should be set.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/int_helper.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index f6e8846..e83a25d 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -32,12 +32,22 @@ uint64_t helper_mulldo(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
     uint64_t tl;
 
     muls64(&tl, (uint64_t *)&th, arg1, arg2);
-    /* If th != 0 && th != -1, then we had an overflow */
-    if (likely((uint64_t)(th + 1) <= 1)) {
+
+    /* th should either contain all 1 bits or all 0 bits and should
+     * match the sign bit of tl; otherwise we have overflowed. */
+
+    if ((int64_t)tl < 0) {
+        if (likely(th == -1LL)) {
+            env->ov = 0;
+        } else {
+            env->so = env->ov = 1;
+        }
+    } else if (likely(th == 0LL)) {
         env->ov = 0;
     } else {
         env->so = env->ov = 1;
     }
+
     return (int64_t)tl;
 }
 #endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 7/8] target-ppc: Bug Fix: srawi
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
                   ` (5 preceding siblings ...)
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 6/8] target-ppc: Bug Fix: mulldo OV Detection Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 8/8] target-ppc: Bug Fix: srad Tom Musta
  2014-08-12  3:06 ` [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions David Gibson
  8 siblings, 0 replies; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

For 64 bit implementations, the special case of a shift by zero
should result in the sign extension of the least significant 32 bits
of the source GPR (not a direct copy of the 64 bit source GPR).

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/translate.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 4904665..61fa42d 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1941,7 +1941,7 @@ static void gen_srawi(DisasContext *ctx)
     TCGv dst = cpu_gpr[rA(ctx->opcode)];
     TCGv src = cpu_gpr[rS(ctx->opcode)];
     if (sh == 0) {
-        tcg_gen_mov_tl(dst, src);
+        tcg_gen_ext32s_tl(dst, src);
         tcg_gen_movi_tl(cpu_ca, 0);
     } else {
         TCGv t0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 8/8] target-ppc: Bug Fix: srad
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
                   ` (6 preceding siblings ...)
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 7/8] target-ppc: Bug Fix: srawi Tom Musta
@ 2014-08-11 19:23 ` Tom Musta
  2014-08-15 20:17   ` Richard Henderson
  2014-08-12  3:06 ` [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions David Gibson
  8 siblings, 1 reply; 20+ messages in thread
From: Tom Musta @ 2014-08-11 19:23 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc; +Cc: Tom Musta, agraf

Fix the check for carry in the srad helper to properly construct
the mask -- a "1ULL" must be used (instead of "1") in order to
get the desired result.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 target-ppc/int_helper.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index e83a25d..e5b103b 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -248,7 +248,7 @@ target_ulong helper_srad(CPUPPCState *env, target_ulong value,
         if (likely((uint64_t)shift != 0)) {
             shift &= 0x3f;
             ret = (int64_t)value >> shift;
-            if (likely(ret >= 0 || (value & ((1 << shift) - 1)) == 0)) {
+            if (likely(ret >= 0 || (value & ((1ULL << shift) - 1)) == 0)) {
                 env->ca = 0;
             } else {
                 env->ca = 1;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions
  2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
                   ` (7 preceding siblings ...)
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 8/8] target-ppc: Bug Fix: srad Tom Musta
@ 2014-08-12  3:06 ` David Gibson
  2014-08-12 11:40   ` Tom Musta
  8 siblings, 1 reply; 20+ messages in thread
From: David Gibson @ 2014-08-12  3:06 UTC (permalink / raw)
  To: Tom Musta; +Cc: qemu-ppc, qemu-devel, agraf

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]

On Mon, Aug 11, 2014 at 02:23:21PM -0500, Tom Musta wrote:
> These patches fix assorted bugs in the emulation of Power Fixed Point Unit
> instructions.
> 
> All instructions have been thorougly tested by running millions of random
> patterns through actual hardware and comparing the results against QEMU.
> The bugs all appear to be limited to 64 bit implementations.

I think understanding these fixes would be easier if each commit
message included some example inputs for which the existing code
generates the wrong results.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions
  2014-08-12  3:06 ` [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions David Gibson
@ 2014-08-12 11:40   ` Tom Musta
  0 siblings, 0 replies; 20+ messages in thread
From: Tom Musta @ 2014-08-12 11:40 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, agraf

On 8/11/2014 10:06 PM, David Gibson wrote:
> On Mon, Aug 11, 2014 at 02:23:21PM -0500, Tom Musta wrote:
>> These patches fix assorted bugs in the emulation of Power Fixed Point Unit
>> instructions.
>>
>> All instructions have been thorougly tested by running millions of random
>> patterns through actual hardware and comparing the results against QEMU.
>> The bugs all appear to be limited to 64 bit implementations.
> 
> I think understanding these fixes would be easier if each commit
> message included some example inputs for which the existing code
> generates the wrong results.
> 

Thanks, David, for the feedback.  I will add some sample data patterns.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm Tom Musta
@ 2014-08-15 18:34   ` Richard Henderson
  2014-08-15 19:52     ` Richard Henderson
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 18:34 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/11/2014 09:23 AM, Tom Musta wrote:
> The rlwinm specification includes the ROTL32 operation, which is defined
> to be a left rotation of two copies of the least significant 32 bits of
> the source GPR.
> 
> The current implementation is incorrect on 64-bit implementations in that
> it rotates a single copy of the least significant 32 bits, padding with
> zeroes in the most significant bits.

Yes, it does describe rotate_32 as a double-copy of the low 32 bits.  But it
also describes the mask as having "0 bits elsewhere".  Thus, post mask, I don't
see how you could distinguish the implementations.

Have you an example that doesn't work with the current code?


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm
  2014-08-15 18:34   ` Richard Henderson
@ 2014-08-15 19:52     ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 19:52 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/15/2014 08:34 AM, Richard Henderson wrote:
> On 08/11/2014 09:23 AM, Tom Musta wrote:
>> The rlwinm specification includes the ROTL32 operation, which is defined
>> to be a left rotation of two copies of the least significant 32 bits of
>> the source GPR.
>>
>> The current implementation is incorrect on 64-bit implementations in that
>> it rotates a single copy of the least significant 32 bits, padding with
>> zeroes in the most significant bits.
> 
> Yes, it does describe rotate_32 as a double-copy of the low 32 bits.  But it
> also describes the mask as having "0 bits elsewhere".  Thus, post mask, I don't
> see how you could distinguish the implementations.
> 
> Have you an example that doesn't work with the current code?

Let me guess the answer myself -- it's MB > ME, and wraparound of the mask.

I think I was distracted by the text "For all the uses given above, the
high-order 32 bits of register RA are cleared." without clearly reading the
examples.


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 2/8] target-ppc: Bug Fix: rlwnm
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 2/8] target-ppc: Bug Fix: rlwnm Tom Musta
@ 2014-08-15 19:54   ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 19:54 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/11/2014 09:23 AM, Tom Musta wrote:
> +#if defined(TARGET_PPC64)
> +        tcg_gen_andi_tl(t0, t0, MASK(32, 63));
> +#endif

Err. this is just simple zero-extension.

>          tcg_gen_mov_tl(cpu_gpr[rA(ctx->opcode)], t0);

Why not combine with the mov as

  tcg_gen_ext32u_tl

which will just expand to a mov for PPC32.


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/8] target-ppc: Bug Fix: rlwimi
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 3/8] target-ppc: Bug Fix: rlwimi Tom Musta
@ 2014-08-15 20:05   ` Richard Henderson
  2014-08-18 19:51     ` Tom Musta
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 20:05 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/11/2014 09:23 AM, Tom Musta wrote:
> Also fix the special case of MB=31 and ME=0 to copy the entire contents
> of the source GPR.

Err, that's not what you did.

>      if (likely(sh == 0 && mb == 0 && me == 31)) {
> +#if defined(TARGET_PPC64)
> +        tcg_gen_mov_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
> +#else
>          tcg_gen_ext32u_tl(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
> +#endif

This is the reverse condition.  Which, true enough, should not be implemented
with ext32u for PPC64.  But a MOV isn't right either, it is

  deposit(ra, rs, 0, 32)

Which does point out that we should probably implement anything MB <= ME and SH
== 31 - ME with the deposit opcode.


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 4/8] target-ppc: Bug Fix: mullw
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 4/8] target-ppc: Bug Fix: mullw Tom Musta
@ 2014-08-15 20:07   ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 20:07 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/11/2014 09:23 AM, Tom Musta wrote:
> For 64-bit implementations, the mullw result is the 64 bit product
> of the sign-extended least significant 32 bits of the source
> registers.
> 
> Fix the code to properly sign extend the source operands and produce
> a 64 bit product.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
>  target-ppc/translate.c |   11 +++++++++++
>  1 files changed, 11 insertions(+), 0 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 5/8] target-ppc: Bug Fix: mullwo
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 5/8] target-ppc: Bug Fix: mullwo Tom Musta
@ 2014-08-15 20:11   ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 20:11 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/11/2014 09:23 AM, Tom Musta wrote:
>      tcg_gen_muls2_i32(t0, t1, t0, t1);
>      tcg_gen_ext_i32_tl(cpu_gpr[rD(ctx->opcode)], t0);
> +#if defined(TARGET_PPC64)
> +    tcg_gen_ext_i32_tl(t2, t1);
> +    tcg_gen_deposit_i64(cpu_gpr[rD(ctx->opcode)],
> +                        cpu_gpr[rD(ctx->opcode)], t2, 32, 32);
> +    tcg_temp_free(t2);
> +#endif

This is

  tcg_gen_muls2_i32(t0, t0, t0, t1);
#if defined(TARGET_PPC64)
  tcg_gen_concat_i32_i64(cpu_gpr[rD(ctx->opcode)], t0, t1);
#else
  tcg_gen_mov_i32(cpu_gpr[rD(ctx->opcode)], t0);
#endif


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 6/8] target-ppc: Bug Fix: mulldo OV Detection
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 6/8] target-ppc: Bug Fix: mulldo OV Detection Tom Musta
@ 2014-08-15 20:16   ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 20:16 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/11/2014 09:23 AM, Tom Musta wrote:
> Fix the code to properly detect overflow; the 128 bit signed
> product must have all zeroes or all ones in the first 65 bits
> otherwise OV should be set.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
>  target-ppc/int_helper.c |   14 ++++++++++++--
>  1 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index f6e8846..e83a25d 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -32,12 +32,22 @@ uint64_t helper_mulldo(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
>      uint64_t tl;
>  
>      muls64(&tl, (uint64_t *)&th, arg1, arg2);
> -    /* If th != 0 && th != -1, then we had an overflow */
> -    if (likely((uint64_t)(th + 1) <= 1)) {
> +
> +    /* th should either contain all 1 bits or all 0 bits and should
> +     * match the sign bit of tl; otherwise we have overflowed. */
> +
> +    if ((int64_t)tl < 0) {
> +        if (likely(th == -1LL)) {
> +            env->ov = 0;
> +        } else {
> +            env->so = env->ov = 1;
> +        }
> +    } else if (likely(th == 0LL)) {
>          env->ov = 0;
>      } else {
>          env->so = env->ov = 1;
>      }
> +

As far as it goes,
Reviewed-by: Richard Henderson <rth@twiddle.net>

But we can just as easily implement this inline, as we do for mullwo.  We've
added a tcg_gen_muls2_i64 since this code was written.


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 8/8] target-ppc: Bug Fix: srad
  2014-08-11 19:23 ` [Qemu-devel] [PATCH 8/8] target-ppc: Bug Fix: srad Tom Musta
@ 2014-08-15 20:17   ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2014-08-15 20:17 UTC (permalink / raw)
  To: Tom Musta, qemu-devel, qemu-ppc; +Cc: agraf

On 08/11/2014 09:23 AM, Tom Musta wrote:
> Fix the check for carry in the srad helper to properly construct
> the mask -- a "1ULL" must be used (instead of "1") in order to
> get the desired result.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
>  target-ppc/int_helper.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 3/8] target-ppc: Bug Fix: rlwimi
  2014-08-15 20:05   ` Richard Henderson
@ 2014-08-18 19:51     ` Tom Musta
  0 siblings, 0 replies; 20+ messages in thread
From: Tom Musta @ 2014-08-18 19:51 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc; +Cc: agraf

On 8/15/2014 3:05 PM, Richard Henderson wrote:
> On 08/11/2014 09:23 AM, Tom Musta wrote:
>> Also fix the special case of MB=31 and ME=0 to copy the entire contents
>> of the source GPR.
> 
> Err, that's not what you did.
> 
>>      if (likely(sh == 0 && mb == 0 && me == 31)) {
>> +#if defined(TARGET_PPC64)
>> +        tcg_gen_mov_i64(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
>> +#else
>>          tcg_gen_ext32u_tl(cpu_gpr[rA(ctx->opcode)], cpu_gpr[rS(ctx->opcode)]);
>> +#endif
> 
> This is the reverse condition.  Which, true enough, should not be implemented
> with ext32u for PPC64.  But a MOV isn't right either, it is
> 
>   deposit(ra, rs, 0, 32)
> 
> Which does point out that we should probably implement anything MB <= ME and SH
> == 31 - ME with the deposit opcode.
> 
> 
> r~
> 

Richard:

Good catch.  I found a bug in my test generator ... rlwimi is unusual in that the
"RA" register is both a source and a target.  A fix is forthcoming.

Thanks also for your other comments.  Unlike this one, I believe they are optimizations.
I will investigate and potentially publish some additional changes.  Alex has already
taken this series into his ppc-next, so the new patches will be relative to these.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-08-18 19:51 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-11 19:23 [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions Tom Musta
2014-08-11 19:23 ` [Qemu-devel] [PATCH 1/8] target-ppc: Bug Fix: rlwinm Tom Musta
2014-08-15 18:34   ` Richard Henderson
2014-08-15 19:52     ` Richard Henderson
2014-08-11 19:23 ` [Qemu-devel] [PATCH 2/8] target-ppc: Bug Fix: rlwnm Tom Musta
2014-08-15 19:54   ` Richard Henderson
2014-08-11 19:23 ` [Qemu-devel] [PATCH 3/8] target-ppc: Bug Fix: rlwimi Tom Musta
2014-08-15 20:05   ` Richard Henderson
2014-08-18 19:51     ` Tom Musta
2014-08-11 19:23 ` [Qemu-devel] [PATCH 4/8] target-ppc: Bug Fix: mullw Tom Musta
2014-08-15 20:07   ` Richard Henderson
2014-08-11 19:23 ` [Qemu-devel] [PATCH 5/8] target-ppc: Bug Fix: mullwo Tom Musta
2014-08-15 20:11   ` Richard Henderson
2014-08-11 19:23 ` [Qemu-devel] [PATCH 6/8] target-ppc: Bug Fix: mulldo OV Detection Tom Musta
2014-08-15 20:16   ` Richard Henderson
2014-08-11 19:23 ` [Qemu-devel] [PATCH 7/8] target-ppc: Bug Fix: srawi Tom Musta
2014-08-11 19:23 ` [Qemu-devel] [PATCH 8/8] target-ppc: Bug Fix: srad Tom Musta
2014-08-15 20:17   ` Richard Henderson
2014-08-12  3:06 ` [Qemu-devel] [PATCH 0/8] target-ppc: Bug Fixes for 64 Bit FXU Instructions David Gibson
2014-08-12 11:40   ` Tom Musta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).