qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-9.0 0/3] target/hppa: Fix DCOR, UADDCM conditions
@ 2024-03-25  3:04 Richard Henderson
  2024-03-25  3:04 ` [PATCH 1/3] targt/hppa: Fix DCOR reconstruction of carry bits Richard Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Richard Henderson @ 2024-03-25  3:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller

Two problems, both related to the reconstruction and computation
of carry bits.  Simplify UXOR a bit, since no carry is involved.
While in the area, optimize UADDCM without condition, as that's
the common case for inverting a register.


r~


Richard Henderson (3):
  targt/hppa: Fix DCOR reconstruction of carry bits
  target/hppa: Optimize UADDCM with no condition
  target/hppa: Fix unit carry conditions

 target/hppa/translate.c | 240 ++++++++++++++++++++++------------------
 1 file changed, 132 insertions(+), 108 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] targt/hppa: Fix DCOR reconstruction of carry bits
  2024-03-25  3:04 [PATCH for-9.0 0/3] target/hppa: Fix DCOR, UADDCM conditions Richard Henderson
@ 2024-03-25  3:04 ` Richard Henderson
  2024-03-25  9:48   ` Helge Deller
  2024-03-25  3:04 ` [PATCH 2/3] target/hppa: Optimize UADDCM with no condition Richard Henderson
  2024-03-25  3:04 ` [PATCH 3/3] target/hppa: Fix unit carry conditions Richard Henderson
  2 siblings, 1 reply; 7+ messages in thread
From: Richard Henderson @ 2024-03-25  3:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller

The carry bits for each nibble N are located in bit (N+1)*4,
so the shift by 3 was off by one.  Furthermore, the carry bit
for the most significant carry bit is indeed located in bit 64,
which is located in a different storage word.

Use a double-word shift-right to reassemble into a single word
and place them all at bit 0 of their respective nibbles.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/hppa/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index e041310207..a3f425d861 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -2791,7 +2791,7 @@ static bool do_dcor(DisasContext *ctx, arg_rr_cf_d *a, bool is_i)
     nullify_over(ctx);
 
     tmp = tcg_temp_new_i64();
-    tcg_gen_shri_i64(tmp, cpu_psw_cb, 3);
+    tcg_gen_extract2_i64(tmp, cpu_psw_cb, cpu_psw_cb_msb, 4);
     if (!is_i) {
         tcg_gen_not_i64(tmp, tmp);
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] target/hppa: Optimize UADDCM with no condition
  2024-03-25  3:04 [PATCH for-9.0 0/3] target/hppa: Fix DCOR, UADDCM conditions Richard Henderson
  2024-03-25  3:04 ` [PATCH 1/3] targt/hppa: Fix DCOR reconstruction of carry bits Richard Henderson
@ 2024-03-25  3:04 ` Richard Henderson
  2024-03-25  9:48   ` Helge Deller
  2024-03-25  3:04 ` [PATCH 3/3] target/hppa: Fix unit carry conditions Richard Henderson
  2 siblings, 1 reply; 7+ messages in thread
From: Richard Henderson @ 2024-03-25  3:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller

With r1 as zero is by far the only usage of UADDCM, as the easiest
way to invert a register.  The compiler does occasionally use the
addition step as well, and we can simplify that to avoid a temp
and write directly into the destination.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/hppa/translate.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index a3f425d861..3fc3e7754c 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -2763,9 +2763,29 @@ static bool do_uaddcm(DisasContext *ctx, arg_rrr_cf_d *a, bool is_tc)
 {
     TCGv_i64 tcg_r1, tcg_r2, tmp;
 
-    if (a->cf) {
-        nullify_over(ctx);
+    if (a->cf == 0) {
+        tcg_r2 = load_gpr(ctx, a->r2);
+        tmp = dest_gpr(ctx, a->t);
+
+        if (a->r1 == 0) {
+            /* UADDCM r0,src,dst is the common idiom for dst = ~src. */
+            tcg_gen_not_i64(tmp, tcg_r2);
+        } else {
+            /*
+             * Recall that r1 - r2 == r1 + ~r2 + 1.
+             * Thus r1 + ~r2 == r1 - r2 - 1,
+             * which does not require an extra temporary.
+             */
+            tcg_r1 = load_gpr(ctx, a->r1);
+            tcg_gen_sub_i64(tmp, tcg_r1, tcg_r2);
+            tcg_gen_subi_i64(tmp, tmp, 1);
+        }
+        save_gpr(ctx, a->t, tmp);
+        cond_free(&ctx->null_cond);
+        return true;
     }
+
+    nullify_over(ctx);
     tcg_r1 = load_gpr(ctx, a->r1);
     tcg_r2 = load_gpr(ctx, a->r2);
     tmp = tcg_temp_new_i64();
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] target/hppa: Fix unit carry conditions
  2024-03-25  3:04 [PATCH for-9.0 0/3] target/hppa: Fix DCOR, UADDCM conditions Richard Henderson
  2024-03-25  3:04 ` [PATCH 1/3] targt/hppa: Fix DCOR reconstruction of carry bits Richard Henderson
  2024-03-25  3:04 ` [PATCH 2/3] target/hppa: Optimize UADDCM with no condition Richard Henderson
@ 2024-03-25  3:04 ` Richard Henderson
  2024-03-25 10:36   ` Helge Deller
  2 siblings, 1 reply; 7+ messages in thread
From: Richard Henderson @ 2024-03-25  3:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: deller

Split do_unit_cond to do_unit_zero_cond to only handle
conditions versus zero.  These are the only ones that
are legal for UXOR.  Simplify trans_uxor accordingly.

Rename do_unit to do_unit_addsub, since xor has been split.
Properly compute carry-out bits for add and subtract,
mirroring the code in do_add and do_sub.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/hppa/translate.c | 214 ++++++++++++++++++++--------------------
 1 file changed, 109 insertions(+), 105 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 3fc3e7754c..2bf213c938 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -936,98 +936,44 @@ static DisasCond do_sed_cond(DisasContext *ctx, unsigned orig, bool d,
     return do_log_cond(ctx, c * 2 + f, d, res);
 }
 
-/* Similar, but for unit conditions.  */
-
-static DisasCond do_unit_cond(unsigned cf, bool d, TCGv_i64 res,
-                              TCGv_i64 in1, TCGv_i64 in2)
+/* Similar, but for unit zero conditions.  */
+static DisasCond do_unit_zero_cond(unsigned cf, bool d, TCGv_i64 res)
 {
-    DisasCond cond;
-    TCGv_i64 tmp, cb = NULL;
+    TCGv_i64 tmp;
     uint64_t d_repl = d ? 0x0000000100000001ull : 1;
-
-    if (cf & 8) {
-        /* Since we want to test lots of carry-out bits all at once, do not
-         * do our normal thing and compute carry-in of bit B+1 since that
-         * leaves us with carry bits spread across two words.
-         */
-        cb = tcg_temp_new_i64();
-        tmp = tcg_temp_new_i64();
-        tcg_gen_or_i64(cb, in1, in2);
-        tcg_gen_and_i64(tmp, in1, in2);
-        tcg_gen_andc_i64(cb, cb, res);
-        tcg_gen_or_i64(cb, cb, tmp);
-    }
+    uint64_t ones = 0, sgns = 0;
 
     switch (cf >> 1) {
-    case 0: /* never / TR */
-        cond = cond_make_f();
-        break;
-
     case 1: /* SBW / NBW */
         if (d) {
-            tmp = tcg_temp_new_i64();
-            tcg_gen_subi_i64(tmp, res, d_repl * 0x00000001u);
-            tcg_gen_andc_i64(tmp, tmp, res);
-            tcg_gen_andi_i64(tmp, tmp, d_repl * 0x80000000u);
-            cond = cond_make_0(TCG_COND_NE, tmp);
-        } else {
-            /* undefined */
-            cond = cond_make_f();
+            ones = d_repl;
+            sgns = d_repl << 31;
         }
         break;
-
     case 2: /* SBZ / NBZ */
-        /* See hasless(v,1) from
-         * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
-         */
-        tmp = tcg_temp_new_i64();
-        tcg_gen_subi_i64(tmp, res, d_repl * 0x01010101u);
-        tcg_gen_andc_i64(tmp, tmp, res);
-        tcg_gen_andi_i64(tmp, tmp, d_repl * 0x80808080u);
-        cond = cond_make_0(TCG_COND_NE, tmp);
+        ones = d_repl * 0x01010101u;
+        sgns = ones << 7;
         break;
-
     case 3: /* SHZ / NHZ */
-        tmp = tcg_temp_new_i64();
-        tcg_gen_subi_i64(tmp, res, d_repl * 0x00010001u);
-        tcg_gen_andc_i64(tmp, tmp, res);
-        tcg_gen_andi_i64(tmp, tmp, d_repl * 0x80008000u);
-        cond = cond_make_0(TCG_COND_NE, tmp);
+        ones = d_repl * 0x00010001u;
+        sgns = ones << 15;
         break;
-
-    case 4: /* SDC / NDC */
-        tcg_gen_andi_i64(cb, cb, d_repl * 0x88888888u);
-        cond = cond_make_0(TCG_COND_NE, cb);
-        break;
-
-    case 5: /* SWC / NWC */
-        if (d) {
-            tcg_gen_andi_i64(cb, cb, d_repl * 0x80000000u);
-            cond = cond_make_0(TCG_COND_NE, cb);
-        } else {
-            /* undefined */
-            cond = cond_make_f();
-        }
-        break;
-
-    case 6: /* SBC / NBC */
-        tcg_gen_andi_i64(cb, cb, d_repl * 0x80808080u);
-        cond = cond_make_0(TCG_COND_NE, cb);
-        break;
-
-    case 7: /* SHC / NHC */
-        tcg_gen_andi_i64(cb, cb, d_repl * 0x80008000u);
-        cond = cond_make_0(TCG_COND_NE, cb);
-        break;
-
-    default:
-        g_assert_not_reached();
     }
-    if (cf & 1) {
-        cond.c = tcg_invert_cond(cond.c);
+    if (ones == 0) {
+        /* Undefined, or 0/1 (never/always). */
+        return cf & 1 ? cond_make_t() : cond_make_f();
     }
 
-    return cond;
+    /*
+     * See hasless(v,1) from
+     * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
+     */
+    tmp = tcg_temp_new_i64();
+    tcg_gen_subi_i64(tmp, res, ones);
+    tcg_gen_andc_i64(tmp, tmp, res);
+
+    return cond_make_tmp(cf & 1 ? TCG_COND_TSTEQ : TCG_COND_TSTNE,
+                         tmp, tcg_constant_i64(sgns));
 }
 
 static TCGv_i64 get_carry(DisasContext *ctx, bool d,
@@ -1330,34 +1276,82 @@ static bool do_log_reg(DisasContext *ctx, arg_rrr_cf_d *a,
     return nullify_end(ctx);
 }
 
-static void do_unit(DisasContext *ctx, unsigned rt, TCGv_i64 in1,
-                    TCGv_i64 in2, unsigned cf, bool d, bool is_tc,
-                    void (*fn)(TCGv_i64, TCGv_i64, TCGv_i64))
+static void do_unit_addsub(DisasContext *ctx, unsigned rt, TCGv_i64 in1,
+                           TCGv_i64 in2, unsigned cf, bool d,
+                           bool is_tc, bool is_add)
 {
-    TCGv_i64 dest;
+    TCGv_i64 dest, cb = NULL;
+    uint64_t test_cb = 0;
     DisasCond cond;
 
-    if (cf == 0) {
-        dest = dest_gpr(ctx, rt);
-        fn(dest, in1, in2);
-        save_gpr(ctx, rt, dest);
-        cond_free(&ctx->null_cond);
-    } else {
-        dest = tcg_temp_new_i64();
-        fn(dest, in1, in2);
-
-        cond = do_unit_cond(cf, d, dest, in1, in2);
-
-        if (is_tc) {
-            TCGv_i64 tmp = tcg_temp_new_i64();
-            tcg_gen_setcond_i64(cond.c, tmp, cond.a0, cond.a1);
-            gen_helper_tcond(tcg_env, tmp);
+    /* Select which carry-out bits to test. */
+    switch (cf >> 1) {
+    case 4: /* NDC / SDC -- 4-bit carries */
+        test_cb = 0x8888888888888888ull;
+        break;
+    case 5: /* NWC / SWC -- 32-bit carries */
+        if (d) {
+            test_cb = 0x8000000080000000ull;
+        } else {
+            cf &= 1; /* undefined -- map to never/always */
         }
-        save_gpr(ctx, rt, dest);
-
-        cond_free(&ctx->null_cond);
-        ctx->null_cond = cond;
+        break;
+    case 6: /* NBC / SBC -- 8-bit carries */
+        test_cb = 0x8080808080808080ull;
+        break;
+    case 7: /* NHC / SHC -- 16-bit carries */
+        test_cb = 0x8000800080008000ull;
+        break;
     }
+
+    dest = tcg_temp_new_i64();
+    if (test_cb) {
+        cb = tcg_temp_new_i64();
+        if (d) {
+            TCGv_i64 cb_msb = tcg_temp_new_i64();
+            if (is_add) {
+                tcg_gen_add2_i64(dest, cb_msb, in1, ctx->zero, in2, ctx->zero);
+                tcg_gen_xor_i64(cb, in1, in2);
+            } else {
+                /* See do_sub, !is_b. */
+                TCGv_i64 one = tcg_constant_i64(1);
+                tcg_gen_sub2_i64(dest, cb_msb, in1, one, in2, ctx->zero);
+                tcg_gen_eqv_i64(cb, in1, in2);
+            }
+            tcg_gen_xor_i64(cb, cb, dest);
+            /* For 64-bit tests, put all carry-out bits back in one word. */
+            tcg_gen_extract2_i64(cb, cb, cb_msb, 1);
+        } else {
+            if (is_add) {
+                tcg_gen_add_i64(dest, in1, in2);
+                tcg_gen_xor_i64(cb, in1, in2);
+            } else {
+                tcg_gen_sub_i64(dest, in1, in2);
+                tcg_gen_eqv_i64(cb, in1, in2);
+            }
+            /* For 32-bit tests, test carry-in instead of carry-out. */
+            test_cb = (uint64_t)(uint32_t)test_cb << 1;
+        }
+        cond = cond_make_tmp(cf & 1 ? TCG_COND_TSTEQ : TCG_COND_TSTNE,
+                             cb, tcg_constant_i64(test_cb));
+    } else {
+        if (is_add) {
+            tcg_gen_add_i64(dest, in1, in2);
+        } else {
+            tcg_gen_sub_i64(dest, in1, in2);
+        }
+        cond = do_unit_zero_cond(cf, d, dest);
+    }
+
+    if (is_tc) {
+        TCGv_i64 tmp = tcg_temp_new_i64();
+        tcg_gen_setcond_i64(cond.c, tmp, cond.a0, cond.a1);
+        gen_helper_tcond(tcg_env, tmp);
+    }
+    save_gpr(ctx, rt, dest);
+
+    cond_free(&ctx->null_cond);
+    ctx->null_cond = cond;
 }
 
 #ifndef CONFIG_USER_ONLY
@@ -2748,14 +2742,24 @@ static bool trans_cmpclr(DisasContext *ctx, arg_rrr_cf_d *a)
 
 static bool trans_uxor(DisasContext *ctx, arg_rrr_cf_d *a)
 {
-    TCGv_i64 tcg_r1, tcg_r2;
+    TCGv_i64 tcg_r1, tcg_r2, dest;
 
     if (a->cf) {
         nullify_over(ctx);
     }
+
     tcg_r1 = load_gpr(ctx, a->r1);
     tcg_r2 = load_gpr(ctx, a->r2);
-    do_unit(ctx, a->t, tcg_r1, tcg_r2, a->cf, a->d, false, tcg_gen_xor_i64);
+    dest = dest_gpr(ctx, a->t);
+
+    tcg_gen_xor_i64(dest, tcg_r1, tcg_r2);
+    save_gpr(ctx, a->t, dest);
+
+    cond_free(&ctx->null_cond);
+    if (a->cf) {
+        ctx->null_cond = do_unit_zero_cond(a->cf, a->d, dest);
+    }
+
     return nullify_end(ctx);
 }
 
@@ -2790,7 +2794,7 @@ static bool do_uaddcm(DisasContext *ctx, arg_rrr_cf_d *a, bool is_tc)
     tcg_r2 = load_gpr(ctx, a->r2);
     tmp = tcg_temp_new_i64();
     tcg_gen_not_i64(tmp, tcg_r2);
-    do_unit(ctx, a->t, tcg_r1, tmp, a->cf, a->d, is_tc, tcg_gen_add_i64);
+    do_unit_addsub(ctx, a->t, tcg_r1, tmp, a->cf, a->d, is_tc, true);
     return nullify_end(ctx);
 }
 
@@ -2817,8 +2821,8 @@ static bool do_dcor(DisasContext *ctx, arg_rr_cf_d *a, bool is_i)
     }
     tcg_gen_andi_i64(tmp, tmp, (uint64_t)0x1111111111111111ull);
     tcg_gen_muli_i64(tmp, tmp, 6);
-    do_unit(ctx, a->t, load_gpr(ctx, a->r), tmp, a->cf, a->d, false,
-            is_i ? tcg_gen_add_i64 : tcg_gen_sub_i64);
+    do_unit_addsub(ctx, a->t, load_gpr(ctx, a->r), tmp,
+                   a->cf, a->d, false, is_i);
     return nullify_end(ctx);
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] targt/hppa: Fix DCOR reconstruction of carry bits
  2024-03-25  3:04 ` [PATCH 1/3] targt/hppa: Fix DCOR reconstruction of carry bits Richard Henderson
@ 2024-03-25  9:48   ` Helge Deller
  0 siblings, 0 replies; 7+ messages in thread
From: Helge Deller @ 2024-03-25  9:48 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/25/24 04:04, Richard Henderson wrote:
> The carry bits for each nibble N are located in bit (N+1)*4,
> so the shift by 3 was off by one.  Furthermore, the carry bit
> for the most significant carry bit is indeed located in bit 64,
> which is located in a different storage word.
>
> Use a double-word shift-right to reassemble into a single word
> and place them all at bit 0 of their respective nibbles.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>

Helge

> ---
>   target/hppa/translate.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
> index e041310207..a3f425d861 100644
> --- a/target/hppa/translate.c
> +++ b/target/hppa/translate.c
> @@ -2791,7 +2791,7 @@ static bool do_dcor(DisasContext *ctx, arg_rr_cf_d *a, bool is_i)
>       nullify_over(ctx);
>
>       tmp = tcg_temp_new_i64();
> -    tcg_gen_shri_i64(tmp, cpu_psw_cb, 3);
> +    tcg_gen_extract2_i64(tmp, cpu_psw_cb, cpu_psw_cb_msb, 4);
>       if (!is_i) {
>           tcg_gen_not_i64(tmp, tmp);
>       }



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] target/hppa: Optimize UADDCM with no condition
  2024-03-25  3:04 ` [PATCH 2/3] target/hppa: Optimize UADDCM with no condition Richard Henderson
@ 2024-03-25  9:48   ` Helge Deller
  0 siblings, 0 replies; 7+ messages in thread
From: Helge Deller @ 2024-03-25  9:48 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/25/24 04:04, Richard Henderson wrote:
> With r1 as zero is by far the only usage of UADDCM, as the easiest
> way to invert a register.  The compiler does occasionally use the
> addition step as well, and we can simplify that to avoid a temp
> and write directly into the destination.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>

Helge


> ---
>   target/hppa/translate.c | 24 ++++++++++++++++++++++--
>   1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
> index a3f425d861..3fc3e7754c 100644
> --- a/target/hppa/translate.c
> +++ b/target/hppa/translate.c
> @@ -2763,9 +2763,29 @@ static bool do_uaddcm(DisasContext *ctx, arg_rrr_cf_d *a, bool is_tc)
>   {
>       TCGv_i64 tcg_r1, tcg_r2, tmp;
>
> -    if (a->cf) {
> -        nullify_over(ctx);
> +    if (a->cf == 0) {
> +        tcg_r2 = load_gpr(ctx, a->r2);
> +        tmp = dest_gpr(ctx, a->t);
> +
> +        if (a->r1 == 0) {
> +            /* UADDCM r0,src,dst is the common idiom for dst = ~src. */
> +            tcg_gen_not_i64(tmp, tcg_r2);
> +        } else {
> +            /*
> +             * Recall that r1 - r2 == r1 + ~r2 + 1.
> +             * Thus r1 + ~r2 == r1 - r2 - 1,
> +             * which does not require an extra temporary.
> +             */
> +            tcg_r1 = load_gpr(ctx, a->r1);
> +            tcg_gen_sub_i64(tmp, tcg_r1, tcg_r2);
> +            tcg_gen_subi_i64(tmp, tmp, 1);
> +        }
> +        save_gpr(ctx, a->t, tmp);
> +        cond_free(&ctx->null_cond);
> +        return true;
>       }
> +
> +    nullify_over(ctx);
>       tcg_r1 = load_gpr(ctx, a->r1);
>       tcg_r2 = load_gpr(ctx, a->r2);
>       tmp = tcg_temp_new_i64();



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] target/hppa: Fix unit carry conditions
  2024-03-25  3:04 ` [PATCH 3/3] target/hppa: Fix unit carry conditions Richard Henderson
@ 2024-03-25 10:36   ` Helge Deller
  0 siblings, 0 replies; 7+ messages in thread
From: Helge Deller @ 2024-03-25 10:36 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 3/25/24 04:04, Richard Henderson wrote:
> Split do_unit_cond to do_unit_zero_cond to only handle
> conditions versus zero.  These are the only ones that
> are legal for UXOR.  Simplify trans_uxor accordingly.
>
> Rename do_unit to do_unit_addsub, since xor has been split.
> Properly compute carry-out bits for add and subtract,
> mirroring the code in do_add and do_sub.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

This patch triggers a failure in SECTION 055 (32-bit)
ERROR 0999 IN SECTION 055
UNEXPECTED TRAP# 13
IN:
0x001a2b2c:  uaddcm,tc,shc r13,r14,r15
r13..r15: 55555555 55555555 00000000



> ---
>   target/hppa/translate.c | 214 ++++++++++++++++++++--------------------
>   1 file changed, 109 insertions(+), 105 deletions(-)
>
> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
> index 3fc3e7754c..2bf213c938 100644
> --- a/target/hppa/translate.c
> +++ b/target/hppa/translate.c
> @@ -936,98 +936,44 @@ static DisasCond do_sed_cond(DisasContext *ctx, unsigned orig, bool d,
>       return do_log_cond(ctx, c * 2 + f, d, res);
>   }
>
> -/* Similar, but for unit conditions.  */
> -
> -static DisasCond do_unit_cond(unsigned cf, bool d, TCGv_i64 res,
> -                              TCGv_i64 in1, TCGv_i64 in2)
> +/* Similar, but for unit zero conditions.  */
> +static DisasCond do_unit_zero_cond(unsigned cf, bool d, TCGv_i64 res)
>   {
> -    DisasCond cond;
> -    TCGv_i64 tmp, cb = NULL;
> +    TCGv_i64 tmp;
>       uint64_t d_repl = d ? 0x0000000100000001ull : 1;
> -
> -    if (cf & 8) {
> -        /* Since we want to test lots of carry-out bits all at once, do not
> -         * do our normal thing and compute carry-in of bit B+1 since that
> -         * leaves us with carry bits spread across two words.
> -         */
> -        cb = tcg_temp_new_i64();
> -        tmp = tcg_temp_new_i64();
> -        tcg_gen_or_i64(cb, in1, in2);
> -        tcg_gen_and_i64(tmp, in1, in2);
> -        tcg_gen_andc_i64(cb, cb, res);
> -        tcg_gen_or_i64(cb, cb, tmp);
> -    }
> +    uint64_t ones = 0, sgns = 0;
>
>       switch (cf >> 1) {
> -    case 0: /* never / TR */
> -        cond = cond_make_f();
> -        break;
> -
>       case 1: /* SBW / NBW */
>           if (d) {
> -            tmp = tcg_temp_new_i64();
> -            tcg_gen_subi_i64(tmp, res, d_repl * 0x00000001u);
> -            tcg_gen_andc_i64(tmp, tmp, res);
> -            tcg_gen_andi_i64(tmp, tmp, d_repl * 0x80000000u);
> -            cond = cond_make_0(TCG_COND_NE, tmp);
> -        } else {
> -            /* undefined */
> -            cond = cond_make_f();
> +            ones = d_repl;
> +            sgns = d_repl << 31;
>           }
>           break;
> -
>       case 2: /* SBZ / NBZ */
> -        /* See hasless(v,1) from
> -         * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
> -         */
> -        tmp = tcg_temp_new_i64();
> -        tcg_gen_subi_i64(tmp, res, d_repl * 0x01010101u);
> -        tcg_gen_andc_i64(tmp, tmp, res);
> -        tcg_gen_andi_i64(tmp, tmp, d_repl * 0x80808080u);
> -        cond = cond_make_0(TCG_COND_NE, tmp);
> +        ones = d_repl * 0x01010101u;
> +        sgns = ones << 7;
>           break;
> -
>       case 3: /* SHZ / NHZ */
> -        tmp = tcg_temp_new_i64();
> -        tcg_gen_subi_i64(tmp, res, d_repl * 0x00010001u);
> -        tcg_gen_andc_i64(tmp, tmp, res);
> -        tcg_gen_andi_i64(tmp, tmp, d_repl * 0x80008000u);
> -        cond = cond_make_0(TCG_COND_NE, tmp);
> +        ones = d_repl * 0x00010001u;
> +        sgns = ones << 15;
>           break;
> -
> -    case 4: /* SDC / NDC */
> -        tcg_gen_andi_i64(cb, cb, d_repl * 0x88888888u);
> -        cond = cond_make_0(TCG_COND_NE, cb);
> -        break;
> -
> -    case 5: /* SWC / NWC */
> -        if (d) {
> -            tcg_gen_andi_i64(cb, cb, d_repl * 0x80000000u);
> -            cond = cond_make_0(TCG_COND_NE, cb);
> -        } else {
> -            /* undefined */
> -            cond = cond_make_f();
> -        }
> -        break;
> -
> -    case 6: /* SBC / NBC */
> -        tcg_gen_andi_i64(cb, cb, d_repl * 0x80808080u);
> -        cond = cond_make_0(TCG_COND_NE, cb);
> -        break;
> -
> -    case 7: /* SHC / NHC */
> -        tcg_gen_andi_i64(cb, cb, d_repl * 0x80008000u);
> -        cond = cond_make_0(TCG_COND_NE, cb);
> -        break;
> -
> -    default:
> -        g_assert_not_reached();
>       }
> -    if (cf & 1) {
> -        cond.c = tcg_invert_cond(cond.c);
> +    if (ones == 0) {
> +        /* Undefined, or 0/1 (never/always). */
> +        return cf & 1 ? cond_make_t() : cond_make_f();
>       }
>
> -    return cond;
> +    /*
> +     * See hasless(v,1) from
> +     * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
> +     */
> +    tmp = tcg_temp_new_i64();
> +    tcg_gen_subi_i64(tmp, res, ones);
> +    tcg_gen_andc_i64(tmp, tmp, res);
> +
> +    return cond_make_tmp(cf & 1 ? TCG_COND_TSTEQ : TCG_COND_TSTNE,
> +                         tmp, tcg_constant_i64(sgns));
>   }
>
>   static TCGv_i64 get_carry(DisasContext *ctx, bool d,
> @@ -1330,34 +1276,82 @@ static bool do_log_reg(DisasContext *ctx, arg_rrr_cf_d *a,
>       return nullify_end(ctx);
>   }
>
> -static void do_unit(DisasContext *ctx, unsigned rt, TCGv_i64 in1,
> -                    TCGv_i64 in2, unsigned cf, bool d, bool is_tc,
> -                    void (*fn)(TCGv_i64, TCGv_i64, TCGv_i64))
> +static void do_unit_addsub(DisasContext *ctx, unsigned rt, TCGv_i64 in1,
> +                           TCGv_i64 in2, unsigned cf, bool d,
> +                           bool is_tc, bool is_add)
>   {
> -    TCGv_i64 dest;
> +    TCGv_i64 dest, cb = NULL;
> +    uint64_t test_cb = 0;
>       DisasCond cond;
>
> -    if (cf == 0) {
> -        dest = dest_gpr(ctx, rt);
> -        fn(dest, in1, in2);
> -        save_gpr(ctx, rt, dest);
> -        cond_free(&ctx->null_cond);
> -    } else {
> -        dest = tcg_temp_new_i64();
> -        fn(dest, in1, in2);
> -
> -        cond = do_unit_cond(cf, d, dest, in1, in2);
> -
> -        if (is_tc) {
> -            TCGv_i64 tmp = tcg_temp_new_i64();
> -            tcg_gen_setcond_i64(cond.c, tmp, cond.a0, cond.a1);
> -            gen_helper_tcond(tcg_env, tmp);
> +    /* Select which carry-out bits to test. */
> +    switch (cf >> 1) {
> +    case 4: /* NDC / SDC -- 4-bit carries */
> +        test_cb = 0x8888888888888888ull;
> +        break;
> +    case 5: /* NWC / SWC -- 32-bit carries */
> +        if (d) {
> +            test_cb = 0x8000000080000000ull;
> +        } else {
> +            cf &= 1; /* undefined -- map to never/always */
>           }
> -        save_gpr(ctx, rt, dest);
> -
> -        cond_free(&ctx->null_cond);
> -        ctx->null_cond = cond;
> +        break;
> +    case 6: /* NBC / SBC -- 8-bit carries */
> +        test_cb = 0x8080808080808080ull;
> +        break;
> +    case 7: /* NHC / SHC -- 16-bit carries */
> +        test_cb = 0x8000800080008000ull;
> +        break;
>       }
> +
> +    dest = tcg_temp_new_i64();
> +    if (test_cb) {
> +        cb = tcg_temp_new_i64();
> +        if (d) {
> +            TCGv_i64 cb_msb = tcg_temp_new_i64();
> +            if (is_add) {
> +                tcg_gen_add2_i64(dest, cb_msb, in1, ctx->zero, in2, ctx->zero);
> +                tcg_gen_xor_i64(cb, in1, in2);
> +            } else {
> +                /* See do_sub, !is_b. */
> +                TCGv_i64 one = tcg_constant_i64(1);
> +                tcg_gen_sub2_i64(dest, cb_msb, in1, one, in2, ctx->zero);
> +                tcg_gen_eqv_i64(cb, in1, in2);
> +            }
> +            tcg_gen_xor_i64(cb, cb, dest);
> +            /* For 64-bit tests, put all carry-out bits back in one word. */
> +            tcg_gen_extract2_i64(cb, cb, cb_msb, 1);
> +        } else {
> +            if (is_add) {
> +                tcg_gen_add_i64(dest, in1, in2);
> +                tcg_gen_xor_i64(cb, in1, in2);
> +            } else {
> +                tcg_gen_sub_i64(dest, in1, in2);
> +                tcg_gen_eqv_i64(cb, in1, in2);
> +            }
> +            /* For 32-bit tests, test carry-in instead of carry-out. */
> +            test_cb = (uint64_t)(uint32_t)test_cb << 1;
> +        }
> +        cond = cond_make_tmp(cf & 1 ? TCG_COND_TSTEQ : TCG_COND_TSTNE,
> +                             cb, tcg_constant_i64(test_cb));
> +    } else {
> +        if (is_add) {
> +            tcg_gen_add_i64(dest, in1, in2);
> +        } else {
> +            tcg_gen_sub_i64(dest, in1, in2);
> +        }
> +        cond = do_unit_zero_cond(cf, d, dest);
> +    }
> +
> +    if (is_tc) {
> +        TCGv_i64 tmp = tcg_temp_new_i64();
> +        tcg_gen_setcond_i64(cond.c, tmp, cond.a0, cond.a1);
> +        gen_helper_tcond(tcg_env, tmp);
> +    }
> +    save_gpr(ctx, rt, dest);
> +
> +    cond_free(&ctx->null_cond);
> +    ctx->null_cond = cond;
>   }
>
>   #ifndef CONFIG_USER_ONLY
> @@ -2748,14 +2742,24 @@ static bool trans_cmpclr(DisasContext *ctx, arg_rrr_cf_d *a)
>
>   static bool trans_uxor(DisasContext *ctx, arg_rrr_cf_d *a)
>   {
> -    TCGv_i64 tcg_r1, tcg_r2;
> +    TCGv_i64 tcg_r1, tcg_r2, dest;
>
>       if (a->cf) {
>           nullify_over(ctx);
>       }
> +
>       tcg_r1 = load_gpr(ctx, a->r1);
>       tcg_r2 = load_gpr(ctx, a->r2);
> -    do_unit(ctx, a->t, tcg_r1, tcg_r2, a->cf, a->d, false, tcg_gen_xor_i64);
> +    dest = dest_gpr(ctx, a->t);
> +
> +    tcg_gen_xor_i64(dest, tcg_r1, tcg_r2);
> +    save_gpr(ctx, a->t, dest);
> +
> +    cond_free(&ctx->null_cond);
> +    if (a->cf) {
> +        ctx->null_cond = do_unit_zero_cond(a->cf, a->d, dest);
> +    }
> +
>       return nullify_end(ctx);
>   }
>
> @@ -2790,7 +2794,7 @@ static bool do_uaddcm(DisasContext *ctx, arg_rrr_cf_d *a, bool is_tc)
>       tcg_r2 = load_gpr(ctx, a->r2);
>       tmp = tcg_temp_new_i64();
>       tcg_gen_not_i64(tmp, tcg_r2);
> -    do_unit(ctx, a->t, tcg_r1, tmp, a->cf, a->d, is_tc, tcg_gen_add_i64);
> +    do_unit_addsub(ctx, a->t, tcg_r1, tmp, a->cf, a->d, is_tc, true);
>       return nullify_end(ctx);
>   }
>
> @@ -2817,8 +2821,8 @@ static bool do_dcor(DisasContext *ctx, arg_rr_cf_d *a, bool is_i)
>       }
>       tcg_gen_andi_i64(tmp, tmp, (uint64_t)0x1111111111111111ull);
>       tcg_gen_muli_i64(tmp, tmp, 6);
> -    do_unit(ctx, a->t, load_gpr(ctx, a->r), tmp, a->cf, a->d, false,
> -            is_i ? tcg_gen_add_i64 : tcg_gen_sub_i64);
> +    do_unit_addsub(ctx, a->t, load_gpr(ctx, a->r), tmp,
> +                   a->cf, a->d, false, is_i);
>       return nullify_end(ctx);
>   }
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-25 10:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-25  3:04 [PATCH for-9.0 0/3] target/hppa: Fix DCOR, UADDCM conditions Richard Henderson
2024-03-25  3:04 ` [PATCH 1/3] targt/hppa: Fix DCOR reconstruction of carry bits Richard Henderson
2024-03-25  9:48   ` Helge Deller
2024-03-25  3:04 ` [PATCH 2/3] target/hppa: Optimize UADDCM with no condition Richard Henderson
2024-03-25  9:48   ` Helge Deller
2024-03-25  3:04 ` [PATCH 3/3] target/hppa: Fix unit carry conditions Richard Henderson
2024-03-25 10:36   ` Helge Deller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).