qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] tcg: Canonicalize SUBI to ANDI
@ 2023-10-26  1:39 Richard Henderson
  2023-10-26  1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-26  1:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

Supercedes: 20231025185956.72677-1-pbonzini@redhat.com
("tcg: allow a target to request canonicalization of SUBI to ADDI")

Based-on: 20231025072707.833943-1-richard.henderson@linaro.org
("tcg: Introduce TCG_COND_TST{EQ,NE}")
There a couple of optimize.c routines introduced which I use here.

The final NOTFORMERGE patch suggests that I've caught all of the
cases, but I certainly wouldn't want to merge the assert without
a lot more testing.


r~


Richard Henderson (4):
  tcg: Canonicalize subi to addi during opcode generation
  tcg/optimize: Canonicalize subi to addi during optimization
  tcg/optimize: Canonicalize sub2 with constants to add2
  NOTFORMERGE tcg/i386: Assert sub of immediate has been folded

 tcg/optimize.c            | 35 +++++++++++++++++++++++++---
 tcg/tcg-op.c              | 25 +++++++-------------
 tcg/tcg.c                 | 49 ++++++++++++++++++++++++++-------------
 tcg/i386/tcg-target.c.inc | 13 ++++++++---
 4 files changed, 84 insertions(+), 38 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation
  2023-10-26  1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
@ 2023-10-26  1:39 ` Richard Henderson
  2023-11-06 14:37   ` Philippe Mathieu-Daudé
  2023-10-26  1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2023-10-26  1:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg-op.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 59deb3cbbb..3fce9bd8f4 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -147,12 +147,7 @@ void tcg_gen_subfi_i32(TCGv_i32 ret, int32_t arg1, TCGv_i32 arg2)
 
 void tcg_gen_subi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
 {
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        tcg_gen_sub_i32(ret, arg1, tcg_constant_i32(arg2));
-    }
+    tcg_gen_addi_i32(ret, arg1, -arg2);
 }
 
 void tcg_gen_andi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
@@ -1258,6 +1253,13 @@ void tcg_gen_add_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 
 void tcg_gen_sub_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
+    TCGTemp *ts2 = tcgv_i64_temp(arg2);
+
+    /* Canonicalize subi into addi. */
+    if (ts2->kind == TEMP_CONST) {
+        tcg_gen_addi_i64(ret, arg1, -ts2->val);
+        return;
+    }
     tcg_gen_sub2_i32(TCGV_LOW(ret), TCGV_HIGH(ret), TCGV_LOW(arg1),
                      TCGV_HIGH(arg1), TCGV_LOW(arg2), TCGV_HIGH(arg2));
 }
@@ -1354,16 +1356,7 @@ void tcg_gen_subfi_i64(TCGv_i64 ret, int64_t arg1, TCGv_i64 arg2)
 
 void tcg_gen_subi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 {
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else if (TCG_TARGET_REG_BITS == 64) {
-        tcg_gen_sub_i64(ret, arg1, tcg_constant_i64(arg2));
-    } else {
-        tcg_gen_sub2_i32(TCGV_LOW(ret), TCGV_HIGH(ret),
-                         TCGV_LOW(arg1), TCGV_HIGH(arg1),
-                         tcg_constant_i32(arg2), tcg_constant_i32(arg2 >> 32));
-    }
+    tcg_gen_addi_i64(ret, arg1, -arg2);
 }
 
 void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization
  2023-10-26  1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
  2023-10-26  1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
@ 2023-10-26  1:39 ` Richard Henderson
  2023-11-06 14:42   ` Philippe Mathieu-Daudé
  2023-10-26  1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2023-10-26  1:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/optimize.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 5e16800cfa..9d91c25f68 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -2237,7 +2237,19 @@ static bool fold_sub_vec(OptContext *ctx, TCGOp *op)
 
 static bool fold_sub(OptContext *ctx, TCGOp *op)
 {
-    return fold_const2(ctx, op) || fold_sub_vec(ctx, op);
+    if (fold_const2(ctx, op) || fold_sub_vec(ctx, op)) {
+        return true;
+    }
+
+    /* Fold sub r,x,i to add r,x,-i */
+    if (arg_is_const(op->args[2])) {
+        uint64_t val = arg_info(op->args[2])->val;
+
+        op->opc = (ctx->type == TCG_TYPE_I32
+                   ? INDEX_op_add_i32 : INDEX_op_add_i64);
+        op->args[2] = arg_new_constant(ctx, -val);
+    }
+    return false;
 }
 
 static bool fold_sub2(OptContext *ctx, TCGOp *op)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2
  2023-10-26  1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
  2023-10-26  1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
  2023-10-26  1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
@ 2023-10-26  1:39 ` Richard Henderson
  2023-11-06 14:43   ` Philippe Mathieu-Daudé
  2023-10-26  1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
  2023-10-31 22:20 ` [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
  4 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2023-10-26  1:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/optimize.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 9d91c25f68..280e6089a6 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1030,8 +1030,10 @@ static bool fold_add_vec(OptContext *ctx, TCGOp *op)
 
 static bool fold_addsub2(OptContext *ctx, TCGOp *op, bool add)
 {
-    if (arg_is_const(op->args[2]) && arg_is_const(op->args[3]) &&
-        arg_is_const(op->args[4]) && arg_is_const(op->args[5])) {
+    bool a_const = arg_is_const(op->args[2]) && arg_is_const(op->args[3]);
+    bool b_const = arg_is_const(op->args[4]) && arg_is_const(op->args[5]);
+
+    if (a_const && b_const) {
         uint64_t al = arg_info(op->args[2])->val;
         uint64_t ah = arg_info(op->args[3])->val;
         uint64_t bl = arg_info(op->args[4])->val;
@@ -1075,6 +1077,21 @@ static bool fold_addsub2(OptContext *ctx, TCGOp *op, bool add)
         tcg_opt_gen_movi(ctx, op2, rh, ah);
         return true;
     }
+
+    /* Fold sub2 r,x,i to add2 r,x,-i */
+    if (!add && b_const) {
+        uint64_t bl = arg_info(op->args[4])->val;
+        uint64_t bh = arg_info(op->args[5])->val;
+
+        /* Negate the two parts without assembling and disassembling. */
+        bl = -bl;
+        bh = ~bh + !bl;
+
+        op->opc = (ctx->type == TCG_TYPE_I32
+                   ? INDEX_op_add2_i32 : INDEX_op_add2_i64);
+        op->args[4] = arg_new_constant(ctx, bl);
+        op->args[5] = arg_new_constant(ctx, bh);
+    }
     return false;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
  2023-10-26  1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
                   ` (2 preceding siblings ...)
  2023-10-26  1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
@ 2023-10-26  1:39 ` Richard Henderson
  2023-10-26  1:53   ` Richard Henderson
  2023-11-06 14:49   ` Philippe Mathieu-Daudé
  2023-10-31 22:20 ` [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
  4 siblings, 2 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-26  1:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

A release build should simply accept and emit the subtract.
I'm not even sure if this is reasonable to keep for debug.

Not-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg.c                 | 49 ++++++++++++++++++++++++++-------------
 tcg/i386/tcg-target.c.inc | 13 ++++++++---
 2 files changed, 43 insertions(+), 19 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a507c111cf..408647af7e 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -3618,23 +3618,40 @@ liveness_pass_1(TCGContext *s)
         do_addsub2:
             nb_iargs = 4;
             nb_oargs = 2;
-            /* Test if the high part of the operation is dead, but not
-               the low part.  The result can be optimized to a simple
-               add or sub.  This happens often for x86_64 guest when the
-               cpu mode is set to 32 bit.  */
-            if (arg_temp(op->args[1])->state == TS_DEAD) {
-                if (arg_temp(op->args[0])->state == TS_DEAD) {
-                    goto do_remove;
-                }
-                /* Replace the opcode and adjust the args in place,
-                   leaving 3 unused args at the end.  */
-                op->opc = opc = opc_new;
-                op->args[1] = op->args[2];
-                op->args[2] = op->args[4];
-                /* Fall through and mark the single-word operation live.  */
-                nb_iargs = 2;
-                nb_oargs = 1;
+            /*
+             * Test if the high part of the operation is dead, but the low
+             * part is still live.  The result can be optimized to a simple
+             * add or sub.
+             */
+            if (arg_temp(op->args[1])->state != TS_DEAD) {
+                goto do_not_remove;
             }
+            if (arg_temp(op->args[0])->state == TS_DEAD) {
+                goto do_remove;
+            }
+            /*
+             * Replace the opcode and adjust the args in place, leaving 3
+             * unused args at the end.  Canonicalize subi to andi.
+             */
+            op->args[1] = op->args[2];
+            {
+                TCGTemp *src2 = arg_temp(op->args[4]);
+                if (src2->kind == TEMP_CONST) {
+                    if (opc_new == INDEX_op_sub_i32) {
+                        src2 = tcg_constant_internal(TCG_TYPE_I32,
+                                                     (int32_t)-src2->val);
+                        opc_new = INDEX_op_add_i32;
+                    } else if (opc_new == INDEX_op_sub_i64) {
+                        src2 = tcg_constant_internal(TCG_TYPE_I64, -src2->val);
+                        opc_new = INDEX_op_add_i64;
+                    }
+                }
+                op->args[2] = temp_arg(src2);
+            }
+            op->opc = opc = opc_new;
+            /* Mark the single-word operation live.  */
+            nb_iargs = 2;
+            nb_oargs = 1;
             goto do_not_remove;
 
         case INDEX_op_mulu2_i32:
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 0d97864174..39d03fa698 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -2544,8 +2544,14 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         c = ARITH_ADD;
         goto gen_arith;
     OP_32_64(sub):
-        c = ARITH_SUB;
-        goto gen_arith;
+        /*
+         * Should have canonicalized all constants to add.
+         * Keep the constraint allowing any constant so that we catch this
+         * case without register allocation loading the constant first.
+         */
+        tcg_debug_assert(!const_a2);
+        tgen_arithr(s, ARITH_SUB + rexw, a0, a2);
+        break;
     OP_32_64(and):
         c = ARITH_AND;
         goto gen_arith;
@@ -3325,7 +3331,6 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
         return C_O1_I2(r, r, re);
 
     case INDEX_op_sub_i32:
-    case INDEX_op_sub_i64:
     case INDEX_op_mul_i32:
     case INDEX_op_mul_i64:
     case INDEX_op_or_i32:
@@ -3333,6 +3338,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_xor_i32:
     case INDEX_op_xor_i64:
         return C_O1_I2(r, 0, re);
+    case INDEX_op_sub_i64:
+        return C_O1_I2(r, 0, ri);
 
     case INDEX_op_and_i32:
     case INDEX_op_and_i64:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
  2023-10-26  1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
@ 2023-10-26  1:53   ` Richard Henderson
  2023-11-06 14:49   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-26  1:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

On 10/25/23 18:39, Richard Henderson wrote:
> A release build should simply accept and emit the subtract.
> I'm not even sure if this is reasonable to keep for debug.
> 
> Not-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/tcg.c                 | 49 ++++++++++++++++++++++++++-------------
>   tcg/i386/tcg-target.c.inc | 13 ++++++++---
>   2 files changed, 43 insertions(+), 19 deletions(-)

Oops, accidental merge of two patches.
The tcg.c change *is* required.


r~

> 
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index a507c111cf..408647af7e 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -3618,23 +3618,40 @@ liveness_pass_1(TCGContext *s)
>           do_addsub2:
>               nb_iargs = 4;
>               nb_oargs = 2;
> -            /* Test if the high part of the operation is dead, but not
> -               the low part.  The result can be optimized to a simple
> -               add or sub.  This happens often for x86_64 guest when the
> -               cpu mode is set to 32 bit.  */
> -            if (arg_temp(op->args[1])->state == TS_DEAD) {
> -                if (arg_temp(op->args[0])->state == TS_DEAD) {
> -                    goto do_remove;
> -                }
> -                /* Replace the opcode and adjust the args in place,
> -                   leaving 3 unused args at the end.  */
> -                op->opc = opc = opc_new;
> -                op->args[1] = op->args[2];
> -                op->args[2] = op->args[4];
> -                /* Fall through and mark the single-word operation live.  */
> -                nb_iargs = 2;
> -                nb_oargs = 1;
> +            /*
> +             * Test if the high part of the operation is dead, but the low
> +             * part is still live.  The result can be optimized to a simple
> +             * add or sub.
> +             */
> +            if (arg_temp(op->args[1])->state != TS_DEAD) {
> +                goto do_not_remove;
>               }
> +            if (arg_temp(op->args[0])->state == TS_DEAD) {
> +                goto do_remove;
> +            }
> +            /*
> +             * Replace the opcode and adjust the args in place, leaving 3
> +             * unused args at the end.  Canonicalize subi to andi.
> +             */
> +            op->args[1] = op->args[2];
> +            {
> +                TCGTemp *src2 = arg_temp(op->args[4]);
> +                if (src2->kind == TEMP_CONST) {
> +                    if (opc_new == INDEX_op_sub_i32) {
> +                        src2 = tcg_constant_internal(TCG_TYPE_I32,
> +                                                     (int32_t)-src2->val);
> +                        opc_new = INDEX_op_add_i32;
> +                    } else if (opc_new == INDEX_op_sub_i64) {
> +                        src2 = tcg_constant_internal(TCG_TYPE_I64, -src2->val);
> +                        opc_new = INDEX_op_add_i64;
> +                    }
> +                }
> +                op->args[2] = temp_arg(src2);
> +            }
> +            op->opc = opc = opc_new;
> +            /* Mark the single-word operation live.  */
> +            nb_iargs = 2;
> +            nb_oargs = 1;
>               goto do_not_remove;
>   
>           case INDEX_op_mulu2_i32:
> diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
> index 0d97864174..39d03fa698 100644
> --- a/tcg/i386/tcg-target.c.inc
> +++ b/tcg/i386/tcg-target.c.inc
> @@ -2544,8 +2544,14 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>           c = ARITH_ADD;
>           goto gen_arith;
>       OP_32_64(sub):
> -        c = ARITH_SUB;
> -        goto gen_arith;
> +        /*
> +         * Should have canonicalized all constants to add.
> +         * Keep the constraint allowing any constant so that we catch this
> +         * case without register allocation loading the constant first.
> +         */
> +        tcg_debug_assert(!const_a2);
> +        tgen_arithr(s, ARITH_SUB + rexw, a0, a2);
> +        break;
>       OP_32_64(and):
>           c = ARITH_AND;
>           goto gen_arith;
> @@ -3325,7 +3331,6 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
>           return C_O1_I2(r, r, re);
>   
>       case INDEX_op_sub_i32:
> -    case INDEX_op_sub_i64:
>       case INDEX_op_mul_i32:
>       case INDEX_op_mul_i64:
>       case INDEX_op_or_i32:
> @@ -3333,6 +3338,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
>       case INDEX_op_xor_i32:
>       case INDEX_op_xor_i64:
>           return C_O1_I2(r, 0, re);
> +    case INDEX_op_sub_i64:
> +        return C_O1_I2(r, 0, ri);
>   
>       case INDEX_op_and_i32:
>       case INDEX_op_and_i64:



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] tcg: Canonicalize SUBI to ANDI
  2023-10-26  1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
                   ` (3 preceding siblings ...)
  2023-10-26  1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
@ 2023-10-31 22:20 ` Richard Henderson
  4 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-31 22:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

Ping.

On 10/25/23 18:39, Richard Henderson wrote:
> Supercedes: 20231025185956.72677-1-pbonzini@redhat.com
> ("tcg: allow a target to request canonicalization of SUBI to ADDI")
> 
> Based-on: 20231025072707.833943-1-richard.henderson@linaro.org
> ("tcg: Introduce TCG_COND_TST{EQ,NE}")
> There a couple of optimize.c routines introduced which I use here.
> 
> The final NOTFORMERGE patch suggests that I've caught all of the
> cases, but I certainly wouldn't want to merge the assert without
> a lot more testing.
> 
> 
> r~
> 
> 
> Richard Henderson (4):
>    tcg: Canonicalize subi to addi during opcode generation
>    tcg/optimize: Canonicalize subi to addi during optimization
>    tcg/optimize: Canonicalize sub2 with constants to add2
>    NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
> 
>   tcg/optimize.c            | 35 +++++++++++++++++++++++++---
>   tcg/tcg-op.c              | 25 +++++++-------------
>   tcg/tcg.c                 | 49 ++++++++++++++++++++++++++-------------
>   tcg/i386/tcg-target.c.inc | 13 ++++++++---
>   4 files changed, 84 insertions(+), 38 deletions(-)
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation
  2023-10-26  1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
@ 2023-11-06 14:37   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:37 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: pbonzini

On 26/10/23 03:39, Richard Henderson wrote:
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/tcg-op.c | 25 +++++++++----------------
>   1 file changed, 9 insertions(+), 16 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization
  2023-10-26  1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
@ 2023-11-06 14:42   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:42 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: pbonzini

On 26/10/23 03:39, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/optimize.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2
  2023-10-26  1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
@ 2023-11-06 14:43   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:43 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: pbonzini

On 26/10/23 03:39, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/optimize.c | 21 +++++++++++++++++++--
>   1 file changed, 19 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
  2023-10-26  1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
  2023-10-26  1:53   ` Richard Henderson
@ 2023-11-06 14:49   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:49 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: pbonzini

On 26/10/23 03:39, Richard Henderson wrote:
> A release build should simply accept and emit the subtract.
> I'm not even sure if this is reasonable to keep for debug.
> 
> Not-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/tcg.c                 | 49 ++++++++++++++++++++++++++-------------
>   tcg/i386/tcg-target.c.inc | 13 ++++++++---
>   2 files changed, 43 insertions(+), 19 deletions(-)
> 
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index a507c111cf..408647af7e 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -3618,23 +3618,40 @@ liveness_pass_1(TCGContext *s)
>           do_addsub2:
>               nb_iargs = 4;
>               nb_oargs = 2;
> -            /* Test if the high part of the operation is dead, but not
> -               the low part.  The result can be optimized to a simple
> -               add or sub.  This happens often for x86_64 guest when the
> -               cpu mode is set to 32 bit.  */
> -            if (arg_temp(op->args[1])->state == TS_DEAD) {
> -                if (arg_temp(op->args[0])->state == TS_DEAD) {
> -                    goto do_remove;
> -                }
> -                /* Replace the opcode and adjust the args in place,
> -                   leaving 3 unused args at the end.  */
> -                op->opc = opc = opc_new;
> -                op->args[1] = op->args[2];
> -                op->args[2] = op->args[4];
> -                /* Fall through and mark the single-word operation live.  */
> -                nb_iargs = 2;
> -                nb_oargs = 1;
> +            /*
> +             * Test if the high part of the operation is dead, but the low
> +             * part is still live.  The result can be optimized to a simple
> +             * add or sub.
> +             */
> +            if (arg_temp(op->args[1])->state != TS_DEAD) {
> +                goto do_not_remove;
>               }
> +            if (arg_temp(op->args[0])->state == TS_DEAD) {
> +                goto do_remove;
> +            }
> +            /*
> +             * Replace the opcode and adjust the args in place, leaving 3
> +             * unused args at the end.  Canonicalize subi to andi.

Typo s/andi/addi/.

> +             */
> +            op->args[1] = op->args[2];
> +            {
> +                TCGTemp *src2 = arg_temp(op->args[4]);
> +                if (src2->kind == TEMP_CONST) {
> +                    if (opc_new == INDEX_op_sub_i32) {
> +                        src2 = tcg_constant_internal(TCG_TYPE_I32,
> +                                                     (int32_t)-src2->val);
> +                        opc_new = INDEX_op_add_i32;
> +                    } else if (opc_new == INDEX_op_sub_i64) {
> +                        src2 = tcg_constant_internal(TCG_TYPE_I64, -src2->val);
> +                        opc_new = INDEX_op_add_i64;
> +                    }
> +                }
> +                op->args[2] = temp_arg(src2);
> +            }
> +            op->opc = opc = opc_new;
> +            /* Mark the single-word operation live.  */
> +            nb_iargs = 2;
> +            nb_oargs = 1;
>               goto do_not_remove;

For tcg/tcg.c with your S-o-b and a reworded description:

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-11-06 14:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-26  1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
2023-10-26  1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
2023-11-06 14:37   ` Philippe Mathieu-Daudé
2023-10-26  1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
2023-11-06 14:42   ` Philippe Mathieu-Daudé
2023-10-26  1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
2023-11-06 14:43   ` Philippe Mathieu-Daudé
2023-10-26  1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
2023-10-26  1:53   ` Richard Henderson
2023-11-06 14:49   ` Philippe Mathieu-Daudé
2023-10-31 22:20 ` [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).