* [PATCH 0/4] tcg: Canonicalize SUBI to ANDI
@ 2023-10-26 1:39 Richard Henderson
2023-10-26 1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-26 1:39 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
Supercedes: 20231025185956.72677-1-pbonzini@redhat.com
("tcg: allow a target to request canonicalization of SUBI to ADDI")
Based-on: 20231025072707.833943-1-richard.henderson@linaro.org
("tcg: Introduce TCG_COND_TST{EQ,NE}")
There a couple of optimize.c routines introduced which I use here.
The final NOTFORMERGE patch suggests that I've caught all of the
cases, but I certainly wouldn't want to merge the assert without
a lot more testing.
r~
Richard Henderson (4):
tcg: Canonicalize subi to addi during opcode generation
tcg/optimize: Canonicalize subi to addi during optimization
tcg/optimize: Canonicalize sub2 with constants to add2
NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
tcg/optimize.c | 35 +++++++++++++++++++++++++---
tcg/tcg-op.c | 25 +++++++-------------
tcg/tcg.c | 49 ++++++++++++++++++++++++++-------------
tcg/i386/tcg-target.c.inc | 13 ++++++++---
4 files changed, 84 insertions(+), 38 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation
2023-10-26 1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
@ 2023-10-26 1:39 ` Richard Henderson
2023-11-06 14:37 ` Philippe Mathieu-Daudé
2023-10-26 1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2023-10-26 1:39 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-op.c | 25 +++++++++----------------
1 file changed, 9 insertions(+), 16 deletions(-)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 59deb3cbbb..3fce9bd8f4 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -147,12 +147,7 @@ void tcg_gen_subfi_i32(TCGv_i32 ret, int32_t arg1, TCGv_i32 arg2)
void tcg_gen_subi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
{
- /* some cases can be optimized here */
- if (arg2 == 0) {
- tcg_gen_mov_i32(ret, arg1);
- } else {
- tcg_gen_sub_i32(ret, arg1, tcg_constant_i32(arg2));
- }
+ tcg_gen_addi_i32(ret, arg1, -arg2);
}
void tcg_gen_andi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
@@ -1258,6 +1253,13 @@ void tcg_gen_add_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
void tcg_gen_sub_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
{
+ TCGTemp *ts2 = tcgv_i64_temp(arg2);
+
+ /* Canonicalize subi into addi. */
+ if (ts2->kind == TEMP_CONST) {
+ tcg_gen_addi_i64(ret, arg1, -ts2->val);
+ return;
+ }
tcg_gen_sub2_i32(TCGV_LOW(ret), TCGV_HIGH(ret), TCGV_LOW(arg1),
TCGV_HIGH(arg1), TCGV_LOW(arg2), TCGV_HIGH(arg2));
}
@@ -1354,16 +1356,7 @@ void tcg_gen_subfi_i64(TCGv_i64 ret, int64_t arg1, TCGv_i64 arg2)
void tcg_gen_subi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
{
- /* some cases can be optimized here */
- if (arg2 == 0) {
- tcg_gen_mov_i64(ret, arg1);
- } else if (TCG_TARGET_REG_BITS == 64) {
- tcg_gen_sub_i64(ret, arg1, tcg_constant_i64(arg2));
- } else {
- tcg_gen_sub2_i32(TCGV_LOW(ret), TCGV_HIGH(ret),
- TCGV_LOW(arg1), TCGV_HIGH(arg1),
- tcg_constant_i32(arg2), tcg_constant_i32(arg2 >> 32));
- }
+ tcg_gen_addi_i64(ret, arg1, -arg2);
}
void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization
2023-10-26 1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
2023-10-26 1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
@ 2023-10-26 1:39 ` Richard Henderson
2023-11-06 14:42 ` Philippe Mathieu-Daudé
2023-10-26 1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2023-10-26 1:39 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 5e16800cfa..9d91c25f68 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -2237,7 +2237,19 @@ static bool fold_sub_vec(OptContext *ctx, TCGOp *op)
static bool fold_sub(OptContext *ctx, TCGOp *op)
{
- return fold_const2(ctx, op) || fold_sub_vec(ctx, op);
+ if (fold_const2(ctx, op) || fold_sub_vec(ctx, op)) {
+ return true;
+ }
+
+ /* Fold sub r,x,i to add r,x,-i */
+ if (arg_is_const(op->args[2])) {
+ uint64_t val = arg_info(op->args[2])->val;
+
+ op->opc = (ctx->type == TCG_TYPE_I32
+ ? INDEX_op_add_i32 : INDEX_op_add_i64);
+ op->args[2] = arg_new_constant(ctx, -val);
+ }
+ return false;
}
static bool fold_sub2(OptContext *ctx, TCGOp *op)
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2
2023-10-26 1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
2023-10-26 1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
2023-10-26 1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
@ 2023-10-26 1:39 ` Richard Henderson
2023-11-06 14:43 ` Philippe Mathieu-Daudé
2023-10-26 1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
2023-10-31 22:20 ` [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
4 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2023-10-26 1:39 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 9d91c25f68..280e6089a6 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1030,8 +1030,10 @@ static bool fold_add_vec(OptContext *ctx, TCGOp *op)
static bool fold_addsub2(OptContext *ctx, TCGOp *op, bool add)
{
- if (arg_is_const(op->args[2]) && arg_is_const(op->args[3]) &&
- arg_is_const(op->args[4]) && arg_is_const(op->args[5])) {
+ bool a_const = arg_is_const(op->args[2]) && arg_is_const(op->args[3]);
+ bool b_const = arg_is_const(op->args[4]) && arg_is_const(op->args[5]);
+
+ if (a_const && b_const) {
uint64_t al = arg_info(op->args[2])->val;
uint64_t ah = arg_info(op->args[3])->val;
uint64_t bl = arg_info(op->args[4])->val;
@@ -1075,6 +1077,21 @@ static bool fold_addsub2(OptContext *ctx, TCGOp *op, bool add)
tcg_opt_gen_movi(ctx, op2, rh, ah);
return true;
}
+
+ /* Fold sub2 r,x,i to add2 r,x,-i */
+ if (!add && b_const) {
+ uint64_t bl = arg_info(op->args[4])->val;
+ uint64_t bh = arg_info(op->args[5])->val;
+
+ /* Negate the two parts without assembling and disassembling. */
+ bl = -bl;
+ bh = ~bh + !bl;
+
+ op->opc = (ctx->type == TCG_TYPE_I32
+ ? INDEX_op_add2_i32 : INDEX_op_add2_i64);
+ op->args[4] = arg_new_constant(ctx, bl);
+ op->args[5] = arg_new_constant(ctx, bh);
+ }
return false;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
2023-10-26 1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
` (2 preceding siblings ...)
2023-10-26 1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
@ 2023-10-26 1:39 ` Richard Henderson
2023-10-26 1:53 ` Richard Henderson
2023-11-06 14:49 ` Philippe Mathieu-Daudé
2023-10-31 22:20 ` [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
4 siblings, 2 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-26 1:39 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
A release build should simply accept and emit the subtract.
I'm not even sure if this is reasonable to keep for debug.
Not-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 49 ++++++++++++++++++++++++++-------------
tcg/i386/tcg-target.c.inc | 13 ++++++++---
2 files changed, 43 insertions(+), 19 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a507c111cf..408647af7e 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -3618,23 +3618,40 @@ liveness_pass_1(TCGContext *s)
do_addsub2:
nb_iargs = 4;
nb_oargs = 2;
- /* Test if the high part of the operation is dead, but not
- the low part. The result can be optimized to a simple
- add or sub. This happens often for x86_64 guest when the
- cpu mode is set to 32 bit. */
- if (arg_temp(op->args[1])->state == TS_DEAD) {
- if (arg_temp(op->args[0])->state == TS_DEAD) {
- goto do_remove;
- }
- /* Replace the opcode and adjust the args in place,
- leaving 3 unused args at the end. */
- op->opc = opc = opc_new;
- op->args[1] = op->args[2];
- op->args[2] = op->args[4];
- /* Fall through and mark the single-word operation live. */
- nb_iargs = 2;
- nb_oargs = 1;
+ /*
+ * Test if the high part of the operation is dead, but the low
+ * part is still live. The result can be optimized to a simple
+ * add or sub.
+ */
+ if (arg_temp(op->args[1])->state != TS_DEAD) {
+ goto do_not_remove;
}
+ if (arg_temp(op->args[0])->state == TS_DEAD) {
+ goto do_remove;
+ }
+ /*
+ * Replace the opcode and adjust the args in place, leaving 3
+ * unused args at the end. Canonicalize subi to andi.
+ */
+ op->args[1] = op->args[2];
+ {
+ TCGTemp *src2 = arg_temp(op->args[4]);
+ if (src2->kind == TEMP_CONST) {
+ if (opc_new == INDEX_op_sub_i32) {
+ src2 = tcg_constant_internal(TCG_TYPE_I32,
+ (int32_t)-src2->val);
+ opc_new = INDEX_op_add_i32;
+ } else if (opc_new == INDEX_op_sub_i64) {
+ src2 = tcg_constant_internal(TCG_TYPE_I64, -src2->val);
+ opc_new = INDEX_op_add_i64;
+ }
+ }
+ op->args[2] = temp_arg(src2);
+ }
+ op->opc = opc = opc_new;
+ /* Mark the single-word operation live. */
+ nb_iargs = 2;
+ nb_oargs = 1;
goto do_not_remove;
case INDEX_op_mulu2_i32:
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 0d97864174..39d03fa698 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -2544,8 +2544,14 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
c = ARITH_ADD;
goto gen_arith;
OP_32_64(sub):
- c = ARITH_SUB;
- goto gen_arith;
+ /*
+ * Should have canonicalized all constants to add.
+ * Keep the constraint allowing any constant so that we catch this
+ * case without register allocation loading the constant first.
+ */
+ tcg_debug_assert(!const_a2);
+ tgen_arithr(s, ARITH_SUB + rexw, a0, a2);
+ break;
OP_32_64(and):
c = ARITH_AND;
goto gen_arith;
@@ -3325,7 +3331,6 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
return C_O1_I2(r, r, re);
case INDEX_op_sub_i32:
- case INDEX_op_sub_i64:
case INDEX_op_mul_i32:
case INDEX_op_mul_i64:
case INDEX_op_or_i32:
@@ -3333,6 +3338,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_xor_i32:
case INDEX_op_xor_i64:
return C_O1_I2(r, 0, re);
+ case INDEX_op_sub_i64:
+ return C_O1_I2(r, 0, ri);
case INDEX_op_and_i32:
case INDEX_op_and_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
2023-10-26 1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
@ 2023-10-26 1:53 ` Richard Henderson
2023-11-06 14:49 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-26 1:53 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
On 10/25/23 18:39, Richard Henderson wrote:
> A release build should simply accept and emit the subtract.
> I'm not even sure if this is reasonable to keep for debug.
>
> Not-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tcg/tcg.c | 49 ++++++++++++++++++++++++++-------------
> tcg/i386/tcg-target.c.inc | 13 ++++++++---
> 2 files changed, 43 insertions(+), 19 deletions(-)
Oops, accidental merge of two patches.
The tcg.c change *is* required.
r~
>
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index a507c111cf..408647af7e 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -3618,23 +3618,40 @@ liveness_pass_1(TCGContext *s)
> do_addsub2:
> nb_iargs = 4;
> nb_oargs = 2;
> - /* Test if the high part of the operation is dead, but not
> - the low part. The result can be optimized to a simple
> - add or sub. This happens often for x86_64 guest when the
> - cpu mode is set to 32 bit. */
> - if (arg_temp(op->args[1])->state == TS_DEAD) {
> - if (arg_temp(op->args[0])->state == TS_DEAD) {
> - goto do_remove;
> - }
> - /* Replace the opcode and adjust the args in place,
> - leaving 3 unused args at the end. */
> - op->opc = opc = opc_new;
> - op->args[1] = op->args[2];
> - op->args[2] = op->args[4];
> - /* Fall through and mark the single-word operation live. */
> - nb_iargs = 2;
> - nb_oargs = 1;
> + /*
> + * Test if the high part of the operation is dead, but the low
> + * part is still live. The result can be optimized to a simple
> + * add or sub.
> + */
> + if (arg_temp(op->args[1])->state != TS_DEAD) {
> + goto do_not_remove;
> }
> + if (arg_temp(op->args[0])->state == TS_DEAD) {
> + goto do_remove;
> + }
> + /*
> + * Replace the opcode and adjust the args in place, leaving 3
> + * unused args at the end. Canonicalize subi to andi.
> + */
> + op->args[1] = op->args[2];
> + {
> + TCGTemp *src2 = arg_temp(op->args[4]);
> + if (src2->kind == TEMP_CONST) {
> + if (opc_new == INDEX_op_sub_i32) {
> + src2 = tcg_constant_internal(TCG_TYPE_I32,
> + (int32_t)-src2->val);
> + opc_new = INDEX_op_add_i32;
> + } else if (opc_new == INDEX_op_sub_i64) {
> + src2 = tcg_constant_internal(TCG_TYPE_I64, -src2->val);
> + opc_new = INDEX_op_add_i64;
> + }
> + }
> + op->args[2] = temp_arg(src2);
> + }
> + op->opc = opc = opc_new;
> + /* Mark the single-word operation live. */
> + nb_iargs = 2;
> + nb_oargs = 1;
> goto do_not_remove;
>
> case INDEX_op_mulu2_i32:
> diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
> index 0d97864174..39d03fa698 100644
> --- a/tcg/i386/tcg-target.c.inc
> +++ b/tcg/i386/tcg-target.c.inc
> @@ -2544,8 +2544,14 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
> c = ARITH_ADD;
> goto gen_arith;
> OP_32_64(sub):
> - c = ARITH_SUB;
> - goto gen_arith;
> + /*
> + * Should have canonicalized all constants to add.
> + * Keep the constraint allowing any constant so that we catch this
> + * case without register allocation loading the constant first.
> + */
> + tcg_debug_assert(!const_a2);
> + tgen_arithr(s, ARITH_SUB + rexw, a0, a2);
> + break;
> OP_32_64(and):
> c = ARITH_AND;
> goto gen_arith;
> @@ -3325,7 +3331,6 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
> return C_O1_I2(r, r, re);
>
> case INDEX_op_sub_i32:
> - case INDEX_op_sub_i64:
> case INDEX_op_mul_i32:
> case INDEX_op_mul_i64:
> case INDEX_op_or_i32:
> @@ -3333,6 +3338,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
> case INDEX_op_xor_i32:
> case INDEX_op_xor_i64:
> return C_O1_I2(r, 0, re);
> + case INDEX_op_sub_i64:
> + return C_O1_I2(r, 0, ri);
>
> case INDEX_op_and_i32:
> case INDEX_op_and_i64:
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] tcg: Canonicalize SUBI to ANDI
2023-10-26 1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
` (3 preceding siblings ...)
2023-10-26 1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
@ 2023-10-31 22:20 ` Richard Henderson
4 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2023-10-31 22:20 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini
Ping.
On 10/25/23 18:39, Richard Henderson wrote:
> Supercedes: 20231025185956.72677-1-pbonzini@redhat.com
> ("tcg: allow a target to request canonicalization of SUBI to ADDI")
>
> Based-on: 20231025072707.833943-1-richard.henderson@linaro.org
> ("tcg: Introduce TCG_COND_TST{EQ,NE}")
> There a couple of optimize.c routines introduced which I use here.
>
> The final NOTFORMERGE patch suggests that I've caught all of the
> cases, but I certainly wouldn't want to merge the assert without
> a lot more testing.
>
>
> r~
>
>
> Richard Henderson (4):
> tcg: Canonicalize subi to addi during opcode generation
> tcg/optimize: Canonicalize subi to addi during optimization
> tcg/optimize: Canonicalize sub2 with constants to add2
> NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
>
> tcg/optimize.c | 35 +++++++++++++++++++++++++---
> tcg/tcg-op.c | 25 +++++++-------------
> tcg/tcg.c | 49 ++++++++++++++++++++++++++-------------
> tcg/i386/tcg-target.c.inc | 13 ++++++++---
> 4 files changed, 84 insertions(+), 38 deletions(-)
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation
2023-10-26 1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
@ 2023-11-06 14:37 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:37 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: pbonzini
On 26/10/23 03:39, Richard Henderson wrote:
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tcg/tcg-op.c | 25 +++++++++----------------
> 1 file changed, 9 insertions(+), 16 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization
2023-10-26 1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
@ 2023-11-06 14:42 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:42 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: pbonzini
On 26/10/23 03:39, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tcg/optimize.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2
2023-10-26 1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
@ 2023-11-06 14:43 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:43 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: pbonzini
On 26/10/23 03:39, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tcg/optimize.c | 21 +++++++++++++++++++--
> 1 file changed, 19 insertions(+), 2 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded
2023-10-26 1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
2023-10-26 1:53 ` Richard Henderson
@ 2023-11-06 14:49 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-11-06 14:49 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: pbonzini
On 26/10/23 03:39, Richard Henderson wrote:
> A release build should simply accept and emit the subtract.
> I'm not even sure if this is reasonable to keep for debug.
>
> Not-Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tcg/tcg.c | 49 ++++++++++++++++++++++++++-------------
> tcg/i386/tcg-target.c.inc | 13 ++++++++---
> 2 files changed, 43 insertions(+), 19 deletions(-)
>
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index a507c111cf..408647af7e 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -3618,23 +3618,40 @@ liveness_pass_1(TCGContext *s)
> do_addsub2:
> nb_iargs = 4;
> nb_oargs = 2;
> - /* Test if the high part of the operation is dead, but not
> - the low part. The result can be optimized to a simple
> - add or sub. This happens often for x86_64 guest when the
> - cpu mode is set to 32 bit. */
> - if (arg_temp(op->args[1])->state == TS_DEAD) {
> - if (arg_temp(op->args[0])->state == TS_DEAD) {
> - goto do_remove;
> - }
> - /* Replace the opcode and adjust the args in place,
> - leaving 3 unused args at the end. */
> - op->opc = opc = opc_new;
> - op->args[1] = op->args[2];
> - op->args[2] = op->args[4];
> - /* Fall through and mark the single-word operation live. */
> - nb_iargs = 2;
> - nb_oargs = 1;
> + /*
> + * Test if the high part of the operation is dead, but the low
> + * part is still live. The result can be optimized to a simple
> + * add or sub.
> + */
> + if (arg_temp(op->args[1])->state != TS_DEAD) {
> + goto do_not_remove;
> }
> + if (arg_temp(op->args[0])->state == TS_DEAD) {
> + goto do_remove;
> + }
> + /*
> + * Replace the opcode and adjust the args in place, leaving 3
> + * unused args at the end. Canonicalize subi to andi.
Typo s/andi/addi/.
> + */
> + op->args[1] = op->args[2];
> + {
> + TCGTemp *src2 = arg_temp(op->args[4]);
> + if (src2->kind == TEMP_CONST) {
> + if (opc_new == INDEX_op_sub_i32) {
> + src2 = tcg_constant_internal(TCG_TYPE_I32,
> + (int32_t)-src2->val);
> + opc_new = INDEX_op_add_i32;
> + } else if (opc_new == INDEX_op_sub_i64) {
> + src2 = tcg_constant_internal(TCG_TYPE_I64, -src2->val);
> + opc_new = INDEX_op_add_i64;
> + }
> + }
> + op->args[2] = temp_arg(src2);
> + }
> + op->opc = opc = opc_new;
> + /* Mark the single-word operation live. */
> + nb_iargs = 2;
> + nb_oargs = 1;
> goto do_not_remove;
For tcg/tcg.c with your S-o-b and a reworded description:
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-11-06 14:50 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-26 1:39 [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
2023-10-26 1:39 ` [PATCH 1/4] tcg: Canonicalize subi to addi during opcode generation Richard Henderson
2023-11-06 14:37 ` Philippe Mathieu-Daudé
2023-10-26 1:39 ` [PATCH 2/4] tcg/optimize: Canonicalize subi to addi during optimization Richard Henderson
2023-11-06 14:42 ` Philippe Mathieu-Daudé
2023-10-26 1:39 ` [PATCH 3/4] tcg/optimize: Canonicalize sub2 with constants to add2 Richard Henderson
2023-11-06 14:43 ` Philippe Mathieu-Daudé
2023-10-26 1:39 ` [PATCH 4/4] NOTFORMERGE tcg/i386: Assert sub of immediate has been folded Richard Henderson
2023-10-26 1:53 ` Richard Henderson
2023-11-06 14:49 ` Philippe Mathieu-Daudé
2023-10-31 22:20 ` [PATCH 0/4] tcg: Canonicalize SUBI to ANDI Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).