* [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize
@ 2024-03-12 14:38 Richard Henderson
2024-03-12 14:38 ` [PATCH 01/15] tcg/optimize: Fold andc with immediate to and Richard Henderson
` (14 more replies)
0 siblings, 15 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
This is a follow-on to 6334a968eec3 ("tcg/optimize: Canonicalize
subi to addi during optimization"), which I wrote at the end of
the previous devel cycle, then forgot about during the current.
In addition to sub->add, canonicalize andc->and etc.
The early expansion that we produce for deposit does not fold
constants well; expand unsupported deposit during optimize.
r~
Richard Henderson (15):
tcg/optimize: Fold andc with immediate to and
tcg/optimize: Fold orc with immediate to or
tcg/optimize: Fold eqv with immediate to xor
tcg/i386: Do not accept immediate operand for andc
tcg/aarch64: Do not accept immediate operand for andc, orc, eqv
tcg/arm: Do not accept immediate operand for andc
tcg/ppc: Do not accept immediate operand for andc, orc, eqv
tcg/loongarch64: Do not accept immediate operand for andc, orc
tcg/s390x: Do not accept immediate operand for andc, orc
tcg/riscv: Do not accept immediate operand for andc, orc, eqv
tcg/riscv: Do not accept immediate operands for sub
tcg/riscv: Do not accept zero operands for logicals, multiply or
divide
tcg/optimize: Fold and to extu during optimize
tcg: Use arg_is_const_val in fold_sub_to_neg
tcg/optimize: Lower unsupported deposit during optimize
tcg/i386/tcg-target-con-set.h | 3 +-
tcg/i386/tcg-target-con-str.h | 1 -
tcg/loongarch64/tcg-target-con-set.h | 2 +-
tcg/loongarch64/tcg-target-con-str.h | 1 -
tcg/riscv/tcg-target-con-set.h | 4 +-
tcg/riscv/tcg-target-con-str.h | 2 -
tcg/optimize.c | 318 +++++++++++++++++++++++----
tcg/tcg-op.c | 244 +++++---------------
tcg/aarch64/tcg-target.c.inc | 50 ++---
tcg/arm/tcg-target.c.inc | 6 +-
tcg/i386/tcg-target.c.inc | 20 +-
tcg/loongarch64/tcg-target.c.inc | 31 +--
tcg/ppc/tcg-target.c.inc | 32 +--
tcg/riscv/tcg-target.c.inc | 58 +----
tcg/s390x/tcg-target.c.inc | 56 +----
15 files changed, 393 insertions(+), 435 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 01/15] tcg/optimize: Fold andc with immediate to and
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-13 1:29 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 02/15] tcg/optimize: Fold orc with immediate to or Richard Henderson
` (13 subsequent siblings)
14 siblings, 1 reply; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 752cc5c56b..2ec52df368 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1324,17 +1324,23 @@ static bool fold_andc(OptContext *ctx, TCGOp *op)
z1 = arg_info(op->args[1])->z_mask;
- /*
- * Known-zeros does not imply known-ones. Therefore unless
- * arg2 is constant, we can't infer anything from it.
- */
if (arg_is_const(op->args[2])) {
- uint64_t z2 = ~arg_info(op->args[2])->z_mask;
- ctx->a_mask = z1 & ~z2;
- z1 &= z2;
- }
- ctx->z_mask = z1;
+ uint64_t val = ~arg_info(op->args[2])->val;
+ /* Fold andc r,x,i to and r,x,~i. */
+ op->opc = (ctx->type == TCG_TYPE_I32
+ ? INDEX_op_and_i32 : INDEX_op_and_i64);
+ op->args[2] = arg_new_constant(ctx, val);
+
+ /*
+ * Known-zeros does not imply known-ones. Therefore unless
+ * arg2 is constant, we can't infer anything from it.
+ */
+ ctx->a_mask = z1 & ~val;
+ z1 &= val;
+ }
+
+ ctx->z_mask = z1;
ctx->s_mask = arg_info(op->args[1])->s_mask
& arg_info(op->args[2])->s_mask;
return fold_masks(ctx, op);
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 02/15] tcg/optimize: Fold orc with immediate to or
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
2024-03-12 14:38 ` [PATCH 01/15] tcg/optimize: Fold andc with immediate to and Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 03/15] tcg/optimize: Fold eqv with immediate to xor Richard Henderson
` (12 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 2ec52df368..5729433548 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -2065,6 +2065,15 @@ static bool fold_orc(OptContext *ctx, TCGOp *op)
return true;
}
+ /* Fold orc r,x,i to or r,x,~i. */
+ if (arg_is_const(op->args[2])) {
+ uint64_t val = ~arg_info(op->args[2])->val;
+
+ op->opc = (ctx->type == TCG_TYPE_I32
+ ? INDEX_op_or_i32 : INDEX_op_or_i64);
+ op->args[2] = arg_new_constant(ctx, val);
+ }
+
ctx->s_mask = arg_info(op->args[1])->s_mask
& arg_info(op->args[2])->s_mask;
return false;
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 03/15] tcg/optimize: Fold eqv with immediate to xor
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
2024-03-12 14:38 ` [PATCH 01/15] tcg/optimize: Fold andc with immediate to and Richard Henderson
2024-03-12 14:38 ` [PATCH 02/15] tcg/optimize: Fold orc with immediate to or Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 04/15] tcg/i386: Do not accept immediate operand for andc Richard Henderson
` (11 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 5729433548..c6b0ab35c8 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1687,6 +1687,15 @@ static bool fold_eqv(OptContext *ctx, TCGOp *op)
return true;
}
+ /* Fold eqv r,x,i to xor r,x,~i. */
+ if (arg_is_const(op->args[2])) {
+ uint64_t val = ~arg_info(op->args[2])->val;
+
+ op->opc = (ctx->type == TCG_TYPE_I32
+ ? INDEX_op_xor_i32 : INDEX_op_xor_i64);
+ op->args[2] = arg_new_constant(ctx, val);
+ }
+
ctx->s_mask = arg_info(op->args[1])->s_mask
& arg_info(op->args[2])->s_mask;
return false;
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 04/15] tcg/i386: Do not accept immediate operand for andc
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (2 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 03/15] tcg/optimize: Fold eqv with immediate to xor Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 05/15] tcg/aarch64: Do not accept immediate operand for andc, orc, eqv Richard Henderson
` (10 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformation of andc with immediate to and is now
done generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target-con-set.h | 3 +--
tcg/i386/tcg-target-con-str.h | 1 -
tcg/i386/tcg-target.c.inc | 20 +++++---------------
3 files changed, 6 insertions(+), 18 deletions(-)
diff --git a/tcg/i386/tcg-target-con-set.h b/tcg/i386/tcg-target-con-set.h
index e24241cfa2..69d2d38570 100644
--- a/tcg/i386/tcg-target-con-set.h
+++ b/tcg/i386/tcg-target-con-set.h
@@ -40,11 +40,10 @@ C_O1_I2(r, 0, r)
C_O1_I2(r, 0, re)
C_O1_I2(r, 0, reZ)
C_O1_I2(r, 0, ri)
-C_O1_I2(r, 0, rI)
C_O1_I2(r, L, L)
+C_O1_I2(r, r, r)
C_O1_I2(r, r, re)
C_O1_I2(r, r, ri)
-C_O1_I2(r, r, rI)
C_O1_I2(x, x, x)
C_N1_I2(r, r, r)
C_N1_I2(r, r, rW)
diff --git a/tcg/i386/tcg-target-con-str.h b/tcg/i386/tcg-target-con-str.h
index cc22db227b..0c766eac7e 100644
--- a/tcg/i386/tcg-target-con-str.h
+++ b/tcg/i386/tcg-target-con-str.h
@@ -27,7 +27,6 @@ REGS('s', ALL_BYTEL_REGS & ~SOFTMMU_RESERVE_REGS) /* qemu_st8_i32 data */
* CONST(letter, TCG_CT_CONST_* bit set)
*/
CONST('e', TCG_CT_CONST_S32)
-CONST('I', TCG_CT_CONST_I32)
CONST('T', TCG_CT_CONST_TST)
CONST('W', TCG_CT_CONST_WSZ)
CONST('Z', TCG_CT_CONST_U32)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index c6ba498623..ed70524864 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -130,9 +130,8 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
/* Constants we accept. */
#define TCG_CT_CONST_S32 0x100
#define TCG_CT_CONST_U32 0x200
-#define TCG_CT_CONST_I32 0x400
-#define TCG_CT_CONST_WSZ 0x800
-#define TCG_CT_CONST_TST 0x1000
+#define TCG_CT_CONST_WSZ 0x400
+#define TCG_CT_CONST_TST 0x800
/* Registers used with L constraint, which are the first argument
registers on x86_64, and two random call clobbered registers on
@@ -203,8 +202,7 @@ static bool tcg_target_const_match(int64_t val, int ct,
return 1;
}
if (type == TCG_TYPE_I32) {
- if (ct & (TCG_CT_CONST_S32 | TCG_CT_CONST_U32 |
- TCG_CT_CONST_I32 | TCG_CT_CONST_TST)) {
+ if (ct & (TCG_CT_CONST_S32 | TCG_CT_CONST_U32 | TCG_CT_CONST_TST)) {
return 1;
}
} else {
@@ -214,9 +212,6 @@ static bool tcg_target_const_match(int64_t val, int ct,
if ((ct & TCG_CT_CONST_U32) && val == (uint32_t)val) {
return 1;
}
- if ((ct & TCG_CT_CONST_I32) && ~val == (int32_t)~val) {
- return 1;
- }
/*
* This will be used in combination with TCG_CT_CONST_S32,
* so "normal" TESTQ is already matched. Also accept:
@@ -2666,12 +2661,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
OP_32_64(andc):
- if (const_a2) {
- tcg_out_mov(s, rexw ? TCG_TYPE_I64 : TCG_TYPE_I32, a0, a1);
- tgen_arithi(s, ARITH_AND + rexw, a0, ~a2, 0);
- } else {
- tcg_out_vex_modrm(s, OPC_ANDN + rexw, a0, a2, a1);
- }
+ tcg_out_vex_modrm(s, OPC_ANDN + rexw, a0, a2, a1);
break;
OP_32_64(mul):
@@ -3442,7 +3432,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_andc_i32:
case INDEX_op_andc_i64:
- return C_O1_I2(r, r, rI);
+ return C_O1_I2(r, r, r);
case INDEX_op_shl_i32:
case INDEX_op_shl_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 05/15] tcg/aarch64: Do not accept immediate operand for andc, orc, eqv
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (3 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 04/15] tcg/i386: Do not accept immediate operand for andc Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 06/15] tcg/arm: Do not accept immediate operand for andc Richard Henderson
` (9 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformations with inverted immediate are now done
generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/aarch64/tcg-target.c.inc | 50 +++++++++++-------------------------
1 file changed, 15 insertions(+), 35 deletions(-)
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index dec8ecc1b6..68a381e4af 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -2216,17 +2216,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
}
break;
- case INDEX_op_andc_i32:
- a2 = (int32_t)a2;
- /* FALLTHRU */
- case INDEX_op_andc_i64:
- if (c2) {
- tcg_out_logicali(s, I3404_ANDI, ext, a0, a1, ~a2);
- } else {
- tcg_out_insn(s, 3510, BIC, ext, a0, a1, a2);
- }
- break;
-
case INDEX_op_or_i32:
a2 = (int32_t)a2;
/* FALLTHRU */
@@ -2238,17 +2227,6 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
}
break;
- case INDEX_op_orc_i32:
- a2 = (int32_t)a2;
- /* FALLTHRU */
- case INDEX_op_orc_i64:
- if (c2) {
- tcg_out_logicali(s, I3404_ORRI, ext, a0, a1, ~a2);
- } else {
- tcg_out_insn(s, 3510, ORN, ext, a0, a1, a2);
- }
- break;
-
case INDEX_op_xor_i32:
a2 = (int32_t)a2;
/* FALLTHRU */
@@ -2260,15 +2238,17 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
}
break;
+ case INDEX_op_andc_i32:
+ case INDEX_op_andc_i64:
+ tcg_out_insn(s, 3510, BIC, ext, a0, a1, a2);
+ break;
+ case INDEX_op_orc_i32:
+ case INDEX_op_orc_i64:
+ tcg_out_insn(s, 3510, ORN, ext, a0, a1, a2);
+ break;
case INDEX_op_eqv_i32:
- a2 = (int32_t)a2;
- /* FALLTHRU */
case INDEX_op_eqv_i64:
- if (c2) {
- tcg_out_logicali(s, I3404_EORI, ext, a0, a1, ~a2);
- } else {
- tcg_out_insn(s, 3510, EON, ext, a0, a1, a2);
- }
+ tcg_out_insn(s, 3510, EON, ext, a0, a1, a2);
break;
case INDEX_op_not_i64:
@@ -2995,6 +2975,12 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_negsetcond_i64:
return C_O1_I2(r, r, rC);
+ case INDEX_op_andc_i32:
+ case INDEX_op_andc_i64:
+ case INDEX_op_orc_i32:
+ case INDEX_op_orc_i64:
+ case INDEX_op_eqv_i32:
+ case INDEX_op_eqv_i64:
case INDEX_op_mul_i32:
case INDEX_op_mul_i64:
case INDEX_op_div_i32:
@@ -3015,12 +3001,6 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_or_i64:
case INDEX_op_xor_i32:
case INDEX_op_xor_i64:
- case INDEX_op_andc_i32:
- case INDEX_op_andc_i64:
- case INDEX_op_orc_i32:
- case INDEX_op_orc_i64:
- case INDEX_op_eqv_i32:
- case INDEX_op_eqv_i64:
return C_O1_I2(r, r, rL);
case INDEX_op_shl_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 06/15] tcg/arm: Do not accept immediate operand for andc
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (4 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 05/15] tcg/aarch64: Do not accept immediate operand for andc, orc, eqv Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 07/15] tcg/ppc: Do not accept immediate operand for andc, orc, eqv Richard Henderson
` (8 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformation of andc with immediate to and is now
done generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/arm/tcg-target.c.inc | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 6a04c73c76..a0c5887579 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1869,8 +1869,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
args[0], args[1], args[2], const_args[2]);
break;
case INDEX_op_andc_i32:
- tcg_out_dat_rIK(s, COND_AL, ARITH_BIC, ARITH_AND,
- args[0], args[1], args[2], const_args[2]);
+ tcg_out_dat_reg(s, COND_AL, ARITH_BIC, args[0], args[1],
+ args[2], SHIFT_IMM_LSL(0));
break;
case INDEX_op_or_i32:
c = ARITH_ORR;
@@ -2152,11 +2152,11 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
return C_O1_I2(r, r, rIN);
case INDEX_op_and_i32:
- case INDEX_op_andc_i32:
case INDEX_op_clz_i32:
case INDEX_op_ctz_i32:
return C_O1_I2(r, r, rIK);
+ case INDEX_op_andc_i32:
case INDEX_op_mul_i32:
case INDEX_op_div_i32:
case INDEX_op_divu_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 07/15] tcg/ppc: Do not accept immediate operand for andc, orc, eqv
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (5 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 06/15] tcg/arm: Do not accept immediate operand for andc Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 08/15] tcg/loongarch64: Do not accept immediate operand for andc, orc Richard Henderson
` (7 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformations with inverted immediate are now done
generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target.c.inc | 32 +++++---------------------------
1 file changed, 5 insertions(+), 27 deletions(-)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 7f3829beeb..336b8a28ba 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -3070,36 +3070,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
}
break;
case INDEX_op_andc_i32:
- a0 = args[0], a1 = args[1], a2 = args[2];
- if (const_args[2]) {
- tcg_out_andi32(s, a0, a1, ~a2);
- } else {
- tcg_out32(s, ANDC | SAB(a1, a0, a2));
- }
- break;
case INDEX_op_andc_i64:
- a0 = args[0], a1 = args[1], a2 = args[2];
- if (const_args[2]) {
- tcg_out_andi64(s, a0, a1, ~a2);
- } else {
- tcg_out32(s, ANDC | SAB(a1, a0, a2));
- }
+ tcg_out32(s, ANDC | SAB(args[1], args[0], args[2]));
break;
case INDEX_op_orc_i32:
- if (const_args[2]) {
- tcg_out_ori32(s, args[0], args[1], ~args[2]);
- break;
- }
- /* FALLTHRU */
case INDEX_op_orc_i64:
tcg_out32(s, ORC | SAB(args[1], args[0], args[2]));
break;
case INDEX_op_eqv_i32:
- if (const_args[2]) {
- tcg_out_xori32(s, args[0], args[1], ~args[2]);
- break;
- }
- /* FALLTHRU */
case INDEX_op_eqv_i64:
tcg_out32(s, EQV | SAB(args[1], args[0], args[2]));
break;
@@ -4120,16 +4098,12 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_and_i32:
case INDEX_op_or_i32:
case INDEX_op_xor_i32:
- case INDEX_op_andc_i32:
- case INDEX_op_orc_i32:
- case INDEX_op_eqv_i32:
case INDEX_op_shl_i32:
case INDEX_op_shr_i32:
case INDEX_op_sar_i32:
case INDEX_op_rotl_i32:
case INDEX_op_rotr_i32:
case INDEX_op_and_i64:
- case INDEX_op_andc_i64:
case INDEX_op_shl_i64:
case INDEX_op_shr_i64:
case INDEX_op_sar_i64:
@@ -4145,10 +4119,14 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_divu_i32:
case INDEX_op_rem_i32:
case INDEX_op_remu_i32:
+ case INDEX_op_andc_i32:
+ case INDEX_op_orc_i32:
+ case INDEX_op_eqv_i32:
case INDEX_op_nand_i32:
case INDEX_op_nor_i32:
case INDEX_op_muluh_i32:
case INDEX_op_mulsh_i32:
+ case INDEX_op_andc_i64:
case INDEX_op_orc_i64:
case INDEX_op_eqv_i64:
case INDEX_op_nand_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 08/15] tcg/loongarch64: Do not accept immediate operand for andc, orc
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (6 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 07/15] tcg/ppc: Do not accept immediate operand for andc, orc, eqv Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 09/15] tcg/s390x: " Richard Henderson
` (6 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformations with inverted immediate are now done
generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/loongarch64/tcg-target-con-set.h | 2 +-
tcg/loongarch64/tcg-target-con-str.h | 1 -
tcg/loongarch64/tcg-target.c.inc | 31 ++++++----------------------
3 files changed, 7 insertions(+), 27 deletions(-)
diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index cae6c2aad6..272f33c1e4 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -22,7 +22,7 @@ C_O0_I3(r, r, r)
C_O1_I1(r, r)
C_O1_I1(w, r)
C_O1_I1(w, w)
-C_O1_I2(r, r, rC)
+C_O1_I2(r, r, r)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
C_O1_I2(r, r, rJ)
diff --git a/tcg/loongarch64/tcg-target-con-str.h b/tcg/loongarch64/tcg-target-con-str.h
index 2ba9c135ac..e7d2686db3 100644
--- a/tcg/loongarch64/tcg-target-con-str.h
+++ b/tcg/loongarch64/tcg-target-con-str.h
@@ -24,7 +24,6 @@ CONST('I', TCG_CT_CONST_S12)
CONST('J', TCG_CT_CONST_S32)
CONST('U', TCG_CT_CONST_U12)
CONST('Z', TCG_CT_CONST_ZERO)
-CONST('C', TCG_CT_CONST_C12)
CONST('W', TCG_CT_CONST_WSZ)
CONST('M', TCG_CT_CONST_VCMP)
CONST('A', TCG_CT_CONST_VADD)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 69c5b8ac4f..e343d33dba 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -169,10 +169,9 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
#define TCG_CT_CONST_S12 0x200
#define TCG_CT_CONST_S32 0x400
#define TCG_CT_CONST_U12 0x800
-#define TCG_CT_CONST_C12 0x1000
-#define TCG_CT_CONST_WSZ 0x2000
-#define TCG_CT_CONST_VCMP 0x4000
-#define TCG_CT_CONST_VADD 0x8000
+#define TCG_CT_CONST_WSZ 0x1000
+#define TCG_CT_CONST_VCMP 0x2000
+#define TCG_CT_CONST_VADD 0x4000
#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
#define ALL_VECTOR_REGS MAKE_64BIT_MASK(32, 32)
@@ -201,9 +200,6 @@ static bool tcg_target_const_match(int64_t val, int ct,
if ((ct & TCG_CT_CONST_U12) && val >= 0 && val <= 0xfff) {
return true;
}
- if ((ct & TCG_CT_CONST_C12) && ~val >= 0 && ~val <= 0xfff) {
- return true;
- }
if ((ct & TCG_CT_CONST_WSZ) && val == (type == TCG_TYPE_I32 ? 32 : 64)) {
return true;
}
@@ -1236,22 +1232,12 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
case INDEX_op_andc_i32:
case INDEX_op_andc_i64:
- if (c2) {
- /* guaranteed to fit due to constraint */
- tcg_out_opc_andi(s, a0, a1, ~a2);
- } else {
- tcg_out_opc_andn(s, a0, a1, a2);
- }
+ tcg_out_opc_andn(s, a0, a1, a2);
break;
case INDEX_op_orc_i32:
case INDEX_op_orc_i64:
- if (c2) {
- /* guaranteed to fit due to constraint */
- tcg_out_opc_ori(s, a0, a1, ~a2);
- } else {
- tcg_out_opc_orn(s, a0, a1, a2);
- }
+ tcg_out_opc_orn(s, a0, a1, a2);
break;
case INDEX_op_and_i32:
@@ -2120,12 +2106,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_andc_i64:
case INDEX_op_orc_i32:
case INDEX_op_orc_i64:
- /*
- * LoongArch insns for these ops don't have reg-imm forms, but we
- * can express using andi/ori if ~constant satisfies
- * TCG_CT_CONST_U12.
- */
- return C_O1_I2(r, r, rC);
+ return C_O1_I2(r, r, r);
case INDEX_op_shl_i32:
case INDEX_op_shl_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 09/15] tcg/s390x: Do not accept immediate operand for andc, orc
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (7 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 08/15] tcg/loongarch64: Do not accept immediate operand for andc, orc Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 10/15] tcg/riscv: Do not accept immediate operand for andc, orc, eqv Richard Henderson
` (5 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformations with inverted immediate are now done
generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/s390x/tcg-target.c.inc | 56 ++++++--------------------------------
1 file changed, 8 insertions(+), 48 deletions(-)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index ad587325fc..b9a3e6e56a 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -2216,31 +2216,13 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_andc_i32:
- a0 = args[0], a1 = args[1], a2 = (uint32_t)args[2];
- if (const_args[2]) {
- tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
- tgen_andi(s, TCG_TYPE_I32, a0, (uint32_t)~a2);
- } else {
- tcg_out_insn(s, RRFa, NCRK, a0, a1, a2);
- }
+ tcg_out_insn(s, RRFa, NCRK, args[0], args[1], args[2]);
break;
case INDEX_op_orc_i32:
- a0 = args[0], a1 = args[1], a2 = (uint32_t)args[2];
- if (const_args[2]) {
- tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
- tgen_ori(s, a0, (uint32_t)~a2);
- } else {
- tcg_out_insn(s, RRFa, OCRK, a0, a1, a2);
- }
+ tcg_out_insn(s, RRFa, OCRK, args[0], args[1], args[2]);
break;
case INDEX_op_eqv_i32:
- a0 = args[0], a1 = args[1], a2 = (uint32_t)args[2];
- if (const_args[2]) {
- tcg_out_mov(s, TCG_TYPE_I32, a0, a1);
- tcg_out_insn(s, RIL, XILF, a0, ~a2);
- } else {
- tcg_out_insn(s, RRFa, NXRK, a0, a1, a2);
- }
+ tcg_out_insn(s, RRFa, NXRK, args[0], args[1], args[2]);
break;
case INDEX_op_nand_i32:
tcg_out_insn(s, RRFa, NNRK, args[0], args[1], args[2]);
@@ -2517,31 +2499,13 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_andc_i64:
- a0 = args[0], a1 = args[1], a2 = args[2];
- if (const_args[2]) {
- tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
- tgen_andi(s, TCG_TYPE_I64, a0, ~a2);
- } else {
- tcg_out_insn(s, RRFa, NCGRK, a0, a1, a2);
- }
+ tcg_out_insn(s, RRFa, NCGRK, args[0], args[1], args[2]);
break;
case INDEX_op_orc_i64:
- a0 = args[0], a1 = args[1], a2 = args[2];
- if (const_args[2]) {
- tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
- tgen_ori(s, a0, ~a2);
- } else {
- tcg_out_insn(s, RRFa, OCGRK, a0, a1, a2);
- }
+ tcg_out_insn(s, RRFa, OCGRK, args[0], args[1], args[2]);
break;
case INDEX_op_eqv_i64:
- a0 = args[0], a1 = args[1], a2 = args[2];
- if (const_args[2]) {
- tcg_out_mov(s, TCG_TYPE_I64, a0, a1);
- tgen_xori(s, a0, ~a2);
- } else {
- tcg_out_insn(s, RRFa, NXGRK, a0, a1, a2);
- }
+ tcg_out_insn(s, RRFa, NXGRK, args[0], args[1], args[2]);
break;
case INDEX_op_nand_i64:
tcg_out_insn(s, RRFa, NNGRK, args[0], args[1], args[2]);
@@ -3244,15 +3208,11 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
return C_O1_I2(r, r, rK);
case INDEX_op_andc_i32:
- case INDEX_op_orc_i32:
- case INDEX_op_eqv_i32:
- return C_O1_I2(r, r, ri);
case INDEX_op_andc_i64:
- return C_O1_I2(r, r, rKR);
+ case INDEX_op_orc_i32:
case INDEX_op_orc_i64:
+ case INDEX_op_eqv_i32:
case INDEX_op_eqv_i64:
- return C_O1_I2(r, r, rNK);
-
case INDEX_op_nand_i32:
case INDEX_op_nand_i64:
case INDEX_op_nor_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 10/15] tcg/riscv: Do not accept immediate operand for andc, orc, eqv
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (8 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 09/15] tcg/s390x: " Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 11/15] tcg/riscv: Do not accept immediate operands for sub Richard Henderson
` (4 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformations with inverted immediate are now done
generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target-con-set.h | 1 -
tcg/riscv/tcg-target-con-str.h | 1 -
tcg/riscv/tcg-target.c.inc | 36 +++++++---------------------------
3 files changed, 7 insertions(+), 31 deletions(-)
diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
index aac5ceee2b..0f72281a08 100644
--- a/tcg/riscv/tcg-target-con-set.h
+++ b/tcg/riscv/tcg-target-con-set.h
@@ -15,7 +15,6 @@ C_O0_I2(rZ, rZ)
C_O1_I1(r, r)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
-C_O1_I2(r, r, rJ)
C_O1_I2(r, rZ, rN)
C_O1_I2(r, rZ, rZ)
C_N1_I2(r, r, rM)
diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
index d5c419dff1..6f1cfb976c 100644
--- a/tcg/riscv/tcg-target-con-str.h
+++ b/tcg/riscv/tcg-target-con-str.h
@@ -15,7 +15,6 @@ REGS('r', ALL_GENERAL_REGS)
* CONST(letter, TCG_CT_CONST_* bit set)
*/
CONST('I', TCG_CT_CONST_S12)
-CONST('J', TCG_CT_CONST_J12)
CONST('N', TCG_CT_CONST_N12)
CONST('M', TCG_CT_CONST_M12)
CONST('Z', TCG_CT_CONST_ZERO)
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 639363039b..2b889486e4 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -138,7 +138,6 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
#define TCG_CT_CONST_S12 0x200
#define TCG_CT_CONST_N12 0x400
#define TCG_CT_CONST_M12 0x800
-#define TCG_CT_CONST_J12 0x1000
#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
@@ -176,13 +175,6 @@ static bool tcg_target_const_match(int64_t val, int ct,
if ((ct & TCG_CT_CONST_M12) && val >= -0x7ff && val <= 0x7ff) {
return 1;
}
- /*
- * Inverse of sign extended from 12 bits: ~[-0x800, 0x7ff].
- * Used to map ANDN back to ANDI, etc.
- */
- if ((ct & TCG_CT_CONST_J12) && ~val >= -0x800 && ~val <= 0x7ff) {
- return 1;
- }
return 0;
}
@@ -1610,27 +1602,15 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
case INDEX_op_andc_i32:
case INDEX_op_andc_i64:
- if (c2) {
- tcg_out_opc_imm(s, OPC_ANDI, a0, a1, ~a2);
- } else {
- tcg_out_opc_reg(s, OPC_ANDN, a0, a1, a2);
- }
+ tcg_out_opc_reg(s, OPC_ANDN, a0, a1, a2);
break;
case INDEX_op_orc_i32:
case INDEX_op_orc_i64:
- if (c2) {
- tcg_out_opc_imm(s, OPC_ORI, a0, a1, ~a2);
- } else {
- tcg_out_opc_reg(s, OPC_ORN, a0, a1, a2);
- }
+ tcg_out_opc_reg(s, OPC_ORN, a0, a1, a2);
break;
case INDEX_op_eqv_i32:
case INDEX_op_eqv_i64:
- if (c2) {
- tcg_out_opc_imm(s, OPC_XORI, a0, a1, ~a2);
- } else {
- tcg_out_opc_reg(s, OPC_XNOR, a0, a1, a2);
- }
+ tcg_out_opc_reg(s, OPC_XNOR, a0, a1, a2);
break;
case INDEX_op_not_i32:
@@ -1963,18 +1943,16 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_negsetcond_i64:
return C_O1_I2(r, r, rI);
+ case INDEX_op_sub_i32:
+ case INDEX_op_sub_i64:
+ return C_O1_I2(r, rZ, rN);
+
case INDEX_op_andc_i32:
case INDEX_op_andc_i64:
case INDEX_op_orc_i32:
case INDEX_op_orc_i64:
case INDEX_op_eqv_i32:
case INDEX_op_eqv_i64:
- return C_O1_I2(r, r, rJ);
-
- case INDEX_op_sub_i32:
- case INDEX_op_sub_i64:
- return C_O1_I2(r, rZ, rN);
-
case INDEX_op_mul_i32:
case INDEX_op_mulsh_i32:
case INDEX_op_muluh_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 11/15] tcg/riscv: Do not accept immediate operands for sub
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (9 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 10/15] tcg/riscv: Do not accept immediate operand for andc, orc, eqv Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 12/15] tcg/riscv: Do not accept zero operands for logicals, multiply or divide Richard Henderson
` (3 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The transformations to neg and add immediate are now done
generically and need not be handled by the backend.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target-con-set.h | 2 +-
tcg/riscv/tcg-target-con-str.h | 1 -
tcg/riscv/tcg-target.c.inc | 24 ++++--------------------
3 files changed, 5 insertions(+), 22 deletions(-)
diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
index 0f72281a08..13a383aeb1 100644
--- a/tcg/riscv/tcg-target-con-set.h
+++ b/tcg/riscv/tcg-target-con-set.h
@@ -13,9 +13,9 @@ C_O0_I1(r)
C_O0_I2(rZ, r)
C_O0_I2(rZ, rZ)
C_O1_I1(r, r)
+C_O1_I2(r, r, r)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
-C_O1_I2(r, rZ, rN)
C_O1_I2(r, rZ, rZ)
C_N1_I2(r, r, rM)
C_O1_I4(r, r, rI, rM, rM)
diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
index 6f1cfb976c..a8d57c0e37 100644
--- a/tcg/riscv/tcg-target-con-str.h
+++ b/tcg/riscv/tcg-target-con-str.h
@@ -15,6 +15,5 @@ REGS('r', ALL_GENERAL_REGS)
* CONST(letter, TCG_CT_CONST_* bit set)
*/
CONST('I', TCG_CT_CONST_S12)
-CONST('N', TCG_CT_CONST_N12)
CONST('M', TCG_CT_CONST_M12)
CONST('Z', TCG_CT_CONST_ZERO)
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 2b889486e4..6b28f2f85d 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -136,8 +136,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
#define TCG_CT_CONST_ZERO 0x100
#define TCG_CT_CONST_S12 0x200
-#define TCG_CT_CONST_N12 0x400
-#define TCG_CT_CONST_M12 0x800
+#define TCG_CT_CONST_M12 0x400
#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
@@ -160,13 +159,6 @@ static bool tcg_target_const_match(int64_t val, int ct,
if ((ct & TCG_CT_CONST_S12) && val >= -0x800 && val <= 0x7ff) {
return 1;
}
- /*
- * Sign extended from 12 bits, negated: [-0x7ff, 0x800].
- * Used for subtraction, where a constant must be handled by ADDI.
- */
- if ((ct & TCG_CT_CONST_N12) && val >= -0x7ff && val <= 0x800) {
- return 1;
- }
/*
* Sign extended from 12 bits, +/- matching: [-0x7ff, 0x7ff].
* Used by addsub2 and movcond, which may need the negative value,
@@ -1559,18 +1551,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_sub_i32:
- if (c2) {
- tcg_out_opc_imm(s, OPC_ADDIW, a0, a1, -a2);
- } else {
- tcg_out_opc_reg(s, OPC_SUBW, a0, a1, a2);
- }
+ tcg_out_opc_reg(s, OPC_SUBW, a0, a1, a2);
break;
case INDEX_op_sub_i64:
- if (c2) {
- tcg_out_opc_imm(s, OPC_ADDI, a0, a1, -a2);
- } else {
- tcg_out_opc_reg(s, OPC_SUB, a0, a1, a2);
- }
+ tcg_out_opc_reg(s, OPC_SUB, a0, a1, a2);
break;
case INDEX_op_and_i32:
@@ -1945,7 +1929,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_sub_i32:
case INDEX_op_sub_i64:
- return C_O1_I2(r, rZ, rN);
+ return C_O1_I2(r, r, r);
case INDEX_op_andc_i32:
case INDEX_op_andc_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 12/15] tcg/riscv: Do not accept zero operands for logicals, multiply or divide
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (10 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 11/15] tcg/riscv: Do not accept immediate operands for sub Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 13/15] tcg/optimize: Fold and to extu during optimize Richard Henderson
` (2 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
Trust that the optimizer has folded all of these away.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target-con-set.h | 1 -
tcg/riscv/tcg-target.c.inc | 4 +---
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
index 13a383aeb1..527d2fd4d9 100644
--- a/tcg/riscv/tcg-target-con-set.h
+++ b/tcg/riscv/tcg-target-con-set.h
@@ -16,7 +16,6 @@ C_O1_I1(r, r)
C_O1_I2(r, r, r)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
-C_O1_I2(r, rZ, rZ)
C_N1_I2(r, r, rM)
C_O1_I4(r, r, rI, rM, rM)
C_O2_I4(r, r, rZ, rZ, rM, rM)
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 6b28f2f85d..0dc1b2d8f7 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -1929,8 +1929,6 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_sub_i32:
case INDEX_op_sub_i64:
- return C_O1_I2(r, r, r);
-
case INDEX_op_andc_i32:
case INDEX_op_andc_i64:
case INDEX_op_orc_i32:
@@ -1951,7 +1949,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_divu_i64:
case INDEX_op_rem_i64:
case INDEX_op_remu_i64:
- return C_O1_I2(r, rZ, rZ);
+ return C_O1_I2(r, r, r);
case INDEX_op_shl_i32:
case INDEX_op_shr_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 13/15] tcg/optimize: Fold and to extu during optimize
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (11 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 12/15] tcg/riscv: Do not accept zero operands for logicals, multiply or divide Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 14/15] tcg: Use arg_is_const_val in fold_sub_to_neg Richard Henderson
2024-03-12 14:38 ` [PATCH 15/15] tcg/optimize: Lower unsupported deposit during optimize Richard Henderson
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 43 +++++++++++++++++++++++++++++++++++++++----
1 file changed, 39 insertions(+), 4 deletions(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index c6b0ab35c8..39bcd32f72 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1300,11 +1300,46 @@ static bool fold_and(OptContext *ctx, TCGOp *op)
ctx->s_mask = arg_info(op->args[1])->s_mask
& arg_info(op->args[2])->s_mask;
- /*
- * Known-zeros does not imply known-ones. Therefore unless
- * arg2 is constant, we can't infer affected bits from it.
- */
if (arg_is_const(op->args[2])) {
+ TCGOpcode ext8 = 0, ext16 = 0, ext32 = 0;
+
+ /* Canonicalize as zero-extend, if supported. */
+ switch (ctx->type) {
+ case TCG_TYPE_I32:
+ ext8 = TCG_TARGET_HAS_ext8u_i32 ? INDEX_op_ext8u_i32 : 0;
+ ext16 = TCG_TARGET_HAS_ext16u_i32 ? INDEX_op_ext16u_i32 : 0;
+ break;
+ case TCG_TYPE_I64:
+ ext8 = TCG_TARGET_HAS_ext8u_i64 ? INDEX_op_ext8u_i64 : 0;
+ ext16 = TCG_TARGET_HAS_ext16u_i64 ? INDEX_op_ext16u_i64 : 0;
+ ext32 = TCG_TARGET_HAS_ext32u_i64 ? INDEX_op_ext32u_i64 : 0;
+ break;
+ default:
+ break;
+ }
+
+ switch (arg_info(op->args[2])->val) {
+ case 0xff:
+ if (ext8) {
+ op->opc = ext8;
+ }
+ break;
+ case 0xffff:
+ if (ext16) {
+ op->opc = ext16;
+ }
+ break;
+ case UINT32_MAX:
+ if (ext32) {
+ op->opc = ext32;
+ }
+ break;
+ }
+
+ /*
+ * Known-zeros does not imply known-ones. Therefore unless
+ * arg2 is constant, we can't infer affected bits from it.
+ */
ctx->a_mask = z1 & ~z2;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 14/15] tcg: Use arg_is_const_val in fold_sub_to_neg
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (12 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 13/15] tcg/optimize: Fold and to extu during optimize Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 15/15] tcg/optimize: Lower unsupported deposit during optimize Richard Henderson
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 39bcd32f72..f3867ce9e6 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -2451,7 +2451,7 @@ static bool fold_sub_to_neg(OptContext *ctx, TCGOp *op)
TCGOpcode neg_op;
bool have_neg;
- if (!arg_is_const(op->args[1]) || arg_info(op->args[1])->val != 0) {
+ if (!arg_is_const_val(op->args[1], 0)) {
return false;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 15/15] tcg/optimize: Lower unsupported deposit during optimize
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
` (13 preceding siblings ...)
2024-03-12 14:38 ` [PATCH 14/15] tcg: Use arg_is_const_val in fold_sub_to_neg Richard Henderson
@ 2024-03-12 14:38 ` Richard Henderson
14 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-12 14:38 UTC (permalink / raw)
To: qemu-devel
The expansions that we chose in tcg-op.c may be less than optimial.
Delay lowering until optimize, so that we have propagated constants
and have computed known zero masks.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/optimize.c | 231 +++++++++++++++++++++++++++++++++++++++++-----
tcg/tcg-op.c | 244 ++++++++++++-------------------------------------
2 files changed, 266 insertions(+), 209 deletions(-)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index f3867ce9e6..ce1dbab097 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1632,51 +1632,234 @@ static bool fold_ctpop(OptContext *ctx, TCGOp *op)
static bool fold_deposit(OptContext *ctx, TCGOp *op)
{
- TCGOpcode and_opc;
+ TCGOpcode and_opc, or_opc, ex2_opc, shl_opc, rotl_opc;
+ TCGOp *op2;
+ TCGArg ret = op->args[0];
+ TCGArg arg1 = op->args[1];
+ TCGArg arg2 = op->args[2];
+ int ofs = op->args[3];
+ int len = op->args[4];
+ int width;
+ uint64_t type_mask;
+ bool valid;
- if (arg_is_const(op->args[1]) && arg_is_const(op->args[2])) {
- uint64_t t1 = arg_info(op->args[1])->val;
- uint64_t t2 = arg_info(op->args[2])->val;
+ if (arg_is_const(arg1) && arg_is_const(arg2)) {
+ uint64_t t1 = arg_info(arg1)->val;
+ uint64_t t2 = arg_info(arg2)->val;
- t1 = deposit64(t1, op->args[3], op->args[4], t2);
- return tcg_opt_gen_movi(ctx, op, op->args[0], t1);
+ t1 = deposit64(t1, ofs, len, t2);
+ return tcg_opt_gen_movi(ctx, op, ret, t1);
}
switch (ctx->type) {
case TCG_TYPE_I32:
and_opc = INDEX_op_and_i32;
+ or_opc = INDEX_op_or_i32;
+ shl_opc = INDEX_op_shl_i32;
+ ex2_opc = TCG_TARGET_HAS_extract2_i32 ? INDEX_op_extract2_i32 : 0;
+ rotl_opc = TCG_TARGET_HAS_rot_i32 ? INDEX_op_rotl_i32 : 0;
+ valid = (TCG_TARGET_HAS_deposit_i32 &&
+ TCG_TARGET_deposit_i32_valid(ofs, len));
+ width = 32;
+ type_mask = UINT32_MAX;
break;
case TCG_TYPE_I64:
and_opc = INDEX_op_and_i64;
+ or_opc = INDEX_op_or_i64;
+ shl_opc = INDEX_op_shl_i64;
+ ex2_opc = TCG_TARGET_HAS_extract2_i64 ? INDEX_op_extract2_i64 : 0;
+ rotl_opc = TCG_TARGET_HAS_rot_i64 ? INDEX_op_rotl_i64 : 0;
+ valid = (TCG_TARGET_HAS_deposit_i64 &&
+ TCG_TARGET_deposit_i64_valid(ofs, len));
+ width = 64;
+ type_mask = UINT64_MAX;
break;
default:
g_assert_not_reached();
}
- /* Inserting a value into zero at offset 0. */
- if (arg_is_const_val(op->args[1], 0) && op->args[3] == 0) {
- uint64_t mask = MAKE_64BIT_MASK(0, op->args[4]);
+ if (arg_is_const(arg2)) {
+ uint64_t val = arg_info(arg2)->val;
+ uint64_t mask = MAKE_64BIT_MASK(0, len);
- op->opc = and_opc;
- op->args[1] = op->args[2];
- op->args[2] = arg_new_constant(ctx, mask);
- ctx->z_mask = mask & arg_info(op->args[1])->z_mask;
- return false;
+ /* Inserting all-zero into a value. */
+ if ((val & mask) == 0) {
+ op->opc = and_opc;
+ op->args[2] = arg_new_constant(ctx, ~(mask << ofs));
+ return fold_and(ctx, op);
+ }
+
+ /* Inserting all-one into a value. */
+ if ((val & mask) == mask) {
+ op->opc = or_opc;
+ op->args[2] = arg_new_constant(ctx, mask << ofs);
+ goto done;
+ }
+
+ /* Lower invalid deposit of constant as AND + OR. */
+ if (!valid) {
+ op2 = tcg_op_insert_before(ctx->tcg, op, and_opc, 3);
+ op2->args[0] = ret;
+ op2->args[1] = arg1;
+ op2->args[2] = arg_new_constant(ctx, ~(mask << ofs));
+ fold_and(ctx, op2); /* fold to ext*u */
+
+ op->opc = or_opc;
+ op->args[1] = ret;
+ op->args[2] = arg_new_constant(ctx, (val & mask) << ofs);
+ goto done;
+ }
}
- /* Inserting zero into a value. */
- if (arg_is_const_val(op->args[2], 0)) {
- uint64_t mask = deposit64(-1, op->args[3], op->args[4], 0);
+ /* Inserting a value into zero. */
+ if (arg_is_const_val(arg1, 0)) {
+ uint64_t mask = MAKE_64BIT_MASK(0, len);
+ uint64_t need_mask = arg_info(arg2)->z_mask & ~mask & type_mask;
- op->opc = and_opc;
- op->args[2] = arg_new_constant(ctx, mask);
- ctx->z_mask = mask & arg_info(op->args[1])->z_mask;
- return false;
+ /* Always lower deposit into zero at 0 as AND. */
+ if (ofs == 0) {
+ if (!need_mask) {
+ return tcg_opt_gen_mov(ctx, op, ret, arg2);
+ }
+ op->opc = and_opc;
+ op->args[1] = arg2;
+ op->args[2] = arg_new_constant(ctx, mask);
+ return fold_and(ctx, op);
+ }
+
+ /* If no mask required, fold as SHL. */
+ if (!((need_mask << ofs) & type_mask)) {
+ op->opc = shl_opc;
+ op->args[1] = arg2;
+ op->args[2] = arg_new_constant(ctx, ofs);
+ goto done;
+ }
+
+ /* Lower invalid deposit into zero as AND + SHL. */
+ if (!valid) {
+ /*
+ * ret = arg2 & mask
+ * ret = ret << ofs
+ */
+ TCGOpcode ext_second_opc = 0;
+
+ switch (ofs + len) {
+ case 8:
+ ext_second_opc =
+ (ctx->type == TCG_TYPE_I32
+ ? (TCG_TARGET_HAS_ext8u_i32 ? INDEX_op_ext8u_i32 : 0)
+ : (TCG_TARGET_HAS_ext8u_i64 ? INDEX_op_ext8u_i64 : 0));
+ break;
+ case 16:
+ ext_second_opc =
+ (ctx->type == TCG_TYPE_I32
+ ? (TCG_TARGET_HAS_ext16u_i32 ? INDEX_op_ext16u_i32 : 0)
+ : (TCG_TARGET_HAS_ext16u_i64 ? INDEX_op_ext16u_i64 : 0));
+ break;
+ case 32:
+ ext_second_opc =
+ TCG_TARGET_HAS_ext32u_i64 ? INDEX_op_ext32u_i64 : 0;
+ break;
+ }
+
+ if (ext_second_opc) {
+ op2 = tcg_op_insert_before(ctx->tcg, op, shl_opc, 3);
+ op2->args[0] = ret;
+ op2->args[1] = arg2;
+ op2->args[2] = arg_new_constant(ctx, ofs);
+
+ op->opc = ext_second_opc;
+ op->args[1] = ret;
+ } else {
+ op2 = tcg_op_insert_before(ctx->tcg, op, and_opc, 3);
+ op2->args[0] = ret;
+ op2->args[1] = arg2;
+ op2->args[2] = arg_new_constant(ctx, mask);
+ fold_and(ctx, op2);
+
+ op->opc = shl_opc;
+ op->args[1] = ret;
+ op->args[2] = arg_new_constant(ctx, ofs);
+ }
+ goto done;
+ }
}
- ctx->z_mask = deposit64(arg_info(op->args[1])->z_mask,
- op->args[3], op->args[4],
- arg_info(op->args[2])->z_mask);
+ /* After special cases, lower invalid deposit. */
+ if (!valid) {
+ uint64_t mask = MAKE_64BIT_MASK(0, len);
+ TCGArg tmp;
+
+ /*
+ * ret = arg2:arg1 >> len
+ * ret = rotl(ret, len)
+ */
+ if (ex2_opc && rotl_opc && ofs == 0) {
+ op2 = tcg_op_insert_before(ctx->tcg, op, ex2_opc, 4);
+ op2->args[0] = ret;
+ op2->args[1] = arg1;
+ op2->args[2] = arg2;
+ op2->args[3] = len;
+
+ op->opc = rotl_opc;
+ op->args[1] = ret;
+ op->args[2] = arg_new_constant(ctx, len);
+ goto done;
+ }
+
+ /*
+ * tmp = arg1 << len
+ * ret = arg2:tmp >> len
+ */
+ if (ex2_opc && ofs + len == width) {
+ tmp = ret == arg2 ? arg_new_temp(ctx) : ret;
+
+ op2 = tcg_op_insert_before(ctx->tcg, op, shl_opc, 4);
+ op2->args[0] = tmp;
+ op2->args[1] = arg1;
+ op2->args[2] = arg_new_constant(ctx, len);
+
+ op->opc = ex2_opc;
+ op->args[0] = ret;
+ op->args[1] = tmp;
+ op->args[2] = arg2;
+ op->args[3] = len;
+ goto done;
+ }
+
+ /*
+ * tmp = arg2 & mask
+ * ret = arg1 & ~(mask << ofs)
+ * tmp = tmp << ofs
+ * ret = ret | tmp
+ */
+ tmp = arg_new_temp(ctx);
+
+ op2 = tcg_op_insert_before(ctx->tcg, op, and_opc, 3);
+ op2->args[0] = tmp;
+ op2->args[1] = arg2;
+ op2->args[2] = arg_new_constant(ctx, mask);
+ fold_and(ctx, op2);
+
+ op2 = tcg_op_insert_before(ctx->tcg, op, shl_opc, 3);
+ op2->args[0] = tmp;
+ op2->args[1] = tmp;
+ op2->args[2] = arg_new_constant(ctx, ofs);
+
+ op2 = tcg_op_insert_before(ctx->tcg, op, and_opc, 3);
+ op2->args[0] = ret;
+ op2->args[1] = arg1;
+ op2->args[2] = arg_new_constant(ctx, ~(mask << ofs));
+ fold_and(ctx, op2);
+
+ op->opc = or_opc;
+ op->args[1] = ret;
+ op->args[2] = tmp;
+ }
+
+ done:
+ ctx->z_mask = deposit64(arg_info(arg1)->z_mask, ofs, len,
+ arg_info(arg2)->z_mask);
return false;
}
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index aa6bc6f57d..76a1f5e296 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -874,9 +874,6 @@ void tcg_gen_rotri_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
unsigned int ofs, unsigned int len)
{
- uint32_t mask;
- TCGv_i32 t1;
-
tcg_debug_assert(ofs < 32);
tcg_debug_assert(len > 0);
tcg_debug_assert(len <= 32);
@@ -886,37 +883,7 @@ void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
tcg_gen_mov_i32(ret, arg2);
return;
}
- if (TCG_TARGET_HAS_deposit_i32 && TCG_TARGET_deposit_i32_valid(ofs, len)) {
- tcg_gen_op5ii_i32(INDEX_op_deposit_i32, ret, arg1, arg2, ofs, len);
- return;
- }
-
- t1 = tcg_temp_ebb_new_i32();
-
- if (TCG_TARGET_HAS_extract2_i32) {
- if (ofs + len == 32) {
- tcg_gen_shli_i32(t1, arg1, len);
- tcg_gen_extract2_i32(ret, t1, arg2, len);
- goto done;
- }
- if (ofs == 0) {
- tcg_gen_extract2_i32(ret, arg1, arg2, len);
- tcg_gen_rotli_i32(ret, ret, len);
- goto done;
- }
- }
-
- mask = (1u << len) - 1;
- if (ofs + len < 32) {
- tcg_gen_andi_i32(t1, arg2, mask);
- tcg_gen_shli_i32(t1, t1, ofs);
- } else {
- tcg_gen_shli_i32(t1, arg2, ofs);
- }
- tcg_gen_andi_i32(ret, arg1, ~(mask << ofs));
- tcg_gen_or_i32(ret, ret, t1);
- done:
- tcg_temp_free_i32(t1);
+ tcg_gen_op5ii_i32(INDEX_op_deposit_i32, ret, arg1, arg2, ofs, len);
}
void tcg_gen_deposit_z_i32(TCGv_i32 ret, TCGv_i32 arg,
@@ -931,48 +898,9 @@ void tcg_gen_deposit_z_i32(TCGv_i32 ret, TCGv_i32 arg,
tcg_gen_shli_i32(ret, arg, ofs);
} else if (ofs == 0) {
tcg_gen_andi_i32(ret, arg, (1u << len) - 1);
- } else if (TCG_TARGET_HAS_deposit_i32
- && TCG_TARGET_deposit_i32_valid(ofs, len)) {
+ } else {
TCGv_i32 zero = tcg_constant_i32(0);
tcg_gen_op5ii_i32(INDEX_op_deposit_i32, ret, zero, arg, ofs, len);
- } else {
- /* To help two-operand hosts we prefer to zero-extend first,
- which allows ARG to stay live. */
- switch (len) {
- case 16:
- if (TCG_TARGET_HAS_ext16u_i32) {
- tcg_gen_ext16u_i32(ret, arg);
- tcg_gen_shli_i32(ret, ret, ofs);
- return;
- }
- break;
- case 8:
- if (TCG_TARGET_HAS_ext8u_i32) {
- tcg_gen_ext8u_i32(ret, arg);
- tcg_gen_shli_i32(ret, ret, ofs);
- return;
- }
- break;
- }
- /* Otherwise prefer zero-extension over AND for code size. */
- switch (ofs + len) {
- case 16:
- if (TCG_TARGET_HAS_ext16u_i32) {
- tcg_gen_shli_i32(ret, arg, ofs);
- tcg_gen_ext16u_i32(ret, ret);
- return;
- }
- break;
- case 8:
- if (TCG_TARGET_HAS_ext8u_i32) {
- tcg_gen_shli_i32(ret, arg, ofs);
- tcg_gen_ext8u_i32(ret, ret);
- return;
- }
- break;
- }
- tcg_gen_andi_i32(ret, arg, (1u << len) - 1);
- tcg_gen_shli_i32(ret, ret, ofs);
}
}
@@ -2611,9 +2539,6 @@ void tcg_gen_rotri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
unsigned int ofs, unsigned int len)
{
- uint64_t mask;
- TCGv_i64 t1;
-
tcg_debug_assert(ofs < 64);
tcg_debug_assert(len > 0);
tcg_debug_assert(len <= 64);
@@ -2623,52 +2548,41 @@ void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
tcg_gen_mov_i64(ret, arg2);
return;
}
- if (TCG_TARGET_HAS_deposit_i64 && TCG_TARGET_deposit_i64_valid(ofs, len)) {
+
+ if (TCG_TARGET_REG_BITS == 64) {
tcg_gen_op5ii_i64(INDEX_op_deposit_i64, ret, arg1, arg2, ofs, len);
- return;
- }
-
- if (TCG_TARGET_REG_BITS == 32) {
- if (ofs >= 32) {
- tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
- TCGV_LOW(arg2), ofs - 32, len);
- tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
- return;
- }
- if (ofs + len <= 32) {
- tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
- TCGV_LOW(arg2), ofs, len);
- tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
- return;
- }
- }
-
- t1 = tcg_temp_ebb_new_i64();
-
- if (TCG_TARGET_HAS_extract2_i64) {
- if (ofs + len == 64) {
- tcg_gen_shli_i64(t1, arg1, len);
- tcg_gen_extract2_i64(ret, t1, arg2, len);
- goto done;
- }
- if (ofs == 0) {
- tcg_gen_extract2_i64(ret, arg1, arg2, len);
- tcg_gen_rotli_i64(ret, ret, len);
- goto done;
- }
- }
-
- mask = (1ull << len) - 1;
- if (ofs + len < 64) {
- tcg_gen_andi_i64(t1, arg2, mask);
- tcg_gen_shli_i64(t1, t1, ofs);
+ } else if (ofs >= 32) {
+ tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
+ TCGV_LOW(arg2), ofs - 32, len);
+ tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
+ } else if (ofs + len <= 32) {
+ tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
+ TCGV_LOW(arg2), ofs, len);
+ tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
+ } else if (ofs == 0) {
+ tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
+ TCGV_HIGH(arg2), 0, len - 32);
+ tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg2));
} else {
- tcg_gen_shli_i64(t1, arg2, ofs);
+ /* The 64-bit deposit is split across the 32-bit halves. */
+ unsigned lo_len = 32 - ofs;
+ unsigned hi_len = len - lo_len;
+ TCGv_i32 tl = tcg_temp_ebb_new_i32();
+ TCGv_i32 th = tcg_temp_ebb_new_i32();
+
+ tcg_gen_deposit_i32(tl, TCGV_LOW(arg1), TCGV_LOW(arg2), ofs, lo_len);
+ if (len <= 32) {
+ tcg_gen_shri_i32(th, TCGV_LOW(arg2), lo_len);
+ } else {
+ tcg_gen_extract2_i32(th, TCGV_LOW(arg2), TCGV_HIGH(arg2), lo_len);
+ }
+ tcg_gen_deposit_i32(th, TCGV_HIGH(arg1), th, 0, hi_len);
+
+ tcg_gen_mov_i32(TCGV_LOW(ret), tl);
+ tcg_gen_mov_i32(TCGV_HIGH(ret), th);
+ tcg_temp_free_i32(tl);
+ tcg_temp_free_i32(th);
}
- tcg_gen_andi_i64(ret, arg1, ~(mask << ofs));
- tcg_gen_or_i64(ret, ret, t1);
- done:
- tcg_temp_free_i64(t1);
}
void tcg_gen_deposit_z_i64(TCGv_i64 ret, TCGv_i64 arg,
@@ -2683,75 +2597,35 @@ void tcg_gen_deposit_z_i64(TCGv_i64 ret, TCGv_i64 arg,
tcg_gen_shli_i64(ret, arg, ofs);
} else if (ofs == 0) {
tcg_gen_andi_i64(ret, arg, (1ull << len) - 1);
- } else if (TCG_TARGET_HAS_deposit_i64
- && TCG_TARGET_deposit_i64_valid(ofs, len)) {
+ } else if (TCG_TARGET_REG_BITS == 64) {
TCGv_i64 zero = tcg_constant_i64(0);
tcg_gen_op5ii_i64(INDEX_op_deposit_i64, ret, zero, arg, ofs, len);
+ } else if (ofs >= 32) {
+ tcg_gen_deposit_z_i32(TCGV_HIGH(ret), TCGV_LOW(arg), ofs - 32, len);
+ tcg_gen_movi_i32(TCGV_LOW(ret), 0);
+ } else if (ofs + len <= 32) {
+ tcg_gen_deposit_z_i32(TCGV_LOW(ret), TCGV_LOW(arg), ofs, len);
+ tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+ } else if (ofs == 0) {
+ tcg_gen_deposit_z_i32(TCGV_HIGH(ret), TCGV_HIGH(arg), 0, len - 32);
+ tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
} else {
- if (TCG_TARGET_REG_BITS == 32) {
- if (ofs >= 32) {
- tcg_gen_deposit_z_i32(TCGV_HIGH(ret), TCGV_LOW(arg),
- ofs - 32, len);
- tcg_gen_movi_i32(TCGV_LOW(ret), 0);
- return;
- }
- if (ofs + len <= 32) {
- tcg_gen_deposit_z_i32(TCGV_LOW(ret), TCGV_LOW(arg), ofs, len);
- tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
- return;
- }
+ /* The 64-bit deposit is split across the 32-bit halves. */
+ unsigned lo_len = 32 - ofs;
+ unsigned hi_len = len - lo_len;
+ TCGv_i32 tl = tcg_temp_ebb_new_i32();
+ TCGv_i32 th = TCGV_HIGH(ret);
+
+ tcg_gen_deposit_z_i32(tl, TCGV_LOW(arg), ofs, lo_len);
+ if (len <= 32) {
+ tcg_gen_shri_i32(th, TCGV_LOW(arg), lo_len);
+ } else {
+ tcg_gen_extract2_i32(th, TCGV_LOW(arg), TCGV_HIGH(arg), lo_len);
}
- /* To help two-operand hosts we prefer to zero-extend first,
- which allows ARG to stay live. */
- switch (len) {
- case 32:
- if (TCG_TARGET_HAS_ext32u_i64) {
- tcg_gen_ext32u_i64(ret, arg);
- tcg_gen_shli_i64(ret, ret, ofs);
- return;
- }
- break;
- case 16:
- if (TCG_TARGET_HAS_ext16u_i64) {
- tcg_gen_ext16u_i64(ret, arg);
- tcg_gen_shli_i64(ret, ret, ofs);
- return;
- }
- break;
- case 8:
- if (TCG_TARGET_HAS_ext8u_i64) {
- tcg_gen_ext8u_i64(ret, arg);
- tcg_gen_shli_i64(ret, ret, ofs);
- return;
- }
- break;
- }
- /* Otherwise prefer zero-extension over AND for code size. */
- switch (ofs + len) {
- case 32:
- if (TCG_TARGET_HAS_ext32u_i64) {
- tcg_gen_shli_i64(ret, arg, ofs);
- tcg_gen_ext32u_i64(ret, ret);
- return;
- }
- break;
- case 16:
- if (TCG_TARGET_HAS_ext16u_i64) {
- tcg_gen_shli_i64(ret, arg, ofs);
- tcg_gen_ext16u_i64(ret, ret);
- return;
- }
- break;
- case 8:
- if (TCG_TARGET_HAS_ext8u_i64) {
- tcg_gen_shli_i64(ret, arg, ofs);
- tcg_gen_ext8u_i64(ret, ret);
- return;
- }
- break;
- }
- tcg_gen_andi_i64(ret, arg, (1ull << len) - 1);
- tcg_gen_shli_i64(ret, ret, ofs);
+ tcg_gen_deposit_z_i32(th, th, 0, hi_len);
+
+ tcg_gen_mov_i32(TCGV_LOW(ret), tl);
+ tcg_temp_free_i32(tl);
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 01/15] tcg/optimize: Fold andc with immediate to and
2024-03-12 14:38 ` [PATCH 01/15] tcg/optimize: Fold andc with immediate to and Richard Henderson
@ 2024-03-13 1:29 ` Richard Henderson
0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2024-03-13 1:29 UTC (permalink / raw)
To: qemu-devel
On 3/12/24 04:38, Richard Henderson wrote:
> + /* Fold andc r,x,i to and r,x,~i. */
> + op->opc = (ctx->type == TCG_TYPE_I32
> + ? INDEX_op_and_i32 : INDEX_op_and_i64);
This and the next two patches also need to handle vector types.
r~
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-03-13 1:30 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-12 14:38 [PATCH for-9.1 00/15] tcg: Canonicalize operations during optimize Richard Henderson
2024-03-12 14:38 ` [PATCH 01/15] tcg/optimize: Fold andc with immediate to and Richard Henderson
2024-03-13 1:29 ` Richard Henderson
2024-03-12 14:38 ` [PATCH 02/15] tcg/optimize: Fold orc with immediate to or Richard Henderson
2024-03-12 14:38 ` [PATCH 03/15] tcg/optimize: Fold eqv with immediate to xor Richard Henderson
2024-03-12 14:38 ` [PATCH 04/15] tcg/i386: Do not accept immediate operand for andc Richard Henderson
2024-03-12 14:38 ` [PATCH 05/15] tcg/aarch64: Do not accept immediate operand for andc, orc, eqv Richard Henderson
2024-03-12 14:38 ` [PATCH 06/15] tcg/arm: Do not accept immediate operand for andc Richard Henderson
2024-03-12 14:38 ` [PATCH 07/15] tcg/ppc: Do not accept immediate operand for andc, orc, eqv Richard Henderson
2024-03-12 14:38 ` [PATCH 08/15] tcg/loongarch64: Do not accept immediate operand for andc, orc Richard Henderson
2024-03-12 14:38 ` [PATCH 09/15] tcg/s390x: " Richard Henderson
2024-03-12 14:38 ` [PATCH 10/15] tcg/riscv: Do not accept immediate operand for andc, orc, eqv Richard Henderson
2024-03-12 14:38 ` [PATCH 11/15] tcg/riscv: Do not accept immediate operands for sub Richard Henderson
2024-03-12 14:38 ` [PATCH 12/15] tcg/riscv: Do not accept zero operands for logicals, multiply or divide Richard Henderson
2024-03-12 14:38 ` [PATCH 13/15] tcg/optimize: Fold and to extu during optimize Richard Henderson
2024-03-12 14:38 ` [PATCH 14/15] tcg: Use arg_is_const_val in fold_sub_to_neg Richard Henderson
2024-03-12 14:38 ` [PATCH 15/15] tcg/optimize: Lower unsupported deposit during optimize Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).