qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes
@ 2023-08-08  3:11 Richard Henderson
  2023-08-08  3:11 ` [PATCH 01/24] " Richard Henderson
                   ` (23 more replies)
  0 siblings, 24 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Introduce two new setcond opcode variants which produce -1 instead
of 1 when the condition.  For most of our hosts, producing -1 is
just as easy as 1, and avoid requiring a separate negate instruction.

Use the new opcode in tcg/tcg-op-gvec.c for integral expansion of
generic vector operations.  I looked through target/ for obvious
pairings of setcond and neg.


r~


Richard Henderson (24):
  tcg: Introduce negsetcond opcodes
  tcg: Use tcg_gen_negsetcond_*
  target/alpha: Use tcg_gen_movcond_i64 in gen_fold_mzero
  target/arm: Use tcg_gen_negsetcond_*
  target/m68k: Use tcg_gen_negsetcond_*
  target/openrisc: Use tcg_gen_negsetcond_*
  target/ppc: Use tcg_gen_negsetcond_*
  target/sparc: Use tcg_gen_movcond_i64 in gen_edge
  target/tricore: Replace gen_cond_w with tcg_gen_negsetcond_tl
  tcg/ppc: Implement negsetcond_*
  tcg/ppc: Use the Set Boolean Extension
  tcg/aarch64: Implement negsetcond_*
  tcg/arm: Implement negsetcond_i32
  tcg/riscv: Implement negsetcond_*
  tcg/s390x: Implement negsetcond_*
  tcg/sparc64: Implement negsetcond_*
  tcg/i386: Merge tcg_out_brcond{32,64}
  tcg/i386: Merge tcg_out_setcond{32,64}
  tcg/i386: Merge tcg_out_movcond{32,64}
  tcg/i386: Add cf parameter to tcg_out_cmp
  tcg/i386: Use CMP+SBB in tcg_out_setcond
  tcg/i386: Clear dest first in tcg_out_setcond if possible
  tcg/i386: Use shift in tcg_out_setcond
  tcg/i386: Implement negsetcond_*

 docs/devel/tcg-ops.rst                     |   6 +
 include/tcg/tcg-op-common.h                |   4 +
 include/tcg/tcg-op.h                       |   2 +
 include/tcg/tcg-opc.h                      |   2 +
 include/tcg/tcg.h                          |   1 +
 tcg/aarch64/tcg-target.h                   |   2 +
 tcg/arm/tcg-target.h                       |   1 +
 tcg/i386/tcg-target.h                      |   2 +
 tcg/loongarch64/tcg-target.h               |   3 +
 tcg/mips/tcg-target.h                      |   2 +
 tcg/ppc/tcg-target.h                       |   2 +
 tcg/riscv/tcg-target.h                     |   2 +
 tcg/s390x/tcg-target.h                     |   2 +
 tcg/sparc64/tcg-target.h                   |   2 +
 tcg/tci/tcg-target.h                       |   2 +
 target/alpha/translate.c                   |   7 +-
 target/arm/tcg/translate-a64.c             |  22 +-
 target/arm/tcg/translate.c                 |  12 +-
 target/m68k/translate.c                    |  24 +-
 target/openrisc/translate.c                |   6 +-
 target/sparc/translate.c                   |  17 +-
 target/tricore/translate.c                 |  16 +-
 tcg/optimize.c                             |  41 +++-
 tcg/tcg-op-gvec.c                          |   6 +-
 tcg/tcg-op.c                               |  42 +++-
 tcg/tcg.c                                  |   6 +
 target/ppc/translate/fixedpoint-impl.c.inc |   6 +-
 target/ppc/translate/vmx-impl.c.inc        |   8 +-
 tcg/aarch64/tcg-target.c.inc               |  12 +
 tcg/arm/tcg-target.c.inc                   |   9 +
 tcg/i386/tcg-target.c.inc                  | 265 +++++++++++++--------
 tcg/ppc/tcg-target.c.inc                   | 149 ++++++++----
 tcg/riscv/tcg-target.c.inc                 |  45 ++++
 tcg/s390x/tcg-target.c.inc                 |  78 ++++--
 tcg/sparc64/tcg-target.c.inc               |  36 ++-
 35 files changed, 572 insertions(+), 270 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH 01/24] tcg: Introduce negsetcond opcodes
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:12   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 02/24] tcg: Use tcg_gen_negsetcond_* Richard Henderson
                   ` (22 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Introduce a new opcode for negative setcond.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 docs/devel/tcg-ops.rst       |  6 ++++++
 include/tcg/tcg-op-common.h  |  4 ++++
 include/tcg/tcg-op.h         |  2 ++
 include/tcg/tcg-opc.h        |  2 ++
 include/tcg/tcg.h            |  1 +
 tcg/aarch64/tcg-target.h     |  2 ++
 tcg/arm/tcg-target.h         |  1 +
 tcg/i386/tcg-target.h        |  2 ++
 tcg/loongarch64/tcg-target.h |  3 +++
 tcg/mips/tcg-target.h        |  2 ++
 tcg/ppc/tcg-target.h         |  2 ++
 tcg/riscv/tcg-target.h       |  2 ++
 tcg/s390x/tcg-target.h       |  2 ++
 tcg/sparc64/tcg-target.h     |  2 ++
 tcg/tci/tcg-target.h         |  2 ++
 tcg/optimize.c               | 41 +++++++++++++++++++++++++++++++++++-
 tcg/tcg-op.c                 | 36 +++++++++++++++++++++++++++++++
 tcg/tcg.c                    |  6 ++++++
 18 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/docs/devel/tcg-ops.rst b/docs/devel/tcg-ops.rst
index 6a166c5665..fbde8040d7 100644
--- a/docs/devel/tcg-ops.rst
+++ b/docs/devel/tcg-ops.rst
@@ -498,6 +498,12 @@ Conditional moves
        |
        | Set *dest* to 1 if (*t1* *cond* *t2*) is true, otherwise set to 0.
 
+   * - negsetcond_i32/i64 *dest*, *t1*, *t2*, *cond*
+
+     - | *dest* = -(*t1* *cond* *t2*)
+       |
+       | Set *dest* to -1 if (*t1* *cond* *t2*) is true, otherwise set to 0.
+
    * - movcond_i32/i64 *dest*, *c1*, *c2*, *v1*, *v2*, *cond*
 
      - | *dest* = (*c1* *cond* *c2* ? *v1* : *v2*)
diff --git a/include/tcg/tcg-op-common.h b/include/tcg/tcg-op-common.h
index be382bbf77..a53b15933b 100644
--- a/include/tcg/tcg-op-common.h
+++ b/include/tcg/tcg-op-common.h
@@ -344,6 +344,8 @@ void tcg_gen_setcond_i32(TCGCond cond, TCGv_i32 ret,
                          TCGv_i32 arg1, TCGv_i32 arg2);
 void tcg_gen_setcondi_i32(TCGCond cond, TCGv_i32 ret,
                           TCGv_i32 arg1, int32_t arg2);
+void tcg_gen_negsetcond_i32(TCGCond cond, TCGv_i32 ret,
+                            TCGv_i32 arg1, TCGv_i32 arg2);
 void tcg_gen_movcond_i32(TCGCond cond, TCGv_i32 ret, TCGv_i32 c1,
                          TCGv_i32 c2, TCGv_i32 v1, TCGv_i32 v2);
 void tcg_gen_add2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
@@ -540,6 +542,8 @@ void tcg_gen_setcond_i64(TCGCond cond, TCGv_i64 ret,
                          TCGv_i64 arg1, TCGv_i64 arg2);
 void tcg_gen_setcondi_i64(TCGCond cond, TCGv_i64 ret,
                           TCGv_i64 arg1, int64_t arg2);
+void tcg_gen_negsetcond_i64(TCGCond cond, TCGv_i64 ret,
+                            TCGv_i64 arg1, TCGv_i64 arg2);
 void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret, TCGv_i64 c1,
                          TCGv_i64 c2, TCGv_i64 v1, TCGv_i64 v2);
 void tcg_gen_add2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index d63683c47b..80cfcf8104 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -200,6 +200,7 @@ DEF_ATOMIC2(tcg_gen_atomic_umax_fetch, i64)
 #define tcg_gen_brcondi_tl tcg_gen_brcondi_i64
 #define tcg_gen_setcond_tl tcg_gen_setcond_i64
 #define tcg_gen_setcondi_tl tcg_gen_setcondi_i64
+#define tcg_gen_negsetcond_tl tcg_gen_negsetcond_i64
 #define tcg_gen_mul_tl tcg_gen_mul_i64
 #define tcg_gen_muli_tl tcg_gen_muli_i64
 #define tcg_gen_div_tl tcg_gen_div_i64
@@ -317,6 +318,7 @@ DEF_ATOMIC2(tcg_gen_atomic_umax_fetch, i64)
 #define tcg_gen_brcondi_tl tcg_gen_brcondi_i32
 #define tcg_gen_setcond_tl tcg_gen_setcond_i32
 #define tcg_gen_setcondi_tl tcg_gen_setcondi_i32
+#define tcg_gen_negsetcond_tl tcg_gen_negsetcond_i32
 #define tcg_gen_mul_tl tcg_gen_mul_i32
 #define tcg_gen_muli_tl tcg_gen_muli_i32
 #define tcg_gen_div_tl tcg_gen_div_i32
diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h
index acfa5ba753..5044814d15 100644
--- a/include/tcg/tcg-opc.h
+++ b/include/tcg/tcg-opc.h
@@ -46,6 +46,7 @@ DEF(mb, 0, 0, 1, 0)
 
 DEF(mov_i32, 1, 1, 0, TCG_OPF_NOT_PRESENT)
 DEF(setcond_i32, 1, 2, 1, 0)
+DEF(negsetcond_i32, 1, 2, 1, IMPL(TCG_TARGET_HAS_negsetcond_i32))
 DEF(movcond_i32, 1, 4, 1, IMPL(TCG_TARGET_HAS_movcond_i32))
 /* load/store */
 DEF(ld8u_i32, 1, 1, 1, 0)
@@ -111,6 +112,7 @@ DEF(ctpop_i32, 1, 1, 0, IMPL(TCG_TARGET_HAS_ctpop_i32))
 
 DEF(mov_i64, 1, 1, 0, TCG_OPF_64BIT | TCG_OPF_NOT_PRESENT)
 DEF(setcond_i64, 1, 2, 1, IMPL64)
+DEF(negsetcond_i64, 1, 2, 1, IMPL64 | IMPL(TCG_TARGET_HAS_negsetcond_i64))
 DEF(movcond_i64, 1, 4, 1, IMPL64 | IMPL(TCG_TARGET_HAS_movcond_i64))
 /* load/store */
 DEF(ld8u_i64, 1, 1, 1, IMPL64)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 0875971719..f00bff9c85 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -104,6 +104,7 @@ typedef uint64_t TCGRegSet;
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        0
 #define TCG_TARGET_HAS_mulsh_i64        0
+#define TCG_TARGET_HAS_negsetcond_i64   0
 /* Turn some undef macros into true macros.  */
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index ce64de06e5..6080fddf73 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -94,6 +94,7 @@ typedef enum {
 #define TCG_TARGET_HAS_mulsh_i32        0
 #define TCG_TARGET_HAS_extrl_i64_i32    0
 #define TCG_TARGET_HAS_extrh_i64_i32    0
+#define TCG_TARGET_HAS_negsetcond_i32   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_div_i64          1
@@ -129,6 +130,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        1
 #define TCG_TARGET_HAS_mulsh_i64        1
+#define TCG_TARGET_HAS_negsetcond_i64   0
 
 /*
  * Without FEAT_LSE2, we must use LDXP+STXP to implement atomic 128-bit load,
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index c649db72a6..b076d033a9 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -122,6 +122,7 @@ extern bool use_neon_instructions;
 #define TCG_TARGET_HAS_mulsh_i32        0
 #define TCG_TARGET_HAS_div_i32          use_idiv_instructions
 #define TCG_TARGET_HAS_rem_i32          0
+#define TCG_TARGET_HAS_negsetcond_i32   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   0
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 2a2e3fffa8..41df0e5ae1 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -156,6 +156,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
+#define TCG_TARGET_HAS_negsetcond_i32   0
 
 #if TCG_TARGET_REG_BITS == 64
 /* Keep 32-bit values zero-extended in a register.  */
@@ -193,6 +194,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i64        1
 #define TCG_TARGET_HAS_muluh_i64        0
 #define TCG_TARGET_HAS_mulsh_i64        0
+#define TCG_TARGET_HAS_negsetcond_i64   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 #else
 #define TCG_TARGET_HAS_qemu_st8_i32     1
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 26f1aab780..ce8fa3507e 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -118,6 +118,7 @@ typedef enum {
 #define TCG_TARGET_HAS_ctpop_i32        0
 #define TCG_TARGET_HAS_brcond2          0
 #define TCG_TARGET_HAS_setcond2         0
+#define TCG_TARGET_HAS_negsetcond_i32   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 /* 64-bit operations */
@@ -157,6 +158,8 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        1
 #define TCG_TARGET_HAS_mulsh_i64        1
+#define TCG_TARGET_HAS_negsetcond_i64   0
+
 #define TCG_TARGET_HAS_qemu_ldst_i128   0
 
 #define TCG_TARGET_DEFAULT_MO (0)
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index dd2efa795c..68e6cc33cc 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -128,6 +128,7 @@ extern bool use_mips32r2_instructions;
 #define TCG_TARGET_HAS_muluh_i32        1
 #define TCG_TARGET_HAS_mulsh_i32        1
 #define TCG_TARGET_HAS_bswap32_i32      1
+#define TCG_TARGET_HAS_negsetcond_i32   0
 
 #if TCG_TARGET_REG_BITS == 64
 #define TCG_TARGET_HAS_add2_i32         0
@@ -150,6 +151,7 @@ extern bool use_mips32r2_instructions;
 #define TCG_TARGET_HAS_mulsh_i64        1
 #define TCG_TARGET_HAS_ext32s_i64       1
 #define TCG_TARGET_HAS_ext32u_i64       1
+#define TCG_TARGET_HAS_negsetcond_i64   0
 #endif
 
 /* optional instructions detected at runtime */
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 9a41fab8cc..ba4fd3eb3a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -101,6 +101,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i32        0
 #define TCG_TARGET_HAS_muluh_i32        1
 #define TCG_TARGET_HAS_mulsh_i32        1
+#define TCG_TARGET_HAS_negsetcond_i32   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #if TCG_TARGET_REG_BITS == 64
@@ -141,6 +142,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        1
 #define TCG_TARGET_HAS_mulsh_i64        1
+#define TCG_TARGET_HAS_negsetcond_i64   0
 #endif
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   \
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index e1d8110ee4..b2961fec8e 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -120,6 +120,7 @@ extern bool have_zbb;
 #define TCG_TARGET_HAS_ctpop_i32        have_zbb
 #define TCG_TARGET_HAS_brcond2          1
 #define TCG_TARGET_HAS_setcond2         1
+#define TCG_TARGET_HAS_negsetcond_i32   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_movcond_i64      1
@@ -158,6 +159,7 @@ extern bool have_zbb;
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        1
 #define TCG_TARGET_HAS_mulsh_i64        1
+#define TCG_TARGET_HAS_negsetcond_i64   0
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   0
 
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 9a405003b9..24e207c2d4 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -104,6 +104,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_mulsh_i32      0
 #define TCG_TARGET_HAS_extrl_i64_i32  0
 #define TCG_TARGET_HAS_extrh_i64_i32  0
+#define TCG_TARGET_HAS_negsetcond_i32 0
 #define TCG_TARGET_HAS_qemu_st8_i32   0
 
 #define TCG_TARGET_HAS_div2_i64       1
@@ -138,6 +139,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_muls2_i64      HAVE_FACILITY(MISC_INSN_EXT2)
 #define TCG_TARGET_HAS_muluh_i64      0
 #define TCG_TARGET_HAS_mulsh_i64      0
+#define TCG_TARGET_HAS_negsetcond_i64 0
 
 #define TCG_TARGET_HAS_qemu_ldst_i128 1
 
diff --git a/tcg/sparc64/tcg-target.h b/tcg/sparc64/tcg-target.h
index d454278811..1faadc704b 100644
--- a/tcg/sparc64/tcg-target.h
+++ b/tcg/sparc64/tcg-target.h
@@ -112,6 +112,7 @@ extern bool use_vis3_instructions;
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
+#define TCG_TARGET_HAS_negsetcond_i32   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_extrl_i64_i32    1
@@ -149,6 +150,7 @@ extern bool use_vis3_instructions;
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        use_vis3_instructions
 #define TCG_TARGET_HAS_mulsh_i64        0
+#define TCG_TARGET_HAS_negsetcond_i64   0
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   0
 
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 37ee10c959..ca18ddaaad 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -73,6 +73,7 @@
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
+#define TCG_TARGET_HAS_negsetcond_i32   0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #if TCG_TARGET_REG_BITS == 64
@@ -114,6 +115,7 @@
 #define TCG_TARGET_HAS_mulu2_i64        1
 #define TCG_TARGET_HAS_muluh_i64        0
 #define TCG_TARGET_HAS_mulsh_i64        0
+#define TCG_TARGET_HAS_negsetcond_i64   0
 #else
 #define TCG_TARGET_HAS_mulu2_i32        1
 #endif /* TCG_TARGET_REG_BITS == 64 */
diff --git a/tcg/optimize.c b/tcg/optimize.c
index d2156367a3..0b4590ec7a 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1530,14 +1530,22 @@ static bool fold_movcond(OptContext *ctx, TCGOp *op)
     if (arg_is_const(op->args[3]) && arg_is_const(op->args[4])) {
         uint64_t tv = arg_info(op->args[3])->val;
         uint64_t fv = arg_info(op->args[4])->val;
-        TCGOpcode opc;
+        TCGOpcode opc, negopc = 0;
 
         switch (ctx->type) {
         case TCG_TYPE_I32:
             opc = INDEX_op_setcond_i32;
+            if (TCG_TARGET_HAS_negsetcond_i32) {
+                negopc = INDEX_op_negsetcond_i32;
+            }
+            tv = (int32_t)tv;
+            fv = (int32_t)fv;
             break;
         case TCG_TYPE_I64:
             opc = INDEX_op_setcond_i64;
+            if (TCG_TARGET_HAS_negsetcond_i64) {
+                negopc = INDEX_op_negsetcond_i64;
+            }
             break;
         default:
             g_assert_not_reached();
@@ -1549,6 +1557,14 @@ static bool fold_movcond(OptContext *ctx, TCGOp *op)
         } else if (fv == 1 && tv == 0) {
             op->opc = opc;
             op->args[3] = tcg_invert_cond(cond);
+        } else if (negopc) {
+            if (tv == -1 && fv == 0) {
+                op->opc = negopc;
+                op->args[3] = cond;
+            } else if (fv == -1 && tv == 0) {
+                op->opc = negopc;
+                op->args[3] = tcg_invert_cond(cond);
+            }
         }
     }
     return false;
@@ -1759,6 +1775,26 @@ static bool fold_setcond(OptContext *ctx, TCGOp *op)
     return false;
 }
 
+static bool fold_negsetcond(OptContext *ctx, TCGOp *op)
+{
+    TCGCond cond = op->args[3];
+    int i;
+
+    if (swap_commutative(op->args[0], &op->args[1], &op->args[2])) {
+        op->args[3] = cond = tcg_swap_cond(cond);
+    }
+
+    i = do_constant_folding_cond(ctx->type, op->args[1], op->args[2], cond);
+    if (i >= 0) {
+        return tcg_opt_gen_movi(ctx, op, op->args[0], -i);
+    }
+
+    /* Value is {0,-1} so all bits are repititions of the sign. */
+    ctx->s_mask = -1;
+    return false;
+}
+
+
 static bool fold_setcond2(OptContext *ctx, TCGOp *op)
 {
     TCGCond cond = op->args[5];
@@ -2216,6 +2252,9 @@ void tcg_optimize(TCGContext *s)
         CASE_OP_32_64(setcond):
             done = fold_setcond(&ctx, op);
             break;
+        CASE_OP_32_64(negsetcond):
+            done = fold_negsetcond(&ctx, op);
+            break;
         case INDEX_op_setcond2_i32:
             done = fold_setcond2(&ctx, op);
             break;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 7aadb37756..76d2377669 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -276,6 +276,21 @@ void tcg_gen_setcondi_i32(TCGCond cond, TCGv_i32 ret,
     tcg_gen_setcond_i32(cond, ret, arg1, tcg_constant_i32(arg2));
 }
 
+void tcg_gen_negsetcond_i32(TCGCond cond, TCGv_i32 ret,
+                            TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (cond == TCG_COND_ALWAYS) {
+        tcg_gen_movi_i32(ret, -1);
+    } else if (cond == TCG_COND_NEVER) {
+        tcg_gen_movi_i32(ret, 0);
+    } else if (TCG_TARGET_HAS_negsetcond_i32) {
+        tcg_gen_op4i_i32(INDEX_op_negsetcond_i32, ret, arg1, arg2, cond);
+    } else {
+        tcg_gen_setcond_i32(cond, ret, arg1, arg2);
+        tcg_gen_neg_i32(ret, ret);
+    }
+}
+
 void tcg_gen_muli_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
 {
     if (arg2 == 0) {
@@ -1567,6 +1582,27 @@ void tcg_gen_setcondi_i64(TCGCond cond, TCGv_i64 ret,
     }
 }
 
+void tcg_gen_negsetcond_i64(TCGCond cond, TCGv_i64 ret,
+                            TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (cond == TCG_COND_ALWAYS) {
+        tcg_gen_movi_i64(ret, -1);
+    } else if (cond == TCG_COND_NEVER) {
+        tcg_gen_movi_i64(ret, 0);
+    } else if (TCG_TARGET_HAS_negsetcond_i64) {
+        tcg_gen_op4i_i64(INDEX_op_negsetcond_i64, ret, arg1, arg2, cond);
+    } else if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_op6i_i32(INDEX_op_setcond2_i32, TCGV_LOW(ret),
+                         TCGV_LOW(arg1), TCGV_HIGH(arg1),
+                         TCGV_LOW(arg2), TCGV_HIGH(arg2), cond);
+        tcg_gen_neg_i32(TCGV_LOW(ret), TCGV_LOW(ret));
+        tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_LOW(ret));
+    } else {
+        tcg_gen_setcond_i64(cond, ret, arg1, arg2);
+        tcg_gen_neg_i64(ret, ret);
+    }
+}
+
 void tcg_gen_muli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 {
     if (arg2 == 0) {
diff --git a/tcg/tcg.c b/tcg/tcg.c
index ddfe9a96cb..b7f8f007ca 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1879,6 +1879,8 @@ bool tcg_op_supported(TCGOpcode op)
     case INDEX_op_sar_i32:
         return true;
 
+    case INDEX_op_negsetcond_i32:
+        return TCG_TARGET_HAS_negsetcond_i32;
     case INDEX_op_movcond_i32:
         return TCG_TARGET_HAS_movcond_i32;
     case INDEX_op_div_i32:
@@ -1977,6 +1979,8 @@ bool tcg_op_supported(TCGOpcode op)
     case INDEX_op_extu_i32_i64:
         return TCG_TARGET_REG_BITS == 64;
 
+    case INDEX_op_negsetcond_i64:
+        return TCG_TARGET_HAS_negsetcond_i64;
     case INDEX_op_movcond_i64:
         return TCG_TARGET_HAS_movcond_i64;
     case INDEX_op_div_i64:
@@ -2510,11 +2514,13 @@ static void tcg_dump_ops(TCGContext *s, FILE *f, bool have_prefs)
             switch (c) {
             case INDEX_op_brcond_i32:
             case INDEX_op_setcond_i32:
+            case INDEX_op_negsetcond_i32:
             case INDEX_op_movcond_i32:
             case INDEX_op_brcond2_i32:
             case INDEX_op_setcond2_i32:
             case INDEX_op_brcond_i64:
             case INDEX_op_setcond_i64:
+            case INDEX_op_negsetcond_i64:
             case INDEX_op_movcond_i64:
             case INDEX_op_cmp_vec:
             case INDEX_op_cmpsel_vec:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 02/24] tcg: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
  2023-08-08  3:11 ` [PATCH 01/24] " Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-08 15:55   ` Peter Maydell
  2023-08-10 16:13   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 03/24] target/alpha: Use tcg_gen_movcond_i64 in gen_fold_mzero Richard Henderson
                   ` (21 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg-op-gvec.c | 6 ++----
 tcg/tcg-op.c      | 6 ++----
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index a062239804..e260a07c61 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -3692,8 +3692,7 @@ static void expand_cmp_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
     for (i = 0; i < oprsz; i += 4) {
         tcg_gen_ld_i32(t0, cpu_env, aofs + i);
         tcg_gen_ld_i32(t1, cpu_env, bofs + i);
-        tcg_gen_setcond_i32(cond, t0, t0, t1);
-        tcg_gen_neg_i32(t0, t0);
+        tcg_gen_negsetcond_i32(cond, t0, t0, t1);
         tcg_gen_st_i32(t0, cpu_env, dofs + i);
     }
     tcg_temp_free_i32(t1);
@@ -3710,8 +3709,7 @@ static void expand_cmp_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs,
     for (i = 0; i < oprsz; i += 8) {
         tcg_gen_ld_i64(t0, cpu_env, aofs + i);
         tcg_gen_ld_i64(t1, cpu_env, bofs + i);
-        tcg_gen_setcond_i64(cond, t0, t0, t1);
-        tcg_gen_neg_i64(t0, t0);
+        tcg_gen_negsetcond_i64(cond, t0, t0, t1);
         tcg_gen_st_i64(t0, cpu_env, dofs + i);
     }
     tcg_temp_free_i64(t1);
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 76d2377669..b4f1f24cab 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -863,8 +863,7 @@ void tcg_gen_movcond_i32(TCGCond cond, TCGv_i32 ret, TCGv_i32 c1,
     } else {
         TCGv_i32 t0 = tcg_temp_ebb_new_i32();
         TCGv_i32 t1 = tcg_temp_ebb_new_i32();
-        tcg_gen_setcond_i32(cond, t0, c1, c2);
-        tcg_gen_neg_i32(t0, t0);
+        tcg_gen_negsetcond_i32(cond, t0, c1, c2);
         tcg_gen_and_i32(t1, v1, t0);
         tcg_gen_andc_i32(ret, v2, t0);
         tcg_gen_or_i32(ret, ret, t1);
@@ -2563,8 +2562,7 @@ void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret, TCGv_i64 c1,
     } else {
         TCGv_i64 t0 = tcg_temp_ebb_new_i64();
         TCGv_i64 t1 = tcg_temp_ebb_new_i64();
-        tcg_gen_setcond_i64(cond, t0, c1, c2);
-        tcg_gen_neg_i64(t0, t0);
+        tcg_gen_negsetcond_i64(cond, t0, c1, c2);
         tcg_gen_and_i64(t1, v1, t0);
         tcg_gen_andc_i64(ret, v2, t0);
         tcg_gen_or_i64(ret, ret, t1);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 03/24] target/alpha: Use tcg_gen_movcond_i64 in gen_fold_mzero
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
  2023-08-08  3:11 ` [PATCH 01/24] " Richard Henderson
  2023-08-08  3:11 ` [PATCH 02/24] tcg: Use tcg_gen_negsetcond_* Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:19   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 04/24] target/arm: Use tcg_gen_negsetcond_* Richard Henderson
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

The setcond + neg + and sequence is a complex method of
performing a conditional move.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/alpha/translate.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 846f3d8091..0839182a1f 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -517,10 +517,9 @@ static void gen_fold_mzero(TCGCond cond, TCGv dest, TCGv src)
 
     case TCG_COND_GE:
     case TCG_COND_LT:
-        /* For >= or <, map -0.0 to +0.0 via comparison and mask.  */
-        tcg_gen_setcondi_i64(TCG_COND_NE, dest, src, mzero);
-        tcg_gen_neg_i64(dest, dest);
-        tcg_gen_and_i64(dest, dest, src);
+        /* For >= or <, map -0.0 to +0.0. */
+        tcg_gen_movcond_i64(TCG_COND_NE, dest, src, tcg_constant_i64(mzero),
+                            src, tcg_constant_i64(0));
         break;
 
     default:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 04/24] target/arm: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (2 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 03/24] target/alpha: Use tcg_gen_movcond_i64 in gen_fold_mzero Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:22   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 05/24] target/m68k: " Richard Henderson
                   ` (19 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-a64.c | 22 +++++++++-------------
 target/arm/tcg/translate.c     | 12 ++++--------
 2 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 5fa1257d32..ac16593699 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -4935,9 +4935,12 @@ static void disas_cond_select(DisasContext *s, uint32_t insn)
 
     if (rn == 31 && rm == 31 && (else_inc ^ else_inv)) {
         /* CSET & CSETM.  */
-        tcg_gen_setcond_i64(tcg_invert_cond(c.cond), tcg_rd, c.value, zero);
         if (else_inv) {
-            tcg_gen_neg_i64(tcg_rd, tcg_rd);
+            tcg_gen_negsetcond_i64(tcg_invert_cond(c.cond),
+                                   tcg_rd, c.value, zero);
+        } else {
+            tcg_gen_setcond_i64(tcg_invert_cond(c.cond),
+                                tcg_rd, c.value, zero);
         }
     } else {
         TCGv_i64 t_true = cpu_reg(s, rn);
@@ -8670,13 +8673,10 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
         }
         break;
     case 0x6: /* CMGT, CMHI */
-        /* 64 bit integer comparison, result = test ? (2^64 - 1) : 0.
-         * We implement this using setcond (test) and then negating.
-         */
         cond = u ? TCG_COND_GTU : TCG_COND_GT;
     do_cmop:
-        tcg_gen_setcond_i64(cond, tcg_rd, tcg_rn, tcg_rm);
-        tcg_gen_neg_i64(tcg_rd, tcg_rd);
+        /* 64 bit integer comparison, result = test ? -1 : 0. */
+        tcg_gen_negsetcond_i64(cond, tcg_rd, tcg_rn, tcg_rm);
         break;
     case 0x7: /* CMGE, CMHS */
         cond = u ? TCG_COND_GEU : TCG_COND_GE;
@@ -9265,14 +9265,10 @@ static void handle_2misc_64(DisasContext *s, int opcode, bool u,
         }
         break;
     case 0xa: /* CMLT */
-        /* 64 bit integer comparison against zero, result is
-         * test ? (2^64 - 1) : 0. We implement via setcond(!test) and
-         * subtracting 1.
-         */
+        /* 64 bit integer comparison against zero, result is test ? 1 : 0. */
         cond = TCG_COND_LT;
     do_cmop:
-        tcg_gen_setcondi_i64(cond, tcg_rd, tcg_rn, 0);
-        tcg_gen_neg_i64(tcg_rd, tcg_rd);
+        tcg_gen_negsetcond_i64(cond, tcg_rd, tcg_rn, tcg_constant_i64(0));
         break;
     case 0x8: /* CMGT, CMGE */
         cond = u ? TCG_COND_GE : TCG_COND_GT;
diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
index b71ac2d0d5..31d3130e4c 100644
--- a/target/arm/tcg/translate.c
+++ b/target/arm/tcg/translate.c
@@ -2946,13 +2946,11 @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 #define GEN_CMP0(NAME, COND)                                            \
     static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a)               \
     {                                                                   \
-        tcg_gen_setcondi_i32(COND, d, a, 0);                            \
-        tcg_gen_neg_i32(d, d);                                          \
+        tcg_gen_negsetcond_i32(COND, d, a, tcg_constant_i32(0));        \
     }                                                                   \
     static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a)               \
     {                                                                   \
-        tcg_gen_setcondi_i64(COND, d, a, 0);                            \
-        tcg_gen_neg_i64(d, d);                                          \
+        tcg_gen_negsetcond_i64(COND, d, a, tcg_constant_i64(0));        \
     }                                                                   \
     static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
     {                                                                   \
@@ -3863,15 +3861,13 @@ void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
 {
     tcg_gen_and_i32(d, a, b);
-    tcg_gen_setcondi_i32(TCG_COND_NE, d, d, 0);
-    tcg_gen_neg_i32(d, d);
+    tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0));
 }
 
 void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
 {
     tcg_gen_and_i64(d, a, b);
-    tcg_gen_setcondi_i64(TCG_COND_NE, d, d, 0);
-    tcg_gen_neg_i64(d, d);
+    tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0));
 }
 
 static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 05/24] target/m68k: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (3 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 04/24] target/arm: Use tcg_gen_negsetcond_* Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:24   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 06/24] target/openrisc: " Richard Henderson
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/m68k/translate.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index e07161d76f..37954d11a6 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -1357,8 +1357,7 @@ static void gen_cc_cond(DisasCompare *c, DisasContext *s, int cond)
     case 14: /* GT (!(Z || (N ^ V))) */
     case 15: /* LE (Z || (N ^ V)) */
         c->v1 = tmp = tcg_temp_new();
-        tcg_gen_setcond_i32(TCG_COND_EQ, tmp, QREG_CC_Z, c->v2);
-        tcg_gen_neg_i32(tmp, tmp);
+        tcg_gen_negsetcond_i32(TCG_COND_EQ, tmp, QREG_CC_Z, c->v2);
         tmp2 = tcg_temp_new();
         tcg_gen_xor_i32(tmp2, QREG_CC_N, QREG_CC_V);
         tcg_gen_or_i32(tmp, tmp, tmp2);
@@ -1437,9 +1436,8 @@ DISAS_INSN(scc)
     gen_cc_cond(&c, s, cond);
 
     tmp = tcg_temp_new();
-    tcg_gen_setcond_i32(c.tcond, tmp, c.v1, c.v2);
+    tcg_gen_negsetcond_i32(c.tcond, tmp, c.v1, c.v2);
 
-    tcg_gen_neg_i32(tmp, tmp);
     DEST_EA(env, insn, OS_BYTE, tmp, NULL);
 }
 
@@ -2771,13 +2769,14 @@ DISAS_INSN(mull)
             tcg_gen_muls2_i32(QREG_CC_N, QREG_CC_V, src1, DREG(ext, 12));
             /* QREG_CC_V is -(QREG_CC_V != (QREG_CC_N >> 31)) */
             tcg_gen_sari_i32(QREG_CC_Z, QREG_CC_N, 31);
-            tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, QREG_CC_V, QREG_CC_Z);
+            tcg_gen_negsetcond_i32(TCG_COND_NE, QREG_CC_V,
+                                   QREG_CC_V, QREG_CC_Z);
         } else {
             tcg_gen_mulu2_i32(QREG_CC_N, QREG_CC_V, src1, DREG(ext, 12));
             /* QREG_CC_V is -(QREG_CC_V != 0), use QREG_CC_C as 0 */
-            tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, QREG_CC_V, QREG_CC_C);
+            tcg_gen_negsetcond_i32(TCG_COND_NE, QREG_CC_V,
+                                   QREG_CC_V, QREG_CC_C);
         }
-        tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V);
         tcg_gen_mov_i32(DREG(ext, 12), QREG_CC_N);
 
         tcg_gen_mov_i32(QREG_CC_Z, QREG_CC_N);
@@ -3346,14 +3345,13 @@ static inline void shift_im(DisasContext *s, uint16_t insn, int opsize)
         if (!logical && m68k_feature(s->env, M68K_FEATURE_M68K)) {
             /* if shift count >= bits, V is (reg != 0) */
             if (count >= bits) {
-                tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, reg, QREG_CC_V);
+                tcg_gen_negsetcond_i32(TCG_COND_NE, QREG_CC_V, reg, QREG_CC_V);
             } else {
                 TCGv t0 = tcg_temp_new();
                 tcg_gen_sari_i32(QREG_CC_V, reg, bits - 1);
                 tcg_gen_sari_i32(t0, reg, bits - count - 1);
-                tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, QREG_CC_V, t0);
+                tcg_gen_negsetcond_i32(TCG_COND_NE, QREG_CC_V, QREG_CC_V, t0);
             }
-            tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V);
         }
     } else {
         tcg_gen_shri_i32(QREG_CC_C, reg, count - 1);
@@ -3437,9 +3435,8 @@ static inline void shift_reg(DisasContext *s, uint16_t insn, int opsize)
             /* Ignore the bits below the sign bit.  */
             tcg_gen_andi_i64(t64, t64, -1ULL << (bits - 1));
             /* If any bits remain set, we have overflow.  */
-            tcg_gen_setcondi_i64(TCG_COND_NE, t64, t64, 0);
+            tcg_gen_negsetcond_i64(TCG_COND_NE, t64, t64, tcg_constant_i64(0));
             tcg_gen_extrl_i64_i32(QREG_CC_V, t64);
-            tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V);
         }
     } else {
         tcg_gen_shli_i64(t64, t64, 32);
@@ -5318,9 +5315,8 @@ DISAS_INSN(fscc)
     gen_fcc_cond(&c, s, cond);
 
     tmp = tcg_temp_new();
-    tcg_gen_setcond_i32(c.tcond, tmp, c.v1, c.v2);
+    tcg_gen_negsetcond_i32(c.tcond, tmp, c.v1, c.v2);
 
-    tcg_gen_neg_i32(tmp, tmp);
     DEST_EA(env, insn, OS_BYTE, tmp, NULL);
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 06/24] target/openrisc: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (4 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 05/24] target/m68k: " Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:24   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 07/24] target/ppc: " Richard Henderson
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/openrisc/translate.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index a86360d4f5..7c6f80daf1 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -253,9 +253,8 @@ static void gen_mul(DisasContext *dc, TCGv dest, TCGv srca, TCGv srcb)
 
     tcg_gen_muls2_tl(dest, cpu_sr_ov, srca, srcb);
     tcg_gen_sari_tl(t0, dest, TARGET_LONG_BITS - 1);
-    tcg_gen_setcond_tl(TCG_COND_NE, cpu_sr_ov, cpu_sr_ov, t0);
+    tcg_gen_negsetcond_tl(TCG_COND_NE, cpu_sr_ov, cpu_sr_ov, t0);
 
-    tcg_gen_neg_tl(cpu_sr_ov, cpu_sr_ov);
     gen_ove_ov(dc);
 }
 
@@ -309,9 +308,8 @@ static void gen_muld(DisasContext *dc, TCGv srca, TCGv srcb)
 
         tcg_gen_muls2_i64(cpu_mac, high, t1, t2);
         tcg_gen_sari_i64(t1, cpu_mac, 63);
-        tcg_gen_setcond_i64(TCG_COND_NE, t1, t1, high);
+        tcg_gen_negsetcond_i64(TCG_COND_NE, t1, t1, high);
         tcg_gen_trunc_i64_tl(cpu_sr_ov, t1);
-        tcg_gen_neg_tl(cpu_sr_ov, cpu_sr_ov);
 
         gen_ove_ov(dc);
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 07/24] target/ppc: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (5 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 06/24] target/openrisc: " Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-08 16:51   ` Daniel Henrique Barboza
  2023-08-15 12:54   ` Nicholas Piggin
  2023-08-08  3:11 ` [PATCH 08/24] target/sparc: Use tcg_gen_movcond_i64 in gen_edge Richard Henderson
                   ` (16 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/ppc/translate/fixedpoint-impl.c.inc | 6 ++++--
 target/ppc/translate/vmx-impl.c.inc        | 8 +++-----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
index f47f1a50e8..4ce02fd3a4 100644
--- a/target/ppc/translate/fixedpoint-impl.c.inc
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -342,12 +342,14 @@ static bool do_set_bool_cond(DisasContext *ctx, arg_X_bi *a, bool neg, bool rev)
     uint32_t mask = 0x08 >> (a->bi & 0x03);
     TCGCond cond = rev ? TCG_COND_EQ : TCG_COND_NE;
     TCGv temp = tcg_temp_new();
+    TCGv zero = tcg_constant_tl(0);
 
     tcg_gen_extu_i32_tl(temp, cpu_crf[a->bi >> 2]);
     tcg_gen_andi_tl(temp, temp, mask);
-    tcg_gen_setcondi_tl(cond, cpu_gpr[a->rt], temp, 0);
     if (neg) {
-        tcg_gen_neg_tl(cpu_gpr[a->rt], cpu_gpr[a->rt]);
+        tcg_gen_negsetcond_tl(cond, cpu_gpr[a->rt], temp, zero);
+    } else {
+        tcg_gen_setcond_tl(cond, cpu_gpr[a->rt], temp, zero);
     }
     return true;
 }
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index c8712dd7d8..6d7669aabd 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1341,8 +1341,7 @@ static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
     tcg_gen_xor_i64(t1, t0, t1);
 
     tcg_gen_or_i64(t1, t1, t2);
-    tcg_gen_setcondi_i64(TCG_COND_EQ, t1, t1, 0);
-    tcg_gen_neg_i64(t1, t1);
+    tcg_gen_negsetcond_i64(TCG_COND_EQ, t1, t1, tcg_constant_i64(0));
 
     set_avr64(a->vrt, t1, true);
     set_avr64(a->vrt, t1, false);
@@ -1365,15 +1364,14 @@ static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
 
     get_avr64(t0, a->vra, false);
     get_avr64(t1, a->vrb, false);
-    tcg_gen_setcond_i64(TCG_COND_GTU, t2, t0, t1);
+    tcg_gen_negsetcond_i64(TCG_COND_GTU, t2, t0, t1);
 
     get_avr64(t0, a->vra, true);
     get_avr64(t1, a->vrb, true);
     tcg_gen_movcond_i64(TCG_COND_EQ, t2, t0, t1, t2, tcg_constant_i64(0));
-    tcg_gen_setcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
+    tcg_gen_negsetcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
 
     tcg_gen_or_i64(t1, t1, t2);
-    tcg_gen_neg_i64(t1, t1);
 
     set_avr64(a->vrt, t1, true);
     set_avr64(a->vrt, t1, false);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 08/24] target/sparc: Use tcg_gen_movcond_i64 in gen_edge
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (6 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 07/24] target/ppc: " Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:29   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 09/24] target/tricore: Replace gen_cond_w with tcg_gen_negsetcond_tl Richard Henderson
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

The setcond + neg + or sequence is a complex method of
performing a conditional move.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/sparc/translate.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index bd877a5e4a..fa80a91161 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2916,7 +2916,7 @@ static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
 
     tcg_gen_shr_tl(lo1, tcg_constant_tl(tabl), lo1);
     tcg_gen_shr_tl(lo2, tcg_constant_tl(tabr), lo2);
-    tcg_gen_andi_tl(dst, lo1, omask);
+    tcg_gen_andi_tl(lo1, lo1, omask);
     tcg_gen_andi_tl(lo2, lo2, omask);
 
     amask = -8;
@@ -2926,18 +2926,9 @@ static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
     tcg_gen_andi_tl(s1, s1, amask);
     tcg_gen_andi_tl(s2, s2, amask);
 
-    /* We want to compute
-        dst = (s1 == s2 ? lo1 : lo1 & lo2).
-       We've already done dst = lo1, so this reduces to
-        dst &= (s1 == s2 ? -1 : lo2)
-       Which we perform by
-        lo2 |= -(s1 == s2)
-        dst &= lo2
-    */
-    tcg_gen_setcond_tl(TCG_COND_EQ, lo1, s1, s2);
-    tcg_gen_neg_tl(lo1, lo1);
-    tcg_gen_or_tl(lo2, lo2, lo1);
-    tcg_gen_and_tl(dst, dst, lo2);
+    /* Compute dst = (s1 == s2 ? lo1 : lo1 & lo2). */
+    tcg_gen_and_tl(lo2, lo2, lo1);
+    tcg_gen_movcond_tl(TCG_COND_EQ, dst, s1, s2, lo1, lo2);
 }
 
 static void gen_alignaddr(TCGv dst, TCGv s1, TCGv s2, bool left)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 09/24] target/tricore: Replace gen_cond_w with tcg_gen_negsetcond_tl
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (7 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 08/24] target/sparc: Use tcg_gen_movcond_i64 in gen_edge Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-08 15:42   ` Bastian Koppelmann
  2023-08-08  3:11 ` [PATCH 10/24] tcg/ppc: Implement negsetcond_* Richard Henderson
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/tricore/translate.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/target/tricore/translate.c b/target/tricore/translate.c
index 1947733870..6ae5ccbf72 100644
--- a/target/tricore/translate.c
+++ b/target/tricore/translate.c
@@ -2680,13 +2680,6 @@ gen_accumulating_condi(int cond, TCGv ret, TCGv r1, int32_t con,
     gen_accumulating_cond(cond, ret, r1, temp, op);
 }
 
-/* ret = (r1 cond r2) ? 0xFFFFFFFF ? 0x00000000;*/
-static inline void gen_cond_w(TCGCond cond, TCGv ret, TCGv r1, TCGv r2)
-{
-    tcg_gen_setcond_tl(cond, ret, r1, r2);
-    tcg_gen_neg_tl(ret, ret);
-}
-
 static inline void gen_eqany_bi(TCGv ret, TCGv r1, int32_t con)
 {
     TCGv b0 = tcg_temp_new();
@@ -5692,7 +5685,8 @@ static void decode_rr_accumulator(DisasContext *ctx)
         gen_helper_eq_h(cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2]);
         break;
     case OPC2_32_RR_EQ_W:
-        gen_cond_w(TCG_COND_EQ, cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2]);
+        tcg_gen_negsetcond_tl(TCG_COND_EQ, cpu_gpr_d[r3],
+                              cpu_gpr_d[r1], cpu_gpr_d[r2]);
         break;
     case OPC2_32_RR_EQANY_B:
         gen_helper_eqany_b(cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2]);
@@ -5729,10 +5723,12 @@ static void decode_rr_accumulator(DisasContext *ctx)
         gen_helper_lt_hu(cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2]);
         break;
     case OPC2_32_RR_LT_W:
-        gen_cond_w(TCG_COND_LT, cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2]);
+        tcg_gen_negsetcond_tl(TCG_COND_LT, cpu_gpr_d[r3],
+                              cpu_gpr_d[r1], cpu_gpr_d[r2]);
         break;
     case OPC2_32_RR_LT_WU:
-        gen_cond_w(TCG_COND_LTU, cpu_gpr_d[r3], cpu_gpr_d[r1], cpu_gpr_d[r2]);
+        tcg_gen_negsetcond_tl(TCG_COND_LTU, cpu_gpr_d[r3],
+                              cpu_gpr_d[r1], cpu_gpr_d[r2]);
         break;
     case OPC2_32_RR_MAX:
         tcg_gen_movcond_tl(TCG_COND_GT, cpu_gpr_d[r3], cpu_gpr_d[r1],
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 10/24] tcg/ppc: Implement negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (8 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 09/24] target/tricore: Replace gen_cond_w with tcg_gen_negsetcond_tl Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-08 16:55   ` Daniel Henrique Barboza
  2023-08-08  3:11 ` [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension Richard Henderson
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

In the general case we simply negate.  However with isel we
may load -1 instead of 1 with no extra effort.

Consolidate EQ0 and NE0 logic.  Replace the NE0 zero-extension
with inversion+negation of EQ0, which is never worse and may
eliminate one insn.  Provide a special case for -EQ0.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/ppc/tcg-target.h     |   4 +-
 tcg/ppc/tcg-target.c.inc | 127 ++++++++++++++++++++++++---------------
 2 files changed, 82 insertions(+), 49 deletions(-)

diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index ba4fd3eb3a..a143b8f1e0 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -101,7 +101,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i32        0
 #define TCG_TARGET_HAS_muluh_i32        1
 #define TCG_TARGET_HAS_mulsh_i32        1
-#define TCG_TARGET_HAS_negsetcond_i32   0
+#define TCG_TARGET_HAS_negsetcond_i32   1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #if TCG_TARGET_REG_BITS == 64
@@ -142,7 +142,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        1
 #define TCG_TARGET_HAS_mulsh_i64        1
-#define TCG_TARGET_HAS_negsetcond_i64   0
+#define TCG_TARGET_HAS_negsetcond_i64   1
 #endif
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   \
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 511e14b180..10448aa0e6 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1548,8 +1548,20 @@ static void tcg_out_cmp(TCGContext *s, int cond, TCGArg arg1, TCGArg arg2,
 }
 
 static void tcg_out_setcond_eq0(TCGContext *s, TCGType type,
-                                TCGReg dst, TCGReg src)
+                                TCGReg dst, TCGReg src, bool neg)
 {
+    if (neg && (TCG_TARGET_REG_BITS == 32 || type == TCG_TYPE_I64)) {
+        /*
+         * X != 0 implies X + -1 generates a carry.
+         * RT = (~X + X) + CA
+         *    = -1 + CA
+         *    = CA ? 0 : -1
+         */
+        tcg_out32(s, ADDIC | TAI(TCG_REG_R0, src, -1));
+        tcg_out32(s, SUBFE | TAB(dst, src, src));
+        return;
+    }
+
     if (type == TCG_TYPE_I32) {
         tcg_out32(s, CNTLZW | RS(src) | RA(dst));
         tcg_out_shri32(s, dst, dst, 5);
@@ -1557,18 +1569,28 @@ static void tcg_out_setcond_eq0(TCGContext *s, TCGType type,
         tcg_out32(s, CNTLZD | RS(src) | RA(dst));
         tcg_out_shri64(s, dst, dst, 6);
     }
+    if (neg) {
+        tcg_out32(s, NEG | RT(dst) | RA(dst));
+    }
 }
 
-static void tcg_out_setcond_ne0(TCGContext *s, TCGReg dst, TCGReg src)
+static void tcg_out_setcond_ne0(TCGContext *s, TCGType type,
+                                TCGReg dst, TCGReg src, bool neg)
 {
-    /* X != 0 implies X + -1 generates a carry.  Extra addition
-       trickery means: R = X-1 + ~X + C = X-1 + (-X+1) + C = C.  */
-    if (dst != src) {
-        tcg_out32(s, ADDIC | TAI(dst, src, -1));
-        tcg_out32(s, SUBFE | TAB(dst, dst, src));
-    } else {
+    if (!neg && (TCG_TARGET_REG_BITS == 32 || type == TCG_TYPE_I64)) {
+        /*
+         * X != 0 implies X + -1 generates a carry.  Extra addition
+         * trickery means: R = X-1 + ~X + C = X-1 + (-X+1) + C = C.
+         */
         tcg_out32(s, ADDIC | TAI(TCG_REG_R0, src, -1));
         tcg_out32(s, SUBFE | TAB(dst, TCG_REG_R0, src));
+        return;
+    }
+    tcg_out_setcond_eq0(s, type, dst, src, false);
+    if (neg) {
+        tcg_out32(s, ADDI | TAI(dst, dst, -1));
+    } else {
+        tcg_out_xori32(s, dst, dst, 1);
     }
 }
 
@@ -1590,9 +1612,10 @@ static TCGReg tcg_gen_setcond_xor(TCGContext *s, TCGReg arg1, TCGArg arg2,
 
 static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
                             TCGArg arg0, TCGArg arg1, TCGArg arg2,
-                            int const_arg2)
+                            int const_arg2, bool neg)
 {
-    int crop, sh;
+    int sh;
+    bool inv;
 
     tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || type == TCG_TYPE_I32);
 
@@ -1605,14 +1628,10 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
     if (arg2 == 0) {
         switch (cond) {
         case TCG_COND_EQ:
-            tcg_out_setcond_eq0(s, type, arg0, arg1);
+            tcg_out_setcond_eq0(s, type, arg0, arg1, neg);
             return;
         case TCG_COND_NE:
-            if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I32) {
-                tcg_out_ext32u(s, TCG_REG_R0, arg1);
-                arg1 = TCG_REG_R0;
-            }
-            tcg_out_setcond_ne0(s, arg0, arg1);
+            tcg_out_setcond_ne0(s, type, arg0, arg1, neg);
             return;
         case TCG_COND_GE:
             tcg_out32(s, NOR | SAB(arg1, arg0, arg1));
@@ -1621,9 +1640,17 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
         case TCG_COND_LT:
             /* Extract the sign bit.  */
             if (type == TCG_TYPE_I32) {
-                tcg_out_shri32(s, arg0, arg1, 31);
+                if (neg) {
+                    tcg_out_sari32(s, arg0, arg1, 31);
+                } else {
+                    tcg_out_shri32(s, arg0, arg1, 31);
+                }
             } else {
-                tcg_out_shri64(s, arg0, arg1, 63);
+                if (neg) {
+                    tcg_out_sari64(s, arg0, arg1, 63);
+                } else {
+                    tcg_out_shri64(s, arg0, arg1, 63);
+                }
             }
             return;
         default:
@@ -1641,7 +1668,7 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
 
         isel = tcg_to_isel[cond];
 
-        tcg_out_movi(s, type, arg0, 1);
+        tcg_out_movi(s, type, arg0, neg ? -1 : 1);
         if (isel & 1) {
             /* arg0 = (bc ? 0 : 1) */
             tab = TAB(arg0, 0, arg0);
@@ -1655,51 +1682,47 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
         return;
     }
 
+    inv = false;
     switch (cond) {
     case TCG_COND_EQ:
         arg1 = tcg_gen_setcond_xor(s, arg1, arg2, const_arg2);
-        tcg_out_setcond_eq0(s, type, arg0, arg1);
-        return;
+        tcg_out_setcond_eq0(s, type, arg0, arg1, neg);
+        break;
 
     case TCG_COND_NE:
         arg1 = tcg_gen_setcond_xor(s, arg1, arg2, const_arg2);
-        /* Discard the high bits only once, rather than both inputs.  */
-        if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I32) {
-            tcg_out_ext32u(s, TCG_REG_R0, arg1);
-            arg1 = TCG_REG_R0;
-        }
-        tcg_out_setcond_ne0(s, arg0, arg1);
-        return;
+        tcg_out_setcond_ne0(s, type, arg0, arg1, neg);
+        break;
 
+    case TCG_COND_LE:
+    case TCG_COND_LEU:
+        inv = true;
+        /* fall through */
     case TCG_COND_GT:
     case TCG_COND_GTU:
-        sh = 30;
-        crop = 0;
-        goto crtest;
-
-    case TCG_COND_LT:
-    case TCG_COND_LTU:
-        sh = 29;
-        crop = 0;
+        sh = 30; /* CR7 CR_GT */
         goto crtest;
 
     case TCG_COND_GE:
     case TCG_COND_GEU:
-        sh = 31;
-        crop = CRNOR | BT(7, CR_EQ) | BA(7, CR_LT) | BB(7, CR_LT);
+        inv = true;
+        /* fall through */
+    case TCG_COND_LT:
+    case TCG_COND_LTU:
+        sh = 29; /* CR7 CR_LT */
         goto crtest;
 
-    case TCG_COND_LE:
-    case TCG_COND_LEU:
-        sh = 31;
-        crop = CRNOR | BT(7, CR_EQ) | BA(7, CR_GT) | BB(7, CR_GT);
     crtest:
         tcg_out_cmp(s, cond, arg1, arg2, const_arg2, 7, type);
-        if (crop) {
-            tcg_out32(s, crop);
-        }
         tcg_out32(s, MFOCRF | RT(TCG_REG_R0) | FXM(7));
         tcg_out_rlw(s, RLWINM, arg0, TCG_REG_R0, sh, 31, 31);
+        if (neg && inv) {
+            tcg_out32(s, ADDI | TAI(arg0, arg0, -1));
+        } else if (neg) {
+            tcg_out32(s, NEG | RT(arg0) | RA(arg0));
+        } else if (inv) {
+            tcg_out_xori32(s, arg0, arg0, 1);
+        }
         break;
 
     default:
@@ -2982,11 +3005,19 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_setcond_i32:
         tcg_out_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2],
-                        const_args[2]);
+                        const_args[2], false);
         break;
     case INDEX_op_setcond_i64:
         tcg_out_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2],
-                        const_args[2]);
+                        const_args[2], false);
+        break;
+    case INDEX_op_negsetcond_i32:
+        tcg_out_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2],
+                        const_args[2], true);
+        break;
+    case INDEX_op_negsetcond_i64:
+        tcg_out_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2],
+                        const_args[2], true);
         break;
     case INDEX_op_setcond2_i32:
         tcg_out_setcond2(s, args, const_args);
@@ -3724,6 +3755,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_rotl_i32:
     case INDEX_op_rotr_i32:
     case INDEX_op_setcond_i32:
+    case INDEX_op_negsetcond_i32:
     case INDEX_op_and_i64:
     case INDEX_op_andc_i64:
     case INDEX_op_shl_i64:
@@ -3732,6 +3764,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_rotl_i64:
     case INDEX_op_rotr_i64:
     case INDEX_op_setcond_i64:
+    case INDEX_op_negsetcond_i64:
         return C_O1_I2(r, r, ri);
 
     case INDEX_op_mul_i32:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (9 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 10/24] tcg/ppc: Implement negsetcond_* Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-08 16:56   ` Daniel Henrique Barboza
  2023-08-15 13:16   ` Nicholas Piggin
  2023-08-08  3:11 ` [PATCH 12/24] tcg/aarch64: Implement negsetcond_* Richard Henderson
                   ` (12 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

The SETBC family of instructions requires exactly two insns for
all comparisions, saving 0-3 insns per (neg)setcond.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/ppc/tcg-target.c.inc | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 10448aa0e6..090f11e71c 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -447,6 +447,11 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
 #define TW     XO31( 4)
 #define TRAP   (TW | TO(31))
 
+#define SETBC    XO31(384)  /* v3.10 */
+#define SETBCR   XO31(416)  /* v3.10 */
+#define SETNBC   XO31(448)  /* v3.10 */
+#define SETNBCR  XO31(480)  /* v3.10 */
+
 #define NOP    ORI  /* ori 0,0,0 */
 
 #define LVX        XO31(103)
@@ -1624,6 +1629,23 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
         arg2 = (uint32_t)arg2;
     }
 
+    /* With SETBC/SETBCR, we can always implement with 2 insns. */
+    if (have_isa_3_10) {
+        tcg_insn_unit bi, opc;
+
+        tcg_out_cmp(s, cond, arg1, arg2, const_arg2, 7, type);
+
+        /* Re-use tcg_to_bc for BI and BO_COND_{TRUE,FALSE}. */
+        bi = tcg_to_bc[cond] & (0x1f << 16);
+        if (tcg_to_bc[cond] & BO(8)) {
+            opc = neg ? SETNBC : SETBC;
+        } else {
+            opc = neg ? SETNBCR : SETBCR;
+        }
+        tcg_out32(s, opc | RT(arg0) | bi);
+        return;
+    }
+
     /* Handle common and trivial cases before handling anything else.  */
     if (arg2 == 0) {
         switch (cond) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 12/24] tcg/aarch64: Implement negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (10 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:39   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 13/24] tcg/arm: Implement negsetcond_i32 Richard Henderson
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Trivial, as aarch64 has an instruction for this: CSETM.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/aarch64/tcg-target.h     |  4 ++--
 tcg/aarch64/tcg-target.c.inc | 12 ++++++++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 6080fddf73..e3faa9cff4 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -94,7 +94,7 @@ typedef enum {
 #define TCG_TARGET_HAS_mulsh_i32        0
 #define TCG_TARGET_HAS_extrl_i64_i32    0
 #define TCG_TARGET_HAS_extrh_i64_i32    0
-#define TCG_TARGET_HAS_negsetcond_i32   0
+#define TCG_TARGET_HAS_negsetcond_i32   1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_div_i64          1
@@ -130,7 +130,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        1
 #define TCG_TARGET_HAS_mulsh_i64        1
-#define TCG_TARGET_HAS_negsetcond_i64   0
+#define TCG_TARGET_HAS_negsetcond_i64   1
 
 /*
  * Without FEAT_LSE2, we must use LDXP+STXP to implement atomic 128-bit load,
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 35ca80cd56..7d8d114c9e 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -2262,6 +2262,16 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
                      TCG_REG_XZR, tcg_invert_cond(args[3]));
         break;
 
+    case INDEX_op_negsetcond_i32:
+        a2 = (int32_t)a2;
+        /* FALLTHRU */
+    case INDEX_op_negsetcond_i64:
+        tcg_out_cmp(s, ext, a1, a2, c2);
+        /* Use CSETM alias of CSINV Wd, WZR, WZR, invert(cond).  */
+        tcg_out_insn(s, 3506, CSINV, ext, a0, TCG_REG_XZR,
+                     TCG_REG_XZR, tcg_invert_cond(args[3]));
+        break;
+
     case INDEX_op_movcond_i32:
         a2 = (int32_t)a2;
         /* FALLTHRU */
@@ -2868,6 +2878,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_sub_i64:
     case INDEX_op_setcond_i32:
     case INDEX_op_setcond_i64:
+    case INDEX_op_negsetcond_i32:
+    case INDEX_op_negsetcond_i64:
         return C_O1_I2(r, r, rA);
 
     case INDEX_op_mul_i32:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 13/24] tcg/arm: Implement negsetcond_i32
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (11 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 12/24] tcg/aarch64: Implement negsetcond_* Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-10 16:41   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 14/24] tcg/riscv: Implement negsetcond_* Richard Henderson
                   ` (10 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Trivial, as we simply need to load a different constant
in the conditional move.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/arm/tcg-target.h     | 2 +-
 tcg/arm/tcg-target.c.inc | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index b076d033a9..b064bbda9f 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -122,7 +122,7 @@ extern bool use_neon_instructions;
 #define TCG_TARGET_HAS_mulsh_i32        0
 #define TCG_TARGET_HAS_div_i32          use_idiv_instructions
 #define TCG_TARGET_HAS_rem_i32          0
-#define TCG_TARGET_HAS_negsetcond_i32   0
+#define TCG_TARGET_HAS_negsetcond_i32   1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   0
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 83e286088f..162df38c73 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1975,6 +1975,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_dat_imm(s, tcg_cond_to_arm_cond[tcg_invert_cond(args[3])],
                         ARITH_MOV, args[0], 0, 0);
         break;
+    case INDEX_op_negsetcond_i32:
+        tcg_out_dat_rIN(s, COND_AL, ARITH_CMP, ARITH_CMN, 0,
+                        args[1], args[2], const_args[2]);
+        tcg_out_dat_imm(s, tcg_cond_to_arm_cond[args[3]],
+                        ARITH_MVN, args[0], 0, 0);
+        tcg_out_dat_imm(s, tcg_cond_to_arm_cond[tcg_invert_cond(args[3])],
+                        ARITH_MOV, args[0], 0, 0);
+        break;
 
     case INDEX_op_brcond2_i32:
         c = tcg_out_cmp2(s, args, const_args);
@@ -2112,6 +2120,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_add_i32:
     case INDEX_op_sub_i32:
     case INDEX_op_setcond_i32:
+    case INDEX_op_negsetcond_i32:
         return C_O1_I2(r, r, rIN);
 
     case INDEX_op_and_i32:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 14/24] tcg/riscv: Implement negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (12 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 13/24] tcg/arm: Implement negsetcond_i32 Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-08 16:47   ` Daniel Henrique Barboza
  2023-08-08  3:11 ` [PATCH 15/24] tcg/s390x: " Richard Henderson
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/riscv/tcg-target.h     |  4 ++--
 tcg/riscv/tcg-target.c.inc | 45 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index b2961fec8e..7e8ac48a7d 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -120,7 +120,7 @@ extern bool have_zbb;
 #define TCG_TARGET_HAS_ctpop_i32        have_zbb
 #define TCG_TARGET_HAS_brcond2          1
 #define TCG_TARGET_HAS_setcond2         1
-#define TCG_TARGET_HAS_negsetcond_i32   0
+#define TCG_TARGET_HAS_negsetcond_i32   1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_movcond_i64      1
@@ -159,7 +159,7 @@ extern bool have_zbb;
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        1
 #define TCG_TARGET_HAS_mulsh_i64        1
-#define TCG_TARGET_HAS_negsetcond_i64   0
+#define TCG_TARGET_HAS_negsetcond_i64   1
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   0
 
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index eeaeb6b6e3..232b616af3 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -936,6 +936,44 @@ static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
     }
 }
 
+static void tcg_out_negsetcond(TCGContext *s, TCGCond cond, TCGReg ret,
+                               TCGReg arg1, tcg_target_long arg2, bool c2)
+{
+    int tmpflags;
+    TCGReg tmp;
+
+    /* For LT/GE comparison against 0, replicate the sign bit. */
+    if (c2 && arg2 == 0) {
+        switch (cond) {
+        case TCG_COND_GE:
+            tcg_out_opc_imm(s, OPC_XORI, ret, arg1, -1);
+            arg1 = ret;
+            /* fall through */
+        case TCG_COND_LT:
+            tcg_out_opc_imm(s, OPC_SRAI, ret, arg1, TCG_TARGET_REG_BITS - 1);
+            return;
+        default:
+            break;
+        }
+    }
+
+    tmpflags = tcg_out_setcond_int(s, cond, ret, arg1, arg2, c2);
+    tmp = tmpflags & ~SETCOND_FLAGS;
+
+    /* If intermediate result is zero/non-zero: test != 0. */
+    if (tmpflags & SETCOND_NEZ) {
+        tcg_out_opc_reg(s, OPC_SLTU, ret, TCG_REG_ZERO, tmp);
+        tmp = ret;
+    }
+
+    /* Produce the 0/-1 result. */
+    if (tmpflags & SETCOND_INV) {
+        tcg_out_opc_imm(s, OPC_ADDI, ret, tmp, -1);
+    } else {
+        tcg_out_opc_reg(s, OPC_SUB, ret, TCG_REG_ZERO, tmp);
+    }
+}
+
 static void tcg_out_movcond_zicond(TCGContext *s, TCGReg ret, TCGReg test_ne,
                                    int val1, bool c_val1,
                                    int val2, bool c_val2)
@@ -1782,6 +1820,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_setcond(s, args[3], a0, a1, a2, c2);
         break;
 
+    case INDEX_op_negsetcond_i32:
+    case INDEX_op_negsetcond_i64:
+        tcg_out_negsetcond(s, args[3], a0, a1, a2, c2);
+        break;
+
     case INDEX_op_movcond_i32:
     case INDEX_op_movcond_i64:
         tcg_out_movcond(s, args[5], a0, a1, a2, c2,
@@ -1910,6 +1953,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_xor_i64:
     case INDEX_op_setcond_i32:
     case INDEX_op_setcond_i64:
+    case INDEX_op_negsetcond_i32:
+    case INDEX_op_negsetcond_i64:
         return C_O1_I2(r, r, rI);
 
     case INDEX_op_andc_i32:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 15/24] tcg/s390x: Implement negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (13 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 14/24] tcg/riscv: Implement negsetcond_* Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-08  3:11 ` [PATCH 16/24] tcg/sparc64: " Richard Henderson
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/s390x/tcg-target.h     |  4 +-
 tcg/s390x/tcg-target.c.inc | 78 +++++++++++++++++++++++++-------------
 2 files changed, 54 insertions(+), 28 deletions(-)

diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 24e207c2d4..cd3d245be0 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -104,7 +104,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_mulsh_i32      0
 #define TCG_TARGET_HAS_extrl_i64_i32  0
 #define TCG_TARGET_HAS_extrh_i64_i32  0
-#define TCG_TARGET_HAS_negsetcond_i32 0
+#define TCG_TARGET_HAS_negsetcond_i32 1
 #define TCG_TARGET_HAS_qemu_st8_i32   0
 
 #define TCG_TARGET_HAS_div2_i64       1
@@ -139,7 +139,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_muls2_i64      HAVE_FACILITY(MISC_INSN_EXT2)
 #define TCG_TARGET_HAS_muluh_i64      0
 #define TCG_TARGET_HAS_mulsh_i64      0
-#define TCG_TARGET_HAS_negsetcond_i64 0
+#define TCG_TARGET_HAS_negsetcond_i64 1
 
 #define TCG_TARGET_HAS_qemu_ldst_i128 1
 
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index a94f7908d6..ecd8aaf2a1 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1266,7 +1266,8 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
 }
 
 static void tgen_setcond(TCGContext *s, TCGType type, TCGCond cond,
-                         TCGReg dest, TCGReg c1, TCGArg c2, int c2const)
+                         TCGReg dest, TCGReg c1, TCGArg c2,
+                         bool c2const, bool neg)
 {
     int cc;
 
@@ -1275,11 +1276,27 @@ static void tgen_setcond(TCGContext *s, TCGType type, TCGCond cond,
         /* Emit: d = 0, d = (cc ? 1 : d).  */
         cc = tgen_cmp(s, type, cond, c1, c2, c2const, false);
         tcg_out_movi(s, TCG_TYPE_I64, dest, 0);
-        tcg_out_insn(s, RIEg, LOCGHI, dest, 1, cc);
+        tcg_out_insn(s, RIEg, LOCGHI, dest, neg ? -1 : 1, cc);
         return;
     }
 
- restart:
+    switch (cond) {
+    case TCG_COND_GEU:
+    case TCG_COND_LTU:
+    case TCG_COND_LT:
+    case TCG_COND_GE:
+        /* Swap operands so that we can use LEU/GTU/GT/LE.  */
+        if (!c2const) {
+            TCGReg t = c1;
+            c1 = c2;
+            c2 = t;
+            cond = tcg_swap_cond(cond);
+        }
+        break;
+    default:
+        break;
+    }
+
     switch (cond) {
     case TCG_COND_NE:
         /* X != 0 is X > 0.  */
@@ -1292,11 +1309,20 @@ static void tgen_setcond(TCGContext *s, TCGType type, TCGCond cond,
 
     case TCG_COND_GTU:
     case TCG_COND_GT:
-        /* The result of a compare has CC=2 for GT and CC=3 unused.
-           ADD LOGICAL WITH CARRY considers (CC & 2) the carry bit.  */
+        /*
+         * The result of a compare has CC=2 for GT and CC=3 unused.
+         * ADD LOGICAL WITH CARRY considers (CC & 2) the carry bit.
+         */
         tgen_cmp(s, type, cond, c1, c2, c2const, true);
         tcg_out_movi(s, type, dest, 0);
         tcg_out_insn(s, RRE, ALCGR, dest, dest);
+        if (neg) {
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RR, LCR, dest, dest);
+            } else {
+                tcg_out_insn(s, RRE, LCGR, dest, dest);
+            }
+        }
         return;
 
     case TCG_COND_EQ:
@@ -1310,27 +1336,17 @@ static void tgen_setcond(TCGContext *s, TCGType type, TCGCond cond,
 
     case TCG_COND_LEU:
     case TCG_COND_LE:
-        /* As above, but we're looking for borrow, or !carry.
-           The second insn computes d - d - borrow, or -1 for true
-           and 0 for false.  So we must mask to 1 bit afterward.  */
+        /*
+         * As above, but we're looking for borrow, or !carry.
+         * The second insn computes d - d - borrow, or -1 for true
+         * and 0 for false.  So we must mask to 1 bit afterward.
+         */
         tgen_cmp(s, type, cond, c1, c2, c2const, true);
         tcg_out_insn(s, RRE, SLBGR, dest, dest);
-        tgen_andi(s, type, dest, 1);
-        return;
-
-    case TCG_COND_GEU:
-    case TCG_COND_LTU:
-    case TCG_COND_LT:
-    case TCG_COND_GE:
-        /* Swap operands so that we can use LEU/GTU/GT/LE.  */
-        if (!c2const) {
-            TCGReg t = c1;
-            c1 = c2;
-            c2 = t;
-            cond = tcg_swap_cond(cond);
-            goto restart;
+        if (!neg) {
+            tgen_andi(s, type, dest, 1);
         }
-        break;
+        return;
 
     default:
         g_assert_not_reached();
@@ -1339,7 +1355,7 @@ static void tgen_setcond(TCGContext *s, TCGType type, TCGCond cond,
     cc = tgen_cmp(s, type, cond, c1, c2, c2const, false);
     /* Emit: d = 0, t = 1, d = (cc ? t : d).  */
     tcg_out_movi(s, TCG_TYPE_I64, dest, 0);
-    tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, 1);
+    tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, neg ? -1 : 1);
     tcg_out_insn(s, RRFc, LOCGR, dest, TCG_TMP0, cc);
 }
 
@@ -2288,7 +2304,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
     case INDEX_op_setcond_i32:
         tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1],
-                     args[2], const_args[2]);
+                     args[2], const_args[2], false);
+        break;
+    case INDEX_op_negsetcond_i32:
+        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1],
+                     args[2], const_args[2], true);
         break;
     case INDEX_op_movcond_i32:
         tgen_movcond(s, TCG_TYPE_I32, args[5], args[0], args[1],
@@ -2566,7 +2586,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
     case INDEX_op_setcond_i64:
         tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
-                     args[2], const_args[2]);
+                     args[2], const_args[2], false);
+        break;
+    case INDEX_op_negsetcond_i64:
+        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
+                     args[2], const_args[2], true);
         break;
     case INDEX_op_movcond_i64:
         tgen_movcond(s, TCG_TYPE_I64, args[5], args[0], args[1],
@@ -3109,8 +3133,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_rotr_i32:
     case INDEX_op_rotr_i64:
     case INDEX_op_setcond_i32:
+    case INDEX_op_negsetcond_i32:
         return C_O1_I2(r, r, ri);
     case INDEX_op_setcond_i64:
+    case INDEX_op_negsetcond_i64:
         return C_O1_I2(r, r, rA);
 
     case INDEX_op_clz_i64:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 16/24] tcg/sparc64: Implement negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (14 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 15/24] tcg/s390x: " Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 12:24   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 17/24] tcg/i386: Merge tcg_out_brcond{32,64} Richard Henderson
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/sparc64/tcg-target.h     |  4 ++--
 tcg/sparc64/tcg-target.c.inc | 36 ++++++++++++++++++++++++++----------
 2 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/tcg/sparc64/tcg-target.h b/tcg/sparc64/tcg-target.h
index 1faadc704b..4bbd825bd8 100644
--- a/tcg/sparc64/tcg-target.h
+++ b/tcg/sparc64/tcg-target.h
@@ -112,7 +112,7 @@ extern bool use_vis3_instructions;
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
-#define TCG_TARGET_HAS_negsetcond_i32   0
+#define TCG_TARGET_HAS_negsetcond_i32   1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_extrl_i64_i32    1
@@ -150,7 +150,7 @@ extern bool use_vis3_instructions;
 #define TCG_TARGET_HAS_muls2_i64        0
 #define TCG_TARGET_HAS_muluh_i64        use_vis3_instructions
 #define TCG_TARGET_HAS_mulsh_i64        0
-#define TCG_TARGET_HAS_negsetcond_i64   0
+#define TCG_TARGET_HAS_negsetcond_i64   1
 
 #define TCG_TARGET_HAS_qemu_ldst_i128   0
 
diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc
index ffcb879211..37839f9a21 100644
--- a/tcg/sparc64/tcg-target.c.inc
+++ b/tcg/sparc64/tcg-target.c.inc
@@ -720,7 +720,7 @@ static void tcg_out_movcond_i64(TCGContext *s, TCGCond cond, TCGReg ret,
 }
 
 static void tcg_out_setcond_i32(TCGContext *s, TCGCond cond, TCGReg ret,
-                                TCGReg c1, int32_t c2, int c2const)
+                                TCGReg c1, int32_t c2, int c2const, bool neg)
 {
     /* For 32-bit comparisons, we can play games with ADDC/SUBC.  */
     switch (cond) {
@@ -760,22 +760,30 @@ static void tcg_out_setcond_i32(TCGContext *s, TCGCond cond, TCGReg ret,
     default:
         tcg_out_cmp(s, c1, c2, c2const);
         tcg_out_movi_s13(s, ret, 0);
-        tcg_out_movcc(s, cond, MOVCC_ICC, ret, 1, 1);
+        tcg_out_movcc(s, cond, MOVCC_ICC, ret, neg ? -1 : 1, 1);
         return;
     }
 
     tcg_out_cmp(s, c1, c2, c2const);
     if (cond == TCG_COND_LTU) {
-        tcg_out_arithi(s, ret, TCG_REG_G0, 0, ARITH_ADDC);
+        if (neg) {
+            tcg_out_arithi(s, ret, TCG_REG_G0, 0, ARITH_SUBC);
+        } else {
+            tcg_out_arithi(s, ret, TCG_REG_G0, 0, ARITH_ADDC);
+        }
     } else {
-        tcg_out_arithi(s, ret, TCG_REG_G0, -1, ARITH_SUBC);
+        if (neg) {
+            tcg_out_arithi(s, ret, TCG_REG_G0, -1, ARITH_ADDC);
+        } else {
+            tcg_out_arithi(s, ret, TCG_REG_G0, -1, ARITH_SUBC);
+        }
     }
 }
 
 static void tcg_out_setcond_i64(TCGContext *s, TCGCond cond, TCGReg ret,
-                                TCGReg c1, int32_t c2, int c2const)
+                                TCGReg c1, int32_t c2, int c2const, bool neg)
 {
-    if (use_vis3_instructions) {
+    if (use_vis3_instructions && !neg) {
         switch (cond) {
         case TCG_COND_NE:
             if (c2 != 0) {
@@ -796,11 +804,11 @@ static void tcg_out_setcond_i64(TCGContext *s, TCGCond cond, TCGReg ret,
        if the input does not overlap the output.  */
     if (c2 == 0 && !is_unsigned_cond(cond) && c1 != ret) {
         tcg_out_movi_s13(s, ret, 0);
-        tcg_out_movr(s, cond, ret, c1, 1, 1);
+        tcg_out_movr(s, cond, ret, c1, neg ? -1 : 1, 1);
     } else {
         tcg_out_cmp(s, c1, c2, c2const);
         tcg_out_movi_s13(s, ret, 0);
-        tcg_out_movcc(s, cond, MOVCC_XCC, ret, 1, 1);
+        tcg_out_movcc(s, cond, MOVCC_XCC, ret, neg ? -1 : 1, 1);
     }
 }
 
@@ -1355,7 +1363,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_brcond_i32(s, a2, a0, a1, const_args[1], arg_label(args[3]));
         break;
     case INDEX_op_setcond_i32:
-        tcg_out_setcond_i32(s, args[3], a0, a1, a2, c2);
+        tcg_out_setcond_i32(s, args[3], a0, a1, a2, c2, false);
+        break;
+    case INDEX_op_negsetcond_i32:
+        tcg_out_setcond_i32(s, args[3], a0, a1, a2, c2, true);
         break;
     case INDEX_op_movcond_i32:
         tcg_out_movcond_i32(s, args[5], a0, a1, a2, c2, args[3], const_args[3]);
@@ -1437,7 +1448,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_brcond_i64(s, a2, a0, a1, const_args[1], arg_label(args[3]));
         break;
     case INDEX_op_setcond_i64:
-        tcg_out_setcond_i64(s, args[3], a0, a1, a2, c2);
+        tcg_out_setcond_i64(s, args[3], a0, a1, a2, c2, false);
+        break;
+    case INDEX_op_negsetcond_i64:
+        tcg_out_setcond_i64(s, args[3], a0, a1, a2, c2, true);
         break;
     case INDEX_op_movcond_i64:
         tcg_out_movcond_i64(s, args[5], a0, a1, a2, c2, args[3], const_args[3]);
@@ -1564,6 +1578,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_sar_i64:
     case INDEX_op_setcond_i32:
     case INDEX_op_setcond_i64:
+    case INDEX_op_negsetcond_i32:
+    case INDEX_op_negsetcond_i64:
         return C_O1_I2(r, rZ, rJ);
 
     case INDEX_op_brcond_i32:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 17/24] tcg/i386: Merge tcg_out_brcond{32,64}
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (15 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 16/24] tcg/sparc64: " Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 10:20   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 18/24] tcg/i386: Merge tcg_out_setcond{32,64} Richard Henderson
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Pass a rexw parameter instead of duplicating the functions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.c.inc | 110 +++++++++++++++++---------------------
 1 file changed, 49 insertions(+), 61 deletions(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 77482da070..b9673b55bd 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1433,99 +1433,89 @@ static void tcg_out_cmp(TCGContext *s, TCGArg arg1, TCGArg arg2,
     }
 }
 
-static void tcg_out_brcond32(TCGContext *s, TCGCond cond,
-                             TCGArg arg1, TCGArg arg2, int const_arg2,
-                             TCGLabel *label, int small)
+static void tcg_out_brcond(TCGContext *s, int rexw, TCGCond cond,
+                           TCGArg arg1, TCGArg arg2, int const_arg2,
+                           TCGLabel *label, bool small)
 {
-    tcg_out_cmp(s, arg1, arg2, const_arg2, 0);
+    tcg_out_cmp(s, arg1, arg2, const_arg2, rexw);
     tcg_out_jxx(s, tcg_cond_to_jcc[cond], label, small);
 }
 
-#if TCG_TARGET_REG_BITS == 64
-static void tcg_out_brcond64(TCGContext *s, TCGCond cond,
-                             TCGArg arg1, TCGArg arg2, int const_arg2,
-                             TCGLabel *label, int small)
-{
-    tcg_out_cmp(s, arg1, arg2, const_arg2, P_REXW);
-    tcg_out_jxx(s, tcg_cond_to_jcc[cond], label, small);
-}
-#else
-/* XXX: we implement it at the target level to avoid having to
-   handle cross basic blocks temporaries */
+#if TCG_TARGET_REG_BITS == 32
 static void tcg_out_brcond2(TCGContext *s, const TCGArg *args,
-                            const int *const_args, int small)
+                            const int *const_args, bool small)
 {
     TCGLabel *label_next = gen_new_label();
     TCGLabel *label_this = arg_label(args[5]);
 
     switch(args[4]) {
     case TCG_COND_EQ:
-        tcg_out_brcond32(s, TCG_COND_NE, args[0], args[2], const_args[2],
-                         label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_EQ, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_NE, args[0], args[2], const_args[2],
+                       label_next, 1);
+        tcg_out_brcond(s, 0, TCG_COND_EQ, args[1], args[3], const_args[3],
+                       label_this, small);
         break;
     case TCG_COND_NE:
-        tcg_out_brcond32(s, TCG_COND_NE, args[0], args[2], const_args[2],
-                         label_this, small);
-        tcg_out_brcond32(s, TCG_COND_NE, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_NE, args[0], args[2], const_args[2],
+                       label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_NE, args[1], args[3], const_args[3],
+                       label_this, small);
         break;
     case TCG_COND_LT:
-        tcg_out_brcond32(s, TCG_COND_LT, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LT, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_LTU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LTU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     case TCG_COND_LE:
-        tcg_out_brcond32(s, TCG_COND_LT, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LT, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_LEU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LEU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     case TCG_COND_GT:
-        tcg_out_brcond32(s, TCG_COND_GT, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GT, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_GTU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GTU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     case TCG_COND_GE:
-        tcg_out_brcond32(s, TCG_COND_GT, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GT, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_GEU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GEU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     case TCG_COND_LTU:
-        tcg_out_brcond32(s, TCG_COND_LTU, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LTU, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_LTU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LTU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     case TCG_COND_LEU:
-        tcg_out_brcond32(s, TCG_COND_LTU, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LTU, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_LEU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_LEU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     case TCG_COND_GTU:
-        tcg_out_brcond32(s, TCG_COND_GTU, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GTU, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_GTU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GTU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     case TCG_COND_GEU:
-        tcg_out_brcond32(s, TCG_COND_GTU, args[1], args[3], const_args[3],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GTU, args[1], args[3], const_args[3],
+                       label_this, small);
         tcg_out_jxx(s, JCC_JNE, label_next, 1);
-        tcg_out_brcond32(s, TCG_COND_GEU, args[0], args[2], const_args[2],
-                         label_this, small);
+        tcg_out_brcond(s, 0, TCG_COND_GEU, args[0], args[2], const_args[2],
+                       label_this, small);
         break;
     default:
         g_assert_not_reached();
@@ -2571,8 +2561,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_modrm(s, OPC_POPCNT + rexw, a0, a1);
         break;
 
-    case INDEX_op_brcond_i32:
-        tcg_out_brcond32(s, a2, a0, a1, const_args[1], arg_label(args[3]), 0);
+    OP_32_64(brcond):
+        tcg_out_brcond(s, rexw, a2, a0, a1, const_args[1],
+                       arg_label(args[3]), 0);
         break;
     case INDEX_op_setcond_i32:
         tcg_out_setcond32(s, args[3], a0, a1, a2, const_a2);
@@ -2727,9 +2718,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_brcond_i64:
-        tcg_out_brcond64(s, a2, a0, a1, const_args[1], arg_label(args[3]), 0);
-        break;
     case INDEX_op_setcond_i64:
         tcg_out_setcond64(s, args[3], a0, a1, a2, const_a2);
         break;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 18/24] tcg/i386: Merge tcg_out_setcond{32,64}
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (16 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 17/24] tcg/i386: Merge tcg_out_brcond{32,64} Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 10:21   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 19/24] tcg/i386: Merge tcg_out_movcond{32,64} Richard Henderson
                   ` (5 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Pass a rexw parameter instead of duplicating the functions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.c.inc | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index b9673b55bd..ec3c7012d4 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1524,23 +1524,16 @@ static void tcg_out_brcond2(TCGContext *s, const TCGArg *args,
 }
 #endif
 
-static void tcg_out_setcond32(TCGContext *s, TCGCond cond, TCGArg dest,
-                              TCGArg arg1, TCGArg arg2, int const_arg2)
+static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
+                            TCGArg dest, TCGArg arg1, TCGArg arg2,
+                            int const_arg2)
 {
-    tcg_out_cmp(s, arg1, arg2, const_arg2, 0);
+    tcg_out_cmp(s, arg1, arg2, const_arg2, rexw);
     tcg_out_modrm(s, OPC_SETCC | tcg_cond_to_jcc[cond], 0, dest);
     tcg_out_ext8u(s, dest, dest);
 }
 
-#if TCG_TARGET_REG_BITS == 64
-static void tcg_out_setcond64(TCGContext *s, TCGCond cond, TCGArg dest,
-                              TCGArg arg1, TCGArg arg2, int const_arg2)
-{
-    tcg_out_cmp(s, arg1, arg2, const_arg2, P_REXW);
-    tcg_out_modrm(s, OPC_SETCC | tcg_cond_to_jcc[cond], 0, dest);
-    tcg_out_ext8u(s, dest, dest);
-}
-#else
+#if TCG_TARGET_REG_BITS == 32
 static void tcg_out_setcond2(TCGContext *s, const TCGArg *args,
                              const int *const_args)
 {
@@ -2565,8 +2558,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_brcond(s, rexw, a2, a0, a1, const_args[1],
                        arg_label(args[3]), 0);
         break;
-    case INDEX_op_setcond_i32:
-        tcg_out_setcond32(s, args[3], a0, a1, a2, const_a2);
+    OP_32_64(setcond):
+        tcg_out_setcond(s, rexw, args[3], a0, a1, a2, const_a2);
         break;
     case INDEX_op_movcond_i32:
         tcg_out_movcond32(s, args[5], a0, a1, a2, const_a2, args[3]);
@@ -2718,9 +2711,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_setcond_i64:
-        tcg_out_setcond64(s, args[3], a0, a1, a2, const_a2);
-        break;
     case INDEX_op_movcond_i64:
         tcg_out_movcond64(s, args[5], a0, a1, a2, const_a2, args[3]);
         break;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 19/24] tcg/i386: Merge tcg_out_movcond{32,64}
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (17 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 18/24] tcg/i386: Merge tcg_out_setcond{32,64} Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 10:22   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp Richard Henderson
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Pass a rexw parameter instead of duplicating the functions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.c.inc | 28 +++++++---------------------
 1 file changed, 7 insertions(+), 21 deletions(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index ec3c7012d4..b88fc14afd 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1590,24 +1590,14 @@ static void tcg_out_cmov(TCGContext *s, TCGCond cond, int rexw,
     }
 }
 
-static void tcg_out_movcond32(TCGContext *s, TCGCond cond, TCGReg dest,
-                              TCGReg c1, TCGArg c2, int const_c2,
-                              TCGReg v1)
+static void tcg_out_movcond(TCGContext *s, int rexw, TCGCond cond,
+                            TCGReg dest, TCGReg c1, TCGArg c2, int const_c2,
+                            TCGReg v1)
 {
-    tcg_out_cmp(s, c1, c2, const_c2, 0);
-    tcg_out_cmov(s, cond, 0, dest, v1);
+    tcg_out_cmp(s, c1, c2, const_c2, rexw);
+    tcg_out_cmov(s, cond, rexw, dest, v1);
 }
 
-#if TCG_TARGET_REG_BITS == 64
-static void tcg_out_movcond64(TCGContext *s, TCGCond cond, TCGReg dest,
-                              TCGReg c1, TCGArg c2, int const_c2,
-                              TCGReg v1)
-{
-    tcg_out_cmp(s, c1, c2, const_c2, P_REXW);
-    tcg_out_cmov(s, cond, P_REXW, dest, v1);
-}
-#endif
-
 static void tcg_out_ctz(TCGContext *s, int rexw, TCGReg dest, TCGReg arg1,
                         TCGArg arg2, bool const_a2)
 {
@@ -2561,8 +2551,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     OP_32_64(setcond):
         tcg_out_setcond(s, rexw, args[3], a0, a1, a2, const_a2);
         break;
-    case INDEX_op_movcond_i32:
-        tcg_out_movcond32(s, args[5], a0, a1, a2, const_a2, args[3]);
+    OP_32_64(movcond):
+        tcg_out_movcond(s, rexw, args[5], a0, a1, a2, const_a2, args[3]);
         break;
 
     OP_32_64(bswap16):
@@ -2711,10 +2701,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_movcond_i64:
-        tcg_out_movcond64(s, args[5], a0, a1, a2, const_a2, args[3]);
-        break;
-
     case INDEX_op_bswap64_i64:
         tcg_out_bswap64(s, a0);
         break;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (18 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 19/24] tcg/i386: Merge tcg_out_movcond{32,64} Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 10:26   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 21/24] tcg/i386: Use CMP+SBB in tcg_out_setcond Richard Henderson
                   ` (3 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Add the parameter to avoid TEST and pass along to tgen_arithi.
All current users pass false.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.c.inc | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index b88fc14afd..56549ff2a0 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1418,15 +1418,15 @@ static void tcg_out_jxx(TCGContext *s, int opc, TCGLabel *l, bool small)
     }
 }
 
-static void tcg_out_cmp(TCGContext *s, TCGArg arg1, TCGArg arg2,
-                        int const_arg2, int rexw)
+static void tcg_out_cmp(TCGContext *s, int rexw, TCGArg arg1, TCGArg arg2,
+                        int const_arg2, bool cf)
 {
     if (const_arg2) {
-        if (arg2 == 0) {
+        if (arg2 == 0 && !cf) {
             /* test r, r */
             tcg_out_modrm(s, OPC_TESTL + rexw, arg1, arg1);
         } else {
-            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, 0);
+            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, cf);
         }
     } else {
         tgen_arithr(s, ARITH_CMP + rexw, arg1, arg2);
@@ -1437,7 +1437,7 @@ static void tcg_out_brcond(TCGContext *s, int rexw, TCGCond cond,
                            TCGArg arg1, TCGArg arg2, int const_arg2,
                            TCGLabel *label, bool small)
 {
-    tcg_out_cmp(s, arg1, arg2, const_arg2, rexw);
+    tcg_out_cmp(s, rexw, arg1, arg2, const_arg2, false);
     tcg_out_jxx(s, tcg_cond_to_jcc[cond], label, small);
 }
 
@@ -1528,7 +1528,7 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
                             TCGArg dest, TCGArg arg1, TCGArg arg2,
                             int const_arg2)
 {
-    tcg_out_cmp(s, arg1, arg2, const_arg2, rexw);
+    tcg_out_cmp(s, rexw, arg1, arg2, const_arg2, false);
     tcg_out_modrm(s, OPC_SETCC | tcg_cond_to_jcc[cond], 0, dest);
     tcg_out_ext8u(s, dest, dest);
 }
@@ -1594,7 +1594,7 @@ static void tcg_out_movcond(TCGContext *s, int rexw, TCGCond cond,
                             TCGReg dest, TCGReg c1, TCGArg c2, int const_c2,
                             TCGReg v1)
 {
-    tcg_out_cmp(s, c1, c2, const_c2, rexw);
+    tcg_out_cmp(s, rexw, c1, c2, const_c2, false);
     tcg_out_cmov(s, cond, rexw, dest, v1);
 }
 
@@ -1637,7 +1637,7 @@ static void tcg_out_clz(TCGContext *s, int rexw, TCGReg dest, TCGReg arg1,
         tgen_arithi(s, ARITH_XOR + rexw, dest, rexw ? 63 : 31, 0);
 
         /* Since we have destroyed the flags from BSR, we have to re-test.  */
-        tcg_out_cmp(s, arg1, 0, 1, rexw);
+        tcg_out_cmp(s, rexw, arg1, 0, 1, false);
         tcg_out_cmov(s, TCG_COND_EQ, rexw, dest, arg2);
     }
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 21/24] tcg/i386: Use CMP+SBB in tcg_out_setcond
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (19 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 12:07   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible Richard Henderson
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Use the carry bit to optimize some forms of setcond.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.c.inc | 50 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 56549ff2a0..e06ac638b0 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1528,6 +1528,56 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
                             TCGArg dest, TCGArg arg1, TCGArg arg2,
                             int const_arg2)
 {
+    bool inv = false;
+
+    switch (cond) {
+    case TCG_COND_NE:
+        inv = true;
+        /* fall through */
+    case TCG_COND_EQ:
+        /* If arg2 is 0, convert to LTU/GEU vs 1. */
+        if (const_arg2 && arg2 == 0) {
+            arg2 = 1;
+            goto do_ltu;
+        }
+        break;
+
+    case TCG_COND_LEU:
+        inv = true;
+        /* fall through */
+    case TCG_COND_GTU:
+        /* If arg2 is a register, swap for LTU/GEU. */
+        if (!const_arg2) {
+            TCGReg t = arg1;
+            arg1 = arg2;
+            arg2 = t;
+            goto do_ltu;
+        }
+        break;
+
+    case TCG_COND_GEU:
+        inv = true;
+        /* fall through */
+    case TCG_COND_LTU:
+    do_ltu:
+        /*
+         * Relying on the carry bit, use SBB to produce -1 if LTU, 0 if GEU.
+         * We can then use NEG or INC to produce the desired result.
+         * This is always smaller than the SETCC expansion.
+         */
+        tcg_out_cmp(s, rexw, arg1, arg2, const_arg2, true);
+        tgen_arithr(s, ARITH_SBB, dest, dest);              /* T:-1 F:0 */
+        if (inv) {
+            tgen_arithi(s, ARITH_ADD, dest, 1, 0);          /* T:0  F:1 */
+        } else {
+            tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_NEG, dest);  /* T:1  F:0 */
+        }
+        return;
+
+    default:
+        break;
+    }
+
     tcg_out_cmp(s, rexw, arg1, arg2, const_arg2, false);
     tcg_out_modrm(s, OPC_SETCC | tcg_cond_to_jcc[cond], 0, dest);
     tcg_out_ext8u(s, dest, dest);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (20 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 21/24] tcg/i386: Use CMP+SBB in tcg_out_setcond Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 12:09   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 23/24] tcg/i386: Use shift in tcg_out_setcond Richard Henderson
  2023-08-08  3:11 ` [PATCH 24/24] tcg/i386: Implement negsetcond_* Richard Henderson
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Using XOR first is both smaller and more efficient,
though cannot be applied if it clobbers an input.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.c.inc | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index e06ac638b0..cca49fe63a 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1529,6 +1529,7 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
                             int const_arg2)
 {
     bool inv = false;
+    bool cleared;
 
     switch (cond) {
     case TCG_COND_NE:
@@ -1578,9 +1579,23 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
         break;
     }
 
+    /*
+     * If dest does not overlap the inputs, clearing it first is preferred.
+     * The XOR breaks any false dependency for the low-byte write to dest,
+     * and is also one byte smaller than MOVZBL.
+     */
+    cleared = false;
+    if (dest != arg1 && (const_arg2 || dest != arg2)) {
+        tgen_arithr(s, ARITH_XOR, dest, dest);
+        cleared = true;
+    }
+
     tcg_out_cmp(s, rexw, arg1, arg2, const_arg2, false);
     tcg_out_modrm(s, OPC_SETCC | tcg_cond_to_jcc[cond], 0, dest);
-    tcg_out_ext8u(s, dest, dest);
+
+    if (!cleared) {
+        tcg_out_ext8u(s, dest, dest);
+    }
 }
 
 #if TCG_TARGET_REG_BITS == 32
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 23/24] tcg/i386: Use shift in tcg_out_setcond
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (21 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 12:10   ` Peter Maydell
  2023-08-08  3:11 ` [PATCH 24/24] tcg/i386: Implement negsetcond_* Richard Henderson
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

For LT/GE vs zero, shift down the sign bit.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.c.inc | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index cca49fe63a..f68722b8a5 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1575,6 +1575,21 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
         }
         return;
 
+    case TCG_COND_GE:
+        inv = true;
+        /* fall through */
+    case TCG_COND_LT:
+        /* If arg2 is 0, extract the sign bit. */
+        if (const_arg2 && arg2 == 0) {
+            tcg_out_mov(s, rexw ? TCG_TYPE_I64 : TCG_TYPE_I32, dest, arg1);
+            if (inv) {
+                tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_NOT, dest);
+            }
+            tcg_out_shifti(s, SHIFT_SHR + rexw, dest, rexw ? 63 : 31);
+            return;
+        }
+        break;
+
     default:
         break;
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH 24/24] tcg/i386: Implement negsetcond_*
  2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
                   ` (22 preceding siblings ...)
  2023-08-08  3:11 ` [PATCH 23/24] tcg/i386: Use shift in tcg_out_setcond Richard Henderson
@ 2023-08-08  3:11 ` Richard Henderson
  2023-08-11 12:13   ` Peter Maydell
  23 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-08  3:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/i386/tcg-target.h     |  4 ++--
 tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++--------
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 41df0e5ae1..1a9025d786 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -156,7 +156,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
-#define TCG_TARGET_HAS_negsetcond_i32   0
+#define TCG_TARGET_HAS_negsetcond_i32   1
 
 #if TCG_TARGET_REG_BITS == 64
 /* Keep 32-bit values zero-extended in a register.  */
@@ -194,7 +194,7 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i64        1
 #define TCG_TARGET_HAS_muluh_i64        0
 #define TCG_TARGET_HAS_mulsh_i64        0
-#define TCG_TARGET_HAS_negsetcond_i64   0
+#define TCG_TARGET_HAS_negsetcond_i64   1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 #else
 #define TCG_TARGET_HAS_qemu_st8_i32     1
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index f68722b8a5..cc75653bb8 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1526,7 +1526,7 @@ static void tcg_out_brcond2(TCGContext *s, const TCGArg *args,
 
 static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
                             TCGArg dest, TCGArg arg1, TCGArg arg2,
-                            int const_arg2)
+                            int const_arg2, bool neg)
 {
     bool inv = false;
     bool cleared;
@@ -1567,11 +1567,13 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
          * This is always smaller than the SETCC expansion.
          */
         tcg_out_cmp(s, rexw, arg1, arg2, const_arg2, true);
-        tgen_arithr(s, ARITH_SBB, dest, dest);              /* T:-1 F:0 */
-        if (inv) {
-            tgen_arithi(s, ARITH_ADD, dest, 1, 0);          /* T:0  F:1 */
-        } else {
-            tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_NEG, dest);  /* T:1  F:0 */
+        tgen_arithr(s, ARITH_SBB + (neg ? rexw : 0), dest, dest); /* T:-1 F:0 */
+        if (inv && neg) {
+            tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_NOT, dest); /* T:0 F:-1 */
+        } else if (inv) {
+            tgen_arithi(s, ARITH_ADD, dest, 1, 0);                /* T:0  F:1 */
+        } else if (!neg) {
+            tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_NEG, dest);        /* T:1  F:0 */
         }
         return;
 
@@ -1585,7 +1587,8 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
             if (inv) {
                 tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_NOT, dest);
             }
-            tcg_out_shifti(s, SHIFT_SHR + rexw, dest, rexw ? 63 : 31);
+            tcg_out_shifti(s, (neg ? SHIFT_SAR : SHIFT_SHR) + rexw,
+                           dest, rexw ? 63 : 31);
             return;
         }
         break;
@@ -1611,6 +1614,9 @@ static void tcg_out_setcond(TCGContext *s, int rexw, TCGCond cond,
     if (!cleared) {
         tcg_out_ext8u(s, dest, dest);
     }
+    if (neg) {
+        tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_NEG, dest);
+    }
 }
 
 #if TCG_TARGET_REG_BITS == 32
@@ -2629,7 +2635,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                        arg_label(args[3]), 0);
         break;
     OP_32_64(setcond):
-        tcg_out_setcond(s, rexw, args[3], a0, a1, a2, const_a2);
+        tcg_out_setcond(s, rexw, args[3], a0, a1, a2, const_a2, false);
+        break;
+    OP_32_64(negsetcond):
+        tcg_out_setcond(s, rexw, args[3], a0, a1, a2, const_a2, true);
         break;
     OP_32_64(movcond):
         tcg_out_movcond(s, rexw, args[5], a0, a1, a2, const_a2, args[3]);
@@ -3357,6 +3366,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
 
     case INDEX_op_setcond_i32:
     case INDEX_op_setcond_i64:
+    case INDEX_op_negsetcond_i32:
+    case INDEX_op_negsetcond_i64:
         return C_O1_I2(q, r, re);
 
     case INDEX_op_movcond_i32:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH 09/24] target/tricore: Replace gen_cond_w with tcg_gen_negsetcond_tl
  2023-08-08  3:11 ` [PATCH 09/24] target/tricore: Replace gen_cond_w with tcg_gen_negsetcond_tl Richard Henderson
@ 2023-08-08 15:42   ` Bastian Koppelmann
  0 siblings, 0 replies; 59+ messages in thread
From: Bastian Koppelmann @ 2023-08-08 15:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Mon, Aug 07, 2023 at 08:11:28PM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/tricore/translate.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/target/tricore/translate.c b/target/tricore/translate.c
> index 1947733870..6ae5ccbf72 100644
> --- a/target/tricore/translate.c
> +++ b/target/tricore/translate.c

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Cheers,
Bastian


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 02/24] tcg: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 ` [PATCH 02/24] tcg: Use tcg_gen_negsetcond_* Richard Henderson
@ 2023-08-08 15:55   ` Peter Maydell
  2023-08-08 16:04     ` Richard Henderson
  2023-08-10 16:13   ` Peter Maydell
  1 sibling, 1 reply; 59+ messages in thread
From: Peter Maydell @ 2023-08-08 15:55 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:13, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/tcg-op-gvec.c | 6 ++----
>  tcg/tcg-op.c      | 6 ++----
>  2 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
> index a062239804..e260a07c61 100644
> --- a/tcg/tcg-op-gvec.c
> +++ b/tcg/tcg-op-gvec.c
> @@ -3692,8 +3692,7 @@ static void expand_cmp_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
>      for (i = 0; i < oprsz; i += 4) {
>          tcg_gen_ld_i32(t0, cpu_env, aofs + i);
>          tcg_gen_ld_i32(t1, cpu_env, bofs + i);
> -        tcg_gen_setcond_i32(cond, t0, t0, t1);
> -        tcg_gen_neg_i32(t0, t0);
> +        tcg_gen_negsetcond_i32(cond, t0, t0, t1);
>          tcg_gen_st_i32(t0, cpu_env, dofs + i);
>      }

Is it not possible for the optimizer to notice "you did
a setcond followed by a neg, let me turn it into negsetcond
for you" ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 02/24] tcg: Use tcg_gen_negsetcond_*
  2023-08-08 15:55   ` Peter Maydell
@ 2023-08-08 16:04     ` Richard Henderson
  0 siblings, 0 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-08 16:04 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On 8/8/23 08:55, Peter Maydell wrote:
> On Tue, 8 Aug 2023 at 04:13, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   tcg/tcg-op-gvec.c | 6 ++----
>>   tcg/tcg-op.c      | 6 ++----
>>   2 files changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
>> index a062239804..e260a07c61 100644
>> --- a/tcg/tcg-op-gvec.c
>> +++ b/tcg/tcg-op-gvec.c
>> @@ -3692,8 +3692,7 @@ static void expand_cmp_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
>>       for (i = 0; i < oprsz; i += 4) {
>>           tcg_gen_ld_i32(t0, cpu_env, aofs + i);
>>           tcg_gen_ld_i32(t1, cpu_env, bofs + i);
>> -        tcg_gen_setcond_i32(cond, t0, t0, t1);
>> -        tcg_gen_neg_i32(t0, t0);
>> +        tcg_gen_negsetcond_i32(cond, t0, t0, t1);
>>           tcg_gen_st_i32(t0, cpu_env, dofs + i);
>>       }
> 
> Is it not possible for the optimizer to notice "you did
> a setcond followed by a neg, let me turn it into negsetcond
> for you" ?

Not at present, no.  That sort of peephole optimization is a bit more difficult than what 
we do at present.


r~



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 14/24] tcg/riscv: Implement negsetcond_*
  2023-08-08  3:11 ` [PATCH 14/24] tcg/riscv: Implement negsetcond_* Richard Henderson
@ 2023-08-08 16:47   ` Daniel Henrique Barboza
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Henrique Barboza @ 2023-08-08 16:47 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x



On 8/8/23 00:11, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>

>   tcg/riscv/tcg-target.h     |  4 ++--
>   tcg/riscv/tcg-target.c.inc | 45 ++++++++++++++++++++++++++++++++++++++
>   2 files changed, 47 insertions(+), 2 deletions(-)
> 
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index b2961fec8e..7e8ac48a7d 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -120,7 +120,7 @@ extern bool have_zbb;
>   #define TCG_TARGET_HAS_ctpop_i32        have_zbb
>   #define TCG_TARGET_HAS_brcond2          1
>   #define TCG_TARGET_HAS_setcond2         1
> -#define TCG_TARGET_HAS_negsetcond_i32   0
> +#define TCG_TARGET_HAS_negsetcond_i32   1
>   #define TCG_TARGET_HAS_qemu_st8_i32     0
>   
>   #define TCG_TARGET_HAS_movcond_i64      1
> @@ -159,7 +159,7 @@ extern bool have_zbb;
>   #define TCG_TARGET_HAS_muls2_i64        0
>   #define TCG_TARGET_HAS_muluh_i64        1
>   #define TCG_TARGET_HAS_mulsh_i64        1
> -#define TCG_TARGET_HAS_negsetcond_i64   0
> +#define TCG_TARGET_HAS_negsetcond_i64   1
>   
>   #define TCG_TARGET_HAS_qemu_ldst_i128   0
>   
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index eeaeb6b6e3..232b616af3 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -936,6 +936,44 @@ static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
>       }
>   }
>   
> +static void tcg_out_negsetcond(TCGContext *s, TCGCond cond, TCGReg ret,
> +                               TCGReg arg1, tcg_target_long arg2, bool c2)
> +{
> +    int tmpflags;
> +    TCGReg tmp;
> +
> +    /* For LT/GE comparison against 0, replicate the sign bit. */
> +    if (c2 && arg2 == 0) {
> +        switch (cond) {
> +        case TCG_COND_GE:
> +            tcg_out_opc_imm(s, OPC_XORI, ret, arg1, -1);
> +            arg1 = ret;
> +            /* fall through */
> +        case TCG_COND_LT:
> +            tcg_out_opc_imm(s, OPC_SRAI, ret, arg1, TCG_TARGET_REG_BITS - 1);
> +            return;
> +        default:
> +            break;
> +        }
> +    }
> +
> +    tmpflags = tcg_out_setcond_int(s, cond, ret, arg1, arg2, c2);
> +    tmp = tmpflags & ~SETCOND_FLAGS;
> +
> +    /* If intermediate result is zero/non-zero: test != 0. */
> +    if (tmpflags & SETCOND_NEZ) {
> +        tcg_out_opc_reg(s, OPC_SLTU, ret, TCG_REG_ZERO, tmp);
> +        tmp = ret;
> +    }
> +
> +    /* Produce the 0/-1 result. */
> +    if (tmpflags & SETCOND_INV) {
> +        tcg_out_opc_imm(s, OPC_ADDI, ret, tmp, -1);
> +    } else {
> +        tcg_out_opc_reg(s, OPC_SUB, ret, TCG_REG_ZERO, tmp);
> +    }
> +}
> +
>   static void tcg_out_movcond_zicond(TCGContext *s, TCGReg ret, TCGReg test_ne,
>                                      int val1, bool c_val1,
>                                      int val2, bool c_val2)
> @@ -1782,6 +1820,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>           tcg_out_setcond(s, args[3], a0, a1, a2, c2);
>           break;
>   
> +    case INDEX_op_negsetcond_i32:
> +    case INDEX_op_negsetcond_i64:
> +        tcg_out_negsetcond(s, args[3], a0, a1, a2, c2);
> +        break;
> +
>       case INDEX_op_movcond_i32:
>       case INDEX_op_movcond_i64:
>           tcg_out_movcond(s, args[5], a0, a1, a2, c2,
> @@ -1910,6 +1953,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
>       case INDEX_op_xor_i64:
>       case INDEX_op_setcond_i32:
>       case INDEX_op_setcond_i64:
> +    case INDEX_op_negsetcond_i32:
> +    case INDEX_op_negsetcond_i64:
>           return C_O1_I2(r, r, rI);
>   
>       case INDEX_op_andc_i32:


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 07/24] target/ppc: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 ` [PATCH 07/24] target/ppc: " Richard Henderson
@ 2023-08-08 16:51   ` Daniel Henrique Barboza
  2023-08-15 12:54   ` Nicholas Piggin
  1 sibling, 0 replies; 59+ messages in thread
From: Daniel Henrique Barboza @ 2023-08-08 16:51 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x



On 8/8/23 00:11, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>


>   target/ppc/translate/fixedpoint-impl.c.inc | 6 ++++--
>   target/ppc/translate/vmx-impl.c.inc        | 8 +++-----
>   2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
> index f47f1a50e8..4ce02fd3a4 100644
> --- a/target/ppc/translate/fixedpoint-impl.c.inc
> +++ b/target/ppc/translate/fixedpoint-impl.c.inc
> @@ -342,12 +342,14 @@ static bool do_set_bool_cond(DisasContext *ctx, arg_X_bi *a, bool neg, bool rev)
>       uint32_t mask = 0x08 >> (a->bi & 0x03);
>       TCGCond cond = rev ? TCG_COND_EQ : TCG_COND_NE;
>       TCGv temp = tcg_temp_new();
> +    TCGv zero = tcg_constant_tl(0);
>   
>       tcg_gen_extu_i32_tl(temp, cpu_crf[a->bi >> 2]);
>       tcg_gen_andi_tl(temp, temp, mask);
> -    tcg_gen_setcondi_tl(cond, cpu_gpr[a->rt], temp, 0);
>       if (neg) {
> -        tcg_gen_neg_tl(cpu_gpr[a->rt], cpu_gpr[a->rt]);
> +        tcg_gen_negsetcond_tl(cond, cpu_gpr[a->rt], temp, zero);
> +    } else {
> +        tcg_gen_setcond_tl(cond, cpu_gpr[a->rt], temp, zero);
>       }
>       return true;
>   }
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index c8712dd7d8..6d7669aabd 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -1341,8 +1341,7 @@ static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
>       tcg_gen_xor_i64(t1, t0, t1);
>   
>       tcg_gen_or_i64(t1, t1, t2);
> -    tcg_gen_setcondi_i64(TCG_COND_EQ, t1, t1, 0);
> -    tcg_gen_neg_i64(t1, t1);
> +    tcg_gen_negsetcond_i64(TCG_COND_EQ, t1, t1, tcg_constant_i64(0));
>   
>       set_avr64(a->vrt, t1, true);
>       set_avr64(a->vrt, t1, false);
> @@ -1365,15 +1364,14 @@ static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
>   
>       get_avr64(t0, a->vra, false);
>       get_avr64(t1, a->vrb, false);
> -    tcg_gen_setcond_i64(TCG_COND_GTU, t2, t0, t1);
> +    tcg_gen_negsetcond_i64(TCG_COND_GTU, t2, t0, t1);
>   
>       get_avr64(t0, a->vra, true);
>       get_avr64(t1, a->vrb, true);
>       tcg_gen_movcond_i64(TCG_COND_EQ, t2, t0, t1, t2, tcg_constant_i64(0));
> -    tcg_gen_setcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
> +    tcg_gen_negsetcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
>   
>       tcg_gen_or_i64(t1, t1, t2);
> -    tcg_gen_neg_i64(t1, t1);
>   
>       set_avr64(a->vrt, t1, true);
>       set_avr64(a->vrt, t1, false);


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 10/24] tcg/ppc: Implement negsetcond_*
  2023-08-08  3:11 ` [PATCH 10/24] tcg/ppc: Implement negsetcond_* Richard Henderson
@ 2023-08-08 16:55   ` Daniel Henrique Barboza
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Henrique Barboza @ 2023-08-08 16:55 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x



On 8/8/23 00:11, Richard Henderson wrote:
> In the general case we simply negate.  However with isel we
> may load -1 instead of 1 with no extra effort.
> 
> Consolidate EQ0 and NE0 logic.  Replace the NE0 zero-extension
> with inversion+negation of EQ0, which is never worse and may
> eliminate one insn.  Provide a special case for -EQ0.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>

>   tcg/ppc/tcg-target.h     |   4 +-
>   tcg/ppc/tcg-target.c.inc | 127 ++++++++++++++++++++++++---------------
>   2 files changed, 82 insertions(+), 49 deletions(-)
> 
> diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
> index ba4fd3eb3a..a143b8f1e0 100644
> --- a/tcg/ppc/tcg-target.h
> +++ b/tcg/ppc/tcg-target.h
> @@ -101,7 +101,7 @@ typedef enum {
>   #define TCG_TARGET_HAS_muls2_i32        0
>   #define TCG_TARGET_HAS_muluh_i32        1
>   #define TCG_TARGET_HAS_mulsh_i32        1
> -#define TCG_TARGET_HAS_negsetcond_i32   0
> +#define TCG_TARGET_HAS_negsetcond_i32   1
>   #define TCG_TARGET_HAS_qemu_st8_i32     0
>   
>   #if TCG_TARGET_REG_BITS == 64
> @@ -142,7 +142,7 @@ typedef enum {
>   #define TCG_TARGET_HAS_muls2_i64        0
>   #define TCG_TARGET_HAS_muluh_i64        1
>   #define TCG_TARGET_HAS_mulsh_i64        1
> -#define TCG_TARGET_HAS_negsetcond_i64   0
> +#define TCG_TARGET_HAS_negsetcond_i64   1
>   #endif
>   
>   #define TCG_TARGET_HAS_qemu_ldst_i128   \
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index 511e14b180..10448aa0e6 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -1548,8 +1548,20 @@ static void tcg_out_cmp(TCGContext *s, int cond, TCGArg arg1, TCGArg arg2,
>   }
>   
>   static void tcg_out_setcond_eq0(TCGContext *s, TCGType type,
> -                                TCGReg dst, TCGReg src)
> +                                TCGReg dst, TCGReg src, bool neg)
>   {
> +    if (neg && (TCG_TARGET_REG_BITS == 32 || type == TCG_TYPE_I64)) {
> +        /*
> +         * X != 0 implies X + -1 generates a carry.
> +         * RT = (~X + X) + CA
> +         *    = -1 + CA
> +         *    = CA ? 0 : -1
> +         */
> +        tcg_out32(s, ADDIC | TAI(TCG_REG_R0, src, -1));
> +        tcg_out32(s, SUBFE | TAB(dst, src, src));
> +        return;
> +    }
> +
>       if (type == TCG_TYPE_I32) {
>           tcg_out32(s, CNTLZW | RS(src) | RA(dst));
>           tcg_out_shri32(s, dst, dst, 5);
> @@ -1557,18 +1569,28 @@ static void tcg_out_setcond_eq0(TCGContext *s, TCGType type,
>           tcg_out32(s, CNTLZD | RS(src) | RA(dst));
>           tcg_out_shri64(s, dst, dst, 6);
>       }
> +    if (neg) {
> +        tcg_out32(s, NEG | RT(dst) | RA(dst));
> +    }
>   }
>   
> -static void tcg_out_setcond_ne0(TCGContext *s, TCGReg dst, TCGReg src)
> +static void tcg_out_setcond_ne0(TCGContext *s, TCGType type,
> +                                TCGReg dst, TCGReg src, bool neg)
>   {
> -    /* X != 0 implies X + -1 generates a carry.  Extra addition
> -       trickery means: R = X-1 + ~X + C = X-1 + (-X+1) + C = C.  */
> -    if (dst != src) {
> -        tcg_out32(s, ADDIC | TAI(dst, src, -1));
> -        tcg_out32(s, SUBFE | TAB(dst, dst, src));
> -    } else {
> +    if (!neg && (TCG_TARGET_REG_BITS == 32 || type == TCG_TYPE_I64)) {
> +        /*
> +         * X != 0 implies X + -1 generates a carry.  Extra addition
> +         * trickery means: R = X-1 + ~X + C = X-1 + (-X+1) + C = C.
> +         */
>           tcg_out32(s, ADDIC | TAI(TCG_REG_R0, src, -1));
>           tcg_out32(s, SUBFE | TAB(dst, TCG_REG_R0, src));
> +        return;
> +    }
> +    tcg_out_setcond_eq0(s, type, dst, src, false);
> +    if (neg) {
> +        tcg_out32(s, ADDI | TAI(dst, dst, -1));
> +    } else {
> +        tcg_out_xori32(s, dst, dst, 1);
>       }
>   }
>   
> @@ -1590,9 +1612,10 @@ static TCGReg tcg_gen_setcond_xor(TCGContext *s, TCGReg arg1, TCGArg arg2,
>   
>   static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
>                               TCGArg arg0, TCGArg arg1, TCGArg arg2,
> -                            int const_arg2)
> +                            int const_arg2, bool neg)
>   {
> -    int crop, sh;
> +    int sh;
> +    bool inv;
>   
>       tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || type == TCG_TYPE_I32);
>   
> @@ -1605,14 +1628,10 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
>       if (arg2 == 0) {
>           switch (cond) {
>           case TCG_COND_EQ:
> -            tcg_out_setcond_eq0(s, type, arg0, arg1);
> +            tcg_out_setcond_eq0(s, type, arg0, arg1, neg);
>               return;
>           case TCG_COND_NE:
> -            if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I32) {
> -                tcg_out_ext32u(s, TCG_REG_R0, arg1);
> -                arg1 = TCG_REG_R0;
> -            }
> -            tcg_out_setcond_ne0(s, arg0, arg1);
> +            tcg_out_setcond_ne0(s, type, arg0, arg1, neg);
>               return;
>           case TCG_COND_GE:
>               tcg_out32(s, NOR | SAB(arg1, arg0, arg1));
> @@ -1621,9 +1640,17 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
>           case TCG_COND_LT:
>               /* Extract the sign bit.  */
>               if (type == TCG_TYPE_I32) {
> -                tcg_out_shri32(s, arg0, arg1, 31);
> +                if (neg) {
> +                    tcg_out_sari32(s, arg0, arg1, 31);
> +                } else {
> +                    tcg_out_shri32(s, arg0, arg1, 31);
> +                }
>               } else {
> -                tcg_out_shri64(s, arg0, arg1, 63);
> +                if (neg) {
> +                    tcg_out_sari64(s, arg0, arg1, 63);
> +                } else {
> +                    tcg_out_shri64(s, arg0, arg1, 63);
> +                }
>               }
>               return;
>           default:
> @@ -1641,7 +1668,7 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
>   
>           isel = tcg_to_isel[cond];
>   
> -        tcg_out_movi(s, type, arg0, 1);
> +        tcg_out_movi(s, type, arg0, neg ? -1 : 1);
>           if (isel & 1) {
>               /* arg0 = (bc ? 0 : 1) */
>               tab = TAB(arg0, 0, arg0);
> @@ -1655,51 +1682,47 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
>           return;
>       }
>   
> +    inv = false;
>       switch (cond) {
>       case TCG_COND_EQ:
>           arg1 = tcg_gen_setcond_xor(s, arg1, arg2, const_arg2);
> -        tcg_out_setcond_eq0(s, type, arg0, arg1);
> -        return;
> +        tcg_out_setcond_eq0(s, type, arg0, arg1, neg);
> +        break;
>   
>       case TCG_COND_NE:
>           arg1 = tcg_gen_setcond_xor(s, arg1, arg2, const_arg2);
> -        /* Discard the high bits only once, rather than both inputs.  */
> -        if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I32) {
> -            tcg_out_ext32u(s, TCG_REG_R0, arg1);
> -            arg1 = TCG_REG_R0;
> -        }
> -        tcg_out_setcond_ne0(s, arg0, arg1);
> -        return;
> +        tcg_out_setcond_ne0(s, type, arg0, arg1, neg);
> +        break;
>   
> +    case TCG_COND_LE:
> +    case TCG_COND_LEU:
> +        inv = true;
> +        /* fall through */
>       case TCG_COND_GT:
>       case TCG_COND_GTU:
> -        sh = 30;
> -        crop = 0;
> -        goto crtest;
> -
> -    case TCG_COND_LT:
> -    case TCG_COND_LTU:
> -        sh = 29;
> -        crop = 0;
> +        sh = 30; /* CR7 CR_GT */
>           goto crtest;
>   
>       case TCG_COND_GE:
>       case TCG_COND_GEU:
> -        sh = 31;
> -        crop = CRNOR | BT(7, CR_EQ) | BA(7, CR_LT) | BB(7, CR_LT);
> +        inv = true;
> +        /* fall through */
> +    case TCG_COND_LT:
> +    case TCG_COND_LTU:
> +        sh = 29; /* CR7 CR_LT */
>           goto crtest;
>   
> -    case TCG_COND_LE:
> -    case TCG_COND_LEU:
> -        sh = 31;
> -        crop = CRNOR | BT(7, CR_EQ) | BA(7, CR_GT) | BB(7, CR_GT);
>       crtest:
>           tcg_out_cmp(s, cond, arg1, arg2, const_arg2, 7, type);
> -        if (crop) {
> -            tcg_out32(s, crop);
> -        }
>           tcg_out32(s, MFOCRF | RT(TCG_REG_R0) | FXM(7));
>           tcg_out_rlw(s, RLWINM, arg0, TCG_REG_R0, sh, 31, 31);
> +        if (neg && inv) {
> +            tcg_out32(s, ADDI | TAI(arg0, arg0, -1));
> +        } else if (neg) {
> +            tcg_out32(s, NEG | RT(arg0) | RA(arg0));
> +        } else if (inv) {
> +            tcg_out_xori32(s, arg0, arg0, 1);
> +        }
>           break;
>   
>       default:
> @@ -2982,11 +3005,19 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>   
>       case INDEX_op_setcond_i32:
>           tcg_out_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2],
> -                        const_args[2]);
> +                        const_args[2], false);
>           break;
>       case INDEX_op_setcond_i64:
>           tcg_out_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2],
> -                        const_args[2]);
> +                        const_args[2], false);
> +        break;
> +    case INDEX_op_negsetcond_i32:
> +        tcg_out_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2],
> +                        const_args[2], true);
> +        break;
> +    case INDEX_op_negsetcond_i64:
> +        tcg_out_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2],
> +                        const_args[2], true);
>           break;
>       case INDEX_op_setcond2_i32:
>           tcg_out_setcond2(s, args, const_args);
> @@ -3724,6 +3755,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
>       case INDEX_op_rotl_i32:
>       case INDEX_op_rotr_i32:
>       case INDEX_op_setcond_i32:
> +    case INDEX_op_negsetcond_i32:
>       case INDEX_op_and_i64:
>       case INDEX_op_andc_i64:
>       case INDEX_op_shl_i64:
> @@ -3732,6 +3764,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
>       case INDEX_op_rotl_i64:
>       case INDEX_op_rotr_i64:
>       case INDEX_op_setcond_i64:
> +    case INDEX_op_negsetcond_i64:
>           return C_O1_I2(r, r, ri);
>   
>       case INDEX_op_mul_i32:


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension
  2023-08-08  3:11 ` [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension Richard Henderson
@ 2023-08-08 16:56   ` Daniel Henrique Barboza
  2023-08-15 13:16   ` Nicholas Piggin
  1 sibling, 0 replies; 59+ messages in thread
From: Daniel Henrique Barboza @ 2023-08-08 16:56 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x



On 8/8/23 00:11, Richard Henderson wrote:
> The SETBC family of instructions requires exactly two insns for
> all comparisions, saving 0-3 insns per (neg)setcond.

Nice.

> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>

>   tcg/ppc/tcg-target.c.inc | 22 ++++++++++++++++++++++
>   1 file changed, 22 insertions(+)
> 
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index 10448aa0e6..090f11e71c 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -447,6 +447,11 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
>   #define TW     XO31( 4)
>   #define TRAP   (TW | TO(31))
>   
> +#define SETBC    XO31(384)  /* v3.10 */
> +#define SETBCR   XO31(416)  /* v3.10 */
> +#define SETNBC   XO31(448)  /* v3.10 */
> +#define SETNBCR  XO31(480)  /* v3.10 */
> +
>   #define NOP    ORI  /* ori 0,0,0 */
>   
>   #define LVX        XO31(103)
> @@ -1624,6 +1629,23 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
>           arg2 = (uint32_t)arg2;
>       }
>   
> +    /* With SETBC/SETBCR, we can always implement with 2 insns. */
> +    if (have_isa_3_10) {
> +        tcg_insn_unit bi, opc;
> +
> +        tcg_out_cmp(s, cond, arg1, arg2, const_arg2, 7, type);
> +
> +        /* Re-use tcg_to_bc for BI and BO_COND_{TRUE,FALSE}. */
> +        bi = tcg_to_bc[cond] & (0x1f << 16);
> +        if (tcg_to_bc[cond] & BO(8)) {
> +            opc = neg ? SETNBC : SETBC;
> +        } else {
> +            opc = neg ? SETNBCR : SETBCR;
> +        }
> +        tcg_out32(s, opc | RT(arg0) | bi);
> +        return;
> +    }
> +
>       /* Handle common and trivial cases before handling anything else.  */
>       if (arg2 == 0) {
>           switch (cond) {


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 01/24] tcg: Introduce negsetcond opcodes
  2023-08-08  3:11 ` [PATCH 01/24] " Richard Henderson
@ 2023-08-10 16:12   ` Peter Maydell
  2023-08-10 16:39     ` Richard Henderson
  0 siblings, 1 reply; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:12 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:12, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Introduce a new opcode for negative setcond.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>




> +static bool fold_negsetcond(OptContext *ctx, TCGOp *op)
> +{
> +    TCGCond cond = op->args[3];
> +    int i;
> +
> +    if (swap_commutative(op->args[0], &op->args[1], &op->args[2])) {
> +        op->args[3] = cond = tcg_swap_cond(cond);
> +    }
> +
> +    i = do_constant_folding_cond(ctx->type, op->args[1], op->args[2], cond);
> +    if (i >= 0) {
> +        return tcg_opt_gen_movi(ctx, op, op->args[0], -i);
> +    }
> +
> +    /* Value is {0,-1} so all bits are repititions of the sign. */

"repetitions"

> +    ctx->s_mask = -1;

Do we not also need to set z_mask to something here (presumably -1)?
(I'm not very familiar with the optimizer internals.)

> +    return false;
> +}

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 02/24] tcg: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 ` [PATCH 02/24] tcg: Use tcg_gen_negsetcond_* Richard Henderson
  2023-08-08 15:55   ` Peter Maydell
@ 2023-08-10 16:13   ` Peter Maydell
  1 sibling, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:13, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 03/24] target/alpha: Use tcg_gen_movcond_i64 in gen_fold_mzero
  2023-08-08  3:11 ` [PATCH 03/24] target/alpha: Use tcg_gen_movcond_i64 in gen_fold_mzero Richard Henderson
@ 2023-08-10 16:19   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The setcond + neg + and sequence is a complex method of
> performing a conditional move.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/alpha/translate.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/target/alpha/translate.c b/target/alpha/translate.c
> index 846f3d8091..0839182a1f 100644
> --- a/target/alpha/translate.c
> +++ b/target/alpha/translate.c
> @@ -517,10 +517,9 @@ static void gen_fold_mzero(TCGCond cond, TCGv dest, TCGv src)
>
>      case TCG_COND_GE:
>      case TCG_COND_LT:
> -        /* For >= or <, map -0.0 to +0.0 via comparison and mask.  */
> -        tcg_gen_setcondi_i64(TCG_COND_NE, dest, src, mzero);
> -        tcg_gen_neg_i64(dest, dest);
> -        tcg_gen_and_i64(dest, dest, src);
> +        /* For >= or <, map -0.0 to +0.0. */
> +        tcg_gen_movcond_i64(TCG_COND_NE, dest, src, tcg_constant_i64(mzero),
> +                            src, tcg_constant_i64(0));
>          break;
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 04/24] target/arm: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 ` [PATCH 04/24] target/arm: Use tcg_gen_negsetcond_* Richard Henderson
@ 2023-08-10 16:22   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:22 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:15, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/tcg/translate-a64.c | 22 +++++++++-------------
>  target/arm/tcg/translate.c     | 12 ++++--------
>  2 files changed, 13 insertions(+), 21 deletions(-)
>
> diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
> index 5fa1257d32..ac16593699 100644
> --- a/target/arm/tcg/translate-a64.c
> +++ b/target/arm/tcg/translate-a64.c
> @@ -4935,9 +4935,12 @@ static void disas_cond_select(DisasContext *s, uint32_t insn)
>
>      if (rn == 31 && rm == 31 && (else_inc ^ else_inv)) {
>          /* CSET & CSETM.  */
> -        tcg_gen_setcond_i64(tcg_invert_cond(c.cond), tcg_rd, c.value, zero);
>          if (else_inv) {
> -            tcg_gen_neg_i64(tcg_rd, tcg_rd);
> +            tcg_gen_negsetcond_i64(tcg_invert_cond(c.cond),
> +                                   tcg_rd, c.value, zero);
> +        } else {
> +            tcg_gen_setcond_i64(tcg_invert_cond(c.cond),
> +                                tcg_rd, c.value, zero);
>          }
>      } else {
>          TCGv_i64 t_true = cpu_reg(s, rn);
> @@ -8670,13 +8673,10 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
>          }
>          break;
>      case 0x6: /* CMGT, CMHI */
> -        /* 64 bit integer comparison, result = test ? (2^64 - 1) : 0.
> -         * We implement this using setcond (test) and then negating.
> -         */
>          cond = u ? TCG_COND_GTU : TCG_COND_GT;
>      do_cmop:
> -        tcg_gen_setcond_i64(cond, tcg_rd, tcg_rn, tcg_rm);
> -        tcg_gen_neg_i64(tcg_rd, tcg_rd);
> +        /* 64 bit integer comparison, result = test ? -1 : 0. */
> +        tcg_gen_negsetcond_i64(cond, tcg_rd, tcg_rn, tcg_rm);
>          break;
>      case 0x7: /* CMGE, CMHS */
>          cond = u ? TCG_COND_GEU : TCG_COND_GE;
> @@ -9265,14 +9265,10 @@ static void handle_2misc_64(DisasContext *s, int opcode, bool u,
>          }
>          break;
>      case 0xa: /* CMLT */
> -        /* 64 bit integer comparison against zero, result is
> -         * test ? (2^64 - 1) : 0. We implement via setcond(!test) and
> -         * subtracting 1.
> -         */
> +        /* 64 bit integer comparison against zero, result is test ? 1 : 0. */

surely "-1" ?

>          cond = TCG_COND_LT;
>      do_cmop:
> -        tcg_gen_setcondi_i64(cond, tcg_rd, tcg_rn, 0);
> -        tcg_gen_neg_i64(tcg_rd, tcg_rd);
> +        tcg_gen_negsetcond_i64(cond, tcg_rd, tcg_rn, tcg_constant_i64(0));
>          break;
>      case 0x8: /* CMGT, CMGE */
>          cond = u ? TCG_COND_GE : TCG_COND_GT;
> diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
> index b71ac2d0d5..31d3130e4c 100644
> --- a/target/arm/tcg/translate.c
> +++ b/target/arm/tcg/translate.c
> @@ -2946,13 +2946,11 @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
>  #define GEN_CMP0(NAME, COND)                                            \
>      static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a)               \
>      {                                                                   \
> -        tcg_gen_setcondi_i32(COND, d, a, 0);                            \
> -        tcg_gen_neg_i32(d, d);                                          \
> +        tcg_gen_negsetcond_i32(COND, d, a, tcg_constant_i32(0));        \
>      }                                                                   \
>      static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a)               \
>      {                                                                   \
> -        tcg_gen_setcondi_i64(COND, d, a, 0);                            \
> -        tcg_gen_neg_i64(d, d);                                          \
> +        tcg_gen_negsetcond_i64(COND, d, a, tcg_constant_i64(0));        \
>      }                                                                   \
>      static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \
>      {                                                                   \
> @@ -3863,15 +3861,13 @@ void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
>  static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
>  {
>      tcg_gen_and_i32(d, a, b);
> -    tcg_gen_setcondi_i32(TCG_COND_NE, d, d, 0);
> -    tcg_gen_neg_i32(d, d);
> +    tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0));
>  }
>
>  void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b)
>  {
>      tcg_gen_and_i64(d, a, b);
> -    tcg_gen_setcondi_i64(TCG_COND_NE, d, d, 0);
> -    tcg_gen_neg_i64(d, d);
> +    tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0));
>  }
>
>  static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b)

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 05/24] target/m68k: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 ` [PATCH 05/24] target/m68k: " Richard Henderson
@ 2023-08-10 16:24   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:17, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/m68k/translate.c | 24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 06/24] target/openrisc: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 ` [PATCH 06/24] target/openrisc: " Richard Henderson
@ 2023-08-10 16:24   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:13, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/openrisc/translate.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 08/24] target/sparc: Use tcg_gen_movcond_i64 in gen_edge
  2023-08-08  3:11 ` [PATCH 08/24] target/sparc: Use tcg_gen_movcond_i64 in gen_edge Richard Henderson
@ 2023-08-10 16:29   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:29 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The setcond + neg + or sequence is a complex method of
> performing a conditional move.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/sparc/translate.c | 17 ++++-------------
>  1 file changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/target/sparc/translate.c b/target/sparc/translate.c
> index bd877a5e4a..fa80a91161 100644
> --- a/target/sparc/translate.c
> +++ b/target/sparc/translate.c
> @@ -2916,7 +2916,7 @@ static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
>
>      tcg_gen_shr_tl(lo1, tcg_constant_tl(tabl), lo1);
>      tcg_gen_shr_tl(lo2, tcg_constant_tl(tabr), lo2);
> -    tcg_gen_andi_tl(dst, lo1, omask);
> +    tcg_gen_andi_tl(lo1, lo1, omask);
>      tcg_gen_andi_tl(lo2, lo2, omask);
>
>      amask = -8;
> @@ -2926,18 +2926,9 @@ static void gen_edge(DisasContext *dc, TCGv dst, TCGv s1, TCGv s2,
>      tcg_gen_andi_tl(s1, s1, amask);
>      tcg_gen_andi_tl(s2, s2, amask);
>
> -    /* We want to compute
> -        dst = (s1 == s2 ? lo1 : lo1 & lo2).
> -       We've already done dst = lo1, so this reduces to
> -        dst &= (s1 == s2 ? -1 : lo2)
> -       Which we perform by
> -        lo2 |= -(s1 == s2)
> -        dst &= lo2
> -    */
> -    tcg_gen_setcond_tl(TCG_COND_EQ, lo1, s1, s2);
> -    tcg_gen_neg_tl(lo1, lo1);
> -    tcg_gen_or_tl(lo2, lo2, lo1);
> -    tcg_gen_and_tl(dst, dst, lo2);
> +    /* Compute dst = (s1 == s2 ? lo1 : lo1 & lo2). */
> +    tcg_gen_and_tl(lo2, lo2, lo1);
> +    tcg_gen_movcond_tl(TCG_COND_EQ, dst, s1, s2, lo1, lo2);
>  }

I'm glad I didn't have to figure out exactly what this
exciting instruction actually does to review this :-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 12/24] tcg/aarch64: Implement negsetcond_*
  2023-08-08  3:11 ` [PATCH 12/24] tcg/aarch64: Implement negsetcond_* Richard Henderson
@ 2023-08-10 16:39   ` Peter Maydell
  2023-08-10 16:55     ` Richard Henderson
  0 siblings, 1 reply; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:39 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:13, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Trivial, as aarch64 has an instruction for this: CSETM.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/aarch64/tcg-target.h     |  4 ++--
>  tcg/aarch64/tcg-target.c.inc | 12 ++++++++++++
>  2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
> index 6080fddf73..e3faa9cff4 100644
> --- a/tcg/aarch64/tcg-target.h
> +++ b/tcg/aarch64/tcg-target.h
> @@ -94,7 +94,7 @@ typedef enum {
>  #define TCG_TARGET_HAS_mulsh_i32        0
>  #define TCG_TARGET_HAS_extrl_i64_i32    0
>  #define TCG_TARGET_HAS_extrh_i64_i32    0
> -#define TCG_TARGET_HAS_negsetcond_i32   0
> +#define TCG_TARGET_HAS_negsetcond_i32   1
>  #define TCG_TARGET_HAS_qemu_st8_i32     0
>
>  #define TCG_TARGET_HAS_div_i64          1
> @@ -130,7 +130,7 @@ typedef enum {
>  #define TCG_TARGET_HAS_muls2_i64        0
>  #define TCG_TARGET_HAS_muluh_i64        1
>  #define TCG_TARGET_HAS_mulsh_i64        1
> -#define TCG_TARGET_HAS_negsetcond_i64   0
> +#define TCG_TARGET_HAS_negsetcond_i64   1
>
>  /*
>   * Without FEAT_LSE2, we must use LDXP+STXP to implement atomic 128-bit load,
> diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
> index 35ca80cd56..7d8d114c9e 100644
> --- a/tcg/aarch64/tcg-target.c.inc
> +++ b/tcg/aarch64/tcg-target.c.inc
> @@ -2262,6 +2262,16 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>                       TCG_REG_XZR, tcg_invert_cond(args[3]));
>          break;
>
> +    case INDEX_op_negsetcond_i32:
> +        a2 = (int32_t)a2;
> +        /* FALLTHRU */

I see this is what we already do for setcond and movcond,
but how does it work when the 2nd input is a register?
Or is reg-reg guaranteed to always use the _i64 op?

> +    case INDEX_op_negsetcond_i64:
> +        tcg_out_cmp(s, ext, a1, a2, c2);
> +        /* Use CSETM alias of CSINV Wd, WZR, WZR, invert(cond).  */
> +        tcg_out_insn(s, 3506, CSINV, ext, a0, TCG_REG_XZR,
> +                     TCG_REG_XZR, tcg_invert_cond(args[3]));
> +        break;
> +
>      case INDEX_op_movcond_i32:
>          a2 = (int32_t)a2;
>          /* FALLTHRU */
> @@ -2868,6 +2878,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
>      case INDEX_op_sub_i64:
>      case INDEX_op_setcond_i32:
>      case INDEX_op_setcond_i64:
> +    case INDEX_op_negsetcond_i32:
> +    case INDEX_op_negsetcond_i64:
>          return C_O1_I2(r, r, rA);
>
>      case INDEX_op_mul_i32:

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 01/24] tcg: Introduce negsetcond opcodes
  2023-08-10 16:12   ` Peter Maydell
@ 2023-08-10 16:39     ` Richard Henderson
  0 siblings, 0 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-10 16:39 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On 8/10/23 09:12, Peter Maydell wrote:
>> +    ctx->s_mask = -1;
> 
> Do we not also need to set z_mask to something here (presumably -1)?
> (I'm not very familiar with the optimizer internals.)

It is set to -1 by default before folding all operations.


r~


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 13/24] tcg/arm: Implement negsetcond_i32
  2023-08-08  3:11 ` [PATCH 13/24] tcg/arm: Implement negsetcond_i32 Richard Henderson
@ 2023-08-10 16:41   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:41 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Trivial, as we simply need to load a different constant
> in the conditional move.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/arm/tcg-target.h     | 2 +-
>  tcg/arm/tcg-target.c.inc | 9 +++++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 12/24] tcg/aarch64: Implement negsetcond_*
  2023-08-10 16:39   ` Peter Maydell
@ 2023-08-10 16:55     ` Richard Henderson
  2023-08-10 16:58       ` Peter Maydell
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-10 16:55 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On 8/10/23 09:39, Peter Maydell wrote:
>> +    case INDEX_op_negsetcond_i32:
>> +        a2 = (int32_t)a2;
>> +        /* FALLTHRU */
> 
> I see this is what we already do for setcond and movcond,
> but how does it work when the 2nd input is a register?
> Or is reg-reg guaranteed to always use the _i64 op?

For reg-reg, a2 < 31, so the sign-extend does nothing.


r~


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 12/24] tcg/aarch64: Implement negsetcond_*
  2023-08-10 16:55     ` Richard Henderson
@ 2023-08-10 16:58       ` Peter Maydell
  2023-08-10 17:01         ` Richard Henderson
  0 siblings, 1 reply; 59+ messages in thread
From: Peter Maydell @ 2023-08-10 16:58 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Thu, 10 Aug 2023 at 17:55, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 8/10/23 09:39, Peter Maydell wrote:
> >> +    case INDEX_op_negsetcond_i32:
> >> +        a2 = (int32_t)a2;
> >> +        /* FALLTHRU */
> >
> > I see this is what we already do for setcond and movcond,
> > but how does it work when the 2nd input is a register?
> > Or is reg-reg guaranteed to always use the _i64 op?
>
> For reg-reg, a2 < 31, so the sign-extend does nothing.

OK. Do we document somewhere what a TCGArg is?

Anyway,
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 12/24] tcg/aarch64: Implement negsetcond_*
  2023-08-10 16:58       ` Peter Maydell
@ 2023-08-10 17:01         ` Richard Henderson
  0 siblings, 0 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-10 17:01 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On 8/10/23 09:58, Peter Maydell wrote:
> On Thu, 10 Aug 2023 at 17:55, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> On 8/10/23 09:39, Peter Maydell wrote:
>>>> +    case INDEX_op_negsetcond_i32:
>>>> +        a2 = (int32_t)a2;
>>>> +        /* FALLTHRU */
>>>
>>> I see this is what we already do for setcond and movcond,
>>> but how does it work when the 2nd input is a register?
>>> Or is reg-reg guaranteed to always use the _i64 op?
>>
>> For reg-reg, a2 < 31, so the sign-extend does nothing.
> 
> OK. Do we document somewhere what a TCGArg is?

No.  I should do that...


r~



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 17/24] tcg/i386: Merge tcg_out_brcond{32,64}
  2023-08-08  3:11 ` [PATCH 17/24] tcg/i386: Merge tcg_out_brcond{32,64} Richard Henderson
@ 2023-08-11 10:20   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 10:20 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:12, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Pass a rexw parameter instead of duplicating the functions.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/i386/tcg-target.c.inc | 110 +++++++++++++++++---------------------
>  1 file changed, 49 insertions(+), 61 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 18/24] tcg/i386: Merge tcg_out_setcond{32,64}
  2023-08-08  3:11 ` [PATCH 18/24] tcg/i386: Merge tcg_out_setcond{32,64} Richard Henderson
@ 2023-08-11 10:21   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 10:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Pass a rexw parameter instead of duplicating the functions.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 19/24] tcg/i386: Merge tcg_out_movcond{32,64}
  2023-08-08  3:11 ` [PATCH 19/24] tcg/i386: Merge tcg_out_movcond{32,64} Richard Henderson
@ 2023-08-11 10:22   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 10:22 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:16, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Pass a rexw parameter instead of duplicating the functions.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp
  2023-08-08  3:11 ` [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp Richard Henderson
@ 2023-08-11 10:26   ` Peter Maydell
  2023-08-11 10:45     ` Peter Maydell
  0 siblings, 1 reply; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 10:26 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:13, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Add the parameter to avoid TEST and pass along to tgen_arithi.
> All current users pass false.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/i386/tcg-target.c.inc | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
> index b88fc14afd..56549ff2a0 100644
> --- a/tcg/i386/tcg-target.c.inc
> +++ b/tcg/i386/tcg-target.c.inc
> @@ -1418,15 +1418,15 @@ static void tcg_out_jxx(TCGContext *s, int opc, TCGLabel *l, bool small)
>      }
>  }
>
> -static void tcg_out_cmp(TCGContext *s, TCGArg arg1, TCGArg arg2,
> -                        int const_arg2, int rexw)
> +static void tcg_out_cmp(TCGContext *s, int rexw, TCGArg arg1, TCGArg arg2,
> +                        int const_arg2, bool cf)
>  {
>      if (const_arg2) {
> -        if (arg2 == 0) {
> +        if (arg2 == 0 && !cf) {
>              /* test r, r */
>              tcg_out_modrm(s, OPC_TESTL + rexw, arg1, arg1);
>          } else {
> -            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, 0);
> +            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, cf);
>          }
>      } else {
>          tgen_arithr(s, ARITH_CMP + rexw, arg1, arg2);

I don't really understand the motivation here.
Why are some uses of this function fine with using the TEST
insn, but some must avoid it? What does 'cf' stand for?
A comment would help here if there isn't a clearer argument
name available...

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp
  2023-08-11 10:26   ` Peter Maydell
@ 2023-08-11 10:45     ` Peter Maydell
  2023-08-11 15:06       ` Richard Henderson
  0 siblings, 1 reply; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 10:45 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Fri, 11 Aug 2023 at 11:26, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Tue, 8 Aug 2023 at 04:13, Richard Henderson
> <richard.henderson@linaro.org> wrote:
> >
> > Add the parameter to avoid TEST and pass along to tgen_arithi.
> > All current users pass false.
> >
> > Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> > ---
> >  tcg/i386/tcg-target.c.inc | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
> > index b88fc14afd..56549ff2a0 100644
> > --- a/tcg/i386/tcg-target.c.inc
> > +++ b/tcg/i386/tcg-target.c.inc
> > @@ -1418,15 +1418,15 @@ static void tcg_out_jxx(TCGContext *s, int opc, TCGLabel *l, bool small)
> >      }
> >  }
> >
> > -static void tcg_out_cmp(TCGContext *s, TCGArg arg1, TCGArg arg2,
> > -                        int const_arg2, int rexw)
> > +static void tcg_out_cmp(TCGContext *s, int rexw, TCGArg arg1, TCGArg arg2,
> > +                        int const_arg2, bool cf)
> >  {
> >      if (const_arg2) {
> > -        if (arg2 == 0) {
> > +        if (arg2 == 0 && !cf) {
> >              /* test r, r */
> >              tcg_out_modrm(s, OPC_TESTL + rexw, arg1, arg1);
> >          } else {
> > -            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, 0);
> > +            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, cf);
> >          }
> >      } else {
> >          tgen_arithr(s, ARITH_CMP + rexw, arg1, arg2);
>
> I don't really understand the motivation here.
> Why are some uses of this function fine with using the TEST
> insn, but some must avoid it? What does 'cf' stand for?
> A comment would help here if there isn't a clearer argument
> name available...

Looking at the following patch suggests perhaps:

/**
 * tcg_out_cmp: Emit a compare, setting the X, Y, Z flags accordingly.
 * @need_cf : true if the comparison must also set CF
 */

(fill in which XYZ flags you can rely on even if need_cf is false)

?

-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 21/24] tcg/i386: Use CMP+SBB in tcg_out_setcond
  2023-08-08  3:11 ` [PATCH 21/24] tcg/i386: Use CMP+SBB in tcg_out_setcond Richard Henderson
@ 2023-08-11 12:07   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 12:07 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:16, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Use the carry bit to optimize some forms of setcond.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible
  2023-08-08  3:11 ` [PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible Richard Henderson
@ 2023-08-11 12:09   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 12:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:16, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Using XOR first is both smaller and more efficient,
> though cannot be applied if it clobbers an input.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 23/24] tcg/i386: Use shift in tcg_out_setcond
  2023-08-08  3:11 ` [PATCH 23/24] tcg/i386: Use shift in tcg_out_setcond Richard Henderson
@ 2023-08-11 12:10   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 12:10 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:13, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> For LT/GE vs zero, shift down the sign bit.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 24/24] tcg/i386: Implement negsetcond_*
  2023-08-08  3:11 ` [PATCH 24/24] tcg/i386: Implement negsetcond_* Richard Henderson
@ 2023-08-11 12:13   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 12:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:15, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/i386/tcg-target.h     |  4 ++--
>  tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++--------
>  2 files changed, 21 insertions(+), 10 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 16/24] tcg/sparc64: Implement negsetcond_*
  2023-08-08  3:11 ` [PATCH 16/24] tcg/sparc64: " Richard Henderson
@ 2023-08-11 12:24   ` Peter Maydell
  0 siblings, 0 replies; 59+ messages in thread
From: Peter Maydell @ 2023-08-11 12:24 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue, 8 Aug 2023 at 04:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp
  2023-08-11 10:45     ` Peter Maydell
@ 2023-08-11 15:06       ` Richard Henderson
  2023-08-12 17:21         ` Richard Henderson
  0 siblings, 1 reply; 59+ messages in thread
From: Richard Henderson @ 2023-08-11 15:06 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On 8/11/23 03:45, Peter Maydell wrote:
> On Fri, 11 Aug 2023 at 11:26, Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> On Tue, 8 Aug 2023 at 04:13, Richard Henderson
>> <richard.henderson@linaro.org> wrote:
>>>
>>> Add the parameter to avoid TEST and pass along to tgen_arithi.
>>> All current users pass false.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>>>   tcg/i386/tcg-target.c.inc | 16 ++++++++--------
>>>   1 file changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
>>> index b88fc14afd..56549ff2a0 100644
>>> --- a/tcg/i386/tcg-target.c.inc
>>> +++ b/tcg/i386/tcg-target.c.inc
>>> @@ -1418,15 +1418,15 @@ static void tcg_out_jxx(TCGContext *s, int opc, TCGLabel *l, bool small)
>>>       }
>>>   }
>>>
>>> -static void tcg_out_cmp(TCGContext *s, TCGArg arg1, TCGArg arg2,
>>> -                        int const_arg2, int rexw)
>>> +static void tcg_out_cmp(TCGContext *s, int rexw, TCGArg arg1, TCGArg arg2,
>>> +                        int const_arg2, bool cf)
>>>   {
>>>       if (const_arg2) {
>>> -        if (arg2 == 0) {
>>> +        if (arg2 == 0 && !cf) {
>>>               /* test r, r */
>>>               tcg_out_modrm(s, OPC_TESTL + rexw, arg1, arg1);
>>>           } else {
>>> -            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, 0);
>>> +            tgen_arithi(s, ARITH_CMP + rexw, arg1, arg2, cf);
>>>           }
>>>       } else {
>>>           tgen_arithr(s, ARITH_CMP + rexw, arg1, arg2);
>>
>> I don't really understand the motivation here.
>> Why are some uses of this function fine with using the TEST
>> insn, but some must avoid it? What does 'cf' stand for?
>> A comment would help here if there isn't a clearer argument
>> name available...
> 
> Looking at the following patch suggests perhaps:
> 
> /**
>   * tcg_out_cmp: Emit a compare, setting the X, Y, Z flags accordingly.
>   * @need_cf : true if the comparison must also set CF
>   */
> 
> (fill in which XYZ flags you can rely on even if need_cf is false)

I can add that, yes.

Basically, test sets SZ flags, where cmp sets SZCO.  I want to add an optimizaton using C, 
so "cmp 0,x" should not be silently replaced by "test x,x".


r~


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp
  2023-08-11 15:06       ` Richard Henderson
@ 2023-08-12 17:21         ` Richard Henderson
  0 siblings, 0 replies; 59+ messages in thread
From: Richard Henderson @ 2023-08-12 17:21 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On 8/11/23 08:06, Richard Henderson wrote:
> Basically, test sets SZ flags, where cmp sets SZCO.  I want to add an optimizaton using C, 
> so "cmp 0,x" should not be silently replaced by "test x,x".

This patch can be dropped entirely.

TEST clears C (which cmp vs 0 would also do).  I was mis-remembering INC/DEC which leave C 
unchanged and thus uninitialized wrt the current operation.


r~



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 07/24] target/ppc: Use tcg_gen_negsetcond_*
  2023-08-08  3:11 ` [PATCH 07/24] target/ppc: " Richard Henderson
  2023-08-08 16:51   ` Daniel Henrique Barboza
@ 2023-08-15 12:54   ` Nicholas Piggin
  1 sibling, 0 replies; 59+ messages in thread
From: Nicholas Piggin @ 2023-08-15 12:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue Aug 8, 2023 at 1:11 PM AEST, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  target/ppc/translate/fixedpoint-impl.c.inc | 6 ++++--
>  target/ppc/translate/vmx-impl.c.inc        | 8 +++-----
>  2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
> index f47f1a50e8..4ce02fd3a4 100644
> --- a/target/ppc/translate/fixedpoint-impl.c.inc
> +++ b/target/ppc/translate/fixedpoint-impl.c.inc
> @@ -342,12 +342,14 @@ static bool do_set_bool_cond(DisasContext *ctx, arg_X_bi *a, bool neg, bool rev)
>      uint32_t mask = 0x08 >> (a->bi & 0x03);
>      TCGCond cond = rev ? TCG_COND_EQ : TCG_COND_NE;
>      TCGv temp = tcg_temp_new();
> +    TCGv zero = tcg_constant_tl(0);
>  
>      tcg_gen_extu_i32_tl(temp, cpu_crf[a->bi >> 2]);
>      tcg_gen_andi_tl(temp, temp, mask);
> -    tcg_gen_setcondi_tl(cond, cpu_gpr[a->rt], temp, 0);
>      if (neg) {
> -        tcg_gen_neg_tl(cpu_gpr[a->rt], cpu_gpr[a->rt]);
> +        tcg_gen_negsetcond_tl(cond, cpu_gpr[a->rt], temp, zero);
> +    } else {
> +        tcg_gen_setcond_tl(cond, cpu_gpr[a->rt], temp, zero);
>      }
>      return true;
>  }
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index c8712dd7d8..6d7669aabd 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -1341,8 +1341,7 @@ static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
>      tcg_gen_xor_i64(t1, t0, t1);
>  
>      tcg_gen_or_i64(t1, t1, t2);
> -    tcg_gen_setcondi_i64(TCG_COND_EQ, t1, t1, 0);
> -    tcg_gen_neg_i64(t1, t1);
> +    tcg_gen_negsetcond_i64(TCG_COND_EQ, t1, t1, tcg_constant_i64(0));
>  
>      set_avr64(a->vrt, t1, true);
>      set_avr64(a->vrt, t1, false);
> @@ -1365,15 +1364,14 @@ static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
>  
>      get_avr64(t0, a->vra, false);
>      get_avr64(t1, a->vrb, false);
> -    tcg_gen_setcond_i64(TCG_COND_GTU, t2, t0, t1);
> +    tcg_gen_negsetcond_i64(TCG_COND_GTU, t2, t0, t1);
>  
>      get_avr64(t0, a->vra, true);
>      get_avr64(t1, a->vrb, true);
>      tcg_gen_movcond_i64(TCG_COND_EQ, t2, t0, t1, t2, tcg_constant_i64(0));
> -    tcg_gen_setcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
> +    tcg_gen_negsetcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
>  
>      tcg_gen_or_i64(t1, t1, t2);
> -    tcg_gen_neg_i64(t1, t1);
>  
>      set_avr64(a->vrt, t1, true);
>      set_avr64(a->vrt, t1, false);



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension
  2023-08-08  3:11 ` [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension Richard Henderson
  2023-08-08 16:56   ` Daniel Henrique Barboza
@ 2023-08-15 13:16   ` Nicholas Piggin
  1 sibling, 0 replies; 59+ messages in thread
From: Nicholas Piggin @ 2023-08-15 13:16 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm, qemu-ppc, qemu-riscv, qemu-s390x

On Tue Aug 8, 2023 at 1:11 PM AEST, Richard Henderson wrote:
> The SETBC family of instructions requires exactly two insns for
> all comparisions, saving 0-3 insns per (neg)setcond.
>

Nice patch. Tested on a POWER10.

Tested-by: Nicholas Piggin <npiggin@gmail.com>

> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/ppc/tcg-target.c.inc | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index 10448aa0e6..090f11e71c 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -447,6 +447,11 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
>  #define TW     XO31( 4)
>  #define TRAP   (TW | TO(31))
>  
> +#define SETBC    XO31(384)  /* v3.10 */
> +#define SETBCR   XO31(416)  /* v3.10 */
> +#define SETNBC   XO31(448)  /* v3.10 */
> +#define SETNBCR  XO31(480)  /* v3.10 */
> +
>  #define NOP    ORI  /* ori 0,0,0 */
>  
>  #define LVX        XO31(103)
> @@ -1624,6 +1629,23 @@ static void tcg_out_setcond(TCGContext *s, TCGType type, TCGCond cond,
>          arg2 = (uint32_t)arg2;
>      }
>  
> +    /* With SETBC/SETBCR, we can always implement with 2 insns. */
> +    if (have_isa_3_10) {
> +        tcg_insn_unit bi, opc;
> +
> +        tcg_out_cmp(s, cond, arg1, arg2, const_arg2, 7, type);
> +
> +        /* Re-use tcg_to_bc for BI and BO_COND_{TRUE,FALSE}. */
> +        bi = tcg_to_bc[cond] & (0x1f << 16);
> +        if (tcg_to_bc[cond] & BO(8)) {
> +            opc = neg ? SETNBC : SETBC;
> +        } else {
> +            opc = neg ? SETNBCR : SETBCR;
> +        }
> +        tcg_out32(s, opc | RT(arg0) | bi);
> +        return;
> +    }
> +
>      /* Handle common and trivial cases before handling anything else.  */
>      if (arg2 == 0) {
>          switch (cond) {



^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2023-08-15 13:17 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-08  3:11 [PATCH for-8.2 00/24] tcg: Introduce negsetcond opcodes Richard Henderson
2023-08-08  3:11 ` [PATCH 01/24] " Richard Henderson
2023-08-10 16:12   ` Peter Maydell
2023-08-10 16:39     ` Richard Henderson
2023-08-08  3:11 ` [PATCH 02/24] tcg: Use tcg_gen_negsetcond_* Richard Henderson
2023-08-08 15:55   ` Peter Maydell
2023-08-08 16:04     ` Richard Henderson
2023-08-10 16:13   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 03/24] target/alpha: Use tcg_gen_movcond_i64 in gen_fold_mzero Richard Henderson
2023-08-10 16:19   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 04/24] target/arm: Use tcg_gen_negsetcond_* Richard Henderson
2023-08-10 16:22   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 05/24] target/m68k: " Richard Henderson
2023-08-10 16:24   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 06/24] target/openrisc: " Richard Henderson
2023-08-10 16:24   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 07/24] target/ppc: " Richard Henderson
2023-08-08 16:51   ` Daniel Henrique Barboza
2023-08-15 12:54   ` Nicholas Piggin
2023-08-08  3:11 ` [PATCH 08/24] target/sparc: Use tcg_gen_movcond_i64 in gen_edge Richard Henderson
2023-08-10 16:29   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 09/24] target/tricore: Replace gen_cond_w with tcg_gen_negsetcond_tl Richard Henderson
2023-08-08 15:42   ` Bastian Koppelmann
2023-08-08  3:11 ` [PATCH 10/24] tcg/ppc: Implement negsetcond_* Richard Henderson
2023-08-08 16:55   ` Daniel Henrique Barboza
2023-08-08  3:11 ` [PATCH 11/24] tcg/ppc: Use the Set Boolean Extension Richard Henderson
2023-08-08 16:56   ` Daniel Henrique Barboza
2023-08-15 13:16   ` Nicholas Piggin
2023-08-08  3:11 ` [PATCH 12/24] tcg/aarch64: Implement negsetcond_* Richard Henderson
2023-08-10 16:39   ` Peter Maydell
2023-08-10 16:55     ` Richard Henderson
2023-08-10 16:58       ` Peter Maydell
2023-08-10 17:01         ` Richard Henderson
2023-08-08  3:11 ` [PATCH 13/24] tcg/arm: Implement negsetcond_i32 Richard Henderson
2023-08-10 16:41   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 14/24] tcg/riscv: Implement negsetcond_* Richard Henderson
2023-08-08 16:47   ` Daniel Henrique Barboza
2023-08-08  3:11 ` [PATCH 15/24] tcg/s390x: " Richard Henderson
2023-08-08  3:11 ` [PATCH 16/24] tcg/sparc64: " Richard Henderson
2023-08-11 12:24   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 17/24] tcg/i386: Merge tcg_out_brcond{32,64} Richard Henderson
2023-08-11 10:20   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 18/24] tcg/i386: Merge tcg_out_setcond{32,64} Richard Henderson
2023-08-11 10:21   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 19/24] tcg/i386: Merge tcg_out_movcond{32,64} Richard Henderson
2023-08-11 10:22   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 20/24] tcg/i386: Add cf parameter to tcg_out_cmp Richard Henderson
2023-08-11 10:26   ` Peter Maydell
2023-08-11 10:45     ` Peter Maydell
2023-08-11 15:06       ` Richard Henderson
2023-08-12 17:21         ` Richard Henderson
2023-08-08  3:11 ` [PATCH 21/24] tcg/i386: Use CMP+SBB in tcg_out_setcond Richard Henderson
2023-08-11 12:07   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 22/24] tcg/i386: Clear dest first in tcg_out_setcond if possible Richard Henderson
2023-08-11 12:09   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 23/24] tcg/i386: Use shift in tcg_out_setcond Richard Henderson
2023-08-11 12:10   ` Peter Maydell
2023-08-08  3:11 ` [PATCH 24/24] tcg/i386: Implement negsetcond_* Richard Henderson
2023-08-11 12:13   ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).