qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/12] tcg-s390 updates
@ 2013-03-27 18:52 Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 01/12] tcg-s390: Fix movi Richard Henderson
                   ` (11 more replies)
  0 siblings, 12 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

The first patch here has been seen previously, fixing a bug in
how constants are constructed.

The second patch fixes how parameters beyond the first 5 are
passed to functions.  I'm not sure how this got missed during
the initial creation of the port.  :-P

The remaining patches are improvements, either for new opcodes
or for additional ISA support.


r~


Richard Henderson (12):
  tcg-s390: Fix movi
  tcg-s390: Properly allocate a stack frame.
  tcg-s390: Remove useless preprocessor conditions
  tcg-s390: Implement add2/sub2 opcodes
  tcg-s390: Implement mulu2_i64 opcode
  tcg-s390: Implement movcond opcodes
  tcg-s390: Implement deposit opcodes
  tcg-s390: Remove constraint letters for and
  tcg-s390: Use risbgz for andi
  tcg-s390: Cleanup argument shuffling fixme in softmmu code
  tcg-s390: Use load-address for addition
  tcg-s390: Use all 20 bits of the offset in tcg_out_mem

 tcg/s390/tcg-target.c | 541 ++++++++++++++++++++++++++++----------------------
 tcg/s390/tcg-target.h |  26 +--
 2 files changed, 314 insertions(+), 253 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 01/12] tcg-s390: Fix movi
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 02/12] tcg-s390: Properly allocate a stack frame Richard Henderson
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

The code to load the high 64 bits assumed that the insn used to
load the low 64 bits zero-extended.  Enforce that.
---
 tcg/s390/tcg-target.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index e12a152..0132010 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -770,7 +770,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
     /* If we get here, both the high and low parts have non-zero bits.  */
 
     /* Recurse to load the lower 32-bits.  */
-    tcg_out_movi(s, TCG_TYPE_I32, ret, sval);
+    tcg_out_movi(s, TCG_TYPE_I64, ret, uval & 0xffffffff);
 
     /* Insert data into the high 32-bits.  */
     uval = uval >> 31 >> 1;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 02/12] tcg-s390: Properly allocate a stack frame.
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 01/12] tcg-s390: Fix movi Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 03/12] tcg-s390: Remove useless preprocessor conditions Richard Henderson
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Set TCG_TARGET_CALL_STACK_OFFSET properly for the abi.  Allocate the
standard TCG_STATIC_CALL_ARGS_SIZE.  And while we're at it, allocate
space for CPU_TEMP_BUF_NLONGS.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 20 ++++++++++++++------
 tcg/s390/tcg-target.h |  2 +-
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 0132010..d91b894 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -2302,17 +2302,24 @@ static void tcg_target_init(TCGContext *s)
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK);
 
     tcg_add_target_add_op_defs(s390_op_defs);
-    tcg_set_frame(s, TCG_AREG0, offsetof(CPUArchState, temp_buf),
-                  CPU_TEMP_BUF_NLONGS * sizeof(long));
 }
 
 static void tcg_target_qemu_prologue(TCGContext *s)
 {
+    tcg_target_long frame_size;
+
     /* stmg %r6,%r15,48(%r15) (save registers) */
     tcg_out_insn(s, RXY, STMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 48);
 
-    /* aghi %r15,-160 (stack frame) */
-    tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -160);
+    /* aghi %r15,-frame_size */
+    frame_size = TCG_TARGET_CALL_STACK_OFFSET;
+    frame_size += TCG_STATIC_CALL_ARGS_SIZE;
+    frame_size += CPU_TEMP_BUF_NLONGS * sizeof(long);
+    tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -frame_size);
+
+    tcg_set_frame(s, TCG_REG_CALL_STACK,
+                  TCG_STATIC_CALL_ARGS_SIZE + TCG_TARGET_CALL_STACK_OFFSET,
+                  CPU_TEMP_BUF_NLONGS * sizeof(long));
 
     if (GUEST_BASE >= 0x80000) {
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, GUEST_BASE);
@@ -2325,8 +2332,9 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 
     tb_ret_addr = s->code_ptr;
 
-    /* lmg %r6,%r15,208(%r15) (restore registers) */
-    tcg_out_insn(s, RXY, LMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 208);
+    /* lmg %r6,%r15,fs+48(%r15) (restore registers) */
+    tcg_out_insn(s, RXY, LMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15,
+                 frame_size + 48);
 
     /* br %r14 (return) */
     tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R14);
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 40211e6..c6d9e84 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -100,7 +100,7 @@ typedef enum TCGReg {
 /* used for function call generation */
 #define TCG_REG_CALL_STACK		TCG_REG_R15
 #define TCG_TARGET_STACK_ALIGN		8
-#define TCG_TARGET_CALL_STACK_OFFSET	0
+#define TCG_TARGET_CALL_STACK_OFFSET	160
 
 #define TCG_TARGET_EXTEND_ARGS 1
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 03/12] tcg-s390: Remove useless preprocessor conditions
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 01/12] tcg-s390: Fix movi Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 02/12] tcg-s390: Properly allocate a stack frame Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-28  0:14   ` Aurelien Jarno
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 04/12] tcg-s390: Implement add2/sub2 opcodes Richard Henderson
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

We only support 64-bit code generation for s390x.
Don't clutter the code with ifdefs that suggest otherwise.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 17 +++++------------
 tcg/s390/tcg-target.h |  2 --
 2 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index d91b894..ba314b3 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -24,6 +24,11 @@
  * THE SOFTWARE.
  */
 
+/* We only support generating code for 64-bit mode.  */
+#if TCG_TARGET_REG_BITS != 64
+#error "unsupported code generation mode"
+#endif
+
 /* ??? The translation blocks produced by TCG are generally small enough to
    be entirely reachable with a 16-bit displacement.  Leaving the option for
    a 32-bit displacement here Just In Case.  */
@@ -252,9 +257,6 @@ static const int tcg_target_call_iarg_regs[] = {
 
 static const int tcg_target_call_oarg_regs[] = {
     TCG_REG_R2,
-#if TCG_TARGET_REG_BITS == 32
-    TCG_REG_R3
-#endif
 };
 
 #define S390_CC_EQ      8
@@ -1620,14 +1622,9 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 #endif
 }
 
-#if TCG_TARGET_REG_BITS == 64
 # define OP_32_64(x) \
         case glue(glue(INDEX_op_,x),_i32): \
         case glue(glue(INDEX_op_,x),_i64)
-#else
-# define OP_32_64(x) \
-        case glue(glue(INDEX_op_,x),_i32)
-#endif
 
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
@@ -1870,7 +1867,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_qemu_st(s, args, LD_UINT64);
         break;
 
-#if TCG_TARGET_REG_BITS == 64
     case INDEX_op_mov_i64:
         tcg_out_mov(s, TCG_TYPE_I64, args[0], args[1]);
         break;
@@ -2035,7 +2031,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_qemu_ld32s:
         tcg_out_qemu_ld(s, args, LD_INT32);
         break;
-#endif /* TCG_TARGET_REG_BITS == 64 */
 
     default:
         fprintf(stderr,"unimplemented opc 0x%x\n",opc);
@@ -2104,7 +2099,6 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_qemu_st32, { "L", "L" } },
     { INDEX_op_qemu_st64, { "L", "L" } },
 
-#if defined(__s390x__)
     { INDEX_op_mov_i64, { "r", "r" } },
     { INDEX_op_movi_i64, { "r" } },
 
@@ -2157,7 +2151,6 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_qemu_ld32u, { "r", "L" } },
     { INDEX_op_qemu_ld32s, { "r", "L" } },
-#endif
 
     { -1 },
 };
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index c6d9e84..0929d55 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -70,7 +70,6 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_mulu2_i32        0
 #define TCG_TARGET_HAS_muls2_i32        0
 
-#if TCG_TARGET_REG_BITS == 64
 #define TCG_TARGET_HAS_div2_i64         1
 #define TCG_TARGET_HAS_rot_i64          1
 #define TCG_TARGET_HAS_ext8s_i64        1
@@ -95,7 +94,6 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_sub2_i64         0
 #define TCG_TARGET_HAS_mulu2_i64        0
 #define TCG_TARGET_HAS_muls2_i64        0
-#endif
 
 /* used for function call generation */
 #define TCG_REG_CALL_STACK		TCG_REG_R15
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 04/12] tcg-s390: Implement add2/sub2 opcodes
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (2 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 03/12] tcg-s390: Remove useless preprocessor conditions Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 05/12] tcg-s390: Implement mulu2_i64 opcode Richard Henderson
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 38 ++++++++++++++++++++++++++++++++++++++
 tcg/s390/tcg-target.h |  8 ++++----
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index ba314b3..b007763 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -68,6 +68,7 @@
 typedef enum S390Opcode {
     RIL_AFI     = 0xc209,
     RIL_AGFI    = 0xc208,
+    RIL_ALFI    = 0xc20b,
     RIL_ALGFI   = 0xc20a,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
@@ -89,6 +90,7 @@ typedef enum S390Opcode {
     RIL_NILF    = 0xc00b,
     RIL_OIHF    = 0xc00c,
     RIL_OILF    = 0xc00d,
+    RIL_SLFI    = 0xc205,
     RIL_XIHF    = 0xc006,
     RIL_XILF    = 0xc007,
 
@@ -125,6 +127,9 @@ typedef enum S390Opcode {
     RIE_CRJ     = 0xec76,
 
     RRE_AGR     = 0xb908,
+    RRE_ALGR    = 0xb90a,
+    RRE_ALCR    = 0xb998,
+    RRE_ALCGR   = 0xb988,
     RRE_CGR     = 0xb920,
     RRE_CLGR    = 0xb921,
     RRE_DLGR    = 0xb987,
@@ -147,9 +152,13 @@ typedef enum S390Opcode {
     RRE_NGR     = 0xb980,
     RRE_OGR     = 0xb981,
     RRE_SGR     = 0xb909,
+    RRE_SLGR    = 0xb90b,
+    RRE_SLBR    = 0xb999,
+    RRE_SLBGR   = 0xb989,
     RRE_XGR     = 0xb982,
 
     RR_AR       = 0x1a,
+    RR_ALR      = 0x1e,
     RR_BASR     = 0x0d,
     RR_BCR      = 0x07,
     RR_CLR      = 0x15,
@@ -161,6 +170,7 @@ typedef enum S390Opcode {
     RR_NR       = 0x14,
     RR_OR       = 0x16,
     RR_SR       = 0x1b,
+    RR_SLR      = 0x1f,
     RR_XR       = 0x17,
 
     RSY_RLL     = 0xeb1d,
@@ -1821,6 +1831,17 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
         break;
 
+    case INDEX_op_add2_i32:
+        /* ??? Make use of ALFI.  */
+        tcg_out_insn(s, RR, ALR, args[0], args[4]);
+        tcg_out_insn(s, RRE, ALCR, args[1], args[5]);
+        break;
+    case INDEX_op_sub2_i32:
+        /* ??? Make use of SLFI.  */
+        tcg_out_insn(s, RR, SLR, args[0], args[4]);
+        tcg_out_insn(s, RRE, SLBR, args[1], args[5]);
+        break;
+
     case INDEX_op_br:
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
@@ -2016,6 +2037,17 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_ext32u(s, args[0], args[1]);
         break;
 
+    case INDEX_op_add2_i64:
+        /* ??? Make use of ALGFI and SLGFI.  */
+        tcg_out_insn(s, RRE, ALGR, args[0], args[4]);
+        tcg_out_insn(s, RRE, ALCGR, args[1], args[5]);
+        break;
+    case INDEX_op_sub2_i64:
+        /* ??? Make use of ALGFI and SLGFI.  */
+        tcg_out_insn(s, RRE, SLGR, args[0], args[4]);
+        tcg_out_insn(s, RRE, SLBGR, args[1], args[5]);
+        break;
+
     case INDEX_op_brcond_i64:
         tgen_brcond(s, TCG_TYPE_I64, args[2], args[0],
                     args[1], const_args[1], args[3]);
@@ -2084,6 +2116,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_bswap16_i32, { "r", "r" } },
     { INDEX_op_bswap32_i32, { "r", "r" } },
 
+    { INDEX_op_add2_i32, { "r", "r", "0", "1", "r", "r" } },
+    { INDEX_op_sub2_i32, { "r", "r", "0", "1", "r", "r" } },
+
     { INDEX_op_brcond_i32, { "r", "rWC" } },
     { INDEX_op_setcond_i32, { "r", "r", "rWC" } },
 
@@ -2146,6 +2181,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_bswap32_i64, { "r", "r" } },
     { INDEX_op_bswap64_i64, { "r", "r" } },
 
+    { INDEX_op_add2_i64, { "r", "r", "0", "1", "r", "r" } },
+    { INDEX_op_sub2_i64, { "r", "r", "0", "1", "r", "r" } },
+
     { INDEX_op_brcond_i64, { "r", "rC" } },
     { INDEX_op_setcond_i64, { "r", "r", "rC" } },
 
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 0929d55..da726bd 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -65,8 +65,8 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      0
 #define TCG_TARGET_HAS_movcond_i32      0
-#define TCG_TARGET_HAS_add2_i32         0
-#define TCG_TARGET_HAS_sub2_i32         0
+#define TCG_TARGET_HAS_add2_i32         1
+#define TCG_TARGET_HAS_sub2_i32         1
 #define TCG_TARGET_HAS_mulu2_i32        0
 #define TCG_TARGET_HAS_muls2_i32        0
 
@@ -90,8 +90,8 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      0
 #define TCG_TARGET_HAS_movcond_i64      0
-#define TCG_TARGET_HAS_add2_i64         0
-#define TCG_TARGET_HAS_sub2_i64         0
+#define TCG_TARGET_HAS_add2_i64         1
+#define TCG_TARGET_HAS_sub2_i64         1
 #define TCG_TARGET_HAS_mulu2_i64        0
 #define TCG_TARGET_HAS_muls2_i64        0
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 05/12] tcg-s390: Implement mulu2_i64 opcode
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (3 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 04/12] tcg-s390: Implement add2/sub2 opcodes Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 06/12] tcg-s390: Implement movcond opcodes Richard Henderson
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 5 +++++
 tcg/s390/tcg-target.h | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index b007763..81e2f6a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -147,6 +147,7 @@ typedef enum S390Opcode {
     RRE_LRVR    = 0xb91f,
     RRE_LRVGR   = 0xb90f,
     RRE_LTGR    = 0xb902,
+    RRE_MLGR    = 0xb986,
     RRE_MSGR    = 0xb90c,
     RRE_MSR     = 0xb252,
     RRE_NGR     = 0xb980,
@@ -1981,6 +1982,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_divu2_i64:
         tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
         break;
+    case INDEX_op_mulu2_i64:
+        tcg_out_insn(s, RRE, MLGR, TCG_REG_R2, args[3]);
+        break;
 
     case INDEX_op_shl_i64:
         op = RSY_SLLG;
@@ -2156,6 +2160,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
+    { INDEX_op_mulu2_i64, { "b", "a", "0", "r" } },
 
     { INDEX_op_and_i64, { "r", "0", "rA" } },
     { INDEX_op_or_i64, { "r", "0", "rO" } },
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index da726bd..c0cb714 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -92,7 +92,7 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_movcond_i64      0
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
-#define TCG_TARGET_HAS_mulu2_i64        0
+#define TCG_TARGET_HAS_mulu2_i64        1
 #define TCG_TARGET_HAS_muls2_i64        0
 
 /* used for function call generation */
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 06/12] tcg-s390: Implement movcond opcodes
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (4 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 05/12] tcg-s390: Implement mulu2_i64 opcode Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 07/12] tcg-s390: Implement deposit opcodes Richard Henderson
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 41 +++++++++++++++++++++++++++++++++++++++--
 tcg/s390/tcg-target.h |  4 ++--
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 81e2f6a..dbe2fa6 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -158,6 +158,9 @@ typedef enum S390Opcode {
     RRE_SLBGR   = 0xb989,
     RRE_XGR     = 0xb982,
 
+    RRF_LOCR    = 0xb9f2,
+    RRF_LOCGR   = 0xb9e2,
+
     RR_AR       = 0x1a,
     RR_ALR      = 0x1e,
     RR_BASR     = 0x0d,
@@ -342,6 +345,7 @@ static uint8_t *tb_ret_addr;
 #define FACILITY_LONG_DISP	(1ULL << (63 - 18))
 #define FACILITY_EXT_IMM	(1ULL << (63 - 21))
 #define FACILITY_GEN_INST_EXT	(1ULL << (63 - 34))
+#define FACILITY_LOAD_ON_COND   (1ULL << (63 - 45))
 
 static uint64_t facilities;
 
@@ -638,6 +642,12 @@ static void tcg_out_insn_RRE(TCGContext *s, S390Opcode op,
     tcg_out32(s, (op << 16) | (r1 << 4) | r2);
 }
 
+static void tcg_out_insn_RRF(TCGContext *s, S390Opcode op,
+                             TCGReg r1, TCGReg r2, int m3)
+{
+    tcg_out32(s, (op << 16) | (m3 << 12) | (r1 << 4) | r2);
+}
+
 static void tcg_out_insn_RI(TCGContext *s, S390Opcode op, TCGReg r1, int i2)
 {
     tcg_out32(s, (op << 16) | (r1 << 20) | (i2 & 0xffff));
@@ -1169,9 +1179,9 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
 }
 
 static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
-                         TCGReg dest, TCGReg r1, TCGArg c2, int c2const)
+                         TCGReg dest, TCGReg c1, TCGArg c2, int c2const)
 {
-    int cc = tgen_cmp(s, type, c, r1, c2, c2const);
+    int cc = tgen_cmp(s, type, c, c1, c2, c2const);
 
     /* Emit: r1 = 1; if (cc) goto over; r1 = 0; over:  */
     tcg_out_movi(s, type, dest, 1);
@@ -1179,6 +1189,23 @@ static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
     tcg_out_movi(s, type, dest, 0);
 }
 
+static void tgen_movcond(TCGContext *s, TCGType type, TCGCond c, TCGReg dest,
+                         TCGReg c1, TCGArg c2, int c2const, TCGReg r3)
+{
+    int cc;
+    if (facilities & FACILITY_LOAD_ON_COND) {
+        cc = tgen_cmp(s, type, c, c1, c2, c2const);
+        tcg_out_insn(s, RRF, LOCGR, dest, r3, cc);
+    } else {
+        c = tcg_invert_cond(c);
+        cc = tgen_cmp(s, type, c, c1, c2, c2const);
+
+        /* Emit: if (cc) goto over; dest = r3; over:  */
+        tcg_out_insn(s, RI, BRC, cc, (4 + 4) >> 1);
+        tcg_out_insn(s, RRE, LGR, dest, r3);
+    }
+}
+
 static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
 {
     tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
@@ -1855,6 +1882,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1],
                      args[2], const_args[2]);
         break;
+    case INDEX_op_movcond_i32:
+        tgen_movcond(s, TCG_TYPE_I32, args[5], args[0], args[1],
+                     args[2], const_args[2], args[3]);
+        break;
 
     case INDEX_op_qemu_ld8u:
         tcg_out_qemu_ld(s, args, LD_UINT8);
@@ -2060,6 +2091,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
                      args[2], const_args[2]);
         break;
+    case INDEX_op_movcond_i64:
+        tgen_movcond(s, TCG_TYPE_I64, args[5], args[0], args[1],
+                     args[2], const_args[2], args[3]);
+        break;
 
     case INDEX_op_qemu_ld32u:
         tcg_out_qemu_ld(s, args, LD_UINT32);
@@ -2125,6 +2160,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_brcond_i32, { "r", "rWC" } },
     { INDEX_op_setcond_i32, { "r", "r", "rWC" } },
+    { INDEX_op_movcond_i32, { "r", "r", "rWC", "r", "0" } },
 
     { INDEX_op_qemu_ld8u, { "r", "L" } },
     { INDEX_op_qemu_ld8s, { "r", "L" } },
@@ -2191,6 +2227,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_brcond_i64, { "r", "rC" } },
     { INDEX_op_setcond_i64, { "r", "r", "rC" } },
+    { INDEX_op_movcond_i64, { "r", "r", "rC", "r", "0" } },
 
     { INDEX_op_qemu_ld32u, { "r", "L" } },
     { INDEX_op_qemu_ld32s, { "r", "L" } },
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index c0cb714..5e1ac8b 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -64,7 +64,7 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
 #define TCG_TARGET_HAS_deposit_i32      0
-#define TCG_TARGET_HAS_movcond_i32      0
+#define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
 #define TCG_TARGET_HAS_mulu2_i32        0
@@ -89,7 +89,7 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
 #define TCG_TARGET_HAS_deposit_i64      0
-#define TCG_TARGET_HAS_movcond_i64      0
+#define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
 #define TCG_TARGET_HAS_mulu2_i64        1
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 07/12] tcg-s390: Implement deposit opcodes
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (5 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 06/12] tcg-s390: Implement movcond opcodes Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 08/12] tcg-s390: Remove constraint letters for and Richard Henderson
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 24 ++++++++++++++++++++++++
 tcg/s390/tcg-target.h |  8 ++++++--
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index dbe2fa6..673a568 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -125,6 +125,7 @@ typedef enum S390Opcode {
     RIE_CLGIJ   = 0xec7d,
     RIE_CLRJ    = 0xec77,
     RIE_CRJ     = 0xec76,
+    RIE_RISBG   = 0xec55,
 
     RRE_AGR     = 0xb908,
     RRE_ALGR    = 0xb90a,
@@ -1206,6 +1207,23 @@ static void tgen_movcond(TCGContext *s, TCGType type, TCGCond c, TCGReg dest,
     }
 }
 
+bool tcg_target_deposit_valid(int ofs, int len)
+{
+    return (facilities & FACILITY_GEN_INST_EXT) != 0;
+}
+
+static void tgen_deposit(TCGContext *s, TCGReg dest, TCGReg src,
+                         int ofs, int len)
+{
+    int lsb = (63 - ofs);
+    int msb = lsb - (len - 1);
+
+    /* Format RIE-f */
+    tcg_out16(s, (RIE_RISBG & 0xff00) | (dest << 4) | src);
+    tcg_out16(s, (msb << 8) | lsb);
+    tcg_out16(s, (ofs << 8) | (RIE_RISBG & 0xff));
+}
+
 static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
 {
     tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
@@ -2103,6 +2121,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_qemu_ld(s, args, LD_INT32);
         break;
 
+    OP_32_64(deposit):
+        tgen_deposit(s, args[0], args[2], args[3], args[4]);
+        break;
+
     default:
         fprintf(stderr,"unimplemented opc 0x%x\n",opc);
         tcg_abort();
@@ -2161,6 +2183,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_brcond_i32, { "r", "rWC" } },
     { INDEX_op_setcond_i32, { "r", "r", "rWC" } },
     { INDEX_op_movcond_i32, { "r", "r", "rWC", "r", "0" } },
+    { INDEX_op_deposit_i32, { "r", "0", "r" } },
 
     { INDEX_op_qemu_ld8u, { "r", "L" } },
     { INDEX_op_qemu_ld8s, { "r", "L" } },
@@ -2228,6 +2251,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_brcond_i64, { "r", "rC" } },
     { INDEX_op_setcond_i64, { "r", "r", "rC" } },
     { INDEX_op_movcond_i64, { "r", "r", "rC", "r", "0" } },
+    { INDEX_op_deposit_i64, { "r", "0", "r" } },
 
     { INDEX_op_qemu_ld32u, { "r", "L" } },
     { INDEX_op_qemu_ld32s, { "r", "L" } },
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 5e1ac8b..42ca36c 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -63,7 +63,7 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_eqv_i32          0
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          0
-#define TCG_TARGET_HAS_deposit_i32      0
+#define TCG_TARGET_HAS_deposit_i32      1
 #define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_add2_i32         1
 #define TCG_TARGET_HAS_sub2_i32         1
@@ -88,13 +88,17 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_eqv_i64          0
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          0
-#define TCG_TARGET_HAS_deposit_i64      0
+#define TCG_TARGET_HAS_deposit_i64      1
 #define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_add2_i64         1
 #define TCG_TARGET_HAS_sub2_i64         1
 #define TCG_TARGET_HAS_mulu2_i64        1
 #define TCG_TARGET_HAS_muls2_i64        0
 
+extern bool tcg_target_deposit_valid(int ofs, int len);
+#define TCG_TARGET_deposit_i32_valid  tcg_target_deposit_valid
+#define TCG_TARGET_deposit_i64_valid  tcg_target_deposit_valid
+
 /* used for function call generation */
 #define TCG_REG_CALL_STACK		TCG_REG_R15
 #define TCG_TARGET_STACK_ALIGN		8
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 08/12] tcg-s390: Remove constraint letters for and
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (6 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 07/12] tcg-s390: Implement deposit opcodes Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-28 15:03   ` Aurelien Jarno
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 09/12] tcg-s390: Use risbgz for andi Richard Henderson
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Since we have a free temporary and can always just load the constant, we
ought to do so, rather than spending the same effort constraining the const.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 149 +++++++++++---------------------------------------
 1 file changed, 32 insertions(+), 117 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 673a568..203cbb5 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -38,7 +38,6 @@
 #define TCG_CT_CONST_NEG   0x0200
 #define TCG_CT_CONST_ADDI  0x0400
 #define TCG_CT_CONST_MULI  0x0800
-#define TCG_CT_CONST_ANDI  0x1000
 #define TCG_CT_CONST_ORI   0x2000
 #define TCG_CT_CONST_XORI  0x4000
 #define TCG_CT_CONST_CMPI  0x8000
@@ -417,9 +416,6 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
     case 'K':
         ct->ct |= TCG_CT_CONST_MULI;
         break;
-    case 'A':
-        ct->ct |= TCG_CT_CONST_ANDI;
-        break;
     case 'O':
         ct->ct |= TCG_CT_CONST_ORI;
         break;
@@ -438,63 +434,6 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
     return 0;
 }
 
-/* Immediates to be used with logical AND.  This is an optimization only,
-   since a full 64-bit immediate AND can always be performed with 4 sequential
-   NI[LH][LH] instructions.  What we're looking for is immediates that we
-   can load efficiently, and the immediate load plus the reg-reg AND is
-   smaller than the sequential NI's.  */
-
-static int tcg_match_andi(int ct, tcg_target_ulong val)
-{
-    int i;
-
-    if (facilities & FACILITY_EXT_IMM) {
-        if (ct & TCG_CT_CONST_32) {
-            /* All 32-bit ANDs can be performed with 1 48-bit insn.  */
-            return 1;
-        }
-
-        /* Zero-extensions.  */
-        if (val == 0xff || val == 0xffff || val == 0xffffffff) {
-            return 1;
-        }
-    } else {
-        if (ct & TCG_CT_CONST_32) {
-            val = (uint32_t)val;
-        } else if (val == 0xffffffff) {
-            return 1;
-        }
-    }
-
-    /* Try all 32-bit insns that can perform it in one go.  */
-    for (i = 0; i < 4; i++) {
-        tcg_target_ulong mask = ~(0xffffull << i*16);
-        if ((val & mask) == mask) {
-            return 1;
-        }
-    }
-
-    /* Look for 16-bit values performing the mask.  These are better
-       to load with LLI[LH][LH].  */
-    for (i = 0; i < 4; i++) {
-        tcg_target_ulong mask = 0xffffull << i*16;
-        if ((val & mask) == val) {
-            return 0;
-        }
-    }
-
-    /* Look for 32-bit values performing the 64-bit mask.  These
-       are better to load with LLI[LH]F, or if extended immediates
-       not available, with a pair of LLI insns.  */
-    if ((ct & TCG_CT_CONST_32) == 0) {
-        if (val <= 0xffffffff || (val & 0xffffffff) == 0) {
-            return 0;
-        }
-    }
-
-    return 1;
-}
-
 /* Immediates to be used with logical OR.  This is an optimization only,
    since a full 64-bit immediate OR can always be performed with 4 sequential
    OI[LH][LH] instructions.  What we're looking for is immediates that we
@@ -617,8 +556,6 @@ static int tcg_target_const_match(tcg_target_long val,
         } else {
             return val == (int16_t)val;
         }
-    } else if (ct & TCG_CT_CONST_ANDI) {
-        return tcg_match_andi(ct, val);
     } else if (ct & TCG_CT_CONST_ORI) {
         return tcg_match_ori(ct, val);
     } else if (ct & TCG_CT_CONST_XORI) {
@@ -1003,7 +940,7 @@ static inline void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
 
 }
 
-static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+static void tgen_andi(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
 {
     static const S390Opcode ni_insns[4] = {
         RI_NILL, RI_NILH, RI_NIHL, RI_NIHH
@@ -1011,63 +948,51 @@ static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     static const S390Opcode nif_insns[2] = {
         RIL_NILF, RIL_NIHF
     };
-
+    uint64_t valid = (type == TCG_TYPE_I32 ? 0xffffffffull : -1ull);
     int i;
 
-    /* Look for no-op.  */
-    if (val == -1) {
-        return;
-    }
-
     /* Look for the zero-extensions.  */
-    if (val == 0xffffffff) {
+    if ((val & valid) == 0xffffffff) {
         tgen_ext32u(s, dest, dest);
         return;
     }
-
     if (facilities & FACILITY_EXT_IMM) {
-        if (val == 0xff) {
+        if ((val & valid) == 0xff) {
             tgen_ext8u(s, TCG_TYPE_I64, dest, dest);
             return;
         }
-        if (val == 0xffff) {
+        if ((val & valid) == 0xffff) {
             tgen_ext16u(s, TCG_TYPE_I64, dest, dest);
             return;
         }
+    }
 
-        /* Try all 32-bit insns that can perform it in one go.  */
-        for (i = 0; i < 4; i++) {
-            tcg_target_ulong mask = ~(0xffffull << i*16);
-            if ((val & mask) == mask) {
-                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
-                return;
-            }
+    /* Try all 32-bit insns that can perform it in one go.  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = ~(0xffffull << i*16);
+        if (((val | ~valid) & mask) == mask) {
+            tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
+            return;
         }
+    }
 
-        /* Try all 48-bit insns that can perform it in one go.  */
-        if (facilities & FACILITY_EXT_IMM) {
-            for (i = 0; i < 2; i++) {
-                tcg_target_ulong mask = ~(0xffffffffull << i*32);
-                if ((val & mask) == mask) {
-                    tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
-                    return;
-                }
+    /* Try all 48-bit insns that can perform it in one go.  */
+    if (facilities & FACILITY_EXT_IMM) {
+        for (i = 0; i < 2; i++) {
+            tcg_target_ulong mask = ~(0xffffffffull << i*32);
+            if (((val | ~valid) & mask) == mask) {
+                tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+                return;
             }
         }
+    }
 
-        /* Perform the AND via sequential modifications to the high and low
-           parts.  Do this via recursion to handle 16-bit vs 32-bit masks in
-           each half.  */
-        tgen64_andi(s, dest, val | 0xffffffff00000000ull);
-        tgen64_andi(s, dest, val | 0x00000000ffffffffull);
+    /* Fall back to loading the constant.  */
+    tcg_out_movi(s, type, TCG_TMP0, val);
+    if (type == TCG_TYPE_I32) {
+        tcg_out_insn(s, RR, NR, dest, TCG_TMP0);
     } else {
-        /* With no extended-immediate facility, just emit the sequence.  */
-        for (i = 0; i < 4; i++) {
-            tcg_target_ulong mask = 0xffffull << i*16;
-            if ((val & mask) != mask) {
-                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
-            }
-        }
+        tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
     }
 }
 
@@ -1463,16 +1388,6 @@ static void tcg_out_qemu_st_direct(TCGContext *s, int opc, TCGReg data,
 }
 
 #if defined(CONFIG_SOFTMMU)
-static void tgen64_andi_tmp(TCGContext *s, TCGReg dest, tcg_target_ulong val)
-{
-    if (tcg_match_andi(0, val)) {
-        tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
-        tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
-    } else {
-        tgen64_andi(s, dest, val);
-    }
-}
-
 static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
                                   TCGReg addr_reg, int mem_index, int opc,
                                   uint16_t **label2_ptr_p, int is_store)
@@ -1492,8 +1407,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
-    tgen64_andi_tmp(s, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
-    tgen64_andi_tmp(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
+    tgen_andi(s, TCG_TYPE_I64, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+    tgen_andi(s, TCG_TYPE_I64, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
         ofs = offsetof(CPUArchState, tlb_table[mem_index][0].addr_write);
@@ -1777,7 +1692,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_and_i32:
         if (const_args[2]) {
-            tgen64_andi(s, args[0], args[2] | 0xffffffff00000000ull);
+            tgen_andi(s, TCG_TYPE_I32, args[0], args[2]);
         } else {
             tcg_out_insn(s, RR, NR, args[0], args[2]);
         }
@@ -1982,7 +1897,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_and_i64:
         if (const_args[2]) {
-            tgen64_andi(s, args[0], args[2]);
+            tgen_andi(s, TCG_TYPE_I64, args[0], args[2]);
         } else {
             tcg_out_insn(s, RRE, NGR, args[0], args[2]);
         }
@@ -2156,7 +2071,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
-    { INDEX_op_and_i32, { "r", "0", "rWA" } },
+    { INDEX_op_and_i32, { "r", "0", "ri" } },
     { INDEX_op_or_i32, { "r", "0", "rWO" } },
     { INDEX_op_xor_i32, { "r", "0", "rWX" } },
 
@@ -2221,7 +2136,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_mulu2_i64, { "b", "a", "0", "r" } },
 
-    { INDEX_op_and_i64, { "r", "0", "rA" } },
+    { INDEX_op_and_i64, { "r", "0", "ri" } },
     { INDEX_op_or_i64, { "r", "0", "rO" } },
     { INDEX_op_xor_i64, { "r", "0", "rX" } },
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 09/12] tcg-s390: Use risbgz for andi
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (7 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 08/12] tcg-s390: Remove constraint letters for and Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 19:03   ` Paolo Bonzini
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 10/12] tcg-s390: Cleanup argument shuffling fixme in softmmu code Richard Henderson
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

This is immediately usable by the tlb lookup code.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 5 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 203cbb5..2bab245 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -827,6 +827,15 @@ static void tcg_out_ld_abs(TCGContext *s, TCGType type, TCGReg dest, void *abs)
     tcg_out_ld(s, type, dest, dest, addr & 0xffff);
 }
 
+static inline void tcg_out_risbg(TCGContext *s, TCGReg dest, TCGReg src,
+                                 int msb, int lsb, int ofs, int z)
+{
+    /* Format RIE-f */
+    tcg_out16(s, (RIE_RISBG & 0xff00) | (dest << 4) | src);
+    tcg_out16(s, (msb << 8) | (z << 7) | lsb);
+    tcg_out16(s, (ofs << 8) | (RIE_RISBG & 0xff));
+}
+
 static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
     if (facilities & FACILITY_EXT_IMM) {
@@ -940,6 +949,36 @@ static inline void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
 
 }
 
+/* Accept bit patterns like these:
+    0....01....1
+    1....10....0
+    1..10..01..1
+    0..01..10..0
+   Copied from gcc sources.  */
+static inline bool risbg_mask(uint64_t c)
+{
+    uint64_t lsb;
+    /* We don't change the number of transitions by inverting,
+       so make sure we start with the LSB zero.  */
+    if (c & 1) {
+        c = ~c;
+    }
+    /* Reject all zeros or all ones.  */
+    if (c == 0) {
+        return false;
+    }
+    /* Find the first transition.  */
+    lsb = c & -c;
+    /* Invert to look for a second transition.  */
+    c = ~c;
+    /* Erase the first transition.  */
+    c &= -lsb;
+    /* Find the second transition, if any.  */
+    lsb = c & -c;
+    /* Match if all the bits are 1's, or if c is zero.  */
+    return c == -lsb;
+}
+
 static void tgen_andi(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
 {
     static const S390Opcode ni_insns[4] = {
@@ -986,6 +1025,19 @@ static void tgen_andi(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
             }
         }
     }
+    if ((facilities & FACILITY_GEN_INST_EXT) && risbg_mask(val)) {
+        int msb, lsb;
+        if ((val & 0x8000000000000001ull) == 0x8000000000000001ull) {
+            /* Achieve wraparound by swapping msb and lsb.  */
+            msb = 63 - ctz64(~val);
+            lsb = clz64(~val) + 1;
+        } else {
+            msb = clz64(val);
+            lsb = 63 - ctz64(val);
+        }
+        tcg_out_risbg(s, dest, dest, msb, lsb, 0, 1);
+        return;
+    }
 
     /* Fall back to loading the constant.  */
     tcg_out_movi(s, type, TCG_TMP0, val);
@@ -1142,11 +1194,7 @@ static void tgen_deposit(TCGContext *s, TCGReg dest, TCGReg src,
 {
     int lsb = (63 - ofs);
     int msb = lsb - (len - 1);
-
-    /* Format RIE-f */
-    tcg_out16(s, (RIE_RISBG & 0xff00) | (dest << 4) | src);
-    tcg_out16(s, (msb << 8) | lsb);
-    tcg_out16(s, (ofs << 8) | (RIE_RISBG & 0xff));
+    tcg_out_risbg(s, dest, src, msb, lsb, ofs, 0);
 }
 
 static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 10/12] tcg-s390: Cleanup argument shuffling fixme in softmmu code
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (8 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 09/12] tcg-s390: Use risbgz for andi Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 11/12] tcg-s390: Use load-address for addition Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 12/12] tcg-s390: Use all 20 bits of the offset in tcg_out_mem Richard Henderson
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 85 +++++++++++++++++++++++----------------------------
 1 file changed, 38 insertions(+), 47 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 2bab245..43a0de8 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -393,6 +393,7 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         tcg_regset_set32(ct->u.regs, 0, 0xffff);
         tcg_regset_reset_reg (ct->u.regs, TCG_REG_R2);
         tcg_regset_reset_reg (ct->u.regs, TCG_REG_R3);
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R4);
         break;
     case 'a':                  /* force R2 for division */
         ct->ct |= TCG_CT_REG;
@@ -1436,27 +1437,29 @@ static void tcg_out_qemu_st_direct(TCGContext *s, int opc, TCGReg data,
 }
 
 #if defined(CONFIG_SOFTMMU)
-static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
-                                  TCGReg addr_reg, int mem_index, int opc,
-                                  uint16_t **label2_ptr_p, int is_store)
+static TCGReg tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
+                                    TCGReg addr_reg, int mem_index, int opc,
+                                    uint16_t **label2_ptr_p, int is_store)
 {
-    const TCGReg arg0 = TCG_REG_R2;
-    const TCGReg arg1 = TCG_REG_R3;
+    const TCGReg arg0 = tcg_target_call_iarg_regs[0];
+    const TCGReg arg1 = tcg_target_call_iarg_regs[1];
+    const TCGReg arg2 = tcg_target_call_iarg_regs[2];
+    const TCGReg arg3 = tcg_target_call_iarg_regs[3];
     int s_bits = opc & 3;
     uint16_t *label1_ptr;
     tcg_target_long ofs;
 
     if (TARGET_LONG_BITS == 32) {
-        tgen_ext32u(s, arg0, addr_reg);
+        tgen_ext32u(s, arg1, addr_reg);
     } else {
-        tcg_out_mov(s, TCG_TYPE_I64, arg0, addr_reg);
+        tcg_out_mov(s, TCG_TYPE_I64, arg1, addr_reg);
     }
 
-    tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
+    tcg_out_sh64(s, RSY_SRLG, arg2, addr_reg, TCG_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
-    tgen_andi(s, TCG_TYPE_I64, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
-    tgen_andi(s, TCG_TYPE_I64, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
+    tgen_andi(s, TCG_TYPE_I64, arg1, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+    tgen_andi(s, TCG_TYPE_I64, arg2, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
         ofs = offsetof(CPUArchState, tlb_table[mem_index][0].addr_write);
@@ -1466,15 +1469,15 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     assert(ofs < 0x80000);
 
     if (TARGET_LONG_BITS == 32) {
-        tcg_out_mem(s, RX_C, RXY_CY, arg0, arg1, TCG_AREG0, ofs);
+        tcg_out_mem(s, RX_C, RXY_CY, arg1, arg2, TCG_AREG0, ofs);
     } else {
-        tcg_out_mem(s, 0, RXY_CG, arg0, arg1, TCG_AREG0, ofs);
+        tcg_out_mem(s, 0, RXY_CG, arg1, arg2, TCG_AREG0, ofs);
     }
 
     if (TARGET_LONG_BITS == 32) {
-        tgen_ext32u(s, arg0, addr_reg);
+        tgen_ext32u(s, arg1, addr_reg);
     } else {
-        tcg_out_mov(s, TCG_TYPE_I64, arg0, addr_reg);
+        tcg_out_mov(s, TCG_TYPE_I64, arg1, addr_reg);
     }
 
     label1_ptr = (uint16_t*)s->code_ptr;
@@ -1488,56 +1491,42 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
            for the calling convention.  */
         switch (opc) {
         case LD_UINT8:
-            tgen_ext8u(s, TCG_TYPE_I64, arg1, data_reg);
+            tgen_ext8u(s, TCG_TYPE_I64, arg2, data_reg);
             break;
         case LD_UINT16:
-            tgen_ext16u(s, TCG_TYPE_I64, arg1, data_reg);
+            tgen_ext16u(s, TCG_TYPE_I64, arg2, data_reg);
             break;
         case LD_UINT32:
-            tgen_ext32u(s, arg1, data_reg);
+            tgen_ext32u(s, arg2, data_reg);
             break;
         case LD_UINT64:
-            tcg_out_mov(s, TCG_TYPE_I64, arg1, data_reg);
+            tcg_out_mov(s, TCG_TYPE_I64, arg2, data_reg);
             break;
         default:
             tcg_abort();
         }
-        tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, mem_index);
-        /* XXX/FIXME: suboptimal */
-        tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[3],
-                    tcg_target_call_iarg_regs[2]);
-        tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[2],
-                    tcg_target_call_iarg_regs[1]);
-        tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[1],
-                    tcg_target_call_iarg_regs[0]);
-        tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[0],
-                    TCG_AREG0);
+        tcg_out_movi(s, TCG_TYPE_I32, arg3, mem_index);
+        tcg_out_mov(s, TCG_TYPE_I64, arg0, TCG_AREG0);
         tgen_calli(s, (tcg_target_ulong)qemu_st_helpers[s_bits]);
     } else {
-        tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
-        /* XXX/FIXME: suboptimal */
-        tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[2],
-                    tcg_target_call_iarg_regs[1]);
-        tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[1],
-                    tcg_target_call_iarg_regs[0]);
-        tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[0],
-                    TCG_AREG0);
+        tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
+        tcg_out_mov(s, TCG_TYPE_I64, arg0, TCG_AREG0);
         tgen_calli(s, (tcg_target_ulong)qemu_ld_helpers[s_bits]);
 
         /* sign extension */
         switch (opc) {
         case LD_INT8:
-            tgen_ext8s(s, TCG_TYPE_I64, data_reg, arg0);
+            tgen_ext8s(s, TCG_TYPE_I64, data_reg, TCG_REG_R2);
             break;
         case LD_INT16:
-            tgen_ext16s(s, TCG_TYPE_I64, data_reg, arg0);
+            tgen_ext16s(s, TCG_TYPE_I64, data_reg, TCG_REG_R2);
             break;
         case LD_INT32:
-            tgen_ext32s(s, data_reg, arg0);
+            tgen_ext32s(s, data_reg, TCG_REG_R2);
             break;
         default:
             /* unsigned -> just copy */
-            tcg_out_mov(s, TCG_TYPE_I64, data_reg, arg0);
+            tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_R2);
             break;
         }
     }
@@ -1554,7 +1543,9 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     ofs = offsetof(CPUArchState, tlb_table[mem_index][0].addend);
     assert(ofs < 0x80000);
 
-    tcg_out_mem(s, 0, RXY_AG, arg0, arg1, TCG_AREG0, ofs);
+    tcg_out_mem(s, 0, RXY_AG, arg1, arg2, TCG_AREG0, ofs);
+
+    return arg1;
 }
 
 static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
@@ -1600,10 +1591,10 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #if defined(CONFIG_SOFTMMU)
     mem_index = *args;
 
-    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
-                          opc, &label2_ptr, 0);
+    addr_reg = tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
+                                     opc, &label2_ptr, 0);
 
-    tcg_out_qemu_ld_direct(s, opc, data_reg, TCG_REG_R2, TCG_REG_NONE, 0);
+    tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, TCG_REG_NONE, 0);
 
     tcg_finish_qemu_ldst(s, label2_ptr);
 #else
@@ -1629,10 +1620,10 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 #if defined(CONFIG_SOFTMMU)
     mem_index = *args;
 
-    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
-                          opc, &label2_ptr, 1);
+    addr_reg = tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
+                                     opc, &label2_ptr, 1);
 
-    tcg_out_qemu_st_direct(s, opc, data_reg, TCG_REG_R2, TCG_REG_NONE, 0);
+    tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, TCG_REG_NONE, 0);
 
     tcg_finish_qemu_ldst(s, label2_ptr);
 #else
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 11/12] tcg-s390: Use load-address for addition
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (9 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 10/12] tcg-s390: Cleanup argument shuffling fixme in softmmu code Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 12/12] tcg-s390: Use all 20 bits of the offset in tcg_out_mem Richard Henderson
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

Since we're always in 64-bit mode, load address performs a full
64-bit add.  Use that for 3-address addition, as well as for
larger constant addends when we lack extended-immediates facility.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 113 +++++++++++++++++++++++++-------------------------
 1 file changed, 56 insertions(+), 57 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 43a0de8..22927df 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -35,8 +35,6 @@
 #define USE_LONG_BRANCHES 0
 
 #define TCG_CT_CONST_32    0x0100
-#define TCG_CT_CONST_NEG   0x0200
-#define TCG_CT_CONST_ADDI  0x0400
 #define TCG_CT_CONST_MULI  0x0800
 #define TCG_CT_CONST_ORI   0x2000
 #define TCG_CT_CONST_XORI  0x4000
@@ -90,6 +88,7 @@ typedef enum S390Opcode {
     RIL_OIHF    = 0xc00c,
     RIL_OILF    = 0xc00d,
     RIL_SLFI    = 0xc205,
+    RIL_SLGFI   = 0xc204,
     RIL_XIHF    = 0xc006,
     RIL_XILF    = 0xc007,
 
@@ -191,6 +190,7 @@ typedef enum S390Opcode {
     RXY_AY      = 0xe35a,
     RXY_CG      = 0xe320,
     RXY_CY      = 0xe359,
+    RXY_LAY     = 0xe371,
     RXY_LB      = 0xe376,
     RXY_LG      = 0xe304,
     RXY_LGB     = 0xe377,
@@ -217,6 +217,7 @@ typedef enum S390Opcode {
     RX_A        = 0x5a,
     RX_C        = 0x59,
     RX_L        = 0x58,
+    RX_LA       = 0x41,
     RX_LH       = 0x48,
     RX_ST       = 0x50,
     RX_STC      = 0x42,
@@ -405,15 +406,9 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         tcg_regset_clear(ct->u.regs);
         tcg_regset_set_reg(ct->u.regs, TCG_REG_R3);
         break;
-    case 'N':                  /* force immediate negate */
-        ct->ct |= TCG_CT_CONST_NEG;
-        break;
     case 'W':                  /* force 32-bit ("word") immediate */
         ct->ct |= TCG_CT_CONST_32;
         break;
-    case 'I':
-        ct->ct |= TCG_CT_CONST_ADDI;
-        break;
     case 'K':
         ct->ct |= TCG_CT_CONST_MULI;
         break;
@@ -529,25 +524,12 @@ static int tcg_target_const_match(tcg_target_long val,
     }
 
     /* Handle the modifiers.  */
-    if (ct & TCG_CT_CONST_NEG) {
-        val = -val;
-    }
     if (ct & TCG_CT_CONST_32) {
         val = (int32_t)val;
     }
 
     /* The following are mutually exclusive.  */
-    if (ct & TCG_CT_CONST_ADDI) {
-        /* Immediates that may be used with add.  If we have the
-           extended-immediates facility then we have ADD IMMEDIATE
-           with signed and unsigned 32-bit, otherwise we have only
-           ADD HALFWORD IMMEDIATE with a signed 16-bit.  */
-        if (facilities & FACILITY_EXT_IMM) {
-            return val == (int32_t)val || val == (uint32_t)val;
-        } else {
-            return val == (int16_t)val;
-        }
-    } else if (ct & TCG_CT_CONST_MULI) {
+    if (ct & TCG_CT_CONST_MULI) {
         /* Immediates that may be used with multiply.  If we have the
            general-instruction-extensions, then we have MULTIPLY SINGLE
            IMMEDIATE with a signed 32-bit, otherwise we have only
@@ -927,29 +909,6 @@ static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
     tcg_out_insn(s, RRE, LLGFR, dest, src);
 }
 
-static inline void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
-{
-    if (val == (int16_t)val) {
-        tcg_out_insn(s, RI, AHI, dest, val);
-    } else {
-        tcg_out_insn(s, RIL, AFI, dest, val);
-    }
-}
-
-static inline void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
-{
-    if (val == (int16_t)val) {
-        tcg_out_insn(s, RI, AGHI, dest, val);
-    } else if (val == (int32_t)val) {
-        tcg_out_insn(s, RIL, AGFI, dest, val);
-    } else if (val == (uint32_t)val) {
-        tcg_out_insn(s, RIL, ALGFI, dest, val);
-    } else {
-        tcg_abort();
-    }
-
-}
-
 /* Accept bit patterns like these:
     0....01....1
     1....10....0
@@ -1640,6 +1599,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
     S390Opcode op;
+    TCGArg a0, a1, a2;
 
     switch (opc) {
     case INDEX_op_exit_tb:
@@ -1715,18 +1675,33 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_add_i32:
+        a0 = args[0], a1 = args[1], a2 = (int32_t)args[2];
         if (const_args[2]) {
-            tgen32_addi(s, args[0], args[2]);
+        do_addi_32:
+            if (a0 == a1) {
+                if (a2 == (int16_t)a2) {
+                    tcg_out_insn(s, RI, AHI, a0, a2);
+                    break;
+                }
+                if (facilities & FACILITY_EXT_IMM) {
+                    tcg_out_insn(s, RIL, AFI, a0, a2);
+                    break;
+                }
+            }
+            tcg_out_mem(s, RX_LA, RXY_LAY, a0, a1, TCG_REG_NONE, a2);
+        } else if (a0 == a1) {
+            tcg_out_insn(s, RR, AR, a0, a2);
         } else {
-            tcg_out_insn(s, RR, AR, args[0], args[2]);
+            tcg_out_insn(s, RX, LA, a0, a1, a2, 0);
         }
         break;
     case INDEX_op_sub_i32:
+        a0 = args[0], a1 = args[1], a2 = (int32_t)args[2];
         if (const_args[2]) {
-            tgen32_addi(s, args[0], -args[2]);
-        } else {
-            tcg_out_insn(s, RR, SR, args[0], args[2]);
+            a2 = -a2;
+            goto do_addi_32;
         }
+        tcg_out_insn(s, RR, SR, args[0], args[2]);
         break;
 
     case INDEX_op_and_i32:
@@ -1920,15 +1895,39 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_add_i64:
+        a0 = args[0], a1 = args[1], a2 = args[2];
         if (const_args[2]) {
-            tgen64_addi(s, args[0], args[2]);
+        do_addi_64:
+            if (a0 == a1) {
+                if (a2 == (int16_t)a2) {
+                    tcg_out_insn(s, RI, AGHI, a0, a2);
+                    break;
+                }
+                if (facilities & FACILITY_EXT_IMM) {
+                    if (a2 == (int32_t)a2) {
+                        tcg_out_insn(s, RIL, AGFI, a0, a2);
+                        break;
+                    } else if (a2 == (uint32_t)a2) {
+                        tcg_out_insn(s, RIL, ALGFI, a0, a2);
+                        break;
+                    } else if (-a2 == (uint32_t)-a2) {
+                        tcg_out_insn(s, RIL, SLGFI, a0, -a2);
+                        break;
+                    }
+                }
+            }
+            tcg_out_mem(s, RX_LA, RXY_LAY, a0, a1, TCG_REG_NONE, a2);
+        } else if (a0 == a1) {
+            tcg_out_insn(s, RRE, AGR, a0, a2);
         } else {
-            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+            tcg_out_insn(s, RX, LA, a0, a1, a2, 0);
         }
         break;
     case INDEX_op_sub_i64:
+        a0 = args[0], a1 = args[1], a2 = args[2];
         if (const_args[2]) {
-            tgen64_addi(s, args[0], -args[2]);
+            a2 = -a2;
+            goto do_addi_64;
         } else {
             tcg_out_insn(s, RRE, SGR, args[0], args[2]);
         }
@@ -2103,8 +2102,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st16_i32, { "r", "r" } },
     { INDEX_op_st_i32, { "r", "r" } },
 
-    { INDEX_op_add_i32, { "r", "0", "rWI" } },
-    { INDEX_op_sub_i32, { "r", "0", "rWNI" } },
+    { INDEX_op_add_i32, { "r", "r", "ri" } },
+    { INDEX_op_sub_i32, { "r", "0", "ri" } },
     { INDEX_op_mul_i32, { "r", "0", "rK" } },
 
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
@@ -2167,8 +2166,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st32_i64, { "r", "r" } },
     { INDEX_op_st_i64, { "r", "r" } },
 
-    { INDEX_op_add_i64, { "r", "0", "rI" } },
-    { INDEX_op_sub_i64, { "r", "0", "rNI" } },
+    { INDEX_op_add_i64, { "r", "r", "ri" } },
+    { INDEX_op_sub_i64, { "r", "0", "ri" } },
     { INDEX_op_mul_i64, { "r", "0", "rK" } },
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 12/12] tcg-s390: Use all 20 bits of the offset in tcg_out_mem
  2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
                   ` (10 preceding siblings ...)
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 11/12] tcg-s390: Use load-address for addition Richard Henderson
@ 2013-03-27 18:52 ` Richard Henderson
  11 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 18:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf

This can save one insn, if the constant has any bits in 32-63 set,
but no bits in 21-31 set.  It never results in more insns.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 22927df..8e660b3 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -748,10 +748,11 @@ static void tcg_out_mem(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
                         tcg_target_long ofs)
 {
     if (ofs < -0x80000 || ofs >= 0x80000) {
-        /* Combine the low 16 bits of the offset with the actual load insn;
-           the high 48 bits must come from an immediate load.  */
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, ofs & ~0xffff);
-        ofs &= 0xffff;
+        /* Combine the low 20 bits of the offset with the actual load insn;
+           the high 44 bits must come from an immediate load.  */
+        tcg_target_long low = ((ofs & 0xfffff) ^ 0x80000) - 0x80000;
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, ofs - low);
+        ofs = low;
 
         /* If we were already given an index register, add it in.  */
         if (index != TCG_REG_NONE) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] tcg-s390: Use risbgz for andi
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 09/12] tcg-s390: Use risbgz for andi Richard Henderson
@ 2013-03-27 19:03   ` Paolo Bonzini
  2013-03-27 19:27     ` Richard Henderson
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-03-27 19:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

Il 27/03/2013 19:52, Richard Henderson ha scritto:
> This is immediately usable by the tlb lookup code.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 53 insertions(+), 5 deletions(-)
> 
> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 203cbb5..2bab245 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -827,6 +827,15 @@ static void tcg_out_ld_abs(TCGContext *s, TCGType type, TCGReg dest, void *abs)
>      tcg_out_ld(s, type, dest, dest, addr & 0xffff);
>  }
>  
> +static inline void tcg_out_risbg(TCGContext *s, TCGReg dest, TCGReg src,
> +                                 int msb, int lsb, int ofs, int z)
> +{
> +    /* Format RIE-f */
> +    tcg_out16(s, (RIE_RISBG & 0xff00) | (dest << 4) | src);
> +    tcg_out16(s, (msb << 8) | (z << 7) | lsb);
> +    tcg_out16(s, (ofs << 8) | (RIE_RISBG & 0xff));
> +}
> +
>  static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
>  {
>      if (facilities & FACILITY_EXT_IMM) {
> @@ -940,6 +949,36 @@ static inline void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
>  
>  }
>  
> +/* Accept bit patterns like these:
> +    0....01....1
> +    1....10....0
> +    1..10..01..1
> +    0..01..10..0
> +   Copied from gcc sources.  */
> +static inline bool risbg_mask(uint64_t c)
> +{
> +    uint64_t lsb;
> +    /* We don't change the number of transitions by inverting,
> +       so make sure we start with the LSB zero.  */
> +    if (c & 1) {
> +        c = ~c;
> +    }
> +    /* Reject all zeros or all ones.  */
> +    if (c == 0) {
> +        return false;
> +    }
> +    /* Find the first transition.  */
> +    lsb = c & -c;
> +    /* Invert to look for a second transition.  */
> +    c = ~c;
> +    /* Erase the first transition.  */
> +    c &= -lsb;
> +    /* Find the second transition, if any.  */
> +    lsb = c & -c;
> +    /* Match if all the bits are 1's, or if c is zero.  */
> +    return c == -lsb;
> +}
> +
>  static void tgen_andi(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
>  {
>      static const S390Opcode ni_insns[4] = {
> @@ -986,6 +1025,19 @@ static void tgen_andi(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
>              }
>          }
>      }
> +    if ((facilities & FACILITY_GEN_INST_EXT) && risbg_mask(val)) {
> +        int msb, lsb;
> +        if ((val & 0x8000000000000001ull) == 0x8000000000000001ull) {
> +            /* Achieve wraparound by swapping msb and lsb.  */
> +            msb = 63 - ctz64(~val);
> +            lsb = clz64(~val) + 1;
> +        } else {
> +            msb = clz64(val);
> +            lsb = 63 - ctz64(val);
> +        }
> +        tcg_out_risbg(s, dest, dest, msb, lsb, 0, 1);
> +        return;
> +    }
>  
>      /* Fall back to loading the constant.  */
>      tcg_out_movi(s, type, TCG_TMP0, val);
> @@ -1142,11 +1194,7 @@ static void tgen_deposit(TCGContext *s, TCGReg dest, TCGReg src,
>  {
>      int lsb = (63 - ofs);
>      int msb = lsb - (len - 1);
> -
> -    /* Format RIE-f */
> -    tcg_out16(s, (RIE_RISBG & 0xff00) | (dest << 4) | src);
> -    tcg_out16(s, (msb << 8) | lsb);
> -    tcg_out16(s, (ofs << 8) | (RIE_RISBG & 0xff));
> +    tcg_out_risbg(s, dest, src, msb, lsb, ofs, 0);
>  }
>  
>  static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
> 

I wonder if PPC can use this too.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 09/12] tcg-s390: Use risbgz for andi
  2013-03-27 19:03   ` Paolo Bonzini
@ 2013-03-27 19:27     ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-27 19:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, agraf

On 03/27/2013 12:03 PM, Paolo Bonzini wrote:
> I wonder if PPC can use this too.

Yep, though it's more complicated for ppc64.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 03/12] tcg-s390: Remove useless preprocessor conditions
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 03/12] tcg-s390: Remove useless preprocessor conditions Richard Henderson
@ 2013-03-28  0:14   ` Aurelien Jarno
  2013-03-28  0:54     ` Richard Henderson
  0 siblings, 1 reply; 19+ messages in thread
From: Aurelien Jarno @ 2013-03-28  0:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Wed, Mar 27, 2013 at 11:52:24AM -0700, Richard Henderson wrote:
> We only support 64-bit code generation for s390x.
> Don't clutter the code with ifdefs that suggest otherwise.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c | 17 +++++------------
>  tcg/s390/tcg-target.h |  2 --
>  2 files changed, 5 insertions(+), 14 deletions(-)
> 
> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index d91b894..ba314b3 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -24,6 +24,11 @@
>   * THE SOFTWARE.
>   */
>  
> +/* We only support generating code for 64-bit mode.  */
> +#if TCG_TARGET_REG_BITS != 64
> +#error "unsupported code generation mode"
> +#endif
> +

I don't know when the s390 support has been removed, but it has not been
removed from the configure script at that time. It looks better to me
doing so than adding the error message there.

>  /* ??? The translation blocks produced by TCG are generally small enough to
>     be entirely reachable with a 16-bit displacement.  Leaving the option for
>     a 32-bit displacement here Just In Case.  */
> @@ -252,9 +257,6 @@ static const int tcg_target_call_iarg_regs[] = {
>  
>  static const int tcg_target_call_oarg_regs[] = {
>      TCG_REG_R2,
> -#if TCG_TARGET_REG_BITS == 32
> -    TCG_REG_R3
> -#endif
>  };
>  
>  #define S390_CC_EQ      8
> @@ -1620,14 +1622,9 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
>  #endif
>  }
>  
> -#if TCG_TARGET_REG_BITS == 64
>  # define OP_32_64(x) \
>          case glue(glue(INDEX_op_,x),_i32): \
>          case glue(glue(INDEX_op_,x),_i64)
> -#else
> -# define OP_32_64(x) \
> -        case glue(glue(INDEX_op_,x),_i32)
> -#endif
>  
>  static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>                  const TCGArg *args, const int *const_args)
> @@ -1870,7 +1867,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>          tcg_out_qemu_st(s, args, LD_UINT64);
>          break;
>  
> -#if TCG_TARGET_REG_BITS == 64
>      case INDEX_op_mov_i64:
>          tcg_out_mov(s, TCG_TYPE_I64, args[0], args[1]);
>          break;
> @@ -2035,7 +2031,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>      case INDEX_op_qemu_ld32s:
>          tcg_out_qemu_ld(s, args, LD_INT32);
>          break;
> -#endif /* TCG_TARGET_REG_BITS == 64 */
>  
>      default:
>          fprintf(stderr,"unimplemented opc 0x%x\n",opc);
> @@ -2104,7 +2099,6 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_qemu_st32, { "L", "L" } },
>      { INDEX_op_qemu_st64, { "L", "L" } },
>  
> -#if defined(__s390x__)
>      { INDEX_op_mov_i64, { "r", "r" } },
>      { INDEX_op_movi_i64, { "r" } },
>  
> @@ -2157,7 +2151,6 @@ static const TCGTargetOpDef s390_op_defs[] = {
>  
>      { INDEX_op_qemu_ld32u, { "r", "L" } },
>      { INDEX_op_qemu_ld32s, { "r", "L" } },
> -#endif
>  
>      { -1 },
>  };
> diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
> index c6d9e84..0929d55 100644
> --- a/tcg/s390/tcg-target.h
> +++ b/tcg/s390/tcg-target.h
> @@ -70,7 +70,6 @@ typedef enum TCGReg {
>  #define TCG_TARGET_HAS_mulu2_i32        0
>  #define TCG_TARGET_HAS_muls2_i32        0
>  
> -#if TCG_TARGET_REG_BITS == 64
>  #define TCG_TARGET_HAS_div2_i64         1
>  #define TCG_TARGET_HAS_rot_i64          1
>  #define TCG_TARGET_HAS_ext8s_i64        1
> @@ -95,7 +94,6 @@ typedef enum TCGReg {
>  #define TCG_TARGET_HAS_sub2_i64         0
>  #define TCG_TARGET_HAS_mulu2_i64        0
>  #define TCG_TARGET_HAS_muls2_i64        0
> -#endif
>  
>  /* used for function call generation */
>  #define TCG_REG_CALL_STACK		TCG_REG_R15


-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 03/12] tcg-s390: Remove useless preprocessor conditions
  2013-03-28  0:14   ` Aurelien Jarno
@ 2013-03-28  0:54     ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-28  0:54 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 2013-03-27 17:14, Aurelien Jarno wrote:
>> >+/* We only support generating code for 64-bit mode.  */
>> >+#if TCG_TARGET_REG_BITS != 64
>> >+#error "unsupported code generation mode"
>> >+#endif
>> >+
> I don't know when the s390 support has been removed, but it has not been
> removed from the configure script at that time. It looks better to me
> doing so than adding the error message there.
>

Adjusting the configure script would be good (use the interpreter?),
but I think the error is good documentation.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] tcg-s390: Remove constraint letters for and
  2013-03-27 18:52 ` [Qemu-devel] [PATCH 08/12] tcg-s390: Remove constraint letters for and Richard Henderson
@ 2013-03-28 15:03   ` Aurelien Jarno
  2013-03-28 15:08     ` Richard Henderson
  0 siblings, 1 reply; 19+ messages in thread
From: Aurelien Jarno @ 2013-03-28 15:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, agraf

On Wed, Mar 27, 2013 at 11:52:29AM -0700, Richard Henderson wrote:
> Since we have a free temporary and can always just load the constant, we
> ought to do so, rather than spending the same effort constraining the const.

Is it really a good idea doing so? If a constraint can't be satisfied
the TCG code will also load the constant in a register, with the
difference that the register is not trashed and might be reused later 
instead of reloading the constant again. Of course it means one more
register available, but the S390 target doesn't really have issues
with the number of available registers.

> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/s390/tcg-target.c | 149 +++++++++++---------------------------------------
>  1 file changed, 32 insertions(+), 117 deletions(-)
> 
> diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
> index 673a568..203cbb5 100644
> --- a/tcg/s390/tcg-target.c
> +++ b/tcg/s390/tcg-target.c
> @@ -38,7 +38,6 @@
>  #define TCG_CT_CONST_NEG   0x0200
>  #define TCG_CT_CONST_ADDI  0x0400
>  #define TCG_CT_CONST_MULI  0x0800
> -#define TCG_CT_CONST_ANDI  0x1000
>  #define TCG_CT_CONST_ORI   0x2000
>  #define TCG_CT_CONST_XORI  0x4000
>  #define TCG_CT_CONST_CMPI  0x8000
> @@ -417,9 +416,6 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
>      case 'K':
>          ct->ct |= TCG_CT_CONST_MULI;
>          break;
> -    case 'A':
> -        ct->ct |= TCG_CT_CONST_ANDI;
> -        break;
>      case 'O':
>          ct->ct |= TCG_CT_CONST_ORI;
>          break;
> @@ -438,63 +434,6 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
>      return 0;
>  }
>  
> -/* Immediates to be used with logical AND.  This is an optimization only,
> -   since a full 64-bit immediate AND can always be performed with 4 sequential
> -   NI[LH][LH] instructions.  What we're looking for is immediates that we
> -   can load efficiently, and the immediate load plus the reg-reg AND is
> -   smaller than the sequential NI's.  */
> -
> -static int tcg_match_andi(int ct, tcg_target_ulong val)
> -{
> -    int i;
> -
> -    if (facilities & FACILITY_EXT_IMM) {
> -        if (ct & TCG_CT_CONST_32) {
> -            /* All 32-bit ANDs can be performed with 1 48-bit insn.  */
> -            return 1;
> -        }
> -
> -        /* Zero-extensions.  */
> -        if (val == 0xff || val == 0xffff || val == 0xffffffff) {
> -            return 1;
> -        }
> -    } else {
> -        if (ct & TCG_CT_CONST_32) {
> -            val = (uint32_t)val;
> -        } else if (val == 0xffffffff) {
> -            return 1;
> -        }
> -    }
> -
> -    /* Try all 32-bit insns that can perform it in one go.  */
> -    for (i = 0; i < 4; i++) {
> -        tcg_target_ulong mask = ~(0xffffull << i*16);
> -        if ((val & mask) == mask) {
> -            return 1;
> -        }
> -    }
> -
> -    /* Look for 16-bit values performing the mask.  These are better
> -       to load with LLI[LH][LH].  */
> -    for (i = 0; i < 4; i++) {
> -        tcg_target_ulong mask = 0xffffull << i*16;
> -        if ((val & mask) == val) {
> -            return 0;
> -        }
> -    }
> -
> -    /* Look for 32-bit values performing the 64-bit mask.  These
> -       are better to load with LLI[LH]F, or if extended immediates
> -       not available, with a pair of LLI insns.  */
> -    if ((ct & TCG_CT_CONST_32) == 0) {
> -        if (val <= 0xffffffff || (val & 0xffffffff) == 0) {
> -            return 0;
> -        }
> -    }
> -
> -    return 1;
> -}
> -
>  /* Immediates to be used with logical OR.  This is an optimization only,
>     since a full 64-bit immediate OR can always be performed with 4 sequential
>     OI[LH][LH] instructions.  What we're looking for is immediates that we
> @@ -617,8 +556,6 @@ static int tcg_target_const_match(tcg_target_long val,
>          } else {
>              return val == (int16_t)val;
>          }
> -    } else if (ct & TCG_CT_CONST_ANDI) {
> -        return tcg_match_andi(ct, val);
>      } else if (ct & TCG_CT_CONST_ORI) {
>          return tcg_match_ori(ct, val);
>      } else if (ct & TCG_CT_CONST_XORI) {
> @@ -1003,7 +940,7 @@ static inline void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
>  
>  }
>  
> -static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
> +static void tgen_andi(TCGContext *s, TCGType type, TCGReg dest, uint64_t val)
>  {
>      static const S390Opcode ni_insns[4] = {
>          RI_NILL, RI_NILH, RI_NIHL, RI_NIHH
> @@ -1011,63 +948,51 @@ static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
>      static const S390Opcode nif_insns[2] = {
>          RIL_NILF, RIL_NIHF
>      };
> -
> +    uint64_t valid = (type == TCG_TYPE_I32 ? 0xffffffffull : -1ull);
>      int i;
>  
> -    /* Look for no-op.  */
> -    if (val == -1) {
> -        return;
> -    }
> -
>      /* Look for the zero-extensions.  */
> -    if (val == 0xffffffff) {
> +    if ((val & valid) == 0xffffffff) {
>          tgen_ext32u(s, dest, dest);
>          return;
>      }
> -
>      if (facilities & FACILITY_EXT_IMM) {
> -        if (val == 0xff) {
> +        if ((val & valid) == 0xff) {
>              tgen_ext8u(s, TCG_TYPE_I64, dest, dest);
>              return;
>          }
> -        if (val == 0xffff) {
> +        if ((val & valid) == 0xffff) {
>              tgen_ext16u(s, TCG_TYPE_I64, dest, dest);
>              return;
>          }
> +    }
>  
> -        /* Try all 32-bit insns that can perform it in one go.  */
> -        for (i = 0; i < 4; i++) {
> -            tcg_target_ulong mask = ~(0xffffull << i*16);
> -            if ((val & mask) == mask) {
> -                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
> -                return;
> -            }
> +    /* Try all 32-bit insns that can perform it in one go.  */
> +    for (i = 0; i < 4; i++) {
> +        tcg_target_ulong mask = ~(0xffffull << i*16);
> +        if (((val | ~valid) & mask) == mask) {
> +            tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
> +            return;
>          }
> +    }
>  
> -        /* Try all 48-bit insns that can perform it in one go.  */
> -        if (facilities & FACILITY_EXT_IMM) {
> -            for (i = 0; i < 2; i++) {
> -                tcg_target_ulong mask = ~(0xffffffffull << i*32);
> -                if ((val & mask) == mask) {
> -                    tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
> -                    return;
> -                }
> +    /* Try all 48-bit insns that can perform it in one go.  */
> +    if (facilities & FACILITY_EXT_IMM) {
> +        for (i = 0; i < 2; i++) {
> +            tcg_target_ulong mask = ~(0xffffffffull << i*32);
> +            if (((val | ~valid) & mask) == mask) {
> +                tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
> +                return;
>              }
>          }
> +    }
>  
> -        /* Perform the AND via sequential modifications to the high and low
> -           parts.  Do this via recursion to handle 16-bit vs 32-bit masks in
> -           each half.  */
> -        tgen64_andi(s, dest, val | 0xffffffff00000000ull);
> -        tgen64_andi(s, dest, val | 0x00000000ffffffffull);
> +    /* Fall back to loading the constant.  */
> +    tcg_out_movi(s, type, TCG_TMP0, val);
> +    if (type == TCG_TYPE_I32) {
> +        tcg_out_insn(s, RR, NR, dest, TCG_TMP0);
>      } else {
> -        /* With no extended-immediate facility, just emit the sequence.  */
> -        for (i = 0; i < 4; i++) {
> -            tcg_target_ulong mask = 0xffffull << i*16;
> -            if ((val & mask) != mask) {
> -                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
> -            }
> -        }
> +        tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
>      }
>  }
>  
> @@ -1463,16 +1388,6 @@ static void tcg_out_qemu_st_direct(TCGContext *s, int opc, TCGReg data,
>  }
>  
>  #if defined(CONFIG_SOFTMMU)
> -static void tgen64_andi_tmp(TCGContext *s, TCGReg dest, tcg_target_ulong val)
> -{
> -    if (tcg_match_andi(0, val)) {
> -        tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
> -        tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
> -    } else {
> -        tgen64_andi(s, dest, val);
> -    }
> -}
> -
>  static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
>                                    TCGReg addr_reg, int mem_index, int opc,
>                                    uint16_t **label2_ptr_p, int is_store)
> @@ -1492,8 +1407,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
>      tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
>                   TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
>  
> -    tgen64_andi_tmp(s, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
> -    tgen64_andi_tmp(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
> +    tgen_andi(s, TCG_TYPE_I64, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
> +    tgen_andi(s, TCG_TYPE_I64, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
>  
>      if (is_store) {
>          ofs = offsetof(CPUArchState, tlb_table[mem_index][0].addr_write);
> @@ -1777,7 +1692,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  
>      case INDEX_op_and_i32:
>          if (const_args[2]) {
> -            tgen64_andi(s, args[0], args[2] | 0xffffffff00000000ull);
> +            tgen_andi(s, TCG_TYPE_I32, args[0], args[2]);
>          } else {
>              tcg_out_insn(s, RR, NR, args[0], args[2]);
>          }
> @@ -1982,7 +1897,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  
>      case INDEX_op_and_i64:
>          if (const_args[2]) {
> -            tgen64_andi(s, args[0], args[2]);
> +            tgen_andi(s, TCG_TYPE_I64, args[0], args[2]);
>          } else {
>              tcg_out_insn(s, RRE, NGR, args[0], args[2]);
>          }
> @@ -2156,7 +2071,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
>      { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
>  
> -    { INDEX_op_and_i32, { "r", "0", "rWA" } },
> +    { INDEX_op_and_i32, { "r", "0", "ri" } },
>      { INDEX_op_or_i32, { "r", "0", "rWO" } },
>      { INDEX_op_xor_i32, { "r", "0", "rWX" } },
>  
> @@ -2221,7 +2136,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
>      { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
>      { INDEX_op_mulu2_i64, { "b", "a", "0", "r" } },
>  
> -    { INDEX_op_and_i64, { "r", "0", "rA" } },
> +    { INDEX_op_and_i64, { "r", "0", "ri" } },
>      { INDEX_op_or_i64, { "r", "0", "rO" } },
>      { INDEX_op_xor_i64, { "r", "0", "rX" } },
>  
> -- 
> 1.8.1.4
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 08/12] tcg-s390: Remove constraint letters for and
  2013-03-28 15:03   ` Aurelien Jarno
@ 2013-03-28 15:08     ` Richard Henderson
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2013-03-28 15:08 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, agraf

On 03/28/2013 08:03 AM, Aurelien Jarno wrote:
>> Since we have a free temporary and can always just load the constant, we
>> > ought to do so, rather than spending the same effort constraining the const.
> Is it really a good idea doing so? If a constraint can't be satisfied
> the TCG code will also load the constant in a register, with the
> difference that the register is not trashed and might be reused later 
> instead of reloading the constant again. Of course it means one more
> register available, but the S390 target doesn't really have issues
> with the number of available registers.
> 

My main thinking is along the lines you yourself pointed out when the code
was first written -- it's really quite hard to figure out what constants are
implementable for AND.

It gets even worse with a patch further in the series that uses ROTATE AND
INSERT SELECTED BITS.

It's complicated enough that it *seems* better to just go ahead and accept
all constants.  Even from a maintainence point of view -- we no longer have
to have two big functions match up.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-03-28 15:08 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-27 18:52 [Qemu-devel] [PATCH 00/12] tcg-s390 updates Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 01/12] tcg-s390: Fix movi Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 02/12] tcg-s390: Properly allocate a stack frame Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 03/12] tcg-s390: Remove useless preprocessor conditions Richard Henderson
2013-03-28  0:14   ` Aurelien Jarno
2013-03-28  0:54     ` Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 04/12] tcg-s390: Implement add2/sub2 opcodes Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 05/12] tcg-s390: Implement mulu2_i64 opcode Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 06/12] tcg-s390: Implement movcond opcodes Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 07/12] tcg-s390: Implement deposit opcodes Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 08/12] tcg-s390: Remove constraint letters for and Richard Henderson
2013-03-28 15:03   ` Aurelien Jarno
2013-03-28 15:08     ` Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 09/12] tcg-s390: Use risbgz for andi Richard Henderson
2013-03-27 19:03   ` Paolo Bonzini
2013-03-27 19:27     ` Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 10/12] tcg-s390: Cleanup argument shuffling fixme in softmmu code Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 11/12] tcg-s390: Use load-address for addition Richard Henderson
2013-03-27 18:52 ` [Qemu-devel] [PATCH 12/12] tcg-s390: Use all 20 bits of the offset in tcg_out_mem Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).