qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/62] s390x tcg target
@ 2010-05-27 20:45 Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 01/62] S390 TCG target Richard Henderson
                   ` (62 more replies)
  0 siblings, 63 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The following patch series is available at

  git://repo.or.cz/qemu/rth.git tcg-s390-2

It begins with Uli Hecht's original patch, posted by Alexander
sometime last year.  I then make incremental changes to

  (1) Make it compile -- first patch that compiles is tagged
      as tcg-s390-2-first-compile and is

      d142103... tcg-s390: Define tcg_target_reg_names.

  (2) Make it work -- the first patch that i386-linux-user 
      successfully completes linux-test-user-0.2 is tagged
      as tcg-s390-2-first-working and is

      3571f8d... tcg-s390: Implement setcond.

  (3) Make it work for other targets.  I don't tag this,
      but there are lots of load/store aborts and an 
      incorrectly division routine until

      9798371... tcg-s390: Implement div2.

  (4) Make it work well.  The balance of the patches incrementally
      add support for new instructions.  At

      7bfaa9e... tcg-s390: Query instruction extensions that are installed.

      I add support for detecting the instruction set extensions
      present in the host and then start disabling some of those
      new instructions that may not be present.

Once things start working, each step was tested with an --enable-debug
compile, and running the linux-user-test suite as well as booting 
the {arm,coldfire,sparc}-linux test kernels, and booting freedos.

Unfortunately, each step was only built without optimization, and it
is only at the end that we discovered that TCG was not properly honoring
the host ABI.  This is solved by the last patch, adding proper sign
extensions for the 32-bit function arguments.  With the final patch
everything works for an optimized build as well.

The current state is that the TCG compiler works for an s390x host.
That is, with a 64-bit userland binary.  It will *compile* for a 
32-bit userland binary, but that facility is only retained for the
purpose of running the s390 kvm guest.  If kvm is not used, the
32-bit binary will exit with an error message.

Given that this is the beginning of proper support for s390, I don't
know whether bisectability is really an issue.  I suppose we could
fairly easily re-base the patches that touch files outside tcg/s390/
and then squash the rest, but I suspect the history may be useful.



r~



Alexander Graf (2):
  S390 TCG target
  add lost chunks from the original patch

Richard Henderson (60):
  tcg-s390: Only validate CPUTLBEntry for system mode.
  tcg-s390: Fix tcg_prepare_qemu_ldst for user mode.
  tcg-s390: Move opcode defines to tcg-target.c.
  s390x: Avoid _llseek.
  s390x: Don't use a linker script for user-only.
  tcg-s390: Avoid set-but-not-used werrors.
  tcg-s390: Mark R0 & R15 reserved.
  tcg-s390: R6 is a function argument register
  tcg-s390: Move tcg_out_mov up and use it throughout.
  tcg-s390: Eliminate the S constraint.
  tcg-s390: Add -m64 and -march to s390x compilation.
  tcg-s390: Define tcg_target_reg_names.
  tcg-s390: Update disassembler from binutils head.
  tcg-s390: Compute is_write in cpu_signal_handler.
  tcg-s390: Reorganize instruction emission
  tcg-s390: Use matching constraints.
  tcg-s390: Fixup qemu_ld/st opcodes.
  tcg-s390: Implement setcond.
  tcg-s390: Generalize the direct load/store emission.
  tcg-s390: Tidy branches.
  tcg-s390: Add tgen_calli.
  tcg-s390: Implement div2.
  tcg-s390: Re-implement tcg_out_movi.
  tcg-s390: Implement sign and zero-extension operations.
  tcg-s390: Implement bswap operations.
  tcg-s390: Implement rotates.
  tcg-s390: Use LOAD COMPLIMENT for negate.
  tcg-s390: Tidy unimplemented opcodes.
  tcg-s390: Use the extended-immediate facility for add/sub.
  tcg-s390: Implement immediate ANDs.
  tcg-s390: Implement immediate ORs.
  tcg-s390: Implement immediate MULs.
  tcg-s390: Implement immediate XORs.
  tcg-s390: Icache flush is a no-op.
  tcg-s390: Define TCG_TMP0.
  tcg-s390: Tidy regset initialization; use R14 as temporary.
  tcg-s390: Rearrange register allocation order.
  tcg-s390: Tidy goto_tb.
  tcg-s390: Allocate the code_gen_buffer near the main program.
  tcg-s390: Rearrange qemu_ld/st to avoid register copy.
  tcg-s390: Tidy tcg_prepare_qemu_ldst.
  tcg-s390: Tidy user qemu_ld/st.
  tcg-s390: Implement GUEST_BASE.
  tcg-s390: Query instruction extensions that are installed.
  tcg-s390: Conditionalize general-instruction-extension insns.
  tcg-s390: Conditionalize ADD IMMEDIATE instructions.
  tcg-s390: Conditionalize LOAD IMMEDIATE instructions.
  tcg-s390: Conditionalize 8 and 16 bit extensions.
  tcg-s390: Conditionalize AND IMMEDIATE instructions.
  tcg-s390: Conditionalize OR IMMEDIATE instructions.
  tcg-s390: Conditionalize XOR IMMEDIATE instructions.
  tcg-s390: Do not require the extended-immediate facility.
  tcg-s390: Use 16-bit branches for forward jumps.
  tcg-s390: Use the LOAD AND TEST instruction for compares.
  tcg-s390: Use the COMPARE IMMEDIATE instrucions for compares.
  tcg-s390: Use COMPARE AND BRANCH instructions.
  tcg-s390: Generalize load/store support.
  tcg-s390: Fix TLB comparison width.
  tcg-s390: Enable compile in 32-bit mode.
  tcg: Optionally sign-extend 32-bit arguments for 64-bit host.

 configure                    |   12 +-
 cpu-exec.c                   |   42 +-
 def-helper.h                 |   38 +-
 exec.c                       |    7 +
 linux-user/syscall.c         |    4 +-
 s390-dis.c                   |  818 +++++++++++++---
 target-i386/ops_sse_header.h |    3 +
 target-ppc/helper.h          |    1 +
 tcg/s390/tcg-target.c        | 2240 +++++++++++++++++++++++++++++++++++++++++-
 tcg/s390/tcg-target.h        |   63 +-
 tcg/tcg-op.h                 |   34 +-
 tcg/tcg.c                    |   41 +-
 12 files changed, 3063 insertions(+), 240 deletions(-)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 01/62] S390 TCG target
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 02/62] add lost chunks from the original patch Richard Henderson
                   ` (61 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf, aurelien

From: Alexander Graf <agraf@suse.de>

We already have stubs for a TCG target on S390, but were missing code that
would actually generate instructions.

So I took Uli's patch, cleaned it up and present it to you again :-).

I hope I found all odd coding style and unprettiness issues, but if you
still spot one feel free to nag about it.

Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Uli Hecht <uli@suse.de>
---
 tcg/s390/tcg-target.c | 1176 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 1162 insertions(+), 14 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 265194a..d2a93c2 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -2,6 +2,7 @@
  * Tiny Code Generator for QEMU
  *
  * Copyright (c) 2009 Ulrich Hecht <uli@suse.de>
+ * Copyright (c) 2009 Alexander Graf <agraf@suse.de>
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to deal
@@ -22,31 +23,146 @@
  * THE SOFTWARE.
  */
 
+/* #define DEBUG_S390_TCG */
+
+#ifdef DEBUG_S390_TCG
+#define dprintf(fmt, ...) \
+    do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
+#else
+#define dprintf(fmt, ...) \
+    do { } while (0)
+#endif
+
 static const int tcg_target_reg_alloc_order[] = {
+    TCG_REG_R6,
+    TCG_REG_R7,
+    TCG_REG_R8,
+    TCG_REG_R9,
+    TCG_REG_R10,
+    TCG_REG_R11,
+    TCG_REG_R12,
+    TCG_REG_R13,
+    TCG_REG_R14,
+    /* XXX many insns can't be used with R0, so we better avoid it for now */
+    /* TCG_REG_R0 */
+    TCG_REG_R1,
+    TCG_REG_R2,
+    TCG_REG_R3,
+    TCG_REG_R4,
+    TCG_REG_R5,
 };
 
 static const int tcg_target_call_iarg_regs[] = {
+    TCG_REG_R2,
+    TCG_REG_R3,
+    TCG_REG_R4,
+    TCG_REG_R5,
 };
 
 static const int tcg_target_call_oarg_regs[] = {
+    TCG_REG_R2,
+    TCG_REG_R3,
+};
+
+/* signed/unsigned is handled by using COMPARE and COMPARE LOGICAL,
+   respectively */
+static const uint8_t tcg_cond_to_s390_cond[10] = {
+    [TCG_COND_EQ]  = 8,
+    [TCG_COND_LT]  = 4,
+    [TCG_COND_LTU] = 4,
+    [TCG_COND_LE]  = 8 | 4,
+    [TCG_COND_LEU] = 8 | 4,
+    [TCG_COND_GT]  = 2,
+    [TCG_COND_GTU] = 2,
+    [TCG_COND_GE]  = 8 | 2,
+    [TCG_COND_GEU] = 8 | 2,
+    [TCG_COND_NE]  = 4 | 2 | 1,
+};
+
+#ifdef CONFIG_SOFTMMU
+
+#include "../../softmmu_defs.h"
+
+static void *qemu_ld_helpers[4] = {
+    __ldb_mmu,
+    __ldw_mmu,
+    __ldl_mmu,
+    __ldq_mmu,
+};
+
+static void *qemu_st_helpers[4] = {
+    __stb_mmu,
+    __stw_mmu,
+    __stl_mmu,
+    __stq_mmu,
 };
 
 static void patch_reloc(uint8_t *code_ptr, int type,
                 tcg_target_long value, tcg_target_long addend)
 {
-    tcg_abort();
+    uint32_t *code_ptr_32 = (uint32_t*)code_ptr;
+    tcg_target_long code_ptr_tlong = (tcg_target_long)code_ptr;
+
+    switch (type) {
+    case R_390_PC32DBL:
+        *code_ptr_32 = (value - (code_ptr_tlong + addend)) >> 1;
+        break;
+    default:
+        tcg_abort();
+        break;
+    }
 }
 
-static inline int tcg_target_get_call_iarg_regs_count(int flags)
-{
-    tcg_abort();
-    return 0;
+static int tcg_target_get_call_iarg_regs_count(int flags)
+  {
+    return sizeof(tcg_target_call_iarg_regs) / sizeof(int);
 }
 
+static void constraint_softmmu(TCGArgConstraint *ct, const char c)
+{
+#ifdef CONFIG_SOFTMMU
+    switch (c) {
+    case 'S':                   /* qemu_st constraint */
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R4);
+        /* fall through */
+    case 'L':                   /* qemu_ld constraint */
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R3);
+        break;
+    }
+#endif
+  }
+
 /* parse target specific constraints */
 static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
 {
-    tcg_abort();
+    const char *ct_str;
+
+    ct->ct |= TCG_CT_REG;
+    tcg_regset_set32(ct->u.regs, 0, 0xffff);
+    ct_str = *pct_str;
+
+    switch (ct_str[0]) {
+    case 'L':                   /* qemu_ld constraint */
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R2);
+        constraint_softmmu(ct, 'L');
+        break;
+    case 'S':                   /* qemu_st constraint */
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R2);
+        constraint_softmmu(ct, 'S');
+        break;
+    case 'R':                        /* not R0 */
+        tcg_regset_reset_reg(ct->u.regs, TCG_REG_R0);
+        break;
+    case 'I':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_S16;
+        break;
+    default:
+        break;
+    }
+    ct_str++;
+    *pct_str = ct_str;
+
     return 0;
 }
 
@@ -54,49 +170,1081 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
 static inline int tcg_target_const_match(tcg_target_long val,
                 const TCGArgConstraint *arg_ct)
 {
-    tcg_abort();
+    int ct = arg_ct->ct;
+
+    if ((ct & TCG_CT_CONST) ||
+       ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) ||
+       ((ct & TCG_CT_CONST_U12) && val == (val & 0xfff))) {
+        return 1;
+    }
+
     return 0;
 }
 
+/* emit load/store (and then some) instructions (E3 prefix) */
+static void tcg_out_e3(TCGContext* s, int op, int r1, int r2, int disp)
+{
+    tcg_out16(s, 0xe300 | (r1 << 4));
+    tcg_out32(s, op | (r2 << 28) | ((disp & 0xfff) << 16) | ((disp >> 12) << 8));
+}
+
+/* emit 64-bit register/register insns (B9 prefix) */
+static void tcg_out_b9(TCGContext* s, int op, int r1, int r2)
+{
+    tcg_out32(s, 0xb9000000 | (op << 16) | (r1 << 4) | r2);
+}
+
+/* emit (mostly) 32-bit register/register insns */
+static void tcg_out_rr(TCGContext* s, int op, int r1, int r2)
+{
+    tcg_out16(s, (op << 8) | (r1 << 4) | r2);
+}
+
+static void tcg_out_a7(TCGContext *s, int op, int r1, int16_t i2)
+{
+    tcg_out32(s, 0xa7000000UL | (r1 << 20) | (op << 16) | ((uint16_t)i2));
+}
+
+/* emit 64-bit shifts (EB prefix) */
+static void tcg_out_sh64(TCGContext* s, int op, int r0, int r1, int r2, int imm)
+{
+    tcg_out16(s, 0xeb00 | (r0 << 4) | r1);
+    tcg_out32(s, op | (r2 << 28) | ((imm & 0xfff) << 16) | ((imm >> 12) << 8));
+}
+
+/* emit 32-bit shifts */
+static void tcg_out_sh32(TCGContext* s, int op, int r0, int r1, int imm)
+{
+    tcg_out32(s, 0x80000000 | (op << 24) | (r0 << 20) | (r1 << 12) | imm);
+}
+
+/* branch to relative address (long) */
+static void tcg_out_brasl(TCGContext* s, int r, tcg_target_long raddr)
+{
+    tcg_out16(s, 0xc005 | (r << 4));
+    tcg_out32(s, raddr >> 1);
+}
+
+/* store 8/16/32 bits */
+static void tcg_out_store(TCGContext* s, int op, int r0, int r1, int off)
+{
+    tcg_out32(s, (op << 24) | (r0 << 20) | (r1 << 12) | off);
+}
+
 /* load a register with an immediate value */
 static inline void tcg_out_movi(TCGContext *s, TCGType type,
                 int ret, tcg_target_long arg)
 {
-    tcg_abort();
+    if (arg >= -0x8000 && arg < 0x8000) { /* signed immediate load */
+        /* lghi %rret, arg */
+        tcg_out32(s, S390_INS_LGHI | (ret << 20) | (arg & 0xffff));
+    } else if (!(arg & 0xffffffffffff0000UL)) {
+        /* llill %rret, arg */
+        tcg_out32(s, S390_INS_LLILL | (ret << 20) | arg);
+    } else if (!(arg & 0xffffffff00000000UL) || type == TCG_TYPE_I32) {
+        /* llill %rret, arg */
+        tcg_out32(s, S390_INS_LLILL | (ret << 20) | (arg & 0xffff));
+        /* iilh %rret, arg */
+        tcg_out32(s, S390_INS_IILH | (ret << 20) | ((arg & 0xffffffff) >> 16));
+    } else {
+        /* branch over constant and store its address in R13 */
+        tcg_out_brasl(s, TCG_REG_R13, 14);
+        /* 64-bit constant */
+        tcg_out32(s,arg >> 32);
+        tcg_out32(s,arg);
+        /* load constant to ret */
+        tcg_out_e3(s, E3_LG, ret, TCG_REG_R13, 0);
+    }
 }
 
 /* load data without address translation or endianness conversion */
 static inline void tcg_out_ld(TCGContext *s, TCGType type, int arg,
                 int arg1, tcg_target_long arg2)
 {
-    tcg_abort();
+    int op;
+
+    dprintf("tcg_out_ld type %d arg %d arg1 %d arg2 %ld\n",
+            type, arg, arg1, arg2);
+
+    op = (type == TCG_TYPE_I32) ? E3_LLGF : E3_LG;
+
+    if (arg2 < -0x80000 || arg2 > 0x7ffff) {
+        /* load the displacement */
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, arg2);
+        /* add the address */
+        tcg_out_b9(s, B9_AGR, TCG_REG_R13, arg1);
+        /* load the data */
+        tcg_out_e3(s, op, arg, TCG_REG_R13, 0);
+    } else {
+        /* load the data */
+        tcg_out_e3(s, op, arg, arg1, arg2);
+    }
+}
+
+#if defined(CONFIG_SOFTMMU)
+static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
+                                  int mem_index, int opc,
+                                  uint16_t **label2_ptr_p, int is_store)
+  {
+    int arg0 = TCG_REG_R2;
+    int arg1 = TCG_REG_R3;
+    int arg2 = TCG_REG_R4;
+    int s_bits;
+    uint16_t *label1_ptr;
+
+    if (is_store) {
+        s_bits = opc;
+    } else {
+        s_bits = opc & 3;
+    }
+
+#if TARGET_LONG_BITS == 32
+    tcg_out_b9(s, B9_LLGFR, arg1, addr_reg);
+    tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+#else
+    tcg_out_b9(s, B9_LGR, arg1, addr_reg);
+    tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+#endif
+
+    tcg_out_sh64(s, SH64_SRLG, arg1, addr_reg, SH64_REG_NONE,
+                 TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                 TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+    tcg_out_b9(s, B9_NGR, arg0, TCG_REG_R13);
+
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                 (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
+    tcg_out_b9(s, B9_NGR, arg1, TCG_REG_R13);
+
+    if (is_store) {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                     offsetof(CPUState, tlb_table[mem_index][0].addr_write));
+    } else {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                     offsetof(CPUState, tlb_table[mem_index][0].addr_read));
+    }
+    tcg_out_b9(s, B9_AGR, arg1, TCG_REG_R13);
+
+    tcg_out_b9(s, B9_AGR, arg1, TCG_AREG0);
+
+    tcg_out_e3(s, E3_CG, arg0, arg1, 0);
+
+    label1_ptr = (uint16_t*)s->code_ptr;
+
+    /* je label1 (offset will be patched in later) */
+    tcg_out32(s, 0xa7840000);
+
+    /* call load/store helper */
+#if TARGET_LONG_BITS == 32
+    tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+#else
+    tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+#endif
+
+    if (is_store) {
+        tcg_out_b9(s, B9_LGR, arg1, data_reg);
+        tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                     (tcg_target_ulong)qemu_st_helpers[s_bits]);
+        tcg_out_rr(s, RR_BASR, TCG_REG_R14, TCG_REG_R13);
+    } else {
+        tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                     (tcg_target_ulong)qemu_ld_helpers[s_bits]);
+        tcg_out_rr(s, RR_BASR, TCG_REG_R14, TCG_REG_R13);
+
+        /* sign extension */
+        switch (opc) {
+        case LD_INT8:
+            tcg_out_sh64(s, SH64_SLLG, data_reg, arg0, SH64_REG_NONE, 56);
+            tcg_out_sh64(s, SH64_SRAG, data_reg, data_reg, SH64_REG_NONE, 56);
+            break;
+        case LD_INT16:
+            tcg_out_sh64(s, SH64_SLLG, data_reg, arg0, SH64_REG_NONE, 48);
+            tcg_out_sh64(s, SH64_SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
+            break;
+        case LD_INT32:
+            tcg_out_b9(s, B9_LGFR, data_reg, arg0);
+            break;
+        default:
+            /* unsigned -> just copy */
+            tcg_out_b9(s, B9_LGR, data_reg, arg0);
+            break;
+        }
+    }
+
+    /* jump to label2 (end) */
+    *label2_ptr_p = (uint16_t*)s->code_ptr;
+
+    /* bras %r13, label2 */
+    tcg_out32(s, 0xa7d50000);
+
+    /* this is label1, patch branch */
+    *(label1_ptr + 1) = ((unsigned long)s->code_ptr -
+                         (unsigned long)label1_ptr) >> 1;
+
+    if (is_store) {
+        tcg_out_e3(s, E3_LG, arg1, arg1, offsetof(CPUTLBEntry, addend) -
+                                         offsetof(CPUTLBEntry, addr_write));
+    } else {
+        tcg_out_e3(s, E3_LG, arg1, arg1, offsetof(CPUTLBEntry, addend) -
+                                         offsetof(CPUTLBEntry, addr_read));
+    }
+
+#if TARGET_LONG_BITS == 32
+    /* zero upper 32 bits */
+    tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+#else
+    /* just copy */
+    tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+#endif
+    tcg_out_b9(s, B9_AGR, arg0, arg1);
+  }
+  
+static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
+  {
+    /* patch branch */
+    *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
+                         (unsigned long)label2_ptr) >> 1;
+}
+
+#else /* CONFIG_SOFTMMU */
+
+static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
+                                int mem_index, int opc,
+                                uint16_t **label2_ptr_p, int is_store)
+{
+    /* user mode, no address translation required */
+    *arg0 = addr_reg;
+}
+
+static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
+{
+}
+
+#endif /* CONFIG_SOFTMMU */
+
+/* load data with address translation (if applicable)
+   and endianness conversion */
+static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
+{
+    int addr_reg, data_reg, mem_index, s_bits;
+    int arg0 = TCG_REG_R2;
+    uint16_t *label2_ptr;
+
+    data_reg = *args++;
+    addr_reg = *args++;
+    mem_index = *args;
+
+    s_bits = opc & 3;
+
+    dprintf("tcg_out_qemu_ld opc %d data_reg %d addr_reg %d mem_index %d "
+            "s_bits %d\n", opc, data_reg, addr_reg, mem_index, s_bits);
+
+    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
+                          opc, &label2_ptr, 0);
+
+    switch (opc) {
+    case LD_UINT8:
+        tcg_out_e3(s, E3_LLGC, data_reg, arg0, 0);
+        break;
+    case LD_INT8:
+        tcg_out_e3(s, E3_LGB, data_reg, arg0, 0);
+        break;
+    case LD_UINT16:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_e3(s, E3_LLGH, data_reg, arg0, 0);
+#else
+        /* swapped unsigned halfword load with upper bits zeroed */
+        tcg_out_e3(s, E3_LRVH, data_reg, arg0, 0);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, 0xffffL);
+        tcg_out_b9(s, B9_NGR, data_reg, 13);
+#endif
+        break;
+    case LD_INT16:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_e3(s, E3_LGH, data_reg, arg0, 0);
+#else
+        /* swapped sign-extended halfword load */
+        tcg_out_e3(s, E3_LRVH, data_reg, arg0, 0);
+        tcg_out_sh64(s, SH64_SLLG, data_reg, data_reg, SH64_REG_NONE, 48);
+        tcg_out_sh64(s, SH64_SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
+#endif
+        break;
+    case LD_UINT32:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_e3(s, E3_LLGF, data_reg, arg0, 0);
+#else
+        /* swapped unsigned int load with upper bits zeroed */
+        tcg_out_e3(s, E3_LRV, data_reg, arg0, 0);
+        tcg_out_b9(s, B9_LLGFR, data_reg, data_reg);
+#endif
+        break;
+    case LD_INT32:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_e3(s, E3_LGF, data_reg, arg0, 0);
+#else
+        /* swapped sign-extended int load */
+        tcg_out_e3(s, E3_LRV, data_reg, arg0, 0);
+        tcg_out_b9(s, B9_LGFR, data_reg, data_reg);
+#endif
+        break;
+    case LD_UINT64:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_e3(s, E3_LG, data_reg, arg0, 0);
+#else
+        tcg_out_e3(s, E3_LRVG, data_reg, arg0, 0);
+#endif
+        break;
+    default:
+        tcg_abort();
+    }
+
+    tcg_finish_qemu_ldst(s, label2_ptr);
+}
+
+static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
+{
+    int addr_reg, data_reg, mem_index, s_bits;
+    uint16_t *label2_ptr;
+    int arg0 = TCG_REG_R2;
+
+    data_reg = *args++;
+    addr_reg = *args++;
+    mem_index = *args;
+
+    s_bits = opc;
+
+    dprintf("tcg_out_qemu_st opc %d data_reg %d addr_reg %d mem_index %d "
+            "s_bits %d\n", opc, data_reg, addr_reg, mem_index, s_bits);
+
+    tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
+                          opc, &label2_ptr, 1);
+
+    switch (opc) {
+    case LD_UINT8:
+        tcg_out_store(s, ST_STC, data_reg, arg0, 0);
+        break;
+    case LD_UINT16:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_store(s, ST_STH, data_reg, arg0, 0);
+#else
+        tcg_out_e3(s, E3_STRVH, data_reg, arg0, 0);
+#endif
+        break;
+    case LD_UINT32:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_store(s, ST_ST, data_reg, arg0, 0);
+#else
+        tcg_out_e3(s, E3_STRV, data_reg, arg0, 0);
+#endif
+        break;
+    case LD_UINT64:
+#ifdef TARGET_WORDS_BIGENDIAN
+        tcg_out_e3(s, E3_STG, data_reg, arg0, 0);
+#else
+        tcg_out_e3(s, E3_STRVG, data_reg, arg0, 0);
+#endif
+        break;
+    default:
+        tcg_abort();
+    }
+
+    tcg_finish_qemu_ldst(s, label2_ptr);
 }
 
 static inline void tcg_out_st(TCGContext *s, TCGType type, int arg,
                               int arg1, tcg_target_long arg2)
 {
-    tcg_abort();
+    dprintf("tcg_out_st arg 0x%x arg1 0x%x arg2 0x%lx\n", arg, arg1, arg2);
+
+    if (type == TCG_TYPE_I32) {
+        if (((long)arg2) < -0x800 || ((long)arg2) > 0x7ff) {
+            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, arg2);
+            tcg_out_b9(s, B9_AGR, 13, arg1);
+            tcg_out_store(s, ST_ST, arg, TCG_REG_R13, 0);
+        } else {
+            tcg_out_store(s, ST_ST, arg, arg1, arg2);
+        }
+    }
+    else {
+        if (((long)arg2) < -0x80000 || ((long)arg2) > 0x7ffff) {
+            tcg_abort();
+        }
+        tcg_out_e3(s, E3_STG, arg, arg1, arg2);
+    }
 }
 
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
-    tcg_abort();
+    TCGLabel* l;
+    int op;
+    int op2;
+
+    dprintf("0x%x\n", INDEX_op_divu_i32);
+
+    switch (opc) {
+    case INDEX_op_exit_tb:
+        dprintf("op 0x%x exit_tb 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* return value */
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R2, args[0]);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, (unsigned long)tb_ret_addr);
+        /* br %r13 */
+        tcg_out16(s, S390_INS_BR | TCG_REG_R13);
+        break;
+
+    case INDEX_op_goto_tb:
+        dprintf("op 0x%x goto_tb 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (s->tb_jmp_offset) {
+            tcg_abort();
+        } else {
+            tcg_target_long off = ((tcg_target_long)(s->tb_next + args[0]) -
+                                   (tcg_target_long)s->code_ptr) >> 1;
+            if (off > -0x80000000L && off < 0x7fffffffL) {
+                /* load address relative to PC */
+                /* larl %r13, off */
+                tcg_out16(s, S390_INS_LARL | (TCG_REG_R13 << 4));
+                tcg_out32(s, off);
+            } else {
+                /* too far for larl */
+                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                             (tcg_target_long)(s->tb_next + args[0]));
+            }
+            /* load address stored at s->tb_next + args[0] */
+            tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
+            /* and go there */
+            tcg_out_rr(s, RR_BASR, TCG_REG_R13, TCG_REG_R13);
+        }
+        s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
+        break;
+
+    case INDEX_op_call:
+        dprintf("op 0x%x call 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (const_args[0]) {
+            tcg_target_long off;
+
+            /* FIXME: + 4? Where did that come from? */
+            off = (args[0] - (tcg_target_long)s->code_ptr + 4) >> 1;
+            if (off > -0x80000000 && off < 0x7fffffff) {
+                /* relative call */
+                tcg_out_brasl(s, TCG_REG_R14, off << 1);
+                /* XXX untested */
+                tcg_abort();
+            } else {
+                /* too far for a relative call, load full address */
+                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, args[0]);
+                tcg_out_rr(s, RR_BASR, TCG_REG_R14, TCG_REG_R13);
+            }
+        } else {
+            /* call function in register args[0] */
+            tcg_out_rr(s, RR_BASR, TCG_REG_R14, args[0]);
+        }
+        break;
+
+    case INDEX_op_jmp:
+        dprintf("op 0x%x jmp 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_ld8u_i32:
+    case INDEX_op_ld8u_i64:
+        dprintf("op 0x%x ld8u_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if ((long)args[2] > -0x80000 && (long)args[2] < 0x7ffff) {
+            tcg_out_e3(s, E3_LLGC, args[0], args[1], args[2]);
+        } else {
+            /* XXX displacement too large, have to calculate address manually */
+            tcg_abort();
+        }
+        break;
+
+    case INDEX_op_ld8s_i32:
+        dprintf("op 0x%x ld8s_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_ld16u_i32:
+        dprintf("op 0x%x ld16u_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if ((long)args[2] > -0x80000 && (long)args[2] < 0x7ffff) {
+            tcg_out_e3(s, E3_LLGH, args[0], args[1], args[2]);
+        } else {
+            /* XXX displacement too large, have to calculate address manually */
+            tcg_abort();
+        }
+        break;
+
+    case INDEX_op_ld16s_i32:
+        dprintf("op 0x%x ld16s_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_ld_i32:
+    case INDEX_op_ld32u_i64:
+        tcg_out_ld(s, TCG_TYPE_I32, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_ld32s_i64:
+        if (args[2] < -0x80000 || args[2] > 0x7ffff) {
+            /* load the displacement */
+            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, args[2]);
+            /* add the address */
+            tcg_out_b9(s, B9_AGR, TCG_REG_R13, args[1]);
+            /* load the data (sign-extended) */
+            tcg_out_e3(s, E3_LGF, args[0], TCG_REG_R13, 0);
+        } else {
+            /* load the data (sign-extended) */
+            tcg_out_e3(s, E3_LGF, args[0], args[1], args[2]);
+        }
+        break;
+
+    case INDEX_op_ld_i64:
+        tcg_out_ld(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_st8_i32:
+    case INDEX_op_st8_i64:
+        dprintf("op 0x%x st8_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (((long)args[2]) >= -0x800 && ((long)args[2]) < 0x800) {
+            tcg_out_store(s, ST_STC, args[0], args[1], args[2]);
+        } else if (((long)args[2]) >= -0x80000 && ((long)args[2]) < 0x80000) {
+            /* FIXME: requires long displacement facility */
+            tcg_out_e3(s, E3_STCY, args[0], args[1], args[2]);
+            tcg_abort();
+        } else {
+            tcg_abort();
+        }
+        break;
+
+    case INDEX_op_st16_i32:
+    case INDEX_op_st16_i64:
+        dprintf("op 0x%x st16_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (((long)args[2]) >= -0x800 && ((long)args[2]) < 0x800) {
+            tcg_out_store(s, ST_STH, args[0], args[1], args[2]);
+        } else if (((long)args[2]) >= -0x80000 && ((long)args[2]) < 0x80000) {
+            /* FIXME: requires long displacement facility */
+            tcg_out_e3(s, E3_STHY, args[0], args[1], args[2]);
+            tcg_abort();
+        } else {
+            tcg_abort();
+        }
+        break;
+
+    case INDEX_op_st_i32:
+    case INDEX_op_st32_i64:
+        tcg_out_st(s, TCG_TYPE_I32, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_st_i64:
+        tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_mov_i32:
+        dprintf("op 0x%x mov_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_movi_i32:
+        dprintf("op 0x%x movi_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* XXX */
+        tcg_abort();
+        break;
+
+    case INDEX_op_add_i32:
+        if (const_args[2]) {
+            if (args[0] == args[1]) {
+                tcg_out_a7(s, A7_AHI, args[1], args[2]);
+            } else {
+                tcg_out_rr(s, RR_LR, args[0], args[1]);
+                tcg_out_a7(s, A7_AHI, args[0], args[2]);
+            }
+        } else if (args[0] == args[1]) {
+            tcg_out_rr(s, RR_AR, args[1], args[2]);
+        } else if (args[0] == args[2]) {
+            tcg_out_rr(s, RR_AR, args[0], args[1]);
+        } else {
+            tcg_out_rr(s, RR_LR, args[0], args[1]);
+            tcg_out_rr(s, RR_AR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_sub_i32:
+        dprintf("op 0x%x sub_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (args[0] == args[1]) {
+            /* sr %ra0/1, %ra2 */
+            tcg_out_rr(s, RR_SR, args[1], args[2]);
+        } else if (args[0] == args[2]) {
+            /* lr %r13, %raa0/2 */
+            tcg_out_rr(s, RR_LR, TCG_REG_R13, args[2]);
+            /* lr %ra0/2, %ra1 */
+            tcg_out_rr(s, RR_LR, args[0], args[1]);
+            /* sr %ra0/2, %r13 */
+            tcg_out_rr(s, RR_SR, args[0], TCG_REG_R13);
+        } else {
+            /* lr %ra0, %ra1 */
+            tcg_out_rr(s, RR_LR, args[0], args[1]);
+            /* sr %ra0, %ra2 */
+            tcg_out_rr(s, RR_SR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_sub_i64:
+        dprintf("op 0x%x sub_i64 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (args[0] == args[1]) {
+            /* sgr %ra0/1, %ra2 */
+            tcg_out_b9(s, B9_SGR, args[1], args[2]);
+        } else if (args[0] == args[2]) {
+            /* lgr %r13, %raa0/2 */
+            tcg_out_b9(s, B9_LGR, TCG_REG_R13, args[2]);
+            /* lgr %ra0/2, %ra1 */
+            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            /* sgr %ra0/2, %r13 */
+            tcg_out_b9(s, B9_SGR, args[0], TCG_REG_R13);
+        } else {
+            /* lgr %ra0, %ra1 */
+            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            /* sgr %ra0, %ra2 */
+            tcg_out_b9(s, B9_SGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_add_i64:
+        dprintf("op 0x%x add_i64 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (args[0] == args[1]) {
+            tcg_out_b9(s, B9_AGR, args[1], args[2]);
+        } else if (args[0] == args[2]) {
+            tcg_out_b9(s, B9_AGR, args[0], args[1]);
+        } else {
+            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            tcg_out_b9(s, B9_AGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_and_i32:
+        op = RR_NR;
+do_logic_i32:
+        if (args[0] == args[1]) {
+            /* xr %ra0/1, %ra2 */
+            tcg_out_rr(s, op, args[1], args[2]);
+        } else if (args[0] == args[2]) {
+            /* xr %ra0/2, %ra1 */
+            tcg_out_rr(s, op, args[0], args[1]);
+        } else {
+            /* lr %ra0, %ra1 */
+            tcg_out_rr(s, RR_LR, args[0], args[1]);
+            /* xr %ra0, %ra2 */
+            tcg_out_rr(s, op, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_or_i32:
+        op = RR_OR;
+        goto do_logic_i32;
+
+    case INDEX_op_xor_i32:
+        op = RR_XR;
+        goto do_logic_i32;
+
+    case INDEX_op_and_i64:
+        dprintf("op 0x%x and_i64 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        op = B9_NGR;
+do_logic_i64:
+        if (args[0] == args[1]) {
+            tcg_out_b9(s, op, args[0], args[2]);
+        } else if (args[0] == args[2]) {
+            tcg_out_b9(s, op, args[0], args[1]);
+        } else {
+            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            tcg_out_b9(s, op, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_or_i64:
+        op = B9_OGR;
+        goto do_logic_i64;
+
+    case INDEX_op_xor_i64:
+        op = B9_XGR;
+        goto do_logic_i64;
+
+    case INDEX_op_neg_i32:
+        dprintf("op 0x%x neg_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* FIXME: optimize args[0] != args[1] case */
+        tcg_out_rr(s, RR_LR, 13, args[1]);
+        /* lghi %ra0, 0 */
+        tcg_out32(s, S390_INS_LGHI | (args[0] << 20));
+        tcg_out_rr(s, RR_SR, args[0], 13);
+        break;
+
+    case INDEX_op_neg_i64:
+        dprintf("op 0x%x neg_i64 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        /* FIXME: optimize args[0] != args[1] case */
+        tcg_out_b9(s, B9_LGR, 13, args[1]);
+        /* lghi %ra0, 0 */
+        tcg_out32(s, S390_INS_LGHI | (args[0] << 20));
+        tcg_out_b9(s, B9_SGR, args[0], 13);
+        break;
+
+    case INDEX_op_mul_i32:
+        dprintf("op 0x%x mul_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (args[0] == args[1])
+          /* msr %ra0/1, %ra2 */
+          tcg_out32(s, S390_INS_MSR | (args[0] << 4) | args[2]);
+        else if (args[0] == args[2])
+          /* msr %ra0/2, %ra1 */
+          tcg_out32(s, S390_INS_MSR | (args[0] << 4) | args[1]);
+        else {
+          tcg_out_rr(s, RR_LR, args[0], args[1]);
+          /* msr %ra0, %ra2 */
+          tcg_out32(s, S390_INS_MSR | (args[0] << 4) | args[2]);
+        }
+        break;
+
+    case INDEX_op_mul_i64:
+        dprintf("op 0x%x mul_i64 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        if (args[0] == args[1]) {
+            tcg_out_b9(s, B9_MSGR, args[0], args[2]);
+        } else if (args[0] == args[2]) {
+            tcg_out_b9(s, B9_MSGR, args[0], args[1]);
+        } else {
+            /* XXX */
+            tcg_abort();
+        }
+        break;
+
+    case INDEX_op_divu_i32:
+    case INDEX_op_remu_i32:
+        dprintf("op 0x%x div/remu_i32 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R12, 0);
+        tcg_out_rr(s, RR_LR, TCG_REG_R13, args[1]);
+        tcg_out_b9(s, B9_DLR, TCG_REG_R12, args[2]);
+        if (opc == INDEX_op_divu_i32) {
+          tcg_out_rr(s, RR_LR, args[0], TCG_REG_R13);        /* quotient */
+        } else {
+          tcg_out_rr(s, RR_LR, args[0], TCG_REG_R12);        /* remainder */
+        }
+        break;
+
+    case INDEX_op_shl_i32:
+        op = SH32_SLL;
+        op2 = SH64_SLLG;
+ do_shift32:
+        if (const_args[2]) {
+            if (args[0] == args[1]) {
+                tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
+            } else {
+                tcg_out_rr(s, RR_LR, args[0], args[1]);
+                tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
+            }
+        } else {
+            if (args[0] == args[1]) {
+                tcg_out_sh32(s, op, args[0], args[2], 0);
+            } else {
+                tcg_out_sh64(s, op2, args[0], args[1], args[2], 0);
+            }
+        }
+        break;
+
+    case INDEX_op_shr_i32:
+        op = SH32_SRL;
+        op2 = SH64_SRLG;
+        goto do_shift32;
+
+    case INDEX_op_sar_i32:
+        op = SH32_SRA;
+        op2 = SH64_SRAG;
+        goto do_shift32;
+
+    case INDEX_op_shl_i64:
+        op = SH64_SLLG;
+ do_shift64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, op, args[0], args[1], SH64_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
+        }
+        break;
+
+    case INDEX_op_shr_i64:
+        op = SH64_SRLG;
+        goto do_shift64;
+
+    case INDEX_op_sar_i64:
+        op = SH64_SRAG;
+        goto do_shift64;
+
+    case INDEX_op_br:
+        dprintf("op 0x%x br 0x%lx 0x%lx 0x%lx\n",
+                opc, args[0], args[1], args[2]);
+        l = &s->labels[args[0]];
+        if (l->has_value) {
+            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, l->u.value);
+        } else {
+            /* larl %r13, ... */
+            tcg_out16(s, S390_INS_LARL | (TCG_REG_R13 << 4));
+            tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, args[0], -2);
+            s->code_ptr += 4;
+        }
+        tcg_out_rr(s, RR_BASR, TCG_REG_R13, TCG_REG_R13);
+        break;
+
+    case INDEX_op_brcond_i64:
+        dprintf("op 0x%x brcond_i64 0x%lx 0x%lx (c %d) 0x%lx\n",
+                opc, args[0], args[1], const_args[1], args[2]);
+        if (args[2] > TCG_COND_GT) {
+          /* unsigned */
+          /* clgr %ra0, %ra1 */
+          tcg_out_b9(s, B9_CLGR, args[0], args[1]);
+        } else {
+          /* signed */
+          /* cgr %ra0, %ra1 */
+          tcg_out_b9(s, B9_CGR, args[0], args[1]);
+        }
+        goto do_brcond;
+
+    case INDEX_op_brcond_i32:
+        dprintf("op 0x%x brcond_i32 0x%lx 0x%lx (c %d) 0x%lx\n",
+                opc, args[0], args[1], const_args[1], args[2]);
+        if (args[2] > TCG_COND_GT) {
+          /* unsigned */
+          /* clr %ra0, %ra1 */
+          tcg_out_rr(s, RR_CLR, args[0], args[1]);
+        } else {
+          /* signed */
+          /* cr %ra0, %ra1 */
+          tcg_out_rr(s, RR_CR, args[0], args[1]);
+        }
+ do_brcond:
+        l = &s->labels[args[3]];
+        if (l->has_value) {
+            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, l->u.value);
+        } else {
+            /* larl %r13, ... */
+            tcg_out16(s, S390_INS_LARL | (TCG_REG_R13 << 4));
+            tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, args[3], -2);
+            s->code_ptr += 4;
+        }
+        /* bcr cond, %r13 */
+        tcg_out16(s, S390_INS_BCR | TCG_REG_R13 |
+                     (tcg_cond_to_s390_cond[args[2]] << 4));
+        break;
+
+    case INDEX_op_qemu_ld8u:
+        tcg_out_qemu_ld(s, args, LD_UINT8);
+        break;
+
+    case INDEX_op_qemu_ld8s:
+        tcg_out_qemu_ld(s, args, LD_INT8);
+        break;
+
+    case INDEX_op_qemu_ld16u:
+        tcg_out_qemu_ld(s, args, LD_UINT16);
+        break;
+
+    case INDEX_op_qemu_ld16s:
+        tcg_out_qemu_ld(s, args, LD_INT16);
+        break;
+
+    case INDEX_op_qemu_ld32u:
+        tcg_out_qemu_ld(s, args, LD_UINT32);
+        break;
+
+    case INDEX_op_qemu_ld32s:
+        tcg_out_qemu_ld(s, args, LD_INT32);
+        break;
+
+    case INDEX_op_qemu_ld64:
+        tcg_out_qemu_ld(s, args, LD_UINT64);
+        break;
+
+    case INDEX_op_qemu_st8:
+        tcg_out_qemu_st(s, args, LD_UINT8);
+        break;
+
+    case INDEX_op_qemu_st16:
+        tcg_out_qemu_st(s, args, LD_UINT16);
+        break;
+
+    case INDEX_op_qemu_st32:
+        tcg_out_qemu_st(s, args, LD_UINT32);
+        break;
+
+    case INDEX_op_qemu_st64:
+        tcg_out_qemu_st(s, args, LD_UINT64);
+        break;
+
+    default:
+        fprintf(stderr,"unimplemented opc 0x%x\n",opc);
+        tcg_abort();
+    }
 }
 
+ static const TCGTargetOpDef s390_op_defs[] = {
+    { INDEX_op_exit_tb, { } },
+    { INDEX_op_goto_tb, { } },
+    { INDEX_op_call, { "ri" } },
+    { INDEX_op_jmp, { "ri" } },
+    { INDEX_op_br, { } },
+
+    { INDEX_op_mov_i32, { "r", "r" } },
+    { INDEX_op_movi_i32, { "r" } },
+
+    { INDEX_op_ld8u_i32, { "r", "r" } },
+    { INDEX_op_ld8s_i32, { "r", "r" } },
+    { INDEX_op_ld16u_i32, { "r", "r" } },
+    { INDEX_op_ld16s_i32, { "r", "r" } },
+    { INDEX_op_ld_i32, { "r", "r" } },
+    { INDEX_op_st8_i32, { "r", "r" } },
+    { INDEX_op_st16_i32, { "r", "r" } },
+    { INDEX_op_st_i32, { "r", "r" } },
+
+    { INDEX_op_add_i32, { "r", "r", "rI" } },
+    { INDEX_op_sub_i32, { "r", "r", "r" } },
+    { INDEX_op_mul_i32, { "r", "r", "r" } },
+
+    { INDEX_op_div_i32, { "r", "r", "r" } },
+    { INDEX_op_divu_i32, { "r", "r", "r" } },
+    { INDEX_op_rem_i32, { "r", "r", "r" } },
+    { INDEX_op_remu_i32, { "r", "r", "r" } },
+
+    { INDEX_op_and_i32, { "r", "r", "r" } },
+    { INDEX_op_or_i32, { "r", "r", "r" } },
+    { INDEX_op_xor_i32, { "r", "r", "r" } },
+    { INDEX_op_neg_i32, { "r", "r" } },
+
+    { INDEX_op_shl_i32, { "r", "r", "Ri" } },
+    { INDEX_op_shr_i32, { "r", "r", "Ri" } },
+    { INDEX_op_sar_i32, { "r", "r", "Ri" } },
+
+    { INDEX_op_brcond_i32, { "r", "r" } },
+
+    { INDEX_op_qemu_ld8u, { "r", "L" } },
+    { INDEX_op_qemu_ld8s, { "r", "L" } },
+    { INDEX_op_qemu_ld16u, { "r", "L" } },
+    { INDEX_op_qemu_ld16s, { "r", "L" } },
+    { INDEX_op_qemu_ld32u, { "r", "L" } },
+    { INDEX_op_qemu_ld32s, { "r", "L" } },
+
+    { INDEX_op_qemu_st8, { "S", "S" } },
+    { INDEX_op_qemu_st16, { "S", "S" } },
+    { INDEX_op_qemu_st32, { "S", "S" } },
+
+#if defined(__s390x__)
+    { INDEX_op_mov_i64, { "r", "r" } },
+    { INDEX_op_movi_i64, { "r" } },
+
+    { INDEX_op_ld8u_i64, { "r", "r" } },
+    { INDEX_op_ld8s_i64, { "r", "r" } },
+    { INDEX_op_ld16u_i64, { "r", "r" } },
+    { INDEX_op_ld16s_i64, { "r", "r" } },
+    { INDEX_op_ld32u_i64, { "r", "r" } },
+    { INDEX_op_ld32s_i64, { "r", "r" } },
+    { INDEX_op_ld_i64, { "r", "r" } },
+
+    { INDEX_op_st8_i64, { "r", "r" } },
+    { INDEX_op_st16_i64, { "r", "r" } },
+    { INDEX_op_st32_i64, { "r", "r" } },
+    { INDEX_op_st_i64, { "r", "r" } },
+
+    { INDEX_op_qemu_ld64, { "L", "L" } },
+    { INDEX_op_qemu_st64, { "S", "S" } },
+
+    { INDEX_op_add_i64, { "r", "r", "r" } },
+    { INDEX_op_mul_i64, { "r", "r", "r" } },
+    { INDEX_op_sub_i64, { "r", "r", "r" } },
+
+    { INDEX_op_and_i64, { "r", "r", "r" } },
+    { INDEX_op_or_i64, { "r", "r", "r" } },
+    { INDEX_op_xor_i64, { "r", "r", "r" } },
+    { INDEX_op_neg_i64, { "r", "r" } },
+
+    { INDEX_op_shl_i64, { "r", "r", "Ri" } },
+    { INDEX_op_shr_i64, { "r", "r", "Ri" } },
+    { INDEX_op_sar_i64, { "r", "r", "Ri" } },
+
+    { INDEX_op_brcond_i64, { "r", "r" } },
+#endif
+
+    { -1 },
+ };
+
 void tcg_target_init(TCGContext *s)
 {
-    /* gets called with KVM */
+    /* fail safe */
+    if ((1 << CPU_TLB_ENTRY_BITS) != sizeof(CPUTLBEntry)) {
+        tcg_abort();
+    }
+
+    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
+    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
+    tcg_regset_set32(tcg_target_call_clobber_regs, 0,
+                     (1 << TCG_REG_R0) |
+                     (1 << TCG_REG_R1) |
+                     (1 << TCG_REG_R2) |
+                     (1 << TCG_REG_R3) |
+                     (1 << TCG_REG_R4) |
+                     (1 << TCG_REG_R5) |
+                     (1 << TCG_REG_R14)); /* link register */
+
+    tcg_regset_clear(s->reserved_regs);
+    /* frequently used as a temporary */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13);
+    /* another temporary */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
+
+    tcg_add_target_add_op_defs(s390_op_defs);
 }
 
 void tcg_target_qemu_prologue(TCGContext *s)
 {
-    /* gets called with KVM */
+    /* stmg %r6,%r15,48(%r15) (save registers) */
+    tcg_out16(s, 0xeb6f);
+    tcg_out32(s, 0xf0300024);
+
+    /* aghi %r15,-160 (stack frame) */
+    tcg_out32(s, 0xa7fbff60);
+
+    /* br %r2 (go to TB) */
+    tcg_out16(s, S390_INS_BR | TCG_REG_R2);
+
+    tb_ret_addr = s->code_ptr;
+
+    /* lmg %r6,%r15,208(%r15) (restore registers) */
+    tcg_out16(s, 0xeb6f);
+    tcg_out32(s, 0xf0d00004);
+
+    /* br %r14 (return) */
+    tcg_out16(s, S390_INS_BR | TCG_REG_R14);
 }
 
 static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
 {
-    tcg_abort();
+    tcg_out_b9(s, B9_LGR, ret, arg);
 }
 
 static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 02/62] add lost chunks from the original patch
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 01/62] S390 TCG target Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-28 16:49   ` Andreas Färber
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 03/62] tcg-s390: Only validate CPUTLBEntry for system mode Richard Henderson
                   ` (60 subsequent siblings)
  62 siblings, 1 reply; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexander Graf, aurelien

From: Alexander Graf <agraf@suse.de>

---
 tcg/s390/tcg-target.c |    3 ++
 tcg/s390/tcg-target.h |   86 +++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index d2a93c2..45c1bf7 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -96,6 +96,9 @@ static void *qemu_st_helpers[4] = {
     __stl_mmu,
     __stq_mmu,
 };
+#endif
+
+static uint8_t *tb_ret_addr;
 
 static void patch_reloc(uint8_t *code_ptr, int type,
                 tcg_target_long value, tcg_target_long addend)
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index d8a2955..bd72115 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -47,7 +47,7 @@ enum {
 #define TCG_TARGET_NB_REGS 16
 
 /* optional instructions */
-// #define TCG_TARGET_HAS_div_i32
+#define TCG_TARGET_HAS_div_i32
 // #define TCG_TARGET_HAS_rot_i32
 // #define TCG_TARGET_HAS_ext8s_i32
 // #define TCG_TARGET_HAS_ext16s_i32
@@ -56,7 +56,7 @@ enum {
 // #define TCG_TARGET_HAS_bswap16_i32
 // #define TCG_TARGET_HAS_bswap32_i32
 // #define TCG_TARGET_HAS_not_i32
-// #define TCG_TARGET_HAS_neg_i32
+#define TCG_TARGET_HAS_neg_i32
 // #define TCG_TARGET_HAS_andc_i32
 // #define TCG_TARGET_HAS_orc_i32
 // #define TCG_TARGET_HAS_eqv_i32
@@ -75,7 +75,7 @@ enum {
 // #define TCG_TARGET_HAS_bswap32_i64
 // #define TCG_TARGET_HAS_bswap64_i64
 // #define TCG_TARGET_HAS_not_i64
-// #define TCG_TARGET_HAS_neg_i64
+#define TCG_TARGET_HAS_neg_i64
 // #define TCG_TARGET_HAS_andc_i64
 // #define TCG_TARGET_HAS_orc_i64
 // #define TCG_TARGET_HAS_eqv_i64
@@ -87,6 +87,86 @@ enum {
 #define TCG_TARGET_STACK_ALIGN		8
 #define TCG_TARGET_CALL_STACK_OFFSET	0
 
+#define TCG_CT_CONST_S16                0x100
+#define TCG_CT_CONST_U12                0x200
+
+#define E3_LG          0x04
+#define E3_LRVG        0x0f
+#define E3_LGF         0x14
+#define E3_LGH         0x15
+#define E3_LLGF        0x16
+#define E3_LRV         0x1e
+#define E3_LRVH        0x1f
+#define E3_CG          0x20
+#define E3_STG         0x24
+#define E3_STRVG       0x2f
+#define E3_STRV        0x3e
+#define E3_STRVH       0x3f
+#define E3_STHY        0x70
+#define E3_STCY        0x72
+#define E3_LGB         0x77
+#define E3_LLGC        0x90
+#define E3_LLGH        0x91
+
+#define B9_LGR         0x04
+#define B9_AGR         0x08
+#define B9_SGR         0x09
+#define B9_MSGR        0x0c
+#define B9_LGFR        0x14
+#define B9_LLGFR       0x16
+#define B9_CGR         0x20
+#define B9_CLGR        0x21
+#define B9_NGR         0x80
+#define B9_OGR         0x81
+#define B9_XGR         0x82
+#define B9_DLGR        0x87
+#define B9_DLR         0x97
+
+#define RR_BASR        0x0d
+#define RR_NR          0x14
+#define RR_CLR         0x15
+#define RR_OR          0x16
+#define RR_XR          0x17
+#define RR_LR          0x18
+#define RR_CR          0x19
+#define RR_AR          0x1a
+#define RR_SR          0x1b
+
+#define A7_AHI         0xa
+#define A7_AHGI        0xb
+
+#define SH64_REG_NONE  0x00 /* use immediate only (not R0!) */
+#define SH64_SRAG      0x0a
+#define SH64_SRLG      0x0c
+#define SH64_SLLG      0x0d
+
+#define SH32_REG_NONE  0x00 /* use immediate only (not R0!) */
+#define SH32_SRL       0x08
+#define SH32_SLL       0x09
+#define SH32_SRA       0x0a
+
+#define ST_STH         0x40
+#define ST_STC         0x42
+#define ST_ST          0x50
+
+#define LD_SIGNED      0x04
+#define LD_UINT8       0x00
+#define LD_INT8        (LD_UINT8 | LD_SIGNED)
+#define LD_UINT16      0x01
+#define LD_INT16       (LD_UINT16 | LD_SIGNED)
+#define LD_UINT32      0x02
+#define LD_INT32       (LD_UINT32 | LD_SIGNED)
+#define LD_UINT64      0x03
+#define LD_INT64       (LD_UINT64 | LD_SIGNED)
+
+#define S390_INS_BCR   0x0700
+#define S390_INS_BR    (S390_INS_BCR | 0x00f0)
+#define S390_INS_IILH  0xa5020000
+#define S390_INS_LLILL 0xa50f0000
+#define S390_INS_LGHI  0xa7090000
+#define S390_INS_MSR   0xb2520000
+#define S390_INS_LARL  0xc000
+
 enum {
     /* Note: must be synced with dyngen-exec.h */
     TCG_AREG0 = TCG_REG_R10,
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 03/62] tcg-s390: Only validate CPUTLBEntry for system mode.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 01/62] S390 TCG target Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 02/62] add lost chunks from the original patch Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 04/62] tcg-s390: Fix tcg_prepare_qemu_ldst for user mode Richard Henderson
                   ` (59 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 45c1bf7..9ab1d96 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -1198,10 +1198,12 @@ do_logic_i64:
 
 void tcg_target_init(TCGContext *s)
 {
+#if !defined(CONFIG_USER_ONLY)
     /* fail safe */
     if ((1 << CPU_TLB_ENTRY_BITS) != sizeof(CPUTLBEntry)) {
         tcg_abort();
     }
+#endif
 
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 04/62] tcg-s390: Fix tcg_prepare_qemu_ldst for user mode.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (2 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 03/62] tcg-s390: Only validate CPUTLBEntry for system mode Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 05/62] tcg-s390: Move opcode defines to tcg-target.c Richard Henderson
                   ` (58 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

This isn't the most efficient way to implement user
memory accesses, but it's the minimal change to fix
the compilation error.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 9ab1d96..f0013e7 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -418,8 +418,14 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
                                 int mem_index, int opc,
                                 uint16_t **label2_ptr_p, int is_store)
 {
+    int arg0 = TCG_REG_R2;
+
     /* user mode, no address translation required */
-    *arg0 = addr_reg;
+    if (TARGET_LONG_BITS == 32) {
+        tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+    } else {
+        tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+    }
 }
 
 static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 05/62] tcg-s390: Move opcode defines to tcg-target.c.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (3 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 04/62] tcg-s390: Fix tcg_prepare_qemu_ldst for user mode Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 06/62] s390x: Avoid _llseek Richard Henderson
                   ` (57 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

In addition to being the Right Thing, some of the RR_* defines
conflict with RR_* enumerations in target-mips/cpu.h.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++++
 tcg/s390/tcg-target.h |   80 ------------------------------------------------
 2 files changed, 81 insertions(+), 80 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index f0013e7..1f961ad 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,6 +33,87 @@
     do { } while (0)
 #endif
 
+#define TCG_CT_CONST_S16                0x100
+#define TCG_CT_CONST_U12                0x200
+
+#define E3_LG          0x04
+#define E3_LRVG        0x0f
+#define E3_LGF         0x14
+#define E3_LGH         0x15
+#define E3_LLGF        0x16
+#define E3_LRV         0x1e
+#define E3_LRVH        0x1f
+#define E3_CG          0x20
+#define E3_STG         0x24
+#define E3_STRVG       0x2f
+#define E3_STRV        0x3e
+#define E3_STRVH       0x3f
+#define E3_STHY        0x70
+#define E3_STCY        0x72
+#define E3_LGB         0x77
+#define E3_LLGC        0x90
+#define E3_LLGH        0x91
+
+#define B9_LGR         0x04
+#define B9_AGR         0x08
+#define B9_SGR         0x09
+#define B9_MSGR        0x0c
+#define B9_LGFR        0x14
+#define B9_LLGFR       0x16
+#define B9_CGR         0x20
+#define B9_CLGR        0x21
+#define B9_NGR         0x80
+#define B9_OGR         0x81
+#define B9_XGR         0x82
+#define B9_DLGR        0x87
+#define B9_DLR         0x97
+
+#define RR_BASR        0x0d
+#define RR_NR          0x14
+#define RR_CLR         0x15
+#define RR_OR          0x16
+#define RR_XR          0x17
+#define RR_LR          0x18
+#define RR_CR          0x19
+#define RR_AR          0x1a
+#define RR_SR          0x1b
+
+#define A7_AHI         0xa
+#define A7_AHGI        0xb
+
+#define SH64_REG_NONE  0x00 /* use immediate only (not R0!) */
+#define SH64_SRAG      0x0a
+#define SH64_SRLG      0x0c
+#define SH64_SLLG      0x0d
+
+#define SH32_REG_NONE  0x00 /* use immediate only (not R0!) */
+#define SH32_SRL       0x08
+#define SH32_SLL       0x09
+#define SH32_SRA       0x0a
+
+#define ST_STH         0x40
+#define ST_STC         0x42
+#define ST_ST          0x50
+
+#define LD_SIGNED      0x04
+#define LD_UINT8       0x00
+#define LD_INT8        (LD_UINT8 | LD_SIGNED)
+#define LD_UINT16      0x01
+#define LD_INT16       (LD_UINT16 | LD_SIGNED)
+#define LD_UINT32      0x02
+#define LD_INT32       (LD_UINT32 | LD_SIGNED)
+#define LD_UINT64      0x03
+#define LD_INT64       (LD_UINT64 | LD_SIGNED)
+
+#define S390_INS_BCR   0x0700
+#define S390_INS_BR    (S390_INS_BCR | 0x00f0)
+#define S390_INS_IILH  0xa5020000
+#define S390_INS_LLILL 0xa50f0000
+#define S390_INS_LGHI  0xa7090000
+#define S390_INS_MSR   0xb2520000
+#define S390_INS_LARL  0xc000
+
+
 static const int tcg_target_reg_alloc_order[] = {
     TCG_REG_R6,
     TCG_REG_R7,
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index bd72115..7495258 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -87,86 +87,6 @@ enum {
 #define TCG_TARGET_STACK_ALIGN		8
 #define TCG_TARGET_CALL_STACK_OFFSET	0
 
-#define TCG_CT_CONST_S16                0x100
-#define TCG_CT_CONST_U12                0x200
-
-#define E3_LG          0x04
-#define E3_LRVG        0x0f
-#define E3_LGF         0x14
-#define E3_LGH         0x15
-#define E3_LLGF        0x16
-#define E3_LRV         0x1e
-#define E3_LRVH        0x1f
-#define E3_CG          0x20
-#define E3_STG         0x24
-#define E3_STRVG       0x2f
-#define E3_STRV        0x3e
-#define E3_STRVH       0x3f
-#define E3_STHY        0x70
-#define E3_STCY        0x72
-#define E3_LGB         0x77
-#define E3_LLGC        0x90
-#define E3_LLGH        0x91
-
-#define B9_LGR         0x04
-#define B9_AGR         0x08
-#define B9_SGR         0x09
-#define B9_MSGR        0x0c
-#define B9_LGFR        0x14
-#define B9_LLGFR       0x16
-#define B9_CGR         0x20
-#define B9_CLGR        0x21
-#define B9_NGR         0x80
-#define B9_OGR         0x81
-#define B9_XGR         0x82
-#define B9_DLGR        0x87
-#define B9_DLR         0x97
-
-#define RR_BASR        0x0d
-#define RR_NR          0x14
-#define RR_CLR         0x15
-#define RR_OR          0x16
-#define RR_XR          0x17
-#define RR_LR          0x18
-#define RR_CR          0x19
-#define RR_AR          0x1a
-#define RR_SR          0x1b
-
-#define A7_AHI         0xa
-#define A7_AHGI        0xb
-
-#define SH64_REG_NONE  0x00 /* use immediate only (not R0!) */
-#define SH64_SRAG      0x0a
-#define SH64_SRLG      0x0c
-#define SH64_SLLG      0x0d
-
-#define SH32_REG_NONE  0x00 /* use immediate only (not R0!) */
-#define SH32_SRL       0x08
-#define SH32_SLL       0x09
-#define SH32_SRA       0x0a
-
-#define ST_STH         0x40
-#define ST_STC         0x42
-#define ST_ST          0x50
-
-#define LD_SIGNED      0x04
-#define LD_UINT8       0x00
-#define LD_INT8        (LD_UINT8 | LD_SIGNED)
-#define LD_UINT16      0x01
-#define LD_INT16       (LD_UINT16 | LD_SIGNED)
-#define LD_UINT32      0x02
-#define LD_INT32       (LD_UINT32 | LD_SIGNED)
-#define LD_UINT64      0x03
-#define LD_INT64       (LD_UINT64 | LD_SIGNED)
-
-#define S390_INS_BCR   0x0700
-#define S390_INS_BR    (S390_INS_BCR | 0x00f0)
-#define S390_INS_IILH  0xa5020000
-#define S390_INS_LLILL 0xa50f0000
-#define S390_INS_LGHI  0xa7090000
-#define S390_INS_MSR   0xb2520000
-#define S390_INS_LARL  0xc000
-
 enum {
     /* Note: must be synced with dyngen-exec.h */
     TCG_AREG0 = TCG_REG_R10,
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 06/62] s390x: Avoid _llseek.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (4 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 05/62] tcg-s390: Move opcode defines to tcg-target.c Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 07/62] s390x: Don't use a linker script for user-only Richard Henderson
                   ` (56 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

There's no _llseek on s390x either.  Replace the existing
test for __x86_64__ with a functional test for __NR_llseek.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 linux-user/syscall.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8222cb9..e94f1ee 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -208,7 +208,7 @@ _syscall3(int, sys_getdents, uint, fd, struct linux_dirent *, dirp, uint, count)
 _syscall3(int, sys_getdents64, uint, fd, struct linux_dirent64 *, dirp, uint, count);
 #endif
 _syscall2(int, sys_getpriority, int, which, int, who);
-#if defined(TARGET_NR__llseek) && !defined (__x86_64__)
+#if defined(TARGET_NR__llseek) && defined(__NR_llseek)
 _syscall5(int, _llseek,  uint,  fd, ulong, hi, ulong, lo,
           loff_t *, res, uint, wh);
 #endif
@@ -5933,7 +5933,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #ifdef TARGET_NR__llseek /* Not on alpha */
     case TARGET_NR__llseek:
         {
-#if defined (__x86_64__)
+#if !defined(__NR_llseek)
             ret = get_errno(lseek(arg1, ((uint64_t )arg2 << 32) | arg3, arg5));
             if (put_user_s64(ret, arg4))
                 goto efault;
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 07/62] s390x: Don't use a linker script for user-only.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (5 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 06/62] s390x: Avoid _llseek Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 08/62] tcg-s390: Avoid set-but-not-used werrors Richard Henderson
                   ` (55 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The default placement of the application at 0x80000000 is fine,
and will avoid the default placement for most other guests.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index 3cd2c5f..e2b389d 100755
--- a/configure
+++ b/configure
@@ -2753,6 +2753,9 @@ if test "$target_linux_user" = "yes" -o "$target_bsd_user" = "yes" ; then
     # -static is used to avoid g1/g3 usage by the dynamic linker
     ldflags="$linker_script -static $ldflags"
     ;;
+  alpha | s390x)
+    # The default placement of the application is fine.
+    ;;
   *)
     ldflags="$linker_script $ldflags"
     ;;
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 08/62] tcg-s390: Avoid set-but-not-used werrors.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (6 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 07/62] s390x: Don't use a linker script for user-only Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 09/62] tcg-s390: Mark R0 & R15 reserved Richard Henderson
                   ` (54 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The s_bits variable was only used in a dprintf, and isn't
really informative since we already dump 'opc' from which
s_bits is trivially derived.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   16 ++++++----------
 1 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 1f961ad..eb3ca38 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -519,7 +519,7 @@ static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
    and endianness conversion */
 static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 {
-    int addr_reg, data_reg, mem_index, s_bits;
+    int addr_reg, data_reg, mem_index;
     int arg0 = TCG_REG_R2;
     uint16_t *label2_ptr;
 
@@ -527,10 +527,8 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
     addr_reg = *args++;
     mem_index = *args;
 
-    s_bits = opc & 3;
-
-    dprintf("tcg_out_qemu_ld opc %d data_reg %d addr_reg %d mem_index %d "
-            "s_bits %d\n", opc, data_reg, addr_reg, mem_index, s_bits);
+    dprintf("tcg_out_qemu_ld opc %d data_reg %d addr_reg %d mem_index %d\n"
+            opc, data_reg, addr_reg, mem_index);
 
     tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
                           opc, &label2_ptr, 0);
@@ -596,7 +594,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 
 static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 {
-    int addr_reg, data_reg, mem_index, s_bits;
+    int addr_reg, data_reg, mem_index;
     uint16_t *label2_ptr;
     int arg0 = TCG_REG_R2;
 
@@ -604,10 +602,8 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
     addr_reg = *args++;
     mem_index = *args;
 
-    s_bits = opc;
-
-    dprintf("tcg_out_qemu_st opc %d data_reg %d addr_reg %d mem_index %d "
-            "s_bits %d\n", opc, data_reg, addr_reg, mem_index, s_bits);
+    dprintf("tcg_out_qemu_st opc %d data_reg %d addr_reg %d mem_index %d\n"
+            opc, data_reg, addr_reg, mem_index);
 
     tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
                           opc, &label2_ptr, 1);
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 09/62] tcg-s390: Mark R0 & R15 reserved.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (7 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 08/62] tcg-s390: Avoid set-but-not-used werrors Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 10/62] tcg-s390: R6 is a function argument register Richard Henderson
                   ` (53 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Don't merely exclude them from the register allocation order.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index eb3ca38..6988937 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -124,8 +124,7 @@ static const int tcg_target_reg_alloc_order[] = {
     TCG_REG_R12,
     TCG_REG_R13,
     TCG_REG_R14,
-    /* XXX many insns can't be used with R0, so we better avoid it for now */
-    /* TCG_REG_R0 */
+    TCG_REG_R0,
     TCG_REG_R1,
     TCG_REG_R2,
     TCG_REG_R3,
@@ -1304,6 +1303,10 @@ void tcg_target_init(TCGContext *s)
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13);
     /* another temporary */
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
+    /* XXX many insns can't be used with R0, so we better avoid it for now */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0);
+    /* The stack pointer.  */
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R15);
 
     tcg_add_target_add_op_defs(s390_op_defs);
 }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 10/62] tcg-s390: R6 is a function argument register
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (8 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 09/62] tcg-s390: Mark R0 & R15 reserved Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 11/62] tcg-s390: Move tcg_out_mov up and use it throughout Richard Henderson
                   ` (52 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 6988937..25c80e6 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -137,6 +137,7 @@ static const int tcg_target_call_iarg_regs[] = {
     TCG_REG_R3,
     TCG_REG_R4,
     TCG_REG_R5,
+    TCG_REG_R6,
 };
 
 static const int tcg_target_call_oarg_regs[] = {
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 11/62] tcg-s390: Move tcg_out_mov up and use it throughout.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (9 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 10/62] tcg-s390: R6 is a function argument register Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 12/62] tcg-s390: Eliminate the S constraint Richard Henderson
                   ` (51 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   42 ++++++++++++++++++++----------------------
 1 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 25c80e6..455cf6a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -315,6 +315,12 @@ static void tcg_out_store(TCGContext* s, int op, int r0, int r1, int off)
     tcg_out32(s, (op << 24) | (r0 << 20) | (r1 << 12) | off);
 }
 
+static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
+{
+    /* ??? With a TCGType argument, we could emit the smaller LR insn.  */
+    tcg_out_b9(s, B9_LGR, ret, arg);
+}
+
 /* load a register with an immediate value */
 static inline void tcg_out_movi(TCGContext *s, TCGType type,
                 int ret, tcg_target_long arg)
@@ -386,8 +392,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     tcg_out_b9(s, B9_LLGFR, arg1, addr_reg);
     tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
 #else
-    tcg_out_b9(s, B9_LGR, arg1, addr_reg);
-    tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+    tcg_out_mov(s, arg1, addr_reg);
+    tcg_out_mov(s, arg0, addr_reg);
 #endif
 
     tcg_out_sh64(s, SH64_SRLG, arg1, addr_reg, SH64_REG_NONE,
@@ -423,11 +429,11 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
 #if TARGET_LONG_BITS == 32
     tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
 #else
-    tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+    tcg_out_mov(s, arg0, addr_reg);
 #endif
 
     if (is_store) {
-        tcg_out_b9(s, B9_LGR, arg1, data_reg);
+        tcg_out_mov(s, arg1, data_reg);
         tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
                      (tcg_target_ulong)qemu_st_helpers[s_bits]);
@@ -453,7 +459,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
             break;
         default:
             /* unsigned -> just copy */
-            tcg_out_b9(s, B9_LGR, data_reg, arg0);
+            tcg_out_mov(s, data_reg, arg0);
             break;
         }
     }
@@ -481,7 +487,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
 #else
     /* just copy */
-    tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+    tcg_out_mov(s, arg0, addr_reg);
 #endif
     tcg_out_b9(s, B9_AGR, arg0, arg1);
   }
@@ -505,7 +511,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     if (TARGET_LONG_BITS == 32) {
         tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
     } else {
-        tcg_out_b9(s, B9_LGR, arg0, addr_reg);
+        tcg_out_mov(s, arg0, addr_reg);
     }
 }
 
@@ -898,15 +904,12 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             /* sgr %ra0/1, %ra2 */
             tcg_out_b9(s, B9_SGR, args[1], args[2]);
         } else if (args[0] == args[2]) {
-            /* lgr %r13, %raa0/2 */
-            tcg_out_b9(s, B9_LGR, TCG_REG_R13, args[2]);
-            /* lgr %ra0/2, %ra1 */
-            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            tcg_out_mov(s, TCG_REG_R13, args[2]);
+            tcg_out_mov(s, args[0], args[1]);
             /* sgr %ra0/2, %r13 */
             tcg_out_b9(s, B9_SGR, args[0], TCG_REG_R13);
         } else {
-            /* lgr %ra0, %ra1 */
-            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            tcg_out_mov(s, args[0], args[1]);
             /* sgr %ra0, %ra2 */
             tcg_out_b9(s, B9_SGR, args[0], args[2]);
         }
@@ -920,7 +923,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         } else if (args[0] == args[2]) {
             tcg_out_b9(s, B9_AGR, args[0], args[1]);
         } else {
-            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            tcg_out_mov(s, args[0], args[1]);
             tcg_out_b9(s, B9_AGR, args[0], args[2]);
         }
         break;
@@ -960,7 +963,7 @@ do_logic_i64:
         } else if (args[0] == args[2]) {
             tcg_out_b9(s, op, args[0], args[1]);
         } else {
-            tcg_out_b9(s, B9_LGR, args[0], args[1]);
+            tcg_out_mov(s, args[0], args[1]);
             tcg_out_b9(s, op, args[0], args[2]);
         }
         break;
@@ -987,10 +990,10 @@ do_logic_i64:
         dprintf("op 0x%x neg_i64 0x%lx 0x%lx 0x%lx\n",
                 opc, args[0], args[1], args[2]);
         /* FIXME: optimize args[0] != args[1] case */
-        tcg_out_b9(s, B9_LGR, 13, args[1]);
+        tcg_out_mov(s, TCG_REG_R13, args[1]);
         /* lghi %ra0, 0 */
         tcg_out32(s, S390_INS_LGHI | (args[0] << 20));
-        tcg_out_b9(s, B9_SGR, args[0], 13);
+        tcg_out_b9(s, B9_SGR, args[0], TCG_REG_R13);
         break;
 
     case INDEX_op_mul_i32:
@@ -1334,11 +1337,6 @@ void tcg_target_qemu_prologue(TCGContext *s)
     tcg_out16(s, S390_INS_BR | TCG_REG_R14);
 }
 
-static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
-{
-    tcg_out_b9(s, B9_LGR, ret, arg);
-}
-
 static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
 {
     tcg_abort();
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 12/62] tcg-s390: Eliminate the S constraint.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (10 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 11/62] tcg-s390: Move tcg_out_mov up and use it throughout Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 13/62] tcg-s390: Add -m64 and -march to s390x compilation Richard Henderson
                   ` (50 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

R4 is not clobbered until all of the inputs are consumed,
so there's no need to avoid R4 in the qemu_st paths.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   30 ++++++------------------------
 1 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 455cf6a..2f29728 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -202,20 +202,6 @@ static int tcg_target_get_call_iarg_regs_count(int flags)
     return sizeof(tcg_target_call_iarg_regs) / sizeof(int);
 }
 
-static void constraint_softmmu(TCGArgConstraint *ct, const char c)
-{
-#ifdef CONFIG_SOFTMMU
-    switch (c) {
-    case 'S':                   /* qemu_st constraint */
-        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R4);
-        /* fall through */
-    case 'L':                   /* qemu_ld constraint */
-        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R3);
-        break;
-    }
-#endif
-  }
-
 /* parse target specific constraints */
 static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
 {
@@ -226,13 +212,9 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
     ct_str = *pct_str;
 
     switch (ct_str[0]) {
-    case 'L':                   /* qemu_ld constraint */
+    case 'L':                   /* qemu_ld/st constraint */
         tcg_regset_reset_reg (ct->u.regs, TCG_REG_R2);
-        constraint_softmmu(ct, 'L');
-        break;
-    case 'S':                   /* qemu_st constraint */
-        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R2);
-        constraint_softmmu(ct, 'S');
+        tcg_regset_reset_reg (ct->u.regs, TCG_REG_R3);
         break;
     case 'R':                        /* not R0 */
         tcg_regset_reset_reg(ct->u.regs, TCG_REG_R0);
@@ -1239,9 +1221,9 @@ do_logic_i64:
     { INDEX_op_qemu_ld32u, { "r", "L" } },
     { INDEX_op_qemu_ld32s, { "r", "L" } },
 
-    { INDEX_op_qemu_st8, { "S", "S" } },
-    { INDEX_op_qemu_st16, { "S", "S" } },
-    { INDEX_op_qemu_st32, { "S", "S" } },
+    { INDEX_op_qemu_st8, { "L", "L" } },
+    { INDEX_op_qemu_st16, { "L", "L" } },
+    { INDEX_op_qemu_st32, { "L", "L" } },
 
 #if defined(__s390x__)
     { INDEX_op_mov_i64, { "r", "r" } },
@@ -1261,7 +1243,7 @@ do_logic_i64:
     { INDEX_op_st_i64, { "r", "r" } },
 
     { INDEX_op_qemu_ld64, { "L", "L" } },
-    { INDEX_op_qemu_st64, { "S", "S" } },
+    { INDEX_op_qemu_st64, { "L", "L" } },
 
     { INDEX_op_add_i64, { "r", "r", "r" } },
     { INDEX_op_mul_i64, { "r", "r", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 13/62] tcg-s390: Add -m64 and -march to s390x compilation.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (11 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 12/62] tcg-s390: Eliminate the S constraint Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 14/62] tcg-s390: Define tcg_target_reg_names Richard Henderson
                   ` (49 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index e2b389d..72d3df8 100755
--- a/configure
+++ b/configure
@@ -697,7 +697,11 @@ case "$cpu" in
            fi
            ;;
     s390)
-           QEMU_CFLAGS="-march=z900 $QEMU_CFLAGS"
+           QEMU_CFLAGS="-march=z990 $QEMU_CFLAGS"
+           ;;
+    s390x)
+           QEMU_CFLAGS="-m64 -march=z9-109 $QEMU_CFLAGS"
+           LDFLAGS="-m64 $LDFLAGS"
            ;;
     i386)
            QEMU_CFLAGS="-m32 $QEMU_CFLAGS"
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 14/62] tcg-s390: Define tcg_target_reg_names.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (12 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 13/62] tcg-s390: Add -m64 and -march to s390x compilation Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 15/62] tcg-s390: Update disassembler from binutils head Richard Henderson
                   ` (48 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 2f29728..e0a0e73 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -113,6 +113,12 @@
 #define S390_INS_MSR   0xb2520000
 #define S390_INS_LARL  0xc000
 
+#ifndef NDEBUG
+static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
+    "%r0", "%r1", "%r2", "%r3", "%r4", "%r5", "%r6", "%r7",
+    "%r8", "%r9", "%r10" "%r11" "%r12" "%r13" "%r14" "%r15"
+};
+#endif
 
 static const int tcg_target_reg_alloc_order[] = {
     TCG_REG_R6,
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 15/62] tcg-s390: Update disassembler from binutils head.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (13 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 14/62] tcg-s390: Define tcg_target_reg_names Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 16/62] tcg-s390: Compute is_write in cpu_signal_handler Richard Henderson
                   ` (47 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 s390-dis.c |  818 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 669 insertions(+), 149 deletions(-)

diff --git a/s390-dis.c b/s390-dis.c
index 86dd84f..1284f20 100644
--- a/s390-dis.c
+++ b/s390-dis.c
@@ -1,34 +1,39 @@
+/* opcodes/s390-dis.c revision 1.18 */
 /* s390-dis.c -- Disassemble S390 instructions
-   Copyright 2000, 2001, 2002, 2003, 2005 Free Software Foundation, Inc.
+   Copyright 2000, 2001, 2002, 2003, 2005, 2007, 2008
+   Free Software Foundation, Inc.
    Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
 
-   This file is part of GDB, GAS and the GNU binutils.
+   This file is part of the GNU opcodes library.
 
-   This program is free software; you can redistribute it and/or modify
+   This library is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2 of the License, or
-   (at your option) any later version.
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
 
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
+   It is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
 
    You should have received a copy of the GNU General Public License
-   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
+   along with this file; see the file COPYING.  If not, write to the
+   Free Software Foundation, 51 Franklin Street - Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
 
-#include <stdio.h>
+#include "qemu-common.h"
 #include "dis-asm.h"
 
+/* include/opcode/s390.h revision 1.10 */
 /* s390.h -- Header file for S390 opcode table
-   Copyright 2000, 2001, 2003 Free Software Foundation, Inc.
+   Copyright 2000, 2001, 2003, 2010 Free Software Foundation, Inc.
    Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
 
    This file is part of BFD, the Binary File Descriptor library.
 
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2 of the License, or
+   the Free Software Foundation; either version 3 of the License, or
    (at your option) any later version.
 
    This program is distributed in the hope that it will be useful,
@@ -37,7 +42,9 @@
    GNU General Public License for more details.
 
    You should have received a copy of the GNU General Public License
-   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
+   02110-1301, USA.  */
 
 #ifndef S390_H
 #define S390_H
@@ -57,7 +64,8 @@ enum s390_opcode_cpu_val
     S390_OPCODE_Z900,
     S390_OPCODE_Z990,
     S390_OPCODE_Z9_109,
-    S390_OPCODE_Z9_EC
+    S390_OPCODE_Z9_EC,
+    S390_OPCODE_Z10
   };
 
 /* The opcode table is an array of struct s390_opcode.  */
@@ -95,12 +103,13 @@ struct s390_opcode
 /* The table itself is sorted by major opcode number, and is otherwise
    in the order in which the disassembler should consider
    instructions.  */
-extern const struct s390_opcode s390_opcodes[];
-extern const int                s390_num_opcodes;
+/* QEMU: Mark these static.  */
+static const struct s390_opcode s390_opcodes[];
+static const int                s390_num_opcodes;
 
 /* A opcode format table for the .insn pseudo mnemonic.  */
-extern const struct s390_opcode s390_opformats[];
-extern const int                s390_num_opformats;
+static const struct s390_opcode s390_opformats[];
+static const int                s390_num_opformats;
 
 /* Values defined for the flags field of a struct powerpc_opcode.  */
 
@@ -164,12 +173,13 @@ extern const struct s390_operand s390_operands[];
    the instruction may be optional.  */
 #define S390_OPERAND_OPTIONAL 0x400
 
-	#endif /* S390_H */
-
+#endif /* S390_H */
 
 static int init_flag = 0;
 static int opc_index[256];
-static int current_arch_mask = 0;
+
+/* QEMU: Just set mask to include all architectures.  */
+static int current_arch_mask = -1;
 
 /* Set up index table for first opcode byte.  */
 
@@ -178,6 +188,9 @@ init_disasm (struct disassemble_info *info)
 {
   const struct s390_opcode *opcode;
   const struct s390_opcode *opcode_end;
+#ifdef QEMU_DISABLE
+  const char *p;
+#endif
 
   memset (opc_index, 0, sizeof (opc_index));
   opcode_end = s390_opcodes + s390_num_opcodes;
@@ -188,21 +201,44 @@ init_disasm (struct disassemble_info *info)
 	     (opcode[1].opcode[0] == opcode->opcode[0]))
 	opcode++;
     }
-//  switch (info->mach)
-//    {
-//    case bfd_mach_s390_31:
-      current_arch_mask = 1 << S390_OPCODE_ESA;
-//      break;
-//    case bfd_mach_s390_64:
-//      current_arch_mask = 1 << S390_OPCODE_ZARCH;
-//      break;
-//    default:
-//      abort ();
-//    }
+
+#ifdef QEMU_DISABLE
+  for (p = info->disassembler_options; p != NULL; )
+    {
+      if (CONST_STRNEQ (p, "esa"))
+	current_arch_mask = 1 << S390_OPCODE_ESA;
+      else if (CONST_STRNEQ (p, "zarch"))
+	current_arch_mask = 1 << S390_OPCODE_ZARCH;
+      else
+	fprintf (stderr, "Unknown S/390 disassembler option: %s\n", p);
+
+      p = strchr (p, ',');
+      if (p != NULL)
+	p++;
+    }
+
+  if (!current_arch_mask)
+    switch (info->mach)
+      {
+      case bfd_mach_s390_31:
+	current_arch_mask = 1 << S390_OPCODE_ESA;
+	break;
+      case bfd_mach_s390_64:
+	current_arch_mask = 1 << S390_OPCODE_ZARCH;
+	break;
+      default:
+	abort ();
+      }
+#endif /* QEMU_DISABLE */
+
   init_flag = 1;
 }
 
 /* Extracts an operand value from an instruction.  */
+/* We do not perform the shift operation for larl-type address
+   operands here since that would lead to an overflow of the 32 bit
+   integer value.  Instead the shift operation is done when printing
+   the operand in print_insn_s390.  */
 
 static inline unsigned int
 s390_extract_operand (unsigned char *insn, const struct s390_operand *operand)
@@ -233,10 +269,6 @@ s390_extract_operand (unsigned char *insn, const struct s390_operand *operand)
       && (val & (1U << (operand->bits - 1))))
     val |= (-1U << (operand->bits - 1)) << 1;
 
-  /* Double value if the operand is pc relative.  */
-  if (operand->flags & S390_OPERAND_PCREL)
-    val <<= 1;
-
   /* Length x in an instructions has real length x + 1.  */
   if (operand->flags & S390_OPERAND_LENGTH)
     val++;
@@ -318,8 +350,6 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
 	  separator = 0;
 	  for (opindex = opcode->operands; *opindex != 0; opindex++)
 	    {
-	      unsigned int value;
-
 	      operand = s390_operands + *opindex;
 	      value = s390_extract_operand (buffer, operand);
 
@@ -344,7 +374,8 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
 	      else if (operand->flags & S390_OPERAND_CR)
 		(*info->fprintf_func) (info->stream, "%%c%i", value);
 	      else if (operand->flags & S390_OPERAND_PCREL)
-		(*info->print_address_func) (memaddr + (int) value, info);
+		(*info->print_address_func) (memaddr + (int)value + (int)value,
+					     info);
 	      else if (operand->flags & S390_OPERAND_SIGNED)
 		(*info->fprintf_func) (info->stream, "%i", (int) value);
 	      else
@@ -392,26 +423,48 @@ print_insn_s390 (bfd_vma memaddr, struct disassemble_info *info)
       return 1;
     }
 }
+
+#ifdef QEMU_DISABLE
+void
+print_s390_disassembler_options (FILE *stream)
+{
+  fprintf (stream, _("\n\
+The following S/390 specific disassembler options are supported for use\n\
+with the -M switch (multiple options should be separated by commas):\n"));
+
+  fprintf (stream, _("  esa         Disassemble in ESA architecture mode\n"));
+  fprintf (stream, _("  zarch       Disassemble in z/Architecture mode\n"));
+}
+#endif
+
+/* include opcodes/s390-opc.c revision 1.24 */
 /* s390-opc.c -- S390 opcode list
-   Copyright 2000, 2001, 2003 Free Software Foundation, Inc.
+   Copyright 2000, 2001, 2003, 2005, 2007, 2008, 2009
+   Free Software Foundation, Inc.
    Contributed by Martin Schwidefsky (schwidefsky@de.ibm.com).
 
-   This file is part of GDB, GAS, and the GNU binutils.
+   This file is part of the GNU opcodes library.
 
-   This program is free software; you can redistribute it and/or modify
+   This library is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2 of the License, or
-   (at your option) any later version.
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
 
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
+   It is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
 
    You should have received a copy of the GNU General Public License
-   along with this program; if not, see <http://www.gnu.org/licenses/>.  */
+   along with this file; see the file COPYING.  If not, write to the
+   Free Software Foundation, 51 Franklin Street - Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
 
+#ifdef QEMU_DISABLE
 #include <stdio.h>
+#include "ansidecl.h"
+#include "opcode/s390.h"
+#endif
 
 /* This file holds the S390 opcode table.  The opcode table
    includes almost all of the extended instruction mnemonics.  This
@@ -432,100 +485,143 @@ const struct s390_operand s390_operands[] =
 #define UNUSED 0
   { 0, 0, 0 },                    /* Indicates the end of the operand list */
 
+/* General purpose register operands.  */
+
 #define R_8    1                  /* GPR starting at position 8 */
   { 4, 8, S390_OPERAND_GPR },
 #define R_12   2                  /* GPR starting at position 12 */
   { 4, 12, S390_OPERAND_GPR },
-#define R_16   3                  /* GPR starting at position 16 */
+#define RO_12  3                 /* optional GPR starting at position 12 */
+  { 4, 12, S390_OPERAND_GPR|S390_OPERAND_OPTIONAL },
+#define R_16   4                  /* GPR starting at position 16 */
   { 4, 16, S390_OPERAND_GPR },
-#define R_20   4                  /* GPR starting at position 20 */
+#define R_20   5                  /* GPR starting at position 20 */
   { 4, 20, S390_OPERAND_GPR },
-#define R_24   5                  /* GPR starting at position 24 */
+#define R_24   6                  /* GPR starting at position 24 */
   { 4, 24, S390_OPERAND_GPR },
-#define R_28   6                  /* GPR starting at position 28 */
+#define R_28   7                  /* GPR starting at position 28 */
   { 4, 28, S390_OPERAND_GPR },
-#define R_32   7                  /* GPR starting at position 32 */
+#define RO_28  8                  /* optional GPR starting at position 28 */
+  { 4, 28, (S390_OPERAND_GPR | S390_OPERAND_OPTIONAL) },
+#define R_32   9                  /* GPR starting at position 32 */
   { 4, 32, S390_OPERAND_GPR },
 
-#define F_8    8                  /* FPR starting at position 8 */
+/* Floating point register operands.  */
+
+#define F_8    10                 /* FPR starting at position 8 */
   { 4, 8, S390_OPERAND_FPR },
-#define F_12   9                  /* FPR starting at position 12 */
+#define F_12   11                 /* FPR starting at position 12 */
   { 4, 12, S390_OPERAND_FPR },
-#define F_16   10                 /* FPR starting at position 16 */
+#define F_16   12                 /* FPR starting at position 16 */
   { 4, 16, S390_OPERAND_FPR },
-#define F_20   11                 /* FPR starting at position 16 */
+#define F_20   13                 /* FPR starting at position 16 */
   { 4, 16, S390_OPERAND_FPR },
-#define F_24   12                 /* FPR starting at position 24 */
+#define F_24   14                 /* FPR starting at position 24 */
   { 4, 24, S390_OPERAND_FPR },
-#define F_28   13                 /* FPR starting at position 28 */
+#define F_28   15                 /* FPR starting at position 28 */
   { 4, 28, S390_OPERAND_FPR },
-#define F_32   14                 /* FPR starting at position 32 */
+#define F_32   16                 /* FPR starting at position 32 */
   { 4, 32, S390_OPERAND_FPR },
 
-#define A_8    15                 /* Access reg. starting at position 8 */
+/* Access register operands.  */
+
+#define A_8    17                 /* Access reg. starting at position 8 */
   { 4, 8, S390_OPERAND_AR },
-#define A_12   16                 /* Access reg. starting at position 12 */
+#define A_12   18                 /* Access reg. starting at position 12 */
   { 4, 12, S390_OPERAND_AR },
-#define A_24   17                 /* Access reg. starting at position 24 */
+#define A_24   19                 /* Access reg. starting at position 24 */
   { 4, 24, S390_OPERAND_AR },
-#define A_28   18                 /* Access reg. starting at position 28 */
+#define A_28   20                 /* Access reg. starting at position 28 */
   { 4, 28, S390_OPERAND_AR },
 
-#define C_8    19                 /* Control reg. starting at position 8 */
+/* Control register operands.  */
+
+#define C_8    21                 /* Control reg. starting at position 8 */
   { 4, 8, S390_OPERAND_CR },
-#define C_12   20                 /* Control reg. starting at position 12 */
+#define C_12   22                 /* Control reg. starting at position 12 */
   { 4, 12, S390_OPERAND_CR },
 
-#define B_16   21                 /* Base register starting at position 16 */
+/* Base register operands.  */
+
+#define B_16   23                 /* Base register starting at position 16 */
   { 4, 16, S390_OPERAND_BASE|S390_OPERAND_GPR },
-#define B_32   22                 /* Base register starting at position 32 */
+#define B_32   24                 /* Base register starting at position 32 */
   { 4, 32, S390_OPERAND_BASE|S390_OPERAND_GPR },
 
-#define X_12   23                 /* Index register starting at position 12 */
+#define X_12   25                 /* Index register starting at position 12 */
   { 4, 12, S390_OPERAND_INDEX|S390_OPERAND_GPR },
 
-#define D_20   24                 /* Displacement starting at position 20 */
+/* Address displacement operands.  */
+
+#define D_20   26                 /* Displacement starting at position 20 */
   { 12, 20, S390_OPERAND_DISP },
-#define D_36   25                 /* Displacement starting at position 36 */
+#define DO_20  27                 /* optional Displ. starting at position 20 */
+  { 12, 20, S390_OPERAND_DISP|S390_OPERAND_OPTIONAL },
+#define D_36   28                 /* Displacement starting at position 36 */
   { 12, 36, S390_OPERAND_DISP },
-#define D20_20 26		  /* 20 bit displacement starting at 20 */
+#define D20_20 29		  /* 20 bit displacement starting at 20 */
   { 20, 20, S390_OPERAND_DISP|S390_OPERAND_SIGNED },
 
-#define L4_8   27                 /* 4 bit length starting at position 8 */
+/* Length operands.  */
+
+#define L4_8   30                 /* 4 bit length starting at position 8 */
   { 4, 8, S390_OPERAND_LENGTH },
-#define L4_12  28                 /* 4 bit length starting at position 12 */
+#define L4_12  31                 /* 4 bit length starting at position 12 */
   { 4, 12, S390_OPERAND_LENGTH },
-#define L8_8   29                 /* 8 bit length starting at position 8 */
+#define L8_8   32                 /* 8 bit length starting at position 8 */
   { 8, 8, S390_OPERAND_LENGTH },
 
-#define U4_8   30                 /* 4 bit unsigned value starting at 8 */
+/* Signed immediate operands.  */
+
+#define I8_8   33		  /* 8 bit signed value starting at 8 */
+  { 8, 8, S390_OPERAND_SIGNED },
+#define I8_32  34		  /* 8 bit signed value starting at 32 */
+  { 8, 32, S390_OPERAND_SIGNED },
+#define I16_16 35                 /* 16 bit signed value starting at 16 */
+  { 16, 16, S390_OPERAND_SIGNED },
+#define I16_32 36                 /* 16 bit signed value starting at 32 */
+  { 16, 32, S390_OPERAND_SIGNED },
+#define I32_16 37		  /* 32 bit signed value starting at 16 */
+  { 32, 16, S390_OPERAND_SIGNED },
+
+/* Unsigned immediate operands.  */
+
+#define U4_8   38                 /* 4 bit unsigned value starting at 8 */
   { 4, 8, 0 },
-#define U4_12  31                 /* 4 bit unsigned value starting at 12 */
+#define U4_12  39                 /* 4 bit unsigned value starting at 12 */
   { 4, 12, 0 },
-#define U4_16  32                 /* 4 bit unsigned value starting at 16 */
+#define U4_16  40                 /* 4 bit unsigned value starting at 16 */
   { 4, 16, 0 },
-#define U4_20  33                 /* 4 bit unsigned value starting at 20 */
+#define U4_20  41                 /* 4 bit unsigned value starting at 20 */
   { 4, 20, 0 },
-#define U8_8   34                 /* 8 bit unsigned value starting at 8 */
+#define U4_32  42                 /* 4 bit unsigned value starting at 32 */
+  { 4, 32, 0 },
+#define U8_8   43                 /* 8 bit unsigned value starting at 8 */
   { 8, 8, 0 },
-#define U8_16  35                 /* 8 bit unsigned value starting at 16 */
+#define U8_16  44                 /* 8 bit unsigned value starting at 16 */
   { 8, 16, 0 },
-#define I16_16 36                 /* 16 bit signed value starting at 16 */
-  { 16, 16, S390_OPERAND_SIGNED },
-#define U16_16 37                 /* 16 bit unsigned value starting at 16 */
+#define U8_24  45                 /* 8 bit unsigned value starting at 24 */
+  { 8, 24, 0 },
+#define U8_32  46                 /* 8 bit unsigned value starting at 32 */
+  { 8, 32, 0 },
+#define U16_16 47                 /* 16 bit unsigned value starting at 16 */
   { 16, 16, 0 },
-#define J16_16 38                 /* PC relative jump offset at 16 */
+#define U16_32 48		  /* 16 bit unsigned value starting at 32 */
+  { 16, 32, 0 },
+#define U32_16 49		  /* 32 bit unsigned value starting at 16 */
+  { 32, 16, 0 },
+
+/* PC-relative address operands.  */
+
+#define J16_16 50                 /* PC relative jump offset at 16 */
   { 16, 16, S390_OPERAND_PCREL },
-#define J32_16 39                 /* PC relative long offset at 16 */
+#define J32_16 51                 /* PC relative long offset at 16 */
   { 32, 16, S390_OPERAND_PCREL },
-#define I32_16 40		  /* 32 bit signed value starting at 16 */
-  { 32, 16, S390_OPERAND_SIGNED },
-#define U32_16 41		  /* 32 bit unsigned value starting at 16 */
-  { 32, 16, 0 },
-#define M_16   42                 /* 4 bit optional mask starting at 16 */
+
+/* Conditional mask operands.  */
+
+#define M_16   52                 /* 4 bit optional mask starting at 16 */
   { 4, 16, S390_OPERAND_OPTIONAL },
-#define RO_28  43                 /* optional GPR starting at position 28 */
-  { 4, 28, (S390_OPERAND_GPR | S390_OPERAND_OPTIONAL) }
 
 };
 
@@ -563,7 +659,7 @@ const struct s390_operand s390_operands[] =
       quite close.
 
       For example the instruction "mvo" is defined in the PoP as follows:
-
+      
       MVO  D1(L1,B1),D2(L2,B2)   [SS]
 
       --------------------------------------
@@ -575,6 +671,17 @@ const struct s390_operand s390_operands[] =
 
 #define INSTR_E          2, { 0,0,0,0,0,0 }                    /* e.g. pr    */
 #define INSTR_RIE_RRP    6, { R_8,R_12,J16_16,0,0,0 }          /* e.g. brxhg */
+#define INSTR_RIE_RRPU   6, { R_8,R_12,U4_32,J16_16,0,0 }      /* e.g. crj   */
+#define INSTR_RIE_RRP0   6, { R_8,R_12,J16_16,0,0,0 }          /* e.g. crjne */
+#define INSTR_RIE_RUPI   6, { R_8,I8_32,U4_12,J16_16,0,0 }     /* e.g. cij   */
+#define INSTR_RIE_R0PI   6, { R_8,I8_32,J16_16,0,0,0 }         /* e.g. cijne */
+#define INSTR_RIE_RUPU   6, { R_8,U8_32,U4_12,J16_16,0,0 }     /* e.g. clij  */
+#define INSTR_RIE_R0PU   6, { R_8,U8_32,J16_16,0,0,0 }         /* e.g. clijne */
+#define INSTR_RIE_R0IU   6, { R_8,I16_16,U4_32,0,0,0 }         /* e.g. cit   */
+#define INSTR_RIE_R0I0   6, { R_8,I16_16,0,0,0,0 }             /* e.g. citne */
+#define INSTR_RIE_R0UU   6, { R_8,U16_16,U4_32,0,0,0 }         /* e.g. clfit */
+#define INSTR_RIE_R0U0   6, { R_8,U16_16,0,0,0,0 }             /* e.g. clfitne */
+#define INSTR_RIE_RRUUU  6, { R_8,R_12,U8_16,U8_24,U8_32,0 }   /* e.g. rnsbg */
 #define INSTR_RIL_0P     6, { J32_16,0,0,0,0 }                 /* e.g. jg    */
 #define INSTR_RIL_RP     6, { R_8,J32_16,0,0,0,0 }             /* e.g. brasl */
 #define INSTR_RIL_UP     6, { U4_8,J32_16,0,0,0,0 }            /* e.g. brcl  */
@@ -585,6 +692,10 @@ const struct s390_operand s390_operands[] =
 #define INSTR_RI_RP      4, { R_8,J16_16,0,0,0,0 }             /* e.g. brct  */
 #define INSTR_RI_RU      4, { R_8,U16_16,0,0,0,0 }             /* e.g. tml   */
 #define INSTR_RI_UP      4, { U4_8,J16_16,0,0,0,0 }            /* e.g. brc   */
+#define INSTR_RIS_RURDI  6, { R_8,I8_32,U4_12,D_20,B_16,0 }    /* e.g. cib   */
+#define INSTR_RIS_R0RDI  6, { R_8,I8_32,D_20,B_16,0,0 }        /* e.g. cibne */
+#define INSTR_RIS_RURDU  6, { R_8,U8_32,U4_12,D_20,B_16,0 }    /* e.g. clib  */
+#define INSTR_RIS_R0RDU  6, { R_8,U8_32,D_20,B_16,0,0 }        /* e.g. clibne*/
 #define INSTR_RRE_00     4, { 0,0,0,0,0,0 }                    /* e.g. palb  */
 #define INSTR_RRE_0R     4, { R_28,0,0,0,0,0 }                 /* e.g. tb    */
 #define INSTR_RRE_AA     4, { A_24,A_28,0,0,0,0 }              /* e.g. cpya  */
@@ -604,24 +715,29 @@ const struct s390_operand s390_operands[] =
 #define INSTR_RRF_F0FR   4, { F_24,F_16,R_28,0,0,0 }           /* e.g. iedtr */
 #define INSTR_RRF_FUFF   4, { F_24,F_16,F_28,U4_20,0,0 }       /* e.g. didbr */
 #define INSTR_RRF_RURR   4, { R_24,R_28,R_16,U4_20,0,0 }       /* e.g. .insn */
-#define INSTR_RRF_R0RR   4, { R_24,R_28,R_16,0,0,0 }           /* e.g. idte  */
+#define INSTR_RRF_R0RR   4, { R_24,R_16,R_28,0,0,0 }           /* e.g. idte  */
 #define INSTR_RRF_U0FF   4, { F_24,U4_16,F_28,0,0,0 }          /* e.g. fixr  */
 #define INSTR_RRF_U0RF   4, { R_24,U4_16,F_28,0,0,0 }          /* e.g. cfebr */
 #define INSTR_RRF_UUFF   4, { F_24,U4_16,F_28,U4_20,0,0 }      /* e.g. fidtr */
 #define INSTR_RRF_0UFF   4, { F_24,F_28,U4_20,0,0,0 }          /* e.g. ldetr */
-#define INSTR_RRF_FFFU   4, { F_24,F_16,F_28,U4_20,0,0 }       /* e.g. qadtr */
+#define INSTR_RRF_FFRU   4, { F_24,F_16,R_28,U4_20,0,0 }       /* e.g. rrdtr */
 #define INSTR_RRF_M0RR   4, { R_24,R_28,M_16,0,0,0 }           /* e.g. sske  */
+#define INSTR_RRF_U0RR   4, { R_24,R_28,U4_16,0,0,0 }          /* e.g. clrt  */
+#define INSTR_RRF_00RR   4, { R_24,R_28,0,0,0,0 }              /* e.g. clrtne */
 #define INSTR_RR_0R      2, { R_12, 0,0,0,0,0 }                /* e.g. br    */
+#define INSTR_RR_0R_OPT  2, { RO_12, 0,0,0,0,0 }               /* e.g. nopr  */
 #define INSTR_RR_FF      2, { F_8,F_12,0,0,0,0 }               /* e.g. adr   */
 #define INSTR_RR_R0      2, { R_8, 0,0,0,0,0 }                 /* e.g. spm   */
 #define INSTR_RR_RR      2, { R_8,R_12,0,0,0,0 }               /* e.g. lr    */
 #define INSTR_RR_U0      2, { U8_8, 0,0,0,0,0 }                /* e.g. svc   */
 #define INSTR_RR_UR      2, { U4_8,R_12,0,0,0,0 }              /* e.g. bcr   */
 #define INSTR_RRR_F0FF   4, { F_24,F_28,F_16,0,0,0 }           /* e.g. ddtr  */
+#define INSTR_RRS_RRRDU  6, { R_8,R_12,U4_32,D_20,B_16 }       /* e.g. crb   */
+#define INSTR_RRS_RRRD0  6, { R_8,R_12,D_20,B_16,0 }           /* e.g. crbne */
 #define INSTR_RSE_RRRD   6, { R_8,R_12,D_20,B_16,0,0 }         /* e.g. lmh   */
 #define INSTR_RSE_CCRD   6, { C_8,C_12,D_20,B_16,0,0 }         /* e.g. lmh   */
 #define INSTR_RSE_RURD   6, { R_8,U4_12,D_20,B_16,0,0 }        /* e.g. icmh  */
-#define INSTR_RSL_R0RD   6, { R_8,D_20,B_16,0,0,0 }            /* e.g. tp    */
+#define INSTR_RSL_R0RD   6, { D_20,L4_8,B_16,0,0,0 }           /* e.g. tp    */
 #define INSTR_RSI_RRP    4, { R_8,R_12,J16_16,0,0,0 }          /* e.g. brxh  */
 #define INSTR_RSY_RRRD   6, { R_8,R_12,D20_20,B_16,0,0 }       /* e.g. stmy  */
 #define INSTR_RSY_RURD   6, { R_8,U4_12,D20_20,B_16,0,0 }      /* e.g. icmh  */
@@ -638,12 +754,17 @@ const struct s390_operand s390_operands[] =
 #define INSTR_RXF_RRRDR  6, { R_32,R_8,D_20,X_12,B_16,0 }      /* e.g. .insn */
 #define INSTR_RXY_RRRD   6, { R_8,D20_20,X_12,B_16,0,0 }       /* e.g. ly    */
 #define INSTR_RXY_FRRD   6, { F_8,D20_20,X_12,B_16,0,0 }       /* e.g. ley   */
+#define INSTR_RXY_URRD   6, { U4_8,D20_20,X_12,B_16,0,0 }      /* e.g. pfd   */
 #define INSTR_RX_0RRD    4, { D_20,X_12,B_16,0,0,0 }           /* e.g. be    */
+#define INSTR_RX_0RRD_OPT 4, { DO_20,X_12,B_16,0,0,0 }         /* e.g. nop   */
 #define INSTR_RX_FRRD    4, { F_8,D_20,X_12,B_16,0,0 }         /* e.g. ae    */
 #define INSTR_RX_RRRD    4, { R_8,D_20,X_12,B_16,0,0 }         /* e.g. l     */
 #define INSTR_RX_URRD    4, { U4_8,D_20,X_12,B_16,0,0 }        /* e.g. bc    */
 #define INSTR_SI_URD     4, { D_20,B_16,U8_8,0,0,0 }           /* e.g. cli   */
 #define INSTR_SIY_URD    6, { D20_20,B_16,U8_8,0,0,0 }         /* e.g. tmy   */
+#define INSTR_SIY_IRD    6, { D20_20,B_16,I8_8,0,0,0 }         /* e.g. asi   */
+#define INSTR_SIL_RDI    6, { D_20,B_16,I16_32,0,0,0 }         /* e.g. chhsi */
+#define INSTR_SIL_RDU    6, { D_20,B_16,U16_32,0,0,0 }         /* e.g. clfhsi */
 #define INSTR_SSE_RDRD   6, { D_20,B_16,D_36,B_32,0,0 }        /* e.g. mvsdk */
 #define INSTR_SS_L0RDRD  6, { D_20,L8_8,B_16,D_36,B_32,0     } /* e.g. mvc   */
 #define INSTR_SS_L2RDRD  6, { D_20,B_16,D_36,L8_8,B_32,0     } /* e.g. pka   */
@@ -652,12 +773,23 @@ const struct s390_operand s390_operands[] =
 #define INSTR_SS_RRRDRD  6, { D_20,R_8,B_16,D_36,B_32,R_12 }   /* e.g. mvck  */
 #define INSTR_SS_RRRDRD2 6, { R_8,D_20,B_16,R_12,D_36,B_32 }   /* e.g. plo   */
 #define INSTR_SS_RRRDRD3 6, { R_8,R_12,D_20,B_16,D_36,B_32 }   /* e.g. lmd   */
+#define INSTR_SSF_RRDRD  6, { D_20,B_16,D_36,B_32,R_8,0 }      /* e.g. mvcos */
 #define INSTR_S_00       4, { 0,0,0,0,0,0 }                    /* e.g. hsch  */
 #define INSTR_S_RD       4, { D_20,B_16,0,0,0,0 }              /* e.g. lpsw  */
-#define INSTR_SSF_RRDRD  6, { D_20,B_16,D_36,B_32,R_8,0 }      /* e.g. mvcos */
 
 #define MASK_E           { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RIE_RRP     { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RIE_RRPU    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RIE_RRP0    { 0xff, 0x00, 0x00, 0x00, 0xf0, 0xff }
+#define MASK_RIE_RUPI    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RIE_R0PI    { 0xff, 0x00, 0x00, 0x00, 0xf0, 0xff }
+#define MASK_RIE_RUPU    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RIE_R0PU    { 0xff, 0x00, 0x00, 0x00, 0xf0, 0xff }
+#define MASK_RIE_R0IU    { 0xff, 0x0f, 0x00, 0x00, 0x0f, 0xff }
+#define MASK_RIE_R0I0    { 0xff, 0x0f, 0x00, 0x00, 0xff, 0xff }
+#define MASK_RIE_R0UU    { 0xff, 0x0f, 0x00, 0x00, 0x0f, 0xff }
+#define MASK_RIE_R0U0    { 0xff, 0x0f, 0x00, 0x00, 0xff, 0xff }
+#define MASK_RIE_RRUUU   { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
 #define MASK_RIL_0P      { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RIL_RP      { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RIL_UP      { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
@@ -668,6 +800,10 @@ const struct s390_operand s390_operands[] =
 #define MASK_RI_RP       { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RI_RU       { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RI_UP       { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
+#define MASK_RIS_RURDI   { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RIS_R0RDI   { 0xff, 0x0f, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RIS_RURDU   { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RIS_R0RDU   { 0xff, 0x0f, 0x00, 0x00, 0x00, 0xff }
 #define MASK_RRE_00      { 0xff, 0xff, 0xff, 0xff, 0x00, 0x00 }
 #define MASK_RRE_0R      { 0xff, 0xff, 0xff, 0xf0, 0x00, 0x00 }
 #define MASK_RRE_AA      { 0xff, 0xff, 0xff, 0x00, 0x00, 0x00 }
@@ -690,15 +826,20 @@ const struct s390_operand s390_operands[] =
 #define MASK_RRF_U0RF    { 0xff, 0xff, 0x0f, 0x00, 0x00, 0x00 }
 #define MASK_RRF_UUFF    { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RRF_0UFF    { 0xff, 0xff, 0xf0, 0x00, 0x00, 0x00 }
-#define MASK_RRF_FFFU    { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
+#define MASK_RRF_FFRU    { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RRF_M0RR    { 0xff, 0xff, 0x0f, 0x00, 0x00, 0x00 }
+#define MASK_RRF_U0RR    { 0xff, 0xff, 0x0f, 0x00, 0x00, 0x00 }
+#define MASK_RRF_00RR    { 0xff, 0xff, 0xff, 0x00, 0x00, 0x00 }
 #define MASK_RR_0R       { 0xff, 0xf0, 0x00, 0x00, 0x00, 0x00 }
+#define MASK_RR_0R_OPT   { 0xff, 0xf0, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RR_FF       { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RR_R0       { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RR_RR       { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RR_U0       { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RR_UR       { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RRR_F0FF    { 0xff, 0xff, 0x0f, 0x00, 0x00, 0x00 }
+#define MASK_RRS_RRRDU   { 0xff, 0x00, 0x00, 0x00, 0x0f, 0xff }
+#define MASK_RRS_RRRD0   { 0xff, 0x00, 0x00, 0x00, 0xff, 0xff }
 #define MASK_RSE_RRRD    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
 #define MASK_RSE_CCRD    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
 #define MASK_RSE_RURD    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
@@ -719,12 +860,17 @@ const struct s390_operand s390_operands[] =
 #define MASK_RXF_RRRDR   { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
 #define MASK_RXY_RRRD    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
 #define MASK_RXY_FRRD    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_RXY_URRD    { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
 #define MASK_RX_0RRD     { 0xff, 0xf0, 0x00, 0x00, 0x00, 0x00 }
+#define MASK_RX_0RRD_OPT { 0xff, 0xf0, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RX_FRRD     { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RX_RRRD     { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_RX_URRD     { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SI_URD      { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SIY_URD     { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_SIY_IRD     { 0xff, 0x00, 0x00, 0x00, 0x00, 0xff }
+#define MASK_SIL_RDI     { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
+#define MASK_SIL_RDU     { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SSE_RDRD    { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SS_L0RDRD   { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SS_L2RDRD   { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
@@ -733,22 +879,26 @@ const struct s390_operand s390_operands[] =
 #define MASK_SS_RRRDRD   { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SS_RRRDRD2  { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_SS_RRRDRD3  { 0xff, 0x00, 0x00, 0x00, 0x00, 0x00 }
+#define MASK_SSF_RRDRD   { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
 #define MASK_S_00        { 0xff, 0xff, 0xff, 0xff, 0x00, 0x00 }
 #define MASK_S_RD        { 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }
-#define MASK_SSF_RRDRD   { 0xff, 0x0f, 0x00, 0x00, 0x00, 0x00 }
+
 
 /* The opcode formats table (blueprints for .insn pseudo mnemonic).  */
 
-const struct s390_opcode s390_opformats[] =
+/* QEMU: Mark these static.  */
+static const struct s390_opcode s390_opformats[] =
   {
   { "e",	OP8(0x00LL),	MASK_E,		INSTR_E,	3, 0 },
   { "ri",	OP8(0x00LL),	MASK_RI_RI,	INSTR_RI_RI,	3, 0 },
   { "rie",	OP8(0x00LL),	MASK_RIE_RRP,	INSTR_RIE_RRP,	3, 0 },
   { "ril",	OP8(0x00LL),	MASK_RIL_RP,	INSTR_RIL_RP,	3, 0 },
   { "rilu",	OP8(0x00LL),	MASK_RIL_RU,	INSTR_RIL_RU,	3, 0 },
+  { "ris",	OP8(0x00LL),	MASK_RIS_RURDI,	INSTR_RIS_RURDI,3, 6 },
   { "rr",	OP8(0x00LL),	MASK_RR_RR,	INSTR_RR_RR,	3, 0 },
   { "rre",	OP8(0x00LL),	MASK_RRE_RR,	INSTR_RRE_RR,	3, 0 },
   { "rrf",	OP8(0x00LL),	MASK_RRF_RURR,	INSTR_RRF_RURR,	3, 0 },
+  { "rrs",	OP8(0x00LL),	MASK_RRS_RRRDU,	INSTR_RRS_RRRDU,3, 6 },
   { "rs",	OP8(0x00LL),	MASK_RS_RRRD,	INSTR_RS_RRRD,	3, 0 },
   { "rse",	OP8(0x00LL),	MASK_RSE_RRRD,	INSTR_RSE_RRRD,	3, 0 },
   { "rsi",	OP8(0x00LL),	MASK_RSI_RRP,	INSTR_RSI_RRP,	3, 0 },
@@ -760,14 +910,16 @@ const struct s390_opcode s390_opformats[] =
   { "s",	OP8(0x00LL),	MASK_S_RD,	INSTR_S_RD,	3, 0 },
   { "si",	OP8(0x00LL),	MASK_SI_URD,	INSTR_SI_URD,	3, 0 },
   { "siy",	OP8(0x00LL),	MASK_SIY_URD,	INSTR_SIY_URD,	3, 3 },
+  { "sil",	OP8(0x00LL),    MASK_SIL_RDI,   INSTR_SIL_RDI,  3, 6 },
   { "ss",	OP8(0x00LL),	MASK_SS_RRRDRD,	INSTR_SS_RRRDRD,3, 0 },
   { "sse",	OP8(0x00LL),	MASK_SSE_RDRD,	INSTR_SSE_RDRD,	3, 0 },
   { "ssf",	OP8(0x00LL),	MASK_SSF_RRDRD,	INSTR_SSF_RRDRD,3, 0 },
 };
 
-const int s390_num_opformats =
+static const int s390_num_opformats =
   sizeof (s390_opformats) / sizeof (s390_opformats[0]);
 
+/* include "s390-opc.tab", generated from opcodes/s390-opc.txt revision 1.28 */
 /* The opcode table. This file was generated by s390-mkopc.
 
    The format of the opcode table is:
@@ -783,7 +935,8 @@ const int s390_num_opformats =
    The disassembler reads the table in order and prints the first
    instruction which matches.  */
 
-const struct s390_opcode s390_opcodes[] =
+/* QEMU: Mark these static.  */
+static const struct s390_opcode s390_opcodes[] =
   {
   { "dp", OP8(0xfdLL), MASK_SS_LLRDRD, INSTR_SS_LLRDRD, 3, 0},
   { "mp", OP8(0xfcLL), MASK_SS_LLRDRD, INSTR_SS_LLRDRD, 3, 0},
@@ -801,12 +954,12 @@ const struct s390_opcode s390_opcodes[] =
   { "stey", OP48(0xed0000000066LL), MASK_RXY_FRRD, INSTR_RXY_FRRD, 2, 3},
   { "ldy", OP48(0xed0000000065LL), MASK_RXY_FRRD, INSTR_RXY_FRRD, 2, 3},
   { "ley", OP48(0xed0000000064LL), MASK_RXY_FRRD, INSTR_RXY_FRRD, 2, 3},
-  { "tgxt", OP48(0xed0000000059LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
-  { "tcxt", OP48(0xed0000000058LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
-  { "tgdt", OP48(0xed0000000055LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
-  { "tcdt", OP48(0xed0000000054LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
-  { "tget", OP48(0xed0000000051LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
-  { "tcet", OP48(0xed0000000050LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
+  { "tdgxt", OP48(0xed0000000059LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
+  { "tdcxt", OP48(0xed0000000058LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
+  { "tdgdt", OP48(0xed0000000055LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
+  { "tdcdt", OP48(0xed0000000054LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
+  { "tdget", OP48(0xed0000000051LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
+  { "tdcet", OP48(0xed0000000050LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 2, 5},
   { "srxt", OP48(0xed0000000049LL), MASK_RXF_FRRDF, INSTR_RXF_FRRDF, 2, 5},
   { "slxt", OP48(0xed0000000048LL), MASK_RXF_FRRDF, INSTR_RXF_FRRDF, 2, 5},
   { "srdt", OP48(0xed0000000041LL), MASK_RXF_FRRDF, INSTR_RXF_FRRDF, 2, 5},
@@ -820,6 +973,7 @@ const struct s390_opcode s390_opcodes[] =
   { "myl", OP48(0xed0000000039LL), MASK_RXF_FRRDF, INSTR_RXF_FRRDF, 2, 4},
   { "mayl", OP48(0xed0000000038LL), MASK_RXF_FRRDF, INSTR_RXF_FRRDF, 2, 4},
   { "mee", OP48(0xed0000000037LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 3, 0},
+  { "sqd", OP48(0xed0000000035LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 3, 0},
   { "sqe", OP48(0xed0000000034LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 3, 0},
   { "mse", OP48(0xed000000002fLL), MASK_RXF_FRRDF, INSTR_RXF_FRRDF, 3, 3},
   { "mae", OP48(0xed000000002eLL), MASK_RXF_FRRDF, INSTR_RXF_FRRDF, 3, 3},
@@ -852,6 +1006,270 @@ const struct s390_opcode s390_opcodes[] =
   { "lxeb", OP48(0xed0000000006LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 3, 0},
   { "lxdb", OP48(0xed0000000005LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 3, 0},
   { "ldeb", OP48(0xed0000000004LL), MASK_RXE_FRRD, INSTR_RXE_FRRD, 3, 0},
+  { "clibnh", OP48(0xec0c000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clible", OP48(0xec0c000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cibnh", OP48(0xec0c000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cible", OP48(0xec0c000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clgibnh", OP48(0xec0c000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clgible", OP48(0xec0c000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cgibnh", OP48(0xec0c000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cgible", OP48(0xec0c000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clijnh", OP48(0xec0c0000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clijle", OP48(0xec0c0000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cijnh", OP48(0xec0c0000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cijle", OP48(0xec0c0000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clgijnh", OP48(0xec0c0000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clgijle", OP48(0xec0c0000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cgijnh", OP48(0xec0c0000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cgijle", OP48(0xec0c0000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clibnl", OP48(0xec0a000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clibhe", OP48(0xec0a000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cibnl", OP48(0xec0a000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cibhe", OP48(0xec0a000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clgibnl", OP48(0xec0a000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clgibhe", OP48(0xec0a000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cgibnl", OP48(0xec0a000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cgibhe", OP48(0xec0a000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clijnl", OP48(0xec0a0000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clijhe", OP48(0xec0a0000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cijnl", OP48(0xec0a0000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cijhe", OP48(0xec0a0000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clgijnl", OP48(0xec0a0000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clgijhe", OP48(0xec0a0000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cgijnl", OP48(0xec0a0000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cgijhe", OP48(0xec0a0000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clibe", OP48(0xec08000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clibnlh", OP48(0xec08000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cibe", OP48(0xec08000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cibnlh", OP48(0xec08000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clgibe", OP48(0xec08000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clgibnlh", OP48(0xec08000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cgibe", OP48(0xec08000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cgibnlh", OP48(0xec08000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clije", OP48(0xec080000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clijnlh", OP48(0xec080000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cije", OP48(0xec080000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cijnlh", OP48(0xec080000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clgije", OP48(0xec080000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clgijnlh", OP48(0xec080000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cgije", OP48(0xec080000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cgijnlh", OP48(0xec080000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clibne", OP48(0xec06000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cliblh", OP48(0xec06000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cibne", OP48(0xec06000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "ciblh", OP48(0xec06000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clgibne", OP48(0xec06000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clgiblh", OP48(0xec06000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cgibne", OP48(0xec06000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cgiblh", OP48(0xec06000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clijne", OP48(0xec060000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clijlh", OP48(0xec060000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cijne", OP48(0xec060000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cijlh", OP48(0xec060000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clgijne", OP48(0xec060000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clgijlh", OP48(0xec060000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cgijne", OP48(0xec060000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cgijlh", OP48(0xec060000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clibl", OP48(0xec04000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clibnhe", OP48(0xec04000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cibl", OP48(0xec04000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cibnhe", OP48(0xec04000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clgibl", OP48(0xec04000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clgibnhe", OP48(0xec04000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cgibl", OP48(0xec04000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cgibnhe", OP48(0xec04000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clijl", OP48(0xec040000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clijnhe", OP48(0xec040000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cijl", OP48(0xec040000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cijnhe", OP48(0xec040000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clgijl", OP48(0xec040000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clgijnhe", OP48(0xec040000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cgijl", OP48(0xec040000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cgijnhe", OP48(0xec040000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clibh", OP48(0xec02000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clibnle", OP48(0xec02000000ffLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cibh", OP48(0xec02000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cibnle", OP48(0xec02000000feLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clgibh", OP48(0xec02000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "clgibnle", OP48(0xec02000000fdLL), MASK_RIS_R0RDU, INSTR_RIS_R0RDU, 2, 6},
+  { "cgibh", OP48(0xec02000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "cgibnle", OP48(0xec02000000fcLL), MASK_RIS_R0RDI, INSTR_RIS_R0RDI, 2, 6},
+  { "clijh", OP48(0xec020000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clijnle", OP48(0xec020000007fLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cijh", OP48(0xec020000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cijnle", OP48(0xec020000007eLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clgijh", OP48(0xec020000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "clgijnle", OP48(0xec020000007dLL), MASK_RIE_R0PU, INSTR_RIE_R0PU, 2, 6},
+  { "cgijh", OP48(0xec020000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "cgijnle", OP48(0xec020000007cLL), MASK_RIE_R0PI, INSTR_RIE_R0PI, 2, 6},
+  { "clrbnh", OP48(0xec000000c0f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrble", OP48(0xec000000c0f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbnh", OP48(0xec000000c0f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crble", OP48(0xec000000c0f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbnh", OP48(0xec000000c0e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrble", OP48(0xec000000c0e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbnh", OP48(0xec000000c0e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrble", OP48(0xec000000c0e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrjnh", OP48(0xec000000c077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clrjle", OP48(0xec000000c077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjnh", OP48(0xec000000c076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjle", OP48(0xec000000c076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clfitnh", OP48(0xec000000c073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clfitle", OP48(0xec000000c073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "citnh", OP48(0xec000000c072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "citle", OP48(0xec000000c072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgitnh", OP48(0xec000000c071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clgitle", OP48(0xec000000c071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cgitnh", OP48(0xec000000c070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "cgitle", OP48(0xec000000c070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgrjnh", OP48(0xec000000c065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clgrjle", OP48(0xec000000c065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "cgrjnh", OP48(0xec000000c064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "cgrjle", OP48(0xec000000c064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "clrbnl", OP48(0xec000000a0f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrbhe", OP48(0xec000000a0f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbnl", OP48(0xec000000a0f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbhe", OP48(0xec000000a0f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbnl", OP48(0xec000000a0e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbhe", OP48(0xec000000a0e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbnl", OP48(0xec000000a0e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbhe", OP48(0xec000000a0e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrjnl", OP48(0xec000000a077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clrjhe", OP48(0xec000000a077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjnl", OP48(0xec000000a076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjhe", OP48(0xec000000a076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clfitnl", OP48(0xec000000a073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clfithe", OP48(0xec000000a073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "citnl", OP48(0xec000000a072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "cithe", OP48(0xec000000a072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgitnl", OP48(0xec000000a071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clgithe", OP48(0xec000000a071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cgitnl", OP48(0xec000000a070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "cgithe", OP48(0xec000000a070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgrjnl", OP48(0xec000000a065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clgrjhe", OP48(0xec000000a065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "cgrjnl", OP48(0xec000000a064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "cgrjhe", OP48(0xec000000a064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "clrbe", OP48(0xec00000080f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrbnlh", OP48(0xec00000080f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbe", OP48(0xec00000080f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbnlh", OP48(0xec00000080f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbe", OP48(0xec00000080e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbnlh", OP48(0xec00000080e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbe", OP48(0xec00000080e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbnlh", OP48(0xec00000080e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrje", OP48(0xec0000008077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clrjnlh", OP48(0xec0000008077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crje", OP48(0xec0000008076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjnlh", OP48(0xec0000008076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clfite", OP48(0xec0000008073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clfitnlh", OP48(0xec0000008073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cite", OP48(0xec0000008072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "citnlh", OP48(0xec0000008072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgite", OP48(0xec0000008071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clgitnlh", OP48(0xec0000008071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cgite", OP48(0xec0000008070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "cgitnlh", OP48(0xec0000008070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgrje", OP48(0xec0000008065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clgrjnlh", OP48(0xec0000008065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "cgrje", OP48(0xec0000008064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "cgrjnlh", OP48(0xec0000008064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "clrbne", OP48(0xec00000060f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrblh", OP48(0xec00000060f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbne", OP48(0xec00000060f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crblh", OP48(0xec00000060f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbne", OP48(0xec00000060e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrblh", OP48(0xec00000060e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbne", OP48(0xec00000060e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrblh", OP48(0xec00000060e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrjne", OP48(0xec0000006077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clrjlh", OP48(0xec0000006077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjne", OP48(0xec0000006076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjlh", OP48(0xec0000006076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clfitne", OP48(0xec0000006073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clfitlh", OP48(0xec0000006073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "citne", OP48(0xec0000006072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "citlh", OP48(0xec0000006072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgitne", OP48(0xec0000006071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clgitlh", OP48(0xec0000006071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cgitne", OP48(0xec0000006070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "cgitlh", OP48(0xec0000006070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgrjne", OP48(0xec0000006065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clgrjlh", OP48(0xec0000006065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "cgrjne", OP48(0xec0000006064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "cgrjlh", OP48(0xec0000006064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "clrbl", OP48(0xec00000040f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrbnhe", OP48(0xec00000040f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbl", OP48(0xec00000040f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbnhe", OP48(0xec00000040f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbl", OP48(0xec00000040e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbnhe", OP48(0xec00000040e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbl", OP48(0xec00000040e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbnhe", OP48(0xec00000040e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrjl", OP48(0xec0000004077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clrjnhe", OP48(0xec0000004077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjl", OP48(0xec0000004076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjnhe", OP48(0xec0000004076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clfitl", OP48(0xec0000004073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clfitnhe", OP48(0xec0000004073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "citl", OP48(0xec0000004072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "citnhe", OP48(0xec0000004072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgitl", OP48(0xec0000004071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clgitnhe", OP48(0xec0000004071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cgitl", OP48(0xec0000004070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "cgitnhe", OP48(0xec0000004070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgrjl", OP48(0xec0000004065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clgrjnhe", OP48(0xec0000004065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "cgrjl", OP48(0xec0000004064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "cgrjnhe", OP48(0xec0000004064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "clrbh", OP48(0xec00000020f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrbnle", OP48(0xec00000020f7LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbh", OP48(0xec00000020f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "crbnle", OP48(0xec00000020f6LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbh", OP48(0xec00000020e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clgrbnle", OP48(0xec00000020e5LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbh", OP48(0xec00000020e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "cgrbnle", OP48(0xec00000020e4LL), MASK_RRS_RRRD0, INSTR_RRS_RRRD0, 2, 6},
+  { "clrjh", OP48(0xec0000002077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clrjnle", OP48(0xec0000002077LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjh", OP48(0xec0000002076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "crjnle", OP48(0xec0000002076LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clfith", OP48(0xec0000002073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clfitnle", OP48(0xec0000002073LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cith", OP48(0xec0000002072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "citnle", OP48(0xec0000002072LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgith", OP48(0xec0000002071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "clgitnle", OP48(0xec0000002071LL), MASK_RIE_R0U0, INSTR_RIE_R0U0, 2, 6},
+  { "cgith", OP48(0xec0000002070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "cgitnle", OP48(0xec0000002070LL), MASK_RIE_R0I0, INSTR_RIE_R0I0, 2, 6},
+  { "clgrjh", OP48(0xec0000002065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "clgrjnle", OP48(0xec0000002065LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 6},
+  { "cgrjh", OP48(0xec0000002064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "cgrjnle", OP48(0xec0000002064LL), MASK_RIE_RRP0, INSTR_RIE_RRP0, 2, 6},
+  { "clib", OP48(0xec00000000ffLL), MASK_RIS_RURDU, INSTR_RIS_RURDU, 2, 6},
+  { "cib", OP48(0xec00000000feLL), MASK_RIS_RURDI, INSTR_RIS_RURDI, 2, 6},
+  { "clgib", OP48(0xec00000000fdLL), MASK_RIS_RURDU, INSTR_RIS_RURDU, 2, 6},
+  { "cgib", OP48(0xec00000000fcLL), MASK_RIS_RURDI, INSTR_RIS_RURDI, 2, 6},
+  { "clrb", OP48(0xec00000000f7LL), MASK_RRS_RRRDU, INSTR_RRS_RRRDU, 2, 6},
+  { "crb", OP48(0xec00000000f6LL), MASK_RRS_RRRDU, INSTR_RRS_RRRDU, 2, 6},
+  { "clgrb", OP48(0xec00000000e5LL), MASK_RRS_RRRDU, INSTR_RRS_RRRDU, 2, 6},
+  { "cgrb", OP48(0xec00000000e4LL), MASK_RRS_RRRDU, INSTR_RRS_RRRDU, 2, 6},
+  { "clij", OP48(0xec000000007fLL), MASK_RIE_RUPU, INSTR_RIE_RUPU, 2, 6},
+  { "cij", OP48(0xec000000007eLL), MASK_RIE_RUPI, INSTR_RIE_RUPI, 2, 6},
+  { "clgij", OP48(0xec000000007dLL), MASK_RIE_RUPU, INSTR_RIE_RUPU, 2, 6},
+  { "cgij", OP48(0xec000000007cLL), MASK_RIE_RUPI, INSTR_RIE_RUPI, 2, 6},
+  { "clrj", OP48(0xec0000000077LL), MASK_RIE_RRPU, INSTR_RIE_RRPU, 2, 6},
+  { "crj", OP48(0xec0000000076LL), MASK_RIE_RRPU, INSTR_RIE_RRPU, 2, 6},
+  { "clfit", OP48(0xec0000000073LL), MASK_RIE_R0UU, INSTR_RIE_R0UU, 2, 6},
+  { "cit", OP48(0xec0000000072LL), MASK_RIE_R0IU, INSTR_RIE_R0IU, 2, 6},
+  { "clgit", OP48(0xec0000000071LL), MASK_RIE_R0UU, INSTR_RIE_R0UU, 2, 6},
+  { "cgit", OP48(0xec0000000070LL), MASK_RIE_R0IU, INSTR_RIE_R0IU, 2, 6},
+  { "clgrj", OP48(0xec0000000065LL), MASK_RIE_RRPU, INSTR_RIE_RRPU, 2, 6},
+  { "cgrj", OP48(0xec0000000064LL), MASK_RIE_RRPU, INSTR_RIE_RRPU, 2, 6},
+  { "rxsbg", OP48(0xec0000000057LL), MASK_RIE_RRUUU, INSTR_RIE_RRUUU, 2, 6},
+  { "rosbg", OP48(0xec0000000056LL), MASK_RIE_RRUUU, INSTR_RIE_RRUUU, 2, 6},
+  { "risbg", OP48(0xec0000000055LL), MASK_RIE_RRUUU, INSTR_RIE_RRUUU, 2, 6},
+  { "rnsbg", OP48(0xec0000000054LL), MASK_RIE_RRUUU, INSTR_RIE_RRUUU, 2, 6},
   { "brxlg", OP48(0xec0000000045LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 2},
   { "brxhg", OP48(0xec0000000044LL), MASK_RIE_RRP, INSTR_RIE_RRP, 2, 2},
   { "tp", OP48(0xeb00000000c0LL), MASK_RSL_R0RD, INSTR_RSL_R0RD, 3, 0},
@@ -861,18 +1279,23 @@ const struct s390_opcode s390_opcodes[] =
   { "lmh", OP48(0xeb0000000096LL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 2, 3},
   { "lmh", OP48(0xeb0000000096LL), MASK_RSE_RRRD, INSTR_RSE_RRRD, 2, 2},
   { "stmy", OP48(0xeb0000000090LL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 2, 3},
-  { "clclu", OP48(0xeb000000008fLL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 2, 3},
+  { "clclu", OP48(0xeb000000008fLL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 3, 3},
   { "mvclu", OP48(0xeb000000008eLL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 3, 3},
   { "mvclu", OP48(0xeb000000008eLL), MASK_RSE_RRRD, INSTR_RSE_RRRD, 3, 0},
   { "icmy", OP48(0xeb0000000081LL), MASK_RSY_RURD, INSTR_RSY_RURD, 2, 3},
   { "icmh", OP48(0xeb0000000080LL), MASK_RSY_RURD, INSTR_RSY_RURD, 2, 3},
   { "icmh", OP48(0xeb0000000080LL), MASK_RSE_RURD, INSTR_RSE_RURD, 2, 2},
+  { "algsi", OP48(0xeb000000007eLL), MASK_SIY_IRD, INSTR_SIY_IRD, 2, 6},
+  { "agsi", OP48(0xeb000000007aLL), MASK_SIY_IRD, INSTR_SIY_IRD, 2, 6},
+  { "alsi", OP48(0xeb000000006eLL), MASK_SIY_IRD, INSTR_SIY_IRD, 2, 6},
+  { "asi", OP48(0xeb000000006aLL), MASK_SIY_IRD, INSTR_SIY_IRD, 2, 6},
   { "xiy", OP48(0xeb0000000057LL), MASK_SIY_URD, INSTR_SIY_URD, 2, 3},
   { "oiy", OP48(0xeb0000000056LL), MASK_SIY_URD, INSTR_SIY_URD, 2, 3},
   { "cliy", OP48(0xeb0000000055LL), MASK_SIY_URD, INSTR_SIY_URD, 2, 3},
   { "niy", OP48(0xeb0000000054LL), MASK_SIY_URD, INSTR_SIY_URD, 2, 3},
   { "mviy", OP48(0xeb0000000052LL), MASK_SIY_URD, INSTR_SIY_URD, 2, 3},
   { "tmy", OP48(0xeb0000000051LL), MASK_SIY_URD, INSTR_SIY_URD, 2, 3},
+  { "ecag", OP48(0xeb000000004cLL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 2, 6},
   { "bxleg", OP48(0xeb0000000045LL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 2, 3},
   { "bxleg", OP48(0xeb0000000045LL), MASK_RSE_RRRD, INSTR_RSE_RRRD, 2, 2},
   { "bxhg", OP48(0xeb0000000044LL), MASK_RSY_RRRD, INSTR_RSY_RRRD, 2, 3},
@@ -916,6 +1339,15 @@ const struct s390_opcode s390_opcodes[] =
   { "unpka", OP8(0xeaLL), MASK_SS_L0RDRD, INSTR_SS_L0RDRD, 3, 0},
   { "pka", OP8(0xe9LL), MASK_SS_L2RDRD, INSTR_SS_L2RDRD, 3, 0},
   { "mvcin", OP8(0xe8LL), MASK_SS_L0RDRD, INSTR_SS_L0RDRD, 3, 0},
+  { "clfhsi", OP16(0xe55dLL), MASK_SIL_RDU, INSTR_SIL_RDU, 2, 6},
+  { "chsi", OP16(0xe55cLL), MASK_SIL_RDI, INSTR_SIL_RDI, 2, 6},
+  { "clghsi", OP16(0xe559LL), MASK_SIL_RDU, INSTR_SIL_RDU, 2, 6},
+  { "cghsi", OP16(0xe558LL), MASK_SIL_RDI, INSTR_SIL_RDI, 2, 6},
+  { "clhhsi", OP16(0xe555LL), MASK_SIL_RDU, INSTR_SIL_RDU, 2, 6},
+  { "chhsi", OP16(0xe554LL), MASK_SIL_RDI, INSTR_SIL_RDI, 2, 6},
+  { "mvhi", OP16(0xe54cLL), MASK_SIL_RDI, INSTR_SIL_RDI, 2, 6},
+  { "mvghi", OP16(0xe548LL), MASK_SIL_RDI, INSTR_SIL_RDI, 2, 6},
+  { "mvhhi", OP16(0xe544LL), MASK_SIL_RDI, INSTR_SIL_RDI, 2, 6},
   { "mvcdk", OP16(0xe50fLL), MASK_SSE_RDRD, INSTR_SSE_RDRD, 3, 0},
   { "mvcsk", OP16(0xe50eLL), MASK_SSE_RDRD, INSTR_SSE_RDRD, 3, 0},
   { "tprot", OP16(0xe501LL), MASK_SSE_RDRD, INSTR_SSE_RDRD, 3, 0},
@@ -953,18 +1385,21 @@ const struct s390_opcode s390_opcodes[] =
   { "og", OP48(0xe30000000081LL), MASK_RXE_RRRD, INSTR_RXE_RRRD, 2, 2},
   { "ng", OP48(0xe30000000080LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "ng", OP48(0xe30000000080LL), MASK_RXE_RRRD, INSTR_RXE_RRRD, 2, 2},
+  { "mhy", OP48(0xe3000000007cLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 6},
   { "shy", OP48(0xe3000000007bLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "ahy", OP48(0xe3000000007aLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "chy", OP48(0xe30000000079LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "lhy", OP48(0xe30000000078LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "lgb", OP48(0xe30000000077LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "lb", OP48(0xe30000000076LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
+  { "laey", OP48(0xe30000000075LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 6},
   { "icy", OP48(0xe30000000073LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "stcy", OP48(0xe30000000072LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "lay", OP48(0xe30000000071LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "sthy", OP48(0xe30000000070LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "sly", OP48(0xe3000000005fLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "aly", OP48(0xe3000000005eLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
+  { "mfy", OP48(0xe3000000005cLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 6},
   { "sy", OP48(0xe3000000005bLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "ay", OP48(0xe3000000005aLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "cy", OP48(0xe30000000059LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
@@ -981,6 +1416,9 @@ const struct s390_opcode s390_opcodes[] =
   { "strvh", OP48(0xe3000000003fLL), MASK_RXE_RRRD, INSTR_RXE_RRRD, 3, 2},
   { "strv", OP48(0xe3000000003eLL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 3, 3},
   { "strv", OP48(0xe3000000003eLL), MASK_RXE_RRRD, INSTR_RXE_RRRD, 3, 2},
+  { "pfd", OP48(0xe30000000036LL), MASK_RXY_URRD, INSTR_RXY_URRD, 2, 6},
+  { "cgh", OP48(0xe30000000034LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 6},
+  { "ltgf", OP48(0xe30000000032LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 6},
   { "clgf", OP48(0xe30000000031LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
   { "clgf", OP48(0xe30000000031LL), MASK_RXE_RRRD, INSTR_RXE_RRRD, 2, 2},
   { "cgf", OP48(0xe30000000030LL), MASK_RXY_RRRD, INSTR_RXY_RRRD, 2, 3},
@@ -1063,6 +1501,29 @@ const struct s390_opcode s390_opcodes[] =
   { "csst", OP16(0xc802LL), MASK_SSF_RRDRD, INSTR_SSF_RRDRD, 2, 5},
   { "ectg", OP16(0xc801LL), MASK_SSF_RRDRD, INSTR_SSF_RRDRD, 2, 5},
   { "mvcos", OP16(0xc800LL), MASK_SSF_RRDRD, INSTR_SSF_RRDRD, 2, 4},
+  { "clrl", OP16(0xc60fLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "clgfrl", OP16(0xc60eLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "crl", OP16(0xc60dLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "cgfrl", OP16(0xc60cLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "clgrl", OP16(0xc60aLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "cgrl", OP16(0xc608LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "clhrl", OP16(0xc607LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "clghrl", OP16(0xc606LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "chrl", OP16(0xc605LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "cghrl", OP16(0xc604LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "pfdrl", OP16(0xc602LL), MASK_RIL_UP, INSTR_RIL_UP, 2, 6},
+  { "exrl", OP16(0xc600LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "strl", OP16(0xc40fLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "llgfrl", OP16(0xc40eLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "lrl", OP16(0xc40dLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "lgfrl", OP16(0xc40cLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "stgrl", OP16(0xc40bLL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "lgrl", OP16(0xc408LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "sthrl", OP16(0xc407LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "llghrl", OP16(0xc406LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "lhrl", OP16(0xc405LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "lghrl", OP16(0xc404LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
+  { "llhrl", OP16(0xc402LL), MASK_RIL_RP, INSTR_RIL_RP, 2, 6},
   { "clfi", OP16(0xc20fLL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
   { "clgfi", OP16(0xc20eLL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
   { "cfi", OP16(0xc20dLL), MASK_RIL_RI, INSTR_RIL_RI, 2, 4},
@@ -1073,6 +1534,8 @@ const struct s390_opcode s390_opcodes[] =
   { "agfi", OP16(0xc208LL), MASK_RIL_RI, INSTR_RIL_RI, 2, 4},
   { "slfi", OP16(0xc205LL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
   { "slgfi", OP16(0xc204LL), MASK_RIL_RU, INSTR_RIL_RU, 2, 4},
+  { "msfi", OP16(0xc201LL), MASK_RIL_RI, INSTR_RIL_RI, 2, 6},
+  { "msgfi", OP16(0xc200LL), MASK_RIL_RI, INSTR_RIL_RI, 2, 6},
   { "jg", OP16(0xc0f4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
   { "jgno", OP16(0xc0e4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
   { "jgnh", OP16(0xc0d4LL), MASK_RIL_0P, INSTR_RIL_0P, 3, 2},
@@ -1113,11 +1576,15 @@ const struct s390_opcode s390_opcodes[] =
   { "clm", OP8(0xbdLL), MASK_RS_RURD, INSTR_RS_RURD, 3, 0},
   { "cds", OP8(0xbbLL), MASK_RS_RRRD, INSTR_RS_RRRD, 3, 0},
   { "cs", OP8(0xbaLL), MASK_RS_RRRD, INSTR_RS_RRRD, 3, 0},
-  { "cu42", OP16(0xb9b3LL), MASK_RRF_M0RR, INSTR_RRF_M0RR, 2, 4},
-  { "cu41", OP16(0xb9b2LL), MASK_RRF_M0RR, INSTR_RRF_M0RR, 2, 4},
+  { "trte", OP16(0xb9bfLL), MASK_RRF_M0RR, INSTR_RRF_M0RR, 2, 6},
+  { "trtre", OP16(0xb9bdLL), MASK_RRF_M0RR, INSTR_RRF_M0RR, 2, 6},
+  { "cu42", OP16(0xb9b3LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 4},
+  { "cu41", OP16(0xb9b2LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 4},
   { "cu24", OP16(0xb9b1LL), MASK_RRF_M0RR, INSTR_RRF_M0RR, 2, 4},
   { "cu14", OP16(0xb9b0LL), MASK_RRF_M0RR, INSTR_RRF_M0RR, 2, 4},
+  { "pfmf", OP16(0xb9afLL), MASK_RRE_RR, INSTR_RRE_RR, 2, 6},
   { "lptea", OP16(0xb9aaLL), MASK_RRF_RURR, INSTR_RRF_RURR, 2, 4},
+  { "ptf", OP16(0xb9a2LL), MASK_RRE_R0, INSTR_RRE_R0, 2, 6},
   { "esea", OP16(0xb99dLL), MASK_RRE_R0, INSTR_RRE_R0, 2, 2},
   { "slbr", OP16(0xb999LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 2},
   { "alcr", OP16(0xb998LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 2},
@@ -1146,6 +1613,58 @@ const struct s390_opcode s390_opcodes[] =
   { "xgr", OP16(0xb982LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
   { "ogr", OP16(0xb981LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
   { "ngr", OP16(0xb980LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
+  { "clrtnh", OP48(0xb973c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtle", OP48(0xb973c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtnl", OP48(0xb973a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrthe", OP48(0xb973a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrte", OP48(0xb97380000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtnlh", OP48(0xb97380000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtne", OP48(0xb97360000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtlh", OP48(0xb97360000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtl", OP48(0xb97340000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtnhe", OP48(0xb97340000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrth", OP48(0xb97320000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrtnle", OP48(0xb97320000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clrt", OP16(0xb973LL), MASK_RRF_U0RR, INSTR_RRF_U0RR, 2, 6},
+  { "crtnh", OP48(0xb972c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtle", OP48(0xb972c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtnl", OP48(0xb972a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crthe", OP48(0xb972a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crte", OP48(0xb97280000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtnlh", OP48(0xb97280000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtne", OP48(0xb97260000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtlh", OP48(0xb97260000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtl", OP48(0xb97240000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtnhe", OP48(0xb97240000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crth", OP48(0xb97220000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crtnle", OP48(0xb97220000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "crt", OP16(0xb972LL), MASK_RRF_U0RR, INSTR_RRF_U0RR, 2, 6},
+  { "clgrtnh", OP48(0xb961c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtle", OP48(0xb961c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtnl", OP48(0xb961a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrthe", OP48(0xb961a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrte", OP48(0xb96180000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtnlh", OP48(0xb96180000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtne", OP48(0xb96160000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtlh", OP48(0xb96160000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtl", OP48(0xb96140000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtnhe", OP48(0xb96140000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrth", OP48(0xb96120000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrtnle", OP48(0xb96120000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "clgrt", OP16(0xb961LL), MASK_RRF_U0RR, INSTR_RRF_U0RR, 2, 6},
+  { "cgrtnh", OP48(0xb960c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtle", OP48(0xb960c0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtnl", OP48(0xb960a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrthe", OP48(0xb960a0000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrte", OP48(0xb96080000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtnlh", OP48(0xb96080000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtne", OP48(0xb96060000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtlh", OP48(0xb96060000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtl", OP48(0xb96040000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtnhe", OP48(0xb96040000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrth", OP48(0xb96020000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrtnle", OP48(0xb96020000000LL), MASK_RRF_00RR, INSTR_RRF_00RR, 2, 6},
+  { "cgrt", OP16(0xb960LL), MASK_RRF_U0RR, INSTR_RRF_U0RR, 2, 6},
   { "bctgr", OP16(0xb946LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
   { "klmd", OP16(0xb93fLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 3},
   { "kimd", OP16(0xb93eLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 3},
@@ -1191,16 +1710,16 @@ const struct s390_opcode s390_opcodes[] =
   { "lpgr", OP16(0xb900LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
   { "lctl", OP8(0xb7LL), MASK_RS_CCRD, INSTR_RS_CCRD, 3, 0},
   { "stctl", OP8(0xb6LL), MASK_RS_CCRD, INSTR_RS_CCRD, 3, 0},
-  { "rrxtr", OP16(0xb3ffLL), MASK_RRF_FFFU, INSTR_RRF_FFFU, 2, 5},
+  { "rrxtr", OP16(0xb3ffLL), MASK_RRF_FFRU, INSTR_RRF_FFRU, 2, 5},
   { "iextr", OP16(0xb3feLL), MASK_RRF_F0FR, INSTR_RRF_F0FR, 2, 5},
-  { "qaxtr", OP16(0xb3fdLL), MASK_RRF_FFFU, INSTR_RRF_FFFU, 2, 5},
+  { "qaxtr", OP16(0xb3fdLL), MASK_RRF_FUFF, INSTR_RRF_FUFF, 2, 5},
   { "cextr", OP16(0xb3fcLL), MASK_RRE_FF, INSTR_RRE_FF, 2, 5},
   { "cxstr", OP16(0xb3fbLL), MASK_RRE_FR, INSTR_RRE_FR, 2, 5},
   { "cxutr", OP16(0xb3faLL), MASK_RRE_FR, INSTR_RRE_FR, 2, 5},
   { "cxgtr", OP16(0xb3f9LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 5},
-  { "rrdtr", OP16(0xb3f7LL), MASK_RRF_FFFU, INSTR_RRF_FFFU, 2, 5},
+  { "rrdtr", OP16(0xb3f7LL), MASK_RRF_FFRU, INSTR_RRF_FFRU, 2, 5},
   { "iedtr", OP16(0xb3f6LL), MASK_RRF_F0FR, INSTR_RRF_F0FR, 2, 5},
-  { "qadtr", OP16(0xb3f5LL), MASK_RRF_FFFU, INSTR_RRF_FFFU, 2, 5},
+  { "qadtr", OP16(0xb3f5LL), MASK_RRF_FUFF, INSTR_RRF_FUFF, 2, 5},
   { "cedtr", OP16(0xb3f4LL), MASK_RRE_FF, INSTR_RRE_FF, 2, 5},
   { "cdstr", OP16(0xb3f3LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 5},
   { "cdutr", OP16(0xb3f2LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 5},
@@ -1239,52 +1758,52 @@ const struct s390_opcode s390_opcodes[] =
   { "cgxr", OP16(0xb3caLL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
   { "cgdr", OP16(0xb3c9LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
   { "cger", OP16(0xb3c8LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
-  { "cxgr", OP16(0xb3c6LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
-  { "cdgr", OP16(0xb3c5LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
-  { "cegr", OP16(0xb3c4LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
+  { "cxgr", OP16(0xb3c6LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 2},
+  { "cdgr", OP16(0xb3c5LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 2},
+  { "cegr", OP16(0xb3c4LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 2},
   { "ldgr", OP16(0xb3c1LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 5},
-  { "cfxr", OP16(0xb3baLL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
-  { "cfdr", OP16(0xb3b9LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
-  { "cfer", OP16(0xb3b8LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
-  { "cxfr", OP16(0xb3b6LL), MASK_RRE_RF, INSTR_RRE_RF, 3, 0},
-  { "cdfr", OP16(0xb3b5LL), MASK_RRE_RF, INSTR_RRE_RF, 3, 0},
-  { "cefr", OP16(0xb3b4LL), MASK_RRE_RF, INSTR_RRE_RF, 3, 0},
+  { "cfxr", OP16(0xb3baLL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 3, 2},
+  { "cfdr", OP16(0xb3b9LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 3, 2},
+  { "cfer", OP16(0xb3b8LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 3, 2},
+  { "cxfr", OP16(0xb3b6LL), MASK_RRE_FR, INSTR_RRE_FR, 3, 0},
+  { "cdfr", OP16(0xb3b5LL), MASK_RRE_FR, INSTR_RRE_FR, 3, 0},
+  { "cefr", OP16(0xb3b4LL), MASK_RRE_FR, INSTR_RRE_FR, 3, 0},
   { "cgxbr", OP16(0xb3aaLL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
   { "cgdbr", OP16(0xb3a9LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
   { "cgebr", OP16(0xb3a8LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 2, 2},
-  { "cxgbr", OP16(0xb3a6LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
-  { "cdgbr", OP16(0xb3a5LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
-  { "cegbr", OP16(0xb3a4LL), MASK_RRE_RR, INSTR_RRE_RR, 2, 2},
+  { "cxgbr", OP16(0xb3a6LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 2},
+  { "cdgbr", OP16(0xb3a5LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 2},
+  { "cegbr", OP16(0xb3a4LL), MASK_RRE_FR, INSTR_RRE_FR, 2, 2},
   { "cfxbr", OP16(0xb39aLL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 3, 0},
   { "cfdbr", OP16(0xb399LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 3, 0},
   { "cfebr", OP16(0xb398LL), MASK_RRF_U0RF, INSTR_RRF_U0RF, 3, 0},
-  { "cxfbr", OP16(0xb396LL), MASK_RRE_RF, INSTR_RRE_RF, 3, 0},
-  { "cdfbr", OP16(0xb395LL), MASK_RRE_RF, INSTR_RRE_RF, 3, 0},
-  { "cefbr", OP16(0xb394LL), MASK_RRE_RF, INSTR_RRE_RF, 3, 0},
+  { "cxfbr", OP16(0xb396LL), MASK_RRE_FR, INSTR_RRE_FR, 3, 0},
+  { "cdfbr", OP16(0xb395LL), MASK_RRE_FR, INSTR_RRE_FR, 3, 0},
+  { "cefbr", OP16(0xb394LL), MASK_RRE_FR, INSTR_RRE_FR, 3, 0},
   { "efpc", OP16(0xb38cLL), MASK_RRE_RR_OPT, INSTR_RRE_RR_OPT, 3, 0},
   { "sfasr", OP16(0xb385LL), MASK_RRE_R0, INSTR_RRE_R0, 2, 5},
   { "sfpc", OP16(0xb384LL), MASK_RRE_RR_OPT, INSTR_RRE_RR_OPT, 3, 0},
-  { "fidr", OP16(0xb37fLL), MASK_RRF_U0FF, INSTR_RRF_U0FF, 3, 0},
-  { "fier", OP16(0xb377LL), MASK_RRF_U0FF, INSTR_RRF_U0FF, 3, 0},
-  { "lzxr", OP16(0xb376LL), MASK_RRE_R0, INSTR_RRE_R0, 3, 0},
-  { "lzdr", OP16(0xb375LL), MASK_RRE_R0, INSTR_RRE_R0, 3, 0},
-  { "lzer", OP16(0xb374LL), MASK_RRE_R0, INSTR_RRE_R0, 3, 0},
+  { "fidr", OP16(0xb37fLL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
+  { "fier", OP16(0xb377LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
+  { "lzxr", OP16(0xb376LL), MASK_RRE_F0, INSTR_RRE_F0, 3, 0},
+  { "lzdr", OP16(0xb375LL), MASK_RRE_F0, INSTR_RRE_F0, 3, 0},
+  { "lzer", OP16(0xb374LL), MASK_RRE_F0, INSTR_RRE_F0, 3, 0},
   { "lcdfr", OP16(0xb373LL), MASK_RRE_FF, INSTR_RRE_FF, 2, 5},
   { "cpsdr", OP16(0xb372LL), MASK_RRF_F0FF2, INSTR_RRF_F0FF2, 2, 5},
   { "lndfr", OP16(0xb371LL), MASK_RRE_FF, INSTR_RRE_FF, 2, 5},
   { "lpdfr", OP16(0xb370LL), MASK_RRE_FF, INSTR_RRE_FF, 2, 5},
   { "cxr", OP16(0xb369LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
-  { "fixr", OP16(0xb367LL), MASK_RRF_U0FF, INSTR_RRF_U0FF, 3, 0},
+  { "fixr", OP16(0xb367LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "lexr", OP16(0xb366LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
-  { "lxr", OP16(0xb365LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
+  { "lxr", OP16(0xb365LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "lcxr", OP16(0xb363LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "ltxr", OP16(0xb362LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "lnxr", OP16(0xb361LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "lpxr", OP16(0xb360LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "fidbr", OP16(0xb35fLL), MASK_RRF_U0FF, INSTR_RRF_U0FF, 3, 0},
   { "didbr", OP16(0xb35bLL), MASK_RRF_FUFF, INSTR_RRF_FUFF, 3, 0},
-  { "thdr", OP16(0xb359LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
-  { "thder", OP16(0xb358LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
+  { "thdr", OP16(0xb359LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
+  { "thder", OP16(0xb358LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "fiebr", OP16(0xb357LL), MASK_RRF_U0FF, INSTR_RRF_U0FF, 3, 0},
   { "diebr", OP16(0xb353LL), MASK_RRF_FUFF, INSTR_RRF_FUFF, 3, 0},
   { "tbdr", OP16(0xb351LL), MASK_RRF_U0FF, INSTR_RRF_U0FF, 3, 0},
@@ -1374,7 +1893,6 @@ const struct s390_opcode s390_opcodes[] =
   { "xsch", OP16(0xb276LL), MASK_S_00, INSTR_S_00, 3, 0},
   { "siga", OP16(0xb274LL), MASK_S_RD, INSTR_S_RD, 3, 0},
   { "cmpsc", OP16(0xb263LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
-  { "cmpsc", OP16(0xb263LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
   { "srst", OP16(0xb25eLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
   { "clst", OP16(0xb25dLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
   { "bsa", OP16(0xb25aLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
@@ -1394,8 +1912,8 @@ const struct s390_opcode s390_opcodes[] =
   { "palb", OP16(0xb248LL), MASK_RRE_00, INSTR_RRE_00, 3, 0},
   { "msta", OP16(0xb247LL), MASK_RRE_R0, INSTR_RRE_R0, 3, 0},
   { "stura", OP16(0xb246LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
-  { "sqer", OP16(0xb245LL), MASK_RRE_F0, INSTR_RRE_F0, 3, 0},
-  { "sqdr", OP16(0xb244LL), MASK_RRE_F0, INSTR_RRE_F0, 3, 0},
+  { "sqer", OP16(0xb245LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
+  { "sqdr", OP16(0xb244LL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "cksm", OP16(0xb241LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
   { "bakr", OP16(0xb240LL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
   { "schm", OP16(0xb23cLL), MASK_S_00, INSTR_S_00, 3, 0},
@@ -1413,7 +1931,7 @@ const struct s390_opcode s390_opcodes[] =
   { "csch", OP16(0xb230LL), MASK_S_00, INSTR_S_00, 3, 0},
   { "pgout", OP16(0xb22fLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
   { "pgin", OP16(0xb22eLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
-  { "dxr", OP16(0xb22dLL), MASK_RRE_F0, INSTR_RRE_F0, 3, 0},
+  { "dxr", OP16(0xb22dLL), MASK_RRE_FF, INSTR_RRE_FF, 3, 0},
   { "tb", OP16(0xb22cLL), MASK_RRE_0R, INSTR_RRE_0R, 3, 0},
   { "sske", OP16(0xb22bLL), MASK_RRF_M0RR, INSTR_RRF_M0RR, 2, 4},
   { "sske", OP16(0xb22bLL), MASK_RRE_RR, INSTR_RRE_RR, 3, 0},
@@ -1445,6 +1963,7 @@ const struct s390_opcode s390_opcodes[] =
   { "sck", OP16(0xb204LL), MASK_S_RD, INSTR_S_RD, 3, 0},
   { "stidp", OP16(0xb202LL), MASK_S_RD, INSTR_S_RD, 3, 0},
   { "lra", OP8(0xb1LL), MASK_RX_RRRD, INSTR_RX_RRRD, 3, 0},
+  { "mc", OP16(0xaf00LL), MASK_SI_URD, INSTR_SI_URD, 2, 6},
   { "mc", OP8(0xafLL), MASK_SI_URD, INSTR_SI_URD, 3, 0},
   { "sigp", OP8(0xaeLL), MASK_RS_RRRD, INSTR_RS_RRRD, 3, 0},
   { "stosm", OP8(0xadLL), MASK_SI_URD, INSTR_SI_URD, 3, 0},
@@ -1598,7 +2117,7 @@ const struct s390_opcode s390_opcodes[] =
   { "bp", OP16(0x4720LL), MASK_RX_0RRD, INSTR_RX_0RRD, 3, 0},
   { "bo", OP16(0x4710LL), MASK_RX_0RRD, INSTR_RX_0RRD, 3, 0},
   { "bc", OP8(0x47LL), MASK_RX_URRD, INSTR_RX_URRD, 3, 0},
-  { "nop", OP16(0x4700LL), MASK_RX_0RRD, INSTR_RX_0RRD, 3, 0},
+  { "nop", OP16(0x4700LL), MASK_RX_0RRD_OPT, INSTR_RX_0RRD_OPT, 3, 0},
   { "bct", OP8(0x46LL), MASK_RX_RRRD, INSTR_RX_RRRD, 3, 0},
   { "bal", OP8(0x45LL), MASK_RX_RRRD, INSTR_RX_RRRD, 3, 0},
   { "ex", OP8(0x44LL), MASK_RX_RRRD, INSTR_RX_RRRD, 3, 0},
@@ -1685,7 +2204,7 @@ const struct s390_opcode s390_opcodes[] =
   { "bpr", OP16(0x0720LL), MASK_RR_0R, INSTR_RR_0R, 3, 0},
   { "bor", OP16(0x0710LL), MASK_RR_0R, INSTR_RR_0R, 3, 0},
   { "bcr", OP8(0x07LL), MASK_RR_UR, INSTR_RR_UR, 3, 0},
-  { "nopr", OP16(0x0700LL), MASK_RR_0R, INSTR_RR_0R, 3, 0},
+  { "nopr", OP16(0x0700LL), MASK_RR_0R_OPT, INSTR_RR_0R_OPT, 3, 0},
   { "bctr", OP8(0x06LL), MASK_RR_RR, INSTR_RR_RR, 3, 0},
   { "balr", OP8(0x05LL), MASK_RR_RR, INSTR_RR_RR, 3, 0},
   { "spm", OP8(0x04LL), MASK_RR_R0, INSTR_RR_R0, 3, 0},
@@ -1696,9 +2215,10 @@ const struct s390_opcode s390_opcodes[] =
   { "tam", OP16(0x010bLL), MASK_E, INSTR_E, 3, 2},
   { "pfpo", OP16(0x010aLL), MASK_E, INSTR_E, 2, 5},
   { "sckpf", OP16(0x0107LL), MASK_E, INSTR_E, 3, 0},
+  { "ptff", OP16(0x0104LL), MASK_E, INSTR_E, 2, 4},
   { "upt", OP16(0x0102LL), MASK_E, INSTR_E, 3, 0},
   { "pr", OP16(0x0101LL), MASK_E, INSTR_E, 3, 0}
 };
 
-const int s390_num_opcodes =
+static const int s390_num_opcodes =
   sizeof (s390_opcodes) / sizeof (s390_opcodes[0]);
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 16/62] tcg-s390: Compute is_write in cpu_signal_handler.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (14 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 15/62] tcg-s390: Update disassembler from binutils head Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 17/62] tcg-s390: Reorganize instruction emission Richard Henderson
                   ` (46 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 cpu-exec.c |   42 +++++++++++++++++++++++++++++++++++++++---
 1 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index c776605..026980a 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -1156,11 +1156,47 @@ int cpu_signal_handler(int host_signum, void *pinfo,
     siginfo_t *info = pinfo;
     struct ucontext *uc = puc;
     unsigned long pc;
-    int is_write;
+    uint16_t *pinsn;
+    int is_write = 0;
 
     pc = uc->uc_mcontext.psw.addr;
-    /* XXX: compute is_write */
-    is_write = 0;
+
+    /* ??? On linux, the non-rt signal handler has 4 (!) arguments instead
+       of the normal 2 arguments.  The 3rd argument contains the "int_code"
+       from the hardware which does in fact contain the is_write value.
+       The rt signal handler, as far as I can tell, does not give this value
+       at all.  Not that we could get to it from here even if it were.  */
+    /* ??? This is not even close to complete, since it ignores all
+       of the read-modify-write instructions.  */
+    pinsn = (uint16_t *)pc;
+    switch (pinsn[0] >> 8) {
+    case 0x50: /* ST */
+    case 0x42: /* STC */
+    case 0x40: /* STH */
+        is_write = 1;
+        break;
+    case 0xc4: /* RIL format insns */
+        switch (pinsn[0] & 0xf) {
+        case 0xf: /* STRL */
+        case 0xb: /* STGRL */
+        case 0x7: /* STHRL */
+            is_write = 1;
+        }
+        break;
+    case 0xe3: /* RXY format insns */
+        switch (pinsn[2] & 0xff) {
+        case 0x50: /* STY */
+        case 0x24: /* STG */
+        case 0x72: /* STCY */
+        case 0x70: /* STHY */
+        case 0x8e: /* STPQ */
+        case 0x3f: /* STRVH */
+        case 0x3e: /* STRV */
+        case 0x2f: /* STRVG */
+            is_write = 1;
+        }
+        break;
+    }
     return handle_cpu_signal(pc, (unsigned long)info->si_addr,
                              is_write, &uc->uc_sigmask, puc);
 }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 17/62] tcg-s390: Reorganize instruction emission
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (15 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 16/62] tcg-s390: Compute is_write in cpu_signal_handler Richard Henderson
@ 2010-05-27 20:45 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 18/62] tcg-s390: Use matching constraints Richard Henderson
                   ` (45 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Tie the opcode names to the format, and arrange for moderate
compile-time checking that the instruction format output routine
matches the format used by the opcode.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  625 +++++++++++++++++++++++--------------------------
 tcg/s390/tcg-target.h |    5 +-
 2 files changed, 297 insertions(+), 333 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index e0a0e73..0deb332 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -36,64 +36,86 @@
 #define TCG_CT_CONST_S16                0x100
 #define TCG_CT_CONST_U12                0x200
 
-#define E3_LG          0x04
-#define E3_LRVG        0x0f
-#define E3_LGF         0x14
-#define E3_LGH         0x15
-#define E3_LLGF        0x16
-#define E3_LRV         0x1e
-#define E3_LRVH        0x1f
-#define E3_CG          0x20
-#define E3_STG         0x24
-#define E3_STRVG       0x2f
-#define E3_STRV        0x3e
-#define E3_STRVH       0x3f
-#define E3_STHY        0x70
-#define E3_STCY        0x72
-#define E3_LGB         0x77
-#define E3_LLGC        0x90
-#define E3_LLGH        0x91
-
-#define B9_LGR         0x04
-#define B9_AGR         0x08
-#define B9_SGR         0x09
-#define B9_MSGR        0x0c
-#define B9_LGFR        0x14
-#define B9_LLGFR       0x16
-#define B9_CGR         0x20
-#define B9_CLGR        0x21
-#define B9_NGR         0x80
-#define B9_OGR         0x81
-#define B9_XGR         0x82
-#define B9_DLGR        0x87
-#define B9_DLR         0x97
-
-#define RR_BASR        0x0d
-#define RR_NR          0x14
-#define RR_CLR         0x15
-#define RR_OR          0x16
-#define RR_XR          0x17
-#define RR_LR          0x18
-#define RR_CR          0x19
-#define RR_AR          0x1a
-#define RR_SR          0x1b
-
-#define A7_AHI         0xa
-#define A7_AHGI        0xb
-
-#define SH64_REG_NONE  0x00 /* use immediate only (not R0!) */
-#define SH64_SRAG      0x0a
-#define SH64_SRLG      0x0c
-#define SH64_SLLG      0x0d
-
-#define SH32_REG_NONE  0x00 /* use immediate only (not R0!) */
-#define SH32_SRL       0x08
-#define SH32_SLL       0x09
-#define SH32_SRA       0x0a
-
-#define ST_STH         0x40
-#define ST_STC         0x42
-#define ST_ST          0x50
+/* All of the following instructions are prefixed with their instruction
+   format, and are defined as 8- or 16-bit quantities, even when the two
+   halves of the 16-bit quantity may appear 32 bits apart in the insn.
+   This makes it easy to copy the values from the tables in Appendix B.  */
+typedef enum S390Opcode {
+    RIL_LARL    = 0xc000,
+    RIL_BRASL   = 0xc005,
+
+    RI_AGHI     = 0xa70b,
+    RI_AHI      = 0xa70a,
+    RI_BRC      = 0xa704,
+    RI_IILH     = 0xa502,
+    RI_LGHI     = 0xa709,
+    RI_LLILL    = 0xa50f,
+
+    RRE_AGR     = 0xb908,
+    RRE_CGR     = 0xb920,
+    RRE_CLGR    = 0xb921,
+    RRE_DLGR    = 0xb987,
+    RRE_DLR     = 0xb997,
+    RRE_DSGFR   = 0xb91d,
+    RRE_DSGR    = 0xb90d,
+    RRE_LCGR    = 0xb903,
+    RRE_LGFR    = 0xb914,
+    RRE_LGR     = 0xb904,
+    RRE_LLGFR   = 0xb916,
+    RRE_MSGR    = 0xb90c,
+    RRE_MSR     = 0xb252,
+    RRE_NGR     = 0xb980,
+    RRE_OGR     = 0xb981,
+    RRE_SGR     = 0xb909,
+    RRE_XGR     = 0xb982,
+
+    RR_AR       = 0x1a,
+    RR_BASR     = 0x0d,
+    RR_BCR      = 0x07,
+    RR_CLR      = 0x15,
+    RR_CR       = 0x19,
+    RR_LCR      = 0x13,
+    RR_LR       = 0x18,
+    RR_NR       = 0x14,
+    RR_OR       = 0x16,
+    RR_SR       = 0x1b,
+    RR_XR       = 0x17,
+
+    RSY_SLLG    = 0xeb0d,
+    RSY_SRAG    = 0xeb0a,
+    RSY_SRLG    = 0xeb0c,
+
+    RS_SLL      = 0x89,
+    RS_SRA      = 0x8a,
+    RS_SRL      = 0x88,
+
+    RXY_CG      = 0xe320,
+    RXY_LG      = 0xe304,
+    RXY_LGB     = 0xe377,
+    RXY_LGF     = 0xe314,
+    RXY_LGH     = 0xe315,
+    RXY_LLGC    = 0xe390,
+    RXY_LLGF    = 0xe316,
+    RXY_LLGH    = 0xe391,
+    RXY_LMG     = 0xeb04,
+    RXY_LRV     = 0xe31e,
+    RXY_LRVG    = 0xe30f,
+    RXY_LRVH    = 0xe31f,
+    RXY_STCY    = 0xe372,
+    RXY_STG     = 0xe324,
+    RXY_STHY    = 0xe370,
+    RXY_STMG    = 0xeb24,
+    RXY_STRV    = 0xe33e,
+    RXY_STRVG   = 0xe32f,
+    RXY_STRVH   = 0xe33f,
+
+    RX_ST       = 0x50,
+    RX_STC      = 0x42,
+    RX_STH      = 0x40,
+} S390Opcode;
+
+#define SH32_REG_NONE  0
+#define SH64_REG_NONE  0
 
 #define LD_SIGNED      0x04
 #define LD_UINT8       0x00
@@ -105,14 +127,6 @@
 #define LD_UINT64      0x03
 #define LD_INT64       (LD_UINT64 | LD_SIGNED)
 
-#define S390_INS_BCR   0x0700
-#define S390_INS_BR    (S390_INS_BCR | 0x00f0)
-#define S390_INS_IILH  0xa5020000
-#define S390_INS_LLILL 0xa50f0000
-#define S390_INS_LGHI  0xa7090000
-#define S390_INS_MSR   0xb2520000
-#define S390_INS_LARL  0xc000
-
 #ifndef NDEBUG
 static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
     "%r0", "%r1", "%r2", "%r3", "%r4", "%r5", "%r6", "%r7",
@@ -204,7 +218,7 @@ static void patch_reloc(uint8_t *code_ptr, int type,
 }
 
 static int tcg_target_get_call_iarg_regs_count(int flags)
-  {
+{
     return sizeof(tcg_target_call_iarg_regs) / sizeof(int);
 }
 
@@ -240,7 +254,7 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
 
 /* Test if a constant matches the constraint. */
 static inline int tcg_target_const_match(tcg_target_long val,
-                const TCGArgConstraint *arg_ct)
+                                         const TCGArgConstraint *arg_ct)
 {
     int ct = arg_ct->ct;
 
@@ -253,60 +267,77 @@ static inline int tcg_target_const_match(tcg_target_long val,
     return 0;
 }
 
-/* emit load/store (and then some) instructions (E3 prefix) */
-static void tcg_out_e3(TCGContext* s, int op, int r1, int r2, int disp)
+/* Emit instructions according to the given instruction format.  */
+
+static void tcg_out_insn_RR(TCGContext *s, S390Opcode op, TCGReg r1, TCGReg r2)
 {
-    tcg_out16(s, 0xe300 | (r1 << 4));
-    tcg_out32(s, op | (r2 << 28) | ((disp & 0xfff) << 16) | ((disp >> 12) << 8));
+    tcg_out16(s, (op << 8) | (r1 << 4) | r2);
 }
 
-/* emit 64-bit register/register insns (B9 prefix) */
-static void tcg_out_b9(TCGContext* s, int op, int r1, int r2)
+static void tcg_out_insn_RRE(TCGContext *s, S390Opcode op,
+                             TCGReg r1, TCGReg r2)
 {
-    tcg_out32(s, 0xb9000000 | (op << 16) | (r1 << 4) | r2);
+    tcg_out32(s, (op << 16) | (r1 << 4) | r2);
 }
 
-/* emit (mostly) 32-bit register/register insns */
-static void tcg_out_rr(TCGContext* s, int op, int r1, int r2)
+static void tcg_out_insn_RI(TCGContext *s, S390Opcode op, TCGReg r1, int i2)
 {
-    tcg_out16(s, (op << 8) | (r1 << 4) | r2);
+    tcg_out32(s, (op << 16) | (r1 << 20) | (i2 & 0xffff));
 }
 
-static void tcg_out_a7(TCGContext *s, int op, int r1, int16_t i2)
+static void tcg_out_insn_RIL(TCGContext *s, S390Opcode op, TCGReg r1, int i2)
 {
-    tcg_out32(s, 0xa7000000UL | (r1 << 20) | (op << 16) | ((uint16_t)i2));
+    tcg_out16(s, op | (r1 << 4));
+    tcg_out32(s, i2);
 }
 
-/* emit 64-bit shifts (EB prefix) */
-static void tcg_out_sh64(TCGContext* s, int op, int r0, int r1, int r2, int imm)
+static void tcg_out_insn_RS(TCGContext *s, S390Opcode op, TCGReg r1,
+                            TCGReg b2, TCGReg r3, int disp)
 {
-    tcg_out16(s, 0xeb00 | (r0 << 4) | r1);
-    tcg_out32(s, op | (r2 << 28) | ((imm & 0xfff) << 16) | ((imm >> 12) << 8));
+    tcg_out32(s, (op << 24) | (r1 << 20) | (r3 << 16) | (b2 << 12)
+              | (disp & 0xfff));
 }
 
-/* emit 32-bit shifts */
-static void tcg_out_sh32(TCGContext* s, int op, int r0, int r1, int imm)
+static void tcg_out_insn_RSY(TCGContext *s, S390Opcode op, TCGReg r1,
+                             TCGReg b2, TCGReg r3, int disp)
 {
-    tcg_out32(s, 0x80000000 | (op << 24) | (r0 << 20) | (r1 << 12) | imm);
+    tcg_out16(s, (op & 0xff00) | (r1 << 4) | r3);
+    tcg_out32(s, (op & 0xff) | (b2 << 28)
+              | ((disp & 0xfff) << 16) | ((disp & 0xff000) >> 4));
 }
 
-/* branch to relative address (long) */
-static void tcg_out_brasl(TCGContext* s, int r, tcg_target_long raddr)
+#define tcg_out_insn_RX   tcg_out_insn_RS
+#define tcg_out_insn_RXY  tcg_out_insn_RSY
+
+/* Emit an opcode with "type-checking" of the format.  */
+#define tcg_out_insn(S, FMT, OP, ...) \
+    glue(tcg_out_insn_,FMT)(S, glue(glue(FMT,_),OP), ## __VA_ARGS__)
+
+
+/* emit 64-bit shifts */
+static void tcg_out_sh64(TCGContext* s, S390Opcode op, TCGReg dest,
+                         TCGReg src, TCGReg sh_reg, int sh_imm)
+{
+    tcg_out_insn_RSY(s, op, dest, sh_reg, src, sh_imm);
+}
+
+/* emit 32-bit shifts */
+static void tcg_out_sh32(TCGContext* s, S390Opcode op, TCGReg dest,
+                         TCGReg sh_reg, int sh_imm)
 {
-    tcg_out16(s, 0xc005 | (r << 4));
-    tcg_out32(s, raddr >> 1);
+    tcg_out_insn_RS(s, op, dest, sh_reg, 0, sh_imm);
 }
 
-/* store 8/16/32 bits */
-static void tcg_out_store(TCGContext* s, int op, int r0, int r1, int off)
+/* branch to relative address (long) */
+static void tcg_out_brasl(TCGContext *s, TCGReg r, tcg_target_long raddr)
 {
-    tcg_out32(s, (op << 24) | (r0 << 20) | (r1 << 12) | off);
+    tcg_out_insn(s, RIL, BRASL, r, raddr >> 1);
 }
 
 static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
 {
     /* ??? With a TCGType argument, we could emit the smaller LR insn.  */
-    tcg_out_b9(s, B9_LGR, ret, arg);
+    tcg_out_insn(s, RRE, LGR, ret, arg);
 }
 
 /* load a register with an immediate value */
@@ -314,48 +345,39 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
                 int ret, tcg_target_long arg)
 {
     if (arg >= -0x8000 && arg < 0x8000) { /* signed immediate load */
-        /* lghi %rret, arg */
-        tcg_out32(s, S390_INS_LGHI | (ret << 20) | (arg & 0xffff));
+        tcg_out_insn(s, RI, LGHI, ret, arg);
     } else if (!(arg & 0xffffffffffff0000UL)) {
-        /* llill %rret, arg */
-        tcg_out32(s, S390_INS_LLILL | (ret << 20) | arg);
+        tcg_out_insn(s, RI, LLILL, ret, arg);
     } else if (!(arg & 0xffffffff00000000UL) || type == TCG_TYPE_I32) {
-        /* llill %rret, arg */
-        tcg_out32(s, S390_INS_LLILL | (ret << 20) | (arg & 0xffff));
-        /* iilh %rret, arg */
-        tcg_out32(s, S390_INS_IILH | (ret << 20) | ((arg & 0xffffffff) >> 16));
+        tcg_out_insn(s, RI, LLILL, ret, arg);
+        tcg_out_insn(s, RI, IILH, ret, arg >> 16);
     } else {
         /* branch over constant and store its address in R13 */
         tcg_out_brasl(s, TCG_REG_R13, 14);
         /* 64-bit constant */
-        tcg_out32(s,arg >> 32);
-        tcg_out32(s,arg);
+        tcg_out32(s, arg >> 32);
+        tcg_out32(s, arg);
         /* load constant to ret */
-        tcg_out_e3(s, E3_LG, ret, TCG_REG_R13, 0);
+        tcg_out_insn(s, RXY, LG, ret, TCG_REG_R13, 0, 0);
     }
 }
 
 /* load data without address translation or endianness conversion */
-static inline void tcg_out_ld(TCGContext *s, TCGType type, int arg,
-                int arg1, tcg_target_long arg2)
+static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg data,
+                       TCGReg base, tcg_target_long ofs)
 {
-    int op;
-
-    dprintf("tcg_out_ld type %d arg %d arg1 %d arg2 %ld\n",
-            type, arg, arg1, arg2);
+    S390Opcode op;
 
-    op = (type == TCG_TYPE_I32) ? E3_LLGF : E3_LG;
+    op = (type == TCG_TYPE_I32) ? RXY_LLGF : RXY_LG;
 
-    if (arg2 < -0x80000 || arg2 > 0x7ffff) {
+    if (ofs < -0x80000 || ofs > 0x7ffff) {
         /* load the displacement */
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, arg2);
-        /* add the address */
-        tcg_out_b9(s, B9_AGR, TCG_REG_R13, arg1);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, ofs);
         /* load the data */
-        tcg_out_e3(s, op, arg, TCG_REG_R13, 0);
+        tcg_out_insn_RXY(s, op, data, base, TCG_REG_R13, 0);
     } else {
         /* load the data */
-        tcg_out_e3(s, op, arg, arg1, arg2);
+        tcg_out_insn_RXY(s, op, data, base, 0, ofs);
     }
 }
 
@@ -377,23 +399,23 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     }
 
 #if TARGET_LONG_BITS == 32
-    tcg_out_b9(s, B9_LLGFR, arg1, addr_reg);
-    tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+    tcg_out_insn(s, RRE, LLGFR, arg1, addr_reg);
+    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
 #else
     tcg_out_mov(s, arg1, addr_reg);
     tcg_out_mov(s, arg0, addr_reg);
 #endif
 
-    tcg_out_sh64(s, SH64_SRLG, arg1, addr_reg, SH64_REG_NONE,
+    tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, SH64_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
     tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
                  TARGET_PAGE_MASK | ((1 << s_bits) - 1));
-    tcg_out_b9(s, B9_NGR, arg0, TCG_REG_R13);
+    tcg_out_insn(s, RRE, NGR, arg0, TCG_REG_R13);
 
     tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
                  (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
-    tcg_out_b9(s, B9_NGR, arg1, TCG_REG_R13);
+    tcg_out_insn(s, RRE, NGR, arg1, TCG_REG_R13);
 
     if (is_store) {
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
@@ -402,20 +424,20 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
                      offsetof(CPUState, tlb_table[mem_index][0].addr_read));
     }
-    tcg_out_b9(s, B9_AGR, arg1, TCG_REG_R13);
+    tcg_out_insn(s, RRE, AGR, arg1, TCG_REG_R13);
 
-    tcg_out_b9(s, B9_AGR, arg1, TCG_AREG0);
+    tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
 
-    tcg_out_e3(s, E3_CG, arg0, arg1, 0);
+    tcg_out_insn(s, RXY, CG, arg0, arg1, 0, 0);
 
     label1_ptr = (uint16_t*)s->code_ptr;
 
     /* je label1 (offset will be patched in later) */
-    tcg_out32(s, 0xa7840000);
+    tcg_out_insn(s, RI, BRC, 8, 0);
 
     /* call load/store helper */
 #if TARGET_LONG_BITS == 32
-    tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
 #else
     tcg_out_mov(s, arg0, addr_reg);
 #endif
@@ -425,25 +447,25 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
         tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
                      (tcg_target_ulong)qemu_st_helpers[s_bits]);
-        tcg_out_rr(s, RR_BASR, TCG_REG_R14, TCG_REG_R13);
+        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
     } else {
         tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
                      (tcg_target_ulong)qemu_ld_helpers[s_bits]);
-        tcg_out_rr(s, RR_BASR, TCG_REG_R14, TCG_REG_R13);
+        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
 
         /* sign extension */
         switch (opc) {
         case LD_INT8:
-            tcg_out_sh64(s, SH64_SLLG, data_reg, arg0, SH64_REG_NONE, 56);
-            tcg_out_sh64(s, SH64_SRAG, data_reg, data_reg, SH64_REG_NONE, 56);
+            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, SH64_REG_NONE, 56);
+            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, SH64_REG_NONE, 56);
             break;
         case LD_INT16:
-            tcg_out_sh64(s, SH64_SLLG, data_reg, arg0, SH64_REG_NONE, 48);
-            tcg_out_sh64(s, SH64_SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
+            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, SH64_REG_NONE, 48);
+            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
             break;
         case LD_INT32:
-            tcg_out_b9(s, B9_LGFR, data_reg, arg0);
+            tcg_out_insn(s, RRE, LGFR, data_reg, arg0);
             break;
         default:
             /* unsigned -> just copy */
@@ -455,33 +477,34 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     /* jump to label2 (end) */
     *label2_ptr_p = (uint16_t*)s->code_ptr;
 
-    /* bras %r13, label2 */
-    tcg_out32(s, 0xa7d50000);
+    tcg_out_insn(s, RI, BRC, 15, 0);
 
     /* this is label1, patch branch */
     *(label1_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label1_ptr) >> 1;
 
     if (is_store) {
-        tcg_out_e3(s, E3_LG, arg1, arg1, offsetof(CPUTLBEntry, addend) -
-                                         offsetof(CPUTLBEntry, addr_write));
+        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
+                     offsetof(CPUTLBEntry, addend)
+                     - offsetof(CPUTLBEntry, addr_write));
     } else {
-        tcg_out_e3(s, E3_LG, arg1, arg1, offsetof(CPUTLBEntry, addend) -
-                                         offsetof(CPUTLBEntry, addr_read));
+        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
+                     offsetof(CPUTLBEntry, addend)
+                     - offsetof(CPUTLBEntry, addr_read));
     }
 
 #if TARGET_LONG_BITS == 32
     /* zero upper 32 bits */
-    tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
 #else
     /* just copy */
     tcg_out_mov(s, arg0, addr_reg);
 #endif
-    tcg_out_b9(s, B9_AGR, arg0, arg1);
-  }
-  
+    tcg_out_insn(s, RRE, AGR, arg0, arg1);
+}
+
 static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
-  {
+{
     /* patch branch */
     *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label2_ptr) >> 1;
@@ -497,7 +520,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
 
     /* user mode, no address translation required */
     if (TARGET_LONG_BITS == 32) {
-        tcg_out_b9(s, B9_LLGFR, arg0, addr_reg);
+        tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
     } else {
         tcg_out_mov(s, arg0, addr_reg);
     }
@@ -529,54 +552,54 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 
     switch (opc) {
     case LD_UINT8:
-        tcg_out_e3(s, E3_LLGC, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LLGC, data_reg, arg0, 0, 0);
         break;
     case LD_INT8:
-        tcg_out_e3(s, E3_LGB, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LGB, data_reg, arg0, 0, 0);
         break;
     case LD_UINT16:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_e3(s, E3_LLGH, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LLGH, data_reg, arg0, 0, 0);
 #else
         /* swapped unsigned halfword load with upper bits zeroed */
-        tcg_out_e3(s, E3_LRVH, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, 0xffffL);
-        tcg_out_b9(s, B9_NGR, data_reg, 13);
+        tcg_out_insn(s, RRE, NGR, data_reg, 13);
 #endif
         break;
     case LD_INT16:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_e3(s, E3_LGH, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LGH, data_reg, arg0, 0, 0);
 #else
         /* swapped sign-extended halfword load */
-        tcg_out_e3(s, E3_LRVH, data_reg, arg0, 0);
-        tcg_out_sh64(s, SH64_SLLG, data_reg, data_reg, SH64_REG_NONE, 48);
-        tcg_out_sh64(s, SH64_SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
+        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
+        tcg_out_insn(s, RSY, SLLG, data_reg, data_reg, SH64_REG_NONE, 48);
+        tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
 #endif
         break;
     case LD_UINT32:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_e3(s, E3_LLGF, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LLGF, data_reg, arg0, 0, 0);
 #else
         /* swapped unsigned int load with upper bits zeroed */
-        tcg_out_e3(s, E3_LRV, data_reg, arg0, 0);
-        tcg_out_b9(s, B9_LLGFR, data_reg, data_reg);
+        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
+        tcg_out_insn(s, RRE, LLGFR, data_reg, data_reg);
 #endif
         break;
     case LD_INT32:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_e3(s, E3_LGF, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LGF, data_reg, arg0, 0, 0);
 #else
         /* swapped sign-extended int load */
-        tcg_out_e3(s, E3_LRV, data_reg, arg0, 0);
-        tcg_out_b9(s, B9_LGFR, data_reg, data_reg);
+        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
+        tcg_out_insn(s, RRE, LGFR, data_reg, data_reg);
 #endif
         break;
     case LD_UINT64:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_e3(s, E3_LG, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LG, data_reg, arg0, 0, 0);
 #else
-        tcg_out_e3(s, E3_LRVG, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, LRVG, data_reg, arg0, 0, 0);
 #endif
         break;
     default:
@@ -604,27 +627,27 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 
     switch (opc) {
     case LD_UINT8:
-        tcg_out_store(s, ST_STC, data_reg, arg0, 0);
+        tcg_out_insn(s, RX, STC, data_reg, arg0, 0, 0);
         break;
     case LD_UINT16:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_store(s, ST_STH, data_reg, arg0, 0);
+        tcg_out_insn(s, RX, STH, data_reg, arg0, 0, 0);
 #else
-        tcg_out_e3(s, E3_STRVH, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, STRVH, data_reg, arg0, 0, 0);
 #endif
         break;
     case LD_UINT32:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_store(s, ST_ST, data_reg, arg0, 0);
+        tcg_out_insn(s, RX, ST, data_reg, arg0, 0, 0);
 #else
-        tcg_out_e3(s, E3_STRV, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, STRV, data_reg, arg0, 0, 0);
 #endif
         break;
     case LD_UINT64:
 #ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_e3(s, E3_STG, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, STG, data_reg, arg0, 0, 0);
 #else
-        tcg_out_e3(s, E3_STRVG, data_reg, arg0, 0);
+        tcg_out_insn(s, RXY, STRVG, data_reg, arg0, 0, 0);
 #endif
         break;
     default:
@@ -642,17 +665,17 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, int arg,
     if (type == TCG_TYPE_I32) {
         if (((long)arg2) < -0x800 || ((long)arg2) > 0x7ff) {
             tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, arg2);
-            tcg_out_b9(s, B9_AGR, 13, arg1);
-            tcg_out_store(s, ST_ST, arg, TCG_REG_R13, 0);
+            tcg_out_insn(s, RRE, AGR, 13, arg1);
+            tcg_out_insn(s, RX, ST, arg, TCG_REG_R13, 0, 0);
         } else {
-            tcg_out_store(s, ST_ST, arg, arg1, arg2);
+            tcg_out_insn(s, RX, ST, arg, arg1, 0, arg2);
         }
     }
     else {
         if (((long)arg2) < -0x80000 || ((long)arg2) > 0x7ffff) {
             tcg_abort();
         }
-        tcg_out_e3(s, E3_STG, arg, arg1, arg2);
+        tcg_out_insn(s, RXY, STG, arg, arg1, 0, arg2);
     }
 }
 
@@ -660,25 +683,18 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
     TCGLabel* l;
-    int op;
-    int op2;
-
-    dprintf("0x%x\n", INDEX_op_divu_i32);
+    S390Opcode op, op2;
 
     switch (opc) {
     case INDEX_op_exit_tb:
-        dprintf("op 0x%x exit_tb 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* return value */
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R2, args[0]);
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, (unsigned long)tb_ret_addr);
         /* br %r13 */
-        tcg_out16(s, S390_INS_BR | TCG_REG_R13);
+        tcg_out_insn(s, RR, BCR, 15, TCG_REG_R13);
         break;
 
     case INDEX_op_goto_tb:
-        dprintf("op 0x%x goto_tb 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (s->tb_jmp_offset) {
             tcg_abort();
         } else {
@@ -686,9 +702,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                                    (tcg_target_long)s->code_ptr) >> 1;
             if (off > -0x80000000L && off < 0x7fffffffL) {
                 /* load address relative to PC */
-                /* larl %r13, off */
-                tcg_out16(s, S390_INS_LARL | (TCG_REG_R13 << 4));
-                tcg_out32(s, off);
+                tcg_out_insn(s, RIL, LARL, TCG_REG_R13, off);
             } else {
                 /* too far for larl */
                 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
@@ -697,14 +711,12 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             /* load address stored at s->tb_next + args[0] */
             tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
             /* and go there */
-            tcg_out_rr(s, RR_BASR, TCG_REG_R13, TCG_REG_R13);
+            tcg_out_insn(s, RR, BASR, TCG_REG_R13, TCG_REG_R13);
         }
         s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
         break;
 
     case INDEX_op_call:
-        dprintf("op 0x%x call 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (const_args[0]) {
             tcg_target_long off;
 
@@ -718,27 +730,23 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             } else {
                 /* too far for a relative call, load full address */
                 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, args[0]);
-                tcg_out_rr(s, RR_BASR, TCG_REG_R14, TCG_REG_R13);
+                tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
             }
         } else {
             /* call function in register args[0] */
-            tcg_out_rr(s, RR_BASR, TCG_REG_R14, args[0]);
+            tcg_out_insn(s, RR, BASR, TCG_REG_R14, args[0]);
         }
         break;
 
     case INDEX_op_jmp:
-        dprintf("op 0x%x jmp 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* XXX */
         tcg_abort();
         break;
 
     case INDEX_op_ld8u_i32:
     case INDEX_op_ld8u_i64:
-        dprintf("op 0x%x ld8u_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if ((long)args[2] > -0x80000 && (long)args[2] < 0x7ffff) {
-            tcg_out_e3(s, E3_LLGC, args[0], args[1], args[2]);
+            tcg_out_insn(s, RXY, LLGC, args[0], args[1], 0, args[2]);
         } else {
             /* XXX displacement too large, have to calculate address manually */
             tcg_abort();
@@ -746,17 +754,13 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_ld8s_i32:
-        dprintf("op 0x%x ld8s_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* XXX */
         tcg_abort();
         break;
 
     case INDEX_op_ld16u_i32:
-        dprintf("op 0x%x ld16u_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if ((long)args[2] > -0x80000 && (long)args[2] < 0x7ffff) {
-            tcg_out_e3(s, E3_LLGH, args[0], args[1], args[2]);
+            tcg_out_insn(s, RXY, LLGH, args[0], args[1], 0, args[2]);
         } else {
             /* XXX displacement too large, have to calculate address manually */
             tcg_abort();
@@ -764,8 +768,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_ld16s_i32:
-        dprintf("op 0x%x ld16s_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* XXX */
         tcg_abort();
         break;
@@ -780,12 +782,12 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             /* load the displacement */
             tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, args[2]);
             /* add the address */
-            tcg_out_b9(s, B9_AGR, TCG_REG_R13, args[1]);
+            tcg_out_insn(s, RRE, AGR, TCG_REG_R13, args[1]);
             /* load the data (sign-extended) */
-            tcg_out_e3(s, E3_LGF, args[0], TCG_REG_R13, 0);
+            tcg_out_insn(s, RXY, LGF, args[0], TCG_REG_R13, 0, 0);
         } else {
             /* load the data (sign-extended) */
-            tcg_out_e3(s, E3_LGF, args[0], args[1], args[2]);
+            tcg_out_insn(s, RXY, LGF, args[0], args[1], 0, args[2]);
         }
         break;
 
@@ -795,13 +797,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_st8_i32:
     case INDEX_op_st8_i64:
-        dprintf("op 0x%x st8_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (((long)args[2]) >= -0x800 && ((long)args[2]) < 0x800) {
-            tcg_out_store(s, ST_STC, args[0], args[1], args[2]);
+            tcg_out_insn(s, RX, STC, args[0], args[1], 0, args[2]);
         } else if (((long)args[2]) >= -0x80000 && ((long)args[2]) < 0x80000) {
             /* FIXME: requires long displacement facility */
-            tcg_out_e3(s, E3_STCY, args[0], args[1], args[2]);
+            tcg_out_insn(s, RXY, STCY, args[0], args[1], 0, args[2]);
             tcg_abort();
         } else {
             tcg_abort();
@@ -810,13 +810,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_st16_i32:
     case INDEX_op_st16_i64:
-        dprintf("op 0x%x st16_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (((long)args[2]) >= -0x800 && ((long)args[2]) < 0x800) {
-            tcg_out_store(s, ST_STH, args[0], args[1], args[2]);
+            tcg_out_insn(s, RX, STH, args[0], args[1], 0, args[2]);
         } else if (((long)args[2]) >= -0x80000 && ((long)args[2]) < 0x80000) {
             /* FIXME: requires long displacement facility */
-            tcg_out_e3(s, E3_STHY, args[0], args[1], args[2]);
+            tcg_out_insn(s, RXY, STHY, args[0], args[1], 0, args[2]);
             tcg_abort();
         } else {
             tcg_abort();
@@ -833,15 +831,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_mov_i32:
-        dprintf("op 0x%x mov_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* XXX */
         tcg_abort();
         break;
 
     case INDEX_op_movi_i32:
-        dprintf("op 0x%x movi_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* XXX */
         tcg_abort();
         break;
@@ -849,164 +843,144 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_add_i32:
         if (const_args[2]) {
             if (args[0] == args[1]) {
-                tcg_out_a7(s, A7_AHI, args[1], args[2]);
+                tcg_out_insn(s, RI, AHI, args[1], args[2]);
             } else {
-                tcg_out_rr(s, RR_LR, args[0], args[1]);
-                tcg_out_a7(s, A7_AHI, args[0], args[2]);
+                tcg_out_insn(s, RR, LR, args[0], args[1]);
+                tcg_out_insn(s, RI, AHI, args[0], args[2]);
             }
         } else if (args[0] == args[1]) {
-            tcg_out_rr(s, RR_AR, args[1], args[2]);
+            tcg_out_insn(s, RR, AR, args[1], args[2]);
         } else if (args[0] == args[2]) {
-            tcg_out_rr(s, RR_AR, args[0], args[1]);
+            tcg_out_insn(s, RR, AR, args[0], args[1]);
         } else {
-            tcg_out_rr(s, RR_LR, args[0], args[1]);
-            tcg_out_rr(s, RR_AR, args[0], args[2]);
+            tcg_out_insn(s, RR, LR, args[0], args[1]);
+            tcg_out_insn(s, RR, AR, args[0], args[2]);
         }
         break;
 
     case INDEX_op_sub_i32:
-        dprintf("op 0x%x sub_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (args[0] == args[1]) {
             /* sr %ra0/1, %ra2 */
-            tcg_out_rr(s, RR_SR, args[1], args[2]);
+            tcg_out_insn(s, RR, SR, args[1], args[2]);
         } else if (args[0] == args[2]) {
             /* lr %r13, %raa0/2 */
-            tcg_out_rr(s, RR_LR, TCG_REG_R13, args[2]);
+            tcg_out_insn(s, RR, LR, TCG_REG_R13, args[2]);
             /* lr %ra0/2, %ra1 */
-            tcg_out_rr(s, RR_LR, args[0], args[1]);
+            tcg_out_insn(s, RR, LR, args[0], args[1]);
             /* sr %ra0/2, %r13 */
-            tcg_out_rr(s, RR_SR, args[0], TCG_REG_R13);
+            tcg_out_insn(s, RR, SR, args[0], TCG_REG_R13);
         } else {
             /* lr %ra0, %ra1 */
-            tcg_out_rr(s, RR_LR, args[0], args[1]);
+            tcg_out_insn(s, RR, LR, args[0], args[1]);
             /* sr %ra0, %ra2 */
-            tcg_out_rr(s, RR_SR, args[0], args[2]);
+            tcg_out_insn(s, RR, SR, args[0], args[2]);
         }
         break;
 
     case INDEX_op_sub_i64:
-        dprintf("op 0x%x sub_i64 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (args[0] == args[1]) {
             /* sgr %ra0/1, %ra2 */
-            tcg_out_b9(s, B9_SGR, args[1], args[2]);
+            tcg_out_insn(s, RRE, SGR, args[1], args[2]);
         } else if (args[0] == args[2]) {
             tcg_out_mov(s, TCG_REG_R13, args[2]);
             tcg_out_mov(s, args[0], args[1]);
             /* sgr %ra0/2, %r13 */
-            tcg_out_b9(s, B9_SGR, args[0], TCG_REG_R13);
+            tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
         } else {
             tcg_out_mov(s, args[0], args[1]);
             /* sgr %ra0, %ra2 */
-            tcg_out_b9(s, B9_SGR, args[0], args[2]);
+            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
         }
         break;
 
     case INDEX_op_add_i64:
-        dprintf("op 0x%x add_i64 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (args[0] == args[1]) {
-            tcg_out_b9(s, B9_AGR, args[1], args[2]);
+            tcg_out_insn(s, RRE, AGR, args[1], args[2]);
         } else if (args[0] == args[2]) {
-            tcg_out_b9(s, B9_AGR, args[0], args[1]);
+            tcg_out_insn(s, RRE, AGR, args[0], args[1]);
         } else {
             tcg_out_mov(s, args[0], args[1]);
-            tcg_out_b9(s, B9_AGR, args[0], args[2]);
+            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
         }
         break;
 
     case INDEX_op_and_i32:
         op = RR_NR;
-do_logic_i32:
+    do_logic_i32:
         if (args[0] == args[1]) {
             /* xr %ra0/1, %ra2 */
-            tcg_out_rr(s, op, args[1], args[2]);
+            tcg_out_insn_RR(s, op, args[1], args[2]);
         } else if (args[0] == args[2]) {
             /* xr %ra0/2, %ra1 */
-            tcg_out_rr(s, op, args[0], args[1]);
+            tcg_out_insn_RR(s, op, args[0], args[1]);
         } else {
             /* lr %ra0, %ra1 */
-            tcg_out_rr(s, RR_LR, args[0], args[1]);
+            tcg_out_insn(s, RR, LR, args[0], args[1]);
             /* xr %ra0, %ra2 */
-            tcg_out_rr(s, op, args[0], args[2]);
+            tcg_out_insn_RR(s, op, args[0], args[2]);
         }
         break;
 
     case INDEX_op_or_i32:
         op = RR_OR;
         goto do_logic_i32;
-
     case INDEX_op_xor_i32:
         op = RR_XR;
         goto do_logic_i32;
 
     case INDEX_op_and_i64:
-        dprintf("op 0x%x and_i64 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
-        op = B9_NGR;
-do_logic_i64:
+        op = RRE_NGR;
+    do_logic_i64:
         if (args[0] == args[1]) {
-            tcg_out_b9(s, op, args[0], args[2]);
+            tcg_out_insn_RRE(s, op, args[0], args[2]);
         } else if (args[0] == args[2]) {
-            tcg_out_b9(s, op, args[0], args[1]);
+            tcg_out_insn_RRE(s, op, args[0], args[1]);
         } else {
             tcg_out_mov(s, args[0], args[1]);
-            tcg_out_b9(s, op, args[0], args[2]);
+            tcg_out_insn_RRE(s, op, args[0], args[2]);
         }
         break;
 
     case INDEX_op_or_i64:
-        op = B9_OGR;
+        op = RRE_OGR;
         goto do_logic_i64;
-
     case INDEX_op_xor_i64:
-        op = B9_XGR;
+        op = RRE_XGR;
         goto do_logic_i64;
 
     case INDEX_op_neg_i32:
-        dprintf("op 0x%x neg_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* FIXME: optimize args[0] != args[1] case */
-        tcg_out_rr(s, RR_LR, 13, args[1]);
-        /* lghi %ra0, 0 */
-        tcg_out32(s, S390_INS_LGHI | (args[0] << 20));
-        tcg_out_rr(s, RR_SR, args[0], 13);
+        tcg_out_insn(s, RR, LR, 13, args[1]);
+        tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
+        tcg_out_insn(s, RR, SR, args[0], 13);
         break;
 
     case INDEX_op_neg_i64:
-        dprintf("op 0x%x neg_i64 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         /* FIXME: optimize args[0] != args[1] case */
         tcg_out_mov(s, TCG_REG_R13, args[1]);
-        /* lghi %ra0, 0 */
-        tcg_out32(s, S390_INS_LGHI | (args[0] << 20));
-        tcg_out_b9(s, B9_SGR, args[0], TCG_REG_R13);
+        tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
+        tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
         break;
 
     case INDEX_op_mul_i32:
-        dprintf("op 0x%x mul_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (args[0] == args[1])
           /* msr %ra0/1, %ra2 */
-          tcg_out32(s, S390_INS_MSR | (args[0] << 4) | args[2]);
+          tcg_out_insn(s, RRE, MSR, args[0], args[2]);
         else if (args[0] == args[2])
           /* msr %ra0/2, %ra1 */
-          tcg_out32(s, S390_INS_MSR | (args[0] << 4) | args[1]);
+          tcg_out_insn(s, RRE, MSR, args[0], args[1]);
         else {
-          tcg_out_rr(s, RR_LR, args[0], args[1]);
+          tcg_out_insn(s, RR, LR, args[0], args[1]);
           /* msr %ra0, %ra2 */
-          tcg_out32(s, S390_INS_MSR | (args[0] << 4) | args[2]);
+          tcg_out_insn(s, RRE, MSR, args[0], args[2]);
         }
         break;
 
     case INDEX_op_mul_i64:
-        dprintf("op 0x%x mul_i64 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         if (args[0] == args[1]) {
-            tcg_out_b9(s, B9_MSGR, args[0], args[2]);
+            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
         } else if (args[0] == args[2]) {
-            tcg_out_b9(s, B9_MSGR, args[0], args[1]);
+            tcg_out_insn(s, RRE, MSGR, args[0], args[1]);
         } else {
             /* XXX */
             tcg_abort();
@@ -1015,27 +989,25 @@ do_logic_i64:
 
     case INDEX_op_divu_i32:
     case INDEX_op_remu_i32:
-        dprintf("op 0x%x div/remu_i32 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R12, 0);
-        tcg_out_rr(s, RR_LR, TCG_REG_R13, args[1]);
-        tcg_out_b9(s, B9_DLR, TCG_REG_R12, args[2]);
+        tcg_out_insn(s, RR, LR, TCG_REG_R13, args[1]);
+        tcg_out_insn(s, RRE, DLR, TCG_REG_R12, args[2]);
         if (opc == INDEX_op_divu_i32) {
-          tcg_out_rr(s, RR_LR, args[0], TCG_REG_R13);        /* quotient */
+          tcg_out_insn(s, RR, LR, args[0], TCG_REG_R13);        /* quotient */
         } else {
-          tcg_out_rr(s, RR_LR, args[0], TCG_REG_R12);        /* remainder */
+          tcg_out_insn(s, RR, LR, args[0], TCG_REG_R12);        /* remainder */
         }
         break;
 
     case INDEX_op_shl_i32:
-        op = SH32_SLL;
-        op2 = SH64_SLLG;
- do_shift32:
+        op = RS_SLL;
+        op2 = RSY_SLLG;
+    do_shift32:
         if (const_args[2]) {
             if (args[0] == args[1]) {
                 tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
             } else {
-                tcg_out_rr(s, RR_LR, args[0], args[1]);
+                tcg_out_insn(s, RR, LR, args[0], args[1]);
                 tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
             }
         } else {
@@ -1048,18 +1020,18 @@ do_logic_i64:
         break;
 
     case INDEX_op_shr_i32:
-        op = SH32_SRL;
-        op2 = SH64_SRLG;
+        op = RS_SRL;
+        op2 = RSY_SRLG;
         goto do_shift32;
 
     case INDEX_op_sar_i32:
-        op = SH32_SRA;
-        op2 = SH64_SRAG;
+        op = RS_SRA;
+        op2 = RSY_SRAG;
         goto do_shift32;
 
     case INDEX_op_shl_i64:
-        op = SH64_SLLG;
- do_shift64:
+        op = RSY_SLLG;
+    do_shift64:
         if (const_args[2]) {
             tcg_out_sh64(s, op, args[0], args[1], SH64_REG_NONE, args[2]);
         } else {
@@ -1068,67 +1040,60 @@ do_logic_i64:
         break;
 
     case INDEX_op_shr_i64:
-        op = SH64_SRLG;
+        op = RSY_SRLG;
         goto do_shift64;
 
     case INDEX_op_sar_i64:
-        op = SH64_SRAG;
+        op = RSY_SRAG;
         goto do_shift64;
 
     case INDEX_op_br:
-        dprintf("op 0x%x br 0x%lx 0x%lx 0x%lx\n",
-                opc, args[0], args[1], args[2]);
         l = &s->labels[args[0]];
         if (l->has_value) {
             tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, l->u.value);
         } else {
             /* larl %r13, ... */
-            tcg_out16(s, S390_INS_LARL | (TCG_REG_R13 << 4));
+            tcg_out16(s, RIL_LARL | (TCG_REG_R13 << 4));
             tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, args[0], -2);
             s->code_ptr += 4;
         }
-        tcg_out_rr(s, RR_BASR, TCG_REG_R13, TCG_REG_R13);
+        tcg_out_insn(s, RR, BASR, TCG_REG_R13, TCG_REG_R13);
         break;
 
     case INDEX_op_brcond_i64:
-        dprintf("op 0x%x brcond_i64 0x%lx 0x%lx (c %d) 0x%lx\n",
-                opc, args[0], args[1], const_args[1], args[2]);
         if (args[2] > TCG_COND_GT) {
-          /* unsigned */
-          /* clgr %ra0, %ra1 */
-          tcg_out_b9(s, B9_CLGR, args[0], args[1]);
+            /* unsigned */
+            /* clgr %ra0, %ra1 */
+            tcg_out_insn(s, RRE, CLGR, args[0], args[1]);
         } else {
-          /* signed */
-          /* cgr %ra0, %ra1 */
-          tcg_out_b9(s, B9_CGR, args[0], args[1]);
+            /* signed */
+            /* cgr %ra0, %ra1 */
+            tcg_out_insn(s, RRE, CGR, args[0], args[1]);
         }
         goto do_brcond;
 
     case INDEX_op_brcond_i32:
-        dprintf("op 0x%x brcond_i32 0x%lx 0x%lx (c %d) 0x%lx\n",
-                opc, args[0], args[1], const_args[1], args[2]);
         if (args[2] > TCG_COND_GT) {
-          /* unsigned */
-          /* clr %ra0, %ra1 */
-          tcg_out_rr(s, RR_CLR, args[0], args[1]);
+            /* unsigned */
+            /* clr %ra0, %ra1 */
+            tcg_out_insn(s, RR, CLR, args[0], args[1]);
         } else {
-          /* signed */
-          /* cr %ra0, %ra1 */
-          tcg_out_rr(s, RR_CR, args[0], args[1]);
+            /* signed */
+            /* cr %ra0, %ra1 */
+            tcg_out_insn(s, RR, CR, args[0], args[1]);
         }
- do_brcond:
+    do_brcond:
         l = &s->labels[args[3]];
         if (l->has_value) {
             tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, l->u.value);
         } else {
             /* larl %r13, ... */
-            tcg_out16(s, S390_INS_LARL | (TCG_REG_R13 << 4));
+            tcg_out16(s, RIL_LARL | (TCG_REG_R13 << 4));
             tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, args[3], -2);
             s->code_ptr += 4;
         }
         /* bcr cond, %r13 */
-        tcg_out16(s, S390_INS_BCR | TCG_REG_R13 |
-                     (tcg_cond_to_s390_cond[args[2]] << 4));
+        tcg_out_insn(s, RR, BCR, tcg_cond_to_s390_cond[args[2]], TCG_REG_R13);
         break;
 
     case INDEX_op_qemu_ld8u:
@@ -1306,23 +1271,21 @@ void tcg_target_init(TCGContext *s)
 void tcg_target_qemu_prologue(TCGContext *s)
 {
     /* stmg %r6,%r15,48(%r15) (save registers) */
-    tcg_out16(s, 0xeb6f);
-    tcg_out32(s, 0xf0300024);
+    tcg_out_insn(s, RXY, STMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 48);
 
     /* aghi %r15,-160 (stack frame) */
-    tcg_out32(s, 0xa7fbff60);
+    tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -160);
 
     /* br %r2 (go to TB) */
-    tcg_out16(s, S390_INS_BR | TCG_REG_R2);
+    tcg_out_insn(s, RR, BCR, 15, TCG_REG_R2);
 
     tb_ret_addr = s->code_ptr;
 
     /* lmg %r6,%r15,208(%r15) (restore registers) */
-    tcg_out16(s, 0xeb6f);
-    tcg_out32(s, 0xf0d00004);
+    tcg_out_insn(s, RXY, LMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 208);
 
     /* br %r14 (return) */
-    tcg_out16(s, S390_INS_BR | TCG_REG_R14);
+    tcg_out_insn(s, RR, BCR, 15, TCG_REG_R14);
 }
 
 static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 7495258..c81f886 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -26,7 +26,7 @@
 #define TCG_TARGET_REG_BITS 64
 #define TCG_TARGET_WORDS_BIGENDIAN
 
-enum {
+typedef enum TCGReg {
     TCG_REG_R0 = 0,
     TCG_REG_R1,
     TCG_REG_R2,
@@ -43,7 +43,8 @@ enum {
     TCG_REG_R13,
     TCG_REG_R14,
     TCG_REG_R15
-};
+} TCGReg;
+
 #define TCG_TARGET_NB_REGS 16
 
 /* optional instructions */
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 18/62] tcg-s390: Use matching constraints.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (16 preceding siblings ...)
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 17/62] tcg-s390: Reorganize instruction emission Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 19/62] tcg-s390: Fixup qemu_ld/st opcodes Richard Henderson
                   ` (44 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Simplify the generation within tcg_out_op by forcing arg1 == arg0 for
the two-operand instructions.

In addition, fix the use of the 64-bit shift insns in implementing the
32-bit shifts.  This would yield incorrect results for the right shifts.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  181 +++++++++++--------------------------------------
 1 files changed, 39 insertions(+), 142 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 0deb332..c45d8b5 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -683,7 +683,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
     TCGLabel* l;
-    S390Opcode op, op2;
+    S390Opcode op;
 
     switch (opc) {
     case INDEX_op_exit_tb:
@@ -842,111 +842,43 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_add_i32:
         if (const_args[2]) {
-            if (args[0] == args[1]) {
-                tcg_out_insn(s, RI, AHI, args[1], args[2]);
-            } else {
-                tcg_out_insn(s, RR, LR, args[0], args[1]);
-                tcg_out_insn(s, RI, AHI, args[0], args[2]);
-            }
-        } else if (args[0] == args[1]) {
-            tcg_out_insn(s, RR, AR, args[1], args[2]);
-        } else if (args[0] == args[2]) {
-            tcg_out_insn(s, RR, AR, args[0], args[1]);
+            tcg_out_insn(s, RI, AHI, args[0], args[2]);
         } else {
-            tcg_out_insn(s, RR, LR, args[0], args[1]);
             tcg_out_insn(s, RR, AR, args[0], args[2]);
         }
         break;
 
-    case INDEX_op_sub_i32:
-        if (args[0] == args[1]) {
-            /* sr %ra0/1, %ra2 */
-            tcg_out_insn(s, RR, SR, args[1], args[2]);
-        } else if (args[0] == args[2]) {
-            /* lr %r13, %raa0/2 */
-            tcg_out_insn(s, RR, LR, TCG_REG_R13, args[2]);
-            /* lr %ra0/2, %ra1 */
-            tcg_out_insn(s, RR, LR, args[0], args[1]);
-            /* sr %ra0/2, %r13 */
-            tcg_out_insn(s, RR, SR, args[0], TCG_REG_R13);
-        } else {
-            /* lr %ra0, %ra1 */
-            tcg_out_insn(s, RR, LR, args[0], args[1]);
-            /* sr %ra0, %ra2 */
-            tcg_out_insn(s, RR, SR, args[0], args[2]);
-        }
+    case INDEX_op_add_i64:
+        tcg_out_insn(s, RRE, AGR, args[0], args[2]);
         break;
 
-    case INDEX_op_sub_i64:
-        if (args[0] == args[1]) {
-            /* sgr %ra0/1, %ra2 */
-            tcg_out_insn(s, RRE, SGR, args[1], args[2]);
-        } else if (args[0] == args[2]) {
-            tcg_out_mov(s, TCG_REG_R13, args[2]);
-            tcg_out_mov(s, args[0], args[1]);
-            /* sgr %ra0/2, %r13 */
-            tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
-        } else {
-            tcg_out_mov(s, args[0], args[1]);
-            /* sgr %ra0, %ra2 */
-            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
-        }
+    case INDEX_op_sub_i32:
+        tcg_out_insn(s, RR, SR, args[0], args[2]);
         break;
 
-    case INDEX_op_add_i64:
-        if (args[0] == args[1]) {
-            tcg_out_insn(s, RRE, AGR, args[1], args[2]);
-        } else if (args[0] == args[2]) {
-            tcg_out_insn(s, RRE, AGR, args[0], args[1]);
-        } else {
-            tcg_out_mov(s, args[0], args[1]);
-            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
-        }
+    case INDEX_op_sub_i64:
+        tcg_out_insn(s, RRE, SGR, args[0], args[2]);
         break;
 
     case INDEX_op_and_i32:
-        op = RR_NR;
-    do_logic_i32:
-        if (args[0] == args[1]) {
-            /* xr %ra0/1, %ra2 */
-            tcg_out_insn_RR(s, op, args[1], args[2]);
-        } else if (args[0] == args[2]) {
-            /* xr %ra0/2, %ra1 */
-            tcg_out_insn_RR(s, op, args[0], args[1]);
-        } else {
-            /* lr %ra0, %ra1 */
-            tcg_out_insn(s, RR, LR, args[0], args[1]);
-            /* xr %ra0, %ra2 */
-            tcg_out_insn_RR(s, op, args[0], args[2]);
-        }
+        tcg_out_insn(s, RR, NR, args[0], args[2]);
         break;
-
     case INDEX_op_or_i32:
-        op = RR_OR;
-        goto do_logic_i32;
+        tcg_out_insn(s, RR, OR, args[0], args[2]);
+        break;
     case INDEX_op_xor_i32:
-        op = RR_XR;
-        goto do_logic_i32;
+        tcg_out_insn(s, RR, XR, args[0], args[2]);
+        break;
 
     case INDEX_op_and_i64:
-        op = RRE_NGR;
-    do_logic_i64:
-        if (args[0] == args[1]) {
-            tcg_out_insn_RRE(s, op, args[0], args[2]);
-        } else if (args[0] == args[2]) {
-            tcg_out_insn_RRE(s, op, args[0], args[1]);
-        } else {
-            tcg_out_mov(s, args[0], args[1]);
-            tcg_out_insn_RRE(s, op, args[0], args[2]);
-        }
+        tcg_out_insn(s, RRE, NGR, args[0], args[2]);
         break;
-
     case INDEX_op_or_i64:
-        op = RRE_OGR;
-        goto do_logic_i64;
+        tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        break;
     case INDEX_op_xor_i64:
-        op = RRE_XGR;
-        goto do_logic_i64;
+        tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        break;
 
     case INDEX_op_neg_i32:
         /* FIXME: optimize args[0] != args[1] case */
@@ -954,7 +886,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
         tcg_out_insn(s, RR, SR, args[0], 13);
         break;
-
     case INDEX_op_neg_i64:
         /* FIXME: optimize args[0] != args[1] case */
         tcg_out_mov(s, TCG_REG_R13, args[1]);
@@ -963,28 +894,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_mul_i32:
-        if (args[0] == args[1])
-          /* msr %ra0/1, %ra2 */
-          tcg_out_insn(s, RRE, MSR, args[0], args[2]);
-        else if (args[0] == args[2])
-          /* msr %ra0/2, %ra1 */
-          tcg_out_insn(s, RRE, MSR, args[0], args[1]);
-        else {
-          tcg_out_insn(s, RR, LR, args[0], args[1]);
-          /* msr %ra0, %ra2 */
-          tcg_out_insn(s, RRE, MSR, args[0], args[2]);
-        }
+        tcg_out_insn(s, RRE, MSR, args[0], args[2]);
         break;
-
     case INDEX_op_mul_i64:
-        if (args[0] == args[1]) {
-            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
-        } else if (args[0] == args[2]) {
-            tcg_out_insn(s, RRE, MSGR, args[0], args[1]);
-        } else {
-            /* XXX */
-            tcg_abort();
-        }
+        tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
         break;
 
     case INDEX_op_divu_i32:
@@ -1001,32 +914,18 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_shl_i32:
         op = RS_SLL;
-        op2 = RSY_SLLG;
     do_shift32:
         if (const_args[2]) {
-            if (args[0] == args[1]) {
-                tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
-            } else {
-                tcg_out_insn(s, RR, LR, args[0], args[1]);
-                tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
-            }
+            tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
         } else {
-            if (args[0] == args[1]) {
-                tcg_out_sh32(s, op, args[0], args[2], 0);
-            } else {
-                tcg_out_sh64(s, op2, args[0], args[1], args[2], 0);
-            }
+            tcg_out_sh32(s, op, args[0], args[2], 0);
         }
         break;
-
     case INDEX_op_shr_i32:
         op = RS_SRL;
-        op2 = RSY_SRLG;
         goto do_shift32;
-
     case INDEX_op_sar_i32:
         op = RS_SRA;
-        op2 = RSY_SRAG;
         goto do_shift32;
 
     case INDEX_op_shl_i64:
@@ -1038,11 +937,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
         }
         break;
-
     case INDEX_op_shr_i64:
         op = RSY_SRLG;
         goto do_shift64;
-
     case INDEX_op_sar_i64:
         op = RSY_SRAG;
         goto do_shift64;
@@ -1146,7 +1043,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     }
 }
 
- static const TCGTargetOpDef s390_op_defs[] = {
+static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_exit_tb, { } },
     { INDEX_op_goto_tb, { } },
     { INDEX_op_call, { "ri" } },
@@ -1165,23 +1062,23 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     { INDEX_op_st16_i32, { "r", "r" } },
     { INDEX_op_st_i32, { "r", "r" } },
 
-    { INDEX_op_add_i32, { "r", "r", "rI" } },
-    { INDEX_op_sub_i32, { "r", "r", "r" } },
-    { INDEX_op_mul_i32, { "r", "r", "r" } },
+    { INDEX_op_add_i32, { "r", "0", "rI" } },
+    { INDEX_op_sub_i32, { "r", "0", "r" } },
+    { INDEX_op_mul_i32, { "r", "0", "r" } },
 
     { INDEX_op_div_i32, { "r", "r", "r" } },
     { INDEX_op_divu_i32, { "r", "r", "r" } },
     { INDEX_op_rem_i32, { "r", "r", "r" } },
     { INDEX_op_remu_i32, { "r", "r", "r" } },
 
-    { INDEX_op_and_i32, { "r", "r", "r" } },
-    { INDEX_op_or_i32, { "r", "r", "r" } },
-    { INDEX_op_xor_i32, { "r", "r", "r" } },
+    { INDEX_op_and_i32, { "r", "0", "r" } },
+    { INDEX_op_or_i32, { "r", "0", "r" } },
+    { INDEX_op_xor_i32, { "r", "0", "r" } },
     { INDEX_op_neg_i32, { "r", "r" } },
 
-    { INDEX_op_shl_i32, { "r", "r", "Ri" } },
-    { INDEX_op_shr_i32, { "r", "r", "Ri" } },
-    { INDEX_op_sar_i32, { "r", "r", "Ri" } },
+    { INDEX_op_shl_i32, { "r", "0", "Ri" } },
+    { INDEX_op_shr_i32, { "r", "0", "Ri" } },
+    { INDEX_op_sar_i32, { "r", "0", "Ri" } },
 
     { INDEX_op_brcond_i32, { "r", "r" } },
 
@@ -1216,13 +1113,13 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     { INDEX_op_qemu_ld64, { "L", "L" } },
     { INDEX_op_qemu_st64, { "L", "L" } },
 
-    { INDEX_op_add_i64, { "r", "r", "r" } },
-    { INDEX_op_mul_i64, { "r", "r", "r" } },
-    { INDEX_op_sub_i64, { "r", "r", "r" } },
+    { INDEX_op_add_i64, { "r", "0", "r" } },
+    { INDEX_op_sub_i64, { "r", "0", "r" } },
+    { INDEX_op_mul_i64, { "r", "0", "r" } },
 
-    { INDEX_op_and_i64, { "r", "r", "r" } },
-    { INDEX_op_or_i64, { "r", "r", "r" } },
-    { INDEX_op_xor_i64, { "r", "r", "r" } },
+    { INDEX_op_and_i64, { "r", "0", "r" } },
+    { INDEX_op_or_i64, { "r", "0", "r" } },
+    { INDEX_op_xor_i64, { "r", "0", "r" } },
     { INDEX_op_neg_i64, { "r", "r" } },
 
     { INDEX_op_shl_i64, { "r", "r", "Ri" } },
@@ -1233,7 +1130,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 #endif
 
     { -1 },
- };
+};
 
 void tcg_target_init(TCGContext *s)
 {
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 19/62] tcg-s390: Fixup qemu_ld/st opcodes.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (17 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 18/62] tcg-s390: Use matching constraints Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 20/62] tcg-s390: Implement setcond Richard Henderson
                   ` (43 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Implement INDEX_op_qemu_ld32.  Fix constraints on qemu_ld64.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index c45d8b5..f21a9ca 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -1009,6 +1009,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_qemu_ld(s, args, LD_INT16);
         break;
 
+    case INDEX_op_qemu_ld32:
+        /* ??? Technically we can use a non-extending instruction.  */
     case INDEX_op_qemu_ld32u:
         tcg_out_qemu_ld(s, args, LD_UINT32);
         break;
@@ -1088,10 +1090,13 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_qemu_ld16s, { "r", "L" } },
     { INDEX_op_qemu_ld32u, { "r", "L" } },
     { INDEX_op_qemu_ld32s, { "r", "L" } },
+    { INDEX_op_qemu_ld32, { "r", "L" } },
+    { INDEX_op_qemu_ld64, { "r", "L" } },
 
     { INDEX_op_qemu_st8, { "L", "L" } },
     { INDEX_op_qemu_st16, { "L", "L" } },
     { INDEX_op_qemu_st32, { "L", "L" } },
+    { INDEX_op_qemu_st64, { "L", "L" } },
 
 #if defined(__s390x__)
     { INDEX_op_mov_i64, { "r", "r" } },
@@ -1110,9 +1115,6 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st32_i64, { "r", "r" } },
     { INDEX_op_st_i64, { "r", "r" } },
 
-    { INDEX_op_qemu_ld64, { "L", "L" } },
-    { INDEX_op_qemu_st64, { "L", "L" } },
-
     { INDEX_op_add_i64, { "r", "0", "r" } },
     { INDEX_op_sub_i64, { "r", "0", "r" } },
     { INDEX_op_mul_i64, { "r", "0", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 20/62] tcg-s390: Implement setcond.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (18 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 19/62] tcg-s390: Fixup qemu_ld/st opcodes Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 21/62] tcg-s390: Generalize the direct load/store emission Richard Henderson
                   ` (42 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   66 ++++++++++++++++++++++++++++++++++--------------
 1 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index f21a9ca..b150d1a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -381,6 +381,42 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg data,
     }
 }
 
+static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
+{
+    if (c > TCG_COND_GT) {
+        /* unsigned */
+        tcg_out_insn(s, RR, CLR, r1, r2);
+    } else {
+        /* signed */
+        tcg_out_insn(s, RR, CR, r1, r2);
+    }
+}
+
+static void tgen64_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
+{
+    if (c > TCG_COND_GT) {
+        /* unsigned */
+        tcg_out_insn(s, RRE, CLGR, r1, r2);
+    } else {
+        /* signed */
+        tcg_out_insn(s, RRE, CGR, r1, r2);
+    }
+}
+
+static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
+                         TCGReg dest, TCGReg r1, TCGReg r2)
+{
+    if (type == TCG_TYPE_I32) {
+        tgen32_cmp(s, c, r1, r2);
+    } else {
+        tgen64_cmp(s, c, r1, r2);
+    }
+    /* Emit: r1 = 1; if (cc) goto over; r1 = 0; over:  */
+    tcg_out_movi(s, type, dest, 1);
+    tcg_out_insn(s, RI, BRC, tcg_cond_to_s390_cond[c], (4 + 4) >> 1);
+    tcg_out_movi(s, type, dest, 0);
+}
+
 #if defined(CONFIG_SOFTMMU)
 static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
                                   int mem_index, int opc,
@@ -958,27 +994,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_brcond_i64:
-        if (args[2] > TCG_COND_GT) {
-            /* unsigned */
-            /* clgr %ra0, %ra1 */
-            tcg_out_insn(s, RRE, CLGR, args[0], args[1]);
-        } else {
-            /* signed */
-            /* cgr %ra0, %ra1 */
-            tcg_out_insn(s, RRE, CGR, args[0], args[1]);
-        }
+        tgen64_cmp(s, args[2], args[0], args[1]);
         goto do_brcond;
-
     case INDEX_op_brcond_i32:
-        if (args[2] > TCG_COND_GT) {
-            /* unsigned */
-            /* clr %ra0, %ra1 */
-            tcg_out_insn(s, RR, CLR, args[0], args[1]);
-        } else {
-            /* signed */
-            /* cr %ra0, %ra1 */
-            tcg_out_insn(s, RR, CR, args[0], args[1]);
-        }
+        tgen32_cmp(s, args[2], args[0], args[1]);
     do_brcond:
         l = &s->labels[args[3]];
         if (l->has_value) {
@@ -993,6 +1012,13 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_insn(s, RR, BCR, tcg_cond_to_s390_cond[args[2]], TCG_REG_R13);
         break;
 
+    case INDEX_op_setcond_i32:
+        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2]);
+        break;
+    case INDEX_op_setcond_i64:
+        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2]);
+        break;
+
     case INDEX_op_qemu_ld8u:
         tcg_out_qemu_ld(s, args, LD_UINT8);
         break;
@@ -1083,6 +1109,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_sar_i32, { "r", "0", "Ri" } },
 
     { INDEX_op_brcond_i32, { "r", "r" } },
+    { INDEX_op_setcond_i32, { "r", "r", "r" } },
 
     { INDEX_op_qemu_ld8u, { "r", "L" } },
     { INDEX_op_qemu_ld8s, { "r", "L" } },
@@ -1129,6 +1156,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_sar_i64, { "r", "r", "Ri" } },
 
     { INDEX_op_brcond_i64, { "r", "r" } },
+    { INDEX_op_setcond_i64, { "r", "r", "r" } },
 #endif
 
     { -1 },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 21/62] tcg-s390: Generalize the direct load/store emission.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (19 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 20/62] tcg-s390: Implement setcond Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 22/62] tcg-s390: Tidy branches Richard Henderson
                   ` (41 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Define tcg_out_ldst which can properly choose between RX and RXY
format instructions based on the offset used, and also handles
large offsets.  Use it to implement all the INDEX_op_ld/st operations.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  152 +++++++++++++++++++++++--------------------------
 1 files changed, 71 insertions(+), 81 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index b150d1a..21ad1a3 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -90,17 +90,22 @@ typedef enum S390Opcode {
     RS_SRL      = 0x88,
 
     RXY_CG      = 0xe320,
+    RXY_LB      = 0xe376,
     RXY_LG      = 0xe304,
     RXY_LGB     = 0xe377,
     RXY_LGF     = 0xe314,
     RXY_LGH     = 0xe315,
+    RXY_LHY     = 0xe378,
+    RXY_LLC     = 0xe394,
     RXY_LLGC    = 0xe390,
     RXY_LLGF    = 0xe316,
     RXY_LLGH    = 0xe391,
+    RXY_LLH     = 0xe395,
     RXY_LMG     = 0xeb04,
     RXY_LRV     = 0xe31e,
     RXY_LRVG    = 0xe30f,
     RXY_LRVH    = 0xe31f,
+    RXY_LY      = 0xe358,
     RXY_STCY    = 0xe372,
     RXY_STG     = 0xe324,
     RXY_STHY    = 0xe370,
@@ -108,7 +113,10 @@ typedef enum S390Opcode {
     RXY_STRV    = 0xe33e,
     RXY_STRVG   = 0xe32f,
     RXY_STRVH   = 0xe33f,
+    RXY_STY     = 0xe350,
 
+    RX_L        = 0x58,
+    RX_LH       = 0x48,
     RX_ST       = 0x50,
     RX_STC      = 0x42,
     RX_STH      = 0x40,
@@ -362,22 +370,52 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
     }
 }
 
-/* load data without address translation or endianness conversion */
-static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg data,
-                       TCGReg base, tcg_target_long ofs)
+
+/* Emit a load/store type instruction.  Inputs are:
+   DATA:     The register to be loaded or stored.
+   BASE+OFS: The effective address.
+   OPC_RX:   If the operation has an RX format opcode (e.g. STC), otherwise 0.
+   OPC_RXY:  The RXY format opcode for the operation (e.g. STCY).  */
+
+static void tcg_out_ldst(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
+                         TCGReg data, TCGReg base, tcg_target_long ofs)
 {
-    S390Opcode op;
+    TCGReg index = 0;
+
+    if (ofs < -0x80000 || ofs >= 0x80000) {
+        /* Combine the low 16 bits of the offset with the actual load insn;
+           the high 48 bits must come from an immediate load.  */
+        index = TCG_REG_R13;
+        tcg_out_movi(s, TCG_TYPE_PTR, index, ofs & ~0xffff);
+        ofs &= 0xffff;
+    }
 
-    op = (type == TCG_TYPE_I32) ? RXY_LLGF : RXY_LG;
+    if (opc_rx && ofs >= 0 && ofs < 0x1000) {
+        tcg_out_insn_RX(s, opc_rx, data, base, index, ofs);
+    } else {
+        tcg_out_insn_RXY(s, opc_rxy, data, base, index, ofs);
+    }
+}
 
-    if (ofs < -0x80000 || ofs > 0x7ffff) {
-        /* load the displacement */
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, ofs);
-        /* load the data */
-        tcg_out_insn_RXY(s, op, data, base, TCG_REG_R13, 0);
+
+/* load data without address translation or endianness conversion */
+static inline void tcg_out_ld(TCGContext *s, TCGType type, TCGReg data,
+                              TCGReg base, tcg_target_long ofs)
+{
+    if (type == TCG_TYPE_I32) {
+        tcg_out_ldst(s, RX_L, RXY_LY, data, base, ofs);
     } else {
-        /* load the data */
-        tcg_out_insn_RXY(s, op, data, base, 0, ofs);
+        tcg_out_ldst(s, 0, RXY_LG, data, base, ofs);
+    }
+}
+
+static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
+                              TCGReg base, tcg_target_long ofs)
+{
+    if (type == TCG_TYPE_I32) {
+        tcg_out_ldst(s, RX_ST, RXY_STY, data, base, ofs);
+    } else {
+        tcg_out_ldst(s, 0, RXY_STG, data, base, ofs);
     }
 }
 
@@ -693,28 +731,6 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
     tcg_finish_qemu_ldst(s, label2_ptr);
 }
 
-static inline void tcg_out_st(TCGContext *s, TCGType type, int arg,
-                              int arg1, tcg_target_long arg2)
-{
-    dprintf("tcg_out_st arg 0x%x arg1 0x%x arg2 0x%lx\n", arg, arg1, arg2);
-
-    if (type == TCG_TYPE_I32) {
-        if (((long)arg2) < -0x800 || ((long)arg2) > 0x7ff) {
-            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, arg2);
-            tcg_out_insn(s, RRE, AGR, 13, arg1);
-            tcg_out_insn(s, RX, ST, arg, TCG_REG_R13, 0, 0);
-        } else {
-            tcg_out_insn(s, RX, ST, arg, arg1, 0, arg2);
-        }
-    }
-    else {
-        if (((long)arg2) < -0x80000 || ((long)arg2) > 0x7ffff) {
-            tcg_abort();
-        }
-        tcg_out_insn(s, RXY, STG, arg, arg1, 0, arg2);
-    }
-}
-
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
@@ -780,51 +796,41 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_ld8u_i32:
+        tcg_out_ldst(s, 0, RXY_LLC, args[0], args[1], args[2]);
+        break;
     case INDEX_op_ld8u_i64:
-        if ((long)args[2] > -0x80000 && (long)args[2] < 0x7ffff) {
-            tcg_out_insn(s, RXY, LLGC, args[0], args[1], 0, args[2]);
-        } else {
-            /* XXX displacement too large, have to calculate address manually */
-            tcg_abort();
-        }
+        tcg_out_ldst(s, 0, RXY_LLGC, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_ld8s_i32:
-        /* XXX */
-        tcg_abort();
+        tcg_out_ldst(s, 0, RXY_LB, args[0], args[1], args[2]);
+        break;
+    case INDEX_op_ld8s_i64:
+        tcg_out_ldst(s, 0, RXY_LGB, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_ld16u_i32:
-        if ((long)args[2] > -0x80000 && (long)args[2] < 0x7ffff) {
-            tcg_out_insn(s, RXY, LLGH, args[0], args[1], 0, args[2]);
-        } else {
-            /* XXX displacement too large, have to calculate address manually */
-            tcg_abort();
-        }
+        tcg_out_ldst(s, 0, RXY_LLH, args[0], args[1], args[2]);
+        break;
+    case INDEX_op_ld16u_i64:
+        tcg_out_ldst(s, 0, RXY_LLGH, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_ld16s_i32:
-        /* XXX */
-        tcg_abort();
+        tcg_out_ldst(s, RX_LH, RXY_LHY, args[0], args[1], args[2]);
+        break;
+    case INDEX_op_ld16s_i64:
+        tcg_out_ldst(s, 0, RXY_LGH, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_ld_i32:
-    case INDEX_op_ld32u_i64:
         tcg_out_ld(s, TCG_TYPE_I32, args[0], args[1], args[2]);
         break;
-
+    case INDEX_op_ld32u_i64:
+        tcg_out_ldst(s, 0, RXY_LLGF, args[0], args[1], args[2]);
+        break;
     case INDEX_op_ld32s_i64:
-        if (args[2] < -0x80000 || args[2] > 0x7ffff) {
-            /* load the displacement */
-            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, args[2]);
-            /* add the address */
-            tcg_out_insn(s, RRE, AGR, TCG_REG_R13, args[1]);
-            /* load the data (sign-extended) */
-            tcg_out_insn(s, RXY, LGF, args[0], TCG_REG_R13, 0, 0);
-        } else {
-            /* load the data (sign-extended) */
-            tcg_out_insn(s, RXY, LGF, args[0], args[1], 0, args[2]);
-        }
+        tcg_out_ldst(s, 0, RXY_LGF, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_ld_i64:
@@ -833,28 +839,12 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_st8_i32:
     case INDEX_op_st8_i64:
-        if (((long)args[2]) >= -0x800 && ((long)args[2]) < 0x800) {
-            tcg_out_insn(s, RX, STC, args[0], args[1], 0, args[2]);
-        } else if (((long)args[2]) >= -0x80000 && ((long)args[2]) < 0x80000) {
-            /* FIXME: requires long displacement facility */
-            tcg_out_insn(s, RXY, STCY, args[0], args[1], 0, args[2]);
-            tcg_abort();
-        } else {
-            tcg_abort();
-        }
+        tcg_out_ldst(s, RX_STC, RXY_STCY, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_st16_i32:
     case INDEX_op_st16_i64:
-        if (((long)args[2]) >= -0x800 && ((long)args[2]) < 0x800) {
-            tcg_out_insn(s, RX, STH, args[0], args[1], 0, args[2]);
-        } else if (((long)args[2]) >= -0x80000 && ((long)args[2]) < 0x80000) {
-            /* FIXME: requires long displacement facility */
-            tcg_out_insn(s, RXY, STHY, args[0], args[1], 0, args[2]);
-            tcg_abort();
-        } else {
-            tcg_abort();
-        }
+        tcg_out_ldst(s, RX_STH, RXY_STHY, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_st_i32:
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 22/62] tcg-s390: Tidy branches.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (20 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 21/62] tcg-s390: Generalize the direct load/store emission Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 23/62] tcg-s390: Add tgen_calli Richard Henderson
                   ` (40 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Add tgen_gotoi to implement conditional and unconditional direct
branches.  Add tgen_branch to implement branches to labels.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   96 ++++++++++++++++++++++++++++---------------------
 1 files changed, 55 insertions(+), 41 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 21ad1a3..f4dab1a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -41,8 +41,9 @@
    halves of the 16-bit quantity may appear 32 bits apart in the insn.
    This makes it easy to copy the values from the tables in Appendix B.  */
 typedef enum S390Opcode {
-    RIL_LARL    = 0xc000,
     RIL_BRASL   = 0xc005,
+    RIL_BRCL    = 0xc004,
+    RIL_LARL    = 0xc000,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
@@ -175,17 +176,27 @@ static const int tcg_target_call_oarg_regs[] = {
 
 /* signed/unsigned is handled by using COMPARE and COMPARE LOGICAL,
    respectively */
+
+#define S390_CC_EQ      8
+#define S390_CC_LT      4
+#define S390_CC_GT      2
+#define S390_CC_OV      1
+#define S390_CC_NE      (S390_CC_LT | S390_CC_GT)
+#define S390_CC_LE      (S390_CC_LT | S390_CC_EQ)
+#define S390_CC_GE      (S390_CC_GT | S390_CC_EQ)
+#define S390_CC_ALWAYS  15
+
 static const uint8_t tcg_cond_to_s390_cond[10] = {
-    [TCG_COND_EQ]  = 8,
-    [TCG_COND_LT]  = 4,
-    [TCG_COND_LTU] = 4,
-    [TCG_COND_LE]  = 8 | 4,
-    [TCG_COND_LEU] = 8 | 4,
-    [TCG_COND_GT]  = 2,
-    [TCG_COND_GTU] = 2,
-    [TCG_COND_GE]  = 8 | 2,
-    [TCG_COND_GEU] = 8 | 2,
-    [TCG_COND_NE]  = 4 | 2 | 1,
+    [TCG_COND_EQ]  = S390_CC_EQ,
+    [TCG_COND_LT]  = S390_CC_LT,
+    [TCG_COND_LTU] = S390_CC_LT,
+    [TCG_COND_LE]  = S390_CC_LE,
+    [TCG_COND_LEU] = S390_CC_LE,
+    [TCG_COND_GT]  = S390_CC_GT,
+    [TCG_COND_GTU] = S390_CC_GT,
+    [TCG_COND_GE]  = S390_CC_GE,
+    [TCG_COND_GEU] = S390_CC_GE,
+    [TCG_COND_NE]  = S390_CC_NE,
 };
 
 #ifdef CONFIG_SOFTMMU
@@ -455,6 +466,31 @@ static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
     tcg_out_movi(s, type, dest, 0);
 }
 
+static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
+{
+    tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
+    if (off > -0x8000 && off < 0x7fff) {
+        tcg_out_insn(s, RI, BRC, cc, off);
+    } else if (off == (int32_t)off) {
+        tcg_out_insn(s, RIL, BRCL, cc, off);
+    } else {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
+        tcg_out_insn(s, RR, BCR, cc, TCG_REG_R13);
+    }
+}
+
+static void tgen_branch(TCGContext *s, int cc, int labelno)
+{
+    TCGLabel* l = &s->labels[labelno];
+    if (l->has_value) {
+        tgen_gotoi(s, cc, l->u.value);
+    } else {
+        tcg_out16(s, RIL_BRCL | (cc << 4));
+        tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, labelno, -2);
+        s->code_ptr += 4;
+    }
+}
+
 #if defined(CONFIG_SOFTMMU)
 static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
                                   int mem_index, int opc,
@@ -507,7 +543,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     label1_ptr = (uint16_t*)s->code_ptr;
 
     /* je label1 (offset will be patched in later) */
-    tcg_out_insn(s, RI, BRC, 8, 0);
+    tcg_out_insn(s, RI, BRC, S390_CC_EQ, 0);
 
     /* call load/store helper */
 #if TARGET_LONG_BITS == 32
@@ -551,7 +587,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     /* jump to label2 (end) */
     *label2_ptr_p = (uint16_t*)s->code_ptr;
 
-    tcg_out_insn(s, RI, BRC, 15, 0);
+    tcg_out_insn(s, RI, BRC, S390_CC_ALWAYS, 0);
 
     /* this is label1, patch branch */
     *(label1_ptr + 1) = ((unsigned long)s->code_ptr -
@@ -734,16 +770,13 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
-    TCGLabel* l;
     S390Opcode op;
 
     switch (opc) {
     case INDEX_op_exit_tb:
         /* return value */
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R2, args[0]);
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, (unsigned long)tb_ret_addr);
-        /* br %r13 */
-        tcg_out_insn(s, RR, BCR, 15, TCG_REG_R13);
+        tgen_gotoi(s, S390_CC_ALWAYS, (unsigned long)tb_ret_addr);
         break;
 
     case INDEX_op_goto_tb:
@@ -763,7 +796,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             /* load address stored at s->tb_next + args[0] */
             tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
             /* and go there */
-            tcg_out_insn(s, RR, BASR, TCG_REG_R13, TCG_REG_R13);
+            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R13);
         }
         s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
         break;
@@ -971,16 +1004,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         goto do_shift64;
 
     case INDEX_op_br:
-        l = &s->labels[args[0]];
-        if (l->has_value) {
-            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, l->u.value);
-        } else {
-            /* larl %r13, ... */
-            tcg_out16(s, RIL_LARL | (TCG_REG_R13 << 4));
-            tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, args[0], -2);
-            s->code_ptr += 4;
-        }
-        tcg_out_insn(s, RR, BASR, TCG_REG_R13, TCG_REG_R13);
+        tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
 
     case INDEX_op_brcond_i64:
@@ -989,17 +1013,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_brcond_i32:
         tgen32_cmp(s, args[2], args[0], args[1]);
     do_brcond:
-        l = &s->labels[args[3]];
-        if (l->has_value) {
-            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, l->u.value);
-        } else {
-            /* larl %r13, ... */
-            tcg_out16(s, RIL_LARL | (TCG_REG_R13 << 4));
-            tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, args[3], -2);
-            s->code_ptr += 4;
-        }
-        /* bcr cond, %r13 */
-        tcg_out_insn(s, RR, BCR, tcg_cond_to_s390_cond[args[2]], TCG_REG_R13);
+        tgen_branch(s, tcg_cond_to_s390_cond[args[2]], args[3]);
         break;
 
     case INDEX_op_setcond_i32:
@@ -1194,7 +1208,7 @@ void tcg_target_qemu_prologue(TCGContext *s)
     tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -160);
 
     /* br %r2 (go to TB) */
-    tcg_out_insn(s, RR, BCR, 15, TCG_REG_R2);
+    tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R2);
 
     tb_ret_addr = s->code_ptr;
 
@@ -1202,7 +1216,7 @@ void tcg_target_qemu_prologue(TCGContext *s)
     tcg_out_insn(s, RXY, LMG, TCG_REG_R6, TCG_REG_R15, TCG_REG_R15, 208);
 
     /* br %r14 (return) */
-    tcg_out_insn(s, RR, BCR, 15, TCG_REG_R14);
+    tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R14);
 }
 
 static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 23/62] tcg-s390: Add tgen_calli.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (21 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 22/62] tcg-s390: Tidy branches Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 24/62] tcg-s390: Implement div2 Richard Henderson
                   ` (39 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Use it in the softmmu code paths, and INDEX_op_call.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   45 ++++++++++++++++-----------------------------
 1 files changed, 16 insertions(+), 29 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index f4dab1a..0bd4276 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -347,12 +347,6 @@ static void tcg_out_sh32(TCGContext* s, S390Opcode op, TCGReg dest,
     tcg_out_insn_RS(s, op, dest, sh_reg, 0, sh_imm);
 }
 
-/* branch to relative address (long) */
-static void tcg_out_brasl(TCGContext *s, TCGReg r, tcg_target_long raddr)
-{
-    tcg_out_insn(s, RIL, BRASL, r, raddr >> 1);
-}
-
 static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
 {
     /* ??? With a TCGType argument, we could emit the smaller LR insn.  */
@@ -372,7 +366,7 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
         tcg_out_insn(s, RI, IILH, ret, arg >> 16);
     } else {
         /* branch over constant and store its address in R13 */
-        tcg_out_brasl(s, TCG_REG_R13, 14);
+        tcg_out_insn(s, RIL, BRASL, TCG_REG_R13, (6 + 8) >> 1);
         /* 64-bit constant */
         tcg_out32(s, arg >> 32);
         tcg_out32(s, arg);
@@ -491,6 +485,17 @@ static void tgen_branch(TCGContext *s, int cc, int labelno)
     }
 }
 
+static void tgen_calli(TCGContext *s, tcg_target_long dest)
+{
+    tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
+    if (off == (int32_t)off) {
+        tcg_out_insn(s, RIL, BRASL, TCG_REG_R14, off);
+    } else {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
+        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
+    }
+}
+
 #if defined(CONFIG_SOFTMMU)
 static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
                                   int mem_index, int opc,
@@ -555,14 +560,10 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     if (is_store) {
         tcg_out_mov(s, arg1, data_reg);
         tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
-                     (tcg_target_ulong)qemu_st_helpers[s_bits]);
-        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
+        tgen_calli(s, (tcg_target_ulong)qemu_st_helpers[s_bits]);
     } else {
         tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
-                     (tcg_target_ulong)qemu_ld_helpers[s_bits]);
-        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
+        tgen_calli(s, (tcg_target_ulong)qemu_ld_helpers[s_bits]);
 
         /* sign extension */
         switch (opc) {
@@ -785,7 +786,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         } else {
             tcg_target_long off = ((tcg_target_long)(s->tb_next + args[0]) -
                                    (tcg_target_long)s->code_ptr) >> 1;
-            if (off > -0x80000000L && off < 0x7fffffffL) {
+            if (off == (int32_t)off) {
                 /* load address relative to PC */
                 tcg_out_insn(s, RIL, LARL, TCG_REG_R13, off);
             } else {
@@ -803,22 +804,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_call:
         if (const_args[0]) {
-            tcg_target_long off;
-
-            /* FIXME: + 4? Where did that come from? */
-            off = (args[0] - (tcg_target_long)s->code_ptr + 4) >> 1;
-            if (off > -0x80000000 && off < 0x7fffffff) {
-                /* relative call */
-                tcg_out_brasl(s, TCG_REG_R14, off << 1);
-                /* XXX untested */
-                tcg_abort();
-            } else {
-                /* too far for a relative call, load full address */
-                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, args[0]);
-                tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
-            }
+            tgen_calli(s, args[0]);
         } else {
-            /* call function in register args[0] */
             tcg_out_insn(s, RR, BASR, TCG_REG_R14, args[0]);
         }
         break;
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 24/62] tcg-s390: Implement div2.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (22 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 23/62] tcg-s390: Add tgen_calli Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 25/62] tcg-s390: Re-implement tcg_out_movi Richard Henderson
                   ` (38 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The s390 divide instructions always produce both remainder and quotient.
Since TCG has no mechanism for allocating even+odd register pairs, force
the use of the R2/R3 register pair.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   44 ++++++++++++++++++++++++++++++--------------
 tcg/s390/tcg-target.h |    4 ++--
 2 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 0bd4276..4c2acca 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -75,6 +75,7 @@ typedef enum S390Opcode {
     RR_BCR      = 0x07,
     RR_CLR      = 0x15,
     RR_CR       = 0x19,
+    RR_DR       = 0x1d,
     RR_LCR      = 0x13,
     RR_LR       = 0x18,
     RR_NR       = 0x14,
@@ -258,6 +259,14 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
     case 'R':                        /* not R0 */
         tcg_regset_reset_reg(ct->u.regs, TCG_REG_R0);
         break;
+    case 'a':                  /* force R2 for division */
+        tcg_regset_clear(ct->u.regs);
+        tcg_regset_set_reg(ct->u.regs, TCG_REG_R2);
+        break;
+    case 'b':                  /* force R3 for division */
+        tcg_regset_clear(ct->u.regs);
+        tcg_regset_set_reg(ct->u.regs, TCG_REG_R3);
+        break;
     case 'I':
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_S16;
@@ -946,16 +955,22 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
         break;
 
-    case INDEX_op_divu_i32:
-    case INDEX_op_remu_i32:
-        tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R12, 0);
-        tcg_out_insn(s, RR, LR, TCG_REG_R13, args[1]);
-        tcg_out_insn(s, RRE, DLR, TCG_REG_R12, args[2]);
-        if (opc == INDEX_op_divu_i32) {
-          tcg_out_insn(s, RR, LR, args[0], TCG_REG_R13);        /* quotient */
-        } else {
-          tcg_out_insn(s, RR, LR, args[0], TCG_REG_R12);        /* remainder */
-        }
+    case INDEX_op_div2_i32:
+        tcg_out_insn(s, RR, DR, TCG_REG_R2, args[4]);
+        break;
+    case INDEX_op_divu2_i32:
+        tcg_out_insn(s, RRE, DLR, TCG_REG_R2, args[4]);
+        break;
+
+    case INDEX_op_div2_i64:
+        /* ??? We get an unnecessary sign-extension of the dividend
+           into R3 with this definition, but as we do in fact always
+           produce both quotient and remainder using INDEX_op_div_i64
+           instead requires jumping through even more hoops.  */
+        tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
+        break;
+    case INDEX_op_divu2_i64:
+        tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
         break;
 
     case INDEX_op_shl_i32:
@@ -1085,10 +1100,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_sub_i32, { "r", "0", "r" } },
     { INDEX_op_mul_i32, { "r", "0", "r" } },
 
-    { INDEX_op_div_i32, { "r", "r", "r" } },
-    { INDEX_op_divu_i32, { "r", "r", "r" } },
-    { INDEX_op_rem_i32, { "r", "r", "r" } },
-    { INDEX_op_remu_i32, { "r", "r", "r" } },
+    { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
+    { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
     { INDEX_op_and_i32, { "r", "0", "r" } },
     { INDEX_op_or_i32, { "r", "0", "r" } },
@@ -1137,6 +1150,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_sub_i64, { "r", "0", "r" } },
     { INDEX_op_mul_i64, { "r", "0", "r" } },
 
+    { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
+    { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
+
     { INDEX_op_and_i64, { "r", "0", "r" } },
     { INDEX_op_or_i64, { "r", "0", "r" } },
     { INDEX_op_xor_i64, { "r", "0", "r" } },
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index c81f886..b987a7e 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -48,7 +48,7 @@ typedef enum TCGReg {
 #define TCG_TARGET_NB_REGS 16
 
 /* optional instructions */
-#define TCG_TARGET_HAS_div_i32
+#define TCG_TARGET_HAS_div2_i32
 // #define TCG_TARGET_HAS_rot_i32
 // #define TCG_TARGET_HAS_ext8s_i32
 // #define TCG_TARGET_HAS_ext16s_i32
@@ -64,7 +64,7 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_nand_i32
 // #define TCG_TARGET_HAS_nor_i32
 
-// #define TCG_TARGET_HAS_div_i64
+#define TCG_TARGET_HAS_div2_i64
 // #define TCG_TARGET_HAS_rot_i64
 // #define TCG_TARGET_HAS_ext8s_i64
 // #define TCG_TARGET_HAS_ext16s_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 25/62] tcg-s390: Re-implement tcg_out_movi.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (23 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 24/62] tcg-s390: Implement div2 Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 26/62] tcg-s390: Implement sign and zero-extension operations Richard Henderson
                   ` (37 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Make better use of the LOAD HALFWORD IMMEDIATE, LOAD IMMEDIATE,
and INSERT IMMEDIATE instruction groups.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   90 ++++++++++++++++++++++++++++++++++++++++---------
 1 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 4c2acca..fe83415 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -44,12 +44,23 @@ typedef enum S390Opcode {
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
     RIL_LARL    = 0xc000,
+    RIL_IIHF    = 0xc008,
+    RIL_IILF    = 0xc009,
+    RIL_LGFI    = 0xc001,
+    RIL_LLIHF   = 0xc00e,
+    RIL_LLILF   = 0xc00f,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
     RI_BRC      = 0xa704,
+    RI_IIHH     = 0xa500,
+    RI_IIHL     = 0xa501,
     RI_IILH     = 0xa502,
+    RI_IILL     = 0xa503,
     RI_LGHI     = 0xa709,
+    RI_LLIHH    = 0xa50c,
+    RI_LLIHL    = 0xa50d,
+    RI_LLILH    = 0xa50e,
     RI_LLILL    = 0xa50f,
 
     RRE_AGR     = 0xb908,
@@ -363,24 +374,71 @@ static inline void tcg_out_mov(TCGContext *s, int ret, int arg)
 }
 
 /* load a register with an immediate value */
-static inline void tcg_out_movi(TCGContext *s, TCGType type,
-                int ret, tcg_target_long arg)
+static void tcg_out_movi(TCGContext *s, TCGType type,
+                         TCGReg ret, tcg_target_long sval)
 {
-    if (arg >= -0x8000 && arg < 0x8000) { /* signed immediate load */
-        tcg_out_insn(s, RI, LGHI, ret, arg);
-    } else if (!(arg & 0xffffffffffff0000UL)) {
-        tcg_out_insn(s, RI, LLILL, ret, arg);
-    } else if (!(arg & 0xffffffff00000000UL) || type == TCG_TYPE_I32) {
-        tcg_out_insn(s, RI, LLILL, ret, arg);
-        tcg_out_insn(s, RI, IILH, ret, arg >> 16);
+    static const S390Opcode lli_insns[4] = {
+        RI_LLILL, RI_LLILH, RI_LLIHL, RI_LLIHH
+    };
+
+    tcg_target_ulong uval = sval;
+    int i;
+
+    if (type == TCG_TYPE_I32) {
+        uval = (uint32_t)sval;
+        sval = (int32_t)sval;
+    }
+
+    /* First, try all 32-bit insns that can load it in one go.  */
+    if (sval >= -0x8000 && sval < 0x8000) {
+        tcg_out_insn(s, RI, LGHI, ret, sval);
+        return;
+    }
+
+    for (i = 0; i < 4; i++) {
+        tcg_target_long mask = 0xffffull << i*16;
+        if ((uval & mask) != 0 && (uval & ~mask) == 0) {
+            tcg_out_insn_RI(s, lli_insns[i], ret, uval >> i*16);
+            return;
+        }
+    }
+
+    /* Second, try all 48-bit insns that can load it in one go.  */
+    if (sval == (int32_t)sval) {
+        tcg_out_insn(s, RIL, LGFI, ret, sval);
+        return;
+    }
+    if (uval <= 0xffffffff) {
+        tcg_out_insn(s, RIL, LLILF, ret, uval);
+        return;
+    }
+    if ((uval & 0xffffffff) == 0) {
+        tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
+        return;
+    }
+
+    /* If we get here, both the high and low parts have non-zero bits.  */
+
+    /* Try for PC-relative address load.  */
+    if ((sval & 1) == 0) {
+        intptr_t off = (sval - (intptr_t)s->code_ptr) >> 1;
+        if (off == (int32_t)off) {
+            tcg_out_insn(s, RIL, LARL, ret, off);
+            return;
+        }
+    }
+
+    /* Recurse to load the lower 32-bits.  */
+    tcg_out_movi(s, TCG_TYPE_I32, ret, sval);
+
+    /* Insert data into the high 32-bits.  */
+    uval >>= 32;
+    if (uval < 0x10000) {
+        tcg_out_insn(s, RI, IIHL, ret, uval);
+    } else if ((uval & 0xffff) == 0) {
+        tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
     } else {
-        /* branch over constant and store its address in R13 */
-        tcg_out_insn(s, RIL, BRASL, TCG_REG_R13, (6 + 8) >> 1);
-        /* 64-bit constant */
-        tcg_out32(s, arg >> 32);
-        tcg_out32(s, arg);
-        /* load constant to ret */
-        tcg_out_insn(s, RXY, LG, ret, TCG_REG_R13, 0, 0);
+        tcg_out_insn(s, RIL, IIHF, ret, uval);
     }
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 26/62] tcg-s390: Implement sign and zero-extension operations.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (24 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 25/62] tcg-s390: Re-implement tcg_out_movi Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 27/62] tcg-s390: Implement bswap operations Richard Henderson
                   ` (36 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   94 +++++++++++++++++++++++++++++++++++++++++-------
 tcg/s390/tcg-target.h |   20 +++++-----
 2 files changed, 90 insertions(+), 24 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index fe83415..3f7d08d 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -70,10 +70,14 @@ typedef enum S390Opcode {
     RRE_DLR     = 0xb997,
     RRE_DSGFR   = 0xb91d,
     RRE_DSGR    = 0xb90d,
+    RRE_LGBR    = 0xb906,
     RRE_LCGR    = 0xb903,
     RRE_LGFR    = 0xb914,
+    RRE_LGHR    = 0xb907,
     RRE_LGR     = 0xb904,
+    RRE_LLGCR   = 0xb984,
     RRE_LLGFR   = 0xb916,
+    RRE_LLGHR   = 0xb985,
     RRE_MSGR    = 0xb90c,
     RRE_MSR     = 0xb252,
     RRE_NGR     = 0xb980,
@@ -491,6 +495,36 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
     }
 }
 
+static inline void tgen_ext8s(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LGBR, dest, src);
+}
+
+static inline void tgen_ext8u(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LLGCR, dest, src);
+}
+
+static inline void tgen_ext16s(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LGHR, dest, src);
+}
+
+static inline void tgen_ext16u(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LLGHR, dest, src);
+}
+
+static inline void tgen_ext32s(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LGFR, dest, src);
+}
+
+static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
+{
+    tcg_out_insn(s, RRE, LLGFR, dest, src);
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -581,8 +615,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     }
 
 #if TARGET_LONG_BITS == 32
-    tcg_out_insn(s, RRE, LLGFR, arg1, addr_reg);
-    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+    tgen_ext32u(s, arg1, addr_reg);
+    tgen_ext32u(s, arg0, addr_reg);
 #else
     tcg_out_mov(s, arg1, addr_reg);
     tcg_out_mov(s, arg0, addr_reg);
@@ -619,7 +653,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
 
     /* call load/store helper */
 #if TARGET_LONG_BITS == 32
-    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
+    tgen_ext32u(s, arg0, addr_reg);
 #else
     tcg_out_mov(s, arg0, addr_reg);
 #endif
@@ -635,15 +669,13 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
         /* sign extension */
         switch (opc) {
         case LD_INT8:
-            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, SH64_REG_NONE, 56);
-            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, SH64_REG_NONE, 56);
+            tgen_ext8s(s, data_reg, arg0);
             break;
         case LD_INT16:
-            tcg_out_insn(s, RSY, SLLG, data_reg, arg0, SH64_REG_NONE, 48);
-            tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
+            tgen_ext16s(s, data_reg, arg0);
             break;
         case LD_INT32:
-            tcg_out_insn(s, RRE, LGFR, data_reg, arg0);
+            tgen_ext32s(s, data_reg, arg0);
             break;
         default:
             /* unsigned -> just copy */
@@ -741,8 +773,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped unsigned halfword load with upper bits zeroed */
         tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, 0xffffL);
-        tcg_out_insn(s, RRE, NGR, data_reg, 13);
+        tgen_ext16u(s, data_reg, data_reg);
 #endif
         break;
     case LD_INT16:
@@ -751,8 +782,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped sign-extended halfword load */
         tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tcg_out_insn(s, RSY, SLLG, data_reg, data_reg, SH64_REG_NONE, 48);
-        tcg_out_insn(s, RSY, SRAG, data_reg, data_reg, SH64_REG_NONE, 48);
+        tgen_ext16s(s, data_reg, data_reg);
 #endif
         break;
     case LD_UINT32:
@@ -761,7 +791,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped unsigned int load with upper bits zeroed */
         tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tcg_out_insn(s, RRE, LLGFR, data_reg, data_reg);
+        tgen_ext32u(s, data_reg, data_reg);
 #endif
         break;
     case LD_INT32:
@@ -770,7 +800,7 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #else
         /* swapped sign-extended int load */
         tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tcg_out_insn(s, RRE, LGFR, data_reg, data_reg);
+        tgen_ext32s(s, data_reg, data_reg);
 #endif
         break;
     case LD_UINT64:
@@ -1063,6 +1093,30 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RSY_SRAG;
         goto do_shift64;
 
+    case INDEX_op_ext8s_i32:
+    case INDEX_op_ext8s_i64:
+        tgen_ext8s(s, args[0], args[1]);
+        break;
+    case INDEX_op_ext16s_i32:
+    case INDEX_op_ext16s_i64:
+        tgen_ext16s(s, args[0], args[1]);
+        break;
+    case INDEX_op_ext32s_i64:
+        tgen_ext32s(s, args[0], args[1]);
+        break;
+
+    case INDEX_op_ext8u_i32:
+    case INDEX_op_ext8u_i64:
+        tgen_ext8u(s, args[0], args[1]);
+        break;
+    case INDEX_op_ext16u_i32:
+    case INDEX_op_ext16u_i64:
+        tgen_ext16u(s, args[0], args[1]);
+        break;
+    case INDEX_op_ext32u_i64:
+        tgen_ext32u(s, args[0], args[1]);
+        break;
+
     case INDEX_op_br:
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
@@ -1170,6 +1224,11 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i32, { "r", "0", "Ri" } },
     { INDEX_op_sar_i32, { "r", "0", "Ri" } },
 
+    { INDEX_op_ext8s_i32, { "r", "r" } },
+    { INDEX_op_ext8u_i32, { "r", "r" } },
+    { INDEX_op_ext16s_i32, { "r", "r" } },
+    { INDEX_op_ext16u_i32, { "r", "r" } },
+
     { INDEX_op_brcond_i32, { "r", "r" } },
     { INDEX_op_setcond_i32, { "r", "r", "r" } },
 
@@ -1220,6 +1279,13 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i64, { "r", "r", "Ri" } },
     { INDEX_op_sar_i64, { "r", "r", "Ri" } },
 
+    { INDEX_op_ext8s_i64, { "r", "r" } },
+    { INDEX_op_ext8u_i64, { "r", "r" } },
+    { INDEX_op_ext16s_i64, { "r", "r" } },
+    { INDEX_op_ext16u_i64, { "r", "r" } },
+    { INDEX_op_ext32s_i64, { "r", "r" } },
+    { INDEX_op_ext32u_i64, { "r", "r" } },
+
     { INDEX_op_brcond_i64, { "r", "r" } },
     { INDEX_op_setcond_i64, { "r", "r", "r" } },
 #endif
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index b987a7e..76a13fc 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -50,10 +50,10 @@ typedef enum TCGReg {
 /* optional instructions */
 #define TCG_TARGET_HAS_div2_i32
 // #define TCG_TARGET_HAS_rot_i32
-// #define TCG_TARGET_HAS_ext8s_i32
-// #define TCG_TARGET_HAS_ext16s_i32
-// #define TCG_TARGET_HAS_ext8u_i32
-// #define TCG_TARGET_HAS_ext16u_i32
+#define TCG_TARGET_HAS_ext8s_i32
+#define TCG_TARGET_HAS_ext16s_i32
+#define TCG_TARGET_HAS_ext8u_i32
+#define TCG_TARGET_HAS_ext16u_i32
 // #define TCG_TARGET_HAS_bswap16_i32
 // #define TCG_TARGET_HAS_bswap32_i32
 // #define TCG_TARGET_HAS_not_i32
@@ -66,12 +66,12 @@ typedef enum TCGReg {
 
 #define TCG_TARGET_HAS_div2_i64
 // #define TCG_TARGET_HAS_rot_i64
-// #define TCG_TARGET_HAS_ext8s_i64
-// #define TCG_TARGET_HAS_ext16s_i64
-// #define TCG_TARGET_HAS_ext32s_i64
-// #define TCG_TARGET_HAS_ext8u_i64
-// #define TCG_TARGET_HAS_ext16u_i64
-// #define TCG_TARGET_HAS_ext32u_i64
+#define TCG_TARGET_HAS_ext8s_i64
+#define TCG_TARGET_HAS_ext16s_i64
+#define TCG_TARGET_HAS_ext32s_i64
+#define TCG_TARGET_HAS_ext8u_i64
+#define TCG_TARGET_HAS_ext16u_i64
+#define TCG_TARGET_HAS_ext32u_i64
 // #define TCG_TARGET_HAS_bswap16_i64
 // #define TCG_TARGET_HAS_bswap32_i64
 // #define TCG_TARGET_HAS_bswap64_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 27/62] tcg-s390: Implement bswap operations.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (25 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 26/62] tcg-s390: Implement sign and zero-extension operations Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 28/62] tcg-s390: Implement rotates Richard Henderson
                   ` (35 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   24 ++++++++++++++++++++++++
 tcg/s390/tcg-target.h |   10 +++++-----
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 3f7d08d..7c7adb3 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -78,6 +78,8 @@ typedef enum S390Opcode {
     RRE_LLGCR   = 0xb984,
     RRE_LLGFR   = 0xb916,
     RRE_LLGHR   = 0xb985,
+    RRE_LRVR    = 0xb91f,
+    RRE_LRVGR   = 0xb90f,
     RRE_MSGR    = 0xb90c,
     RRE_MSR     = 0xb252,
     RRE_NGR     = 0xb980,
@@ -1117,6 +1119,21 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_ext32u(s, args[0], args[1]);
         break;
 
+    case INDEX_op_bswap16_i32:
+    case INDEX_op_bswap16_i64:
+        /* The TCG bswap definition requires bits 0-47 already be zero.
+           Thus we don't need the G-type insns to implement bswap16_i64.  */
+        tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
+        tcg_out_insn(s, RS, SRL, args[0], 0, SH32_REG_NONE, 16);
+        break;
+    case INDEX_op_bswap32_i32:
+    case INDEX_op_bswap32_i64:
+        tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
+        break;
+    case INDEX_op_bswap64_i64:
+        tcg_out_insn(s, RRE, LRVGR, args[0], args[1]);
+        break;
+
     case INDEX_op_br:
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
@@ -1229,6 +1246,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_ext16s_i32, { "r", "r" } },
     { INDEX_op_ext16u_i32, { "r", "r" } },
 
+    { INDEX_op_bswap16_i32, { "r", "r" } },
+    { INDEX_op_bswap32_i32, { "r", "r" } },
+
     { INDEX_op_brcond_i32, { "r", "r" } },
     { INDEX_op_setcond_i32, { "r", "r", "r" } },
 
@@ -1286,6 +1306,10 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_ext32s_i64, { "r", "r" } },
     { INDEX_op_ext32u_i64, { "r", "r" } },
 
+    { INDEX_op_bswap16_i64, { "r", "r" } },
+    { INDEX_op_bswap32_i64, { "r", "r" } },
+    { INDEX_op_bswap64_i64, { "r", "r" } },
+
     { INDEX_op_brcond_i64, { "r", "r" } },
     { INDEX_op_setcond_i64, { "r", "r", "r" } },
 #endif
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 76a13fc..76f1d03 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -54,8 +54,8 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_ext16s_i32
 #define TCG_TARGET_HAS_ext8u_i32
 #define TCG_TARGET_HAS_ext16u_i32
-// #define TCG_TARGET_HAS_bswap16_i32
-// #define TCG_TARGET_HAS_bswap32_i32
+#define TCG_TARGET_HAS_bswap16_i32
+#define TCG_TARGET_HAS_bswap32_i32
 // #define TCG_TARGET_HAS_not_i32
 #define TCG_TARGET_HAS_neg_i32
 // #define TCG_TARGET_HAS_andc_i32
@@ -72,9 +72,9 @@ typedef enum TCGReg {
 #define TCG_TARGET_HAS_ext8u_i64
 #define TCG_TARGET_HAS_ext16u_i64
 #define TCG_TARGET_HAS_ext32u_i64
-// #define TCG_TARGET_HAS_bswap16_i64
-// #define TCG_TARGET_HAS_bswap32_i64
-// #define TCG_TARGET_HAS_bswap64_i64
+#define TCG_TARGET_HAS_bswap16_i64
+#define TCG_TARGET_HAS_bswap32_i64
+#define TCG_TARGET_HAS_bswap64_i64
 // #define TCG_TARGET_HAS_not_i64
 #define TCG_TARGET_HAS_neg_i64
 // #define TCG_TARGET_HAS_andc_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 28/62] tcg-s390: Implement rotates.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (26 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 27/62] tcg-s390: Implement bswap operations Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 29/62] tcg-s390: Use LOAD COMPLIMENT for negate Richard Henderson
                   ` (34 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 tcg/s390/tcg-target.h |    4 ++--
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 7c7adb3..f85063e 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -100,6 +100,8 @@ typedef enum S390Opcode {
     RR_SR       = 0x1b,
     RR_XR       = 0x17,
 
+    RSY_RLL     = 0xeb1d,
+    RSY_RLLG    = 0xeb1c,
     RSY_SLLG    = 0xeb0d,
     RSY_SRAG    = 0xeb0a,
     RSY_SRLG    = 0xeb0c,
@@ -1095,6 +1097,44 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RSY_SRAG;
         goto do_shift64;
 
+    case INDEX_op_rotl_i32:
+        /* ??? Using tcg_out_sh64 here for the format; it is a 32-bit rol.  */
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], SH32_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_rotr_i32:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1],
+                         SH32_REG_NONE, (32 - args[2]) & 31);
+        } else {
+            tcg_out_insn(s, RR, LCR, TCG_REG_R13, args[2]);
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_REG_R13, 0);
+        }
+        break;
+
+    case INDEX_op_rotl_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         SH64_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_rotr_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         SH64_REG_NONE, (64 - args[2]) & 63);
+        } else {
+            /* We can use the smaller 32-bit negate because only the
+               low 6 bits are examined for the rotate.  */
+            tcg_out_insn(s, RR, LCR, TCG_REG_R13, args[2]);
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_REG_R13, 0);
+        }
+        break;
+
     case INDEX_op_ext8s_i32:
     case INDEX_op_ext8s_i64:
         tgen_ext8s(s, args[0], args[1]);
@@ -1241,6 +1281,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i32, { "r", "0", "Ri" } },
     { INDEX_op_sar_i32, { "r", "0", "Ri" } },
 
+    { INDEX_op_rotl_i32, { "r", "r", "Ri" } },
+    { INDEX_op_rotr_i32, { "r", "r", "Ri" } },
+
     { INDEX_op_ext8s_i32, { "r", "r" } },
     { INDEX_op_ext8u_i32, { "r", "r" } },
     { INDEX_op_ext16s_i32, { "r", "r" } },
@@ -1299,6 +1342,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_shr_i64, { "r", "r", "Ri" } },
     { INDEX_op_sar_i64, { "r", "r", "Ri" } },
 
+    { INDEX_op_rotl_i64, { "r", "r", "Ri" } },
+    { INDEX_op_rotr_i64, { "r", "r", "Ri" } },
+
     { INDEX_op_ext8s_i64, { "r", "r" } },
     { INDEX_op_ext8u_i64, { "r", "r" } },
     { INDEX_op_ext16s_i64, { "r", "r" } },
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 76f1d03..0af4d38 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -49,7 +49,7 @@ typedef enum TCGReg {
 
 /* optional instructions */
 #define TCG_TARGET_HAS_div2_i32
-// #define TCG_TARGET_HAS_rot_i32
+#define TCG_TARGET_HAS_rot_i32
 #define TCG_TARGET_HAS_ext8s_i32
 #define TCG_TARGET_HAS_ext16s_i32
 #define TCG_TARGET_HAS_ext8u_i32
@@ -65,7 +65,7 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_nor_i32
 
 #define TCG_TARGET_HAS_div2_i64
-// #define TCG_TARGET_HAS_rot_i64
+#define TCG_TARGET_HAS_rot_i64
 #define TCG_TARGET_HAS_ext8s_i64
 #define TCG_TARGET_HAS_ext16s_i64
 #define TCG_TARGET_HAS_ext32s_i64
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 29/62] tcg-s390: Use LOAD COMPLIMENT for negate.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (27 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 28/62] tcg-s390: Implement rotates Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 30/62] tcg-s390: Tidy unimplemented opcodes Richard Henderson
                   ` (33 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   10 ++--------
 1 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index f85063e..97ac66d 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -1028,16 +1028,10 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_neg_i32:
-        /* FIXME: optimize args[0] != args[1] case */
-        tcg_out_insn(s, RR, LR, 13, args[1]);
-        tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
-        tcg_out_insn(s, RR, SR, args[0], 13);
+        tcg_out_insn(s, RR, LCR, args[0], args[1]);
         break;
     case INDEX_op_neg_i64:
-        /* FIXME: optimize args[0] != args[1] case */
-        tcg_out_mov(s, TCG_REG_R13, args[1]);
-        tcg_out_movi(s, TCG_TYPE_I64, args[0], 0);
-        tcg_out_insn(s, RRE, SGR, args[0], TCG_REG_R13);
+        tcg_out_insn(s, RRE, LCGR, args[0], args[1]);
         break;
 
     case INDEX_op_mul_i32:
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 30/62] tcg-s390: Tidy unimplemented opcodes.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (28 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 29/62] tcg-s390: Use LOAD COMPLIMENT for negate Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 31/62] tcg-s390: Use the extended-immediate facility for add/sub Richard Henderson
                   ` (32 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   25 ++++++++++---------------
 1 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 97ac66d..cf70cc2 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -911,11 +911,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_jmp:
-        /* XXX */
-        tcg_abort();
-        break;
-
     case INDEX_op_ld8u_i32:
         tcg_out_ldst(s, 0, RXY_LLC, args[0], args[1], args[2]);
         break;
@@ -977,16 +972,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
         break;
 
-    case INDEX_op_mov_i32:
-        /* XXX */
-        tcg_abort();
-        break;
-
-    case INDEX_op_movi_i32:
-        /* XXX */
-        tcg_abort();
-        break;
-
     case INDEX_op_add_i32:
         if (const_args[2]) {
             tcg_out_insn(s, RI, AHI, args[0], args[2]);
@@ -1234,6 +1219,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_qemu_st(s, args, LD_UINT64);
         break;
 
+    case INDEX_op_mov_i32:
+    case INDEX_op_mov_i64:
+    case INDEX_op_movi_i32:
+    case INDEX_op_movi_i64:
+        /* These are always emitted by TCG directly.  */
+    case INDEX_op_jmp:
+        /* This one is obsolete and never emitted.  */
+        tcg_abort();
+        break;
+
     default:
         fprintf(stderr,"unimplemented opc 0x%x\n",opc);
         tcg_abort();
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 31/62] tcg-s390: Use the extended-immediate facility for add/sub.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (29 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 30/62] tcg-s390: Tidy unimplemented opcodes Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 32/62] tcg-s390: Implement immediate ANDs Richard Henderson
                   ` (31 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

This gives us 32-bit immediate addends.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   68 +++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 52 insertions(+), 16 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index cf70cc2..caa2d0d 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,14 +33,16 @@
     do { } while (0)
 #endif
 
-#define TCG_CT_CONST_S16                0x100
-#define TCG_CT_CONST_U12                0x200
+#define TCG_CT_CONST_S32                0x100
+#define TCG_CT_CONST_N32                0x200
 
 /* All of the following instructions are prefixed with their instruction
    format, and are defined as 8- or 16-bit quantities, even when the two
    halves of the 16-bit quantity may appear 32 bits apart in the insn.
    This makes it easy to copy the values from the tables in Appendix B.  */
 typedef enum S390Opcode {
+    RIL_AFI     = 0xc209,
+    RIL_AGFI    = 0xc208,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
     RIL_LARL    = 0xc000,
@@ -288,7 +290,11 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         break;
     case 'I':
         ct->ct &= ~TCG_CT_REG;
-        ct->ct |= TCG_CT_CONST_S16;
+        ct->ct |= TCG_CT_CONST_S32;
+        break;
+    case 'J':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_N32;
         break;
     default:
         break;
@@ -305,10 +311,12 @@ static inline int tcg_target_const_match(tcg_target_long val,
 {
     int ct = arg_ct->ct;
 
-    if ((ct & TCG_CT_CONST) ||
-       ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) ||
-       ((ct & TCG_CT_CONST_U12) && val == (val & 0xfff))) {
+    if (ct & TCG_CT_CONST) {
         return 1;
+    } else if (ct & TCG_CT_CONST_S32) {
+        return val == (int32_t)val;
+    } else if (ct & TCG_CT_CONST_N32) {
+        return -val == (int32_t)-val;
     }
 
     return 0;
@@ -529,6 +537,24 @@ static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
     tcg_out_insn(s, RRE, LLGFR, dest, src);
 }
 
+static inline void tgen32_addi(TCGContext *s, TCGReg dest, tcg_target_long val)
+{
+    if (val == (int16_t)val) {
+        tcg_out_insn(s, RI, AHI, dest, val);
+    } else {
+        tcg_out_insn(s, RIL, AFI, dest, val);
+    }
+}
+
+static inline void tgen64_addi(TCGContext *s, TCGReg dest, tcg_target_long val)
+{
+    if (val == (int16_t)val) {
+        tcg_out_insn(s, RI, AGHI, dest, val);
+    } else {
+        tcg_out_insn(s, RIL, AGFI, dest, val);
+    }
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -974,22 +1000,32 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_add_i32:
         if (const_args[2]) {
-            tcg_out_insn(s, RI, AHI, args[0], args[2]);
+            tgen32_addi(s, args[0], args[2]);
         } else {
             tcg_out_insn(s, RR, AR, args[0], args[2]);
         }
         break;
-
     case INDEX_op_add_i64:
-        tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_sub_i32:
-        tcg_out_insn(s, RR, SR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen32_addi(s, args[0], -args[2]);
+        } else {
+            tcg_out_insn(s, RR, SR, args[0], args[2]);
+        }
         break;
-
     case INDEX_op_sub_i64:
-        tcg_out_insn(s, RRE, SGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], -args[2]);
+        } else {
+            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_and_i32:
@@ -1254,8 +1290,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st16_i32, { "r", "r" } },
     { INDEX_op_st_i32, { "r", "r" } },
 
-    { INDEX_op_add_i32, { "r", "0", "rI" } },
-    { INDEX_op_sub_i32, { "r", "0", "r" } },
+    { INDEX_op_add_i32, { "r", "0", "ri" } },
+    { INDEX_op_sub_i32, { "r", "0", "ri" } },
     { INDEX_op_mul_i32, { "r", "0", "r" } },
 
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
@@ -1315,8 +1351,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st32_i64, { "r", "r" } },
     { INDEX_op_st_i64, { "r", "r" } },
 
-    { INDEX_op_add_i64, { "r", "0", "r" } },
-    { INDEX_op_sub_i64, { "r", "0", "r" } },
+    { INDEX_op_add_i64, { "r", "0", "rI" } },
+    { INDEX_op_sub_i64, { "r", "0", "rJ" } },
     { INDEX_op_mul_i64, { "r", "0", "r" } },
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 32/62] tcg-s390: Implement immediate ANDs.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (30 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 31/62] tcg-s390: Use the extended-immediate facility for add/sub Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 33/62] tcg-s390: Implement immediate ORs Richard Henderson
                   ` (30 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  138 +++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 127 insertions(+), 11 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index caa2d0d..2fd58bd 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -51,6 +51,8 @@ typedef enum S390Opcode {
     RIL_LGFI    = 0xc001,
     RIL_LLIHF   = 0xc00e,
     RIL_LLILF   = 0xc00f,
+    RIL_NIHF    = 0xc00a,
+    RIL_NILF    = 0xc00b,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
@@ -64,6 +66,10 @@ typedef enum S390Opcode {
     RI_LLIHL    = 0xa50d,
     RI_LLILH    = 0xa50e,
     RI_LLILL    = 0xa50f,
+    RI_NIHH     = 0xa504,
+    RI_NIHL     = 0xa505,
+    RI_NILH     = 0xa506,
+    RI_NILL     = 0xa507,
 
     RRE_AGR     = 0xb908,
     RRE_CGR     = 0xb920,
@@ -555,6 +561,113 @@ static inline void tgen64_addi(TCGContext *s, TCGReg dest, tcg_target_long val)
     }
 }
 
+static void tgen32_andi(TCGContext *s, TCGReg dest, uint32_t val)
+{
+    /* Zero-th, look for no-op.  */
+    if (val == -1) {
+        return;
+    }
+
+    /* First, look for the zero-extensions.  */
+    if (val == 0xff) {
+        tgen_ext8u(s, dest, dest);
+        return;
+    }
+    if (val == 0xffff) {
+        tgen_ext16u(s, dest, dest);
+        return;
+    }
+
+    /* Second, try all 32-bit insns that can perform it in one go.  */
+    if ((val & 0xffff0000) == 0xffff0000) {
+        tcg_out_insn(s, RI, NILL, dest, val);
+        return;
+    }
+    if ((val & 0x0000ffff) == 0x0000ffff) {
+        tcg_out_insn(s, RI, NILH, dest, val >> 16);
+        return;
+    }
+
+    /* Lastly, perform the entire operation with a 48-bit insn.  */
+    tcg_out_insn(s, RIL, NILF, dest, val);
+}
+
+static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    static const S390Opcode ni_insns[4] = {
+        RI_NILL, RI_NILH, RI_NIHL, RI_NIHH
+    };
+    static const S390Opcode nif_insns[2] = {
+        RIL_NILF, RIL_NIHF
+    };
+
+    int i;
+
+    /* Zero-th, look for no-op.  */
+    if (val == -1) {
+        return;
+    }
+
+    /* First, look for the zero-extensions.  */
+    if (val == 0xff) {
+        tgen_ext8u(s, dest, dest);
+        return;
+    }
+    if (val == 0xffff) {
+        tgen_ext16u(s, dest, dest);
+        return;
+    }
+    if (val == 0xffffffff) {
+        tgen_ext32u(s, dest, dest);
+        return;
+    }
+
+    /* Second, try all 32-bit insns that can perform it in one go.  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = ~(0xffffull << i*16);
+        if ((val & mask) == mask) {
+            tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
+            return;
+        }
+    }
+
+    /* Third, try all 48-bit insns that can perform it in one go.  */
+    for (i = 0; i < 2; i++) {
+        tcg_target_ulong mask = ~(0xffffffffull << i*32);
+        if ((val & mask) == mask) {
+            tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+            return;
+        }
+    }
+
+    /* Fourth, look for masks that can be loaded with one instruction
+       into a register.  This is slightly smaller than using two 48-bit
+       masks, as below.  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = ~(0xffffull << i*16);
+        if ((val & mask) == 0) {
+            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_R13, val);
+            tcg_out_insn(s, RRE, NGR, dest, TCG_REG_R13);
+            return;
+        }
+    }
+
+    for (i = 0; i < 2; i++) {
+        tcg_target_ulong mask = ~(0xffffffffull << i*32);
+        if ((val & mask) == 0) {
+            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_R13, val);
+            tcg_out_insn(s, RRE, NGR, dest, TCG_REG_R13);
+            return;
+        }
+    }
+
+    /* Last, perform the AND via sequential modifications to the
+       high and low parts.  Do this via recursion to handle 16-bit
+       vs 32-bit masks in each half.  */
+    tgen64_andi(s, dest, val | 0xffffffff00000000ull);
+    tgen64_andi(s, dest, val | 0x00000000ffffffffull);
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -655,13 +768,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, SH64_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
-    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
-                 TARGET_PAGE_MASK | ((1 << s_bits) - 1));
-    tcg_out_insn(s, RRE, NGR, arg0, TCG_REG_R13);
-
-    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
-                 (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
-    tcg_out_insn(s, RRE, NGR, arg1, TCG_REG_R13);
+    tgen64_andi(s, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+    tgen64_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
         tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
@@ -1029,7 +1137,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_and_i32:
-        tcg_out_insn(s, RR, NR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen32_andi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RR, NR, args[0], args[2]);
+        }
         break;
     case INDEX_op_or_i32:
         tcg_out_insn(s, RR, OR, args[0], args[2]);
@@ -1039,7 +1151,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_and_i64:
-        tcg_out_insn(s, RRE, NGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_andi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, NGR, args[0], args[2]);
+        }
         break;
     case INDEX_op_or_i64:
         tcg_out_insn(s, RRE, OGR, args[0], args[2]);
@@ -1297,7 +1413,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
-    { INDEX_op_and_i32, { "r", "0", "r" } },
+    { INDEX_op_and_i32, { "r", "0", "ri" } },
     { INDEX_op_or_i32, { "r", "0", "r" } },
     { INDEX_op_xor_i32, { "r", "0", "r" } },
     { INDEX_op_neg_i32, { "r", "r" } },
@@ -1358,7 +1474,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
 
-    { INDEX_op_and_i64, { "r", "0", "r" } },
+    { INDEX_op_and_i64, { "r", "0", "ri" } },
     { INDEX_op_or_i64, { "r", "0", "r" } },
     { INDEX_op_xor_i64, { "r", "0", "r" } },
     { INDEX_op_neg_i64, { "r", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 33/62] tcg-s390: Implement immediate ORs.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (31 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 32/62] tcg-s390: Implement immediate ANDs Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 34/62] tcg-s390: Implement immediate MULs Richard Henderson
                   ` (29 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   63 +++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 2fd58bd..2a9d64d 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -53,6 +53,8 @@ typedef enum S390Opcode {
     RIL_LLILF   = 0xc00f,
     RIL_NIHF    = 0xc00a,
     RIL_NILF    = 0xc00b,
+    RIL_OIHF    = 0xc00c,
+    RIL_OILF    = 0xc00d,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
@@ -70,6 +72,10 @@ typedef enum S390Opcode {
     RI_NIHL     = 0xa505,
     RI_NILH     = 0xa506,
     RI_NILL     = 0xa507,
+    RI_OIHH     = 0xa508,
+    RI_OIHL     = 0xa509,
+    RI_OILH     = 0xa50a,
+    RI_OILL     = 0xa50b,
 
     RRE_AGR     = 0xb908,
     RRE_CGR     = 0xb920,
@@ -668,6 +674,47 @@ static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     tgen64_andi(s, dest, val | 0x00000000ffffffffull);
 }
 
+static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    static const S390Opcode oi_insns[4] = {
+        RI_OILL, RI_OILH, RI_OIHL, RI_OIHH
+    };
+    static const S390Opcode nif_insns[2] = {
+        RIL_OILF, RIL_OIHF
+    };
+
+    int i;
+
+    /* Zero-th, look for no-op.  */
+    if (val == 0) {
+        return;
+    }
+
+    /* First, try all 32-bit insns that can perform it in one go.  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = (0xffffull << i*16);
+        if ((val & mask) != 0 && (val & ~mask) == 0) {
+            tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16);
+            return;
+        }
+    }
+
+    /* Second, try all 48-bit insns that can perform it in one go.  */
+    for (i = 0; i < 2; i++) {
+        tcg_target_ulong mask = (0xffffffffull << i*32);
+        if ((val & mask) != 0 && (val & ~mask) == 0) {
+            tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+            return;
+        }
+    }
+
+    /* Last, perform the OR via sequential modifications to the
+       high and low parts.  Do this via recursion to handle 16-bit
+       vs 32-bit masks in each half.  */
+    tgen64_ori(s, dest, val & 0x00000000ffffffffull);
+    tgen64_ori(s, dest, val & 0xffffffff00000000ull);
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -1144,7 +1191,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_or_i32:
-        tcg_out_insn(s, RR, OR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_ori(s, args[0], args[2] & 0xffffffff);
+        } else {
+            tcg_out_insn(s, RR, OR, args[0], args[2]);
+        }
         break;
     case INDEX_op_xor_i32:
         tcg_out_insn(s, RR, XR, args[0], args[2]);
@@ -1158,7 +1209,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_or_i64:
-        tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_ori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        }
         break;
     case INDEX_op_xor_i64:
         tcg_out_insn(s, RRE, XGR, args[0], args[2]);
@@ -1414,7 +1469,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
     { INDEX_op_and_i32, { "r", "0", "ri" } },
-    { INDEX_op_or_i32, { "r", "0", "r" } },
+    { INDEX_op_or_i32, { "r", "0", "ri" } },
     { INDEX_op_xor_i32, { "r", "0", "r" } },
     { INDEX_op_neg_i32, { "r", "r" } },
 
@@ -1475,7 +1530,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
 
     { INDEX_op_and_i64, { "r", "0", "ri" } },
-    { INDEX_op_or_i64, { "r", "0", "r" } },
+    { INDEX_op_or_i64, { "r", "0", "ri" } },
     { INDEX_op_xor_i64, { "r", "0", "r" } },
     { INDEX_op_neg_i64, { "r", "r" } },
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 34/62] tcg-s390: Implement immediate MULs.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (32 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 33/62] tcg-s390: Implement immediate ORs Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 35/62] tcg-s390: Implement immediate XORs Richard Henderson
                   ` (28 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   28 ++++++++++++++++++++++++----
 1 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 2a9d64d..1bc9b4c 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -51,6 +51,8 @@ typedef enum S390Opcode {
     RIL_LGFI    = 0xc001,
     RIL_LLIHF   = 0xc00e,
     RIL_LLILF   = 0xc00f,
+    RIL_MSFI    = 0xc201,
+    RIL_MSGFI   = 0xc200,
     RIL_NIHF    = 0xc00a,
     RIL_NILF    = 0xc00b,
     RIL_OIHF    = 0xc00c,
@@ -68,6 +70,8 @@ typedef enum S390Opcode {
     RI_LLIHL    = 0xa50d,
     RI_LLILH    = 0xa50e,
     RI_LLILL    = 0xa50f,
+    RI_MGHI     = 0xa70d,
+    RI_MHI      = 0xa70c,
     RI_NIHH     = 0xa504,
     RI_NIHL     = 0xa505,
     RI_NILH     = 0xa506,
@@ -1227,10 +1231,26 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_mul_i32:
-        tcg_out_insn(s, RRE, MSR, args[0], args[2]);
+        if (const_args[2]) {
+            if (args[2] == (int16_t)args[2]) {
+                tcg_out_insn(s, RI, MHI, args[0], args[2]);
+            } else {
+                tcg_out_insn(s, RIL, MSFI, args[0], args[2]);
+            }
+        } else {
+            tcg_out_insn(s, RRE, MSR, args[0], args[2]);
+        }
         break;
     case INDEX_op_mul_i64:
-        tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+        if (const_args[2]) {
+            if (args[2] == (int16_t)args[2]) {
+                tcg_out_insn(s, RI, MGHI, args[0], args[2]);
+            } else {
+                tcg_out_insn(s, RIL, MSGFI, args[0], args[2]);
+            }
+        } else {
+            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_div2_i32:
@@ -1463,7 +1483,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_add_i32, { "r", "0", "ri" } },
     { INDEX_op_sub_i32, { "r", "0", "ri" } },
-    { INDEX_op_mul_i32, { "r", "0", "r" } },
+    { INDEX_op_mul_i32, { "r", "0", "ri" } },
 
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
@@ -1524,7 +1544,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_add_i64, { "r", "0", "rI" } },
     { INDEX_op_sub_i64, { "r", "0", "rJ" } },
-    { INDEX_op_mul_i64, { "r", "0", "r" } },
+    { INDEX_op_mul_i64, { "r", "0", "rI" } },
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 35/62] tcg-s390: Implement immediate XORs.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (33 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 34/62] tcg-s390: Implement immediate MULs Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 36/62] tcg-s390: Icache flush is a no-op Richard Henderson
                   ` (27 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   45 +++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 1bc9b4c..ec8c84d 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -57,6 +57,8 @@ typedef enum S390Opcode {
     RIL_NILF    = 0xc00b,
     RIL_OIHF    = 0xc00c,
     RIL_OILF    = 0xc00d,
+    RIL_XIHF    = 0xc006,
+    RIL_XILF    = 0xc007,
 
     RI_AGHI     = 0xa70b,
     RI_AHI      = 0xa70a,
@@ -719,6 +721,33 @@ static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     tgen64_ori(s, dest, val & 0xffffffff00000000ull);
 }
 
+static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    tcg_target_long sval = val;
+
+    /* Zero-th, look for no-op.  */
+    if (val == 0) {
+        return;
+    }
+
+    /* First, look for 64-bit values for which it is better to load the
+       value first and perform the xor via registers.  This is true for
+       any 32-bit negative value, where the high 32-bits get flipped too.  */
+    if (sval < 0 && sval == (int32_t)sval) {
+        tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_R13, sval);
+        tcg_out_insn(s, RRE, XGR, dest, TCG_REG_R13);
+        return;
+    }
+
+    /* Second, perform the xor by parts.  */
+    if (val & 0xffffffff) {
+        tcg_out_insn(s, RIL, XILF, dest, val);
+    }
+    if (val > 0xffffffff) {
+        tcg_out_insn(s, RIL, XIHF, dest, val >> 32);
+    }
+}
+
 static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
 {
     if (c > TCG_COND_GT) {
@@ -1202,7 +1231,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_xor_i32:
-        tcg_out_insn(s, RR, XR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_xori(s, args[0], args[2] & 0xffffffff);
+        } else {
+            tcg_out_insn(s, RR, XR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_and_i64:
@@ -1220,7 +1253,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
     case INDEX_op_xor_i64:
-        tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        if (const_args[2]) {
+            tgen64_xori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        }
         break;
 
     case INDEX_op_neg_i32:
@@ -1490,7 +1527,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_and_i32, { "r", "0", "ri" } },
     { INDEX_op_or_i32, { "r", "0", "ri" } },
-    { INDEX_op_xor_i32, { "r", "0", "r" } },
+    { INDEX_op_xor_i32, { "r", "0", "ri" } },
     { INDEX_op_neg_i32, { "r", "r" } },
 
     { INDEX_op_shl_i32, { "r", "0", "Ri" } },
@@ -1551,7 +1588,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_and_i64, { "r", "0", "ri" } },
     { INDEX_op_or_i64, { "r", "0", "ri" } },
-    { INDEX_op_xor_i64, { "r", "0", "r" } },
+    { INDEX_op_xor_i64, { "r", "0", "ri" } },
     { INDEX_op_neg_i64, { "r", "r" } },
 
     { INDEX_op_shl_i64, { "r", "r", "Ri" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 36/62] tcg-s390: Icache flush is a no-op.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (34 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 35/62] tcg-s390: Implement immediate XORs Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 37/62] tcg-s390: Define TCG_TMP0 Richard Henderson
                   ` (26 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Before gcc 4.2, __builtin___clear_cache doesn't exist, and
afterward the gcc s390 backend implements it as nothing.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.h |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 0af4d38..fae8ed7 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -95,9 +95,4 @@ enum {
 
 static inline void flush_icache_range(unsigned long start, unsigned long stop)
 {
-#if QEMU_GNUC_PREREQ(4, 1)
-    __builtin___clear_cache((char *) start, (char *) stop);
-#else
-#error not implemented
-#endif
 }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 37/62] tcg-s390: Define TCG_TMP0.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (35 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 36/62] tcg-s390: Icache flush is a no-op Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 38/62] tcg-s390: Tidy regset initialization; use R14 as temporary Richard Henderson
                   ` (25 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Use a define for the temp register instead of hard-coding it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   49 ++++++++++++++++++++++++++-----------------------
 1 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index ec8c84d..ee2e879 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -36,6 +36,9 @@
 #define TCG_CT_CONST_S32                0x100
 #define TCG_CT_CONST_N32                0x200
 
+#define TCG_TMP0                        TCG_REG_R13
+
+
 /* All of the following instructions are prefixed with their instruction
    format, and are defined as 8- or 16-bit quantities, even when the two
    halves of the 16-bit quantity may appear 32 bits apart in the insn.
@@ -491,7 +494,7 @@ static void tcg_out_ldst(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
     if (ofs < -0x80000 || ofs >= 0x80000) {
         /* Combine the low 16 bits of the offset with the actual load insn;
            the high 48 bits must come from an immediate load.  */
-        index = TCG_REG_R13;
+        index = TCG_TMP0;
         tcg_out_movi(s, TCG_TYPE_PTR, index, ofs & ~0xffff);
         ofs &= 0xffff;
     }
@@ -658,8 +661,8 @@ static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     for (i = 0; i < 4; i++) {
         tcg_target_ulong mask = ~(0xffffull << i*16);
         if ((val & mask) == 0) {
-            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_R13, val);
-            tcg_out_insn(s, RRE, NGR, dest, TCG_REG_R13);
+            tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
+            tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
             return;
         }
     }
@@ -667,8 +670,8 @@ static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     for (i = 0; i < 2; i++) {
         tcg_target_ulong mask = ~(0xffffffffull << i*32);
         if ((val & mask) == 0) {
-            tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_R13, val);
-            tcg_out_insn(s, RRE, NGR, dest, TCG_REG_R13);
+            tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
+            tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
             return;
         }
     }
@@ -734,8 +737,8 @@ static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
        value first and perform the xor via registers.  This is true for
        any 32-bit negative value, where the high 32-bits get flipped too.  */
     if (sval < 0 && sval == (int32_t)sval) {
-        tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_R13, sval);
-        tcg_out_insn(s, RRE, XGR, dest, TCG_REG_R13);
+        tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, sval);
+        tcg_out_insn(s, RRE, XGR, dest, TCG_TMP0);
         return;
     }
 
@@ -792,8 +795,8 @@ static void tgen_gotoi(TCGContext *s, int cc, tcg_target_long dest)
     } else if (off == (int32_t)off) {
         tcg_out_insn(s, RIL, BRCL, cc, off);
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
-        tcg_out_insn(s, RR, BCR, cc, TCG_REG_R13);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, dest);
+        tcg_out_insn(s, RR, BCR, cc, TCG_TMP0);
     }
 }
 
@@ -815,8 +818,8 @@ static void tgen_calli(TCGContext *s, tcg_target_long dest)
     if (off == (int32_t)off) {
         tcg_out_insn(s, RIL, BRASL, TCG_REG_R14, off);
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13, dest);
-        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_REG_R13);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, dest);
+        tcg_out_insn(s, RR, BASR, TCG_REG_R14, TCG_TMP0);
     }
 }
 
@@ -852,13 +855,13 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
     tgen64_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                      offsetof(CPUState, tlb_table[mem_index][0].addr_write));
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                      offsetof(CPUState, tlb_table[mem_index][0].addr_read));
     }
-    tcg_out_insn(s, RRE, AGR, arg1, TCG_REG_R13);
+    tcg_out_insn(s, RRE, AGR, arg1, TCG_TMP0);
 
     tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
 
@@ -1103,16 +1106,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                                    (tcg_target_long)s->code_ptr) >> 1;
             if (off == (int32_t)off) {
                 /* load address relative to PC */
-                tcg_out_insn(s, RIL, LARL, TCG_REG_R13, off);
+                tcg_out_insn(s, RIL, LARL, TCG_TMP0, off);
             } else {
                 /* too far for larl */
-                tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R13,
+                tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
                              (tcg_target_long)(s->tb_next + args[0]));
             }
             /* load address stored at s->tb_next + args[0] */
-            tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R13, TCG_REG_R13, 0);
+            tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_TMP0, 0);
             /* and go there */
-            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R13);
+            tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_TMP0);
         }
         s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf;
         break;
@@ -1353,8 +1356,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_sh64(s, RSY_RLL, args[0], args[1],
                          SH32_REG_NONE, (32 - args[2]) & 31);
         } else {
-            tcg_out_insn(s, RR, LCR, TCG_REG_R13, args[2]);
-            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_REG_R13, 0);
+            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_TMP0, 0);
         }
         break;
 
@@ -1373,8 +1376,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         } else {
             /* We can use the smaller 32-bit negate because only the
                low 6 bits are examined for the rotate.  */
-            tcg_out_insn(s, RR, LCR, TCG_REG_R13, args[2]);
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_REG_R13, 0);
+            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_TMP0, 0);
         }
         break;
 
@@ -1638,7 +1641,7 @@ void tcg_target_init(TCGContext *s)
 
     tcg_regset_clear(s->reserved_regs);
     /* frequently used as a temporary */
-    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13);
+    tcg_regset_set_reg(s->reserved_regs, TCG_TMP0);
     /* another temporary */
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
     /* XXX many insns can't be used with R0, so we better avoid it for now */
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 38/62] tcg-s390: Tidy regset initialization; use R14 as temporary.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (36 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 37/62] tcg-s390: Define TCG_TMP0 Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 39/62] tcg-s390: Rearrange register allocation order Richard Henderson
                   ` (24 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   26 ++++++++++++--------------
 1 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index ee2e879..a26c963 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -36,7 +36,7 @@
 #define TCG_CT_CONST_S32                0x100
 #define TCG_CT_CONST_N32                0x200
 
-#define TCG_TMP0                        TCG_REG_R13
+#define TCG_TMP0                        TCG_REG_R14
 
 
 /* All of the following instructions are prefixed with their instruction
@@ -1630,24 +1630,22 @@ void tcg_target_init(TCGContext *s)
 
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
-    tcg_regset_set32(tcg_target_call_clobber_regs, 0,
-                     (1 << TCG_REG_R0) |
-                     (1 << TCG_REG_R1) |
-                     (1 << TCG_REG_R2) |
-                     (1 << TCG_REG_R3) |
-                     (1 << TCG_REG_R4) |
-                     (1 << TCG_REG_R5) |
-                     (1 << TCG_REG_R14)); /* link register */
+
+    tcg_regset_clear(tcg_target_call_clobber_regs);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R0);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R1);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R2);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R3);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R4);
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R5);
+    /* The return register can be considered call-clobbered.  */
+    tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_R14);
 
     tcg_regset_clear(s->reserved_regs);
-    /* frequently used as a temporary */
     tcg_regset_set_reg(s->reserved_regs, TCG_TMP0);
-    /* another temporary */
-    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R12);
     /* XXX many insns can't be used with R0, so we better avoid it for now */
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_R0);
-    /* The stack pointer.  */
-    tcg_regset_set_reg(s->reserved_regs, TCG_REG_R15);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK);
 
     tcg_add_target_add_op_defs(s390_op_defs);
 }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 39/62] tcg-s390: Rearrange register allocation order.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (37 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 38/62] tcg-s390: Tidy regset initialization; use R14 as temporary Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 40/62] tcg-s390: Tidy goto_tb Richard Henderson
                   ` (23 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Try to avoid conflicting with the outgoing function call arguments.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   23 +++++++++++++----------
 1 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index a26c963..eb57e24 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -186,22 +186,25 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
 };
 #endif
 
+/* Since R6 is a potential argument register, choose it last of the
+   call-saved registers.  Likewise prefer the call-clobbered registers
+   in reverse order to maximize the chance of avoiding the arguments.  */
 static const int tcg_target_reg_alloc_order[] = {
-    TCG_REG_R6,
-    TCG_REG_R7,
-    TCG_REG_R8,
-    TCG_REG_R9,
-    TCG_REG_R10,
-    TCG_REG_R11,
-    TCG_REG_R12,
     TCG_REG_R13,
+    TCG_REG_R12,
+    TCG_REG_R11,
+    TCG_REG_R10,
+    TCG_REG_R9,
+    TCG_REG_R8,
+    TCG_REG_R7,
+    TCG_REG_R6,
     TCG_REG_R14,
     TCG_REG_R0,
     TCG_REG_R1,
-    TCG_REG_R2,
-    TCG_REG_R3,
-    TCG_REG_R4,
     TCG_REG_R5,
+    TCG_REG_R4,
+    TCG_REG_R3,
+    TCG_REG_R2,
 };
 
 static const int tcg_target_call_iarg_regs[] = {
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 40/62] tcg-s390: Tidy goto_tb.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (38 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 39/62] tcg-s390: Rearrange register allocation order Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 41/62] tcg-s390: Allocate the code_gen_buffer near the main program Richard Henderson
                   ` (22 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Invent tcg_out_ld_abs, using LOAD RELATIVE instructions, and use it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   34 ++++++++++++++++++++++------------
 1 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index eb57e24..627f7b7 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -48,12 +48,14 @@ typedef enum S390Opcode {
     RIL_AGFI    = 0xc208,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
-    RIL_LARL    = 0xc000,
     RIL_IIHF    = 0xc008,
     RIL_IILF    = 0xc009,
+    RIL_LARL    = 0xc000,
     RIL_LGFI    = 0xc001,
+    RIL_LGRL    = 0xc408,
     RIL_LLIHF   = 0xc00e,
     RIL_LLILF   = 0xc00f,
+    RIL_LRL     = 0xc40d,
     RIL_MSFI    = 0xc201,
     RIL_MSGFI   = 0xc200,
     RIL_NIHF    = 0xc00a,
@@ -531,6 +533,24 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
     }
 }
 
+/* load data from an absolute host address */
+static void tcg_out_ld_abs(TCGContext *s, TCGType type, TCGReg dest, void *abs)
+{
+    tcg_target_long addr = (tcg_target_long)abs;
+    tcg_target_long disp = (addr - (tcg_target_long)s->code_ptr) >> 1;
+
+    if (disp == (int32_t)disp) {
+        if (type == TCG_TYPE_I32) {
+            tcg_out_insn(s, RIL, LRL, dest, disp);
+        } else {
+            tcg_out_insn(s, RIL, LGRL, dest, disp);
+        }
+    } else {
+        tcg_out_movi(s, TCG_TYPE_PTR, dest, addr & ~0xffff);
+        tcg_out_ld(s, type, dest, dest, addr & 0xffff);
+    }
+}
+
 static inline void tgen_ext8s(TCGContext *s, TCGReg dest, TCGReg src)
 {
     tcg_out_insn(s, RRE, LGBR, dest, src);
@@ -1105,18 +1125,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         if (s->tb_jmp_offset) {
             tcg_abort();
         } else {
-            tcg_target_long off = ((tcg_target_long)(s->tb_next + args[0]) -
-                                   (tcg_target_long)s->code_ptr) >> 1;
-            if (off == (int32_t)off) {
-                /* load address relative to PC */
-                tcg_out_insn(s, RIL, LARL, TCG_TMP0, off);
-            } else {
-                /* too far for larl */
-                tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                             (tcg_target_long)(s->tb_next + args[0]));
-            }
             /* load address stored at s->tb_next + args[0] */
-            tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_TMP0, 0);
+            tcg_out_ld_abs(s, TCG_TYPE_PTR, TCG_TMP0, s->tb_next + args[0]);
             /* and go there */
             tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_TMP0);
         }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 41/62] tcg-s390: Allocate the code_gen_buffer near the main program.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (39 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 40/62] tcg-s390: Tidy goto_tb Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 42/62] tcg-s390: Rearrange qemu_ld/st to avoid register copy Richard Henderson
                   ` (21 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

This allows the use of direct calls to the helpers,
and a direct branch back to the epilogue.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 exec.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/exec.c b/exec.c
index bb3dcad..7bbfe60 100644
--- a/exec.c
+++ b/exec.c
@@ -519,6 +519,13 @@ static void code_gen_alloc(unsigned long tb_size)
         start = (void *) 0x01000000UL;
         if (code_gen_buffer_size > 16 * 1024 * 1024)
             code_gen_buffer_size = 16 * 1024 * 1024;
+#elif defined(__s390x__)
+        /* Map the buffer so that we can use direct calls and branches.  */
+        /* We have a +- 4GB range on the branches; leave some slop.  */
+        if (code_gen_buffer_size > (3ul * 1024 * 1024 * 1024)) {
+            code_gen_buffer_size = 3ul * 1024 * 1024 * 1024;
+        }
+        start = (void *)0x90000000UL;
 #endif
         code_gen_buffer = mmap(start, code_gen_buffer_size,
                                PROT_WRITE | PROT_READ | PROT_EXEC,
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 42/62] tcg-s390: Rearrange qemu_ld/st to avoid register copy.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (40 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 41/62] tcg-s390: Allocate the code_gen_buffer near the main program Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 43/62] tcg-s390: Tidy tcg_prepare_qemu_ldst Richard Henderson
                   ` (20 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Split out qemu_ld/st_direct with full address components.
Avoid copy from addr_reg to R2 for 64-bit guests.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  270 ++++++++++++++++++++++++++-----------------------
 1 files changed, 145 insertions(+), 125 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 627f7b7..5d2efaa 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -846,14 +846,123 @@ static void tgen_calli(TCGContext *s, tcg_target_long dest)
     }
 }
 
+static void tcg_out_qemu_ld_direct(TCGContext *s, int opc, TCGReg data,
+                                   TCGReg base, TCGReg index, int disp)
+{
+#ifdef TARGET_WORDS_BIGENDIAN
+    const int bswap = 0;
+#else
+    const int bswap = 1;
+#endif
+    switch (opc) {
+    case LD_UINT8:
+        tcg_out_insn(s, RXY, LLGC, data, base, index, disp);
+        break;
+    case LD_INT8:
+        tcg_out_insn(s, RXY, LGB, data, base, index, disp);
+        break;
+    case LD_UINT16:
+        if (bswap) {
+            /* swapped unsigned halfword load with upper bits zeroed */
+            tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
+            tgen_ext16u(s, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LLGH, data, base, index, disp);
+        }
+        break;
+    case LD_INT16:
+        if (bswap) {
+            /* swapped sign-extended halfword load */
+            tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
+            tgen_ext16s(s, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LGH, data, base, index, disp);
+        }
+        break;
+    case LD_UINT32:
+        if (bswap) {
+            /* swapped unsigned int load with upper bits zeroed */
+            tcg_out_insn(s, RXY, LRV, data, base, index, disp);
+            tgen_ext32u(s, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LLGF, data, base, index, disp);
+        }
+        break;
+    case LD_INT32:
+        if (bswap) {
+            /* swapped sign-extended int load */
+            tcg_out_insn(s, RXY, LRV, data, base, index, disp);
+            tgen_ext32s(s, data, data);
+        } else {
+            tcg_out_insn(s, RXY, LGF, data, base, index, disp);
+        }
+        break;
+    case LD_UINT64:
+        if (bswap) {
+            tcg_out_insn(s, RXY, LRVG, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, LG, data, base, index, disp);
+        }
+        break;
+    default:
+        tcg_abort();
+    }
+}
+
+static void tcg_out_qemu_st_direct(TCGContext *s, int opc, TCGReg data,
+                                   TCGReg base, TCGReg index, int disp)
+{
+#ifdef TARGET_WORDS_BIGENDIAN
+    const int bswap = 0;
+#else
+    const int bswap = 1;
+#endif
+    switch (opc) {
+    case LD_UINT8:
+        if (disp >= 0 && disp < 0x1000) {
+            tcg_out_insn(s, RX, STC, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STCY, data, base, index, disp);
+        }
+        break;
+    case LD_UINT16:
+        if (bswap) {
+            tcg_out_insn(s, RXY, STRVH, data, base, index, disp);
+        } else if (disp >= 0 && disp < 0x1000) {
+            tcg_out_insn(s, RX, STH, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STHY, data, base, index, disp);
+        }
+        break;
+    case LD_UINT32:
+        if (bswap) {
+            tcg_out_insn(s, RXY, STRV, data, base, index, disp);
+        } else if (disp >= 0 && disp < 0x1000) {
+            tcg_out_insn(s, RX, ST, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STY, data, base, index, disp);
+        }
+        break;
+    case LD_UINT64:
+        if (bswap) {
+            tcg_out_insn(s, RXY, STRVG, data, base, index, disp);
+        } else {
+            tcg_out_insn(s, RXY, STG, data, base, index, disp);
+        }
+        break;
+    default:
+        tcg_abort();
+    }
+}
+
 #if defined(CONFIG_SOFTMMU)
-static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
-                                  int mem_index, int opc,
+static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
+                                  TCGReg addr_reg, int mem_index, int opc,
                                   uint16_t **label2_ptr_p, int is_store)
-  {
-    int arg0 = TCG_REG_R2;
-    int arg1 = TCG_REG_R3;
-    int arg2 = TCG_REG_R4;
+{
+    const TCGReg arg0 = TCG_REG_R2;
+    const TCGReg arg1 = TCG_REG_R3;
+    const TCGReg arg2 = TCG_REG_R4;
     int s_bits;
     uint16_t *label1_ptr;
 
@@ -947,13 +1056,6 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
                      - offsetof(CPUTLBEntry, addr_read));
     }
 
-#if TARGET_LONG_BITS == 32
-    /* zero upper 32 bits */
-    tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
-#else
-    /* just copy */
-    tcg_out_mov(s, arg0, addr_reg);
-#endif
     tcg_out_insn(s, RRE, AGR, arg0, arg1);
 }
 
@@ -963,150 +1065,68 @@ static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
     *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label2_ptr) >> 1;
 }
-
-#else /* CONFIG_SOFTMMU */
-
-static void tcg_prepare_qemu_ldst(TCGContext* s, int data_reg, int addr_reg,
-                                int mem_index, int opc,
-                                uint16_t **label2_ptr_p, int is_store)
-{
-    int arg0 = TCG_REG_R2;
-
-    /* user mode, no address translation required */
-    if (TARGET_LONG_BITS == 32) {
-        tcg_out_insn(s, RRE, LLGFR, arg0, addr_reg);
-    } else {
-        tcg_out_mov(s, arg0, addr_reg);
-    }
-}
-
-static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
-{
-}
-
 #endif /* CONFIG_SOFTMMU */
 
 /* load data with address translation (if applicable)
    and endianness conversion */
 static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 {
-    int addr_reg, data_reg, mem_index;
-    int arg0 = TCG_REG_R2;
+    TCGReg addr_reg, data_reg;
+#if defined(CONFIG_SOFTMMU)
+    int mem_index;
     uint16_t *label2_ptr;
+#endif
 
     data_reg = *args++;
     addr_reg = *args++;
-    mem_index = *args;
 
-    dprintf("tcg_out_qemu_ld opc %d data_reg %d addr_reg %d mem_index %d\n"
-            opc, data_reg, addr_reg, mem_index);
+#if defined(CONFIG_SOFTMMU)
+    mem_index = *args;
 
     tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
                           opc, &label2_ptr, 0);
 
-    switch (opc) {
-    case LD_UINT8:
-        tcg_out_insn(s, RXY, LLGC, data_reg, arg0, 0, 0);
-        break;
-    case LD_INT8:
-        tcg_out_insn(s, RXY, LGB, data_reg, arg0, 0, 0);
-        break;
-    case LD_UINT16:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LLGH, data_reg, arg0, 0, 0);
-#else
-        /* swapped unsigned halfword load with upper bits zeroed */
-        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tgen_ext16u(s, data_reg, data_reg);
-#endif
-        break;
-    case LD_INT16:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LGH, data_reg, arg0, 0, 0);
-#else
-        /* swapped sign-extended halfword load */
-        tcg_out_insn(s, RXY, LRVH, data_reg, arg0, 0, 0);
-        tgen_ext16s(s, data_reg, data_reg);
-#endif
-        break;
-    case LD_UINT32:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LLGF, data_reg, arg0, 0, 0);
-#else
-        /* swapped unsigned int load with upper bits zeroed */
-        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tgen_ext32u(s, data_reg, data_reg);
-#endif
-        break;
-    case LD_INT32:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LGF, data_reg, arg0, 0, 0);
-#else
-        /* swapped sign-extended int load */
-        tcg_out_insn(s, RXY, LRV, data_reg, arg0, 0, 0);
-        tgen_ext32s(s, data_reg, data_reg);
-#endif
-        break;
-    case LD_UINT64:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, LG, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, LRVG, data_reg, arg0, 0, 0);
-#endif
-        break;
-    default:
-        tcg_abort();
-    }
+    tcg_out_qemu_ld_direct(s, opc, data_reg, TCG_REG_R2, 0, 0);
 
     tcg_finish_qemu_ldst(s, label2_ptr);
+#else
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, TCG_TMP0, addr_reg);
+        tcg_out_qemu_ld_direct(s, opc, data_reg, TCG_TMP0, 0, 0);
+    } else {
+        tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, 0, 0);
+    }
+#endif
 }
 
 static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 {
-    int addr_reg, data_reg, mem_index;
+    TCGReg addr_reg, data_reg;
+#if defined(CONFIG_SOFTMMU)
+    int mem_index;
     uint16_t *label2_ptr;
-    int arg0 = TCG_REG_R2;
+#endif
 
     data_reg = *args++;
     addr_reg = *args++;
-    mem_index = *args;
 
-    dprintf("tcg_out_qemu_st opc %d data_reg %d addr_reg %d mem_index %d\n"
-            opc, data_reg, addr_reg, mem_index);
+#if defined(CONFIG_SOFTMMU)
+    mem_index = *args;
 
     tcg_prepare_qemu_ldst(s, data_reg, addr_reg, mem_index,
                           opc, &label2_ptr, 1);
 
-    switch (opc) {
-    case LD_UINT8:
-        tcg_out_insn(s, RX, STC, data_reg, arg0, 0, 0);
-        break;
-    case LD_UINT16:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RX, STH, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, STRVH, data_reg, arg0, 0, 0);
-#endif
-        break;
-    case LD_UINT32:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RX, ST, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, STRV, data_reg, arg0, 0, 0);
-#endif
-        break;
-    case LD_UINT64:
-#ifdef TARGET_WORDS_BIGENDIAN
-        tcg_out_insn(s, RXY, STG, data_reg, arg0, 0, 0);
-#else
-        tcg_out_insn(s, RXY, STRVG, data_reg, arg0, 0, 0);
-#endif
-        break;
-    default:
-        tcg_abort();
-    }
+    tcg_out_qemu_st_direct(s, opc, data_reg, TCG_REG_R2, 0, 0);
 
     tcg_finish_qemu_ldst(s, label2_ptr);
+#else
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, TCG_TMP0, addr_reg);
+        tcg_out_qemu_st_direct(s, opc, data_reg, TCG_TMP0, 0, 0);
+    } else {
+        tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, 0, 0);
+    }
+#endif
 }
 
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 43/62] tcg-s390: Tidy tcg_prepare_qemu_ldst.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (41 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 42/62] tcg-s390: Rearrange qemu_ld/st to avoid register copy Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 44/62] tcg-s390: Tidy user qemu_ld/st Richard Henderson
                   ` (19 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Make use of the reg+reg+disp addressing mode to eliminate
redundant additions.  Make use of the load-and-operate insns.
Avoid an extra register copy when using the 64-bit shift insns.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   56 ++++++++++++++++--------------------------------
 1 files changed, 19 insertions(+), 37 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 5d2efaa..000a646 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -135,6 +135,7 @@ typedef enum S390Opcode {
     RS_SRA      = 0x8a,
     RS_SRL      = 0x88,
 
+    RXY_AG      = 0xe308,
     RXY_CG      = 0xe320,
     RXY_LB      = 0xe376,
     RXY_LG      = 0xe304,
@@ -962,24 +963,16 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
 {
     const TCGReg arg0 = TCG_REG_R2;
     const TCGReg arg1 = TCG_REG_R3;
-    const TCGReg arg2 = TCG_REG_R4;
-    int s_bits;
+    int s_bits = opc & 3;
     uint16_t *label1_ptr;
+    tcg_target_long ofs;
 
-    if (is_store) {
-        s_bits = opc;
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, arg0, addr_reg);
     } else {
-        s_bits = opc & 3;
+        tcg_out_mov(s, arg0, addr_reg);
     }
 
-#if TARGET_LONG_BITS == 32
-    tgen_ext32u(s, arg1, addr_reg);
-    tgen_ext32u(s, arg0, addr_reg);
-#else
-    tcg_out_mov(s, arg1, addr_reg);
-    tcg_out_mov(s, arg0, addr_reg);
-#endif
-
     tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, SH64_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
@@ -987,17 +980,19 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     tgen64_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                     offsetof(CPUState, tlb_table[mem_index][0].addr_write));
+        ofs = offsetof(CPUState, tlb_table[mem_index][0].addr_write);
     } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0,
-                     offsetof(CPUState, tlb_table[mem_index][0].addr_read));
+        ofs = offsetof(CPUState, tlb_table[mem_index][0].addr_read);
     }
-    tcg_out_insn(s, RRE, AGR, arg1, TCG_TMP0);
+    assert(ofs < 0x80000);
 
-    tcg_out_insn(s, RRE, AGR, arg1, TCG_AREG0);
+    tcg_out_insn(s, RXY, CG, arg0, arg1, TCG_AREG0, ofs);
 
-    tcg_out_insn(s, RXY, CG, arg0, arg1, 0, 0);
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, arg0, addr_reg);
+    } else {
+        tcg_out_mov(s, arg0, addr_reg);
+    }
 
     label1_ptr = (uint16_t*)s->code_ptr;
 
@@ -1005,15 +1000,9 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     tcg_out_insn(s, RI, BRC, S390_CC_EQ, 0);
 
     /* call load/store helper */
-#if TARGET_LONG_BITS == 32
-    tgen_ext32u(s, arg0, addr_reg);
-#else
-    tcg_out_mov(s, arg0, addr_reg);
-#endif
-
     if (is_store) {
         tcg_out_mov(s, arg1, data_reg);
-        tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
+        tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, mem_index);
         tgen_calli(s, (tcg_target_ulong)qemu_st_helpers[s_bits]);
     } else {
         tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
@@ -1046,17 +1035,10 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     *(label1_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label1_ptr) >> 1;
 
-    if (is_store) {
-        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
-                     offsetof(CPUTLBEntry, addend)
-                     - offsetof(CPUTLBEntry, addr_write));
-    } else {
-        tcg_out_insn(s, RXY, LG, arg1, arg1, 0,
-                     offsetof(CPUTLBEntry, addend)
-                     - offsetof(CPUTLBEntry, addr_read));
-    }
+    ofs = offsetof(CPUState, tlb_table[mem_index][0].addend);
+    assert(ofs < 0x80000);
 
-    tcg_out_insn(s, RRE, AGR, arg0, arg1);
+    tcg_out_insn(s, RXY, AG, arg0, arg1, TCG_AREG0, ofs);
 }
 
 static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 44/62] tcg-s390: Tidy user qemu_ld/st.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (42 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 43/62] tcg-s390: Tidy tcg_prepare_qemu_ldst Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 45/62] tcg-s390: Implement GUEST_BASE Richard Henderson
                   ` (18 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Create a tcg_prepare_user_ldst to prep the host address to
be used to implement the guest memory operation.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   33 +++++++++++++++++++++------------
 1 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 000a646..fa089ab 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -1047,6 +1047,17 @@ static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
     *(label2_ptr + 1) = ((unsigned long)s->code_ptr -
                          (unsigned long)label2_ptr) >> 1;
 }
+#else
+static void tcg_prepare_user_ldst(TCGContext *s, TCGReg *addr_reg,
+                                  TCGReg *index_reg, tcg_target_long *disp)
+{
+    *index_reg = 0;
+    *disp = 0;
+    if (TARGET_LONG_BITS == 32) {
+        tgen_ext32u(s, TCG_TMP0, *addr_reg);
+        *addr_reg = TCG_TMP0;
+    }
+}
 #endif /* CONFIG_SOFTMMU */
 
 /* load data with address translation (if applicable)
@@ -1057,6 +1068,9 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 #if defined(CONFIG_SOFTMMU)
     int mem_index;
     uint16_t *label2_ptr;
+#else
+    TCGReg index_reg;
+    tcg_target_long disp;
 #endif
 
     data_reg = *args++;
@@ -1072,12 +1086,8 @@ static void tcg_out_qemu_ld(TCGContext* s, const TCGArg* args, int opc)
 
     tcg_finish_qemu_ldst(s, label2_ptr);
 #else
-    if (TARGET_LONG_BITS == 32) {
-        tgen_ext32u(s, TCG_TMP0, addr_reg);
-        tcg_out_qemu_ld_direct(s, opc, data_reg, TCG_TMP0, 0, 0);
-    } else {
-        tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, 0, 0);
-    }
+    tcg_prepare_user_ldst(s, &addr_reg, &index_reg, &disp);
+    tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, index_reg, disp);
 #endif
 }
 
@@ -1087,6 +1097,9 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 #if defined(CONFIG_SOFTMMU)
     int mem_index;
     uint16_t *label2_ptr;
+#else
+    TCGReg index_reg;
+    tcg_target_long disp;
 #endif
 
     data_reg = *args++;
@@ -1102,12 +1115,8 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 
     tcg_finish_qemu_ldst(s, label2_ptr);
 #else
-    if (TARGET_LONG_BITS == 32) {
-        tgen_ext32u(s, TCG_TMP0, addr_reg);
-        tcg_out_qemu_st_direct(s, opc, data_reg, TCG_TMP0, 0, 0);
-    } else {
-        tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, 0, 0);
-    }
+    tcg_prepare_user_ldst(s, &addr_reg, &index_reg, &disp);
+    tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, index_reg, disp);
 #endif
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 45/62] tcg-s390: Implement GUEST_BASE.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (43 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 44/62] tcg-s390: Tidy user qemu_ld/st Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 46/62] tcg-s390: Query instruction extensions that are installed Richard Henderson
                   ` (17 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure             |    2 ++
 tcg/s390/tcg-target.c |   30 +++++++++++++++++++++++++-----
 tcg/s390/tcg-target.h |    2 ++
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 72d3df8..56dee88 100755
--- a/configure
+++ b/configure
@@ -698,10 +698,12 @@ case "$cpu" in
            ;;
     s390)
            QEMU_CFLAGS="-march=z990 $QEMU_CFLAGS"
+           host_guest_base="yes"
            ;;
     s390x)
            QEMU_CFLAGS="-m64 -march=z9-109 $QEMU_CFLAGS"
            LDFLAGS="-m64 $LDFLAGS"
+           host_guest_base="yes"
            ;;
     i386)
            QEMU_CFLAGS="-m32 $QEMU_CFLAGS"
diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index fa089ab..4a3235c 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,10 +33,20 @@
     do { } while (0)
 #endif
 
-#define TCG_CT_CONST_S32                0x100
-#define TCG_CT_CONST_N32                0x200
+#define TCG_CT_CONST_S32   0x100
+#define TCG_CT_CONST_N32   0x200
 
-#define TCG_TMP0                        TCG_REG_R14
+#define TCG_TMP0           TCG_REG_R14
+
+#ifdef CONFIG_USE_GUEST_BASE
+#define TCG_GUEST_BASE_REG TCG_REG_R13
+#else
+#define TCG_GUEST_BASE_REG TCG_REG_R0
+#endif
+
+#ifndef GUEST_BASE
+#define GUEST_BASE 0
+#endif
 
 
 /* All of the following instructions are prefixed with their instruction
@@ -1051,12 +1061,17 @@ static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
 static void tcg_prepare_user_ldst(TCGContext *s, TCGReg *addr_reg,
                                   TCGReg *index_reg, tcg_target_long *disp)
 {
-    *index_reg = 0;
-    *disp = 0;
     if (TARGET_LONG_BITS == 32) {
         tgen_ext32u(s, TCG_TMP0, *addr_reg);
         *addr_reg = TCG_TMP0;
     }
+    if (GUEST_BASE < 0x80000) {
+        *index_reg = 0;
+        *disp = GUEST_BASE;
+    } else {
+        *index_reg = TCG_GUEST_BASE_REG;
+        *disp = 0;
+    }
 }
 #endif /* CONFIG_SOFTMMU */
 
@@ -1682,6 +1697,11 @@ void tcg_target_qemu_prologue(TCGContext *s)
     /* aghi %r15,-160 (stack frame) */
     tcg_out_insn(s, RI, AGHI, TCG_REG_R15, -160);
 
+    if (GUEST_BASE >= 0x80000) {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, GUEST_BASE);
+        tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG);
+    }
+
     /* br %r2 (go to TB) */
     tcg_out_insn(s, RR, BCR, S390_CC_ALWAYS, TCG_REG_R2);
 
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index fae8ed7..940f530 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -83,6 +83,8 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_nand_i64
 // #define TCG_TARGET_HAS_nor_i64
 
+#define TCG_TARGET_HAS_GUEST_BASE
+
 /* used for function call generation */
 #define TCG_REG_CALL_STACK		TCG_REG_R15
 #define TCG_TARGET_STACK_ALIGN		8
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 46/62] tcg-s390: Query instruction extensions that are installed.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (44 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 45/62] tcg-s390: Implement GUEST_BASE Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 47/62] tcg-s390: Conditionalize general-instruction-extension insns Richard Henderson
                   ` (16 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Verify that we have all the instruction extensions that we generate.
Future patches can tailor code generation to the set of instructions
that are present.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  122 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 122 insertions(+), 0 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 4a3235c..4807bca 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -279,6 +279,17 @@ static void *qemu_st_helpers[4] = {
 
 static uint8_t *tb_ret_addr;
 
+/* A list of relevant facilities used by this translator.  Some of these
+   are required for proper operation, and these are checked at startup.  */
+
+#define FACILITY_ZARCH		(1ULL << (63 - 1))
+#define FACILITY_ZARCH_ACTIVE	(1ULL << (63 - 2))
+#define FACILITY_LONG_DISP	(1ULL << (63 - 18))
+#define FACILITY_EXT_IMM	(1ULL << (63 - 21))
+#define FACILITY_GEN_INST_EXT	(1ULL << (63 - 34))
+
+static uint64_t facilities;
+
 static void patch_reloc(uint8_t *code_ptr, int type,
                 tcg_target_long value, tcg_target_long addend)
 {
@@ -1658,6 +1669,115 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { -1 },
 };
 
+/* ??? Linux kernels provide an AUXV entry AT_HWCAP that provides most of
+   this information.  However, getting at that entry is not easy this far
+   away from main.  Our options are: start searching from environ, but 
+   that fails as soon as someone does a setenv in between.  Read the data
+   from /proc/self/auxv.  Or do the probing ourselves.  The only thing 
+   extra that AT_HWCAP gives us is HWCAP_S390_HIGH_GPRS, which indicates
+   that the kernel saves all 64-bits of the registers around traps while
+   in 31-bit mode.  But this is true of all "recent" kernels (ought to dig
+   back and see from when this might not be true).  */
+
+#include <signal.h>
+
+static volatile sig_atomic_t got_sigill;
+
+static void sigill_handler(int sig)
+{
+    got_sigill = 1;
+}
+
+static void query_facilities(void)
+{
+    struct sigaction sa_old, sa_new;
+    register int r0 __asm__("0");
+    register void *r1 __asm__("1");
+    int fail;
+
+    memset(&sa_new, 0, sizeof(sa_new));
+    sa_new.sa_handler = sigill_handler;
+    sigaction(SIGILL, &sa_new, &sa_old);
+
+    /* First, try STORE FACILITY LIST EXTENDED.  If this is present, then
+       we need not do any more probing.  Unfortunately, this itself is an
+       extension and the original STORE FACILITY LIST instruction is
+       kernel-only, storing its results at absolute address 200.  */
+    /* stfle 0(%r1) */
+    r1 = &facilities;
+    asm volatile(".word 0xb2b0,0x1000"
+                 : "=r"(r0) : "0"(0), "r"(r1) : "memory", "cc");
+
+    if (got_sigill) {
+        /* STORE FACILITY EXTENDED is not available.  Probe for one of each
+           kind of instruction that we're interested in.  */
+        /* ??? Possibly some of these are in practice never present unless
+           the store-facility-extended facility is also present.  But since
+           that isn't documented it's just better to probe for each.  */
+       
+        /* Test for z/Architecture.  Required even in 31-bit mode.  */
+        got_sigill = 0;
+        /* agr %r0,%r0 */
+        asm volatile(".word 0xb908,0x0000" : "=r"(r0) : : "cc");
+        if (!got_sigill) {
+            facilities |= FACILITY_ZARCH | FACILITY_ZARCH_ACTIVE;
+        }
+
+        /* Test for long displacement.  */
+        got_sigill = 0;
+        /* ly %r0,0(%r1) */
+        r1 = &facilities;
+        asm volatile(".word 0xe300,0x1000,0x0058"
+                     : "=r"(r0) : "r"(r1) : "cc");
+        if (!got_sigill) {
+            facilities |= FACILITY_LONG_DISP;
+        }
+
+        /* Test for extended immediates.  */
+        got_sigill = 0;
+        /* afi %r0,0 */
+        asm volatile(".word 0xc209,0x0000,0x0000" : : : "cc");
+        if (!got_sigill) {
+            facilities |= FACILITY_EXT_IMM;
+        }
+
+        /* Test for general-instructions-extension.  */
+        got_sigill = 0;
+        /* msfi %r0,1 */
+        asm volatile(".word 0xc201,0x0000,0x0001");
+        if (!got_sigill) {
+            facilities |= FACILITY_GEN_INST_EXT;
+        }
+    }
+
+    sigaction(SIGILL, &sa_old, NULL);
+
+    /* ??? The translator currently uses all of these extensions
+       unconditionally.  This list could be pruned back to just
+       z/Arch and long displacement with some work.  */
+    fail = 0;
+    if ((facilities & FACILITY_ZARCH_ACTIVE) == 0) {
+        fprintf(stderr, "TCG: z/Arch facility is required\n");
+        fail = 1;
+    }
+    if ((facilities & FACILITY_LONG_DISP) == 0) {
+        fprintf(stderr, "TCG: long-displacement facility is required\n");
+        fail = 1;
+    }
+    if ((facilities & FACILITY_EXT_IMM) == 0) {
+        fprintf(stderr, "TCG: extended-immediate facility is required\n");
+        fail = 1;
+    }
+    if ((facilities & FACILITY_GEN_INST_EXT) == 0) {
+        fprintf(stderr, "TCG: general-instructions-extension "
+                "facility is required\n");
+        fail = 1;
+    }
+    if (fail) {
+        exit(-1);
+    }
+}
+
 void tcg_target_init(TCGContext *s)
 {
 #if !defined(CONFIG_USER_ONLY)
@@ -1667,6 +1787,8 @@ void tcg_target_init(TCGContext *s)
     }
 #endif
 
+    query_facilities();
+
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 47/62] tcg-s390: Conditionalize general-instruction-extension insns.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (45 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 46/62] tcg-s390: Query instruction extensions that are installed Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 48/62] tcg-s390: Conditionalize ADD IMMEDIATE instructions Richard Henderson
                   ` (15 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The LOAD RELATIVE and MULTIPLY SINGLE IMMEDIATE instructions
are currently the only insns from that extension.  It's easy
enough to test for that facility and avoid emitting them.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   51 +++++++++++++++++++++++++++++-------------------
 1 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 4807bca..aecabf9 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -35,6 +35,7 @@
 
 #define TCG_CT_CONST_S32   0x100
 #define TCG_CT_CONST_N32   0x200
+#define TCG_CT_CONST_MULI  0x400
 
 #define TCG_TMP0           TCG_REG_R14
 
@@ -344,6 +345,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_N32;
         break;
+    case 'K':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_MULI;
+        break;
     default:
         break;
     }
@@ -365,6 +370,16 @@ static inline int tcg_target_const_match(tcg_target_long val,
         return val == (int32_t)val;
     } else if (ct & TCG_CT_CONST_N32) {
         return -val == (int32_t)-val;
+    } else if (ct & TCG_CT_CONST_MULI) {
+        /* Immediates that may be used with multiply.  If we have the
+           general-instruction-extensions, then we have MULTIPLY SINGLE
+           IMMEDIATE with a signed 32-bit, otherwise we have only 
+           MULTIPLY HALFWORD IMMEDIATE, with a signed 16-bit.  */
+        if (facilities & FACILITY_GEN_INST_EXT) {
+            return val == (int32_t)val;
+        } else {
+            return val == (int16_t)val;
+        }
     }
 
     return 0;
@@ -559,18 +574,21 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
 static void tcg_out_ld_abs(TCGContext *s, TCGType type, TCGReg dest, void *abs)
 {
     tcg_target_long addr = (tcg_target_long)abs;
-    tcg_target_long disp = (addr - (tcg_target_long)s->code_ptr) >> 1;
 
-    if (disp == (int32_t)disp) {
-        if (type == TCG_TYPE_I32) {
-            tcg_out_insn(s, RIL, LRL, dest, disp);
-        } else {
-            tcg_out_insn(s, RIL, LGRL, dest, disp);
+    if (facilities & FACILITY_GEN_INST_EXT) {
+        tcg_target_long disp = (addr - (tcg_target_long)s->code_ptr) >> 1;
+        if (disp == (int32_t)disp) {
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RIL, LRL, dest, disp);
+            } else {
+                tcg_out_insn(s, RIL, LGRL, dest, disp);
+            }
+            return;
         }
-    } else {
-        tcg_out_movi(s, TCG_TYPE_PTR, dest, addr & ~0xffff);
-        tcg_out_ld(s, type, dest, dest, addr & 0xffff);
     }
+
+    tcg_out_movi(s, TCG_TYPE_PTR, dest, addr & ~0xffff);
+    tcg_out_ld(s, type, dest, dest, addr & 0xffff);
 }
 
 static inline void tgen_ext8s(TCGContext *s, TCGReg dest, TCGReg src)
@@ -1322,7 +1340,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_mul_i32:
         if (const_args[2]) {
-            if (args[2] == (int16_t)args[2]) {
+            if ((int32_t)args[2] == (int16_t)args[2]) {
                 tcg_out_insn(s, RI, MHI, args[0], args[2]);
             } else {
                 tcg_out_insn(s, RIL, MSFI, args[0], args[2]);
@@ -1573,7 +1591,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_add_i32, { "r", "0", "ri" } },
     { INDEX_op_sub_i32, { "r", "0", "ri" } },
-    { INDEX_op_mul_i32, { "r", "0", "ri" } },
+    { INDEX_op_mul_i32, { "r", "0", "rK" } },
 
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
@@ -1634,7 +1652,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_add_i64, { "r", "0", "rI" } },
     { INDEX_op_sub_i64, { "r", "0", "rJ" } },
-    { INDEX_op_mul_i64, { "r", "0", "rI" } },
+    { INDEX_op_mul_i64, { "r", "0", "rK" } },
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
@@ -1752,9 +1770,7 @@ static void query_facilities(void)
 
     sigaction(SIGILL, &sa_old, NULL);
 
-    /* ??? The translator currently uses all of these extensions
-       unconditionally.  This list could be pruned back to just
-       z/Arch and long displacement with some work.  */
+    /* The translator currently uses these extensions unconditionally.  */
     fail = 0;
     if ((facilities & FACILITY_ZARCH_ACTIVE) == 0) {
         fprintf(stderr, "TCG: z/Arch facility is required\n");
@@ -1768,11 +1784,6 @@ static void query_facilities(void)
         fprintf(stderr, "TCG: extended-immediate facility is required\n");
         fail = 1;
     }
-    if ((facilities & FACILITY_GEN_INST_EXT) == 0) {
-        fprintf(stderr, "TCG: general-instructions-extension "
-                "facility is required\n");
-        fail = 1;
-    }
     if (fail) {
         exit(-1);
     }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 48/62] tcg-s390: Conditionalize ADD IMMEDIATE instructions.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (46 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 47/62] tcg-s390: Conditionalize general-instruction-extension insns Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 49/62] tcg-s390: Conditionalize LOAD " Richard Henderson
                   ` (14 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The ADD IMMEDIATE instructions are in the extended-immediate facility.
Begin making that facility optional by using these only if present.
This requires rearranging the way constants constraints are handled,
so that we properly canonicalize constants for 32-bit operations.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   64 +++++++++++++++++++++++++++++++++++-------------
 1 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index aecabf9..b66778a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,9 +33,10 @@
     do { } while (0)
 #endif
 
-#define TCG_CT_CONST_S32   0x100
-#define TCG_CT_CONST_N32   0x200
-#define TCG_CT_CONST_MULI  0x400
+#define TCG_CT_CONST_32    0x100
+#define TCG_CT_CONST_NEG   0x200
+#define TCG_CT_CONST_ADDI  0x400
+#define TCG_CT_CONST_MULI  0x800
 
 #define TCG_TMP0           TCG_REG_R14
 
@@ -57,6 +58,7 @@
 typedef enum S390Opcode {
     RIL_AFI     = 0xc209,
     RIL_AGFI    = 0xc208,
+    RIL_ALGFI   = 0xc20a,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
     RIL_IIHF    = 0xc008,
@@ -337,13 +339,17 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         tcg_regset_clear(ct->u.regs);
         tcg_regset_set_reg(ct->u.regs, TCG_REG_R3);
         break;
-    case 'I':
+    case 'N':                  /* force immediate negate */
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_NEG;
+        break;
+    case 'W':                  /* force 32-bit ("word") immediate */
         ct->ct &= ~TCG_CT_REG;
-        ct->ct |= TCG_CT_CONST_S32;
+        ct->ct |= TCG_CT_CONST_32;
         break;
-    case 'J':
+    case 'I':
         ct->ct &= ~TCG_CT_REG;
-        ct->ct |= TCG_CT_CONST_N32;
+        ct->ct |= TCG_CT_CONST_ADDI;
         break;
     case 'K':
         ct->ct &= ~TCG_CT_REG;
@@ -366,10 +372,27 @@ static inline int tcg_target_const_match(tcg_target_long val,
 
     if (ct & TCG_CT_CONST) {
         return 1;
-    } else if (ct & TCG_CT_CONST_S32) {
-        return val == (int32_t)val;
-    } else if (ct & TCG_CT_CONST_N32) {
-        return -val == (int32_t)-val;
+    }
+
+    /* Handle the modifiers.  */
+    if (ct & TCG_CT_CONST_NEG) {
+        val = -val;
+    }
+    if (ct & TCG_CT_CONST_32) {
+        val = (int32_t)val;
+    }
+
+    /* The following are mutually exclusive.  */
+    if (ct & TCG_CT_CONST_ADDI) {
+        /* Immediates that may be used with add.  If we have the
+           extended-immediates facility then we have ADD IMMEDIATE
+           with signed and unsigned 32-bit, otherwise we have only
+           ADD HALFWORD IMMEDIATE with a signed 16-bit.  */
+        if (facilities & FACILITY_EXT_IMM) {
+            return val == (int32_t)val || val == (uint32_t)val;
+        } else {
+            return val == (int16_t)val;
+        }
     } else if (ct & TCG_CT_CONST_MULI) {
         /* Immediates that may be used with multiply.  If we have the
            general-instruction-extensions, then we have MULTIPLY SINGLE
@@ -621,7 +644,7 @@ static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
     tcg_out_insn(s, RRE, LLGFR, dest, src);
 }
 
-static inline void tgen32_addi(TCGContext *s, TCGReg dest, tcg_target_long val)
+static void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
 {
     if (val == (int16_t)val) {
         tcg_out_insn(s, RI, AHI, dest, val);
@@ -630,13 +653,18 @@ static inline void tgen32_addi(TCGContext *s, TCGReg dest, tcg_target_long val)
     }
 }
 
-static inline void tgen64_addi(TCGContext *s, TCGReg dest, tcg_target_long val)
+static void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
 {
     if (val == (int16_t)val) {
         tcg_out_insn(s, RI, AGHI, dest, val);
-    } else {
+    } else if (val == (int32_t)val) {
         tcg_out_insn(s, RIL, AGFI, dest, val);
+    } else if (val == (uint32_t)val) {
+        tcg_out_insn(s, RIL, ALGFI, dest, val);
+    } else {
+        tcg_abort();
     }
+
 }
 
 static void tgen32_andi(TCGContext *s, TCGReg dest, uint32_t val)
@@ -1589,9 +1617,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st16_i32, { "r", "r" } },
     { INDEX_op_st_i32, { "r", "r" } },
 
-    { INDEX_op_add_i32, { "r", "0", "ri" } },
-    { INDEX_op_sub_i32, { "r", "0", "ri" } },
-    { INDEX_op_mul_i32, { "r", "0", "rK" } },
+    { INDEX_op_add_i32, { "r", "0", "rWI" } },
+    { INDEX_op_sub_i32, { "r", "0", "rWNI" } },
+    { INDEX_op_mul_i32, { "r", "0", "rWK" } },
 
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
@@ -1651,7 +1679,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_st_i64, { "r", "r" } },
 
     { INDEX_op_add_i64, { "r", "0", "rI" } },
-    { INDEX_op_sub_i64, { "r", "0", "rJ" } },
+    { INDEX_op_sub_i64, { "r", "0", "rNI" } },
     { INDEX_op_mul_i64, { "r", "0", "rK" } },
 
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 49/62] tcg-s390: Conditionalize LOAD IMMEDIATE instructions.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (47 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 48/62] tcg-s390: Conditionalize ADD IMMEDIATE instructions Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 50/62] tcg-s390: Conditionalize 8 and 16 bit extensions Richard Henderson
                   ` (13 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The LOAD IMMEDIATE and (some of) the LOAD LOGICAL IMMEDIATE instructions
are in the extended-immediate facility.  Begin making that facility
optional by using these only if present.  Thankfully, the LOAD ADDRESS
RELATIVE and the LOAD LOGICAL IMMEDIATE insns with 16-bit constants are
always available.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   79 ++++++++++++++++++++++++++++++++++++------------
 1 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index b66778a..491de07 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -491,7 +491,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
         sval = (int32_t)sval;
     }
 
-    /* First, try all 32-bit insns that can load it in one go.  */
+    /* Try all 32-bit insns that can load it in one go.  */
     if (sval >= -0x8000 && sval < 0x8000) {
         tcg_out_insn(s, RI, LGHI, ret, sval);
         return;
@@ -505,22 +505,22 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
         }
     }
 
-    /* Second, try all 48-bit insns that can load it in one go.  */
-    if (sval == (int32_t)sval) {
-        tcg_out_insn(s, RIL, LGFI, ret, sval);
-        return;
-    }
-    if (uval <= 0xffffffff) {
-        tcg_out_insn(s, RIL, LLILF, ret, uval);
-        return;
-    }
-    if ((uval & 0xffffffff) == 0) {
-        tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
-        return;
+    /* Try all 48-bit insns that can load it in one go.  */
+    if (facilities & FACILITY_EXT_IMM) {
+        if (sval == (int32_t)sval) {
+            tcg_out_insn(s, RIL, LGFI, ret, sval);
+            return;
+        }
+        if (uval <= 0xffffffff) {
+            tcg_out_insn(s, RIL, LLILF, ret, uval);
+            return;
+        }
+        if ((uval & 0xffffffff) == 0) {
+            tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
+            return;
+        }
     }
 
-    /* If we get here, both the high and low parts have non-zero bits.  */
-
     /* Try for PC-relative address load.  */
     if ((sval & 1) == 0) {
         intptr_t off = (sval - (intptr_t)s->code_ptr) >> 1;
@@ -530,17 +530,56 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
         }
     }
 
+    /* If extended immediates are not present, then we may have to issue
+       several instructions to load the low 32 bits.  */
+    if (!(facilities & FACILITY_EXT_IMM)) {
+        /* A 32-bit unsigned value can be loaded in 2 insns.  And given
+           that the lli_insns loop above did not succeed, we know that
+           both insns are required.  */
+        if (uval <= 0xffffffff) {
+            tcg_out_insn(s, RI, LLILL, ret, uval);
+            tcg_out_insn(s, RI, IILH, ret, uval >> 16);
+            return;
+        }
+
+        /* If all high bits are set, the value can be loaded in 2 or 3 insns.
+           We first want to make sure that all the high bits get set.  With
+           luck the low 16-bits can be considered negative to perform that for
+           free, otherwise we load an explicit -1.  */
+        if (sval >> 32 == -1) {
+            if (uval & 0x8000) {
+                tcg_out_insn(s, RI, LGHI, ret, uval);
+            } else {
+                tcg_out_insn(s, RI, LGHI, ret, -1);
+                tcg_out_insn(s, RI, IILL, ret, uval);
+            }
+            tcg_out_insn(s, RI, IILH, ret, uval >> 16);
+            return;
+        }
+    }
+
+    /* If we get here, both the high and low parts have non-zero bits.  */
+
     /* Recurse to load the lower 32-bits.  */
     tcg_out_movi(s, TCG_TYPE_I32, ret, sval);
 
     /* Insert data into the high 32-bits.  */
     uval >>= 32;
-    if (uval < 0x10000) {
-        tcg_out_insn(s, RI, IIHL, ret, uval);
-    } else if ((uval & 0xffff) == 0) {
-        tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
+    if (facilities & FACILITY_EXT_IMM) {
+        if (uval < 0x10000) {
+            tcg_out_insn(s, RI, IIHL, ret, uval);
+        } else if ((uval & 0xffff) == 0) {
+            tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
+        } else {
+            tcg_out_insn(s, RIL, IIHF, ret, uval);
+        }
     } else {
-        tcg_out_insn(s, RIL, IIHF, ret, uval);
+        if (uval & 0xffff) {
+            tcg_out_insn(s, RI, IIHL, ret, uval);
+        }
+        if (uval & 0xffff0000) {
+            tcg_out_insn(s, RI, IIHH, ret, uval >> 16);
+        }
     }
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 50/62] tcg-s390: Conditionalize 8 and 16 bit extensions.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (48 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 49/62] tcg-s390: Conditionalize LOAD " Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 51/62] tcg-s390: Conditionalize AND IMMEDIATE instructions Richard Henderson
                   ` (12 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

These instructions are part of the extended-immediate facility.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  115 ++++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 90 insertions(+), 25 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 491de07..8a7c9ae 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -156,11 +156,9 @@ typedef enum S390Opcode {
     RXY_LGF     = 0xe314,
     RXY_LGH     = 0xe315,
     RXY_LHY     = 0xe378,
-    RXY_LLC     = 0xe394,
     RXY_LLGC    = 0xe390,
     RXY_LLGF    = 0xe316,
     RXY_LLGH    = 0xe391,
-    RXY_LLH     = 0xe395,
     RXY_LMG     = 0xeb04,
     RXY_LRV     = 0xe31e,
     RXY_LRVG    = 0xe30f,
@@ -653,24 +651,84 @@ static void tcg_out_ld_abs(TCGContext *s, TCGType type, TCGReg dest, void *abs)
     tcg_out_ld(s, type, dest, dest, addr & 0xffff);
 }
 
-static inline void tgen_ext8s(TCGContext *s, TCGReg dest, TCGReg src)
+static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
-    tcg_out_insn(s, RRE, LGBR, dest, src);
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LGBR, dest, src);
+        return;
+    }
+
+    if (type == TCG_TYPE_I32) {
+        if (dest == src) {
+            tcg_out_sh32(s, RS_SLL, dest, SH32_REG_NONE, 24);
+        } else {
+            tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 24);
+        }
+        tcg_out_sh32(s, RS_SRA, dest, SH32_REG_NONE, 24);
+    } else {
+        tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 56);
+        tcg_out_sh64(s, RSY_SRAG, dest, dest, SH64_REG_NONE, 56);
+    }
 }
 
-static inline void tgen_ext8u(TCGContext *s, TCGReg dest, TCGReg src)
+static void tgen_ext8u(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
-    tcg_out_insn(s, RRE, LLGCR, dest, src);
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LLGCR, dest, src);
+        return;
+    }
+
+    if (dest == src) {
+        tcg_out_movi(s, type, TCG_TMP0, 0xff);
+        src = TCG_TMP0;
+    } else {
+        tcg_out_movi(s, type, dest, 0xff);
+    }
+    if (type == TCG_TYPE_I32) {
+        tcg_out_insn(s, RR, NR, dest, src);
+    } else {
+        tcg_out_insn(s, RRE, NGR, dest, src);
+    }
 }
 
-static inline void tgen_ext16s(TCGContext *s, TCGReg dest, TCGReg src)
+static void tgen_ext16s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
-    tcg_out_insn(s, RRE, LGHR, dest, src);
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LGHR, dest, src);
+        return;
+    }
+
+    if (type == TCG_TYPE_I32) {
+        if (dest == src) {
+            tcg_out_sh32(s, RS_SLL, dest, SH32_REG_NONE, 16);
+        } else {
+            tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 16);
+        }
+        tcg_out_sh32(s, RS_SRA, dest, SH32_REG_NONE, 24);
+    } else {
+        tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 48);
+        tcg_out_sh64(s, RSY_SRAG, dest, dest, SH64_REG_NONE, 48);
+    }
 }
 
-static inline void tgen_ext16u(TCGContext *s, TCGReg dest, TCGReg src)
+static void tgen_ext16u(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 {
-    tcg_out_insn(s, RRE, LLGHR, dest, src);
+    if (facilities & FACILITY_EXT_IMM) {
+        tcg_out_insn(s, RRE, LLGHR, dest, src);
+        return;
+    }
+
+    if (dest == src) {
+        tcg_out_movi(s, type, TCG_TMP0, 0xffff);
+        src = TCG_TMP0;
+    } else {
+        tcg_out_movi(s, type, dest, 0xffff);
+    }
+    if (type == TCG_TYPE_I32) {
+        tcg_out_insn(s, RR, NR, dest, src);
+    } else {
+        tcg_out_insn(s, RRE, NGR, dest, src);
+    }
 }
 
 static inline void tgen_ext32s(TCGContext *s, TCGReg dest, TCGReg src)
@@ -972,7 +1030,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, int opc, TCGReg data,
         if (bswap) {
             /* swapped unsigned halfword load with upper bits zeroed */
             tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
-            tgen_ext16u(s, data, data);
+            tgen_ext16u(s, TCG_TYPE_I64, data, data);
         } else {
             tcg_out_insn(s, RXY, LLGH, data, base, index, disp);
         }
@@ -981,7 +1039,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, int opc, TCGReg data,
         if (bswap) {
             /* swapped sign-extended halfword load */
             tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
-            tgen_ext16s(s, data, data);
+            tgen_ext16s(s, TCG_TYPE_I64, data, data);
         } else {
             tcg_out_insn(s, RXY, LGH, data, base, index, disp);
         }
@@ -1117,10 +1175,10 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
         /* sign extension */
         switch (opc) {
         case LD_INT8:
-            tgen_ext8s(s, data_reg, arg0);
+            tgen_ext8s(s, TCG_TYPE_I64, data_reg, arg0);
             break;
         case LD_INT16:
-            tgen_ext16s(s, data_reg, arg0);
+            tgen_ext16s(s, TCG_TYPE_I64, data_reg, arg0);
             break;
         case LD_INT32:
             tgen_ext32s(s, data_reg, arg0);
@@ -1264,23 +1322,22 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_ld8u_i32:
-        tcg_out_ldst(s, 0, RXY_LLC, args[0], args[1], args[2]);
-        break;
     case INDEX_op_ld8u_i64:
+        /* ??? LLC (RXY format) is only present with the extended-immediate
+           facility, whereas LLGC is always present.  */
         tcg_out_ldst(s, 0, RXY_LLGC, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_ld8s_i32:
-        tcg_out_ldst(s, 0, RXY_LB, args[0], args[1], args[2]);
-        break;
     case INDEX_op_ld8s_i64:
+        /* ??? LB is no smaller than LGB, so no point to using it.  */
         tcg_out_ldst(s, 0, RXY_LGB, args[0], args[1], args[2]);
         break;
 
     case INDEX_op_ld16u_i32:
-        tcg_out_ldst(s, 0, RXY_LLH, args[0], args[1], args[2]);
-        break;
     case INDEX_op_ld16u_i64:
+        /* ??? LLH (RXY format) is only present with the extended-immediate
+           facility, whereas LLGH is always present.  */
         tcg_out_ldst(s, 0, RXY_LLGH, args[0], args[1], args[2]);
         break;
 
@@ -1517,24 +1574,32 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         break;
 
     case INDEX_op_ext8s_i32:
+        tgen_ext8s(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
     case INDEX_op_ext8s_i64:
-        tgen_ext8s(s, args[0], args[1]);
+        tgen_ext8s(s, TCG_TYPE_I64, args[0], args[1]);
         break;
     case INDEX_op_ext16s_i32:
+        tgen_ext16s(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
     case INDEX_op_ext16s_i64:
-        tgen_ext16s(s, args[0], args[1]);
+        tgen_ext16s(s, TCG_TYPE_I64, args[0], args[1]);
         break;
     case INDEX_op_ext32s_i64:
         tgen_ext32s(s, args[0], args[1]);
         break;
 
     case INDEX_op_ext8u_i32:
+        tgen_ext8u(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
     case INDEX_op_ext8u_i64:
-        tgen_ext8u(s, args[0], args[1]);
+        tgen_ext8u(s, TCG_TYPE_I64, args[0], args[1]);
         break;
     case INDEX_op_ext16u_i32:
+        tgen_ext16u(s, TCG_TYPE_I32, args[0], args[1]);
+        break;
     case INDEX_op_ext16u_i64:
-        tgen_ext16u(s, args[0], args[1]);
+        tgen_ext16u(s, TCG_TYPE_I64, args[0], args[1]);
         break;
     case INDEX_op_ext32u_i64:
         tgen_ext32u(s, args[0], args[1]);
@@ -1545,7 +1610,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         /* The TCG bswap definition requires bits 0-47 already be zero.
            Thus we don't need the G-type insns to implement bswap16_i64.  */
         tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
-        tcg_out_insn(s, RS, SRL, args[0], 0, SH32_REG_NONE, 16);
+        tcg_out_sh32(s, RS_SRL, args[0], SH32_REG_NONE, 16);
         break;
     case INDEX_op_bswap32_i32:
     case INDEX_op_bswap32_i64:
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 51/62] tcg-s390: Conditionalize AND IMMEDIATE instructions.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (49 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 50/62] tcg-s390: Conditionalize 8 and 16 bit extensions Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 52/62] tcg-s390: Conditionalize OR " Richard Henderson
                   ` (11 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The 32-bit immediate AND instructions are in the extended-immediate
facility.  Use these only if present.

At the same time, pull the logic to load immediates into registers
into a constraint letter for TCG.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  209 ++++++++++++++++++++++++++++--------------------
 1 files changed, 122 insertions(+), 87 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 8a7c9ae..359f6d1 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,10 +33,11 @@
     do { } while (0)
 #endif
 
-#define TCG_CT_CONST_32    0x100
-#define TCG_CT_CONST_NEG   0x200
-#define TCG_CT_CONST_ADDI  0x400
-#define TCG_CT_CONST_MULI  0x800
+#define TCG_CT_CONST_32    0x0100
+#define TCG_CT_CONST_NEG   0x0200
+#define TCG_CT_CONST_ADDI  0x0400
+#define TCG_CT_CONST_MULI  0x0800
+#define TCG_CT_CONST_ANDI  0x1000
 
 #define TCG_TMP0           TCG_REG_R14
 
@@ -353,6 +354,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_MULI;
         break;
+    case 'A':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_ANDI;
+        break;
     default:
         break;
     }
@@ -362,9 +367,66 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
     return 0;
 }
 
+/* Immediates to be used with logical AND.  This is an optimization only,
+   since a full 64-bit immediate AND can always be performed with 4 sequential
+   NI[LH][LH] instructions.  What we're looking for is immediates that we
+   can load efficiently, and the immediate load plus the reg-reg AND is
+   smaller than the sequential NI's.  */
+
+static int tcg_match_andi(int ct, tcg_target_ulong val)
+{
+    int i;
+
+    if (facilities & FACILITY_EXT_IMM) {
+        if (ct & TCG_CT_CONST_32) {
+            /* All 32-bit ANDs can be performed with 1 48-bit insn.  */
+            return 1;
+        }
+
+        /* Zero-extensions.  */
+        if (val == 0xff || val == 0xffff || val == 0xffffffff) {
+            return 1;
+        }
+    } else {
+        if (ct & TCG_CT_CONST_32) {
+            val = (uint32_t)val;
+        } else if (val == 0xffffffff) {
+            return 1;
+        }
+    }
+
+    /* Try all 32-bit insns that can perform it in one go.  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = ~(0xffffull << i*16);
+        if ((val & mask) == mask) {
+            return 1;
+        }
+    }
+
+    /* Look for 16-bit values performing the mask.  These are better
+       to load with LLI[LH][LH].  */
+    for (i = 0; i < 4; i++) {
+        tcg_target_ulong mask = 0xffffull << i*16;
+        if ((val & mask) == val) {
+            return 0;
+        }
+    }
+
+    /* Look for 32-bit values performing the 64-bit mask.  These
+       are better to load with LLI[LH]F, or if extended immediates
+       not available, with a pair of LLI insns.  */
+    if ((ct & TCG_CT_CONST_32) == 0) {
+        if (val <= 0xffffffff || (val & 0xffffffff) == 0) {
+            return 0;
+        }
+    }
+
+    return 1;
+}
+
 /* Test if a constant matches the constraint. */
-static inline int tcg_target_const_match(tcg_target_long val,
-                                         const TCGArgConstraint *arg_ct)
+static int tcg_target_const_match(tcg_target_long val,
+                                  const TCGArgConstraint *arg_ct)
 {
     int ct = arg_ct->ct;
 
@@ -401,6 +463,8 @@ static inline int tcg_target_const_match(tcg_target_long val,
         } else {
             return val == (int16_t)val;
         }
+    } else if (ct & TCG_CT_CONST_ANDI) {
+        return tcg_match_andi(ct, val);
     }
 
     return 0;
@@ -764,37 +828,6 @@ static void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
 
 }
 
-static void tgen32_andi(TCGContext *s, TCGReg dest, uint32_t val)
-{
-    /* Zero-th, look for no-op.  */
-    if (val == -1) {
-        return;
-    }
-
-    /* First, look for the zero-extensions.  */
-    if (val == 0xff) {
-        tgen_ext8u(s, dest, dest);
-        return;
-    }
-    if (val == 0xffff) {
-        tgen_ext16u(s, dest, dest);
-        return;
-    }
-
-    /* Second, try all 32-bit insns that can perform it in one go.  */
-    if ((val & 0xffff0000) == 0xffff0000) {
-        tcg_out_insn(s, RI, NILL, dest, val);
-        return;
-    }
-    if ((val & 0x0000ffff) == 0x0000ffff) {
-        tcg_out_insn(s, RI, NILH, dest, val >> 16);
-        return;
-    }
-
-    /* Lastly, perform the entire operation with a 48-bit insn.  */
-    tcg_out_insn(s, RIL, NILF, dest, val);
-}
-
 static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 {
     static const S390Opcode ni_insns[4] = {
@@ -806,69 +839,61 @@ static void tgen64_andi(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 
     int i;
 
-    /* Zero-th, look for no-op.  */
+    /* Look for no-op.  */
     if (val == -1) {
         return;
     }
 
-    /* First, look for the zero-extensions.  */
-    if (val == 0xff) {
-        tgen_ext8u(s, dest, dest);
-        return;
-    }
-    if (val == 0xffff) {
-        tgen_ext16u(s, dest, dest);
-        return;
-    }
+    /* Look for the zero-extensions.  */
     if (val == 0xffffffff) {
         tgen_ext32u(s, dest, dest);
         return;
     }
 
-    /* Second, try all 32-bit insns that can perform it in one go.  */
-    for (i = 0; i < 4; i++) {
-        tcg_target_ulong mask = ~(0xffffull << i*16);
-        if ((val & mask) == mask) {
-            tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
+    if (facilities & FACILITY_EXT_IMM) {
+        if (val == 0xff) {
+            tgen_ext8u(s, TCG_TYPE_I64, dest, dest);
             return;
         }
-    }
-
-    /* Third, try all 48-bit insns that can perform it in one go.  */
-    for (i = 0; i < 2; i++) {
-        tcg_target_ulong mask = ~(0xffffffffull << i*32);
-        if ((val & mask) == mask) {
-            tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+        if (val == 0xffff) {
+            tgen_ext16u(s, TCG_TYPE_I64, dest, dest);
             return;
         }
-    }
 
-    /* Fourth, look for masks that can be loaded with one instruction
-       into a register.  This is slightly smaller than using two 48-bit
-       masks, as below.  */
-    for (i = 0; i < 4; i++) {
-        tcg_target_ulong mask = ~(0xffffull << i*16);
-        if ((val & mask) == 0) {
-            tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
-            tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
-            return;
+        /* Try all 32-bit insns that can perform it in one go.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = ~(0xffffull << i*16);
+            if ((val & mask) == mask) {
+                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
+                return;
+            }
         }
-    }
 
-    for (i = 0; i < 2; i++) {
-        tcg_target_ulong mask = ~(0xffffffffull << i*32);
-        if ((val & mask) == 0) {
-            tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
-            tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
-            return;
+        /* Try all 48-bit insns that can perform it in one go.  */
+        if (facilities & FACILITY_EXT_IMM) {
+            for (i = 0; i < 2; i++) {
+                tcg_target_ulong mask = ~(0xffffffffull << i*32);
+                if ((val & mask) == mask) {
+                    tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+                    return;
+                }
+            }
         }
-    }
 
-    /* Last, perform the AND via sequential modifications to the
-       high and low parts.  Do this via recursion to handle 16-bit
-       vs 32-bit masks in each half.  */
-    tgen64_andi(s, dest, val | 0xffffffff00000000ull);
-    tgen64_andi(s, dest, val | 0x00000000ffffffffull);
+        /* Perform the AND via sequential modifications to the high and low
+           parts.  Do this via recursion to handle 16-bit vs 32-bit masks in
+           each half.  */
+        tgen64_andi(s, dest, val | 0xffffffff00000000ull);
+        tgen64_andi(s, dest, val | 0x00000000ffffffffull);
+    } else {
+        /* With no extended-immediate facility, just emit the sequence.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = 0xffffull << i*16;
+            if ((val & mask) != mask) {
+                tcg_out_insn_RI(s, ni_insns[i], dest, val >> i*16);
+            }
+        }
+    }
 }
 
 static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
@@ -1121,6 +1146,16 @@ static void tcg_out_qemu_st_direct(TCGContext *s, int opc, TCGReg data,
 }
 
 #if defined(CONFIG_SOFTMMU)
+static void tgen64_andi_tmp(TCGContext *s, TCGReg dest, tcg_target_ulong val)
+{
+    if (tcg_match_andi(0, val)) {
+        tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, val);
+        tcg_out_insn(s, RRE, NGR, dest, TCG_TMP0);
+    } else {
+        tgen64_andi(s, dest, val);
+    }
+}
+
 static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
                                   TCGReg addr_reg, int mem_index, int opc,
                                   uint16_t **label2_ptr_p, int is_store)
@@ -1140,8 +1175,8 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, SH64_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
-    tgen64_andi(s, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
-    tgen64_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
+    tgen64_andi_tmp(s, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+    tgen64_andi_tmp(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
 
     if (is_store) {
         ofs = offsetof(CPUState, tlb_table[mem_index][0].addr_write);
@@ -1413,7 +1448,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_and_i32:
         if (const_args[2]) {
-            tgen32_andi(s, args[0], args[2]);
+            tgen64_andi(s, args[0], args[2] | 0xffffffff00000000ull);
         } else {
             tcg_out_insn(s, RR, NR, args[0], args[2]);
         }
@@ -1728,7 +1763,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_div2_i32, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
-    { INDEX_op_and_i32, { "r", "0", "ri" } },
+    { INDEX_op_and_i32, { "r", "0", "rWA" } },
     { INDEX_op_or_i32, { "r", "0", "ri" } },
     { INDEX_op_xor_i32, { "r", "0", "ri" } },
     { INDEX_op_neg_i32, { "r", "r" } },
@@ -1789,7 +1824,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_div2_i64, { "b", "a", "0", "1", "r" } },
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
 
-    { INDEX_op_and_i64, { "r", "0", "ri" } },
+    { INDEX_op_and_i64, { "r", "0", "rA" } },
     { INDEX_op_or_i64, { "r", "0", "ri" } },
     { INDEX_op_xor_i64, { "r", "0", "ri" } },
     { INDEX_op_neg_i64, { "r", "r" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 52/62] tcg-s390: Conditionalize OR IMMEDIATE instructions.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (50 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 51/62] tcg-s390: Conditionalize AND IMMEDIATE instructions Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 53/62] tcg-s390: Conditionalize XOR " Richard Henderson
                   ` (10 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The 32-bit immediate OR instructions are in the extended-immediate
facility.  Use these only if present.

At the same time, pull the logic to load immediates into registers
into a constraint letter for TCG.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   92 +++++++++++++++++++++++++++++++++++++------------
 1 files changed, 70 insertions(+), 22 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 359f6d1..36d4ad0 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -38,6 +38,7 @@
 #define TCG_CT_CONST_ADDI  0x0400
 #define TCG_CT_CONST_MULI  0x0800
 #define TCG_CT_CONST_ANDI  0x1000
+#define TCG_CT_CONST_ORI   0x2000
 
 #define TCG_TMP0           TCG_REG_R14
 
@@ -358,6 +359,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_ANDI;
         break;
+    case 'O':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_ORI;
+        break;
     default:
         break;
     }
@@ -424,6 +429,36 @@ static int tcg_match_andi(int ct, tcg_target_ulong val)
     return 1;
 }
 
+/* Immediates to be used with logical OR.  This is an optimization only,
+   since a full 64-bit immediate OR can always be performed with 4 sequential
+   OI[LH][LH] instructions.  What we're looking for is immediates that we
+   can load efficiently, and the immediate load plus the reg-reg OR is
+   smaller than the sequential OI's.  */
+
+static int tcg_match_ori(int ct, tcg_target_long val)
+{
+    if (facilities & FACILITY_EXT_IMM) {
+        if (ct & TCG_CT_CONST_32) {
+            /* All 32-bit ORs can be performed with 1 48-bit insn.  */
+            return 1;
+        }
+    }
+
+    /* Look for negative values.  These are best to load with LGHI.  */
+    if (val < 0) {
+        if (val == (int16_t)val) {
+            return 0;
+        }
+        if (facilities & FACILITY_EXT_IMM) {
+            if (val == (int32_t)val) {
+                return 0;
+            }
+        }
+    }
+
+    return 1;
+}
+
 /* Test if a constant matches the constraint. */
 static int tcg_target_const_match(tcg_target_long val,
                                   const TCGArgConstraint *arg_ct)
@@ -465,6 +500,8 @@ static int tcg_target_const_match(tcg_target_long val,
         }
     } else if (ct & TCG_CT_CONST_ANDI) {
         return tcg_match_andi(ct, val);
+    } else if (ct & TCG_CT_CONST_ORI) {
+        return tcg_match_ori(ct, val);
     }
 
     return 0;
@@ -907,34 +944,45 @@ static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 
     int i;
 
-    /* Zero-th, look for no-op.  */
+    /* Look for no-op.  */
     if (val == 0) {
         return;
     }
 
-    /* First, try all 32-bit insns that can perform it in one go.  */
-    for (i = 0; i < 4; i++) {
-        tcg_target_ulong mask = (0xffffull << i*16);
-        if ((val & mask) != 0 && (val & ~mask) == 0) {
-            tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16);
-            return;
+    if (facilities & FACILITY_EXT_IMM) {
+        /* Try all 32-bit insns that can perform it in one go.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = (0xffffull << i*16);
+            if ((val & mask) != 0 && (val & ~mask) == 0) {
+                tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16);
+                return;
+            }
         }
-    }
 
-    /* Second, try all 48-bit insns that can perform it in one go.  */
-    for (i = 0; i < 2; i++) {
-        tcg_target_ulong mask = (0xffffffffull << i*32);
-        if ((val & mask) != 0 && (val & ~mask) == 0) {
-            tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
-            return;
+        /* Try all 48-bit insns that can perform it in one go.  */
+        for (i = 0; i < 2; i++) {
+            tcg_target_ulong mask = (0xffffffffull << i*32);
+            if ((val & mask) != 0 && (val & ~mask) == 0) {
+                tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32);
+                return;
+            }
         }
-    }
 
-    /* Last, perform the OR via sequential modifications to the
-       high and low parts.  Do this via recursion to handle 16-bit
-       vs 32-bit masks in each half.  */
-    tgen64_ori(s, dest, val & 0x00000000ffffffffull);
-    tgen64_ori(s, dest, val & 0xffffffff00000000ull);
+        /* Perform the OR via sequential modifications to the high and
+           low parts.  Do this via recursion to handle 16-bit vs 32-bit
+           masks in each half.  */
+        tgen64_ori(s, dest, val & 0x00000000ffffffffull);
+        tgen64_ori(s, dest, val & 0xffffffff00000000ull);
+    } else {
+        /* With no extended-immediate facility, we don't need to be so
+           clever.  Just iterate over the insns and mask in the constant.  */
+        for (i = 0; i < 4; i++) {
+            tcg_target_ulong mask = (0xffffull << i*16);
+            if ((val & mask) != 0) {
+                tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16);
+            }
+        }
+    }
 }
 
 static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
@@ -1764,7 +1812,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } },
 
     { INDEX_op_and_i32, { "r", "0", "rWA" } },
-    { INDEX_op_or_i32, { "r", "0", "ri" } },
+    { INDEX_op_or_i32, { "r", "0", "rWO" } },
     { INDEX_op_xor_i32, { "r", "0", "ri" } },
     { INDEX_op_neg_i32, { "r", "r" } },
 
@@ -1825,7 +1873,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } },
 
     { INDEX_op_and_i64, { "r", "0", "rA" } },
-    { INDEX_op_or_i64, { "r", "0", "ri" } },
+    { INDEX_op_or_i64, { "r", "0", "rO" } },
     { INDEX_op_xor_i64, { "r", "0", "ri" } },
     { INDEX_op_neg_i64, { "r", "r" } },
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 53/62] tcg-s390: Conditionalize XOR IMMEDIATE instructions.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (51 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 52/62] tcg-s390: Conditionalize OR " Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 54/62] tcg-s390: Do not require the extended-immediate facility Richard Henderson
                   ` (9 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The immediate XOR instructions are in the extended-immediate
facility.  Use these only if present.

At the same time, pull the logic to load immediates into registers
into a constraint letter for TCG.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   53 +++++++++++++++++++++++++++++++-----------------
 1 files changed, 34 insertions(+), 19 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 36d4ad0..084448a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -39,6 +39,7 @@
 #define TCG_CT_CONST_MULI  0x0800
 #define TCG_CT_CONST_ANDI  0x1000
 #define TCG_CT_CONST_ORI   0x2000
+#define TCG_CT_CONST_XORI  0x4000
 
 #define TCG_TMP0           TCG_REG_R14
 
@@ -363,6 +364,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_ORI;
         break;
+    case 'X':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_XORI;
+        break;
     default:
         break;
     }
@@ -459,6 +464,30 @@ static int tcg_match_ori(int ct, tcg_target_long val)
     return 1;
 }
 
+/* Immediates to be used with logical XOR.  This is almost, but not quite,
+   only an optimization.  XOR with immediate is only supported with the
+   extended-immediate facility.  That said, there are a few patterns for
+   which it is better to load the value into a register first.  */
+
+static int tcg_match_xori(int ct, tcg_target_long val)
+{
+    if ((facilities & FACILITY_EXT_IMM) == 0) {
+        return 0;
+    }
+
+    if (ct & TCG_CT_CONST_32) {
+        /* All 32-bit XORs can be performed with 1 48-bit insn.  */
+        return 1;
+    }
+
+    /* Look for negative values.  These are best to load with LGHI.  */
+    if (val < 0 && val == (int32_t)val) {
+        return 0;
+    }
+
+    return 1;
+}
+
 /* Test if a constant matches the constraint. */
 static int tcg_target_const_match(tcg_target_long val,
                                   const TCGArgConstraint *arg_ct)
@@ -502,6 +531,8 @@ static int tcg_target_const_match(tcg_target_long val,
         return tcg_match_andi(ct, val);
     } else if (ct & TCG_CT_CONST_ORI) {
         return tcg_match_ori(ct, val);
+    } else if (ct & TCG_CT_CONST_XORI) {
+        return tcg_match_xori(ct, val);
     }
 
     return 0;
@@ -987,23 +1018,7 @@ static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 
 static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 {
-    tcg_target_long sval = val;
-
-    /* Zero-th, look for no-op.  */
-    if (val == 0) {
-        return;
-    }
-
-    /* First, look for 64-bit values for which it is better to load the
-       value first and perform the xor via registers.  This is true for
-       any 32-bit negative value, where the high 32-bits get flipped too.  */
-    if (sval < 0 && sval == (int32_t)sval) {
-        tcg_out_movi(s, TCG_TYPE_I64, TCG_TMP0, sval);
-        tcg_out_insn(s, RRE, XGR, dest, TCG_TMP0);
-        return;
-    }
-
-    /* Second, perform the xor by parts.  */
+    /* Perform the xor by parts.  */
     if (val & 0xffffffff) {
         tcg_out_insn(s, RIL, XILF, dest, val);
     }
@@ -1813,7 +1828,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_and_i32, { "r", "0", "rWA" } },
     { INDEX_op_or_i32, { "r", "0", "rWO" } },
-    { INDEX_op_xor_i32, { "r", "0", "ri" } },
+    { INDEX_op_xor_i32, { "r", "0", "rWX" } },
     { INDEX_op_neg_i32, { "r", "r" } },
 
     { INDEX_op_shl_i32, { "r", "0", "Ri" } },
@@ -1874,7 +1889,7 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_and_i64, { "r", "0", "rA" } },
     { INDEX_op_or_i64, { "r", "0", "rO" } },
-    { INDEX_op_xor_i64, { "r", "0", "ri" } },
+    { INDEX_op_xor_i64, { "r", "0", "rX" } },
     { INDEX_op_neg_i64, { "r", "r" } },
 
     { INDEX_op_shl_i64, { "r", "r", "Ri" } },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 54/62] tcg-s390: Do not require the extended-immediate facility.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (52 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 53/62] tcg-s390: Conditionalize XOR " Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 55/62] tcg-s390: Use 16-bit branches for forward jumps Richard Henderson
                   ` (8 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

All of the instructions from this group are now conditionalized.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure             |    2 +-
 tcg/s390/tcg-target.c |    4 ----
 2 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 56dee88..f818198 100755
--- a/configure
+++ b/configure
@@ -701,7 +701,7 @@ case "$cpu" in
            host_guest_base="yes"
            ;;
     s390x)
-           QEMU_CFLAGS="-m64 -march=z9-109 $QEMU_CFLAGS"
+           QEMU_CFLAGS="-m64 -march=z990 $QEMU_CFLAGS"
            LDFLAGS="-m64 $LDFLAGS"
            host_guest_base="yes"
            ;;
diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 084448a..0dc71e2 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -2010,10 +2010,6 @@ static void query_facilities(void)
         fprintf(stderr, "TCG: long-displacement facility is required\n");
         fail = 1;
     }
-    if ((facilities & FACILITY_EXT_IMM) == 0) {
-        fprintf(stderr, "TCG: extended-immediate facility is required\n");
-        fail = 1;
-    }
     if (fail) {
         exit(-1);
     }
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 55/62] tcg-s390: Use 16-bit branches for forward jumps.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (53 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 54/62] tcg-s390: Do not require the extended-immediate facility Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 56/62] tcg-s390: Use the LOAD AND TEST instruction for compares Richard Henderson
                   ` (7 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Translation blocks are never big enough to require 32-bit branches.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   27 ++++++++++++++++++++++-----
 1 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 0dc71e2..697c5e4 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -33,6 +33,11 @@
     do { } while (0)
 #endif
 
+/* ??? The translation blocks produced by TCG are generally small enough to
+   be entirely reachable with a 16-bit displacement.  Leaving the option for
+   a 32-bit displacement here Just In Case.  */
+#define USE_LONG_BRANCHES 0
+
 #define TCG_CT_CONST_32    0x0100
 #define TCG_CT_CONST_NEG   0x0200
 #define TCG_CT_CONST_ADDI  0x0400
@@ -295,14 +300,22 @@ static uint8_t *tb_ret_addr;
 static uint64_t facilities;
 
 static void patch_reloc(uint8_t *code_ptr, int type,
-                tcg_target_long value, tcg_target_long addend)
+                        tcg_target_long value, tcg_target_long addend)
 {
-    uint32_t *code_ptr_32 = (uint32_t*)code_ptr;
-    tcg_target_long code_ptr_tlong = (tcg_target_long)code_ptr;
+    tcg_target_long code_ptr_tl = (tcg_target_long)code_ptr;
+    tcg_target_long pcrel2;
 
+    /* ??? Not the usual definition of "addend".  */
+    pcrel2 = (value - (code_ptr_tl + addend)) >> 1;
+    
     switch (type) {
+    case R_390_PC16DBL:
+        assert(pcrel2 == (int16_t)pcrel2);
+        *(int16_t *)code_ptr = pcrel2;
+        break;
     case R_390_PC32DBL:
-        *code_ptr_32 = (value - (code_ptr_tlong + addend)) >> 1;
+        assert(pcrel2 == (int32_t)pcrel2);
+        *(int32_t *)code_ptr = pcrel2;
         break;
     default:
         tcg_abort();
@@ -1081,10 +1094,14 @@ static void tgen_branch(TCGContext *s, int cc, int labelno)
     TCGLabel* l = &s->labels[labelno];
     if (l->has_value) {
         tgen_gotoi(s, cc, l->u.value);
-    } else {
+    } else if (USE_LONG_BRANCHES) {
         tcg_out16(s, RIL_BRCL | (cc << 4));
         tcg_out_reloc(s, s->code_ptr, R_390_PC32DBL, labelno, -2);
         s->code_ptr += 4;
+    } else {
+        tcg_out16(s, RI_BRC | (cc << 4));
+        tcg_out_reloc(s, s->code_ptr, R_390_PC16DBL, labelno, -2);
+        s->code_ptr += 2;
     }
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 56/62] tcg-s390: Use the LOAD AND TEST instruction for compares.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (54 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 55/62] tcg-s390: Use 16-bit branches for forward jumps Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 57/62] tcg-s390: Use the COMPARE IMMEDIATE instrucions " Richard Henderson
                   ` (6 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

This instruction is always available, and nicely eliminates
the constant load for comparisons against zero.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  133 +++++++++++++++++++++++++++++++++---------------
 1 files changed, 91 insertions(+), 42 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 697c5e4..edae6a8 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -45,6 +45,7 @@
 #define TCG_CT_CONST_ANDI  0x1000
 #define TCG_CT_CONST_ORI   0x2000
 #define TCG_CT_CONST_XORI  0x4000
+#define TCG_CT_CONST_CMPI  0x8000
 
 #define TCG_TMP0           TCG_REG_R14
 
@@ -126,6 +127,7 @@ typedef enum S390Opcode {
     RRE_LLGHR   = 0xb985,
     RRE_LRVR    = 0xb91f,
     RRE_LRVGR   = 0xb90f,
+    RRE_LTGR    = 0xb902,
     RRE_MSGR    = 0xb90c,
     RRE_MSR     = 0xb252,
     RRE_NGR     = 0xb980,
@@ -141,6 +143,7 @@ typedef enum S390Opcode {
     RR_DR       = 0x1d,
     RR_LCR      = 0x13,
     RR_LR       = 0x18,
+    RR_LTR      = 0x12,
     RR_NR       = 0x14,
     RR_OR       = 0x16,
     RR_SR       = 0x1b,
@@ -242,9 +245,6 @@ static const int tcg_target_call_oarg_regs[] = {
     TCG_REG_R3,
 };
 
-/* signed/unsigned is handled by using COMPARE and COMPARE LOGICAL,
-   respectively */
-
 #define S390_CC_EQ      8
 #define S390_CC_LT      4
 #define S390_CC_GT      2
@@ -252,19 +252,37 @@ static const int tcg_target_call_oarg_regs[] = {
 #define S390_CC_NE      (S390_CC_LT | S390_CC_GT)
 #define S390_CC_LE      (S390_CC_LT | S390_CC_EQ)
 #define S390_CC_GE      (S390_CC_GT | S390_CC_EQ)
+#define S390_CC_NEVER   0
 #define S390_CC_ALWAYS  15
 
+/* Condition codes that result from a COMPARE and COMPARE LOGICAL.  */
 static const uint8_t tcg_cond_to_s390_cond[10] = {
     [TCG_COND_EQ]  = S390_CC_EQ,
+    [TCG_COND_NE]  = S390_CC_NE,
     [TCG_COND_LT]  = S390_CC_LT,
-    [TCG_COND_LTU] = S390_CC_LT,
     [TCG_COND_LE]  = S390_CC_LE,
-    [TCG_COND_LEU] = S390_CC_LE,
     [TCG_COND_GT]  = S390_CC_GT,
-    [TCG_COND_GTU] = S390_CC_GT,
     [TCG_COND_GE]  = S390_CC_GE,
+    [TCG_COND_LTU] = S390_CC_LT,
+    [TCG_COND_LEU] = S390_CC_LE,
+    [TCG_COND_GTU] = S390_CC_GT,
     [TCG_COND_GEU] = S390_CC_GE,
+};
+
+/* Condition codes that result from a LOAD AND TEST.  Here, we have no
+   unsigned instruction variation, however since the test is vs zero we
+   can re-map the outcomes appropriately.  */
+static const uint8_t tcg_cond_to_ltr_cond[10] = {
+    [TCG_COND_EQ]  = S390_CC_EQ,
     [TCG_COND_NE]  = S390_CC_NE,
+    [TCG_COND_LT]  = S390_CC_LT,
+    [TCG_COND_LE]  = S390_CC_LE,
+    [TCG_COND_GT]  = S390_CC_GT,
+    [TCG_COND_GE]  = S390_CC_GE,
+    [TCG_COND_LTU] = S390_CC_NEVER,
+    [TCG_COND_LEU] = S390_CC_EQ,
+    [TCG_COND_GTU] = S390_CC_NE,
+    [TCG_COND_GEU] = S390_CC_ALWAYS,
 };
 
 #ifdef CONFIG_SOFTMMU
@@ -381,6 +399,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
         ct->ct &= ~TCG_CT_REG;
         ct->ct |= TCG_CT_CONST_XORI;
         break;
+    case 'C':
+        ct->ct &= ~TCG_CT_REG;
+        ct->ct |= TCG_CT_CONST_CMPI;
+        break;
     default:
         break;
     }
@@ -501,6 +523,13 @@ static int tcg_match_xori(int ct, tcg_target_long val)
     return 1;
 }
 
+/* Imediates to be used with comparisons.  */
+
+static int tcg_match_cmpi(int ct, tcg_target_long val)
+{
+    return (val == 0);
+}
+
 /* Test if a constant matches the constraint. */
 static int tcg_target_const_match(tcg_target_long val,
                                   const TCGArgConstraint *arg_ct)
@@ -546,6 +575,8 @@ static int tcg_target_const_match(tcg_target_long val,
         return tcg_match_ori(ct, val);
     } else if (ct & TCG_CT_CONST_XORI) {
         return tcg_match_xori(ct, val);
+    } else if (ct & TCG_CT_CONST_CMPI) {
+        return tcg_match_cmpi(ct, val);
     }
 
     return 0;
@@ -1040,39 +1071,48 @@ static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
     }
 }
 
-static void tgen32_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
-{
-    if (c > TCG_COND_GT) {
-        /* unsigned */
-        tcg_out_insn(s, RR, CLR, r1, r2);
-    } else {
-        /* signed */
-        tcg_out_insn(s, RR, CR, r1, r2);
-    }
-}
-
-static void tgen64_cmp(TCGContext *s, TCGCond c, TCGReg r1, TCGReg r2)
+static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
+                    TCGArg c2, int c2const)
 {
-    if (c > TCG_COND_GT) {
-        /* unsigned */
-        tcg_out_insn(s, RRE, CLGR, r1, r2);
+    if (c2const) {
+        if (c2 == 0) {
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RR, LTR, r1, r1);
+            } else {
+                tcg_out_insn(s, RRE, LTGR, r1, r1);
+            }
+            return tcg_cond_to_ltr_cond[c];
+        } else {
+            tcg_abort();
+        }
     } else {
-        /* signed */
-        tcg_out_insn(s, RRE, CGR, r1, r2);
+        if (c > TCG_COND_GT) {
+            /* unsigned */
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RR, CLR, r1, c2);
+            } else {
+                tcg_out_insn(s, RRE, CLGR, r1, c2);
+            }
+        } else {
+            /* signed */
+            if (type == TCG_TYPE_I32) {
+                tcg_out_insn(s, RR, CR, r1, c2);
+            } else {
+                tcg_out_insn(s, RRE, CGR, r1, c2);
+            }
+        }
     }
+    return tcg_cond_to_s390_cond[c];
 }
 
 static void tgen_setcond(TCGContext *s, TCGType type, TCGCond c,
-                         TCGReg dest, TCGReg r1, TCGReg r2)
+                         TCGReg dest, TCGReg r1, TCGArg c2, int c2const)
 {
-    if (type == TCG_TYPE_I32) {
-        tgen32_cmp(s, c, r1, r2);
-    } else {
-        tgen64_cmp(s, c, r1, r2);
-    }
+    int cc = tgen_cmp(s, type, c, r1, c2, c2const);
+
     /* Emit: r1 = 1; if (cc) goto over; r1 = 0; over:  */
     tcg_out_movi(s, type, dest, 1);
-    tcg_out_insn(s, RI, BRC, tcg_cond_to_s390_cond[c], (4 + 4) >> 1);
+    tcg_out_insn(s, RI, BRC, cc, (4 + 4) >> 1);
     tcg_out_movi(s, type, dest, 0);
 }
 
@@ -1105,6 +1145,13 @@ static void tgen_branch(TCGContext *s, int cc, int labelno)
     }
 }
 
+static void tgen_brcond(TCGContext *s, TCGType type, TCGCond c,
+                        TCGReg r1, TCGArg c2, int c2const, int labelno)
+{
+    int cc = tgen_cmp(s, type, c, r1, c2, c2const);
+    tgen_branch(s, cc, labelno);
+}
+
 static void tgen_calli(TCGContext *s, tcg_target_long dest)
 {
     tcg_target_long off = (dest - (tcg_target_long)s->code_ptr) >> 1;
@@ -1739,20 +1786,22 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
         break;
 
-    case INDEX_op_brcond_i64:
-        tgen64_cmp(s, args[2], args[0], args[1]);
-        goto do_brcond;
     case INDEX_op_brcond_i32:
-        tgen32_cmp(s, args[2], args[0], args[1]);
-    do_brcond:
-        tgen_branch(s, tcg_cond_to_s390_cond[args[2]], args[3]);
+        tgen_brcond(s, TCG_TYPE_I32, args[2], args[0],
+                    args[1], const_args[1], args[3]);
+        break;
+    case INDEX_op_brcond_i64:
+        tgen_brcond(s, TCG_TYPE_I64, args[2], args[0],
+                    args[1], const_args[1], args[3]);
         break;
 
     case INDEX_op_setcond_i32:
-        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1], args[2]);
+        tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1],
+                     args[2], const_args[2]);
         break;
     case INDEX_op_setcond_i64:
-        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1], args[2]);
+        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
+                     args[2], const_args[2]);
         break;
 
     case INDEX_op_qemu_ld8u:
@@ -1863,8 +1912,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_bswap16_i32, { "r", "r" } },
     { INDEX_op_bswap32_i32, { "r", "r" } },
 
-    { INDEX_op_brcond_i32, { "r", "r" } },
-    { INDEX_op_setcond_i32, { "r", "r", "r" } },
+    { INDEX_op_brcond_i32, { "r", "rWC" } },
+    { INDEX_op_setcond_i32, { "r", "r", "rWC" } },
 
     { INDEX_op_qemu_ld8u, { "r", "L" } },
     { INDEX_op_qemu_ld8s, { "r", "L" } },
@@ -1927,8 +1976,8 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_bswap32_i64, { "r", "r" } },
     { INDEX_op_bswap64_i64, { "r", "r" } },
 
-    { INDEX_op_brcond_i64, { "r", "r" } },
-    { INDEX_op_setcond_i64, { "r", "r", "r" } },
+    { INDEX_op_brcond_i64, { "r", "rC" } },
+    { INDEX_op_setcond_i64, { "r", "r", "rC" } },
 #endif
 
     { -1 },
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 57/62] tcg-s390: Use the COMPARE IMMEDIATE instrucions for compares.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (55 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 56/62] tcg-s390: Use the LOAD AND TEST instruction for compares Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 58/62] tcg-s390: Use COMPARE AND BRANCH instructions Richard Henderson
                   ` (5 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

These instructions are available with extended-immediate facility.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   44 ++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index edae6a8..5af8bc9 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -70,6 +70,10 @@ typedef enum S390Opcode {
     RIL_ALGFI   = 0xc20a,
     RIL_BRASL   = 0xc005,
     RIL_BRCL    = 0xc004,
+    RIL_CFI     = 0xc20d,
+    RIL_CGFI    = 0xc20c,
+    RIL_CLFI    = 0xc20f,
+    RIL_CLGFI   = 0xc20e,
     RIL_IIHF    = 0xc008,
     RIL_IILF    = 0xc009,
     RIL_LARL    = 0xc000,
@@ -527,7 +531,29 @@ static int tcg_match_xori(int ct, tcg_target_long val)
 
 static int tcg_match_cmpi(int ct, tcg_target_long val)
 {
-    return (val == 0);
+    if (facilities & FACILITY_EXT_IMM) {
+        /* The COMPARE IMMEDIATE instruction is available.  */
+        if (ct & TCG_CT_CONST_32) {
+            /* We have a 32-bit immediate and can compare against anything.  */
+            return 1;
+        } else {
+            /* ??? We have no insight here into whether the comparison is
+               signed or unsigned.  The COMPARE IMMEDIATE insn uses a 32-bit
+               signed immediate, and the COMPARE LOGICAL IMMEDIATE insn uses
+               a 32-bit unsigned immediate.  If we were to use the (semi)
+               obvious "val == (int32_t)val" we would be enabling unsigned
+               comparisons vs very large numbers.  The only solution is to
+               take the intersection of the ranges.  */
+            /* ??? Another possible solution is to simply lie and allow all
+               constants here and force the out-of-range values into a temp
+               register in tgen_cmp when we have knowledge of the actual
+               comparison code in use.  */
+            return val >= 0 && val <= 0x7fffffff;
+        }
+    } else {
+        /* Only the LOAD AND TEST instruction is available.  */
+        return val == 0;
+    }
 }
 
 /* Test if a constant matches the constraint. */
@@ -1083,7 +1109,21 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
             }
             return tcg_cond_to_ltr_cond[c];
         } else {
-            tcg_abort();
+            if (c > TCG_COND_GT) {
+                /* unsigned */
+                if (type == TCG_TYPE_I32) {
+                    tcg_out_insn(s, RIL, CLFI, r1, c2);
+                } else {
+                    tcg_out_insn(s, RIL, CLGFI, r1, c2);
+                }
+            } else {
+                /* signed */
+                if (type == TCG_TYPE_I32) {
+                    tcg_out_insn(s, RIL, CFI, r1, c2);
+                } else {
+                    tcg_out_insn(s, RIL, CGFI, r1, c2);
+                }
+            }
         }
     } else {
         if (c > TCG_COND_GT) {
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 58/62] tcg-s390: Use COMPARE AND BRANCH instructions.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (56 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 57/62] tcg-s390: Use the COMPARE IMMEDIATE instrucions " Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 59/62] tcg-s390: Generalize load/store support Richard Henderson
                   ` (4 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

These instructions are available with the general-instructions-extension
facility.  Use them if available.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |  102 +++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 95 insertions(+), 7 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 5af8bc9..4e3fb8b 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -114,6 +114,15 @@ typedef enum S390Opcode {
     RI_OILH     = 0xa50a,
     RI_OILL     = 0xa50b,
 
+    RIE_CGIJ    = 0xec7c,
+    RIE_CGRJ    = 0xec64,
+    RIE_CIJ     = 0xec7e,
+    RIE_CLGRJ   = 0xec65,
+    RIE_CLIJ    = 0xec7f,
+    RIE_CLGIJ   = 0xec7d,
+    RIE_CLRJ    = 0xec77,
+    RIE_CRJ     = 0xec76,
+
     RRE_AGR     = 0xb908,
     RRE_CGR     = 0xb920,
     RRE_CLGR    = 0xb921,
@@ -1100,6 +1109,7 @@ static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
 static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
                     TCGArg c2, int c2const)
 {
+    _Bool is_unsigned = (c > TCG_COND_GT);
     if (c2const) {
         if (c2 == 0) {
             if (type == TCG_TYPE_I32) {
@@ -1109,15 +1119,13 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
             }
             return tcg_cond_to_ltr_cond[c];
         } else {
-            if (c > TCG_COND_GT) {
-                /* unsigned */
+            if (is_unsigned) {
                 if (type == TCG_TYPE_I32) {
                     tcg_out_insn(s, RIL, CLFI, r1, c2);
                 } else {
                     tcg_out_insn(s, RIL, CLGFI, r1, c2);
                 }
             } else {
-                /* signed */
                 if (type == TCG_TYPE_I32) {
                     tcg_out_insn(s, RIL, CFI, r1, c2);
                 } else {
@@ -1126,15 +1134,13 @@ static int tgen_cmp(TCGContext *s, TCGType type, TCGCond c, TCGReg r1,
             }
         }
     } else {
-        if (c > TCG_COND_GT) {
-            /* unsigned */
+        if (is_unsigned) {
             if (type == TCG_TYPE_I32) {
                 tcg_out_insn(s, RR, CLR, r1, c2);
             } else {
                 tcg_out_insn(s, RRE, CLGR, r1, c2);
             }
         } else {
-            /* signed */
             if (type == TCG_TYPE_I32) {
                 tcg_out_insn(s, RR, CR, r1, c2);
             } else {
@@ -1185,10 +1191,92 @@ static void tgen_branch(TCGContext *s, int cc, int labelno)
     }
 }
 
+static void tgen_compare_branch(TCGContext *s, S390Opcode opc, int cc,
+                                TCGReg r1, TCGReg r2, int labelno)
+{
+    TCGLabel* l = &s->labels[labelno];
+    tcg_target_long off;
+
+    if (l->has_value) {
+        off = (l->u.value - (tcg_target_long)s->code_ptr) >> 1;
+    } else {
+        /* We need to keep the offset unchanged for retranslation.  */
+        off = ((int16_t *)s->code_ptr)[1];
+        tcg_out_reloc(s, s->code_ptr + 2, R_390_PC16DBL, labelno, -2);
+    }
+
+    tcg_out16(s, (opc & 0xff00) | (r1 << 4) | r2);
+    tcg_out16(s, off);
+    tcg_out16(s, cc << 12 | (opc & 0xff));
+}
+
+static void tgen_compare_imm_branch(TCGContext *s, S390Opcode opc, int cc,
+                                    TCGReg r1, int i2, int labelno)
+{
+    TCGLabel* l = &s->labels[labelno];
+    tcg_target_long off;
+
+    if (l->has_value) {
+        off = (l->u.value - (tcg_target_long)s->code_ptr) >> 1;
+    } else {
+        /* We need to keep the offset unchanged for retranslation.  */
+        off = ((int16_t *)s->code_ptr)[1];
+        tcg_out_reloc(s, s->code_ptr + 2, R_390_PC16DBL, labelno, -2);
+    }
+
+    tcg_out16(s, (opc & 0xff00) | (r1 << 4) | cc);
+    tcg_out16(s, off);
+    tcg_out16(s, (i2 << 8) | (opc & 0xff));
+}
+
 static void tgen_brcond(TCGContext *s, TCGType type, TCGCond c,
                         TCGReg r1, TCGArg c2, int c2const, int labelno)
 {
-    int cc = tgen_cmp(s, type, c, r1, c2, c2const);
+    int cc;
+
+    if (facilities & FACILITY_GEN_INST_EXT) {
+        _Bool is_unsigned = (c > TCG_COND_GT);
+        _Bool in_range;
+        S390Opcode opc;
+
+        cc = tcg_cond_to_s390_cond[c];
+
+        if (!c2const) {
+            opc = (type == TCG_TYPE_I32
+                   ? (is_unsigned ? RIE_CLRJ : RIE_CRJ)
+                   : (is_unsigned ? RIE_CLGRJ : RIE_CGRJ));
+            tgen_compare_branch(s, opc, cc, r1, c2, labelno);
+            return;
+        }
+
+        /* COMPARE IMMEDIATE AND BRANCH RELATIVE has an 8-bit immediate field.
+           If the immediate we've been given does not fit that range, we'll
+           fall back to separate compare and branch instructions using the
+           larger comparison range afforded by COMPARE IMMEDIATE.  */
+        if (type == TCG_TYPE_I32) {
+            if (is_unsigned) {
+                opc = RIE_CLIJ;
+                in_range = (uint32_t)c2 == (uint8_t)c2;
+            } else {
+                opc = RIE_CIJ;
+                in_range = (int32_t)c2 == (int8_t)c2;
+            }
+        } else {
+            if (is_unsigned) {
+                opc = RIE_CLGIJ;
+                in_range = (uint64_t)c2 == (uint8_t)c2;
+            } else {
+                opc = RIE_CGIJ;
+                in_range = (int64_t)c2 == (int8_t)c2;
+            }
+        }
+        if (in_range) {
+            tgen_compare_imm_branch(s, opc, cc, r1, c2, labelno);
+            return;
+        }
+    }
+
+    cc = tgen_cmp(s, type, c, r1, c2, c2const);
     tgen_branch(s, cc, labelno);
 }
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 59/62] tcg-s390: Generalize load/store support.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (57 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 58/62] tcg-s390: Use COMPARE AND BRANCH instructions Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 60/62] tcg-s390: Fix TLB comparison width Richard Henderson
                   ` (3 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Rename tcg_out_ldst to tcg_out_mem and add an index parameter.  If the
index parameter is present, handle it when the offset parameter is large
and the addend must be (partially) loaded.

Rename SH{32,64}_REG_NONE to TCG_REG_NONE, as the concept of a missing
register is not unique to the shift operations.

Adjust all users of tcg_out_mem to add TCG_REG_NONE as the index.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   94 +++++++++++++++++++++++++-----------------------
 1 files changed, 49 insertions(+), 45 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 4e3fb8b..6101255 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -47,6 +47,7 @@
 #define TCG_CT_CONST_XORI  0x4000
 #define TCG_CT_CONST_CMPI  0x8000
 
+#define TCG_REG_NONE       TCG_REG_R0
 #define TCG_TMP0           TCG_REG_R14
 
 #ifdef CONFIG_USE_GUEST_BASE
@@ -204,9 +205,6 @@ typedef enum S390Opcode {
     RX_STH      = 0x40,
 } S390Opcode;
 
-#define SH32_REG_NONE  0
-#define SH64_REG_NONE  0
-
 #define LD_SIGNED      0x04
 #define LD_UINT8       0x00
 #define LD_INT8        (LD_UINT8 | LD_SIGNED)
@@ -338,7 +336,7 @@ static void patch_reloc(uint8_t *code_ptr, int type,
 
     /* ??? Not the usual definition of "addend".  */
     pcrel2 = (value - (code_ptr_tl + addend)) >> 1;
-    
+
     switch (type) {
     case R_390_PC16DBL:
         assert(pcrel2 == (int16_t)pcrel2);
@@ -597,7 +595,7 @@ static int tcg_target_const_match(tcg_target_long val,
     } else if (ct & TCG_CT_CONST_MULI) {
         /* Immediates that may be used with multiply.  If we have the
            general-instruction-extensions, then we have MULTIPLY SINGLE
-           IMMEDIATE with a signed 32-bit, otherwise we have only 
+           IMMEDIATE with a signed 32-bit, otherwise we have only
            MULTIPLY HALFWORD IMMEDIATE, with a signed 16-bit.  */
         if (facilities & FACILITY_GEN_INST_EXT) {
             return val == (int32_t)val;
@@ -799,17 +797,21 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
    OPC_RX:   If the operation has an RX format opcode (e.g. STC), otherwise 0.
    OPC_RXY:  The RXY format opcode for the operation (e.g. STCY).  */
 
-static void tcg_out_ldst(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
-                         TCGReg data, TCGReg base, tcg_target_long ofs)
+static void tcg_out_mem(TCGContext *s, S390Opcode opc_rx, S390Opcode opc_rxy,
+                        TCGReg data, TCGReg base, TCGReg index,
+                        tcg_target_long ofs)
 {
-    TCGReg index = 0;
-
     if (ofs < -0x80000 || ofs >= 0x80000) {
         /* Combine the low 16 bits of the offset with the actual load insn;
            the high 48 bits must come from an immediate load.  */
-        index = TCG_TMP0;
-        tcg_out_movi(s, TCG_TYPE_PTR, index, ofs & ~0xffff);
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_TMP0, ofs & ~0xffff);
         ofs &= 0xffff;
+
+        /* If we were already given an index register, add it in.  */
+        if (index != TCG_REG_NONE) {
+            tcg_out_insn(s, RRE, AGR, TCG_TMP0, index);
+        }
+        index = TCG_TMP0;
     }
 
     if (opc_rx && ofs >= 0 && ofs < 0x1000) {
@@ -825,9 +827,9 @@ static inline void tcg_out_ld(TCGContext *s, TCGType type, TCGReg data,
                               TCGReg base, tcg_target_long ofs)
 {
     if (type == TCG_TYPE_I32) {
-        tcg_out_ldst(s, RX_L, RXY_LY, data, base, ofs);
+        tcg_out_mem(s, RX_L, RXY_LY, data, base, TCG_REG_NONE, ofs);
     } else {
-        tcg_out_ldst(s, 0, RXY_LG, data, base, ofs);
+        tcg_out_mem(s, 0, RXY_LG, data, base, TCG_REG_NONE, ofs);
     }
 }
 
@@ -835,9 +837,9 @@ static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg data,
                               TCGReg base, tcg_target_long ofs)
 {
     if (type == TCG_TYPE_I32) {
-        tcg_out_ldst(s, RX_ST, RXY_STY, data, base, ofs);
+        tcg_out_mem(s, RX_ST, RXY_STY, data, base, TCG_REG_NONE, ofs);
     } else {
-        tcg_out_ldst(s, 0, RXY_STG, data, base, ofs);
+        tcg_out_mem(s, 0, RXY_STG, data, base, TCG_REG_NONE, ofs);
     }
 }
 
@@ -871,14 +873,14 @@ static void tgen_ext8s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 
     if (type == TCG_TYPE_I32) {
         if (dest == src) {
-            tcg_out_sh32(s, RS_SLL, dest, SH32_REG_NONE, 24);
+            tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 24);
         } else {
-            tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 24);
+            tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 24);
         }
-        tcg_out_sh32(s, RS_SRA, dest, SH32_REG_NONE, 24);
+        tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 24);
     } else {
-        tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 56);
-        tcg_out_sh64(s, RSY_SRAG, dest, dest, SH64_REG_NONE, 56);
+        tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 56);
+        tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 56);
     }
 }
 
@@ -911,14 +913,14 @@ static void tgen_ext16s(TCGContext *s, TCGType type, TCGReg dest, TCGReg src)
 
     if (type == TCG_TYPE_I32) {
         if (dest == src) {
-            tcg_out_sh32(s, RS_SLL, dest, SH32_REG_NONE, 16);
+            tcg_out_sh32(s, RS_SLL, dest, TCG_REG_NONE, 16);
         } else {
-            tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 16);
+            tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 16);
         }
-        tcg_out_sh32(s, RS_SRA, dest, SH32_REG_NONE, 24);
+        tcg_out_sh32(s, RS_SRA, dest, TCG_REG_NONE, 24);
     } else {
-        tcg_out_sh64(s, RSY_SLLG, dest, src, SH64_REG_NONE, 48);
-        tcg_out_sh64(s, RSY_SRAG, dest, dest, SH64_REG_NONE, 48);
+        tcg_out_sh64(s, RSY_SLLG, dest, src, TCG_REG_NONE, 48);
+        tcg_out_sh64(s, RSY_SRAG, dest, dest, TCG_REG_NONE, 48);
     }
 }
 
@@ -1427,7 +1429,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
         tcg_out_mov(s, arg0, addr_reg);
     }
 
-    tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, SH64_REG_NONE,
+    tcg_out_sh64(s, RSY_SRLG, arg1, addr_reg, TCG_REG_NONE,
                  TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
 
     tgen64_andi_tmp(s, arg0, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
@@ -1615,37 +1617,37 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_ld8u_i64:
         /* ??? LLC (RXY format) is only present with the extended-immediate
            facility, whereas LLGC is always present.  */
-        tcg_out_ldst(s, 0, RXY_LLGC, args[0], args[1], args[2]);
+        tcg_out_mem(s, 0, RXY_LLGC, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_ld8s_i32:
     case INDEX_op_ld8s_i64:
         /* ??? LB is no smaller than LGB, so no point to using it.  */
-        tcg_out_ldst(s, 0, RXY_LGB, args[0], args[1], args[2]);
+        tcg_out_mem(s, 0, RXY_LGB, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_ld16u_i32:
     case INDEX_op_ld16u_i64:
         /* ??? LLH (RXY format) is only present with the extended-immediate
            facility, whereas LLGH is always present.  */
-        tcg_out_ldst(s, 0, RXY_LLGH, args[0], args[1], args[2]);
+        tcg_out_mem(s, 0, RXY_LLGH, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_ld16s_i32:
-        tcg_out_ldst(s, RX_LH, RXY_LHY, args[0], args[1], args[2]);
+        tcg_out_mem(s, RX_LH, RXY_LHY, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
     case INDEX_op_ld16s_i64:
-        tcg_out_ldst(s, 0, RXY_LGH, args[0], args[1], args[2]);
+        tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_ld_i32:
         tcg_out_ld(s, TCG_TYPE_I32, args[0], args[1], args[2]);
         break;
     case INDEX_op_ld32u_i64:
-        tcg_out_ldst(s, 0, RXY_LLGF, args[0], args[1], args[2]);
+        tcg_out_mem(s, 0, RXY_LLGF, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
     case INDEX_op_ld32s_i64:
-        tcg_out_ldst(s, 0, RXY_LGF, args[0], args[1], args[2]);
+        tcg_out_mem(s, 0, RXY_LGF, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_ld_i64:
@@ -1654,12 +1656,14 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_st8_i32:
     case INDEX_op_st8_i64:
-        tcg_out_ldst(s, RX_STC, RXY_STCY, args[0], args[1], args[2]);
+        tcg_out_mem(s, RX_STC, RXY_STCY, args[0], args[1],
+                    TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_st16_i32:
     case INDEX_op_st16_i64:
-        tcg_out_ldst(s, RX_STH, RXY_STHY, args[0], args[1], args[2]);
+        tcg_out_mem(s, RX_STH, RXY_STHY, args[0], args[1],
+                    TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_st_i32:
@@ -1797,7 +1801,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RS_SLL;
     do_shift32:
         if (const_args[2]) {
-            tcg_out_sh32(s, op, args[0], SH32_REG_NONE, args[2]);
+            tcg_out_sh32(s, op, args[0], TCG_REG_NONE, args[2]);
         } else {
             tcg_out_sh32(s, op, args[0], args[2], 0);
         }
@@ -1813,7 +1817,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RSY_SLLG;
     do_shift64:
         if (const_args[2]) {
-            tcg_out_sh64(s, op, args[0], args[1], SH64_REG_NONE, args[2]);
+            tcg_out_sh64(s, op, args[0], args[1], TCG_REG_NONE, args[2]);
         } else {
             tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
         }
@@ -1828,7 +1832,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_rotl_i32:
         /* ??? Using tcg_out_sh64 here for the format; it is a 32-bit rol.  */
         if (const_args[2]) {
-            tcg_out_sh64(s, RSY_RLL, args[0], args[1], SH32_REG_NONE, args[2]);
+            tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_REG_NONE, args[2]);
         } else {
             tcg_out_sh64(s, RSY_RLL, args[0], args[1], args[2], 0);
         }
@@ -1836,7 +1840,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_rotr_i32:
         if (const_args[2]) {
             tcg_out_sh64(s, RSY_RLL, args[0], args[1],
-                         SH32_REG_NONE, (32 - args[2]) & 31);
+                         TCG_REG_NONE, (32 - args[2]) & 31);
         } else {
             tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
             tcg_out_sh64(s, RSY_RLL, args[0], args[1], TCG_TMP0, 0);
@@ -1846,7 +1850,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_rotl_i64:
         if (const_args[2]) {
             tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
-                         SH64_REG_NONE, args[2]);
+                         TCG_REG_NONE, args[2]);
         } else {
             tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
         }
@@ -1854,7 +1858,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_rotr_i64:
         if (const_args[2]) {
             tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
-                         SH64_REG_NONE, (64 - args[2]) & 63);
+                         TCG_REG_NONE, (64 - args[2]) & 63);
         } else {
             /* We can use the smaller 32-bit negate because only the
                low 6 bits are examined for the rotate.  */
@@ -1900,7 +1904,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         /* The TCG bswap definition requires bits 0-47 already be zero.
            Thus we don't need the G-type insns to implement bswap16_i64.  */
         tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
-        tcg_out_sh32(s, RS_SRL, args[0], SH32_REG_NONE, 16);
+        tcg_out_sh32(s, RS_SRL, args[0], TCG_REG_NONE, 16);
         break;
     case INDEX_op_bswap32_i32:
     case INDEX_op_bswap32_i64:
@@ -2113,9 +2117,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
 /* ??? Linux kernels provide an AUXV entry AT_HWCAP that provides most of
    this information.  However, getting at that entry is not easy this far
-   away from main.  Our options are: start searching from environ, but 
+   away from main.  Our options are: start searching from environ, but
    that fails as soon as someone does a setenv in between.  Read the data
-   from /proc/self/auxv.  Or do the probing ourselves.  The only thing 
+   from /proc/self/auxv.  Or do the probing ourselves.  The only thing
    extra that AT_HWCAP gives us is HWCAP_S390_HIGH_GPRS, which indicates
    that the kernel saves all 64-bits of the registers around traps while
    in 31-bit mode.  But this is true of all "recent" kernels (ought to dig
@@ -2156,7 +2160,7 @@ static void query_facilities(void)
         /* ??? Possibly some of these are in practice never present unless
            the store-facility-extended facility is also present.  But since
            that isn't documented it's just better to probe for each.  */
-       
+
         /* Test for z/Architecture.  Required even in 31-bit mode.  */
         got_sigill = 0;
         /* agr %r0,%r0 */
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 60/62] tcg-s390: Fix TLB comparison width.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (58 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 59/62] tcg-s390: Generalize load/store support Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 61/62] tcg-s390: Enable compile in 32-bit mode Richard Henderson
                   ` (2 subsequent siblings)
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The TLB comparator is sized for the target.
Use a 32-bit compare when appropriate.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/s390/tcg-target.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index 6101255..ec4c72a 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -174,7 +174,9 @@ typedef enum S390Opcode {
     RS_SRL      = 0x88,
 
     RXY_AG      = 0xe308,
+    RXY_AY      = 0xe35a,
     RXY_CG      = 0xe320,
+    RXY_CY      = 0xe359,
     RXY_LB      = 0xe376,
     RXY_LG      = 0xe304,
     RXY_LGB     = 0xe377,
@@ -198,6 +200,8 @@ typedef enum S390Opcode {
     RXY_STRVH   = 0xe33f,
     RXY_STY     = 0xe350,
 
+    RX_A        = 0x5a,
+    RX_C        = 0x59,
     RX_L        = 0x58,
     RX_LH       = 0x48,
     RX_ST       = 0x50,
@@ -1442,7 +1446,11 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     }
     assert(ofs < 0x80000);
 
-    tcg_out_insn(s, RXY, CG, arg0, arg1, TCG_AREG0, ofs);
+    if (TARGET_LONG_BITS == 32) {
+        tcg_out_mem(s, RX_C, RXY_CY, arg0, arg1, TCG_AREG0, ofs);
+    } else {
+        tcg_out_mem(s, 0, RXY_CG, arg0, arg1, TCG_AREG0, ofs);
+    }
 
     if (TARGET_LONG_BITS == 32) {
         tgen_ext32u(s, arg0, addr_reg);
@@ -1494,7 +1502,7 @@ static void tcg_prepare_qemu_ldst(TCGContext* s, TCGReg data_reg,
     ofs = offsetof(CPUState, tlb_table[mem_index][0].addend);
     assert(ofs < 0x80000);
 
-    tcg_out_insn(s, RXY, AG, arg0, arg1, TCG_AREG0, ofs);
+    tcg_out_mem(s, 0, RXY_AG, arg0, arg1, TCG_AREG0, ofs);
 }
 
 static void tcg_finish_qemu_ldst(TCGContext* s, uint16_t *label2_ptr)
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 61/62] tcg-s390: Enable compile in 32-bit mode.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (59 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 60/62] tcg-s390: Fix TLB comparison width Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 62/62] tcg: Optionally sign-extend 32-bit arguments for 64-bit host Richard Henderson
  2010-05-27 21:00 ` [Qemu-devel] [PATCH 00/62] s390x tcg target Blue Swirl
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

The TCG translator will *not* work in 32-bit mode, and there is a
check added to query_facilities to enforce that.

However, QEMU can run in KVM mode when built in 32-bit mode, and
this patch is just good enough to enable that method to continue.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 configure             |    3 +-
 tcg/s390/tcg-target.c |  386 +++++++++++++++++++++++++------------------------
 tcg/s390/tcg-target.h |    7 +
 3 files changed, 207 insertions(+), 189 deletions(-)

diff --git a/configure b/configure
index f818198..f565026 100755
--- a/configure
+++ b/configure
@@ -697,7 +697,8 @@ case "$cpu" in
            fi
            ;;
     s390)
-           QEMU_CFLAGS="-march=z990 $QEMU_CFLAGS"
+           QEMU_CFLAGS="-m31 -march=z990 $QEMU_CFLAGS"
+           LDFLAGS="-m31 $LDFLAGS"
            host_guest_base="yes"
            ;;
     s390x)
diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c
index ec4c72a..cb1b013 100644
--- a/tcg/s390/tcg-target.c
+++ b/tcg/s390/tcg-target.c
@@ -727,7 +727,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
             return;
         }
         if ((uval & 0xffffffff) == 0) {
-            tcg_out_insn(s, RIL, LLIHF, ret, uval >> 32);
+            tcg_out_insn(s, RIL, LLIHF, ret, uval >> 31 >> 1);
             return;
         }
     }
@@ -757,7 +757,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
            We first want to make sure that all the high bits get set.  With
            luck the low 16-bits can be considered negative to perform that for
            free, otherwise we load an explicit -1.  */
-        if (sval >> 32 == -1) {
+        if (sval >> 31 >> 1 == -1) {
             if (uval & 0x8000) {
                 tcg_out_insn(s, RI, LGHI, ret, uval);
             } else {
@@ -775,7 +775,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
     tcg_out_movi(s, TCG_TYPE_I32, ret, sval);
 
     /* Insert data into the high 32-bits.  */
-    uval >>= 32;
+    uval = uval >> 31 >> 1;
     if (facilities & FACILITY_EXT_IMM) {
         if (uval < 0x10000) {
             tcg_out_insn(s, RI, IIHL, ret, uval);
@@ -958,7 +958,7 @@ static inline void tgen_ext32u(TCGContext *s, TCGReg dest, TCGReg src)
     tcg_out_insn(s, RRE, LLGFR, dest, src);
 }
 
-static void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
+static inline void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
 {
     if (val == (int16_t)val) {
         tcg_out_insn(s, RI, AHI, dest, val);
@@ -967,7 +967,7 @@ static void tgen32_addi(TCGContext *s, TCGReg dest, int32_t val)
     }
 }
 
-static void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
+static inline void tgen64_addi(TCGContext *s, TCGReg dest, int64_t val)
 {
     if (val == (int16_t)val) {
         tcg_out_insn(s, RI, AGHI, dest, val);
@@ -1108,7 +1108,7 @@ static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val)
         tcg_out_insn(s, RIL, XILF, dest, val);
     }
     if (val > 0xffffffff) {
-        tcg_out_insn(s, RIL, XIHF, dest, val >> 32);
+        tcg_out_insn(s, RIL, XIHF, dest, val >> 31 >> 1);
     }
 }
 
@@ -1589,6 +1589,15 @@ static void tcg_out_qemu_st(TCGContext* s, const TCGArg* args, int opc)
 #endif
 }
 
+#if TCG_TARGET_REG_BITS == 64
+# define OP_32_64(x) \
+        case glue(glue(INDEX_op_,x),_i32): \
+        case glue(glue(INDEX_op_,x),_i64)
+#else
+# define OP_32_64(x) \
+        case glue(glue(INDEX_op_,x),_i32)
+#endif
+
 static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
                 const TCGArg *args, const int *const_args)
 {
@@ -1621,21 +1630,18 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_ld8u_i32:
-    case INDEX_op_ld8u_i64:
+    OP_32_64(ld8u):
         /* ??? LLC (RXY format) is only present with the extended-immediate
            facility, whereas LLGC is always present.  */
         tcg_out_mem(s, 0, RXY_LLGC, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
-    case INDEX_op_ld8s_i32:
-    case INDEX_op_ld8s_i64:
+    OP_32_64(ld8s):
         /* ??? LB is no smaller than LGB, so no point to using it.  */
         tcg_out_mem(s, 0, RXY_LGB, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
 
-    case INDEX_op_ld16u_i32:
-    case INDEX_op_ld16u_i64:
+    OP_32_64(ld16u):
         /* ??? LLH (RXY format) is only present with the extended-immediate
            facility, whereas LLGH is always present.  */
         tcg_out_mem(s, 0, RXY_LLGH, args[0], args[1], TCG_REG_NONE, args[2]);
@@ -1644,45 +1650,25 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_ld16s_i32:
         tcg_out_mem(s, RX_LH, RXY_LHY, args[0], args[1], TCG_REG_NONE, args[2]);
         break;
-    case INDEX_op_ld16s_i64:
-        tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
-        break;
 
     case INDEX_op_ld_i32:
         tcg_out_ld(s, TCG_TYPE_I32, args[0], args[1], args[2]);
         break;
-    case INDEX_op_ld32u_i64:
-        tcg_out_mem(s, 0, RXY_LLGF, args[0], args[1], TCG_REG_NONE, args[2]);
-        break;
-    case INDEX_op_ld32s_i64:
-        tcg_out_mem(s, 0, RXY_LGF, args[0], args[1], TCG_REG_NONE, args[2]);
-        break;
-
-    case INDEX_op_ld_i64:
-        tcg_out_ld(s, TCG_TYPE_I64, args[0], args[1], args[2]);
-        break;
 
-    case INDEX_op_st8_i32:
-    case INDEX_op_st8_i64:
+    OP_32_64(st8):
         tcg_out_mem(s, RX_STC, RXY_STCY, args[0], args[1],
                     TCG_REG_NONE, args[2]);
         break;
 
-    case INDEX_op_st16_i32:
-    case INDEX_op_st16_i64:
+    OP_32_64(st16):
         tcg_out_mem(s, RX_STH, RXY_STHY, args[0], args[1],
                     TCG_REG_NONE, args[2]);
         break;
 
     case INDEX_op_st_i32:
-    case INDEX_op_st32_i64:
         tcg_out_st(s, TCG_TYPE_I32, args[0], args[1], args[2]);
         break;
 
-    case INDEX_op_st_i64:
-        tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
-        break;
-
     case INDEX_op_add_i32:
         if (const_args[2]) {
             tgen32_addi(s, args[0], args[2]);
@@ -1690,14 +1676,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_insn(s, RR, AR, args[0], args[2]);
         }
         break;
-    case INDEX_op_add_i64:
-        if (const_args[2]) {
-            tgen64_addi(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
-        }
-        break;
-
     case INDEX_op_sub_i32:
         if (const_args[2]) {
             tgen32_addi(s, args[0], -args[2]);
@@ -1705,13 +1683,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_insn(s, RR, SR, args[0], args[2]);
         }
         break;
-    case INDEX_op_sub_i64:
-        if (const_args[2]) {
-            tgen64_addi(s, args[0], -args[2]);
-        } else {
-            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
-        }
-        break;
 
     case INDEX_op_and_i32:
         if (const_args[2]) {
@@ -1735,34 +1706,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_and_i64:
-        if (const_args[2]) {
-            tgen64_andi(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, NGR, args[0], args[2]);
-        }
-        break;
-    case INDEX_op_or_i64:
-        if (const_args[2]) {
-            tgen64_ori(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, OGR, args[0], args[2]);
-        }
-        break;
-    case INDEX_op_xor_i64:
-        if (const_args[2]) {
-            tgen64_xori(s, args[0], args[2]);
-        } else {
-            tcg_out_insn(s, RRE, XGR, args[0], args[2]);
-        }
-        break;
-
     case INDEX_op_neg_i32:
         tcg_out_insn(s, RR, LCR, args[0], args[1]);
         break;
-    case INDEX_op_neg_i64:
-        tcg_out_insn(s, RRE, LCGR, args[0], args[1]);
-        break;
 
     case INDEX_op_mul_i32:
         if (const_args[2]) {
@@ -1775,17 +1721,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
             tcg_out_insn(s, RRE, MSR, args[0], args[2]);
         }
         break;
-    case INDEX_op_mul_i64:
-        if (const_args[2]) {
-            if (args[2] == (int16_t)args[2]) {
-                tcg_out_insn(s, RI, MGHI, args[0], args[2]);
-            } else {
-                tcg_out_insn(s, RIL, MSGFI, args[0], args[2]);
-            }
-        } else {
-            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
-        }
-        break;
 
     case INDEX_op_div2_i32:
         tcg_out_insn(s, RR, DR, TCG_REG_R2, args[4]);
@@ -1794,17 +1729,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_insn(s, RRE, DLR, TCG_REG_R2, args[4]);
         break;
 
-    case INDEX_op_div2_i64:
-        /* ??? We get an unnecessary sign-extension of the dividend
-           into R3 with this definition, but as we do in fact always
-           produce both quotient and remainder using INDEX_op_div_i64
-           instead requires jumping through even more hoops.  */
-        tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
-        break;
-    case INDEX_op_divu2_i64:
-        tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
-        break;
-
     case INDEX_op_shl_i32:
         op = RS_SLL;
     do_shift32:
@@ -1821,22 +1745,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         op = RS_SRA;
         goto do_shift32;
 
-    case INDEX_op_shl_i64:
-        op = RSY_SLLG;
-    do_shift64:
-        if (const_args[2]) {
-            tcg_out_sh64(s, op, args[0], args[1], TCG_REG_NONE, args[2]);
-        } else {
-            tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
-        }
-        break;
-    case INDEX_op_shr_i64:
-        op = RSY_SRLG;
-        goto do_shift64;
-    case INDEX_op_sar_i64:
-        op = RSY_SRAG;
-        goto do_shift64;
-
     case INDEX_op_rotl_i32:
         /* ??? Using tcg_out_sh64 here for the format; it is a 32-bit rol.  */
         if (const_args[2]) {
@@ -1855,72 +1763,28 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
-    case INDEX_op_rotl_i64:
-        if (const_args[2]) {
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
-                         TCG_REG_NONE, args[2]);
-        } else {
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
-        }
-        break;
-    case INDEX_op_rotr_i64:
-        if (const_args[2]) {
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
-                         TCG_REG_NONE, (64 - args[2]) & 63);
-        } else {
-            /* We can use the smaller 32-bit negate because only the
-               low 6 bits are examined for the rotate.  */
-            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
-            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_TMP0, 0);
-        }
-        break;
-
     case INDEX_op_ext8s_i32:
         tgen_ext8s(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext8s_i64:
-        tgen_ext8s(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
     case INDEX_op_ext16s_i32:
         tgen_ext16s(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext16s_i64:
-        tgen_ext16s(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
-    case INDEX_op_ext32s_i64:
-        tgen_ext32s(s, args[0], args[1]);
-        break;
-
     case INDEX_op_ext8u_i32:
         tgen_ext8u(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext8u_i64:
-        tgen_ext8u(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
     case INDEX_op_ext16u_i32:
         tgen_ext16u(s, TCG_TYPE_I32, args[0], args[1]);
         break;
-    case INDEX_op_ext16u_i64:
-        tgen_ext16u(s, TCG_TYPE_I64, args[0], args[1]);
-        break;
-    case INDEX_op_ext32u_i64:
-        tgen_ext32u(s, args[0], args[1]);
-        break;
 
-    case INDEX_op_bswap16_i32:
-    case INDEX_op_bswap16_i64:
+    OP_32_64(bswap16):
         /* The TCG bswap definition requires bits 0-47 already be zero.
            Thus we don't need the G-type insns to implement bswap16_i64.  */
         tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
         tcg_out_sh32(s, RS_SRL, args[0], TCG_REG_NONE, 16);
         break;
-    case INDEX_op_bswap32_i32:
-    case INDEX_op_bswap32_i64:
+    OP_32_64(bswap32):
         tcg_out_insn(s, RRE, LRVR, args[0], args[1]);
         break;
-    case INDEX_op_bswap64_i64:
-        tcg_out_insn(s, RRE, LRVGR, args[0], args[1]);
-        break;
 
     case INDEX_op_br:
         tgen_branch(s, S390_CC_ALWAYS, args[0]);
@@ -1930,46 +1794,27 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tgen_brcond(s, TCG_TYPE_I32, args[2], args[0],
                     args[1], const_args[1], args[3]);
         break;
-    case INDEX_op_brcond_i64:
-        tgen_brcond(s, TCG_TYPE_I64, args[2], args[0],
-                    args[1], const_args[1], args[3]);
-        break;
-
     case INDEX_op_setcond_i32:
         tgen_setcond(s, TCG_TYPE_I32, args[3], args[0], args[1],
                      args[2], const_args[2]);
         break;
-    case INDEX_op_setcond_i64:
-        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
-                     args[2], const_args[2]);
-        break;
 
     case INDEX_op_qemu_ld8u:
         tcg_out_qemu_ld(s, args, LD_UINT8);
         break;
-
     case INDEX_op_qemu_ld8s:
         tcg_out_qemu_ld(s, args, LD_INT8);
         break;
-
     case INDEX_op_qemu_ld16u:
         tcg_out_qemu_ld(s, args, LD_UINT16);
         break;
-
     case INDEX_op_qemu_ld16s:
         tcg_out_qemu_ld(s, args, LD_INT16);
         break;
-
     case INDEX_op_qemu_ld32:
         /* ??? Technically we can use a non-extending instruction.  */
-    case INDEX_op_qemu_ld32u:
         tcg_out_qemu_ld(s, args, LD_UINT32);
         break;
-
-    case INDEX_op_qemu_ld32s:
-        tcg_out_qemu_ld(s, args, LD_INT32);
-        break;
-
     case INDEX_op_qemu_ld64:
         tcg_out_qemu_ld(s, args, LD_UINT64);
         break;
@@ -1977,23 +1822,178 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_qemu_st8:
         tcg_out_qemu_st(s, args, LD_UINT8);
         break;
-
     case INDEX_op_qemu_st16:
         tcg_out_qemu_st(s, args, LD_UINT16);
         break;
-
     case INDEX_op_qemu_st32:
         tcg_out_qemu_st(s, args, LD_UINT32);
         break;
-
     case INDEX_op_qemu_st64:
         tcg_out_qemu_st(s, args, LD_UINT64);
         break;
 
-    case INDEX_op_mov_i32:
-    case INDEX_op_mov_i64:
-    case INDEX_op_movi_i32:
-    case INDEX_op_movi_i64:
+#if TCG_TARGET_REG_BITS == 64
+    case INDEX_op_ld16s_i64:
+        tcg_out_mem(s, 0, RXY_LGH, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld32u_i64:
+        tcg_out_mem(s, 0, RXY_LLGF, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld32s_i64:
+        tcg_out_mem(s, 0, RXY_LGF, args[0], args[1], TCG_REG_NONE, args[2]);
+        break;
+    case INDEX_op_ld_i64:
+        tcg_out_ld(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_st32_i64:
+        tcg_out_st(s, TCG_TYPE_I32, args[0], args[1], args[2]);
+        break;
+    case INDEX_op_st_i64:
+        tcg_out_st(s, TCG_TYPE_I64, args[0], args[1], args[2]);
+        break;
+
+    case INDEX_op_add_i64:
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, AGR, args[0], args[2]);
+        }
+        break;
+    case INDEX_op_sub_i64:
+        if (const_args[2]) {
+            tgen64_addi(s, args[0], -args[2]);
+        } else {
+            tcg_out_insn(s, RRE, SGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_and_i64:
+        if (const_args[2]) {
+            tgen64_andi(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, NGR, args[0], args[2]);
+        }
+        break;
+    case INDEX_op_or_i64:
+        if (const_args[2]) {
+            tgen64_ori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, OGR, args[0], args[2]);
+        }
+        break;
+    case INDEX_op_xor_i64:
+        if (const_args[2]) {
+            tgen64_xori(s, args[0], args[2]);
+        } else {
+            tcg_out_insn(s, RRE, XGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_neg_i64:
+        tcg_out_insn(s, RRE, LCGR, args[0], args[1]);
+        break;
+    case INDEX_op_bswap64_i64:
+        tcg_out_insn(s, RRE, LRVGR, args[0], args[1]);
+        break;
+
+    case INDEX_op_mul_i64:
+        if (const_args[2]) {
+            if (args[2] == (int16_t)args[2]) {
+                tcg_out_insn(s, RI, MGHI, args[0], args[2]);
+            } else {
+                tcg_out_insn(s, RIL, MSGFI, args[0], args[2]);
+            }
+        } else {
+            tcg_out_insn(s, RRE, MSGR, args[0], args[2]);
+        }
+        break;
+
+    case INDEX_op_div2_i64:
+        /* ??? We get an unnecessary sign-extension of the dividend
+           into R3 with this definition, but as we do in fact always
+           produce both quotient and remainder using INDEX_op_div_i64
+           instead requires jumping through even more hoops.  */
+        tcg_out_insn(s, RRE, DSGR, TCG_REG_R2, args[4]);
+        break;
+    case INDEX_op_divu2_i64:
+        tcg_out_insn(s, RRE, DLGR, TCG_REG_R2, args[4]);
+        break;
+
+    case INDEX_op_shl_i64:
+        op = RSY_SLLG;
+    do_shift64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, op, args[0], args[1], TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, op, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_shr_i64:
+        op = RSY_SRLG;
+        goto do_shift64;
+    case INDEX_op_sar_i64:
+        op = RSY_SRAG;
+        goto do_shift64;
+
+    case INDEX_op_rotl_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         TCG_REG_NONE, args[2]);
+        } else {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], args[2], 0);
+        }
+        break;
+    case INDEX_op_rotr_i64:
+        if (const_args[2]) {
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1],
+                         TCG_REG_NONE, (64 - args[2]) & 63);
+        } else {
+            /* We can use the smaller 32-bit negate because only the
+               low 6 bits are examined for the rotate.  */
+            tcg_out_insn(s, RR, LCR, TCG_TMP0, args[2]);
+            tcg_out_sh64(s, RSY_RLLG, args[0], args[1], TCG_TMP0, 0);
+        }
+        break;
+
+    case INDEX_op_ext8s_i64:
+        tgen_ext8s(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext16s_i64:
+        tgen_ext16s(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext32s_i64:
+        tgen_ext32s(s, args[0], args[1]);
+        break;
+    case INDEX_op_ext8u_i64:
+        tgen_ext8u(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext16u_i64:
+        tgen_ext16u(s, TCG_TYPE_I64, args[0], args[1]);
+        break;
+    case INDEX_op_ext32u_i64:
+        tgen_ext32u(s, args[0], args[1]);
+        break;
+
+    case INDEX_op_brcond_i64:
+        tgen_brcond(s, TCG_TYPE_I64, args[2], args[0],
+                    args[1], const_args[1], args[3]);
+        break;
+    case INDEX_op_setcond_i64:
+        tgen_setcond(s, TCG_TYPE_I64, args[3], args[0], args[1],
+                     args[2], const_args[2]);
+        break;
+
+    case INDEX_op_qemu_ld32u:
+        tcg_out_qemu_ld(s, args, LD_UINT32);
+        break;
+    case INDEX_op_qemu_ld32s:
+        tcg_out_qemu_ld(s, args, LD_INT32);
+        break;
+#endif /* TCG_TARGET_REG_BITS == 64 */
+
+    OP_32_64(mov):
+    OP_32_64(movi):
         /* These are always emitted by TCG directly.  */
     case INDEX_op_jmp:
         /* This one is obsolete and never emitted.  */
@@ -2059,8 +2059,6 @@ static const TCGTargetOpDef s390_op_defs[] = {
     { INDEX_op_qemu_ld8s, { "r", "L" } },
     { INDEX_op_qemu_ld16u, { "r", "L" } },
     { INDEX_op_qemu_ld16s, { "r", "L" } },
-    { INDEX_op_qemu_ld32u, { "r", "L" } },
-    { INDEX_op_qemu_ld32s, { "r", "L" } },
     { INDEX_op_qemu_ld32, { "r", "L" } },
     { INDEX_op_qemu_ld64, { "r", "L" } },
 
@@ -2118,6 +2116,9 @@ static const TCGTargetOpDef s390_op_defs[] = {
 
     { INDEX_op_brcond_i64, { "r", "rC" } },
     { INDEX_op_setcond_i64, { "r", "r", "rC" } },
+
+    { INDEX_op_qemu_ld32u, { "r", "L" } },
+    { INDEX_op_qemu_ld32s, { "r", "L" } },
 #endif
 
     { -1 },
@@ -2209,13 +2210,22 @@ static void query_facilities(void)
     /* The translator currently uses these extensions unconditionally.  */
     fail = 0;
     if ((facilities & FACILITY_ZARCH_ACTIVE) == 0) {
-        fprintf(stderr, "TCG: z/Arch facility is required\n");
+        fprintf(stderr, "TCG: z/Arch facility is required.\n");
+        fprintf(stderr, "TCG: Boot with a 64-bit enabled kernel.\n");
         fail = 1;
     }
     if ((facilities & FACILITY_LONG_DISP) == 0) {
-        fprintf(stderr, "TCG: long-displacement facility is required\n");
+        fprintf(stderr, "TCG: long-displacement facility is required.\n");
         fail = 1;
     }
+
+    /* So far there's just enough support for 31-bit mode to let the
+       compile succeed.  This is good enough to run QEMU with KVM.  */
+    if (sizeof(void *) != 8) {
+        fprintf(stderr, "TCG: 31-bit mode is not supported.\n");
+        fail = 1;
+    }
+
     if (fail) {
         exit(-1);
     }
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 940f530..451f1f5 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -23,7 +23,12 @@
  */
 #define TCG_TARGET_S390 1
 
+#ifdef __s390x__
 #define TCG_TARGET_REG_BITS 64
+#else
+#define TCG_TARGET_REG_BITS 32
+#endif
+
 #define TCG_TARGET_WORDS_BIGENDIAN
 
 typedef enum TCGReg {
@@ -64,6 +69,7 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_nand_i32
 // #define TCG_TARGET_HAS_nor_i32
 
+#if TCG_TARGET_REG_BITS == 64
 #define TCG_TARGET_HAS_div2_i64
 #define TCG_TARGET_HAS_rot_i64
 #define TCG_TARGET_HAS_ext8s_i64
@@ -82,6 +88,7 @@ typedef enum TCGReg {
 // #define TCG_TARGET_HAS_eqv_i64
 // #define TCG_TARGET_HAS_nand_i64
 // #define TCG_TARGET_HAS_nor_i64
+#endif
 
 #define TCG_TARGET_HAS_GUEST_BASE
 
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [Qemu-devel] [PATCH 62/62] tcg: Optionally sign-extend 32-bit arguments for 64-bit host.
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (60 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 61/62] tcg-s390: Enable compile in 32-bit mode Richard Henderson
@ 2010-05-27 20:46 ` Richard Henderson
  2010-05-27 21:00 ` [Qemu-devel] [PATCH 00/62] s390x tcg target Blue Swirl
  62 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 20:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: agraf, aurelien

Some hosts (amd64, ia64) have an ABI that ignores the high bits
of the 64-bit register when passing 32-bit arguments.  Others,
like s390x, require the value to be properly sign-extended for
the type.  I.e. "int32_t" must be sign-extended and "uint32_t"
must be zero-extended to 64-bits.

To effect this, extend the "sizemask" parameter to tcg_gen_callN
to include the signedness of the type of each parameter.  If the
tcg target requires it, extend each 32-bit argument into a 64-bit
temp and pass that to the function call.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 def-helper.h                 |   38 +++++++++++++++++++++++++++++---------
 target-i386/ops_sse_header.h |    3 +++
 target-ppc/helper.h          |    1 +
 tcg/s390/tcg-target.h        |    2 ++
 tcg/tcg-op.h                 |   34 +++++++++++++++++-----------------
 tcg/tcg.c                    |   41 +++++++++++++++++++++++++++++++++++------
 6 files changed, 87 insertions(+), 32 deletions(-)

diff --git a/def-helper.h b/def-helper.h
index 8a88c5b..8a822c7 100644
--- a/def-helper.h
+++ b/def-helper.h
@@ -81,9 +81,29 @@
 #define dh_is_64bit_ptr (TCG_TARGET_REG_BITS == 64)
 #define dh_is_64bit(t) glue(dh_is_64bit_, dh_alias(t))
 
+#define dh_is_signed_void 0
+#define dh_is_signed_i32 0
+#define dh_is_signed_s32 1
+#define dh_is_signed_i64 0
+#define dh_is_signed_s64 1
+#define dh_is_signed_f32 0
+#define dh_is_signed_f64 0
+#define dh_is_signed_tl  0
+#define dh_is_signed_int 1
+/* ??? This is highly specific to the host cpu.  There are even special
+   extension instructions that may be required, e.g. ia64's addp4.  But
+   for now we don't support any 64-bit targets with 32-bit pointers.  */
+#define dh_is_signed_ptr 0
+#define dh_is_signed_env dh_is_signed_ptr
+#define dh_is_signed(t) dh_is_signed_##t
+
+#define dh_sizemask(t, n) \
+  sizemask |= dh_is_64bit(t) << (n*2); \
+  sizemask |= dh_is_signed(t) << (n*2+1)
+
 #define dh_arg(t, n) \
   args[n - 1] = glue(GET_TCGV_, dh_alias(t))(glue(arg, n)); \
-  sizemask |= dh_is_64bit(t) << n
+  dh_sizemask(t, n)
 
 #define dh_arg_decl(t, n) glue(TCGv_, dh_alias(t)) glue(arg, n)
 
@@ -138,8 +158,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl0(ret)) \
 static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1)) \
 { \
   TCGArg args[1]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 1, args); \
 }
@@ -149,8 +169,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
     dh_arg_decl(t2, 2)) \
 { \
   TCGArg args[2]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   dh_arg(t2, 2); \
   tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 2, args); \
@@ -161,8 +181,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
     dh_arg_decl(t2, 2), dh_arg_decl(t3, 3)) \
 { \
   TCGArg args[3]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   dh_arg(t2, 2); \
   dh_arg(t3, 3); \
@@ -174,8 +194,8 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
     dh_arg_decl(t2, 2), dh_arg_decl(t3, 3), dh_arg_decl(t4, 4)) \
 { \
   TCGArg args[4]; \
-  int sizemask; \
-  sizemask = dh_is_64bit(ret); \
+  int sizemask = 0; \
+  dh_sizemask(ret, 0); \
   dh_arg(t1, 1); \
   dh_arg(t2, 2); \
   dh_arg(t3, 3); \
diff --git a/target-i386/ops_sse_header.h b/target-i386/ops_sse_header.h
index a0a6361..8d4b2b7 100644
--- a/target-i386/ops_sse_header.h
+++ b/target-i386/ops_sse_header.h
@@ -30,6 +30,9 @@
 #define dh_ctype_Reg Reg *
 #define dh_ctype_XMMReg XMMReg *
 #define dh_ctype_MMXReg MMXReg *
+#define dh_is_signed_Reg dh_is_signed_ptr
+#define dh_is_signed_XMMReg dh_is_signed_ptr
+#define dh_is_signed_MMXReg dh_is_signed_ptr
 
 DEF_HELPER_2(glue(psrlw, SUFFIX), void, Reg, Reg)
 DEF_HELPER_2(glue(psraw, SUFFIX), void, Reg, Reg)
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 5cf6cd4..c025a2f 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -95,6 +95,7 @@ DEF_HELPER_3(fsel, i64, i64, i64, i64)
 
 #define dh_alias_avr ptr
 #define dh_ctype_avr ppc_avr_t *
+#define dh_is_signed_avr dh_is_signed_ptr
 
 DEF_HELPER_3(vaddubm, void, avr, avr, avr)
 DEF_HELPER_3(vadduhm, void, avr, avr, avr)
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 451f1f5..4e45cf3 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -97,6 +97,8 @@ typedef enum TCGReg {
 #define TCG_TARGET_STACK_ALIGN		8
 #define TCG_TARGET_CALL_STACK_OFFSET	0
 
+#define TCG_TARGET_EXTEND_ARGS 1
+
 enum {
     /* Note: must be synced with dyngen-exec.h */
     TCG_AREG0 = TCG_REG_R10,
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index bafac2b..fbafa89 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -369,8 +369,8 @@ static inline void tcg_gen_helperN(void *func, int flags, int sizemask,
    and pure, hence the call to tcg_gen_callN() with TCG_CALL_CONST |
    TCG_CALL_PURE. This may need to be adjusted if these functions
    start to be used with other helpers. */
-static inline void tcg_gen_helper32(void *func, TCGv_i32 ret,
-                                    TCGv_i32 a, TCGv_i32 b)
+static inline void tcg_gen_helper32(void *func, TCGv_i32 ret, TCGv_i32 a,
+                                    TCGv_i32 b, _Bool is_signed)
 {
     TCGv_ptr fn;
     TCGArg args[2];
@@ -378,12 +378,12 @@ static inline void tcg_gen_helper32(void *func, TCGv_i32 ret,
     args[0] = GET_TCGV_I32(a);
     args[1] = GET_TCGV_I32(b);
     tcg_gen_callN(&tcg_ctx, fn, TCG_CALL_CONST | TCG_CALL_PURE,
-                  0, GET_TCGV_I32(ret), 2, args);
+                  (is_signed ? 0x2a : 0x00), GET_TCGV_I32(ret), 2, args);
     tcg_temp_free_ptr(fn);
 }
 
-static inline void tcg_gen_helper64(void *func, TCGv_i64 ret,
-                                    TCGv_i64 a, TCGv_i64 b)
+static inline void tcg_gen_helper64(void *func, TCGv_i64 ret, TCGv_i64 a,
+                                    TCGv_i64 b, _Bool is_signed)
 {
     TCGv_ptr fn;
     TCGArg args[2];
@@ -391,7 +391,7 @@ static inline void tcg_gen_helper64(void *func, TCGv_i64 ret,
     args[0] = GET_TCGV_I64(a);
     args[1] = GET_TCGV_I64(b);
     tcg_gen_callN(&tcg_ctx, fn, TCG_CALL_CONST | TCG_CALL_PURE,
-                  7, GET_TCGV_I64(ret), 2, args);
+                  (is_signed ? 0x3f : 0x15), GET_TCGV_I64(ret), 2, args);
     tcg_temp_free_ptr(fn);
 }
 
@@ -692,22 +692,22 @@ static inline void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 #else
 static inline void tcg_gen_div_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_div_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_div_i32, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_rem_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_rem_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_rem_i32, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_divu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_divu_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_divu_i32, ret, arg1, arg2, 0);
 }
 
 static inline void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    tcg_gen_helper32(tcg_helper_remu_i32, ret, arg1, arg2);
+    tcg_gen_helper32(tcg_helper_remu_i32, ret, arg1, arg2, 0);
 }
 #endif
 
@@ -867,7 +867,7 @@ static inline void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
    specific code (x86) */
 static inline void tcg_gen_shl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_shl_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_shl_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
@@ -877,7 +877,7 @@ static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 
 static inline void tcg_gen_shr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_shr_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_shr_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
@@ -887,7 +887,7 @@ static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 
 static inline void tcg_gen_sar_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_sar_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_sar_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
@@ -935,22 +935,22 @@ static inline void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 
 static inline void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_div_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_rem_i64, ret, arg1, arg2, 1);
 }
 
 static inline void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_divu_i64, ret, arg1, arg2, 0);
 }
 
 static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2);
+    tcg_gen_helper64(tcg_helper_remu_i64, ret, arg1, arg2, 0);
 }
 
 #else
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 880e7ce..d8ddd1f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -560,6 +560,24 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
     int real_args;
     int nb_rets;
     TCGArg *nparam;
+
+#if defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
+    for (i = 0; i < nargs; ++i) {
+        int is_64bit = sizemask & (1 << (i+1)*2);
+        int is_signed = sizemask & (2 << (i+1)*2);
+        if (!is_64bit) {
+            TCGv_i64 temp = tcg_temp_new_i64();
+            TCGv_i64 orig = MAKE_TCGV_I64(args[i]);
+            if (is_signed) {
+                tcg_gen_ext32s_i64(temp, orig);
+            } else {
+                tcg_gen_ext32u_i64(temp, orig);
+            }
+            args[i] = GET_TCGV_I64(temp);
+        }
+    }
+#endif /* TCG_TARGET_EXTEND_ARGS */
+
     *gen_opc_ptr++ = INDEX_op_call;
     nparam = gen_opparam_ptr++;
 #ifdef TCG_TARGET_I386
@@ -588,7 +606,8 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
     real_args = 0;
     for (i = 0; i < nargs; i++) {
 #if TCG_TARGET_REG_BITS < 64
-        if (sizemask & (2 << i)) {
+        int is_64bit = sizemask & (1 << (i+1)*2);
+        if (is_64bit) {
 #ifdef TCG_TARGET_I386
             /* REGPARM case: if the third parameter is 64 bit, it is
                allocated on the stack */
@@ -622,12 +641,12 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
             *gen_opparam_ptr++ = args[i] + 1;
 #endif
             real_args += 2;
-        } else
-#endif
-        {
-            *gen_opparam_ptr++ = args[i];
-            real_args++;
+            continue;
         }
+#endif /* TCG_TARGET_REG_BITS < 64 */
+
+        *gen_opparam_ptr++ = args[i];
+        real_args++;
     }
     *gen_opparam_ptr++ = GET_TCGV_PTR(func);
 
@@ -637,6 +656,16 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
 
     /* total parameters, needed to go backward in the instruction stream */
     *gen_opparam_ptr++ = 1 + nb_rets + real_args + 3;
+
+#if defined(TCG_TARGET_EXTEND_ARGS) && TCG_TARGET_REG_BITS == 64
+    for (i = 0; i < nargs; ++i) {
+        int is_64bit = sizemask & (1 << (i+1)*2);
+        if (!is_64bit) {
+            TCGv_i64 temp = MAKE_TCGV_I64(args[i]);
+            tcg_temp_free_i64(temp);
+        }
+    }
+#endif /* TCG_TARGET_EXTEND_ARGS */
 }
 
 #if TCG_TARGET_REG_BITS == 32
-- 
1.7.0.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [Qemu-devel] [PATCH 00/62] s390x tcg target
  2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
                   ` (61 preceding siblings ...)
  2010-05-27 20:46 ` [Qemu-devel] [PATCH 62/62] tcg: Optionally sign-extend 32-bit arguments for 64-bit host Richard Henderson
@ 2010-05-27 21:00 ` Blue Swirl
  2010-05-27 21:14   ` Richard Henderson
  62 siblings, 1 reply; 67+ messages in thread
From: Blue Swirl @ 2010-05-27 21:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, aurelien, agraf

On Thu, May 27, 2010 at 8:45 PM, Richard Henderson <rth@twiddle.net> wrote:
> The following patch series is available at
>
>  git://repo.or.cz/qemu/rth.git tcg-s390-2
>
> It begins with Uli Hecht's original patch, posted by Alexander
> sometime last year.  I then make incremental changes to
>
>  (1) Make it compile -- first patch that compiles is tagged
>      as tcg-s390-2-first-compile and is
>
>      d142103... tcg-s390: Define tcg_target_reg_names.
>
>  (2) Make it work -- the first patch that i386-linux-user
>      successfully completes linux-test-user-0.2 is tagged
>      as tcg-s390-2-first-working and is
>
>      3571f8d... tcg-s390: Implement setcond.
>
>  (3) Make it work for other targets.  I don't tag this,
>      but there are lots of load/store aborts and an
>      incorrectly division routine until
>
>      9798371... tcg-s390: Implement div2.
>
>  (4) Make it work well.  The balance of the patches incrementally
>      add support for new instructions.  At
>
>      7bfaa9e... tcg-s390: Query instruction extensions that are installed.
>
>      I add support for detecting the instruction set extensions
>      present in the host and then start disabling some of those
>      new instructions that may not be present.
>
> Once things start working, each step was tested with an --enable-debug
> compile, and running the linux-user-test suite as well as booting
> the {arm,coldfire,sparc}-linux test kernels, and booting freedos.
>
> Unfortunately, each step was only built without optimization, and it
> is only at the end that we discovered that TCG was not properly honoring
> the host ABI.  This is solved by the last patch, adding proper sign
> extensions for the 32-bit function arguments.  With the final patch
> everything works for an optimized build as well.
>
> The current state is that the TCG compiler works for an s390x host.
> That is, with a 64-bit userland binary.  It will *compile* for a
> 32-bit userland binary, but that facility is only retained for the
> purpose of running the s390 kvm guest.  If kvm is not used, the
> 32-bit binary will exit with an error message.
>
> Given that this is the beginning of proper support for s390, I don't
> know whether bisectability is really an issue.  I suppose we could
> fairly easily re-base the patches that touch files outside tcg/s390/
> and then squash the rest, but I suspect the history may be useful.
>
>
>
> r~
>
>
>
> Alexander Graf (2):
>  S390 TCG target
>  add lost chunks from the original patch
>
> Richard Henderson (60):
>  tcg-s390: Only validate CPUTLBEntry for system mode.
>  tcg-s390: Fix tcg_prepare_qemu_ldst for user mode.
>  tcg-s390: Move opcode defines to tcg-target.c.
>  s390x: Avoid _llseek.
>  s390x: Don't use a linker script for user-only.
>  tcg-s390: Avoid set-but-not-used werrors.
>  tcg-s390: Mark R0 & R15 reserved.
>  tcg-s390: R6 is a function argument register
>  tcg-s390: Move tcg_out_mov up and use it throughout.
>  tcg-s390: Eliminate the S constraint.
>  tcg-s390: Add -m64 and -march to s390x compilation.
>  tcg-s390: Define tcg_target_reg_names.
>  tcg-s390: Update disassembler from binutils head.

This is GPLv3, which is not OK. Please use the last v2 version, see
88103cfecf5666237fb2e55a7dd666fa66d316ec.

>  tcg-s390: Compute is_write in cpu_signal_handler.
>  tcg-s390: Reorganize instruction emission
>  tcg-s390: Use matching constraints.
>  tcg-s390: Fixup qemu_ld/st opcodes.
>  tcg-s390: Implement setcond.
>  tcg-s390: Generalize the direct load/store emission.
>  tcg-s390: Tidy branches.
>  tcg-s390: Add tgen_calli.
>  tcg-s390: Implement div2.
>  tcg-s390: Re-implement tcg_out_movi.
>  tcg-s390: Implement sign and zero-extension operations.
>  tcg-s390: Implement bswap operations.
>  tcg-s390: Implement rotates.
>  tcg-s390: Use LOAD COMPLIMENT for negate.
>  tcg-s390: Tidy unimplemented opcodes.
>  tcg-s390: Use the extended-immediate facility for add/sub.
>  tcg-s390: Implement immediate ANDs.
>  tcg-s390: Implement immediate ORs.
>  tcg-s390: Implement immediate MULs.
>  tcg-s390: Implement immediate XORs.
>  tcg-s390: Icache flush is a no-op.
>  tcg-s390: Define TCG_TMP0.
>  tcg-s390: Tidy regset initialization; use R14 as temporary.
>  tcg-s390: Rearrange register allocation order.
>  tcg-s390: Tidy goto_tb.
>  tcg-s390: Allocate the code_gen_buffer near the main program.
>  tcg-s390: Rearrange qemu_ld/st to avoid register copy.
>  tcg-s390: Tidy tcg_prepare_qemu_ldst.
>  tcg-s390: Tidy user qemu_ld/st.
>  tcg-s390: Implement GUEST_BASE.
>  tcg-s390: Query instruction extensions that are installed.
>  tcg-s390: Conditionalize general-instruction-extension insns.
>  tcg-s390: Conditionalize ADD IMMEDIATE instructions.
>  tcg-s390: Conditionalize LOAD IMMEDIATE instructions.
>  tcg-s390: Conditionalize 8 and 16 bit extensions.
>  tcg-s390: Conditionalize AND IMMEDIATE instructions.
>  tcg-s390: Conditionalize OR IMMEDIATE instructions.
>  tcg-s390: Conditionalize XOR IMMEDIATE instructions.
>  tcg-s390: Do not require the extended-immediate facility.
>  tcg-s390: Use 16-bit branches for forward jumps.
>  tcg-s390: Use the LOAD AND TEST instruction for compares.
>  tcg-s390: Use the COMPARE IMMEDIATE instrucions for compares.
>  tcg-s390: Use COMPARE AND BRANCH instructions.
>  tcg-s390: Generalize load/store support.
>  tcg-s390: Fix TLB comparison width.
>  tcg-s390: Enable compile in 32-bit mode.
>  tcg: Optionally sign-extend 32-bit arguments for 64-bit host.
>
>  configure                    |   12 +-
>  cpu-exec.c                   |   42 +-
>  def-helper.h                 |   38 +-
>  exec.c                       |    7 +
>  linux-user/syscall.c         |    4 +-
>  s390-dis.c                   |  818 +++++++++++++---
>  target-i386/ops_sse_header.h |    3 +
>  target-ppc/helper.h          |    1 +
>  tcg/s390/tcg-target.c        | 2240 +++++++++++++++++++++++++++++++++++++++++-
>  tcg/s390/tcg-target.h        |   63 +-
>  tcg/tcg-op.h                 |   34 +-
>  tcg/tcg.c                    |   41 +-
>  12 files changed, 3063 insertions(+), 240 deletions(-)
>
>
>

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [Qemu-devel] [PATCH 00/62] s390x tcg target
  2010-05-27 21:00 ` [Qemu-devel] [PATCH 00/62] s390x tcg target Blue Swirl
@ 2010-05-27 21:14   ` Richard Henderson
  0 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-27 21:14 UTC (permalink / raw)
  To: Blue Swirl; +Cc: qemu-devel, aurelien, agraf

On 05/27/2010 02:00 PM, Blue Swirl wrote:
>>  tcg-s390: Update disassembler from binutils head.
> 
> This is GPLv3, which is not OK. Please use the last v2 version, see
> 88103cfecf5666237fb2e55a7dd666fa66d316ec.

Ok.  Thankfully there aren't too many changes since then.

I'll wait for more comments before reorganizing the patches
on the branch.



r~

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [Qemu-devel] [PATCH 02/62] add lost chunks from the original patch
  2010-05-27 20:45 ` [Qemu-devel] [PATCH 02/62] add lost chunks from the original patch Richard Henderson
@ 2010-05-28 16:49   ` Andreas Färber
  2010-05-28 17:13     ` Richard Henderson
  0 siblings, 1 reply; 67+ messages in thread
From: Andreas Färber @ 2010-05-28 16:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel Developers, Alexander Graf

Am 27.05.2010 um 22:45 schrieb Richard Henderson:

> From: Alexander Graf <agraf@suse.de>
>
> ---
> tcg/s390/tcg-target.c |    3 ++
> tcg/s390/tcg-target.h |   86 ++++++++++++++++++++++++++++++++++++++++ 
> +++++++--
> 2 files changed, 86 insertions(+), 3 deletions(-)

This one's missing an SoB, and fwiw I think it should be squashed with  
the preceding one.

Andreas

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [Qemu-devel] [PATCH 02/62] add lost chunks from the original patch
  2010-05-28 16:49   ` Andreas Färber
@ 2010-05-28 17:13     ` Richard Henderson
  0 siblings, 0 replies; 67+ messages in thread
From: Richard Henderson @ 2010-05-28 17:13 UTC (permalink / raw)
  To: Andreas Färber; +Cc: qemu-devel Developers, Alexander Graf

On 05/28/2010 09:49 AM, Andreas Färber wrote:
> Am 27.05.2010 um 22:45 schrieb Richard Henderson:
> 
>> From: Alexander Graf <agraf@suse.de>
>>
>> ---
>> tcg/s390/tcg-target.c |    3 ++
>> tcg/s390/tcg-target.h |   86
>> +++++++++++++++++++++++++++++++++++++++++++++++--
>> 2 files changed, 86 insertions(+), 3 deletions(-)
> 
> This one's missing an SoB, and fwiw I think it should be squashed with
> the preceding one.

I'm intending to squash about 25 of these so that the "base" patch
actually works.  Perhaps it was premature to post the 62-part edition.
That said, Blue caught a license problem, so it hasn't been a complete
waste of time.


r~

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2010-05-28 17:13 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-27 20:45 [Qemu-devel] [PATCH 00/62] s390x tcg target Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 01/62] S390 TCG target Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 02/62] add lost chunks from the original patch Richard Henderson
2010-05-28 16:49   ` Andreas Färber
2010-05-28 17:13     ` Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 03/62] tcg-s390: Only validate CPUTLBEntry for system mode Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 04/62] tcg-s390: Fix tcg_prepare_qemu_ldst for user mode Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 05/62] tcg-s390: Move opcode defines to tcg-target.c Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 06/62] s390x: Avoid _llseek Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 07/62] s390x: Don't use a linker script for user-only Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 08/62] tcg-s390: Avoid set-but-not-used werrors Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 09/62] tcg-s390: Mark R0 & R15 reserved Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 10/62] tcg-s390: R6 is a function argument register Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 11/62] tcg-s390: Move tcg_out_mov up and use it throughout Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 12/62] tcg-s390: Eliminate the S constraint Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 13/62] tcg-s390: Add -m64 and -march to s390x compilation Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 14/62] tcg-s390: Define tcg_target_reg_names Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 15/62] tcg-s390: Update disassembler from binutils head Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 16/62] tcg-s390: Compute is_write in cpu_signal_handler Richard Henderson
2010-05-27 20:45 ` [Qemu-devel] [PATCH 17/62] tcg-s390: Reorganize instruction emission Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 18/62] tcg-s390: Use matching constraints Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 19/62] tcg-s390: Fixup qemu_ld/st opcodes Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 20/62] tcg-s390: Implement setcond Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 21/62] tcg-s390: Generalize the direct load/store emission Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 22/62] tcg-s390: Tidy branches Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 23/62] tcg-s390: Add tgen_calli Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 24/62] tcg-s390: Implement div2 Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 25/62] tcg-s390: Re-implement tcg_out_movi Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 26/62] tcg-s390: Implement sign and zero-extension operations Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 27/62] tcg-s390: Implement bswap operations Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 28/62] tcg-s390: Implement rotates Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 29/62] tcg-s390: Use LOAD COMPLIMENT for negate Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 30/62] tcg-s390: Tidy unimplemented opcodes Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 31/62] tcg-s390: Use the extended-immediate facility for add/sub Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 32/62] tcg-s390: Implement immediate ANDs Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 33/62] tcg-s390: Implement immediate ORs Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 34/62] tcg-s390: Implement immediate MULs Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 35/62] tcg-s390: Implement immediate XORs Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 36/62] tcg-s390: Icache flush is a no-op Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 37/62] tcg-s390: Define TCG_TMP0 Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 38/62] tcg-s390: Tidy regset initialization; use R14 as temporary Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 39/62] tcg-s390: Rearrange register allocation order Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 40/62] tcg-s390: Tidy goto_tb Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 41/62] tcg-s390: Allocate the code_gen_buffer near the main program Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 42/62] tcg-s390: Rearrange qemu_ld/st to avoid register copy Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 43/62] tcg-s390: Tidy tcg_prepare_qemu_ldst Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 44/62] tcg-s390: Tidy user qemu_ld/st Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 45/62] tcg-s390: Implement GUEST_BASE Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 46/62] tcg-s390: Query instruction extensions that are installed Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 47/62] tcg-s390: Conditionalize general-instruction-extension insns Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 48/62] tcg-s390: Conditionalize ADD IMMEDIATE instructions Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 49/62] tcg-s390: Conditionalize LOAD " Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 50/62] tcg-s390: Conditionalize 8 and 16 bit extensions Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 51/62] tcg-s390: Conditionalize AND IMMEDIATE instructions Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 52/62] tcg-s390: Conditionalize OR " Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 53/62] tcg-s390: Conditionalize XOR " Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 54/62] tcg-s390: Do not require the extended-immediate facility Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 55/62] tcg-s390: Use 16-bit branches for forward jumps Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 56/62] tcg-s390: Use the LOAD AND TEST instruction for compares Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 57/62] tcg-s390: Use the COMPARE IMMEDIATE instrucions " Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 58/62] tcg-s390: Use COMPARE AND BRANCH instructions Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 59/62] tcg-s390: Generalize load/store support Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 60/62] tcg-s390: Fix TLB comparison width Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 61/62] tcg-s390: Enable compile in 32-bit mode Richard Henderson
2010-05-27 20:46 ` [Qemu-devel] [PATCH 62/62] tcg: Optionally sign-extend 32-bit arguments for 64-bit host Richard Henderson
2010-05-27 21:00 ` [Qemu-devel] [PATCH 00/62] s390x tcg target Blue Swirl
2010-05-27 21:14   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).