qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64
@ 2015-09-02 17:57 Richard Henderson
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries Richard Henderson
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Version 1 was posted back in February.  At the time, Peter was
less than thrilled about extending the aarch64 NZCV tcg temps
to 64 bits.  This revision drops that change, and so should be
less controversial.

The tree has also been updated to mainline, which means that
we now have tcg_gen_extrh_i64_i32 available to us, which allows
one more bit of tidying up.


r~


Richard Henderson (11):
  target-arm: Share all common TCG temporaries
  target-arm: Introduce DisasCompare
  target-arm: Handle always condition codes within arm_test_cc
  target-arm: Use setcond and movcond for csel
  target-arm: Implement ccmp branchless
  target-arm: Implement fcsel with movcond
  target-arm: Recognize SXTB, SXTH, SXTW, ASR
  target-arm: Recognize UXTB, UXTH, LSR, LSL
  target-arm: Eliminate unnecessary zero-extend in disas_bitfield
  target-arm: Recognize ROR
  target-arm: Use tcg_gen_extrh_i64_i32

 target-arm/translate-a64.c | 336 ++++++++++++++++++++++++++-------------------
 target-arm/translate.c     | 129 ++++++++++-------
 target-arm/translate.h     |  17 +++
 3 files changed, 286 insertions(+), 196 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 16:57   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare Richard Henderson
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

This is a bug fix for aarch64.  At present, we have branches using
the 32-bit (translate.c) versions of cpu_[NZCV]F, but we set the flags
using the 64-bit (translate-a64.c) versions of cpu_[NZCV]F.  From
the view of the TCG code generator, these are unrelated variables.

The bug is hard to see because we currently only read these variables
from branches, and upon reaching a branch TCG will first spill live
variables and then reload the arguments of the branch.  Since the
32-bit versions were never live until reaching the branch, we'd re-read
the data that had just been spilled from the 64-bit versions.

There is currently no such problem with the cpu_exclusive_* variables,
but there's no point in tempting fate.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 22 ----------------------
 target-arm/translate.c     | 10 +++++-----
 target-arm/translate.h     |  8 ++++++++
 3 files changed, 13 insertions(+), 27 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 5c13e15..1587ab5 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -39,16 +39,9 @@
 
 static TCGv_i64 cpu_X[32];
 static TCGv_i64 cpu_pc;
-static TCGv_i32 cpu_NF, cpu_ZF, cpu_CF, cpu_VF;
 
 /* Load/store exclusive handling */
-static TCGv_i64 cpu_exclusive_addr;
-static TCGv_i64 cpu_exclusive_val;
 static TCGv_i64 cpu_exclusive_high;
-#ifdef CONFIG_USER_ONLY
-static TCGv_i64 cpu_exclusive_test;
-static TCGv_i32 cpu_exclusive_info;
-#endif
 
 static const char *regnames[] = {
     "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7",
@@ -104,23 +97,8 @@ void a64_translate_init(void)
                                           regnames[i]);
     }
 
-    cpu_NF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, NF), "NF");
-    cpu_ZF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, ZF), "ZF");
-    cpu_CF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, CF), "CF");
-    cpu_VF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, VF), "VF");
-
-    cpu_exclusive_addr = tcg_global_mem_new_i64(TCG_AREG0,
-        offsetof(CPUARMState, exclusive_addr), "exclusive_addr");
-    cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0,
-        offsetof(CPUARMState, exclusive_val), "exclusive_val");
     cpu_exclusive_high = tcg_global_mem_new_i64(TCG_AREG0,
         offsetof(CPUARMState, exclusive_high), "exclusive_high");
-#ifdef CONFIG_USER_ONLY
-    cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0,
-        offsetof(CPUARMState, exclusive_test), "exclusive_test");
-    cpu_exclusive_info = tcg_global_mem_new_i32(TCG_AREG0,
-        offsetof(CPUARMState, exclusive_info), "exclusive_info");
-#endif
 }
 
 static inline ARMMMUIdx get_a64_user_mem_index(DisasContext *s)
diff --git a/target-arm/translate.c b/target-arm/translate.c
index e27634f..3826a02 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -64,12 +64,12 @@ TCGv_ptr cpu_env;
 /* We reuse the same 64-bit temporaries for efficiency.  */
 static TCGv_i64 cpu_V0, cpu_V1, cpu_M0;
 static TCGv_i32 cpu_R[16];
-static TCGv_i32 cpu_CF, cpu_NF, cpu_VF, cpu_ZF;
-static TCGv_i64 cpu_exclusive_addr;
-static TCGv_i64 cpu_exclusive_val;
+TCGv_i32 cpu_CF, cpu_NF, cpu_VF, cpu_ZF;
+TCGv_i64 cpu_exclusive_addr;
+TCGv_i64 cpu_exclusive_val;
 #ifdef CONFIG_USER_ONLY
-static TCGv_i64 cpu_exclusive_test;
-static TCGv_i32 cpu_exclusive_info;
+TCGv_i64 cpu_exclusive_test;
+TCGv_i32 cpu_exclusive_info;
 #endif
 
 /* FIXME:  These should be removed.  */
diff --git a/target-arm/translate.h b/target-arm/translate.h
index 9ab978f..679bdbc 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -62,7 +62,15 @@ typedef struct DisasContext {
     TCGv_i64 tmp_a64[TMP_A64_MAX];
 } DisasContext;
 
+/* Share the TCG temporaries common between 32 and 64 bit modes.  */
 extern TCGv_ptr cpu_env;
+extern TCGv_i32 cpu_NF, cpu_ZF, cpu_CF, cpu_VF;
+extern TCGv_i64 cpu_exclusive_addr;
+extern TCGv_i64 cpu_exclusive_val;
+#ifdef CONFIG_USER_ONLY
+extern TCGv_i64 cpu_exclusive_test;
+extern TCGv_i32 cpu_exclusive_info;
+#endif
 
 static inline int arm_dc_feature(DisasContext *dc, int feature)
 {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 17:09   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 03/11] target-arm: Handle always condition codes within arm_test_cc Richard Henderson
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Split arm_gen_test_cc into 3 functions, so that it can be reused
for non-branch TCG comparisons.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate.c | 110 ++++++++++++++++++++++++++++---------------------
 target-arm/translate.h |   9 ++++
 2 files changed, 73 insertions(+), 46 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 3826a02..1f43777 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -738,81 +738,99 @@ static void gen_thumb2_parallel_addsub(int op1, int op2, TCGv_i32 a, TCGv_i32 b)
 #undef PAS_OP
 
 /*
- * generate a conditional branch based on ARM condition code cc.
+ * Generate a conditional based on ARM condition code cc.
  * This is common between ARM and Aarch64 targets.
  */
-void arm_gen_test_cc(int cc, TCGLabel *label)
+void arm_test_cc(DisasCompare *cmp, int cc)
 {
-    TCGv_i32 tmp;
-    TCGLabel *inv;
+    TCGv_i32 value;
+    TCGCond cond;
+    bool global = true;
 
     switch (cc) {
     case 0: /* eq: Z */
-        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, label);
-        break;
     case 1: /* ne: !Z */
-        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ZF, 0, label);
+        cond = TCG_COND_EQ;
+        value = cpu_ZF;
         break;
+
     case 2: /* cs: C */
-        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_CF, 0, label);
-        break;
     case 3: /* cc: !C */
-        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_CF, 0, label);
+        cond = TCG_COND_NE;
+        value = cpu_CF;
         break;
+
     case 4: /* mi: N */
-        tcg_gen_brcondi_i32(TCG_COND_LT, cpu_NF, 0, label);
-        break;
     case 5: /* pl: !N */
-        tcg_gen_brcondi_i32(TCG_COND_GE, cpu_NF, 0, label);
+        cond = TCG_COND_LT;
+        value = cpu_NF;
         break;
+
     case 6: /* vs: V */
-        tcg_gen_brcondi_i32(TCG_COND_LT, cpu_VF, 0, label);
-        break;
     case 7: /* vc: !V */
-        tcg_gen_brcondi_i32(TCG_COND_GE, cpu_VF, 0, label);
+        cond = TCG_COND_LT;
+        value = cpu_VF;
         break;
+
     case 8: /* hi: C && !Z */
-        inv = gen_new_label();
-        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_CF, 0, inv);
-        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ZF, 0, label);
-        gen_set_label(inv);
-        break;
-    case 9: /* ls: !C || Z */
-        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_CF, 0, label);
-        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, label);
+    case 9: /* ls: !C || Z -> !(C && !Z) */
+        cond = TCG_COND_NE;
+        value = tcg_temp_new_i32();
+        global = false;
+        tcg_gen_neg_i32(value, cpu_CF);
+        tcg_gen_and_i32(value, value, cpu_ZF);
         break;
+
     case 10: /* ge: N == V -> N ^ V == 0 */
-        tmp = tcg_temp_new_i32();
-        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
-        tcg_gen_brcondi_i32(TCG_COND_GE, tmp, 0, label);
-        tcg_temp_free_i32(tmp);
-        break;
     case 11: /* lt: N != V -> N ^ V != 0 */
-        tmp = tcg_temp_new_i32();
-        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
-        tcg_gen_brcondi_i32(TCG_COND_LT, tmp, 0, label);
-        tcg_temp_free_i32(tmp);
+        cond = TCG_COND_GE;
+        value = tcg_temp_new_i32();
+        global = false;
+        tcg_gen_xor_i32(value, cpu_VF, cpu_NF);
         break;
+
     case 12: /* gt: !Z && N == V */
-        inv = gen_new_label();
-        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, inv);
-        tmp = tcg_temp_new_i32();
-        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
-        tcg_gen_brcondi_i32(TCG_COND_GE, tmp, 0, label);
-        tcg_temp_free_i32(tmp);
-        gen_set_label(inv);
-        break;
     case 13: /* le: Z || N != V */
-        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, label);
-        tmp = tcg_temp_new_i32();
-        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
-        tcg_gen_brcondi_i32(TCG_COND_LT, tmp, 0, label);
-        tcg_temp_free_i32(tmp);
+        cond = TCG_COND_NE;
+        value = tcg_temp_new_i32();
+        global = false;
+        tcg_gen_xor_i32(value, cpu_VF, cpu_NF);
+        tcg_gen_sari_i32(value, value, 31);
+        tcg_gen_andc_i32(value, cpu_ZF, value);
         break;
+
     default:
         fprintf(stderr, "Bad condition code 0x%x\n", cc);
         abort();
     }
+
+    if (cc & 1) {
+        cond = tcg_invert_cond(cond);
+    }
+
+    cmp->cond = cond;
+    cmp->value = value;
+    cmp->value_global = global;
+}
+
+void arm_free_cc(DisasCompare *cmp)
+{
+    if (!cmp->value_global) {
+        tcg_temp_free_i32(cmp->value);
+    }
+}
+
+void arm_jump_cc(DisasCompare *cmp, TCGLabel *label)
+{
+    tcg_gen_brcondi_i32(cmp->cond, cmp->value, 0, label);
+}
+
+void arm_gen_test_cc(int cc, TCGLabel *label)
+{
+    DisasCompare cmp;
+    arm_test_cc(&cmp, cc);
+    arm_jump_cc(&cmp, label);
+    arm_free_cc(&cmp);
 }
 
 static const uint8_t table_logic_cc[16] = {
diff --git a/target-arm/translate.h b/target-arm/translate.h
index 679bdbc..2f30862 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -62,6 +62,12 @@ typedef struct DisasContext {
     TCGv_i64 tmp_a64[TMP_A64_MAX];
 } DisasContext;
 
+typedef struct DisasCompare {
+    TCGCond cond;
+    TCGv_i32 value;
+    bool value_global;
+} DisasCompare;
+
 /* Share the TCG temporaries common between 32 and 64 bit modes.  */
 extern TCGv_ptr cpu_env;
 extern TCGv_i32 cpu_NF, cpu_ZF, cpu_CF, cpu_VF;
@@ -143,6 +149,9 @@ static inline void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
 }
 #endif
 
+void arm_test_cc(DisasCompare *cmp, int cc);
+void arm_free_cc(DisasCompare *cmp);
+void arm_jump_cc(DisasCompare *cmp, TCGLabel *label);
 void arm_gen_test_cc(int cc, TCGLabel *label);
 
 #endif /* TARGET_ARM_TRANSLATE_H */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 03/11] target-arm: Handle always condition codes within arm_test_cc
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries Richard Henderson
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 17:11   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 04/11] target-arm: Use setcond and movcond for csel Richard Henderson
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Handling this with TCG_COND_ALWAYS will allow these unlikely
cases to be handled without special cases in the rest of the
translator.  The TCG optimizer ought to be able to reduce
these ALWAYS conditions completely.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 1f43777..e2bccef 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -799,6 +799,14 @@ void arm_test_cc(DisasCompare *cmp, int cc)
         tcg_gen_andc_i32(value, cpu_ZF, value);
         break;
 
+    case 14: /* always */
+    case 15: /* always */
+        /* Use the ALWAYS condition, which will fold early.
+           It doesn't matter what we use for the value.  */
+        cond = TCG_COND_ALWAYS;
+        value = cpu_ZF;
+        goto no_invert;
+
     default:
         fprintf(stderr, "Bad condition code 0x%x\n", cc);
         abort();
@@ -808,6 +816,7 @@ void arm_test_cc(DisasCompare *cmp, int cc)
         cond = tcg_invert_cond(cond);
     }
 
+ no_invert:
     cmp->cond = cond;
     cmp->value = value;
     cmp->value_global = global;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 04/11] target-arm: Use setcond and movcond for csel
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (2 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 03/11] target-arm: Handle always condition codes within arm_test_cc Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 17:17   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless Richard Henderson
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 87 +++++++++++++++++++++++++++-------------------
 1 file changed, 51 insertions(+), 36 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 1587ab5..dcac490 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -166,6 +166,33 @@ void gen_a64_set_pc_im(uint64_t val)
     tcg_gen_movi_i64(cpu_pc, val);
 }
 
+typedef struct DisasCompare64 {
+    TCGCond cond;
+    TCGv_i64 value;
+} DisasCompare64;
+
+static void a64_test_cc(DisasCompare64 *c64, int cc)
+{
+    DisasCompare c32;
+
+    arm_test_cc(&c32, cc);
+
+    c64->value = tcg_temp_new_i64();
+    c64->cond = c32.cond;
+    if (c32.cond == TCG_COND_EQ || c32.cond == TCG_COND_NE) {
+        tcg_gen_extu_i32_i64(c64->value, c32.value);
+    } else {
+        tcg_gen_ext_i32_i64(c64->value, c32.value);
+    }
+
+    arm_free_cc(&c32);
+}
+
+static void a64_free_cc(DisasCompare64 *c64)
+{
+    tcg_temp_free_i64(c64->value);
+}
+
 static void gen_exception_internal(int excp)
 {
     TCGv_i32 tcg_excp = tcg_const_i32(excp);
@@ -3587,7 +3614,8 @@ static void disas_cc(DisasContext *s, uint32_t insn)
 static void disas_cond_select(DisasContext *s, uint32_t insn)
 {
     unsigned int sf, else_inv, rm, cond, else_inc, rn, rd;
-    TCGv_i64 tcg_rd, tcg_src;
+    TCGv_i64 tcg_rd, zero;
+    DisasCompare64 c;
 
     if (extract32(insn, 29, 1) || extract32(insn, 11, 1)) {
         /* S == 1 or op2<1> == 1 */
@@ -3602,48 +3630,35 @@ static void disas_cond_select(DisasContext *s, uint32_t insn)
     rn = extract32(insn, 5, 5);
     rd = extract32(insn, 0, 5);
 
-    if (rd == 31) {
-        /* silly no-op write; until we use movcond we must special-case
-         * this to avoid a dead temporary across basic blocks.
-         */
-        return;
-    }
-
     tcg_rd = cpu_reg(s, rd);
 
-    if (cond >= 0x0e) { /* condition "always" */
-        tcg_src = read_cpu_reg(s, rn, sf);
-        tcg_gen_mov_i64(tcg_rd, tcg_src);
-    } else {
-        /* OPTME: we could use movcond here, at the cost of duplicating
-         * a lot of the arm_gen_test_cc() logic.
-         */
-        TCGLabel *label_match = gen_new_label();
-        TCGLabel *label_continue = gen_new_label();
-
-        arm_gen_test_cc(cond, label_match);
-        /* nomatch: */
-        tcg_src = cpu_reg(s, rm);
+    a64_test_cc(&c, cond);
+    zero = tcg_const_i64(0);
 
+    if (rn == 31 && rm == 31 && (else_inc ^ else_inv)) {
+        /* CSET & CSETM.  */
+        tcg_gen_setcond_i64(tcg_invert_cond(c.cond), tcg_rd, c.value, zero);
+        if (else_inv) {
+            tcg_gen_neg_i64(tcg_rd, tcg_rd);
+        }
+    } else {
+        TCGv_i64 t_true = cpu_reg(s, rn);
+        TCGv_i64 t_false = read_cpu_reg(s, rm, 1);
         if (else_inv && else_inc) {
-            tcg_gen_neg_i64(tcg_rd, tcg_src);
+            tcg_gen_neg_i64(t_false, t_false);
         } else if (else_inv) {
-            tcg_gen_not_i64(tcg_rd, tcg_src);
+            tcg_gen_not_i64(t_false, t_false);
         } else if (else_inc) {
-            tcg_gen_addi_i64(tcg_rd, tcg_src, 1);
-        } else {
-            tcg_gen_mov_i64(tcg_rd, tcg_src);
-        }
-        if (!sf) {
-            tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
+            tcg_gen_addi_i64(t_false, t_false, 1);
         }
-        tcg_gen_br(label_continue);
-        /* match: */
-        gen_set_label(label_match);
-        tcg_src = read_cpu_reg(s, rn, sf);
-        tcg_gen_mov_i64(tcg_rd, tcg_src);
-        /* continue: */
-        gen_set_label(label_continue);
+        tcg_gen_movcond_i64(c.cond, tcg_rd, c.value, zero, t_true, t_false);
+    }
+
+    tcg_temp_free_i64(zero);
+    a64_free_cc(&c);
+
+    if (!sf) {
+        tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
     }
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (3 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 04/11] target-arm: Use setcond and movcond for csel Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 17:31   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond Richard Henderson
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

This can allow much of a ccmp to be elided when particular
flags are subsequently dead.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 65 +++++++++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 21 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index dcac490..48ecf23 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -3552,8 +3552,9 @@ static void disas_adc_sbc(DisasContext *s, uint32_t insn)
 static void disas_cc(DisasContext *s, uint32_t insn)
 {
     unsigned int sf, op, y, cond, rn, nzcv, is_imm;
-    TCGLabel *label_continue = NULL;
-    TCGv_i64 tcg_tmp, tcg_y, tcg_rn;
+    TCGv_i32 tcg_t0, tcg_t1, tcg_t2;
+    TCGv_i64 tcg_res, tcg_y, tcg_rn;
+    DisasCompare c;
 
     if (!extract32(insn, 29, 1)) {
         unallocated_encoding(s);
@@ -3571,19 +3572,13 @@ static void disas_cc(DisasContext *s, uint32_t insn)
     rn = extract32(insn, 5, 5);
     nzcv = extract32(insn, 0, 4);
 
-    if (cond < 0x0e) { /* not always */
-        TCGLabel *label_match = gen_new_label();
-        label_continue = gen_new_label();
-        arm_gen_test_cc(cond, label_match);
-        /* nomatch: */
-        tcg_tmp = tcg_temp_new_i64();
-        tcg_gen_movi_i64(tcg_tmp, nzcv << 28);
-        gen_set_nzcv(tcg_tmp);
-        tcg_temp_free_i64(tcg_tmp);
-        tcg_gen_br(label_continue);
-        gen_set_label(label_match);
-    }
-    /* match, or condition is always */
+    /* Set T0 = !COND.  */
+    tcg_t0 = tcg_temp_new_i32();
+    arm_test_cc(&c, cond);
+    tcg_gen_setcondi_i32(tcg_invert_cond(c.cond), tcg_t0, c.value, 0);
+    arm_free_cc(&c);
+
+    /* Load the arguments for the new comparison.  */
     if (is_imm) {
         tcg_y = new_tmp_a64(s);
         tcg_gen_movi_i64(tcg_y, y);
@@ -3592,17 +3587,45 @@ static void disas_cc(DisasContext *s, uint32_t insn)
     }
     tcg_rn = cpu_reg(s, rn);
 
-    tcg_tmp = tcg_temp_new_i64();
+    /* Set the flags for the new comparison.  */
+    tcg_res = tcg_temp_new_i64();
     if (op) {
-        gen_sub_CC(sf, tcg_tmp, tcg_rn, tcg_y);
+        gen_sub_CC(sf, tcg_res, tcg_rn, tcg_y);
     } else {
-        gen_add_CC(sf, tcg_tmp, tcg_rn, tcg_y);
+        gen_add_CC(sf, tcg_res, tcg_rn, tcg_y);
     }
-    tcg_temp_free_i64(tcg_tmp);
+    tcg_temp_free_i64(tcg_res);
 
-    if (cond < 0x0e) { /* continue */
-        gen_set_label(label_continue);
+    /* If COND was false, force the flags to #nzcv.
+       Note that T1 = (COND ? 0 : -1), T2 = (COND ? -1 : 0).  */
+    tcg_t1 = tcg_temp_new_i32();
+    tcg_t2 = tcg_temp_new_i32();
+    tcg_gen_neg_i32(tcg_t1, tcg_t0);
+    tcg_gen_subi_i32(tcg_t2, tcg_t0, 1);
+
+    if (nzcv & 8) { /* N */
+        tcg_gen_or_i32(cpu_NF, cpu_NF, tcg_t1);
+    } else {
+        tcg_gen_and_i32(cpu_NF, cpu_NF, tcg_t2);
+    }
+    if (nzcv & 4) { /* Z */
+        tcg_gen_and_i32(cpu_ZF, cpu_ZF, tcg_t2);
+    } else {
+        tcg_gen_or_i32(cpu_ZF, cpu_ZF, tcg_t0);
+    }
+    if (nzcv & 2) { /* C */
+        tcg_gen_or_i32(cpu_CF, cpu_CF, tcg_t0);
+    } else {
+        tcg_gen_and_i32(cpu_CF, cpu_CF, tcg_t2);
+    }
+    if (nzcv & 1) { /* V */
+        tcg_gen_or_i32(cpu_VF, cpu_VF, tcg_t1);
+    } else {
+        tcg_gen_and_i32(cpu_VF, cpu_VF, tcg_t2);
     }
+    tcg_temp_free_i32(tcg_t0);
+    tcg_temp_free_i32(tcg_t1);
+    tcg_temp_free_i32(tcg_t2);
 }
 
 /* C3.5.6 Conditional select
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (4 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 17:42   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 07/11] target-arm: Recognize SXTB, SXTH, SXTW, ASR Richard Henderson
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 48 ++++++++++++++++++++--------------------------
 1 file changed, 21 insertions(+), 27 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 48ecf23..a6e5ccd 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4168,20 +4168,6 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
     }
 }
 
-/* copy src FP register to dst FP register; type specifies single or double */
-static void gen_mov_fp2fp(DisasContext *s, int type, int dst, int src)
-{
-    if (type) {
-        TCGv_i64 v = read_fp_dreg(s, src);
-        write_fp_dreg(s, dst, v);
-        tcg_temp_free_i64(v);
-    } else {
-        TCGv_i32 v = read_fp_sreg(s, src);
-        write_fp_sreg(s, dst, v);
-        tcg_temp_free_i32(v);
-    }
-}
-
 /* C3.6.24 Floating point conditional select
  *   31  30  29 28       24 23  22  21 20  16 15  12 11 10 9    5 4    0
  * +---+---+---+-----------+------+---+------+------+-----+------+------+
@@ -4191,7 +4177,8 @@ static void gen_mov_fp2fp(DisasContext *s, int type, int dst, int src)
 static void disas_fp_csel(DisasContext *s, uint32_t insn)
 {
     unsigned int mos, type, rm, cond, rn, rd;
-    TCGLabel *label_continue = NULL;
+    TCGv_i64 t_true, t_false, t_zero;
+    DisasCompare64 c;
 
     mos = extract32(insn, 29, 3);
     type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
@@ -4209,21 +4196,28 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
         return;
     }
 
-    if (cond < 0x0e) { /* not always */
-        TCGLabel *label_match = gen_new_label();
-        label_continue = gen_new_label();
-        arm_gen_test_cc(cond, label_match);
-        /* nomatch: */
-        gen_mov_fp2fp(s, type, rd, rm);
-        tcg_gen_br(label_continue);
-        gen_set_label(label_match);
+    if (type) {
+        t_true = read_fp_dreg(s, rn);
+        t_false = read_fp_dreg(s, rm);
+    } else {
+        /* Zero-extend sreg inputs to 64-bits now.  */
+        t_true = tcg_temp_new_i64();
+        t_false = tcg_temp_new_i64();
+        tcg_gen_ld32u_i64(t_true, cpu_env, fp_reg_offset(s, rn, MO_32));
+        tcg_gen_ld32u_i64(t_false, cpu_env, fp_reg_offset(s, rm, MO_32));
     }
 
-    gen_mov_fp2fp(s, type, rd, rn);
+    a64_test_cc(&c, cond);
+    t_zero = tcg_const_i64(0);
+    tcg_gen_movcond_i64(c.cond, t_true, c.value, t_zero, t_true, t_false);
+    tcg_temp_free_i64(t_zero);
+    tcg_temp_free_i64(t_false);
+    a64_free_cc(&c);
 
-    if (cond < 0x0e) { /* continue */
-        gen_set_label(label_continue);
-    }
+    /* Note that sregs write back zeros to the high bits,
+       and we've already done the zero-extension.  */
+    write_fp_dreg(s, rd, t_true);
+    tcg_temp_free_i64(t_true);
 }
 
 /* C3.6.25 Floating-point data-processing (1 source) - single precision */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 07/11] target-arm: Recognize SXTB, SXTH, SXTW, ASR
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (5 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 17:47   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 08/11] target-arm: Recognize UXTB, UXTH, LSR, LSL Richard Henderson
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

... as aliases of SBFM, and special-case them.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index a6e5ccd..74dd0f8 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -2999,7 +2999,28 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
     tcg_rd = cpu_reg(s, rd);
     tcg_tmp = read_cpu_reg(s, rn, sf);
 
-    /* OPTME: probably worth recognizing common cases of ext{8,16,32}{u,s} */
+    /* Recognize the common aliases.  */
+    if (opc == 0) { /* SBFM */
+        if (ri == 0) {
+            if (si == 7) { /* SXTB */
+                tcg_gen_ext8s_i64(tcg_rd, tcg_tmp);
+                goto done;
+            } else if (si == 15) { /* SXTH */
+                tcg_gen_ext16s_i64(tcg_rd, tcg_tmp);
+                goto done;
+            } else if (si == 31) { /* SXTW */
+                tcg_gen_ext32s_i64(tcg_rd, tcg_tmp);
+                goto done;
+            }
+        }
+        if (si == 63 || (si == 31 && ri <= si)) { /* ASR */
+            if (si == 31) {
+                tcg_gen_ext32s_i64(tcg_tmp, tcg_tmp);
+            }
+            tcg_gen_sari_i64(tcg_rd, tcg_tmp, ri);
+            goto done;
+        }
+    }
 
     if (opc != 1) { /* SBFM or UBFM */
         tcg_gen_movi_i64(tcg_rd, 0);
@@ -3024,6 +3045,7 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
         tcg_gen_sari_i64(tcg_rd, tcg_rd, 64 - (pos + len));
     }
 
+ done:
     if (!sf) { /* zero extend final result */
         tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
     }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 08/11] target-arm: Recognize UXTB, UXTH, LSR, LSL
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (6 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 07/11] target-arm: Recognize SXTB, SXTH, SXTW, ASR Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 18:00   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 09/11] target-arm: Eliminate unnecessary zero-extend in disas_bitfield Richard Henderson
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 74dd0f8..8c94edf 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -3020,6 +3020,23 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
             tcg_gen_sari_i64(tcg_rd, tcg_tmp, ri);
             goto done;
         }
+    } else if (opc == 2) { /* UBFM */
+        if (ri == 0) { /* UXTB, UXTH, plus non-canonical AND */
+            tcg_gen_andi_i64(tcg_rd, tcg_tmp, bitmask64(si + 1));
+            return;
+        }
+        if (si == 63 || (si == 31 && ri <= si)) { /* LSR */
+            if (si == 31) {
+                tcg_gen_ext32u_i64(tcg_tmp, tcg_tmp);
+            }
+            tcg_gen_shri_i64(tcg_rd, tcg_tmp, ri);
+            return;
+        }
+        if (si + 1 == ri && si != bitsize - 1) { /* LSL */
+            int shift = bitsize - 1 - si;
+            tcg_gen_shli_i64(tcg_rd, tcg_tmp, shift);
+            goto done;
+        }
     }
 
     if (opc != 1) { /* SBFM or UBFM */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 09/11] target-arm: Eliminate unnecessary zero-extend in disas_bitfield
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (7 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 08/11] target-arm: Recognize UXTB, UXTH, LSR, LSL Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 18:02   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 10/11] target-arm: Recognize ROR Richard Henderson
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 11/11] target-arm: Use tcg_gen_extrh_i64_i32 Richard Henderson
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

For !SF, this initial ext32u can't be optimized away by the
current TCG code generator.  (It would require backward bit
liveness propagation.)

But since the range of bits for !SF are already constrained by
unallocated_encoding, we'll never reference the high bits anyway.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 8c94edf..10f8825 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -2997,7 +2997,11 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
     }
 
     tcg_rd = cpu_reg(s, rd);
-    tcg_tmp = read_cpu_reg(s, rn, sf);
+
+    /* Suppress the zero-extend for !sf.  Since RI and SI are constrained
+       to be smaller than bitsize, we'll never reference data outside the
+       low 32-bits anyway.  */
+    tcg_tmp = read_cpu_reg(s, rn, 1);
 
     /* Recognize the common aliases.  */
     if (opc == 0) { /* SBFM */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 10/11] target-arm: Recognize ROR
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (8 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 09/11] target-arm: Eliminate unnecessary zero-extend in disas_bitfield Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 18:06   ` Peter Maydell
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 11/11] target-arm: Use tcg_gen_extrh_i64_i32 Richard Henderson
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 10f8825..815ec7d 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -3099,17 +3099,7 @@ static void disas_extract(DisasContext *s, uint32_t insn)
 
         tcg_rd = cpu_reg(s, rd);
 
-        if (imm) {
-            /* OPTME: we can special case rm==rn as a rotate */
-            tcg_rm = read_cpu_reg(s, rm, sf);
-            tcg_rn = read_cpu_reg(s, rn, sf);
-            tcg_gen_shri_i64(tcg_rm, tcg_rm, imm);
-            tcg_gen_shli_i64(tcg_rn, tcg_rn, bitsize - imm);
-            tcg_gen_or_i64(tcg_rd, tcg_rm, tcg_rn);
-            if (!sf) {
-                tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
-            }
-        } else {
+        if (unlikely(imm == 0)) {
             /* tcg shl_i32/shl_i64 is undefined for 32/64 bit shifts,
              * so an extract from bit 0 is a special case.
              */
@@ -3118,8 +3108,27 @@ static void disas_extract(DisasContext *s, uint32_t insn)
             } else {
                 tcg_gen_ext32u_i64(tcg_rd, cpu_reg(s, rm));
             }
+        } else if (rm == rn) { /* ROR */
+            tcg_rm = cpu_reg(s, rm);
+            if (sf) {
+                tcg_gen_rotri_i64(tcg_rd, tcg_rm, imm);
+            } else {
+                TCGv_i32 tmp = tcg_temp_new_i32();
+                tcg_gen_extrl_i64_i32(tmp, tcg_rm);
+                tcg_gen_rotri_i32(tmp, tmp, imm);
+                tcg_gen_extu_i32_i64(tcg_rd, tmp);
+                tcg_temp_free_i32(tmp);
+            }
+        } else {
+            tcg_rm = read_cpu_reg(s, rm, sf);
+            tcg_rn = read_cpu_reg(s, rn, sf);
+            tcg_gen_shri_i64(tcg_rm, tcg_rm, imm);
+            tcg_gen_shli_i64(tcg_rn, tcg_rn, bitsize - imm);
+            tcg_gen_or_i64(tcg_rd, tcg_rm, tcg_rn);
+            if (!sf) {
+                tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
+            }
         }
-
     }
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [Qemu-devel] [PATCH v2 11/11] target-arm: Use tcg_gen_extrh_i64_i32
  2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
                   ` (9 preceding siblings ...)
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 10/11] target-arm: Recognize ROR Richard Henderson
@ 2015-09-02 17:57 ` Richard Henderson
  2015-09-07 18:11   ` Peter Maydell
  10 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-02 17:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Usually, eliminate an operation from the translator by combining
a shift with an extract.

In the case of gen_set_NZ64, we don't need a boolean value for cpu_ZF,
merely a non-zero value.  Given that we can extract both halves of a
64-bit input in one call, this simplifies the code.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 34 +++++++++-------------------------
 1 file changed, 9 insertions(+), 25 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 815ec7d..1c10448 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -530,13 +530,8 @@ static TCGv_ptr get_fpstatus_ptr(void)
  */
 static inline void gen_set_NZ64(TCGv_i64 result)
 {
-    TCGv_i64 flag = tcg_temp_new_i64();
-
-    tcg_gen_setcondi_i64(TCG_COND_NE, flag, result, 0);
-    tcg_gen_extrl_i64_i32(cpu_ZF, flag);
-    tcg_gen_shri_i64(flag, result, 32);
-    tcg_gen_extrl_i64_i32(cpu_NF, flag);
-    tcg_temp_free_i64(flag);
+    tcg_gen_extr_i64_i32(cpu_ZF, cpu_NF, result);
+    tcg_gen_or_i32(cpu_ZF, cpu_ZF, cpu_NF);
 }
 
 /* Set NZCV as for a logical operation: NZ as per result, CV cleared. */
@@ -546,7 +541,7 @@ static inline void gen_logic_CC(int sf, TCGv_i64 result)
         gen_set_NZ64(result);
     } else {
         tcg_gen_extrl_i64_i32(cpu_ZF, result);
-        tcg_gen_extrl_i64_i32(cpu_NF, result);
+        tcg_gen_mov_i32(cpu_NF, cpu_ZF);
     }
     tcg_gen_movi_i32(cpu_CF, 0);
     tcg_gen_movi_i32(cpu_VF, 0);
@@ -572,8 +567,7 @@ static void gen_add_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
         tcg_gen_xor_i64(tmp, t0, t1);
         tcg_gen_andc_i64(flag, flag, tmp);
         tcg_temp_free_i64(tmp);
-        tcg_gen_shri_i64(flag, flag, 32);
-        tcg_gen_extrl_i64_i32(cpu_VF, flag);
+        tcg_gen_extrh_i64_i32(cpu_VF, flag);
 
         tcg_gen_mov_i64(dest, result);
         tcg_temp_free_i64(result);
@@ -621,8 +615,7 @@ static void gen_sub_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
         tcg_gen_xor_i64(tmp, t0, t1);
         tcg_gen_and_i64(flag, flag, tmp);
         tcg_temp_free_i64(tmp);
-        tcg_gen_shri_i64(flag, flag, 32);
-        tcg_gen_extrl_i64_i32(cpu_VF, flag);
+        tcg_gen_extrh_i64_i32(cpu_VF, flag);
         tcg_gen_mov_i64(dest, result);
         tcg_temp_free_i64(flag);
         tcg_temp_free_i64(result);
@@ -681,8 +674,7 @@ static void gen_adc_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, TCGv_i64 t1)
         tcg_gen_xor_i64(vf_64, result, t0);
         tcg_gen_xor_i64(tmp, t0, t1);
         tcg_gen_andc_i64(vf_64, vf_64, tmp);
-        tcg_gen_shri_i64(vf_64, vf_64, 32);
-        tcg_gen_extrl_i64_i32(cpu_VF, vf_64);
+        tcg_gen_extrh_i64_i32(cpu_VF, vf_64);
 
         tcg_gen_mov_i64(dest, result);
 
@@ -7743,10 +7735,8 @@ static void handle_2misc_narrow(DisasContext *s, bool scalar,
             } else {
                 TCGv_i32 tcg_lo = tcg_temp_new_i32();
                 TCGv_i32 tcg_hi = tcg_temp_new_i32();
-                tcg_gen_extrl_i64_i32(tcg_lo, tcg_op);
+                tcg_gen_extr_i64_i32(tcg_lo, tcg_hi, tcg_op);
                 gen_helper_vfp_fcvt_f32_to_f16(tcg_lo, tcg_lo, cpu_env);
-                tcg_gen_shri_i64(tcg_op, tcg_op, 32);
-                tcg_gen_extrl_i64_i32(tcg_hi, tcg_op);
                 gen_helper_vfp_fcvt_f32_to_f16(tcg_hi, tcg_hi, cpu_env);
                 tcg_gen_deposit_i32(tcg_res[pass], tcg_lo, tcg_hi, 16, 16);
                 tcg_temp_free_i32(tcg_lo);
@@ -8652,16 +8642,10 @@ static void handle_3rd_wide(DisasContext *s, int is_q, int is_u, int size,
     }
 }
 
-static void do_narrow_high_u32(TCGv_i32 res, TCGv_i64 in)
-{
-    tcg_gen_shri_i64(in, in, 32);
-    tcg_gen_extrl_i64_i32(res, in);
-}
-
 static void do_narrow_round_high_u32(TCGv_i32 res, TCGv_i64 in)
 {
     tcg_gen_addi_i64(in, in, 1U << 31);
-    do_narrow_high_u32(res, in);
+    tcg_gen_extrh_i64_i32(res, in);
 }
 
 static void handle_3rd_narrowing(DisasContext *s, int is_q, int is_u, int size,
@@ -8680,7 +8664,7 @@ static void handle_3rd_narrowing(DisasContext *s, int is_q, int is_u, int size,
               gen_helper_neon_narrow_round_high_u8 },
             { gen_helper_neon_narrow_high_u16,
               gen_helper_neon_narrow_round_high_u16 },
-            { do_narrow_high_u32, do_narrow_round_high_u32 },
+            { tcg_gen_extrh_i64_i32, do_narrow_round_high_u32 },
         };
         NeonGenNarrowFn *gennarrow = narrowfns[size][is_u];
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries Richard Henderson
@ 2015-09-07 16:57   ` Peter Maydell
  2015-09-08  5:13     ` Richard Henderson
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 16:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> This is a bug fix for aarch64.  At present, we have branches using
> the 32-bit (translate.c) versions of cpu_[NZCV]F, but we set the flags
> using the 64-bit (translate-a64.c) versions of cpu_[NZCV]F.  From
> the view of the TCG code generator, these are unrelated variables.
>
> The bug is hard to see because we currently only read these variables
> from branches, and upon reaching a branch TCG will first spill live
> variables and then reload the arguments of the branch.  Since the
> 32-bit versions were never live until reaching the branch, we'd re-read
> the data that had just been spilled from the 64-bit versions.
>
> There is currently no such problem with the cpu_exclusive_* variables,
> but there's no point in tempting fate.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

Should this be cc:qemu-stable@nongnu.org ?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare Richard Henderson
@ 2015-09-07 17:09   ` Peter Maydell
  2015-09-08  5:09     ` Richard Henderson
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 17:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> Split arm_gen_test_cc into 3 functions, so that it can be reused
> for non-branch TCG comparisons.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate.c | 110 ++++++++++++++++++++++++++++---------------------
>  target-arm/translate.h |   9 ++++
>  2 files changed, 73 insertions(+), 46 deletions(-)
>
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index 3826a02..1f43777 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -738,81 +738,99 @@ static void gen_thumb2_parallel_addsub(int op1, int op2, TCGv_i32 a, TCGv_i32 b)
>  #undef PAS_OP
>
>  /*
> - * generate a conditional branch based on ARM condition code cc.
> + * Generate a conditional based on ARM condition code cc.
>   * This is common between ARM and Aarch64 targets.
>   */
> -void arm_gen_test_cc(int cc, TCGLabel *label)
> +void arm_test_cc(DisasCompare *cmp, int cc)
>  {
> -    TCGv_i32 tmp;
> -    TCGLabel *inv;
> +    TCGv_i32 value;
> +    TCGCond cond;
> +    bool global = true;
>
>      switch (cc) {
>      case 0: /* eq: Z */
> -        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, label);
> -        break;
>      case 1: /* ne: !Z */
> -        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ZF, 0, label);
> +        cond = TCG_COND_EQ;
> +        value = cpu_ZF;
>          break;
> +
>      case 2: /* cs: C */
> -        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_CF, 0, label);
> -        break;
>      case 3: /* cc: !C */
> -        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_CF, 0, label);
> +        cond = TCG_COND_NE;
> +        value = cpu_CF;
>          break;
> +
>      case 4: /* mi: N */
> -        tcg_gen_brcondi_i32(TCG_COND_LT, cpu_NF, 0, label);
> -        break;
>      case 5: /* pl: !N */
> -        tcg_gen_brcondi_i32(TCG_COND_GE, cpu_NF, 0, label);
> +        cond = TCG_COND_LT;
> +        value = cpu_NF;
>          break;
> +
>      case 6: /* vs: V */
> -        tcg_gen_brcondi_i32(TCG_COND_LT, cpu_VF, 0, label);
> -        break;
>      case 7: /* vc: !V */
> -        tcg_gen_brcondi_i32(TCG_COND_GE, cpu_VF, 0, label);
> +        cond = TCG_COND_LT;
> +        value = cpu_VF;
>          break;
> +
>      case 8: /* hi: C && !Z */
> -        inv = gen_new_label();
> -        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_CF, 0, inv);
> -        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ZF, 0, label);
> -        gen_set_label(inv);
> -        break;
> -    case 9: /* ls: !C || Z */
> -        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_CF, 0, label);
> -        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, label);
> +    case 9: /* ls: !C || Z -> !(C && !Z) */
> +        cond = TCG_COND_NE;
> +        value = tcg_temp_new_i32();
> +        global = false;
> +        tcg_gen_neg_i32(value, cpu_CF);
> +        tcg_gen_and_i32(value, value, cpu_ZF);
>          break;

The comment says hi is C && !Z, but the code
doesn't seem to line up with that. At least part
of that is presumably because we store ZF inverted,
but why are we negating CF here?

> +
>      case 10: /* ge: N == V -> N ^ V == 0 */
> -        tmp = tcg_temp_new_i32();
> -        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
> -        tcg_gen_brcondi_i32(TCG_COND_GE, tmp, 0, label);
> -        tcg_temp_free_i32(tmp);
> -        break;
>      case 11: /* lt: N != V -> N ^ V != 0 */
> -        tmp = tcg_temp_new_i32();
> -        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
> -        tcg_gen_brcondi_i32(TCG_COND_LT, tmp, 0, label);
> -        tcg_temp_free_i32(tmp);
> +        cond = TCG_COND_GE;
> +        value = tcg_temp_new_i32();
> +        global = false;
> +        tcg_gen_xor_i32(value, cpu_VF, cpu_NF);
>          break;
> +
>      case 12: /* gt: !Z && N == V */
> -        inv = gen_new_label();
> -        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, inv);
> -        tmp = tcg_temp_new_i32();
> -        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
> -        tcg_gen_brcondi_i32(TCG_COND_GE, tmp, 0, label);
> -        tcg_temp_free_i32(tmp);
> -        gen_set_label(inv);
> -        break;
>      case 13: /* le: Z || N != V */
> -        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_ZF, 0, label);
> -        tmp = tcg_temp_new_i32();
> -        tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF);
> -        tcg_gen_brcondi_i32(TCG_COND_LT, tmp, 0, label);
> -        tcg_temp_free_i32(tmp);
> +        cond = TCG_COND_NE;
> +        value = tcg_temp_new_i32();
> +        global = false;
> +        tcg_gen_xor_i32(value, cpu_VF, cpu_NF);
> +        tcg_gen_sari_i32(value, value, 31);
> +        tcg_gen_andc_i32(value, cpu_ZF, value);

I think this is correct, but it could use some commentary
to explain what it's doing.

Otherwise looks good.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 03/11] target-arm: Handle always condition codes within arm_test_cc
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 03/11] target-arm: Handle always condition codes within arm_test_cc Richard Henderson
@ 2015-09-07 17:11   ` Peter Maydell
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 17:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> Handling this with TCG_COND_ALWAYS will allow these unlikely
> cases to be handled without special cases in the rest of the
> translator.  The TCG optimizer ought to be able to reduce
> these ALWAYS conditions completely.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index 1f43777..e2bccef 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -799,6 +799,14 @@ void arm_test_cc(DisasCompare *cmp, int cc)
>          tcg_gen_andc_i32(value, cpu_ZF, value);
>          break;
>
> +    case 14: /* always */
> +    case 15: /* always */
> +        /* Use the ALWAYS condition, which will fold early.
> +           It doesn't matter what we use for the value.  */
> +        cond = TCG_COND_ALWAYS;
> +        value = cpu_ZF;
> +        goto no_invert;
> +

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

The usual multiline comment style in target-arm is
 /* line one
  * line two
  */

by the way.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 04/11] target-arm: Use setcond and movcond for csel
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 04/11] target-arm: Use setcond and movcond for csel Richard Henderson
@ 2015-09-07 17:17   ` Peter Maydell
  2015-09-08  5:12     ` Richard Henderson
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 17:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate-a64.c | 87 +++++++++++++++++++++++++++-------------------
>  1 file changed, 51 insertions(+), 36 deletions(-)
>
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index 1587ab5..dcac490 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -166,6 +166,33 @@ void gen_a64_set_pc_im(uint64_t val)
>      tcg_gen_movi_i64(cpu_pc, val);
>  }
>
> +typedef struct DisasCompare64 {
> +    TCGCond cond;
> +    TCGv_i64 value;
> +} DisasCompare64;
> +
> +static void a64_test_cc(DisasCompare64 *c64, int cc)
> +{
> +    DisasCompare c32;
> +
> +    arm_test_cc(&c32, cc);
> +
> +    c64->value = tcg_temp_new_i64();
> +    c64->cond = c32.cond;
> +    if (c32.cond == TCG_COND_EQ || c32.cond == TCG_COND_NE) {
> +        tcg_gen_extu_i32_i64(c64->value, c32.value);
> +    } else {
> +        tcg_gen_ext_i32_i64(c64->value, c32.value);
> +    }

Signed extend would work for EQ and NE as well, wouldn't it?
Why prefer the unsigned extension in those cases? If there's
a reason, it could do with being commented.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless Richard Henderson
@ 2015-09-07 17:31   ` Peter Maydell
  2015-09-08  5:18     ` Richard Henderson
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 17:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> This can allow much of a ccmp to be elided when particular
> flags are subsequently dead.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate-a64.c | 65 +++++++++++++++++++++++++++++++---------------
>  1 file changed, 44 insertions(+), 21 deletions(-)
>
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index dcac490..48ecf23 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -3552,8 +3552,9 @@ static void disas_adc_sbc(DisasContext *s, uint32_t insn)
>  static void disas_cc(DisasContext *s, uint32_t insn)
>  {
>      unsigned int sf, op, y, cond, rn, nzcv, is_imm;
> -    TCGLabel *label_continue = NULL;
> -    TCGv_i64 tcg_tmp, tcg_y, tcg_rn;
> +    TCGv_i32 tcg_t0, tcg_t1, tcg_t2;
> +    TCGv_i64 tcg_res, tcg_y, tcg_rn;
> +    DisasCompare c;
>
>      if (!extract32(insn, 29, 1)) {
>          unallocated_encoding(s);
> @@ -3571,19 +3572,13 @@ static void disas_cc(DisasContext *s, uint32_t insn)
>      rn = extract32(insn, 5, 5);
>      nzcv = extract32(insn, 0, 4);
>
> -    if (cond < 0x0e) { /* not always */
> -        TCGLabel *label_match = gen_new_label();
> -        label_continue = gen_new_label();
> -        arm_gen_test_cc(cond, label_match);
> -        /* nomatch: */
> -        tcg_tmp = tcg_temp_new_i64();
> -        tcg_gen_movi_i64(tcg_tmp, nzcv << 28);
> -        gen_set_nzcv(tcg_tmp);
> -        tcg_temp_free_i64(tcg_tmp);
> -        tcg_gen_br(label_continue);
> -        gen_set_label(label_match);
> -    }
> -    /* match, or condition is always */
> +    /* Set T0 = !COND.  */
> +    tcg_t0 = tcg_temp_new_i32();
> +    arm_test_cc(&c, cond);
> +    tcg_gen_setcondi_i32(tcg_invert_cond(c.cond), tcg_t0, c.value, 0);
> +    arm_free_cc(&c);
> +
> +    /* Load the arguments for the new comparison.  */
>      if (is_imm) {
>          tcg_y = new_tmp_a64(s);
>          tcg_gen_movi_i64(tcg_y, y);
> @@ -3592,17 +3587,45 @@ static void disas_cc(DisasContext *s, uint32_t insn)
>      }
>      tcg_rn = cpu_reg(s, rn);
>
> -    tcg_tmp = tcg_temp_new_i64();
> +    /* Set the flags for the new comparison.  */
> +    tcg_res = tcg_temp_new_i64();
>      if (op) {
> -        gen_sub_CC(sf, tcg_tmp, tcg_rn, tcg_y);
> +        gen_sub_CC(sf, tcg_res, tcg_rn, tcg_y);
>      } else {
> -        gen_add_CC(sf, tcg_tmp, tcg_rn, tcg_y);
> +        gen_add_CC(sf, tcg_res, tcg_rn, tcg_y);
>      }
> -    tcg_temp_free_i64(tcg_tmp);
> +    tcg_temp_free_i64(tcg_res);

Seems a bit unnecessary to bother changing the name of
this TCG temporary.

>
> -    if (cond < 0x0e) { /* continue */
> -        gen_set_label(label_continue);
> +    /* If COND was false, force the flags to #nzcv.
> +       Note that T1 = (COND ? 0 : -1), T2 = (COND ? -1 : 0).  */
> +    tcg_t1 = tcg_temp_new_i32();
> +    tcg_t2 = tcg_temp_new_i32();
> +    tcg_gen_neg_i32(tcg_t1, tcg_t0);
> +    tcg_gen_subi_i32(tcg_t2, tcg_t0, 1);

t2 is ~t1, right? Do we get better/worse code if we use
tcg_gen_andc_i32(..., tcg_t1) rather than creating t2 and
using gen_and_i32 ?

> +
> +    if (nzcv & 8) { /* N */
> +        tcg_gen_or_i32(cpu_NF, cpu_NF, tcg_t1);
> +    } else {
> +        tcg_gen_and_i32(cpu_NF, cpu_NF, tcg_t2);
> +    }
> +    if (nzcv & 4) { /* Z */
> +        tcg_gen_and_i32(cpu_ZF, cpu_ZF, tcg_t2);
> +    } else {
> +        tcg_gen_or_i32(cpu_ZF, cpu_ZF, tcg_t0);
> +    }
> +    if (nzcv & 2) { /* C */
> +        tcg_gen_or_i32(cpu_CF, cpu_CF, tcg_t0);
> +    } else {
> +        tcg_gen_and_i32(cpu_CF, cpu_CF, tcg_t2);
> +    }
> +    if (nzcv & 1) { /* V */
> +        tcg_gen_or_i32(cpu_VF, cpu_VF, tcg_t1);
> +    } else {
> +        tcg_gen_and_i32(cpu_VF, cpu_VF, tcg_t2);
>      }
> +    tcg_temp_free_i32(tcg_t0);
> +    tcg_temp_free_i32(tcg_t1);
> +    tcg_temp_free_i32(tcg_t2);
>  }

Otherwise looks OK.

-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond Richard Henderson
@ 2015-09-07 17:42   ` Peter Maydell
  2015-09-08 15:21     ` Richard Henderson
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 17:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate-a64.c | 48 ++++++++++++++++++++--------------------------
>  1 file changed, 21 insertions(+), 27 deletions(-)
>
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index 48ecf23..a6e5ccd 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -4168,20 +4168,6 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
>      }
>  }
>
> -/* copy src FP register to dst FP register; type specifies single or double */
> -static void gen_mov_fp2fp(DisasContext *s, int type, int dst, int src)
> -{
> -    if (type) {
> -        TCGv_i64 v = read_fp_dreg(s, src);
> -        write_fp_dreg(s, dst, v);
> -        tcg_temp_free_i64(v);
> -    } else {
> -        TCGv_i32 v = read_fp_sreg(s, src);
> -        write_fp_sreg(s, dst, v);
> -        tcg_temp_free_i32(v);
> -    }
> -}
> -
>  /* C3.6.24 Floating point conditional select
>   *   31  30  29 28       24 23  22  21 20  16 15  12 11 10 9    5 4    0
>   * +---+---+---+-----------+------+---+------+------+-----+------+------+
> @@ -4191,7 +4177,8 @@ static void gen_mov_fp2fp(DisasContext *s, int type, int dst, int src)
>  static void disas_fp_csel(DisasContext *s, uint32_t insn)
>  {
>      unsigned int mos, type, rm, cond, rn, rd;
> -    TCGLabel *label_continue = NULL;
> +    TCGv_i64 t_true, t_false, t_zero;
> +    DisasCompare64 c;
>
>      mos = extract32(insn, 29, 3);
>      type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
> @@ -4209,21 +4196,28 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
>          return;
>      }
>
> -    if (cond < 0x0e) { /* not always */
> -        TCGLabel *label_match = gen_new_label();
> -        label_continue = gen_new_label();
> -        arm_gen_test_cc(cond, label_match);
> -        /* nomatch: */
> -        gen_mov_fp2fp(s, type, rd, rm);
> -        tcg_gen_br(label_continue);
> -        gen_set_label(label_match);
> +    if (type) {
> +        t_true = read_fp_dreg(s, rn);
> +        t_false = read_fp_dreg(s, rm);
> +    } else {
> +        /* Zero-extend sreg inputs to 64-bits now.  */
> +        t_true = tcg_temp_new_i64();
> +        t_false = tcg_temp_new_i64();
> +        tcg_gen_ld32u_i64(t_true, cpu_env, fp_reg_offset(s, rn, MO_32));
> +        tcg_gen_ld32u_i64(t_false, cpu_env, fp_reg_offset(s, rm, MO_32));

You could write these as
    read_vec_element(s, t_true, rn, 0, MO_32);
    read_vec_element(s, t_false, rm, 0, MO_32);

(ie "read the 0th element of size MO_32 from this vector register").
I'm on the fence about whether that's actually any clearer, though.
I suppose it does let you do

   read_vec_element(s, t_tre, rn, 0, type ? MO_64 : MO_32);
&c
and avoid the if (type)...

So you could change it, or leave it as-is, whichever you prefer.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 07/11] target-arm: Recognize SXTB, SXTH, SXTW, ASR
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 07/11] target-arm: Recognize SXTB, SXTH, SXTW, ASR Richard Henderson
@ 2015-09-07 17:47   ` Peter Maydell
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 17:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> ... as aliases of SBFM, and special-case them.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

(Not a fan of commit messages that start a sentence in the Subject
and continue it in the body.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 08/11] target-arm: Recognize UXTB, UXTH, LSR, LSL
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 08/11] target-arm: Recognize UXTB, UXTH, LSR, LSL Richard Henderson
@ 2015-09-07 18:00   ` Peter Maydell
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 18:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate-a64.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index 74dd0f8..8c94edf 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -3020,6 +3020,23 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
>              tcg_gen_sari_i64(tcg_rd, tcg_tmp, ri);
>              goto done;
>          }
> +    } else if (opc == 2) { /* UBFM */
> +        if (ri == 0) { /* UXTB, UXTH, plus non-canonical AND */
> +            tcg_gen_andi_i64(tcg_rd, tcg_tmp, bitmask64(si + 1));
> +            return;
> +        }
> +        if (si == 63 || (si == 31 && ri <= si)) { /* LSR */
> +            if (si == 31) {
> +                tcg_gen_ext32u_i64(tcg_tmp, tcg_tmp);
> +            }
> +            tcg_gen_shri_i64(tcg_rd, tcg_tmp, ri);
> +            return;
> +        }
> +        if (si + 1 == ri && si != bitsize - 1) { /* LSL */
> +            int shift = bitsize - 1 - si;
> +            tcg_gen_shli_i64(tcg_rd, tcg_tmp, shift);
> +            goto done;
> +        }
>      }
>
>      if (opc != 1) { /* SBFM or UBFM */

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 09/11] target-arm: Eliminate unnecessary zero-extend in disas_bitfield
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 09/11] target-arm: Eliminate unnecessary zero-extend in disas_bitfield Richard Henderson
@ 2015-09-07 18:02   ` Peter Maydell
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 18:02 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> For !SF, this initial ext32u can't be optimized away by the
> current TCG code generator.  (It would require backward bit
> liveness propagation.)
>
> But since the range of bits for !SF are already constrained by
> unallocated_encoding, we'll never reference the high bits anyway.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate-a64.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index 8c94edf..10f8825 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -2997,7 +2997,11 @@ static void disas_bitfield(DisasContext *s, uint32_t insn)
>      }
>
>      tcg_rd = cpu_reg(s, rd);
> -    tcg_tmp = read_cpu_reg(s, rn, sf);
> +
> +    /* Suppress the zero-extend for !sf.  Since RI and SI are constrained
> +       to be smaller than bitsize, we'll never reference data outside the
> +       low 32-bits anyway.  */
> +    tcg_tmp = read_cpu_reg(s, rn, 1);
>
>      /* Recognize the common aliases.  */
>      if (opc == 0) { /* SBFM */
> -

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/11] target-arm: Recognize ROR
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 10/11] target-arm: Recognize ROR Richard Henderson
@ 2015-09-07 18:06   ` Peter Maydell
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 18:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-arm/translate-a64.c | 33 +++++++++++++++++++++------------
>  1 file changed, 21 insertions(+), 12 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 11/11] target-arm: Use tcg_gen_extrh_i64_i32
  2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 11/11] target-arm: Use tcg_gen_extrh_i64_i32 Richard Henderson
@ 2015-09-07 18:11   ` Peter Maydell
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Maydell @ 2015-09-07 18:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
> Usually, eliminate an operation from the translator by combining
> a shift with an extract.
>
> In the case of gen_set_NZ64, we don't need a boolean value for cpu_ZF,
> merely a non-zero value.  Given that we can extract both halves of a
> 64-bit input in one call, this simplifies the code.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare
  2015-09-07 17:09   ` Peter Maydell
@ 2015-09-08  5:09     ` Richard Henderson
  2015-09-08  8:13       ` Peter Maydell
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-08  5:09 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 09/07/2015 10:09 AM, Peter Maydell wrote:
> On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
>> +    case 9: /* ls: !C || Z -> !(C && !Z) */
>> +        cond = TCG_COND_NE;
>> +        value = tcg_temp_new_i32();
>> +        global = false;
>> +        tcg_gen_neg_i32(value, cpu_CF);
>> +        tcg_gen_and_i32(value, value, cpu_ZF);
>>           break;
>
> The comment says hi is C && !Z, but the code
> doesn't seem to line up with that. At least part
> of that is presumably because we store ZF inverted,
> but why are we negating CF here?

We're computing CF ? -1 : 0.  ANDing that with !Z (aka cpu_ZF) gets us C & !Z.

>>       case 12: /* gt: !Z && N == V */
>>       case 13: /* le: Z || N != V */
>> +        cond = TCG_COND_NE;
>> +        value = tcg_temp_new_i32();
>> +        global = false;
>> +        tcg_gen_xor_i32(value, cpu_VF, cpu_NF);
>> +        tcg_gen_sari_i32(value, value, 31);
>> +        tcg_gen_andc_i32(value, cpu_ZF, value);
>
> I think this is correct, but it could use some commentary
> to explain what it's doing.

Fair enough.


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 04/11] target-arm: Use setcond and movcond for csel
  2015-09-07 17:17   ` Peter Maydell
@ 2015-09-08  5:12     ` Richard Henderson
  0 siblings, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2015-09-08  5:12 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 09/07/2015 10:17 AM, Peter Maydell wrote:
>> +    c64->value = tcg_temp_new_i64();
>> +    c64->cond = c32.cond;
>> +    if (c32.cond == TCG_COND_EQ || c32.cond == TCG_COND_NE) {
>> +        tcg_gen_extu_i32_i64(c64->value, c32.value);
>> +    } else {
>> +        tcg_gen_ext_i32_i64(c64->value, c32.value);
>> +    }
>
> Signed extend would work for EQ and NE as well, wouldn't it?

Yes.

> Why prefer the unsigned extension in those cases? If there's
> a reason, it could do with being commented.

I dunno what I was really thinking there.


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries
  2015-09-07 16:57   ` Peter Maydell
@ 2015-09-08  5:13     ` Richard Henderson
  0 siblings, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2015-09-08  5:13 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 09/07/2015 09:57 AM, Peter Maydell wrote:
> On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
>> This is a bug fix for aarch64.  At present, we have branches using
>> the 32-bit (translate.c) versions of cpu_[NZCV]F, but we set the flags
>> using the 64-bit (translate-a64.c) versions of cpu_[NZCV]F.  From
>> the view of the TCG code generator, these are unrelated variables.
>>
>> The bug is hard to see because we currently only read these variables
>> from branches, and upon reaching a branch TCG will first spill live
>> variables and then reload the arguments of the branch.  Since the
>> 32-bit versions were never live until reaching the branch, we'd re-read
>> the data that had just been spilled from the 64-bit versions.
>>
>> There is currently no such problem with the cpu_exclusive_* variables,
>> but there's no point in tempting fate.
>>
>> Signed-off-by: Richard Henderson <rth@twiddle.net>
>
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>
> Should this be cc:qemu-stable@nongnu.org ?

Possibly.  It's certainly low risk.


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless
  2015-09-07 17:31   ` Peter Maydell
@ 2015-09-08  5:18     ` Richard Henderson
  2015-09-08  8:19       ` Peter Maydell
  0 siblings, 1 reply; 31+ messages in thread
From: Richard Henderson @ 2015-09-08  5:18 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 09/07/2015 10:31 AM, Peter Maydell wrote:
>> -    if (cond < 0x0e) { /* continue */
>> -        gen_set_label(label_continue);
>> +    /* If COND was false, force the flags to #nzcv.
>> +       Note that T1 = (COND ? 0 : -1), T2 = (COND ? -1 : 0).  */
>> +    tcg_t1 = tcg_temp_new_i32();
>> +    tcg_t2 = tcg_temp_new_i32();
>> +    tcg_gen_neg_i32(tcg_t1, tcg_t0);
>> +    tcg_gen_subi_i32(tcg_t2, tcg_t0, 1);
>
> t2 is ~t1, right? Do we get better/worse code if we use
> tcg_gen_andc_i32(..., tcg_t1) rather than creating t2 and
> using gen_and_i32 ?
>
>> +
>> +    if (nzcv & 8) { /* N */
>> +        tcg_gen_or_i32(cpu_NF, cpu_NF, tcg_t1);
>> +    } else {
>> +        tcg_gen_and_i32(cpu_NF, cpu_NF, tcg_t2);
>> +    }
>> +    if (nzcv & 4) { /* Z */
>> +        tcg_gen_and_i32(cpu_ZF, cpu_ZF, tcg_t2);
>> +    } else {
>> +        tcg_gen_or_i32(cpu_ZF, cpu_ZF, tcg_t0);
>> +    }
>> +    if (nzcv & 2) { /* C */
>> +        tcg_gen_or_i32(cpu_CF, cpu_CF, tcg_t0);
>> +    } else {
>> +        tcg_gen_and_i32(cpu_CF, cpu_CF, tcg_t2);
>> +    }
>> +    if (nzcv & 1) { /* V */
>> +        tcg_gen_or_i32(cpu_VF, cpu_VF, tcg_t1);
>> +    } else {
>> +        tcg_gen_and_i32(cpu_VF, cpu_VF, tcg_t2);

If the host supports andc, it's probably better to use only the one temp.  But 
otherwise we may save 4 not insns.  Is it worth complicating the code for that?


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare
  2015-09-08  5:09     ` Richard Henderson
@ 2015-09-08  8:13       ` Peter Maydell
  0 siblings, 0 replies; 31+ messages in thread
From: Peter Maydell @ 2015-09-08  8:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 8 September 2015 at 06:09, Richard Henderson <rth@twiddle.net> wrote:
> On 09/07/2015 10:09 AM, Peter Maydell wrote:
>>
>> On 2 September 2015 at 18:57, Richard Henderson <rth@twiddle.net> wrote:
>>>
>>> +    case 9: /* ls: !C || Z -> !(C && !Z) */
>>> +        cond = TCG_COND_NE;
>>> +        value = tcg_temp_new_i32();
>>> +        global = false;
>>> +        tcg_gen_neg_i32(value, cpu_CF);
>>> +        tcg_gen_and_i32(value, value, cpu_ZF);
>>>           break;
>>
>>
>> The comment says hi is C && !Z, but the code
>> doesn't seem to line up with that. At least part
>> of that is presumably because we store ZF inverted,
>> but why are we negating CF here?
>
>
> We're computing CF ? -1 : 0.  ANDing that with !Z (aka cpu_ZF) gets us C &
> !Z.

Ah yes. As with the case below, a comment would be helpful.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless
  2015-09-08  5:18     ` Richard Henderson
@ 2015-09-08  8:19       ` Peter Maydell
  2015-09-08 15:20         ` Richard Henderson
  0 siblings, 1 reply; 31+ messages in thread
From: Peter Maydell @ 2015-09-08  8:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On 8 September 2015 at 06:18, Richard Henderson <rth@twiddle.net> wrote:
> On 09/07/2015 10:31 AM, Peter Maydell wrote:
>>>
>>> -    if (cond < 0x0e) { /* continue */
>>> -        gen_set_label(label_continue);
>>> +    /* If COND was false, force the flags to #nzcv.
>>> +       Note that T1 = (COND ? 0 : -1), T2 = (COND ? -1 : 0).  */
>>> +    tcg_t1 = tcg_temp_new_i32();
>>> +    tcg_t2 = tcg_temp_new_i32();
>>> +    tcg_gen_neg_i32(tcg_t1, tcg_t0);
>>> +    tcg_gen_subi_i32(tcg_t2, tcg_t0, 1);
>>
>>
>> t2 is ~t1, right? Do we get better/worse code if we use
>> tcg_gen_andc_i32(..., tcg_t1) rather than creating t2 and
>> using gen_and_i32 ?
>>
>>> +
>>> +    if (nzcv & 8) { /* N */
>>> +        tcg_gen_or_i32(cpu_NF, cpu_NF, tcg_t1);
>>> +    } else {
>>> +        tcg_gen_and_i32(cpu_NF, cpu_NF, tcg_t2);
>>> +    }
>>> +    if (nzcv & 4) { /* Z */
>>> +        tcg_gen_and_i32(cpu_ZF, cpu_ZF, tcg_t2);
>>> +    } else {
>>> +        tcg_gen_or_i32(cpu_ZF, cpu_ZF, tcg_t0);
>>> +    }
>>> +    if (nzcv & 2) { /* C */
>>> +        tcg_gen_or_i32(cpu_CF, cpu_CF, tcg_t0);
>>> +    } else {
>>> +        tcg_gen_and_i32(cpu_CF, cpu_CF, tcg_t2);
>>> +    }
>>> +    if (nzcv & 1) { /* V */
>>> +        tcg_gen_or_i32(cpu_VF, cpu_VF, tcg_t1);
>>> +    } else {
>>> +        tcg_gen_and_i32(cpu_VF, cpu_VF, tcg_t2);
>
>
> If the host supports andc, it's probably better to use only the one temp.
> But otherwise we may save 4 not insns.

The tcg common code isn't smart enough to notice it only
needs to calculate not(t1) once ?

In the overwhelmingly common case (x86 tcg backend)
we would save an insn every time, right?

>  Is it worth complicating the code
> for that?

I wouldn't bother to make the front-end generate different
code for the backend does/doesn't have andc situations,
certainly.

Anyway, I'm just guessing here, you probably have a better
feel than me for what codegen choices work better, so I'll
leave the choice up to you.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless
  2015-09-08  8:19       ` Peter Maydell
@ 2015-09-08 15:20         ` Richard Henderson
  0 siblings, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2015-09-08 15:20 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 09/08/2015 01:19 AM, Peter Maydell wrote:
> The tcg common code isn't smart enough to notice it only
> needs to calculate not(t1) once ?

Correct, we do no value numbering or cse.

> In the overwhelmingly common case (x86 tcg backend)
> we would save an insn every time, right?

Yes.  It all depends on what value is used for NZCV, of course.

Thankfully we *do* do dead code elimination, which is why I unconditionally
compute both T1 and T2, and let them be deleted should they be unused.

> I wouldn't bother to make the front-end generate different
> code for the backend does/doesn't have andc situations,
> certainly.

I'll mock it up and see how much duplication there is.  And also check it out
on a Haswell host, which does have andc.


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond
  2015-09-07 17:42   ` Peter Maydell
@ 2015-09-08 15:21     ` Richard Henderson
  0 siblings, 0 replies; 31+ messages in thread
From: Richard Henderson @ 2015-09-08 15:21 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

On 09/07/2015 10:42 AM, Peter Maydell wrote:
> I suppose it does let you do
> 
>    read_vec_element(s, t_tre, rn, 0, type ? MO_64 : MO_32);
> &c
> and avoid the if (type)...

That looks nice, thanks.


r~

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2015-09-08 15:21 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-02 17:57 [Qemu-devel] [PATCH v2 00/11] target-arm improvements for aarch64 Richard Henderson
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 01/11] target-arm: Share all common TCG temporaries Richard Henderson
2015-09-07 16:57   ` Peter Maydell
2015-09-08  5:13     ` Richard Henderson
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 02/11] target-arm: Introduce DisasCompare Richard Henderson
2015-09-07 17:09   ` Peter Maydell
2015-09-08  5:09     ` Richard Henderson
2015-09-08  8:13       ` Peter Maydell
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 03/11] target-arm: Handle always condition codes within arm_test_cc Richard Henderson
2015-09-07 17:11   ` Peter Maydell
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 04/11] target-arm: Use setcond and movcond for csel Richard Henderson
2015-09-07 17:17   ` Peter Maydell
2015-09-08  5:12     ` Richard Henderson
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless Richard Henderson
2015-09-07 17:31   ` Peter Maydell
2015-09-08  5:18     ` Richard Henderson
2015-09-08  8:19       ` Peter Maydell
2015-09-08 15:20         ` Richard Henderson
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond Richard Henderson
2015-09-07 17:42   ` Peter Maydell
2015-09-08 15:21     ` Richard Henderson
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 07/11] target-arm: Recognize SXTB, SXTH, SXTW, ASR Richard Henderson
2015-09-07 17:47   ` Peter Maydell
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 08/11] target-arm: Recognize UXTB, UXTH, LSR, LSL Richard Henderson
2015-09-07 18:00   ` Peter Maydell
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 09/11] target-arm: Eliminate unnecessary zero-extend in disas_bitfield Richard Henderson
2015-09-07 18:02   ` Peter Maydell
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 10/11] target-arm: Recognize ROR Richard Henderson
2015-09-07 18:06   ` Peter Maydell
2015-09-02 17:57 ` [Qemu-devel] [PATCH v2 11/11] target-arm: Use tcg_gen_extrh_i64_i32 Richard Henderson
2015-09-07 18:11   ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).