[PATCH 00/21] target/arm: Finish neon decodetree conversion

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/21] target/arm: Finish neon decodetree conversion
@ 2020-06-16 17:08 Peter Maydell
  2020-06-16 17:08 ` [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Peter Maydell
                   ` (22 more replies)
  0 siblings, 23 replies; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

This patchset completes the conversion of Neon to decodetree
by converting all the instructions in the 2-reg-misc grouping.

There are some potential further cleanups available, which I don't
propose to do (I've spent what feels like too much time on this
refactoring already; I want to move onto implementing FP16 now,
which is what the refactoring was intended to permit):

 * the oddball "TCG temps in global variables" cpu_V0, cpu_V1,
   cpu_M0 are now used only in the iwmmxt codegen; V0 and V1
   would be easy to replace with local temporaries. M0 is
   slightly trickier. The main thing that dissuades me from
   this refactoring is that I don't have an easy way to test the
   iwmmxt codegen.

 * we have a confusingly large number of ways to load and
   store from the Neon/VFP register file:
    - neon_load_reg/neon_store_reg
    - neon_load_reg64/neon_store_reg64
    - neon_load_reg32/neon_store_reg32
    - neon_load_element/neon_store_element
    - neon_load_element64/neon_store_element64
   which all have subtly different semantics. The way that
   neon_load_reg/neon_store_reg do a "create temp on load,
   destroy temp on store" and none of the rest do is particularly
   confusing. I'd like us to have fewer of these but it's not
   immediately obvious what the correct small set of primitives
   should be.

 * it would be nice to make the vfp and neon decode really
   separate translation units rather than #including them
   into translate.c someday

thanks
-- PMM

Peter Maydell (21):
  target/arm: Convert Neon 2-reg-misc VREV64 to decodetree
  target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
  target/arm: Convert VZIP, VUZP to decodetree
  target/arm: Convert Neon narrowing moves to decodetree
  target/arm: Convert Neon 2-reg-misc VSHLL to decodetree
  target/arm: Convert Neon VCVT f16/f32 insns to decodetree
  target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree
  target/arm: Convert Neon 2-reg-misc crypto operations to decodetree
  target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
  target/arm: Fix capitalization in NeonGenTwo{Single,Double}OPFn
    typedefs
  target/arm: Make gen_swap_half() take separate src and dest
  target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
  target/arm: Convert remaining simple 2-reg-misc Neon ops
  target/arm: Convert Neon VQABS, VQNEG to decodetree
  target/arm: Convert simple fp Neon 2-reg-misc insns
  target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to
    decodetree
  target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree
  target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree
  target/arm: Convert Neon VSWP to decodetree
  target/arm: Convert Neon VTRN to decodetree
  target/arm: Move some functions used only in translate-neon.inc.c to
    that file

 target/arm/translate.h          |    8 +-
 target/arm/neon-dp.decode       |  106 +++
 target/arm/translate-a64.c      |    8 +-
 target/arm/translate-neon.inc.c | 1191 ++++++++++++++++++++++++++++++-
 target/arm/translate.c          | 1061 +--------------------------
 5 files changed, 1311 insertions(+), 1063 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 22:36   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops " Peter Maydell
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon VREV64 insn from the 2-reg-misc grouping to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       | 12 ++++++++
 target/arm/translate-neon.inc.c | 50 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 24 ++--------------
 3 files changed, 64 insertions(+), 22 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 6d890b2161f..e12fdf30957 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -429,6 +429,18 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
                  vm=%vm_dp vd=%vd_dp size=1
     VDUP_scalar  1111 001 1 1 . 11 index:1 100 .... 11 000 q:1 . 0 .... \
                  vm=%vm_dp vd=%vd_dp size=2
+
+    ##################################################################
+    # 2-reg-misc grouping:
+    # 1111 001 11 D 11 size:2 opc1:2 Vd:4 0 opc2:4 q:1 M 0 Vm:4
+    ##################################################################
+
+    &2misc vd vm q size
+
+    @2misc       .... ... .. . .. size:2 .. .... . .... q:1 . . .... \
+                 &2misc vm=%vm_dp vd=%vd_dp
+
+    VREV64       1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index a5aa56bbdeb..90431a5383f 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -2970,3 +2970,53 @@ static bool trans_VDUP_scalar(DisasContext *s, arg_VDUP_scalar *a)
                          a->q ? 16 : 8, a->q ? 16 : 8);
     return true;
 }
+
+static bool trans_VREV64(DisasContext *s, arg_VREV64 *a)
+{
+    int pass, half;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (a->size == 3) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    for (pass = 0; pass < (a->q ? 2 : 1); pass++) {
+        TCGv_i32 tmp[2];
+
+        for (half = 0; half < 2; half++) {
+            tmp[half] = neon_load_reg(a->vm, pass * 2 + half);
+            switch (a->size) {
+            case 0:
+                tcg_gen_bswap32_i32(tmp[half], tmp[half]);
+                break;
+            case 1:
+                gen_swap_half(tmp[half]);
+                break;
+            case 2:
+                break;
+            default:
+                g_assert_not_reached();
+            }
+        }
+        neon_store_reg(a->vd, pass * 2, tmp[1]);
+        neon_store_reg(a->vd, pass * 2 + 1, tmp[0]);
+    }
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 6d18892adee..5fca38b5fae 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5092,28 +5092,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 }
                 switch (op) {
                 case NEON_2RM_VREV64:
-                    for (pass = 0; pass < (q ? 2 : 1); pass++) {
-                        tmp = neon_load_reg(rm, pass * 2);
-                        tmp2 = neon_load_reg(rm, pass * 2 + 1);
-                        switch (size) {
-                        case 0: tcg_gen_bswap32_i32(tmp, tmp); break;
-                        case 1: gen_swap_half(tmp); break;
-                        case 2: /* no-op */ break;
-                        default: abort();
-                        }
-                        neon_store_reg(rd, pass * 2 + 1, tmp);
-                        if (size == 2) {
-                            neon_store_reg(rd, pass * 2, tmp2);
-                        } else {
-                            switch (size) {
-                            case 0: tcg_gen_bswap32_i32(tmp2, tmp2); break;
-                            case 1: gen_swap_half(tmp2); break;
-                            default: abort();
-                            }
-                            neon_store_reg(rd, pass * 2, tmp2);
-                        }
-                    }
-                    break;
+                    /* handled by decodetree */
+                    return 1;
                 case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
                 case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
                     for (pass = 0; pass < q + 1; pass++) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree
  2020-06-16 17:08 ` [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Peter Maydell
@ 2020-06-19 22:36   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:36 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VREV64 insn from the 2-reg-misc grouping to decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       | 12 ++++++++
>  target/arm/translate-neon.inc.c | 50 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 24 ++--------------
>  3 files changed, 64 insertions(+), 22 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
  2020-06-16 17:08 ` [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 22:42   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 03/21] target/arm: Convert VZIP, VUZP " Peter Maydell
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the pairwise ops VPADDL and VPADAL in the 2-reg-misc grouping
to decodetree.

At this point we can get rid of the weird CPU_V001 #define that was
used to avoid having to explicitly list all the arguments being
passed to some TCG gen/helper functions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |   6 ++
 target/arm/translate-neon.inc.c | 149 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  35 +-------
 3 files changed, 157 insertions(+), 33 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index e12fdf30957..dd521baa07d 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -441,6 +441,12 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
                  &2misc vm=%vm_dp vd=%vd_dp
 
     VREV64       1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
+
+    VPADDL_S     1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
+    VPADDL_U     1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
+
+    VPADAL_S     1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
+    VPADAL_U     1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 90431a5383f..2f7bd0d556f 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3020,3 +3020,152 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a)
     }
     return true;
 }
+
+static bool do_2misc_pairwise(DisasContext *s, arg_2misc *a,
+                              NeonGenWidenFn *widenfn,
+                              NeonGenTwo64OpFn *opfn,
+                              NeonGenTwo64OpFn *accfn)
+{
+    /*
+     * Pairwise long operations: widen both halves of the pair,
+     * combine the pairs with the opfn, and then possibly accumulate
+     * into the destination with the accfn.
+     */
+    int pass;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!widenfn) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    for (pass = 0; pass < a->q + 1; pass++) {
+        TCGv_i32 tmp;
+        TCGv_i64 rm0_64, rm1_64, rd_64;
+
+        rm0_64 = tcg_temp_new_i64();
+        rm1_64 = tcg_temp_new_i64();
+        rd_64 = tcg_temp_new_i64();
+        tmp = neon_load_reg(a->vm, pass * 2);
+        widenfn(rm0_64, tmp);
+        tcg_temp_free_i32(tmp);
+        tmp = neon_load_reg(a->vm, pass * 2 + 1);
+        widenfn(rm1_64, tmp);
+        tcg_temp_free_i32(tmp);
+        opfn(rd_64, rm0_64, rm1_64);
+        tcg_temp_free_i64(rm0_64);
+        tcg_temp_free_i64(rm1_64);
+
+        if (accfn) {
+            TCGv_i64 tmp64 = tcg_temp_new_i64();
+            neon_load_reg64(tmp64, a->vd + pass);
+            accfn(rd_64, tmp64, rd_64);
+            tcg_temp_free_i64(tmp64);
+        }
+        neon_store_reg64(rd_64, a->vd + pass);
+        tcg_temp_free_i64(rd_64);
+    }
+    return true;
+}
+
+static bool trans_VPADDL_S(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenWidenFn * const widenfn[] = {
+        gen_helper_neon_widen_s8,
+        gen_helper_neon_widen_s16,
+        tcg_gen_ext_i32_i64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const opfn[] = {
+        gen_helper_neon_paddl_u16,
+        gen_helper_neon_paddl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+
+    return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size], NULL);
+}
+
+static bool trans_VPADDL_U(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenWidenFn * const widenfn[] = {
+        gen_helper_neon_widen_u8,
+        gen_helper_neon_widen_u16,
+        tcg_gen_extu_i32_i64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const opfn[] = {
+        gen_helper_neon_paddl_u16,
+        gen_helper_neon_paddl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+
+    return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size], NULL);
+}
+
+static bool trans_VPADAL_S(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenWidenFn * const widenfn[] = {
+        gen_helper_neon_widen_s8,
+        gen_helper_neon_widen_s16,
+        tcg_gen_ext_i32_i64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const opfn[] = {
+        gen_helper_neon_paddl_u16,
+        gen_helper_neon_paddl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const accfn[] = {
+        gen_helper_neon_addl_u16,
+        gen_helper_neon_addl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+
+    return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size],
+                             accfn[a->size]);
+}
+
+static bool trans_VPADAL_U(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenWidenFn * const widenfn[] = {
+        gen_helper_neon_widen_u8,
+        gen_helper_neon_widen_u16,
+        tcg_gen_extu_i32_i64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const opfn[] = {
+        gen_helper_neon_paddl_u16,
+        gen_helper_neon_paddl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+    static NeonGenTwo64OpFn * const accfn[] = {
+        gen_helper_neon_addl_u16,
+        gen_helper_neon_addl_u32,
+        tcg_gen_add_i64,
+        NULL,
+    };
+
+    return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size],
+                             accfn[a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5fca38b5fae..4405b034f77 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2934,8 +2934,6 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
     gen_rfe(s, pc, load_cpu_field(spsr));
 }
 
-#define CPU_V001 cpu_V0, cpu_V0, cpu_V1
-
 static int gen_neon_unzip(int rd, int rm, int size, int q)
 {
     TCGv_ptr pd, pm;
@@ -3117,16 +3115,6 @@ static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
     tcg_temp_free_i32(src);
 }
 
-static inline void gen_neon_addl(int size)
-{
-    switch (size) {
-    case 0: gen_helper_neon_addl_u16(CPU_V001); break;
-    case 1: gen_helper_neon_addl_u32(CPU_V001); break;
-    case 2: tcg_gen_add_i64(CPU_V001); break;
-    default: abort();
-    }
-}
-
 static void gen_neon_narrow_op(int op, int u, int size,
                                TCGv_i32 dest, TCGv_i64 src)
 {
@@ -5092,29 +5080,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 }
                 switch (op) {
                 case NEON_2RM_VREV64:
-                    /* handled by decodetree */
-                    return 1;
                 case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
                 case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
-                    for (pass = 0; pass < q + 1; pass++) {
-                        tmp = neon_load_reg(rm, pass * 2);
-                        gen_neon_widen(cpu_V0, tmp, size, op & 1);
-                        tmp = neon_load_reg(rm, pass * 2 + 1);
-                        gen_neon_widen(cpu_V1, tmp, size, op & 1);
-                        switch (size) {
-                        case 0: gen_helper_neon_paddl_u16(CPU_V001); break;
-                        case 1: gen_helper_neon_paddl_u32(CPU_V001); break;
-                        case 2: tcg_gen_add_i64(CPU_V001); break;
-                        default: abort();
-                        }
-                        if (op >= NEON_2RM_VPADAL) {
-                            /* Accumulate.  */
-                            neon_load_reg64(cpu_V1, rd + pass);
-                            gen_neon_addl(size);
-                        }
-                        neon_store_reg64(cpu_V0, rd + pass);
-                    }
-                    break;
+                    /* handled by decodetree */
+                    return 1;
                 case NEON_2RM_VTRN:
                     if (size == 2) {
                         int n;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
  2020-06-16 17:08 ` [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops " Peter Maydell
@ 2020-06-19 22:42   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:42 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the pairwise ops VPADDL and VPADAL in the 2-reg-misc grouping
> to decodetree.
> 
> At this point we can get rid of the weird CPU_V001 #define that was
> used to avoid having to explicitly list all the arguments being
> passed to some TCG gen/helper functions.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |   6 ++
>  target/arm/translate-neon.inc.c | 149 ++++++++++++++++++++++++++++++++
>  target/arm/translate.c          |  35 +-------
>  3 files changed, 157 insertions(+), 33 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 03/21] target/arm: Convert VZIP, VUZP to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
  2020-06-16 17:08 ` [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Peter Maydell
  2020-06-16 17:08 ` [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 22:47   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 04/21] target/arm: Convert Neon narrowing moves " Peter Maydell
                   ` (19 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon VZIP and VUZP insns in the 2-reg-misc group to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  3 ++
 target/arm/translate-neon.inc.c | 74 ++++++++++++++++++++++++++
 target/arm/translate.c          | 92 +--------------------------------
 3 files changed, 79 insertions(+), 90 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index dd521baa07d..ad9e17fd737 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -447,6 +447,9 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     VPADAL_S     1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
     VPADAL_U     1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
+
+    VUZP         1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
+    VZIP         1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 2f7bd0d556f..f4799dd9770 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3169,3 +3169,77 @@ static bool trans_VPADAL_U(DisasContext *s, arg_2misc *a)
     return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size],
                              accfn[a->size]);
 }
+
+typedef void ZipFn(TCGv_ptr, TCGv_ptr);
+
+static bool do_zip_uzp(DisasContext *s, arg_2misc *a,
+                       ZipFn *fn)
+{
+    TCGv_ptr pd, pm;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!fn) {
+        /* Bad size or size/q combination */
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    pd = vfp_reg_ptr(true, a->vd);
+    pm = vfp_reg_ptr(true, a->vm);
+    fn(pd, pm);
+    tcg_temp_free_ptr(pd);
+    tcg_temp_free_ptr(pm);
+    return true;
+}
+
+static bool trans_VUZP(DisasContext *s, arg_2misc *a)
+{
+    static ZipFn * const fn[2][4] = {
+        {
+            gen_helper_neon_unzip8,
+            gen_helper_neon_unzip16,
+            NULL,
+            NULL,
+        }, {
+            gen_helper_neon_qunzip8,
+            gen_helper_neon_qunzip16,
+            gen_helper_neon_qunzip32,
+            NULL,
+        }
+    };
+    return do_zip_uzp(s, a, fn[a->q][a->size]);
+}
+
+static bool trans_VZIP(DisasContext *s, arg_2misc *a)
+{
+    static ZipFn * const fn[2][4] = {
+        {
+            gen_helper_neon_zip8,
+            gen_helper_neon_zip16,
+            NULL,
+            NULL,
+        }, {
+            gen_helper_neon_qzip8,
+            gen_helper_neon_qzip16,
+            gen_helper_neon_qzip32,
+            NULL,
+        }
+    };
+    return do_zip_uzp(s, a, fn[a->q][a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 4405b034f77..442f287d861 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2934,86 +2934,6 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
     gen_rfe(s, pc, load_cpu_field(spsr));
 }
 
-static int gen_neon_unzip(int rd, int rm, int size, int q)
-{
-    TCGv_ptr pd, pm;
-    
-    if (!q && size == 2) {
-        return 1;
-    }
-    pd = vfp_reg_ptr(true, rd);
-    pm = vfp_reg_ptr(true, rm);
-    if (q) {
-        switch (size) {
-        case 0:
-            gen_helper_neon_qunzip8(pd, pm);
-            break;
-        case 1:
-            gen_helper_neon_qunzip16(pd, pm);
-            break;
-        case 2:
-            gen_helper_neon_qunzip32(pd, pm);
-            break;
-        default:
-            abort();
-        }
-    } else {
-        switch (size) {
-        case 0:
-            gen_helper_neon_unzip8(pd, pm);
-            break;
-        case 1:
-            gen_helper_neon_unzip16(pd, pm);
-            break;
-        default:
-            abort();
-        }
-    }
-    tcg_temp_free_ptr(pd);
-    tcg_temp_free_ptr(pm);
-    return 0;
-}
-
-static int gen_neon_zip(int rd, int rm, int size, int q)
-{
-    TCGv_ptr pd, pm;
-
-    if (!q && size == 2) {
-        return 1;
-    }
-    pd = vfp_reg_ptr(true, rd);
-    pm = vfp_reg_ptr(true, rm);
-    if (q) {
-        switch (size) {
-        case 0:
-            gen_helper_neon_qzip8(pd, pm);
-            break;
-        case 1:
-            gen_helper_neon_qzip16(pd, pm);
-            break;
-        case 2:
-            gen_helper_neon_qzip32(pd, pm);
-            break;
-        default:
-            abort();
-        }
-    } else {
-        switch (size) {
-        case 0:
-            gen_helper_neon_zip8(pd, pm);
-            break;
-        case 1:
-            gen_helper_neon_zip16(pd, pm);
-            break;
-        default:
-            abort();
-        }
-    }
-    tcg_temp_free_ptr(pd);
-    tcg_temp_free_ptr(pm);
-    return 0;
-}
-
 static void gen_neon_trn_u8(TCGv_i32 t0, TCGv_i32 t1)
 {
     TCGv_i32 rd, tmp;
@@ -5082,6 +5002,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VREV64:
                 case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
                 case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
+                case NEON_2RM_VUZP:
+                case NEON_2RM_VZIP:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -5097,16 +5019,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         goto elementwise;
                     }
                     break;
-                case NEON_2RM_VUZP:
-                    if (gen_neon_unzip(rd, rm, size, q)) {
-                        return 1;
-                    }
-                    break;
-                case NEON_2RM_VZIP:
-                    if (gen_neon_zip(rd, rm, size, q)) {
-                        return 1;
-                    }
-                    break;
                 case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
                     /* also VQMOVUN; op field and mnemonics don't line up */
                     if (rm & 1) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/21] target/arm: Convert VZIP, VUZP to decodetree
  2020-06-16 17:08 ` [PATCH 03/21] target/arm: Convert VZIP, VUZP " Peter Maydell
@ 2020-06-19 22:47   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:47 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VZIP and VUZP insns in the 2-reg-misc group to
> decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  3 ++
>  target/arm/translate-neon.inc.c | 74 ++++++++++++++++++++++++++
>  target/arm/translate.c          | 92 +--------------------------------
>  3 files changed, 79 insertions(+), 90 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 04/21] target/arm: Convert Neon narrowing moves to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (2 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 03/21] target/arm: Convert VZIP, VUZP " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 22:52   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL " Peter Maydell
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon narrowing moves VMQNV, VQMOVN, VQMOVUN in the 2-reg-misc
group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  9 ++++
 target/arm/translate-neon.inc.c | 59 ++++++++++++++++++++++++
 target/arm/translate.c          | 81 +--------------------------------
 3 files changed, 70 insertions(+), 79 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index ad9e17fd737..2277b4c7b51 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -439,6 +439,8 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     @2misc       .... ... .. . .. size:2 .. .... . .... q:1 . . .... \
                  &2misc vm=%vm_dp vd=%vd_dp
+    @2misc_q0    .... ... .. . .. size:2 .. .... . .... . . . .... \
+                 &2misc vm=%vm_dp vd=%vd_dp q=0
 
     VREV64       1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
 
@@ -450,6 +452,13 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     VUZP         1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
     VZIP         1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
+
+    VMOVN        1111 001 11 . 11 .. 10 .... 0 0100 0 . 0 .... @2misc_q0
+    # VQMOVUN: unsigned result (source is always signed)
+    VQMOVUN      1111 001 11 . 11 .. 10 .... 0 0100 1 . 0 .... @2misc_q0
+    # VQMOVN: signed result, source may be signed (_S) or unsigned (_U)
+    VQMOVN_S     1111 001 11 . 11 .. 10 .... 0 0101 0 . 0 .... @2misc_q0
+    VQMOVN_U     1111 001 11 . 11 .. 10 .... 0 0101 1 . 0 .... @2misc_q0
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index f4799dd9770..b0620972854 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3243,3 +3243,62 @@ static bool trans_VZIP(DisasContext *s, arg_2misc *a)
     };
     return do_zip_uzp(s, a, fn[a->q][a->size]);
 }
+
+static bool do_vmovn(DisasContext *s, arg_2misc *a,
+                     NeonGenNarrowEnvFn *narrowfn)
+{
+    TCGv_i64 rm;
+    TCGv_i32 rd0, rd1;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->vm & 1) {
+        return false;
+    }
+
+    if (!narrowfn) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    rm = tcg_temp_new_i64();
+    rd0 = tcg_temp_new_i32();
+    rd1 = tcg_temp_new_i32();
+
+    neon_load_reg64(rm, a->vm);
+    narrowfn(rd0, cpu_env, rm);
+    neon_load_reg64(rm, a->vm + 1);
+    narrowfn(rd1, cpu_env, rm);
+    neon_store_reg(a->vd, 0, rd0);
+    neon_store_reg(a->vd, 1, rd1);
+    tcg_temp_free_i64(rm);
+    return true;
+}
+
+#define DO_VMOVN(INSN, FUNC)                                    \
+    static bool trans_##INSN(DisasContext *s, arg_2misc *a)     \
+    {                                                           \
+        static NeonGenNarrowEnvFn * const narrowfn[] = {        \
+            FUNC##8,                                            \
+            FUNC##16,                                           \
+            FUNC##32,                                           \
+            NULL,                                               \
+        };                                                      \
+        return do_vmovn(s, a, narrowfn[a->size]);               \
+    }
+
+DO_VMOVN(VMOVN, gen_neon_narrow_u)
+DO_VMOVN(VQMOVUN, gen_helper_neon_unarrow_sat)
+DO_VMOVN(VQMOVN_S, gen_helper_neon_narrow_sat_s)
+DO_VMOVN(VQMOVN_U, gen_helper_neon_narrow_sat_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 442f287d861..8ecae264e15 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2975,46 +2975,6 @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
     tcg_temp_free_i32(rd);
 }
 
-static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
-{
-    switch (size) {
-    case 0: gen_helper_neon_narrow_u8(dest, src); break;
-    case 1: gen_helper_neon_narrow_u16(dest, src); break;
-    case 2: tcg_gen_extrl_i64_i32(dest, src); break;
-    default: abort();
-    }
-}
-
-static inline void gen_neon_narrow_sats(int size, TCGv_i32 dest, TCGv_i64 src)
-{
-    switch (size) {
-    case 0: gen_helper_neon_narrow_sat_s8(dest, cpu_env, src); break;
-    case 1: gen_helper_neon_narrow_sat_s16(dest, cpu_env, src); break;
-    case 2: gen_helper_neon_narrow_sat_s32(dest, cpu_env, src); break;
-    default: abort();
-    }
-}
-
-static inline void gen_neon_narrow_satu(int size, TCGv_i32 dest, TCGv_i64 src)
-{
-    switch (size) {
-    case 0: gen_helper_neon_narrow_sat_u8(dest, cpu_env, src); break;
-    case 1: gen_helper_neon_narrow_sat_u16(dest, cpu_env, src); break;
-    case 2: gen_helper_neon_narrow_sat_u32(dest, cpu_env, src); break;
-    default: abort();
-    }
-}
-
-static inline void gen_neon_unarrow_sats(int size, TCGv_i32 dest, TCGv_i64 src)
-{
-    switch (size) {
-    case 0: gen_helper_neon_unarrow_sat8(dest, cpu_env, src); break;
-    case 1: gen_helper_neon_unarrow_sat16(dest, cpu_env, src); break;
-    case 2: gen_helper_neon_unarrow_sat32(dest, cpu_env, src); break;
-    default: abort();
-    }
-}
-
 static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
 {
     if (u) {
@@ -3035,24 +2995,6 @@ static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
     tcg_temp_free_i32(src);
 }
 
-static void gen_neon_narrow_op(int op, int u, int size,
-                               TCGv_i32 dest, TCGv_i64 src)
-{
-    if (op) {
-        if (u) {
-            gen_neon_unarrow_sats(size, dest, src);
-        } else {
-            gen_neon_narrow(size, dest, src);
-        }
-    } else {
-        if (u) {
-            gen_neon_narrow_satu(size, dest, src);
-        } else {
-            gen_neon_narrow_sats(size, dest, src);
-        }
-    }
-}
-
 /* Symbolic constants for op fields for Neon 2-register miscellaneous.
  * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
  * table A7-13.
@@ -4994,8 +4936,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     !arm_dc_feature(s, ARM_FEATURE_V8)) {
                     return 1;
                 }
-                if ((op != NEON_2RM_VMOVN && op != NEON_2RM_VQMOVN) &&
-                    q && ((rm | rd) & 1)) {
+                if (q && ((rm | rd) & 1)) {
                     return 1;
                 }
                 switch (op) {
@@ -5004,6 +4945,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
                 case NEON_2RM_VUZP:
                 case NEON_2RM_VZIP:
+                case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -5019,25 +4961,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         goto elementwise;
                     }
                     break;
-                case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
-                    /* also VQMOVUN; op field and mnemonics don't line up */
-                    if (rm & 1) {
-                        return 1;
-                    }
-                    tmp2 = NULL;
-                    for (pass = 0; pass < 2; pass++) {
-                        neon_load_reg64(cpu_V0, rm + pass);
-                        tmp = tcg_temp_new_i32();
-                        gen_neon_narrow_op(op == NEON_2RM_VMOVN, q, size,
-                                           tmp, cpu_V0);
-                        if (pass == 0) {
-                            tmp2 = tmp;
-                        } else {
-                            neon_store_reg(rd, 0, tmp2);
-                            neon_store_reg(rd, 1, tmp);
-                        }
-                    }
-                    break;
                 case NEON_2RM_VSHLL:
                     if (q || (rd & 1)) {
                         return 1;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/21] target/arm: Convert Neon narrowing moves to decodetree
  2020-06-16 17:08 ` [PATCH 04/21] target/arm: Convert Neon narrowing moves " Peter Maydell
@ 2020-06-19 22:52   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:52 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon narrowing moves VMQNV, VQMOVN, VQMOVUN in the 2-reg-misc
> group to decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  9 ++++
>  target/arm/translate-neon.inc.c | 59 ++++++++++++++++++++++++
>  target/arm/translate.c          | 81 +--------------------------------
>  3 files changed, 70 insertions(+), 79 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (3 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 04/21] target/arm: Convert Neon narrowing moves " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 22:55   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns " Peter Maydell
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the VSHLL insn in the 2-reg-misc Neon group to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c | 52 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 35 +---------------------
 3 files changed, 55 insertions(+), 34 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 2277b4c7b51..0102aa7254b 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -459,6 +459,8 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     # VQMOVN: signed result, source may be signed (_S) or unsigned (_U)
     VQMOVN_S     1111 001 11 . 11 .. 10 .... 0 0101 0 . 0 .... @2misc_q0
     VQMOVN_U     1111 001 11 . 11 .. 10 .... 0 0101 1 . 0 .... @2misc_q0
+
+    VSHLL        1111 001 11 . 11 .. 10 .... 0 0110 0 . 0 .... @2misc_q0
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index b0620972854..78239ec1c1b 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3302,3 +3302,55 @@ DO_VMOVN(VMOVN, gen_neon_narrow_u)
 DO_VMOVN(VQMOVUN, gen_helper_neon_unarrow_sat)
 DO_VMOVN(VQMOVN_S, gen_helper_neon_narrow_sat_s)
 DO_VMOVN(VQMOVN_U, gen_helper_neon_narrow_sat_u)
+
+static bool trans_VSHLL(DisasContext *s, arg_2misc *a)
+{
+    TCGv_i32 rm0, rm1;
+    TCGv_i64 rd;
+    static NeonGenWidenFn * const widenfns[] = {
+        gen_helper_neon_widen_u8,
+        gen_helper_neon_widen_u16,
+        tcg_gen_extu_i32_i64,
+        NULL,
+    };
+    NeonGenWidenFn *widenfn = widenfns[a->size];
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->vd & 1) {
+        return false;
+    }
+
+    if (!widenfn) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    rd = tcg_temp_new_i64();
+
+    rm0 = neon_load_reg(a->vm, 0);
+    rm1 = neon_load_reg(a->vm, 1);
+
+    widenfn(rd, rm0);
+    tcg_gen_shli_i64(rd, rd, 8 << a->size);
+    neon_store_reg64(rd, a->vd);
+    widenfn(rd, rm1);
+    tcg_gen_shli_i64(rd, rd, 8 << a->size);
+    neon_store_reg64(rd, a->vd + 1);
+
+    tcg_temp_free_i64(rd);
+    tcg_temp_free_i32(rm0);
+    tcg_temp_free_i32(rm1);
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 8ecae264e15..94d5e34fff4 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2975,26 +2975,6 @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
     tcg_temp_free_i32(rd);
 }
 
-static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
-{
-    if (u) {
-        switch (size) {
-        case 0: gen_helper_neon_widen_u8(dest, src); break;
-        case 1: gen_helper_neon_widen_u16(dest, src); break;
-        case 2: tcg_gen_extu_i32_i64(dest, src); break;
-        default: abort();
-        }
-    } else {
-        switch (size) {
-        case 0: gen_helper_neon_widen_s8(dest, src); break;
-        case 1: gen_helper_neon_widen_s16(dest, src); break;
-        case 2: tcg_gen_ext_i32_i64(dest, src); break;
-        default: abort();
-        }
-    }
-    tcg_temp_free_i32(src);
-}
-
 /* Symbolic constants for op fields for Neon 2-register miscellaneous.
  * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
  * table A7-13.
@@ -4946,6 +4926,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VUZP:
                 case NEON_2RM_VZIP:
                 case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
+                case NEON_2RM_VSHLL:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4961,20 +4942,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         goto elementwise;
                     }
                     break;
-                case NEON_2RM_VSHLL:
-                    if (q || (rd & 1)) {
-                        return 1;
-                    }
-                    tmp = neon_load_reg(rm, 0);
-                    tmp2 = neon_load_reg(rm, 1);
-                    for (pass = 0; pass < 2; pass++) {
-                        if (pass == 1)
-                            tmp = tmp2;
-                        gen_neon_widen(cpu_V0, tmp, size, 1);
-                        tcg_gen_shli_i64(cpu_V0, cpu_V0, 8 << size);
-                        neon_store_reg64(cpu_V0, rd + pass);
-                    }
-                    break;
                 case NEON_2RM_VCVT_F16_F32:
                 {
                     TCGv_ptr fpst;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL to decodetree
  2020-06-16 17:08 ` [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL " Peter Maydell
@ 2020-06-19 22:55   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:55 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the VSHLL insn in the 2-reg-misc Neon group to decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  2 ++
>  target/arm/translate-neon.inc.c | 52 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 35 +---------------------
>  3 files changed, 55 insertions(+), 34 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (4 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:01   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops " Peter Maydell
                   ` (16 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon insns in the 2-reg-misc group which are
VCVT between f32 and f16 to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  3 ++
 target/arm/translate-neon.inc.c | 96 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 65 ++--------------------
 3 files changed, 102 insertions(+), 62 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 0102aa7254b..8174f2f92f4 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -461,6 +461,9 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VQMOVN_U     1111 001 11 . 11 .. 10 .... 0 0101 1 . 0 .... @2misc_q0
 
     VSHLL        1111 001 11 . 11 .. 10 .... 0 0110 0 . 0 .... @2misc_q0
+
+    VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
+    VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 78239ec1c1b..d37be597cf4 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3354,3 +3354,99 @@ static bool trans_VSHLL(DisasContext *s, arg_2misc *a)
     tcg_temp_free_i32(rm1);
     return true;
 }
+
+static bool trans_VCVT_F16_F32(DisasContext *s, arg_2misc *a)
+{
+    TCGv_ptr fpst;
+    TCGv_i32 ahp, tmp, tmp2, tmp3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_fp16_spconv, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vm & 1) || (a->size != 1)) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fpst = get_fpstatus_ptr(true);
+    ahp = get_ahp_flag();
+    tmp = neon_load_reg(a->vm, 0);
+    gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
+    tmp2 = neon_load_reg(a->vm, 1);
+    gen_helper_vfp_fcvt_f32_to_f16(tmp2, tmp2, fpst, ahp);
+    tcg_gen_shli_i32(tmp2, tmp2, 16);
+    tcg_gen_or_i32(tmp2, tmp2, tmp);
+    tcg_temp_free_i32(tmp);
+    tmp = neon_load_reg(a->vm, 2);
+    gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
+    tmp3 = neon_load_reg(a->vm, 3);
+    neon_store_reg(a->vd, 0, tmp2);
+    gen_helper_vfp_fcvt_f32_to_f16(tmp3, tmp3, fpst, ahp);
+    tcg_gen_shli_i32(tmp3, tmp3, 16);
+    tcg_gen_or_i32(tmp3, tmp3, tmp);
+    neon_store_reg(a->vd, 1, tmp3);
+    tcg_temp_free_i32(tmp);
+    tcg_temp_free_i32(ahp);
+    tcg_temp_free_ptr(fpst);
+
+    return true;
+}
+
+static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a)
+{
+    TCGv_ptr fpst;
+    TCGv_i32 ahp, tmp, tmp2, tmp3;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !dc_isar_feature(aa32_fp16_spconv, s)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vd & 1) || (a->size != 1)) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fpst = get_fpstatus_ptr(true);
+    ahp = get_ahp_flag();
+    tmp3 = tcg_temp_new_i32();
+    tmp = neon_load_reg(a->vm, 0);
+    tmp2 = neon_load_reg(a->vm, 1);
+    tcg_gen_ext16u_i32(tmp3, tmp);
+    gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
+    neon_store_reg(a->vd, 0, tmp3);
+    tcg_gen_shri_i32(tmp, tmp, 16);
+    gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp);
+    neon_store_reg(a->vd, 1, tmp);
+    tmp3 = tcg_temp_new_i32();
+    tcg_gen_ext16u_i32(tmp3, tmp2);
+    gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
+    neon_store_reg(a->vd, 2, tmp3);
+    tcg_gen_shri_i32(tmp2, tmp2, 16);
+    gen_helper_vfp_fcvt_f16_to_f32(tmp2, tmp2, fpst, ahp);
+    neon_store_reg(a->vd, 3, tmp2);
+    tcg_temp_free_i32(ahp);
+    tcg_temp_free_ptr(fpst);
+
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 94d5e34fff4..1ea09695546 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4860,7 +4860,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int pass;
     int u;
     int vec_size;
-    TCGv_i32 tmp, tmp2, tmp3;
+    TCGv_i32 tmp, tmp2;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
         return 1;
@@ -4927,6 +4927,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VZIP:
                 case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
                 case NEON_2RM_VSHLL:
+                case NEON_2RM_VCVT_F16_F32:
+                case NEON_2RM_VCVT_F32_F16:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4942,67 +4944,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         goto elementwise;
                     }
                     break;
-                case NEON_2RM_VCVT_F16_F32:
-                {
-                    TCGv_ptr fpst;
-                    TCGv_i32 ahp;
-
-                    if (!dc_isar_feature(aa32_fp16_spconv, s) ||
-                        q || (rm & 1)) {
-                        return 1;
-                    }
-                    fpst = get_fpstatus_ptr(true);
-                    ahp = get_ahp_flag();
-                    tmp = neon_load_reg(rm, 0);
-                    gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
-                    tmp2 = neon_load_reg(rm, 1);
-                    gen_helper_vfp_fcvt_f32_to_f16(tmp2, tmp2, fpst, ahp);
-                    tcg_gen_shli_i32(tmp2, tmp2, 16);
-                    tcg_gen_or_i32(tmp2, tmp2, tmp);
-                    tcg_temp_free_i32(tmp);
-                    tmp = neon_load_reg(rm, 2);
-                    gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
-                    tmp3 = neon_load_reg(rm, 3);
-                    neon_store_reg(rd, 0, tmp2);
-                    gen_helper_vfp_fcvt_f32_to_f16(tmp3, tmp3, fpst, ahp);
-                    tcg_gen_shli_i32(tmp3, tmp3, 16);
-                    tcg_gen_or_i32(tmp3, tmp3, tmp);
-                    neon_store_reg(rd, 1, tmp3);
-                    tcg_temp_free_i32(tmp);
-                    tcg_temp_free_i32(ahp);
-                    tcg_temp_free_ptr(fpst);
-                    break;
-                }
-                case NEON_2RM_VCVT_F32_F16:
-                {
-                    TCGv_ptr fpst;
-                    TCGv_i32 ahp;
-                    if (!dc_isar_feature(aa32_fp16_spconv, s) ||
-                        q || (rd & 1)) {
-                        return 1;
-                    }
-                    fpst = get_fpstatus_ptr(true);
-                    ahp = get_ahp_flag();
-                    tmp3 = tcg_temp_new_i32();
-                    tmp = neon_load_reg(rm, 0);
-                    tmp2 = neon_load_reg(rm, 1);
-                    tcg_gen_ext16u_i32(tmp3, tmp);
-                    gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
-                    neon_store_reg(rd, 0, tmp3);
-                    tcg_gen_shri_i32(tmp, tmp, 16);
-                    gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp);
-                    neon_store_reg(rd, 1, tmp);
-                    tmp3 = tcg_temp_new_i32();
-                    tcg_gen_ext16u_i32(tmp3, tmp2);
-                    gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
-                    neon_store_reg(rd, 2, tmp3);
-                    tcg_gen_shri_i32(tmp2, tmp2, 16);
-                    gen_helper_vfp_fcvt_f16_to_f32(tmp2, tmp2, fpst, ahp);
-                    neon_store_reg(rd, 3, tmp2);
-                    tcg_temp_free_i32(ahp);
-                    tcg_temp_free_ptr(fpst);
-                    break;
-                }
                 case NEON_2RM_AESE: case NEON_2RM_AESMC:
                     if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
                         return 1;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns to decodetree
  2020-06-16 17:08 ` [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns " Peter Maydell
@ 2020-06-19 23:01   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:01 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon insns in the 2-reg-misc group which are
> VCVT between f32 and f16 to decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  3 ++
>  target/arm/translate-neon.inc.c | 96 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 65 ++--------------------
>  3 files changed, 102 insertions(+), 62 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (5 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:17   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations " Peter Maydell
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert to decodetree the insns in the Neon 2-reg-misc grouping which
we implement using gvec.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       | 11 +++++++
 target/arm/translate-neon.inc.c | 55 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 35 +++++----------------
 3 files changed, 74 insertions(+), 27 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 8174f2f92f4..b5692070d62 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -447,9 +447,20 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VPADDL_S     1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
     VPADDL_U     1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
 
+    VMVN         1111 001 11 . 11 .. 00 .... 0 1011 . . 0 .... @2misc
+
     VPADAL_S     1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
     VPADAL_U     1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
 
+    VCGT0        1111 001 11 . 11 .. 01 .... 0 0000 . . 0 .... @2misc
+    VCGE0        1111 001 11 . 11 .. 01 .... 0 0001 . . 0 .... @2misc
+    VCEQ0        1111 001 11 . 11 .. 01 .... 0 0010 . . 0 .... @2misc
+    VCLE0        1111 001 11 . 11 .. 01 .... 0 0011 . . 0 .... @2misc
+    VCLT0        1111 001 11 . 11 .. 01 .... 0 0100 . . 0 .... @2misc
+
+    VABS         1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
+    VNEG         1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
+
     VUZP         1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
     VZIP         1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index d37be597cf4..d80123514c2 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3450,3 +3450,58 @@ static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a)
 
     return true;
 }
+
+static bool do_2misc_vec(DisasContext *s, arg_2misc *a, GVecGen2Fn *fn)
+{
+    int vec_size = a->q ? 16 : 8;
+    int rd_ofs = neon_reg_offset(a->vd, 0);
+    int rm_ofs = neon_reg_offset(a->vm, 0);
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->size == 3) {
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fn(a->size, rd_ofs, rm_ofs, vec_size, vec_size);
+
+    return true;
+}
+
+#define DO_2MISC_VEC(INSN, FN)                                  \
+    static bool trans_##INSN(DisasContext *s, arg_2misc *a)     \
+    {                                                           \
+        return do_2misc_vec(s, a, FN);                          \
+    }
+
+DO_2MISC_VEC(VNEG, tcg_gen_gvec_neg)
+DO_2MISC_VEC(VABS, tcg_gen_gvec_abs)
+DO_2MISC_VEC(VCEQ0, gen_gvec_ceq0)
+DO_2MISC_VEC(VCGT0, gen_gvec_cgt0)
+DO_2MISC_VEC(VCLE0, gen_gvec_cle0)
+DO_2MISC_VEC(VCGE0, gen_gvec_cge0)
+DO_2MISC_VEC(VCLT0, gen_gvec_clt0)
+
+static bool trans_VMVN(DisasContext *s, arg_2misc *a)
+{
+    if (a->size != 0) {
+        return false;
+    }
+    return do_2misc_vec(s, a, tcg_gen_gvec_not);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 1ea09695546..0f0741a37bc 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4859,7 +4859,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     int size;
     int pass;
     int u;
-    int vec_size;
     TCGv_i32 tmp, tmp2;
 
     if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -4883,7 +4882,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     VFP_DREG_D(rd, insn);
     VFP_DREG_M(rm, insn);
     size = (insn >> 20) & 3;
-    vec_size = q ? 16 : 8;
     rd_ofs = neon_reg_offset(rd, 0);
     rm_ofs = neon_reg_offset(rm, 0);
 
@@ -4929,6 +4927,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VSHLL:
                 case NEON_2RM_VCVT_F16_F32:
                 case NEON_2RM_VCVT_F32_F16:
+                case NEON_2RM_VMVN:
+                case NEON_2RM_VNEG:
+                case NEON_2RM_VABS:
+                case NEON_2RM_VCEQ0:
+                case NEON_2RM_VCGT0:
+                case NEON_2RM_VCLE0:
+                case NEON_2RM_VCGE0:
+                case NEON_2RM_VCLT0:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4989,31 +4995,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                                        q ? gen_helper_crypto_sha256su0
                                        : gen_helper_crypto_sha1su1);
                     break;
-                case NEON_2RM_VMVN:
-                    tcg_gen_gvec_not(0, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
-                case NEON_2RM_VNEG:
-                    tcg_gen_gvec_neg(size, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
-                case NEON_2RM_VABS:
-                    tcg_gen_gvec_abs(size, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
-
-                case NEON_2RM_VCEQ0:
-                    gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
-                case NEON_2RM_VCGT0:
-                    gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
-                case NEON_2RM_VCLE0:
-                    gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
-                case NEON_2RM_VCGE0:
-                    gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
-                case NEON_2RM_VCLT0:
-                    gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
-                    break;
 
                 default:
                 elementwise:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree
  2020-06-16 17:08 ` [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops " Peter Maydell
@ 2020-06-19 23:17   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:17 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert to decodetree the insns in the Neon 2-reg-misc grouping which
> we implement using gvec.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       | 11 +++++++
>  target/arm/translate-neon.inc.c | 55 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 35 +++++----------------
>  3 files changed, 74 insertions(+), 27 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (6 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:25   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn Peter Maydell
                   ` (14 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon-2-reg misc crypto ops (AESE, AESMC, SHA1H, SHA1SU1)
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       | 12 ++++++++
 target/arm/translate-neon.inc.c | 42 ++++++++++++++++++++++++++
 target/arm/translate.c          | 52 +++------------------------------
 3 files changed, 58 insertions(+), 48 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index b5692070d62..86b1b9e34bf 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -441,12 +441,19 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
                  &2misc vm=%vm_dp vd=%vd_dp
     @2misc_q0    .... ... .. . .. size:2 .. .... . .... . . . .... \
                  &2misc vm=%vm_dp vd=%vd_dp q=0
+    @2misc_q1    .... ... .. . .. size:2 .. .... . .... . . . .... \
+                 &2misc vm=%vm_dp vd=%vd_dp q=1
 
     VREV64       1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
 
     VPADDL_S     1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
     VPADDL_U     1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
 
+    AESE         1111 001 11 . 11 .. 00 .... 0 0110 0 . 0 .... @2misc_q1
+    AESD         1111 001 11 . 11 .. 00 .... 0 0110 1 . 0 .... @2misc_q1
+    AESMC        1111 001 11 . 11 .. 00 .... 0 0111 0 . 0 .... @2misc_q1
+    AESIMC       1111 001 11 . 11 .. 00 .... 0 0111 1 . 0 .... @2misc_q1
+
     VMVN         1111 001 11 . 11 .. 00 .... 0 1011 . . 0 .... @2misc
 
     VPADAL_S     1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
@@ -458,6 +465,8 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VCLE0        1111 001 11 . 11 .. 01 .... 0 0011 . . 0 .... @2misc
     VCLT0        1111 001 11 . 11 .. 01 .... 0 0100 . . 0 .... @2misc
 
+    SHA1H        1111 001 11 . 11 .. 01 .... 0 0101 1 . 0 .... @2misc_q1
+
     VABS         1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
     VNEG         1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
 
@@ -473,6 +482,9 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     VSHLL        1111 001 11 . 11 .. 10 .... 0 0110 0 . 0 .... @2misc_q0
 
+    SHA1SU1      1111 001 11 . 11 .. 10 .... 0 0111 0 . 0 .... @2misc_q1
+    SHA256SU0    1111 001 11 . 11 .. 10 .... 0 0111 1 . 0 .... @2misc_q1
+
     VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
     VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
   ]
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index d80123514c2..5e2cd18bf71 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3505,3 +3505,45 @@ static bool trans_VMVN(DisasContext *s, arg_2misc *a)
     }
     return do_2misc_vec(s, a, tcg_gen_gvec_not);
 }
+
+#define WRAP_2M_3_OOL_FN(WRAPNAME, FUNC, DATA)                          \
+    static void WRAPNAME(unsigned vece, uint32_t rd_ofs,                \
+                         uint32_t rm_ofs, uint32_t oprsz,               \
+                         uint32_t maxsz)                                \
+    {                                                                   \
+        tcg_gen_gvec_3_ool(rd_ofs, rd_ofs, rm_ofs, oprsz, maxsz,        \
+                           DATA, FUNC);                                 \
+    }
+
+#define WRAP_2M_2_OOL_FN(WRAPNAME, FUNC, DATA)                          \
+    static void WRAPNAME(unsigned vece, uint32_t rd_ofs,                \
+                         uint32_t rm_ofs, uint32_t oprsz,               \
+                         uint32_t maxsz)                                \
+    {                                                                   \
+        tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, oprsz, maxsz, DATA, FUNC);   \
+    }
+
+WRAP_2M_3_OOL_FN(gen_AESE, gen_helper_crypto_aese, 0)
+WRAP_2M_3_OOL_FN(gen_AESD, gen_helper_crypto_aese, 1)
+WRAP_2M_2_OOL_FN(gen_AESMC, gen_helper_crypto_aesmc, 0)
+WRAP_2M_2_OOL_FN(gen_AESIMC, gen_helper_crypto_aesmc, 1)
+WRAP_2M_2_OOL_FN(gen_SHA1H, gen_helper_crypto_sha1h, 0)
+WRAP_2M_2_OOL_FN(gen_SHA1SU1, gen_helper_crypto_sha1su1, 0)
+WRAP_2M_2_OOL_FN(gen_SHA256SU0, gen_helper_crypto_sha256su0, 0)
+
+#define DO_2M_CRYPTO(INSN, FEATURE, SIZE)                       \
+    static bool trans_##INSN(DisasContext *s, arg_2misc *a)     \
+    {                                                           \
+        if (!dc_isar_feature(FEATURE, s) || a->size != SIZE) {  \
+            return false;                                       \
+        }                                                       \
+        return do_2misc_vec(s, a, gen_##INSN);                  \
+    }
+
+DO_2M_CRYPTO(AESE, aa32_aes, 0)
+DO_2M_CRYPTO(AESD, aa32_aes, 0)
+DO_2M_CRYPTO(AESMC, aa32_aes, 0)
+DO_2M_CRYPTO(AESIMC, aa32_aes, 0)
+DO_2M_CRYPTO(SHA1H, aa32_sha1, 2)
+DO_2M_CRYPTO(SHA1SU1, aa32_sha1, 2)
+DO_2M_CRYPTO(SHA256SU0, aa32_sha2, 2)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 0f0741a37bc..38644995ab2 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4855,7 +4855,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
 {
     int op;
     int q;
-    int rd, rm, rd_ofs, rm_ofs;
+    int rd, rm;
     int size;
     int pass;
     int u;
@@ -4882,8 +4882,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
     VFP_DREG_D(rd, insn);
     VFP_DREG_M(rm, insn);
     size = (insn >> 20) & 3;
-    rd_ofs = neon_reg_offset(rd, 0);
-    rm_ofs = neon_reg_offset(rm, 0);
 
     if ((insn & (1 << 23)) == 0) {
         /* Three register same length: handled by decodetree */
@@ -4935,6 +4933,9 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VCLE0:
                 case NEON_2RM_VCGE0:
                 case NEON_2RM_VCLT0:
+                case NEON_2RM_AESE: case NEON_2RM_AESMC:
+                case NEON_2RM_SHA1H:
+                case NEON_2RM_SHA1SU1:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4950,51 +4951,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         goto elementwise;
                     }
                     break;
-                case NEON_2RM_AESE: case NEON_2RM_AESMC:
-                    if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
-                        return 1;
-                    }
-                    /*
-                     * Bit 6 is the lowest opcode bit; it distinguishes
-                     * between encryption (AESE/AESMC) and decryption
-                     * (AESD/AESIMC).
-                     */
-                    if (op == NEON_2RM_AESE) {
-                        tcg_gen_gvec_3_ool(vfp_reg_offset(true, rd),
-                                           vfp_reg_offset(true, rd),
-                                           vfp_reg_offset(true, rm),
-                                           16, 16, extract32(insn, 6, 1),
-                                           gen_helper_crypto_aese);
-                    } else {
-                        tcg_gen_gvec_2_ool(vfp_reg_offset(true, rd),
-                                           vfp_reg_offset(true, rm),
-                                           16, 16, extract32(insn, 6, 1),
-                                           gen_helper_crypto_aesmc);
-                    }
-                    break;
-                case NEON_2RM_SHA1H:
-                    if (!dc_isar_feature(aa32_sha1, s) || ((rm | rd) & 1)) {
-                        return 1;
-                    }
-                    tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, 16, 16, 0,
-                                       gen_helper_crypto_sha1h);
-                    break;
-                case NEON_2RM_SHA1SU1:
-                    if ((rm | rd) & 1) {
-                            return 1;
-                    }
-                    /* bit 6 (q): set -> SHA256SU0, cleared -> SHA1SU1 */
-                    if (q) {
-                        if (!dc_isar_feature(aa32_sha2, s)) {
-                            return 1;
-                        }
-                    } else if (!dc_isar_feature(aa32_sha1, s)) {
-                        return 1;
-                    }
-                    tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, 16, 16, 0,
-                                       q ? gen_helper_crypto_sha256su0
-                                       : gen_helper_crypto_sha1su1);
-                    break;
 
                 default:
                 elementwise:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations to decodetree
  2020-06-16 17:08 ` [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations " Peter Maydell
@ 2020-06-19 23:25   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:25 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon-2-reg misc crypto ops (AESE, AESMC, SHA1H, SHA1SU1)
> to decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       | 12 ++++++++
>  target/arm/translate-neon.inc.c | 42 ++++++++++++++++++++++++++
>  target/arm/translate.c          | 52 +++------------------------------
>  3 files changed, 58 insertions(+), 48 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (7 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:28   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs Peter Maydell
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

The NeonGenOneOpFn typedef breaks with the pattern of the other
NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
(which we will need in a subsequent commit).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h     | 2 +-
 target/arm/translate-a64.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index 62ed5c4780c..35218b3fdf1 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -374,7 +374,7 @@ typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
 typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
 typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
 typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
-typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
+typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
 typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
 typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
 typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a0e72ad6942..7cb5fbfba80 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -11917,8 +11917,8 @@ static void handle_2misc_pairwise(DisasContext *s, int opcode, bool u,
     } else {
         for (pass = 0; pass < maxpass; pass++) {
             TCGv_i64 tcg_op = tcg_temp_new_i64();
-            NeonGenOneOpFn *genfn;
-            static NeonGenOneOpFn * const fns[2][2] = {
+            NeonGenOne64OpFn *genfn;
+            static NeonGenOne64OpFn * const fns[2][2] = {
                 { gen_helper_neon_addlp_s8,  gen_helper_neon_addlp_u8 },
                 { gen_helper_neon_addlp_s16,  gen_helper_neon_addlp_u16 },
             };
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
  2020-06-16 17:08 ` [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn Peter Maydell
@ 2020-06-19 23:28   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:28 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> The NeonGenOneOpFn typedef breaks with the pattern of the other
> NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
> but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
> so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
> (which we will need in a subsequent commit).
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/translate.h     | 2 +-
>  target/arm/translate-a64.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (8 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:32   ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single,Double}OPFn typedefs Richard Henderson
  2020-06-16 17:08 ` [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest Peter Maydell
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

All the other typedefs like these spell "Op" with a lowercase 'p';
remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to
match.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          | 4 ++--
 target/arm/translate-a64.c      | 4 ++--
 target/arm/translate-neon.inc.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index 35218b3fdf1..467c5291101 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -372,8 +372,8 @@ typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
 typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
 typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
 typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
-typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
-typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
+typedef void NeonGenTwoSingleOpFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+typedef void NeonGenTwoDoubleOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
 typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
 typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
 typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 7cb5fbfba80..12040984981 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -9534,7 +9534,7 @@ static void handle_2misc_fcmp_zero(DisasContext *s, int opcode,
         TCGv_i64 tcg_op = tcg_temp_new_i64();
         TCGv_i64 tcg_zero = tcg_const_i64(0);
         TCGv_i64 tcg_res = tcg_temp_new_i64();
-        NeonGenTwoDoubleOPFn *genfn;
+        NeonGenTwoDoubleOpFn *genfn;
         bool swap = false;
         int pass;
 
@@ -9576,7 +9576,7 @@ static void handle_2misc_fcmp_zero(DisasContext *s, int opcode,
         TCGv_i32 tcg_op = tcg_temp_new_i32();
         TCGv_i32 tcg_zero = tcg_const_i32(0);
         TCGv_i32 tcg_res = tcg_temp_new_i32();
-        NeonGenTwoSingleOPFn *genfn;
+        NeonGenTwoSingleOpFn *genfn;
         bool swap = false;
         int pass, maxpasses;
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 5e2cd18bf71..c39443c8cae 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1664,7 +1664,7 @@ static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
 }
 
 static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
-                      NeonGenTwoSingleOPFn *fn)
+                      NeonGenTwoSingleOpFn *fn)
 {
     /* FP operations in 2-reg-and-shift group */
     TCGv_i32 tmp, shiftv;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single,Double}OPFn typedefs
  2020-06-16 17:08 ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs Peter Maydell
@ 2020-06-19 23:32   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:32 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> All the other typedefs like these spell "Op" with a lowercase 'p';
> remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to
> match.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/translate.h          | 4 ++--
>  target/arm/translate-a64.c      | 4 ++--
>  target/arm/translate-neon.inc.c | 2 +-
>  3 files changed, 5 insertions(+), 5 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (9 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:33   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree Peter Maydell
                   ` (11 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Make gen_swap_half() take a source and destination TCGv_i32 rather
than modifying the input TCGv_i32; we're going to want to be able to
use it with the more flexible function signature, and this also
brings it into line with other functions like gen_rev16() and
gen_revsh().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-neon.inc.c |  2 +-
 target/arm/translate.c          | 10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index c39443c8cae..4967e974386 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3007,7 +3007,7 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a)
                 tcg_gen_bswap32_i32(tmp[half], tmp[half]);
                 break;
             case 1:
-                gen_swap_half(tmp[half]);
+                gen_swap_half(tmp[half], tmp[half]);
                 break;
             case 2:
                 break;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 38644995ab2..64b18a95b64 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -378,9 +378,9 @@ static void gen_revsh(TCGv_i32 dest, TCGv_i32 var)
 }
 
 /* Swap low and high halfwords.  */
-static void gen_swap_half(TCGv_i32 var)
+static void gen_swap_half(TCGv_i32 dest, TCGv_i32 var)
 {
-    tcg_gen_rotri_i32(var, var, 16);
+    tcg_gen_rotri_i32(dest, var, 16);
 }
 
 /* Dual 16-bit add.  Result placed in t0 and t1 is marked as dead.
@@ -4960,7 +4960,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                         case NEON_2RM_VREV32:
                             switch (size) {
                             case 0: tcg_gen_bswap32_i32(tmp, tmp); break;
-                            case 1: gen_swap_half(tmp); break;
+                            case 1: gen_swap_half(tmp, tmp); break;
                             default: abort();
                             }
                             break;
@@ -8046,7 +8046,7 @@ static bool op_smlad(DisasContext *s, arg_rrrr *a, bool m_swap, bool sub)
     t1 = load_reg(s, a->rn);
     t2 = load_reg(s, a->rm);
     if (m_swap) {
-        gen_swap_half(t2);
+        gen_swap_half(t2, t2);
     }
     gen_smul_dual(t1, t2);
 
@@ -8104,7 +8104,7 @@ static bool op_smlald(DisasContext *s, arg_rrrr *a, bool m_swap, bool sub)
     t1 = load_reg(s, a->rn);
     t2 = load_reg(s, a->rm);
     if (m_swap) {
-        gen_swap_half(t2);
+        gen_swap_half(t2, t2);
     }
     gen_smul_dual(t1, t2);
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest
  2020-06-16 17:08 ` [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest Peter Maydell
@ 2020-06-19 23:33   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:33 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Make gen_swap_half() take a source and destination TCGv_i32 rather
> than modifying the input TCGv_i32; we're going to want to be able to
> use it with the more flexible function signature, and this also
> brings it into line with other functions like gen_rev16() and
> gen_revsh().
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/translate-neon.inc.c |  2 +-
>  target/arm/translate.c          | 10 +++++-----
>  2 files changed, 6 insertions(+), 6 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (10 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:35   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops Peter Maydell
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the VREV32 and VREV16 insns in the Neon 2-reg-misc group
to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |  1 +
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c | 55 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 12 ++-----
 4 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index 467c5291101..4dbeee4c89f 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -363,6 +363,7 @@ typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
                         uint32_t, uint32_t, uint32_t);
 
 /* Function prototype for gen_ functions for calling Neon helpers */
+typedef void NeonGenOneOpFn(TCGv_i32, TCGv_i32);
 typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
 typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
 typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 86b1b9e34bf..0a791af46c8 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -445,6 +445,8 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
                  &2misc vm=%vm_dp vd=%vd_dp q=1
 
     VREV64       1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
+    VREV32       1111 001 11 . 11 .. 00 .... 0 0001 . . 0 .... @2misc
+    VREV16       1111 001 11 . 11 .. 00 .... 0 0010 . . 0 .... @2misc
 
     VPADDL_S     1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
     VPADDL_U     1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 4967e974386..0a779980d01 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3547,3 +3547,58 @@ DO_2M_CRYPTO(AESIMC, aa32_aes, 0)
 DO_2M_CRYPTO(SHA1H, aa32_sha1, 2)
 DO_2M_CRYPTO(SHA1SU1, aa32_sha1, 2)
 DO_2M_CRYPTO(SHA256SU0, aa32_sha2, 2)
+
+static bool do_2misc(DisasContext *s, arg_2misc *a, NeonGenOneOpFn *fn)
+{
+    int pass;
+
+    /* Handle a 2-reg-misc operation by iterating 32 bits at a time */
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (!fn) {
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+        TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+        fn(tmp, tmp);
+        neon_store_reg(a->vd, pass, tmp);
+    }
+
+    return true;
+}
+
+static bool trans_VREV32(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenOneOpFn * const fn[] = {
+        tcg_gen_bswap32_i32,
+        gen_swap_half,
+        NULL,
+        NULL,
+    };
+    return do_2misc(s, a, fn[a->size]);
+}
+
+static bool trans_VREV16(DisasContext *s, arg_2misc *a)
+{
+    if (a->size != 0) {
+        return false;
+    }
+    return do_2misc(s, a, gen_rev16);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 64b18a95b64..5b50eddd111 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4936,6 +4936,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_AESE: case NEON_2RM_AESMC:
                 case NEON_2RM_SHA1H:
                 case NEON_2RM_SHA1SU1:
+                case NEON_2RM_VREV32:
+                case NEON_2RM_VREV16:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4957,16 +4959,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     for (pass = 0; pass < (q ? 4 : 2); pass++) {
                         tmp = neon_load_reg(rm, pass);
                         switch (op) {
-                        case NEON_2RM_VREV32:
-                            switch (size) {
-                            case 0: tcg_gen_bswap32_i32(tmp, tmp); break;
-                            case 1: gen_swap_half(tmp, tmp); break;
-                            default: abort();
-                            }
-                            break;
-                        case NEON_2RM_VREV16:
-                            gen_rev16(tmp, tmp);
-                            break;
                         case NEON_2RM_VCLS:
                             switch (size) {
                             case 0: gen_helper_neon_cls_s8(tmp, tmp); break;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
  2020-06-16 17:08 ` [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree Peter Maydell
@ 2020-06-19 23:35   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:35 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the VREV32 and VREV16 insns in the Neon 2-reg-misc group
> to decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/translate.h          |  1 +
>  target/arm/neon-dp.decode       |  2 ++
>  target/arm/translate-neon.inc.c | 55 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 12 ++-----
>  4 files changed, 60 insertions(+), 10 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (11 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:41   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree Peter Maydell
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the remaining ops in the Neon 2-reg-misc group which
can be implemented simply with our do_2misc() helper.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       | 10 +++++
 target/arm/translate-neon.inc.c | 69 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 38 ++++--------------
 3 files changed, 86 insertions(+), 31 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 0a791af46c8..f947f7d09f0 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -456,6 +456,10 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     AESMC        1111 001 11 . 11 .. 00 .... 0 0111 0 . 0 .... @2misc_q1
     AESIMC       1111 001 11 . 11 .. 00 .... 0 0111 1 . 0 .... @2misc_q1
 
+    VCLS         1111 001 11 . 11 .. 00 .... 0 1000 . . 0 .... @2misc
+    VCLZ         1111 001 11 . 11 .. 00 .... 0 1001 . . 0 .... @2misc
+    VCNT         1111 001 11 . 11 .. 00 .... 0 1010 . . 0 .... @2misc
+
     VMVN         1111 001 11 . 11 .. 00 .... 0 1011 . . 0 .... @2misc
 
     VPADAL_S     1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
@@ -472,6 +476,9 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VABS         1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
     VNEG         1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
 
+    VABS_F       1111 001 11 . 11 .. 01 .... 0 1110 . . 0 .... @2misc
+    VNEG_F       1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
+
     VUZP         1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
     VZIP         1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
 
@@ -489,6 +496,9 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
     VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
+
+    VRECPE       1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
+    VRSQRTE      1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 0a779980d01..336c2b312eb 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3602,3 +3602,72 @@ static bool trans_VREV16(DisasContext *s, arg_2misc *a)
     }
     return do_2misc(s, a, gen_rev16);
 }
+
+static bool trans_VCLS(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenOneOpFn * const fn[] = {
+        gen_helper_neon_cls_s8,
+        gen_helper_neon_cls_s16,
+        gen_helper_neon_cls_s32,
+        NULL,
+    };
+    return do_2misc(s, a, fn[a->size]);
+}
+
+static void do_VCLZ_32(TCGv_i32 rd, TCGv_i32 rm)
+{
+    tcg_gen_clzi_i32(rd, rm, 32);
+}
+
+static bool trans_VCLZ(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenOneOpFn * const fn[] = {
+        gen_helper_neon_clz_u8,
+        gen_helper_neon_clz_u16,
+        do_VCLZ_32,
+        NULL,
+    };
+    return do_2misc(s, a, fn[a->size]);
+}
+
+static bool trans_VCNT(DisasContext *s, arg_2misc *a)
+{
+    if (a->size != 0) {
+        return false;
+    }
+    return do_2misc(s, a, gen_helper_neon_cnt_u8);
+}
+
+static bool trans_VABS_F(DisasContext *s, arg_2misc *a)
+{
+    if (a->size != 2) {
+        return false;
+    }
+    /* TODO: FP16 : size == 1 */
+    return do_2misc(s, a, gen_helper_vfp_abss);
+}
+
+static bool trans_VNEG_F(DisasContext *s, arg_2misc *a)
+{
+    if (a->size != 2) {
+        return false;
+    }
+    /* TODO: FP16 : size == 1 */
+    return do_2misc(s, a, gen_helper_vfp_negs);
+}
+
+static bool trans_VRECPE(DisasContext *s, arg_2misc *a)
+{
+    if (a->size != 2) {
+        return false;
+    }
+    return do_2misc(s, a, gen_helper_recpe_u32);
+}
+
+static bool trans_VRSQRTE(DisasContext *s, arg_2misc *a)
+{
+    if (a->size != 2) {
+        return false;
+    }
+    return do_2misc(s, a, gen_helper_rsqrte_u32);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5b50eddd111..17373743889 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4938,6 +4938,13 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_SHA1SU1:
                 case NEON_2RM_VREV32:
                 case NEON_2RM_VREV16:
+                case NEON_2RM_VCLS:
+                case NEON_2RM_VCLZ:
+                case NEON_2RM_VCNT:
+                case NEON_2RM_VABS_F:
+                case NEON_2RM_VNEG_F:
+                case NEON_2RM_VRECPE:
+                case NEON_2RM_VRSQRTE:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4959,25 +4966,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     for (pass = 0; pass < (q ? 4 : 2); pass++) {
                         tmp = neon_load_reg(rm, pass);
                         switch (op) {
-                        case NEON_2RM_VCLS:
-                            switch (size) {
-                            case 0: gen_helper_neon_cls_s8(tmp, tmp); break;
-                            case 1: gen_helper_neon_cls_s16(tmp, tmp); break;
-                            case 2: gen_helper_neon_cls_s32(tmp, tmp); break;
-                            default: abort();
-                            }
-                            break;
-                        case NEON_2RM_VCLZ:
-                            switch (size) {
-                            case 0: gen_helper_neon_clz_u8(tmp, tmp); break;
-                            case 1: gen_helper_neon_clz_u16(tmp, tmp); break;
-                            case 2: tcg_gen_clzi_i32(tmp, tmp, 32); break;
-                            default: abort();
-                            }
-                            break;
-                        case NEON_2RM_VCNT:
-                            gen_helper_neon_cnt_u8(tmp, tmp);
-                            break;
                         case NEON_2RM_VQABS:
                             switch (size) {
                             case 0:
@@ -5051,12 +5039,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             tcg_temp_free_ptr(fpstatus);
                             break;
                         }
-                        case NEON_2RM_VABS_F:
-                            gen_helper_vfp_abss(tmp, tmp);
-                            break;
-                        case NEON_2RM_VNEG_F:
-                            gen_helper_vfp_negs(tmp, tmp);
-                            break;
                         case NEON_2RM_VSWP:
                             tmp2 = neon_load_reg(rd, pass);
                             neon_store_reg(rm, pass, tmp2);
@@ -5137,12 +5119,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             tcg_temp_free_ptr(fpst);
                             break;
                         }
-                        case NEON_2RM_VRECPE:
-                            gen_helper_recpe_u32(tmp, tmp);
-                            break;
-                        case NEON_2RM_VRSQRTE:
-                            gen_helper_rsqrte_u32(tmp, tmp);
-                            break;
                         case NEON_2RM_VRECPE_F:
                         {
                             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops
  2020-06-16 17:08 ` [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops Peter Maydell
@ 2020-06-19 23:41   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:41 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the remaining ops in the Neon 2-reg-misc group which
> can be implemented simply with our do_2misc() helper.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       | 10 +++++
>  target/arm/translate-neon.inc.c | 69 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 38 ++++--------------
>  3 files changed, 86 insertions(+), 31 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (12 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:42   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns Peter Maydell
                   ` (8 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon VQABS and VQNEG insns to decodetree.
Since these are the only ones which need cpu_env passing to
the helper, we wrap the helper rather than creating a whole
new do_2misc_env() function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  3 +++
 target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 30 ++--------------------------
 3 files changed, 40 insertions(+), 28 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index f947f7d09f0..f0bb34a49eb 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -465,6 +465,9 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VPADAL_S     1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
     VPADAL_U     1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
 
+    VQABS        1111 001 11 . 11 .. 00 .... 0 1110 . . 0 .... @2misc
+    VQNEG        1111 001 11 . 11 .. 00 .... 0 1111 . . 0 .... @2misc
+
     VCGT0        1111 001 11 . 11 .. 01 .... 0 0000 . . 0 .... @2misc
     VCGE0        1111 001 11 . 11 .. 01 .... 0 0001 . . 0 .... @2misc
     VCEQ0        1111 001 11 . 11 .. 01 .... 0 0010 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 336c2b312eb..2b5dc86f628 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3671,3 +3671,38 @@ static bool trans_VRSQRTE(DisasContext *s, arg_2misc *a)
     }
     return do_2misc(s, a, gen_helper_rsqrte_u32);
 }
+
+#define WRAP_1OP_ENV_FN(WRAPNAME, FUNC) \
+    static void WRAPNAME(TCGv_i32 d, TCGv_i32 m)        \
+    {                                                   \
+        FUNC(d, cpu_env, m);                            \
+    }
+
+WRAP_1OP_ENV_FN(gen_VQABS_s8, gen_helper_neon_qabs_s8)
+WRAP_1OP_ENV_FN(gen_VQABS_s16, gen_helper_neon_qabs_s16)
+WRAP_1OP_ENV_FN(gen_VQABS_s32, gen_helper_neon_qabs_s32)
+WRAP_1OP_ENV_FN(gen_VQNEG_s8, gen_helper_neon_qneg_s8)
+WRAP_1OP_ENV_FN(gen_VQNEG_s16, gen_helper_neon_qneg_s16)
+WRAP_1OP_ENV_FN(gen_VQNEG_s32, gen_helper_neon_qneg_s32)
+
+static bool trans_VQABS(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenOneOpFn * const fn[] = {
+        gen_VQABS_s8,
+        gen_VQABS_s16,
+        gen_VQABS_s32,
+        NULL,
+    };
+    return do_2misc(s, a, fn[a->size]);
+}
+
+static bool trans_VQNEG(DisasContext *s, arg_2misc *a)
+{
+    static NeonGenOneOpFn * const fn[] = {
+        gen_VQNEG_s8,
+        gen_VQNEG_s16,
+        gen_VQNEG_s32,
+        NULL,
+    };
+    return do_2misc(s, a, fn[a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 17373743889..3cbd2ab0c96 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4945,6 +4945,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VNEG_F:
                 case NEON_2RM_VRECPE:
                 case NEON_2RM_VRSQRTE:
+                case NEON_2RM_VQABS:
+                case NEON_2RM_VQNEG:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4966,34 +4968,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     for (pass = 0; pass < (q ? 4 : 2); pass++) {
                         tmp = neon_load_reg(rm, pass);
                         switch (op) {
-                        case NEON_2RM_VQABS:
-                            switch (size) {
-                            case 0:
-                                gen_helper_neon_qabs_s8(tmp, cpu_env, tmp);
-                                break;
-                            case 1:
-                                gen_helper_neon_qabs_s16(tmp, cpu_env, tmp);
-                                break;
-                            case 2:
-                                gen_helper_neon_qabs_s32(tmp, cpu_env, tmp);
-                                break;
-                            default: abort();
-                            }
-                            break;
-                        case NEON_2RM_VQNEG:
-                            switch (size) {
-                            case 0:
-                                gen_helper_neon_qneg_s8(tmp, cpu_env, tmp);
-                                break;
-                            case 1:
-                                gen_helper_neon_qneg_s16(tmp, cpu_env, tmp);
-                                break;
-                            case 2:
-                                gen_helper_neon_qneg_s32(tmp, cpu_env, tmp);
-                                break;
-                            default: abort();
-                            }
-                            break;
                         case NEON_2RM_VCGT0_F:
                         {
                             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree
  2020-06-16 17:08 ` [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree Peter Maydell
@ 2020-06-19 23:42   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:42 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VQABS and VQNEG insns to decodetree.
> Since these are the only ones which need cpu_env passing to
> the helper, we wrap the helper rather than creating a whole
> new do_2misc_env() function.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  3 +++
>  target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 30 ++--------------------------
>  3 files changed, 40 insertions(+), 28 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (13 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:44   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree Peter Maydell
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon 2-reg-misc insns which are implemented with
simple calls to functions that take the input, output and
fpstatus pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate.h          |  1 +
 target/arm/neon-dp.decode       |  8 +++++
 target/arm/translate-neon.inc.c | 62 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 56 ++++-------------------------
 4 files changed, 78 insertions(+), 49 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index 4dbeee4c89f..19650a9e2d7 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -373,6 +373,7 @@ typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
 typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
 typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
 typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
+typedef void NeonGenOneSingleOpFn(TCGv_i32, TCGv_i32, TCGv_ptr);
 typedef void NeonGenTwoSingleOpFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
 typedef void NeonGenTwoDoubleOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
 typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index f0bb34a49eb..ea8d5fd99c3 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -497,11 +497,19 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     SHA1SU1      1111 001 11 . 11 .. 10 .... 0 0111 0 . 0 .... @2misc_q1
     SHA256SU0    1111 001 11 . 11 .. 10 .... 0 0111 1 . 0 .... @2misc_q1
 
+    VRINTX       1111 001 11 . 11 .. 10 .... 0 1001 . . 0 .... @2misc
+
     VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
     VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
 
     VRECPE       1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
     VRSQRTE      1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
+    VRECPE_F     1111 001 11 . 11 .. 11 .... 0 1010 . . 0 .... @2misc
+    VRSQRTE_F    1111 001 11 . 11 .. 11 .... 0 1011 . . 0 .... @2misc
+    VCVT_FS      1111 001 11 . 11 .. 11 .... 0 1100 . . 0 .... @2misc
+    VCVT_FU      1111 001 11 . 11 .. 11 .... 0 1101 . . 0 .... @2misc
+    VCVT_SF      1111 001 11 . 11 .. 11 .... 0 1110 . . 0 .... @2misc
+    VCVT_UF      1111 001 11 . 11 .. 11 .... 0 1111 . . 0 .... @2misc
   ]
 
   # Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 2b5dc86f628..ab183e47d7d 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3706,3 +3706,65 @@ static bool trans_VQNEG(DisasContext *s, arg_2misc *a)
     };
     return do_2misc(s, a, fn[a->size]);
 }
+
+static bool do_2misc_fp(DisasContext *s, arg_2misc *a,
+                        NeonGenOneSingleOpFn *fn)
+{
+    int pass;
+    TCGv_ptr fpst;
+
+    /* Handle a 2-reg-misc operation by iterating 32 bits at a time */
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->size != 2) {
+        /* TODO: FP16 will be the size == 1 case */
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fpst = get_fpstatus_ptr(1);
+    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+        TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+        fn(tmp, tmp, fpst);
+        neon_store_reg(a->vd, pass, tmp);
+    }
+    tcg_temp_free_ptr(fpst);
+
+    return true;
+}
+
+#define DO_2MISC_FP(INSN, FUNC)                                 \
+    static bool trans_##INSN(DisasContext *s, arg_2misc *a)     \
+    {                                                           \
+        return do_2misc_fp(s, a, FUNC);                         \
+    }
+
+DO_2MISC_FP(VRECPE_F, gen_helper_recpe_f32)
+DO_2MISC_FP(VRSQRTE_F, gen_helper_rsqrte_f32)
+DO_2MISC_FP(VCVT_FS, gen_helper_vfp_sitos)
+DO_2MISC_FP(VCVT_FU, gen_helper_vfp_uitos)
+DO_2MISC_FP(VCVT_SF, gen_helper_vfp_tosizs)
+DO_2MISC_FP(VCVT_UF, gen_helper_vfp_touizs)
+
+static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
+{
+    if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
+        return false;
+    }
+    return do_2misc_fp(s, a, gen_helper_rints_exact);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 3cbd2ab0c96..48377860c75 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4947,6 +4947,13 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VRSQRTE:
                 case NEON_2RM_VQABS:
                 case NEON_2RM_VQNEG:
+                case NEON_2RM_VRECPE_F:
+                case NEON_2RM_VRSQRTE_F:
+                case NEON_2RM_VCVT_FS:
+                case NEON_2RM_VCVT_FU:
+                case NEON_2RM_VCVT_SF:
+                case NEON_2RM_VCVT_UF:
+                case NEON_2RM_VRINTX:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -5052,13 +5059,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             tcg_temp_free_i32(tcg_rmode);
                             break;
                         }
-                        case NEON_2RM_VRINTX:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            gen_helper_rints_exact(tmp, tmp, fpstatus);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
                         case NEON_2RM_VCVTAU:
                         case NEON_2RM_VCVTAS:
                         case NEON_2RM_VCVTNU:
@@ -5093,48 +5093,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             tcg_temp_free_ptr(fpst);
                             break;
                         }
-                        case NEON_2RM_VRECPE_F:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            gen_helper_recpe_f32(tmp, tmp, fpstatus);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VRSQRTE_F:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            gen_helper_rsqrte_f32(tmp, tmp, fpstatus);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCVT_FS: /* VCVT.F32.S32 */
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            gen_helper_vfp_sitos(tmp, tmp, fpstatus);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCVT_FU: /* VCVT.F32.U32 */
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            gen_helper_vfp_uitos(tmp, tmp, fpstatus);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCVT_SF: /* VCVT.S32.F32 */
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            gen_helper_vfp_tosizs(tmp, tmp, fpstatus);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCVT_UF: /* VCVT.U32.F32 */
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            gen_helper_vfp_touizs(tmp, tmp, fpstatus);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
                         default:
                             /* Reserved op values were caught by the
                              * neon_2rm_sizes[] check earlier.
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns
  2020-06-16 17:08 ` [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns Peter Maydell
@ 2020-06-19 23:44   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:44 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon 2-reg-misc insns which are implemented with
> simple calls to functions that take the input, output and
> fpstatus pointer.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/translate.h          |  1 +
>  target/arm/neon-dp.decode       |  8 +++++
>  target/arm/translate-neon.inc.c | 62 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 56 ++++-------------------------
>  4 files changed, 78 insertions(+), 49 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (14 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:46   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT " Peter Maydell
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the fp-compare-with-zero insns in the Neon 2-reg-misc group to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  6 ++++
 target/arm/translate-neon.inc.c | 28 ++++++++++++++++++
 target/arm/translate.c          | 50 ++++-----------------------------
 3 files changed, 39 insertions(+), 45 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index ea8d5fd99c3..c9acd00f1e8 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -479,6 +479,12 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VABS         1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
     VNEG         1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
 
+    VCGT0_F      1111 001 11 . 11 .. 01 .... 0 1000 . . 0 .... @2misc
+    VCGE0_F      1111 001 11 . 11 .. 01 .... 0 1001 . . 0 .... @2misc
+    VCEQ0_F      1111 001 11 . 11 .. 01 .... 0 1010 . . 0 .... @2misc
+    VCLE0_F      1111 001 11 . 11 .. 01 .... 0 1011 . . 0 .... @2misc
+    VCLT0_F      1111 001 11 . 11 .. 01 .... 0 1100 . . 0 .... @2misc
+
     VABS_F       1111 001 11 . 11 .. 01 .... 0 1110 . . 0 .... @2misc
     VNEG_F       1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index ab183e47d7d..a62da21b152 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3768,3 +3768,31 @@ static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
     }
     return do_2misc_fp(s, a, gen_helper_rints_exact);
 }
+
+#define WRAP_FP_CMP0_FWD(WRAPNAME, FUNC)                        \
+    static void WRAPNAME(TCGv_i32 d, TCGv_i32 m, TCGv_ptr fpst) \
+    {                                                           \
+        TCGv_i32 zero = tcg_const_i32(0);                       \
+        FUNC(d, m, zero, fpst);                                 \
+        tcg_temp_free_i32(zero);                                \
+    }
+#define WRAP_FP_CMP0_REV(WRAPNAME, FUNC)                        \
+    static void WRAPNAME(TCGv_i32 d, TCGv_i32 m, TCGv_ptr fpst) \
+    {                                                           \
+        TCGv_i32 zero = tcg_const_i32(0);                       \
+        FUNC(d, zero, m, fpst);                                 \
+        tcg_temp_free_i32(zero);                                \
+    }
+
+#define DO_FP_CMP0(INSN, FUNC, REV)                             \
+    WRAP_FP_CMP0_##REV(gen_##INSN, FUNC)                        \
+    static bool trans_##INSN(DisasContext *s, arg_2misc *a)     \
+    {                                                           \
+        return do_2misc_fp(s, a, gen_##INSN);                   \
+    }
+
+DO_FP_CMP0(VCGT0_F, gen_helper_neon_cgt_f32, FWD)
+DO_FP_CMP0(VCGE0_F, gen_helper_neon_cge_f32, FWD)
+DO_FP_CMP0(VCEQ0_F, gen_helper_neon_ceq_f32, FWD)
+DO_FP_CMP0(VCLE0_F, gen_helper_neon_cge_f32, REV)
+DO_FP_CMP0(VCLT0_F, gen_helper_neon_cgt_f32, REV)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 48377860c75..dc98928856d 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4954,6 +4954,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VCVT_SF:
                 case NEON_2RM_VCVT_UF:
                 case NEON_2RM_VRINTX:
+                case NEON_2RM_VCGT0_F:
+                case NEON_2RM_VCGE0_F:
+                case NEON_2RM_VCEQ0_F:
+                case NEON_2RM_VCLE0_F:
+                case NEON_2RM_VCLT0_F:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4975,51 +4980,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     for (pass = 0; pass < (q ? 4 : 2); pass++) {
                         tmp = neon_load_reg(rm, pass);
                         switch (op) {
-                        case NEON_2RM_VCGT0_F:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_cgt_f32(tmp, tmp, tmp2, fpstatus);
-                            tcg_temp_free_i32(tmp2);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCGE0_F:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_cge_f32(tmp, tmp, tmp2, fpstatus);
-                            tcg_temp_free_i32(tmp2);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCEQ0_F:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_ceq_f32(tmp, tmp, tmp2, fpstatus);
-                            tcg_temp_free_i32(tmp2);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCLE0_F:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_cge_f32(tmp, tmp2, tmp, fpstatus);
-                            tcg_temp_free_i32(tmp2);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
-                        case NEON_2RM_VCLT0_F:
-                        {
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_cgt_f32(tmp, tmp2, tmp, fpstatus);
-                            tcg_temp_free_i32(tmp2);
-                            tcg_temp_free_ptr(fpstatus);
-                            break;
-                        }
                         case NEON_2RM_VSWP:
                             tmp2 = neon_load_reg(rd, pass);
                             neon_store_reg(rm, pass, tmp2);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree
  2020-06-16 17:08 ` [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree Peter Maydell
@ 2020-06-19 23:46   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:46 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the fp-compare-with-zero insns in the Neon 2-reg-misc group to
> decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  6 ++++
>  target/arm/translate-neon.inc.c | 28 ++++++++++++++++++
>  target/arm/translate.c          | 50 ++++-----------------------------
>  3 files changed, 39 insertions(+), 45 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (15 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:49   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT " Peter Maydell
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon 2-reg-misc VRINT insns to decodetree.
Giving these insns their own do_vrint() function allows us
to change the rounding mode just once at the start and end
rather than doing it for every element in the vector.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  8 +++++
 target/arm/translate-neon.inc.c | 61 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 31 +++--------------
 3 files changed, 74 insertions(+), 26 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index c9acd00f1e8..e0717c7e4a6 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -503,11 +503,19 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     SHA1SU1      1111 001 11 . 11 .. 10 .... 0 0111 0 . 0 .... @2misc_q1
     SHA256SU0    1111 001 11 . 11 .. 10 .... 0 0111 1 . 0 .... @2misc_q1
 
+    VRINTN       1111 001 11 . 11 .. 10 .... 0 1000 . . 0 .... @2misc
     VRINTX       1111 001 11 . 11 .. 10 .... 0 1001 . . 0 .... @2misc
+    VRINTA       1111 001 11 . 11 .. 10 .... 0 1010 . . 0 .... @2misc
+    VRINTZ       1111 001 11 . 11 .. 10 .... 0 1011 . . 0 .... @2misc
 
     VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
+
+    VRINTM       1111 001 11 . 11 .. 10 .... 0 1101 . . 0 .... @2misc
+
     VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
 
+    VRINTP       1111 001 11 . 11 .. 10 .... 0 1111 . . 0 .... @2misc
+
     VRECPE       1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
     VRSQRTE      1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
     VRECPE_F     1111 001 11 . 11 .. 11 .... 0 1010 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index a62da21b152..0e7f86ad156 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3796,3 +3796,64 @@ DO_FP_CMP0(VCGE0_F, gen_helper_neon_cge_f32, FWD)
 DO_FP_CMP0(VCEQ0_F, gen_helper_neon_ceq_f32, FWD)
 DO_FP_CMP0(VCLE0_F, gen_helper_neon_cge_f32, REV)
 DO_FP_CMP0(VCLT0_F, gen_helper_neon_cgt_f32, REV)
+
+static bool do_vrint(DisasContext *s, arg_2misc *a, int rmode)
+{
+    /*
+     * Handle a VRINT* operation by iterating 32 bits at a time,
+     * with a specified rounding mode in operation.
+     */
+    int pass;
+    TCGv_ptr fpst;
+    TCGv_i32 tcg_rmode;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !arm_dc_feature(s, ARM_FEATURE_V8)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->size != 2) {
+        /* TODO: FP16 will be the size == 1 case */
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fpst = get_fpstatus_ptr(1);
+    tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
+    gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+        TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+        gen_helper_rints(tmp, tmp, fpst);
+        neon_store_reg(a->vd, pass, tmp);
+    }
+    gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+    tcg_temp_free_i32(tcg_rmode);
+    tcg_temp_free_ptr(fpst);
+
+    return true;
+}
+
+#define DO_VRINT(INSN, RMODE)                                   \
+    static bool trans_##INSN(DisasContext *s, arg_2misc *a)     \
+    {                                                           \
+        return do_vrint(s, a, RMODE);                           \
+    }
+
+DO_VRINT(VRINTN, FPROUNDING_TIEEVEN)
+DO_VRINT(VRINTA, FPROUNDING_TIEAWAY)
+DO_VRINT(VRINTZ, FPROUNDING_ZERO)
+DO_VRINT(VRINTM, FPROUNDING_NEGINF)
+DO_VRINT(VRINTP, FPROUNDING_POSINF)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index dc98928856d..61dfc3ae7af 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4959,6 +4959,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VCEQ0_F:
                 case NEON_2RM_VCLE0_F:
                 case NEON_2RM_VCLT0_F:
+                case NEON_2RM_VRINTN:
+                case NEON_2RM_VRINTA:
+                case NEON_2RM_VRINTM:
+                case NEON_2RM_VRINTP:
+                case NEON_2RM_VRINTZ:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4993,32 +4998,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             }
                             neon_store_reg(rm, pass, tmp2);
                             break;
-                        case NEON_2RM_VRINTN:
-                        case NEON_2RM_VRINTA:
-                        case NEON_2RM_VRINTM:
-                        case NEON_2RM_VRINTP:
-                        case NEON_2RM_VRINTZ:
-                        {
-                            TCGv_i32 tcg_rmode;
-                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-                            int rmode;
-
-                            if (op == NEON_2RM_VRINTZ) {
-                                rmode = FPROUNDING_ZERO;
-                            } else {
-                                rmode = fp_decode_rm[((op & 0x6) >> 1) ^ 1];
-                            }
-
-                            tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
-                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
-                                                      cpu_env);
-                            gen_helper_rints(tmp, tmp, fpstatus);
-                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
-                                                      cpu_env);
-                            tcg_temp_free_ptr(fpstatus);
-                            tcg_temp_free_i32(tcg_rmode);
-                            break;
-                        }
                         case NEON_2RM_VCVTAU:
                         case NEON_2RM_VCVTAS:
                         case NEON_2RM_VCVTNU:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree
  2020-06-16 17:08 ` [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT " Peter Maydell
@ 2020-06-19 23:49   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:49 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon 2-reg-misc VRINT insns to decodetree.
> Giving these insns their own do_vrint() function allows us
> to change the rounding mode just once at the start and end
> rather than doing it for every element in the vector.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  8 +++++
>  target/arm/translate-neon.inc.c | 61 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 31 +++--------------
>  3 files changed, 74 insertions(+), 26 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (16 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-19 23:52   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 19/21] target/arm: Convert Neon VSWP " Peter Maydell
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the VCVT instructions in the 2-reg-misc grouping to
decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  9 +++++
 target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 70 ++++-----------------------------
 3 files changed, 87 insertions(+), 62 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index e0717c7e4a6..5507c3e4623 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -516,6 +516,15 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
 
     VRINTP       1111 001 11 . 11 .. 10 .... 0 1111 . . 0 .... @2misc
 
+    VCVTAS       1111 001 11 . 11 .. 11 .... 0 0000 . . 0 .... @2misc
+    VCVTAU       1111 001 11 . 11 .. 11 .... 0 0001 . . 0 .... @2misc
+    VCVTNS       1111 001 11 . 11 .. 11 .... 0 0010 . . 0 .... @2misc
+    VCVTNU       1111 001 11 . 11 .. 11 .... 0 0011 . . 0 .... @2misc
+    VCVTPS       1111 001 11 . 11 .. 11 .... 0 0100 . . 0 .... @2misc
+    VCVTPU       1111 001 11 . 11 .. 11 .... 0 0101 . . 0 .... @2misc
+    VCVTMS       1111 001 11 . 11 .. 11 .... 0 0110 . . 0 .... @2misc
+    VCVTMU       1111 001 11 . 11 .. 11 .... 0 0111 . . 0 .... @2misc
+
     VRECPE       1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
     VRSQRTE      1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
     VRECPE_F     1111 001 11 . 11 .. 11 .... 0 1010 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 0e7f86ad156..29bc161f36a 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3857,3 +3857,73 @@ DO_VRINT(VRINTA, FPROUNDING_TIEAWAY)
 DO_VRINT(VRINTZ, FPROUNDING_ZERO)
 DO_VRINT(VRINTM, FPROUNDING_NEGINF)
 DO_VRINT(VRINTP, FPROUNDING_POSINF)
+
+static bool do_vcvt(DisasContext *s, arg_2misc *a, int rmode, bool is_signed)
+{
+    /*
+     * Handle a VCVT* operation by iterating 32 bits at a time,
+     * with a specified rounding mode in operation.
+     */
+    int pass;
+    TCGv_ptr fpst;
+    TCGv_i32 tcg_rmode, tcg_shift;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+        !arm_dc_feature(s, ARM_FEATURE_V8)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->size != 2) {
+        /* TODO: FP16 will be the size == 1 case */
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    fpst = get_fpstatus_ptr(1);
+    tcg_shift = tcg_const_i32(0);
+    tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
+    gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+    for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+        TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+        if (is_signed) {
+            gen_helper_vfp_tosls(tmp, tmp, tcg_shift, fpst);
+        } else {
+            gen_helper_vfp_touls(tmp, tmp, tcg_shift, fpst);
+        }
+        neon_store_reg(a->vd, pass, tmp);
+    }
+    gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+    tcg_temp_free_i32(tcg_rmode);
+    tcg_temp_free_i32(tcg_shift);
+    tcg_temp_free_ptr(fpst);
+
+    return true;
+}
+
+#define DO_VCVT(INSN, RMODE, SIGNED)                            \
+    static bool trans_##INSN(DisasContext *s, arg_2misc *a)     \
+    {                                                           \
+        return do_vcvt(s, a, RMODE, SIGNED);                    \
+    }
+
+DO_VCVT(VCVTAU, FPROUNDING_TIEAWAY, false)
+DO_VCVT(VCVTAS, FPROUNDING_TIEAWAY, true)
+DO_VCVT(VCVTNU, FPROUNDING_TIEEVEN, false)
+DO_VCVT(VCVTNS, FPROUNDING_TIEEVEN, true)
+DO_VCVT(VCVTPU, FPROUNDING_POSINF, false)
+DO_VCVT(VCVTPS, FPROUNDING_POSINF, true)
+DO_VCVT(VCVTMU, FPROUNDING_NEGINF, false)
+DO_VCVT(VCVTMS, FPROUNDING_NEGINF, true)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 61dfc3ae7af..b0181062020 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -3042,30 +3042,6 @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
 #define NEON_2RM_VCVT_SF 62
 #define NEON_2RM_VCVT_UF 63
 
-static bool neon_2rm_is_v8_op(int op)
-{
-    /* Return true if this neon 2reg-misc op is ARMv8 and up */
-    switch (op) {
-    case NEON_2RM_VRINTN:
-    case NEON_2RM_VRINTA:
-    case NEON_2RM_VRINTM:
-    case NEON_2RM_VRINTP:
-    case NEON_2RM_VRINTZ:
-    case NEON_2RM_VRINTX:
-    case NEON_2RM_VCVTAU:
-    case NEON_2RM_VCVTAS:
-    case NEON_2RM_VCVTNU:
-    case NEON_2RM_VCVTNS:
-    case NEON_2RM_VCVTPU:
-    case NEON_2RM_VCVTPS:
-    case NEON_2RM_VCVTMU:
-    case NEON_2RM_VCVTMS:
-        return true;
-    default:
-        return false;
-    }
-}
-
 /* Each entry in this array has bit n set if the insn allows
  * size value n (otherwise it will UNDEF). Since unallocated
  * op values will have no bits set they always UNDEF.
@@ -4908,10 +4884,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 if ((neon_2rm_sizes[op] & (1 << size)) == 0) {
                     return 1;
                 }
-                if (neon_2rm_is_v8_op(op) &&
-                    !arm_dc_feature(s, ARM_FEATURE_V8)) {
-                    return 1;
-                }
                 if (q && ((rm | rd) & 1)) {
                     return 1;
                 }
@@ -4964,6 +4936,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VRINTM:
                 case NEON_2RM_VRINTP:
                 case NEON_2RM_VRINTZ:
+                case NEON_2RM_VCVTAU:
+                case NEON_2RM_VCVTAS:
+                case NEON_2RM_VCVTNU:
+                case NEON_2RM_VCVTNS:
+                case NEON_2RM_VCVTPU:
+                case NEON_2RM_VCVTPS:
+                case NEON_2RM_VCVTMU:
+                case NEON_2RM_VCVTMS:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4998,40 +4978,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                             }
                             neon_store_reg(rm, pass, tmp2);
                             break;
-                        case NEON_2RM_VCVTAU:
-                        case NEON_2RM_VCVTAS:
-                        case NEON_2RM_VCVTNU:
-                        case NEON_2RM_VCVTNS:
-                        case NEON_2RM_VCVTPU:
-                        case NEON_2RM_VCVTPS:
-                        case NEON_2RM_VCVTMU:
-                        case NEON_2RM_VCVTMS:
-                        {
-                            bool is_signed = !extract32(insn, 7, 1);
-                            TCGv_ptr fpst = get_fpstatus_ptr(1);
-                            TCGv_i32 tcg_rmode, tcg_shift;
-                            int rmode = fp_decode_rm[extract32(insn, 8, 2)];
-
-                            tcg_shift = tcg_const_i32(0);
-                            tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
-                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
-                                                      cpu_env);
-
-                            if (is_signed) {
-                                gen_helper_vfp_tosls(tmp, tmp,
-                                                     tcg_shift, fpst);
-                            } else {
-                                gen_helper_vfp_touls(tmp, tmp,
-                                                     tcg_shift, fpst);
-                            }
-
-                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
-                                                      cpu_env);
-                            tcg_temp_free_i32(tcg_rmode);
-                            tcg_temp_free_i32(tcg_shift);
-                            tcg_temp_free_ptr(fpst);
-                            break;
-                        }
                         default:
                             /* Reserved op values were caught by the
                              * neon_2rm_sizes[] check earlier.
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree
  2020-06-16 17:08 ` [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT " Peter Maydell
@ 2020-06-19 23:52   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:52 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the VCVT instructions in the 2-reg-misc grouping to
> decodetree.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  9 +++++
>  target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 70 ++++-----------------------------
>  3 files changed, 87 insertions(+), 62 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 19/21] target/arm: Convert Neon VSWP to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (17 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-20  0:16   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 20/21] target/arm: Convert Neon VTRN " Peter Maydell
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon VSWP insn to decodetree. Since the new implementation
doesn't have to share a pass-loop with the other 2-reg-misc operations
we can implement the swap with 64-bit accesses rather than 32-bits
(which brings us into line with the pseudocode and is more efficient).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |  2 ++
 target/arm/translate-neon.inc.c | 41 +++++++++++++++++++++++++++++++++
 target/arm/translate.c          |  5 +---
 3 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 5507c3e4623..2f64841de52 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -488,6 +488,8 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VABS_F       1111 001 11 . 11 .. 01 .... 0 1110 . . 0 .... @2misc
     VNEG_F       1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
 
+    VSWP         1111 001 11 . 11 .. 10 .... 0 0000 . . 0 .... @2misc
+
     VUZP         1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
     VZIP         1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 29bc161f36a..01da7fad462 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3927,3 +3927,44 @@ DO_VCVT(VCVTPU, FPROUNDING_POSINF, false)
 DO_VCVT(VCVTPS, FPROUNDING_POSINF, true)
 DO_VCVT(VCVTMU, FPROUNDING_NEGINF, false)
 DO_VCVT(VCVTMS, FPROUNDING_NEGINF, true)
+
+static bool trans_VSWP(DisasContext *s, arg_2misc *a)
+{
+    TCGv_i64 rm, rd;
+    int pass;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if (a->size != 0) {
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    rm = tcg_temp_new_i64();
+    rd = tcg_temp_new_i64();
+    for (pass = 0; pass < (a->q ? 2 : 1); pass++) {
+        neon_load_reg64(rm, a->vm + pass);
+        neon_load_reg64(rd, a->vd + pass);
+        neon_store_reg64(rm, a->vd + pass);
+        neon_store_reg64(rd, a->vm + pass);
+    }
+    tcg_temp_free_i64(rm);
+    tcg_temp_free_i64(rd);
+
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index b0181062020..e8cd4a9c61f 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4944,6 +4944,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                 case NEON_2RM_VCVTPS:
                 case NEON_2RM_VCVTMU:
                 case NEON_2RM_VCVTMS:
+                case NEON_2RM_VSWP:
                     /* handled by decodetree */
                     return 1;
                 case NEON_2RM_VTRN:
@@ -4965,10 +4966,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
                     for (pass = 0; pass < (q ? 4 : 2); pass++) {
                         tmp = neon_load_reg(rm, pass);
                         switch (op) {
-                        case NEON_2RM_VSWP:
-                            tmp2 = neon_load_reg(rd, pass);
-                            neon_store_reg(rm, pass, tmp2);
-                            break;
                         case NEON_2RM_VTRN:
                             tmp2 = neon_load_reg(rd, pass);
                             switch (size) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 19/21] target/arm: Convert Neon VSWP to decodetree
  2020-06-16 17:08 ` [PATCH 19/21] target/arm: Convert Neon VSWP " Peter Maydell
@ 2020-06-20  0:16   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-20  0:16 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VSWP insn to decodetree. Since the new implementation
> doesn't have to share a pass-loop with the other 2-reg-misc operations
> we can implement the swap with 64-bit accesses rather than 32-bits
> (which brings us into line with the pseudocode and is more efficient).
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |  2 ++
>  target/arm/translate-neon.inc.c | 41 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          |  5 +---
>  3 files changed, 44 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 20/21] target/arm: Convert Neon VTRN to decodetree
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (18 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 19/21] target/arm: Convert Neon VSWP " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-20  0:20   ` Richard Henderson
  2020-06-16 17:08 ` [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file Peter Maydell
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

Convert the Neon VTRN insn to decodetree. This is the last insn in the
Neon data-processing group, so we can remove all the now-unused old
decoder framework.

It's possible that there's a more efficient implementation of
VTRN, but for this conversion we just copy the existing approach.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/neon-dp.decode       |   2 +-
 target/arm/translate-neon.inc.c |  90 ++++++++
 target/arm/translate.c          | 363 +-------------------------------
 3 files changed, 93 insertions(+), 362 deletions(-)

diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 2f64841de52..686f9fbf46a 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -489,7 +489,7 @@ Vimm_1r          1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
     VNEG_F       1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
 
     VSWP         1111 001 11 . 11 .. 10 .... 0 0000 . . 0 .... @2misc
-
+    VTRN         1111 001 11 . 11 .. 10 .... 0 0001 . . 0 .... @2misc
     VUZP         1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
     VZIP         1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
 
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 01da7fad462..8cc7f5db544 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3968,3 +3968,93 @@ static bool trans_VSWP(DisasContext *s, arg_2misc *a)
 
     return true;
 }
+static void gen_neon_trn_u8(TCGv_i32 t0, TCGv_i32 t1)
+{
+    TCGv_i32 rd, tmp;
+
+    rd = tcg_temp_new_i32();
+    tmp = tcg_temp_new_i32();
+
+    tcg_gen_shli_i32(rd, t0, 8);
+    tcg_gen_andi_i32(rd, rd, 0xff00ff00);
+    tcg_gen_andi_i32(tmp, t1, 0x00ff00ff);
+    tcg_gen_or_i32(rd, rd, tmp);
+
+    tcg_gen_shri_i32(t1, t1, 8);
+    tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
+    tcg_gen_andi_i32(tmp, t0, 0xff00ff00);
+    tcg_gen_or_i32(t1, t1, tmp);
+    tcg_gen_mov_i32(t0, rd);
+
+    tcg_temp_free_i32(tmp);
+    tcg_temp_free_i32(rd);
+}
+
+static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
+{
+    TCGv_i32 rd, tmp;
+
+    rd = tcg_temp_new_i32();
+    tmp = tcg_temp_new_i32();
+
+    tcg_gen_shli_i32(rd, t0, 16);
+    tcg_gen_andi_i32(tmp, t1, 0xffff);
+    tcg_gen_or_i32(rd, rd, tmp);
+    tcg_gen_shri_i32(t1, t1, 16);
+    tcg_gen_andi_i32(tmp, t0, 0xffff0000);
+    tcg_gen_or_i32(t1, t1, tmp);
+    tcg_gen_mov_i32(t0, rd);
+
+    tcg_temp_free_i32(tmp);
+    tcg_temp_free_i32(rd);
+}
+
+static bool trans_VTRN(DisasContext *s, arg_2misc *a)
+{
+    TCGv_i32 tmp, tmp2;
+    int pass;
+
+    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+        return false;
+    }
+
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vm) & 0x10)) {
+        return false;
+    }
+
+    if ((a->vd | a->vm) & a->q) {
+        return false;
+    }
+
+    if (a->size == 3) {
+        return false;
+    }
+
+    if (!vfp_access_check(s)) {
+        return true;
+    }
+
+    if (a->size == 2) {
+        for (pass = 0; pass < (a->q ? 4 : 2); pass += 2) {
+            tmp = neon_load_reg(a->vm, pass);
+            tmp2 = neon_load_reg(a->vd, pass + 1);
+            neon_store_reg(a->vm, pass, tmp2);
+            neon_store_reg(a->vd, pass + 1, tmp);
+        }
+    } else {
+        for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+            tmp = neon_load_reg(a->vm, pass);
+            tmp2 = neon_load_reg(a->vd, pass);
+            if (a->size == 0) {
+                gen_neon_trn_u8(tmp, tmp2);
+            } else {
+                gen_neon_trn_u16(tmp, tmp2);
+            }
+            neon_store_reg(a->vm, pass, tmp2);
+            neon_store_reg(a->vd, pass, tmp);
+        }
+    }
+    return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index e8cd4a9c61f..581b0b5cde4 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2934,183 +2934,6 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
     gen_rfe(s, pc, load_cpu_field(spsr));
 }
 
-static void gen_neon_trn_u8(TCGv_i32 t0, TCGv_i32 t1)
-{
-    TCGv_i32 rd, tmp;
-
-    rd = tcg_temp_new_i32();
-    tmp = tcg_temp_new_i32();
-
-    tcg_gen_shli_i32(rd, t0, 8);
-    tcg_gen_andi_i32(rd, rd, 0xff00ff00);
-    tcg_gen_andi_i32(tmp, t1, 0x00ff00ff);
-    tcg_gen_or_i32(rd, rd, tmp);
-
-    tcg_gen_shri_i32(t1, t1, 8);
-    tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
-    tcg_gen_andi_i32(tmp, t0, 0xff00ff00);
-    tcg_gen_or_i32(t1, t1, tmp);
-    tcg_gen_mov_i32(t0, rd);
-
-    tcg_temp_free_i32(tmp);
-    tcg_temp_free_i32(rd);
-}
-
-static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
-{
-    TCGv_i32 rd, tmp;
-
-    rd = tcg_temp_new_i32();
-    tmp = tcg_temp_new_i32();
-
-    tcg_gen_shli_i32(rd, t0, 16);
-    tcg_gen_andi_i32(tmp, t1, 0xffff);
-    tcg_gen_or_i32(rd, rd, tmp);
-    tcg_gen_shri_i32(t1, t1, 16);
-    tcg_gen_andi_i32(tmp, t0, 0xffff0000);
-    tcg_gen_or_i32(t1, t1, tmp);
-    tcg_gen_mov_i32(t0, rd);
-
-    tcg_temp_free_i32(tmp);
-    tcg_temp_free_i32(rd);
-}
-
-/* Symbolic constants for op fields for Neon 2-register miscellaneous.
- * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
- * table A7-13.
- */
-#define NEON_2RM_VREV64 0
-#define NEON_2RM_VREV32 1
-#define NEON_2RM_VREV16 2
-#define NEON_2RM_VPADDL 4
-#define NEON_2RM_VPADDL_U 5
-#define NEON_2RM_AESE 6 /* Includes AESD */
-#define NEON_2RM_AESMC 7 /* Includes AESIMC */
-#define NEON_2RM_VCLS 8
-#define NEON_2RM_VCLZ 9
-#define NEON_2RM_VCNT 10
-#define NEON_2RM_VMVN 11
-#define NEON_2RM_VPADAL 12
-#define NEON_2RM_VPADAL_U 13
-#define NEON_2RM_VQABS 14
-#define NEON_2RM_VQNEG 15
-#define NEON_2RM_VCGT0 16
-#define NEON_2RM_VCGE0 17
-#define NEON_2RM_VCEQ0 18
-#define NEON_2RM_VCLE0 19
-#define NEON_2RM_VCLT0 20
-#define NEON_2RM_SHA1H 21
-#define NEON_2RM_VABS 22
-#define NEON_2RM_VNEG 23
-#define NEON_2RM_VCGT0_F 24
-#define NEON_2RM_VCGE0_F 25
-#define NEON_2RM_VCEQ0_F 26
-#define NEON_2RM_VCLE0_F 27
-#define NEON_2RM_VCLT0_F 28
-#define NEON_2RM_VABS_F 30
-#define NEON_2RM_VNEG_F 31
-#define NEON_2RM_VSWP 32
-#define NEON_2RM_VTRN 33
-#define NEON_2RM_VUZP 34
-#define NEON_2RM_VZIP 35
-#define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */
-#define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */
-#define NEON_2RM_VSHLL 38
-#define NEON_2RM_SHA1SU1 39 /* Includes SHA256SU0 */
-#define NEON_2RM_VRINTN 40
-#define NEON_2RM_VRINTX 41
-#define NEON_2RM_VRINTA 42
-#define NEON_2RM_VRINTZ 43
-#define NEON_2RM_VCVT_F16_F32 44
-#define NEON_2RM_VRINTM 45
-#define NEON_2RM_VCVT_F32_F16 46
-#define NEON_2RM_VRINTP 47
-#define NEON_2RM_VCVTAU 48
-#define NEON_2RM_VCVTAS 49
-#define NEON_2RM_VCVTNU 50
-#define NEON_2RM_VCVTNS 51
-#define NEON_2RM_VCVTPU 52
-#define NEON_2RM_VCVTPS 53
-#define NEON_2RM_VCVTMU 54
-#define NEON_2RM_VCVTMS 55
-#define NEON_2RM_VRECPE 56
-#define NEON_2RM_VRSQRTE 57
-#define NEON_2RM_VRECPE_F 58
-#define NEON_2RM_VRSQRTE_F 59
-#define NEON_2RM_VCVT_FS 60
-#define NEON_2RM_VCVT_FU 61
-#define NEON_2RM_VCVT_SF 62
-#define NEON_2RM_VCVT_UF 63
-
-/* Each entry in this array has bit n set if the insn allows
- * size value n (otherwise it will UNDEF). Since unallocated
- * op values will have no bits set they always UNDEF.
- */
-static const uint8_t neon_2rm_sizes[] = {
-    [NEON_2RM_VREV64] = 0x7,
-    [NEON_2RM_VREV32] = 0x3,
-    [NEON_2RM_VREV16] = 0x1,
-    [NEON_2RM_VPADDL] = 0x7,
-    [NEON_2RM_VPADDL_U] = 0x7,
-    [NEON_2RM_AESE] = 0x1,
-    [NEON_2RM_AESMC] = 0x1,
-    [NEON_2RM_VCLS] = 0x7,
-    [NEON_2RM_VCLZ] = 0x7,
-    [NEON_2RM_VCNT] = 0x1,
-    [NEON_2RM_VMVN] = 0x1,
-    [NEON_2RM_VPADAL] = 0x7,
-    [NEON_2RM_VPADAL_U] = 0x7,
-    [NEON_2RM_VQABS] = 0x7,
-    [NEON_2RM_VQNEG] = 0x7,
-    [NEON_2RM_VCGT0] = 0x7,
-    [NEON_2RM_VCGE0] = 0x7,
-    [NEON_2RM_VCEQ0] = 0x7,
-    [NEON_2RM_VCLE0] = 0x7,
-    [NEON_2RM_VCLT0] = 0x7,
-    [NEON_2RM_SHA1H] = 0x4,
-    [NEON_2RM_VABS] = 0x7,
-    [NEON_2RM_VNEG] = 0x7,
-    [NEON_2RM_VCGT0_F] = 0x4,
-    [NEON_2RM_VCGE0_F] = 0x4,
-    [NEON_2RM_VCEQ0_F] = 0x4,
-    [NEON_2RM_VCLE0_F] = 0x4,
-    [NEON_2RM_VCLT0_F] = 0x4,
-    [NEON_2RM_VABS_F] = 0x4,
-    [NEON_2RM_VNEG_F] = 0x4,
-    [NEON_2RM_VSWP] = 0x1,
-    [NEON_2RM_VTRN] = 0x7,
-    [NEON_2RM_VUZP] = 0x7,
-    [NEON_2RM_VZIP] = 0x7,
-    [NEON_2RM_VMOVN] = 0x7,
-    [NEON_2RM_VQMOVN] = 0x7,
-    [NEON_2RM_VSHLL] = 0x7,
-    [NEON_2RM_SHA1SU1] = 0x4,
-    [NEON_2RM_VRINTN] = 0x4,
-    [NEON_2RM_VRINTX] = 0x4,
-    [NEON_2RM_VRINTA] = 0x4,
-    [NEON_2RM_VRINTZ] = 0x4,
-    [NEON_2RM_VCVT_F16_F32] = 0x2,
-    [NEON_2RM_VRINTM] = 0x4,
-    [NEON_2RM_VCVT_F32_F16] = 0x2,
-    [NEON_2RM_VRINTP] = 0x4,
-    [NEON_2RM_VCVTAU] = 0x4,
-    [NEON_2RM_VCVTAS] = 0x4,
-    [NEON_2RM_VCVTNU] = 0x4,
-    [NEON_2RM_VCVTNS] = 0x4,
-    [NEON_2RM_VCVTPU] = 0x4,
-    [NEON_2RM_VCVTPS] = 0x4,
-    [NEON_2RM_VCVTMU] = 0x4,
-    [NEON_2RM_VCVTMS] = 0x4,
-    [NEON_2RM_VRECPE] = 0x4,
-    [NEON_2RM_VRSQRTE] = 0x4,
-    [NEON_2RM_VRECPE_F] = 0x4,
-    [NEON_2RM_VRSQRTE_F] = 0x4,
-    [NEON_2RM_VCVT_FS] = 0x4,
-    [NEON_2RM_VCVT_FU] = 0x4,
-    [NEON_2RM_VCVT_SF] = 0x4,
-    [NEON_2RM_VCVT_UF] = 0x4,
-};
-
 static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
                             uint32_t opr_sz, uint32_t max_sz,
                             gen_helper_gvec_3_ptr *fn)
@@ -4822,178 +4645,6 @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
 }
 
-/* Translate a NEON data processing instruction.  Return nonzero if the
-   instruction is invalid.
-   We process data in a mixture of 32-bit and 64-bit chunks.
-   Mostly we use 32-bit chunks so we can use normal scalar instructions.  */
-
-static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-{
-    int op;
-    int q;
-    int rd, rm;
-    int size;
-    int pass;
-    int u;
-    TCGv_i32 tmp, tmp2;
-
-    if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
-        return 1;
-    }
-
-    /* FIXME: this access check should not take precedence over UNDEF
-     * for invalid encodings; we will generate incorrect syndrome information
-     * for attempts to execute invalid vfp/neon encodings with FP disabled.
-     */
-    if (s->fp_excp_el) {
-        gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
-                           syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
-        return 0;
-    }
-
-    if (!s->vfp_enabled)
-      return 1;
-    q = (insn & (1 << 6)) != 0;
-    u = (insn >> 24) & 1;
-    VFP_DREG_D(rd, insn);
-    VFP_DREG_M(rm, insn);
-    size = (insn >> 20) & 3;
-
-    if ((insn & (1 << 23)) == 0) {
-        /* Three register same length: handled by decodetree */
-        return 1;
-    } else if (insn & (1 << 4)) {
-        /* Two registers and shift or reg and imm: handled by decodetree */
-        return 1;
-    } else { /* (insn & 0x00800010 == 0x00800000) */
-        if (size != 3) {
-            /*
-             * Three registers of different lengths, or two registers and
-             * a scalar: handled by decodetree
-             */
-            return 1;
-        } else { /* size == 3 */
-            if (!u) {
-                /* Extract: handled by decodetree */
-                return 1;
-            } else if ((insn & (1 << 11)) == 0) {
-                /* Two register misc.  */
-                op = ((insn >> 12) & 0x30) | ((insn >> 7) & 0xf);
-                size = (insn >> 18) & 3;
-                /* UNDEF for unknown op values and bad op-size combinations */
-                if ((neon_2rm_sizes[op] & (1 << size)) == 0) {
-                    return 1;
-                }
-                if (q && ((rm | rd) & 1)) {
-                    return 1;
-                }
-                switch (op) {
-                case NEON_2RM_VREV64:
-                case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
-                case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
-                case NEON_2RM_VUZP:
-                case NEON_2RM_VZIP:
-                case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
-                case NEON_2RM_VSHLL:
-                case NEON_2RM_VCVT_F16_F32:
-                case NEON_2RM_VCVT_F32_F16:
-                case NEON_2RM_VMVN:
-                case NEON_2RM_VNEG:
-                case NEON_2RM_VABS:
-                case NEON_2RM_VCEQ0:
-                case NEON_2RM_VCGT0:
-                case NEON_2RM_VCLE0:
-                case NEON_2RM_VCGE0:
-                case NEON_2RM_VCLT0:
-                case NEON_2RM_AESE: case NEON_2RM_AESMC:
-                case NEON_2RM_SHA1H:
-                case NEON_2RM_SHA1SU1:
-                case NEON_2RM_VREV32:
-                case NEON_2RM_VREV16:
-                case NEON_2RM_VCLS:
-                case NEON_2RM_VCLZ:
-                case NEON_2RM_VCNT:
-                case NEON_2RM_VABS_F:
-                case NEON_2RM_VNEG_F:
-                case NEON_2RM_VRECPE:
-                case NEON_2RM_VRSQRTE:
-                case NEON_2RM_VQABS:
-                case NEON_2RM_VQNEG:
-                case NEON_2RM_VRECPE_F:
-                case NEON_2RM_VRSQRTE_F:
-                case NEON_2RM_VCVT_FS:
-                case NEON_2RM_VCVT_FU:
-                case NEON_2RM_VCVT_SF:
-                case NEON_2RM_VCVT_UF:
-                case NEON_2RM_VRINTX:
-                case NEON_2RM_VCGT0_F:
-                case NEON_2RM_VCGE0_F:
-                case NEON_2RM_VCEQ0_F:
-                case NEON_2RM_VCLE0_F:
-                case NEON_2RM_VCLT0_F:
-                case NEON_2RM_VRINTN:
-                case NEON_2RM_VRINTA:
-                case NEON_2RM_VRINTM:
-                case NEON_2RM_VRINTP:
-                case NEON_2RM_VRINTZ:
-                case NEON_2RM_VCVTAU:
-                case NEON_2RM_VCVTAS:
-                case NEON_2RM_VCVTNU:
-                case NEON_2RM_VCVTNS:
-                case NEON_2RM_VCVTPU:
-                case NEON_2RM_VCVTPS:
-                case NEON_2RM_VCVTMU:
-                case NEON_2RM_VCVTMS:
-                case NEON_2RM_VSWP:
-                    /* handled by decodetree */
-                    return 1;
-                case NEON_2RM_VTRN:
-                    if (size == 2) {
-                        int n;
-                        for (n = 0; n < (q ? 4 : 2); n += 2) {
-                            tmp = neon_load_reg(rm, n);
-                            tmp2 = neon_load_reg(rd, n + 1);
-                            neon_store_reg(rm, n, tmp2);
-                            neon_store_reg(rd, n + 1, tmp);
-                        }
-                    } else {
-                        goto elementwise;
-                    }
-                    break;
-
-                default:
-                elementwise:
-                    for (pass = 0; pass < (q ? 4 : 2); pass++) {
-                        tmp = neon_load_reg(rm, pass);
-                        switch (op) {
-                        case NEON_2RM_VTRN:
-                            tmp2 = neon_load_reg(rd, pass);
-                            switch (size) {
-                            case 0: gen_neon_trn_u8(tmp, tmp2); break;
-                            case 1: gen_neon_trn_u16(tmp, tmp2); break;
-                            default: abort();
-                            }
-                            neon_store_reg(rm, pass, tmp2);
-                            break;
-                        default:
-                            /* Reserved op values were caught by the
-                             * neon_2rm_sizes[] check earlier.
-                             */
-                            abort();
-                        }
-                        neon_store_reg(rd, pass, tmp);
-                    }
-                    break;
-                }
-            } else {
-                /* VTBL, VTBX, VDUP: handled by decodetree */
-                return 1;
-            }
-        }
-    }
-    return 0;
-}
-
 static int disas_coproc_insn(DisasContext *s, uint32_t insn)
 {
     int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -8694,13 +8345,6 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
         }
         /* fall back to legacy decoder */
 
-        if (((insn >> 25) & 7) == 1) {
-            /* NEON Data processing.  */
-            if (disas_neon_data_insn(s, insn)) {
-                goto illegal_op;
-            }
-            return;
-        }
         if ((insn & 0x0e000f00) == 0x0c000100) {
             if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
                 /* iWMMXt register transfer.  */
@@ -8888,11 +8532,8 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
             break;
         }
         if (((insn >> 24) & 3) == 3) {
-            /* Translate into the equivalent ARM encoding.  */
-            insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
-            if (disas_neon_data_insn(s, insn)) {
-                goto illegal_op;
-            }
+            /* Neon DP, but failed disas_neon_dp() */
+            goto illegal_op;
         } else if (((insn >> 8) & 0xe) == 10) {
             /* VFP, but failed disas_vfp.  */
             goto illegal_op;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 20/21] target/arm: Convert Neon VTRN to decodetree
  2020-06-16 17:08 ` [PATCH 20/21] target/arm: Convert Neon VTRN " Peter Maydell
@ 2020-06-20  0:20   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-20  0:20 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VTRN insn to decodetree. This is the last insn in the
> Neon data-processing group, so we can remove all the now-unused old
> decoder framework.
> 
> It's possible that there's a more efficient implementation of
> VTRN, but for this conversion we just copy the existing approach.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/neon-dp.decode       |   2 +-
>  target/arm/translate-neon.inc.c |  90 ++++++++
>  target/arm/translate.c          | 363 +-------------------------------
>  3 files changed, 93 insertions(+), 362 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (19 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 20/21] target/arm: Convert Neon VTRN " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
  2020-06-20  0:21   ` Richard Henderson
  2020-06-16 21:37 ` [PATCH 00/21] target/arm: Finish neon decodetree conversion no-reply
  2020-06-16 21:52 ` no-reply
  22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
  To: qemu-arm, qemu-devel; +Cc: Richard Henderson

The functions neon_element_offset(), neon_load_element(),
neon_load_element64(), neon_store_element() and
neon_store_element64() are used only in the translate-neon.inc.c
file, so move their definitions there.

Since the .inc.c file is #included in translate.c this doesn't make
much difference currently, but it's a more logical place to put the
functions and it might be helpful if we ever decide to try to make
the .inc.c files genuinely separate compilation units.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/translate-neon.inc.c | 101 ++++++++++++++++++++++++++++++++
 target/arm/translate.c          | 101 --------------------------------
 2 files changed, 101 insertions(+), 101 deletions(-)

diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 8cc7f5db544..f6cb9215739 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -54,6 +54,107 @@ static inline int rsub_8(DisasContext *s, int x)
 #include "decode-neon-ls.inc.c"
 #include "decode-neon-shared.inc.c"
 
+/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
+ * where 0 is the least significant end of the register.
+ */
+static inline long
+neon_element_offset(int reg, int element, MemOp size)
+{
+    int element_size = 1 << size;
+    int ofs = element * element_size;
+#ifdef HOST_WORDS_BIGENDIAN
+    /* Calculate the offset assuming fully little-endian,
+     * then XOR to account for the order of the 8-byte units.
+     */
+    if (element_size < 8) {
+        ofs ^= 8 - element_size;
+    }
+#endif
+    return neon_reg_offset(reg, 0) + ofs;
+}
+
+static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop)
+{
+    long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
+
+    switch (mop) {
+    case MO_UB:
+        tcg_gen_ld8u_i32(var, cpu_env, offset);
+        break;
+    case MO_UW:
+        tcg_gen_ld16u_i32(var, cpu_env, offset);
+        break;
+    case MO_UL:
+        tcg_gen_ld_i32(var, cpu_env, offset);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void neon_load_element64(TCGv_i64 var, int reg, int ele, MemOp mop)
+{
+    long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
+
+    switch (mop) {
+    case MO_UB:
+        tcg_gen_ld8u_i64(var, cpu_env, offset);
+        break;
+    case MO_UW:
+        tcg_gen_ld16u_i64(var, cpu_env, offset);
+        break;
+    case MO_UL:
+        tcg_gen_ld32u_i64(var, cpu_env, offset);
+        break;
+    case MO_Q:
+        tcg_gen_ld_i64(var, cpu_env, offset);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void neon_store_element(int reg, int ele, MemOp size, TCGv_i32 var)
+{
+    long offset = neon_element_offset(reg, ele, size);
+
+    switch (size) {
+    case MO_8:
+        tcg_gen_st8_i32(var, cpu_env, offset);
+        break;
+    case MO_16:
+        tcg_gen_st16_i32(var, cpu_env, offset);
+        break;
+    case MO_32:
+        tcg_gen_st_i32(var, cpu_env, offset);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var)
+{
+    long offset = neon_element_offset(reg, ele, size);
+
+    switch (size) {
+    case MO_8:
+        tcg_gen_st8_i64(var, cpu_env, offset);
+        break;
+    case MO_16:
+        tcg_gen_st16_i64(var, cpu_env, offset);
+        break;
+    case MO_32:
+        tcg_gen_st32_i64(var, cpu_env, offset);
+        break;
+    case MO_64:
+        tcg_gen_st_i64(var, cpu_env, offset);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
 {
     int opr_sz;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 581b0b5cde4..408fb7a492f 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1133,25 +1133,6 @@ neon_reg_offset (int reg, int n)
     return vfp_reg_offset(0, sreg);
 }
 
-/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
- * where 0 is the least significant end of the register.
- */
-static inline long
-neon_element_offset(int reg, int element, MemOp size)
-{
-    int element_size = 1 << size;
-    int ofs = element * element_size;
-#ifdef HOST_WORDS_BIGENDIAN
-    /* Calculate the offset assuming fully little-endian,
-     * then XOR to account for the order of the 8-byte units.
-     */
-    if (element_size < 8) {
-        ofs ^= 8 - element_size;
-    }
-#endif
-    return neon_reg_offset(reg, 0) + ofs;
-}
-
 static TCGv_i32 neon_load_reg(int reg, int pass)
 {
     TCGv_i32 tmp = tcg_temp_new_i32();
@@ -1159,94 +1140,12 @@ static TCGv_i32 neon_load_reg(int reg, int pass)
     return tmp;
 }
 
-static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop)
-{
-    long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
-
-    switch (mop) {
-    case MO_UB:
-        tcg_gen_ld8u_i32(var, cpu_env, offset);
-        break;
-    case MO_UW:
-        tcg_gen_ld16u_i32(var, cpu_env, offset);
-        break;
-    case MO_UL:
-        tcg_gen_ld_i32(var, cpu_env, offset);
-        break;
-    default:
-        g_assert_not_reached();
-    }
-}
-
-static void neon_load_element64(TCGv_i64 var, int reg, int ele, MemOp mop)
-{
-    long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
-
-    switch (mop) {
-    case MO_UB:
-        tcg_gen_ld8u_i64(var, cpu_env, offset);
-        break;
-    case MO_UW:
-        tcg_gen_ld16u_i64(var, cpu_env, offset);
-        break;
-    case MO_UL:
-        tcg_gen_ld32u_i64(var, cpu_env, offset);
-        break;
-    case MO_Q:
-        tcg_gen_ld_i64(var, cpu_env, offset);
-        break;
-    default:
-        g_assert_not_reached();
-    }
-}
-
 static void neon_store_reg(int reg, int pass, TCGv_i32 var)
 {
     tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass));
     tcg_temp_free_i32(var);
 }
 
-static void neon_store_element(int reg, int ele, MemOp size, TCGv_i32 var)
-{
-    long offset = neon_element_offset(reg, ele, size);
-
-    switch (size) {
-    case MO_8:
-        tcg_gen_st8_i32(var, cpu_env, offset);
-        break;
-    case MO_16:
-        tcg_gen_st16_i32(var, cpu_env, offset);
-        break;
-    case MO_32:
-        tcg_gen_st_i32(var, cpu_env, offset);
-        break;
-    default:
-        g_assert_not_reached();
-    }
-}
-
-static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var)
-{
-    long offset = neon_element_offset(reg, ele, size);
-
-    switch (size) {
-    case MO_8:
-        tcg_gen_st8_i64(var, cpu_env, offset);
-        break;
-    case MO_16:
-        tcg_gen_st16_i64(var, cpu_env, offset);
-        break;
-    case MO_32:
-        tcg_gen_st32_i64(var, cpu_env, offset);
-        break;
-    case MO_64:
-        tcg_gen_st_i64(var, cpu_env, offset);
-        break;
-    default:
-        g_assert_not_reached();
-    }
-}
-
 static inline void neon_load_reg64(TCGv_i64 var, int reg)
 {
     tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg));
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file
  2020-06-16 17:08 ` [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file Peter Maydell
@ 2020-06-20  0:21   ` Richard Henderson
  0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-20  0:21 UTC (permalink / raw)
  To: Peter Maydell, qemu-arm, qemu-devel

On 6/16/20 10:08 AM, Peter Maydell wrote:
> The functions neon_element_offset(), neon_load_element(),
> neon_load_element64(), neon_store_element() and
> neon_store_element64() are used only in the translate-neon.inc.c
> file, so move their definitions there.
> 
> Since the .inc.c file is #included in translate.c this doesn't make
> much difference currently, but it's a more logical place to put the
> functions and it might be helpful if we ever decide to try to make
> the .inc.c files genuinely separate compilation units.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target/arm/translate-neon.inc.c | 101 ++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 101 --------------------------------
>  2 files changed, 101 insertions(+), 101 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/21] target/arm: Finish neon decodetree conversion
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (20 preceding siblings ...)
  2020-06-16 17:08 ` [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file Peter Maydell
@ 2020-06-16 21:37 ` no-reply
  2020-06-16 21:52 ` no-reply
  22 siblings, 0 replies; 45+ messages in thread
From: no-reply @ 2020-06-16 21:37 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-arm, richard.henderson, qemu-devel

Patchew URL: https://patchew.org/QEMU/20200616170844.13318-1-peter.maydell@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH 00/21] target/arm: Finish neon decodetree conversion
Type: series
Message-id: 20200616170844.13318-1-peter.maydell@linaro.org

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
f6e3d71 target/arm: Move some functions used only in translate-neon.inc.c to that file
4a2df60 target/arm: Convert Neon VTRN to decodetree
26f74d6 target/arm: Convert Neon VSWP to decodetree
7a0fa02 target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree
73198e7 target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree
0908381 target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree
f6d72da target/arm: Convert simple fp Neon 2-reg-misc insns
3ef1127 target/arm: Convert Neon VQABS, VQNEG to decodetree
624f399 target/arm: Convert remaining simple 2-reg-misc Neon ops
9bb8fa6 target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
3535d27 target/arm: Make gen_swap_half() take separate src and dest
2eac819 target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs
e30825b target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
7d1109a target/arm: Convert Neon 2-reg-misc crypto operations to decodetree
9d87342 target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree
605ae75 target/arm: Convert Neon VCVT f16/f32 insns to decodetree
5fb6c16 target/arm: Convert Neon 2-reg-misc VSHLL to decodetree
e2e99ab target/arm: Convert Neon narrowing moves to decodetree
3dde5dd target/arm: Convert VZIP, VUZP to decodetree
3ed7eaf target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
37f7428 target/arm: Convert Neon 2-reg-misc VREV64 to decodetree

=== OUTPUT BEGIN ===
1/21 Checking commit 37f7428534e5 (target/arm: Convert Neon 2-reg-misc VREV64 to decodetree)
2/21 Checking commit 3ed7eaff8f5f (target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree)
3/21 Checking commit 3dde5ddb764b (target/arm: Convert VZIP, VUZP to decodetree)
4/21 Checking commit e2e99ab61e59 (target/arm: Convert Neon narrowing moves to decodetree)
5/21 Checking commit 5fb6c161af86 (target/arm: Convert Neon 2-reg-misc VSHLL to decodetree)
6/21 Checking commit 605ae75d2431 (target/arm: Convert Neon VCVT f16/f32 insns to decodetree)
7/21 Checking commit 9d87342857c8 (target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree)
8/21 Checking commit 7d1109aae4db (target/arm: Convert Neon 2-reg-misc crypto operations to decodetree)
9/21 Checking commit e30825bbd0a4 (target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn)
10/21 Checking commit 2eac8198e699 (target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs)
11/21 Checking commit 3535d2721010 (target/arm: Make gen_swap_half() take separate src and dest)
ERROR: trailing statements should be on next line
#50: FILE: target/arm/translate.c:4963:
+                            case 1: gen_swap_half(tmp, tmp); break;

total: 1 errors, 0 warnings, 43 lines checked

Patch 11/21 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

12/21 Checking commit 9bb8fa6a3658 (target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree)
13/21 Checking commit 624f399e8573 (target/arm: Convert remaining simple 2-reg-misc Neon ops)
14/21 Checking commit 3ef11275c669 (target/arm: Convert Neon VQABS, VQNEG to decodetree)
15/21 Checking commit f6d72da4c6fc (target/arm: Convert simple fp Neon 2-reg-misc insns)
16/21 Checking commit 090838197cf0 (target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree)
17/21 Checking commit 73198e721bff (target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree)
18/21 Checking commit 7a0fa02240e2 (target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree)
19/21 Checking commit 26f74d6de184 (target/arm: Convert Neon VSWP to decodetree)
20/21 Checking commit 4a2df6081512 (target/arm: Convert Neon VTRN to decodetree)
21/21 Checking commit f6e3d7186b3e (target/arm: Move some functions used only in translate-neon.inc.c to that file)
WARNING: Block comments use a leading /* on a separate line
#28: FILE: target/arm/translate-neon.inc.c:57:
+/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,

WARNING: Block comments use a leading /* on a separate line
#37: FILE: target/arm/translate-neon.inc.c:66:
+    /* Calculate the offset assuming fully little-endian,

total: 0 errors, 2 warnings, 226 lines checked

Patch 21/21 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200616170844.13318-1-peter.maydell@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/21] target/arm: Finish neon decodetree conversion
  2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
                   ` (21 preceding siblings ...)
  2020-06-16 21:37 ` [PATCH 00/21] target/arm: Finish neon decodetree conversion no-reply
@ 2020-06-16 21:52 ` no-reply
  22 siblings, 0 replies; 45+ messages in thread
From: no-reply @ 2020-06-16 21:52 UTC (permalink / raw)
  To: peter.maydell; +Cc: qemu-arm, richard.henderson, qemu-devel

Patchew URL: https://patchew.org/QEMU/20200616170844.13318-1-peter.maydell@linaro.org/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      qga/guest-agent-command-state.o
  CC      qga/main.o
  CC      qga/commands-posix.o
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  CC      qga/channel-posix.o
  CC      qga/qapi-generated/qga-qapi-types.o
  CC      qga/qapi-generated/qga-qapi-visit.o
---
  AR      libvhost-user.a
  GEN     docs/interop/qemu-ga-ref.html
  GEN     docs/interop/qemu-ga-ref.txt
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  GEN     docs/interop/qemu-ga-ref.7
  LINK    qemu-keymap
  LINK    ivshmem-client
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    ivshmem-server
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    qemu-nbd
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  AS      pc-bios/optionrom/multiboot.o
  AS      pc-bios/optionrom/linuxboot.o
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    qemu-storage-daemon
  CC      pc-bios/optionrom/linuxboot_dma.o
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  AS      pc-bios/optionrom/kvmvapic.o
  AS      pc-bios/optionrom/pvh.o
  CC      pc-bios/optionrom/pvh_main.o
---
  SIGN    pc-bios/optionrom/linuxboot.bin
  SIGN    pc-bios/optionrom/linuxboot_dma.bin
  SIGN    pc-bios/optionrom/kvmvapic.bin
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  BUILD   pc-bios/optionrom/pvh.raw
  SIGN    pc-bios/optionrom/pvh.bin
  LINK    qemu-io
  LINK    qemu-edid
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    fsdev/virtfs-proxy-helper
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    scsi/qemu-pr-helper
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    qemu-bridge-helper
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    virtiofsd
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  LINK    vhost-user-input
  LINK    qemu-ga
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
  GEN     x86_64-softmmu/hmp-commands.h
  GEN     x86_64-softmmu/config-target.h
  GEN     x86_64-softmmu/hmp-commands-info.h
---
  CC      x86_64-softmmu/accel/tcg/translator.o
  CC      x86_64-softmmu/accel/xen/xen-all.o
  CC      x86_64-softmmu/dump/dump.o
/tmp/qemu-test/src/fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
    absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            !
/tmp/qemu-test/src/fpu/softfloat.c:3423:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
        absZ0 &= ~ ( ( (uint64_t) ( absZ1<<1 ) == 0 ) & roundNearestEven );
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 !
/tmp/qemu-test/src/fpu/softfloat.c:3483:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
        absZ0 &= ~(((uint64_t)(absZ1<<1) == 0) & roundNearestEven);
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 !
/tmp/qemu-test/src/fpu/softfloat.c:3606:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
    zSig &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            !
/tmp/qemu-test/src/fpu/softfloat.c:3760:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
    zSig &= ~ ( ( ( roundBits ^ 0x200 ) == 0 ) & roundNearestEven );
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            !
/tmp/qemu-test/src/fpu/softfloat.c:3987:21: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
                    ~ ( ( (uint64_t) ( zSig1<<1 ) == 0 ) & roundNearestEven );
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                    !
/tmp/qemu-test/src/fpu/softfloat.c:4003:22: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
            zSig0 &= ~ ( ( (uint64_t) ( zSig1<<1 ) == 0 ) & roundNearestEven );
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                     !
/tmp/qemu-test/src/fpu/softfloat.c:4273:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
        zSig1 &= ~ ( ( zSig2 + zSig2 == 0 ) & roundNearestEven );
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 !
8 errors generated.
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: fpu/softfloat.o] Error 1
make[1]: *** Waiting for unfinished jobs....
/tmp/qemu-test/src/migration/ram.c:919:45: error: implicit conversion from 'unsigned long' to 'double' changes value from 18446744073709551615 to 18446744073709551616 [-Werror,-Wimplicit-int-float-conversion]
            xbzrle_counters.encoding_rate = UINT64_MAX;
                                          ~ ^~~~~~~~~~
/usr/include/stdint.h:130:23: note: expanded from macro 'UINT64_MAX'
---
18446744073709551615UL
^~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: migration/ram.o] Error 1
make: *** [Makefile:527: x86_64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 669, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=626d1d5e121543ca8c3bbdb19797f602', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-heh_4z5t/src/docker-src.2020-06-16-17.47.36.4664:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=626d1d5e121543ca8c3bbdb19797f602
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-heh_4z5t/src'
make: *** [docker-run-test-debug@fedora] Error 2

real    4m29.254s
user    0m7.869s


The full log is available at
http://patchew.org/logs/20200616170844.13318-1-peter.maydell@linaro.org/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-06-20  0:21 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
2020-06-16 17:08 ` [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Peter Maydell
2020-06-19 22:36   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops " Peter Maydell
2020-06-19 22:42   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 03/21] target/arm: Convert VZIP, VUZP " Peter Maydell
2020-06-19 22:47   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 04/21] target/arm: Convert Neon narrowing moves " Peter Maydell
2020-06-19 22:52   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL " Peter Maydell
2020-06-19 22:55   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns " Peter Maydell
2020-06-19 23:01   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops " Peter Maydell
2020-06-19 23:17   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations " Peter Maydell
2020-06-19 23:25   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn Peter Maydell
2020-06-19 23:28   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs Peter Maydell
2020-06-19 23:32   ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single,Double}OPFn typedefs Richard Henderson
2020-06-16 17:08 ` [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest Peter Maydell
2020-06-19 23:33   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree Peter Maydell
2020-06-19 23:35   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops Peter Maydell
2020-06-19 23:41   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree Peter Maydell
2020-06-19 23:42   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns Peter Maydell
2020-06-19 23:44   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree Peter Maydell
2020-06-19 23:46   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT " Peter Maydell
2020-06-19 23:49   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT " Peter Maydell
2020-06-19 23:52   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 19/21] target/arm: Convert Neon VSWP " Peter Maydell
2020-06-20  0:16   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 20/21] target/arm: Convert Neon VTRN " Peter Maydell
2020-06-20  0:20   ` Richard Henderson
2020-06-16 17:08 ` [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file Peter Maydell
2020-06-20  0:21   ` Richard Henderson
2020-06-16 21:37 ` [PATCH 00/21] target/arm: Finish neon decodetree conversion no-reply
2020-06-16 21:52 ` no-reply

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).