* [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 22:36 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops " Peter Maydell
` (21 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon VREV64 insn from the 2-reg-misc grouping to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 12 ++++++++
target/arm/translate-neon.inc.c | 50 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 24 ++--------------
3 files changed, 64 insertions(+), 22 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 6d890b2161f..e12fdf30957 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -429,6 +429,18 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
vm=%vm_dp vd=%vd_dp size=1
VDUP_scalar 1111 001 1 1 . 11 index:1 100 .... 11 000 q:1 . 0 .... \
vm=%vm_dp vd=%vd_dp size=2
+
+ ##################################################################
+ # 2-reg-misc grouping:
+ # 1111 001 11 D 11 size:2 opc1:2 Vd:4 0 opc2:4 q:1 M 0 Vm:4
+ ##################################################################
+
+ &2misc vd vm q size
+
+ @2misc .... ... .. . .. size:2 .. .... . .... q:1 . . .... \
+ &2misc vm=%vm_dp vd=%vd_dp
+
+ VREV64 1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index a5aa56bbdeb..90431a5383f 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -2970,3 +2970,53 @@ static bool trans_VDUP_scalar(DisasContext *s, arg_VDUP_scalar *a)
a->q ? 16 : 8, a->q ? 16 : 8);
return true;
}
+
+static bool trans_VREV64(DisasContext *s, arg_VREV64 *a)
+{
+ int pass, half;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (a->size == 3) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ for (pass = 0; pass < (a->q ? 2 : 1); pass++) {
+ TCGv_i32 tmp[2];
+
+ for (half = 0; half < 2; half++) {
+ tmp[half] = neon_load_reg(a->vm, pass * 2 + half);
+ switch (a->size) {
+ case 0:
+ tcg_gen_bswap32_i32(tmp[half], tmp[half]);
+ break;
+ case 1:
+ gen_swap_half(tmp[half]);
+ break;
+ case 2:
+ break;
+ default:
+ g_assert_not_reached();
+ }
+ }
+ neon_store_reg(a->vd, pass * 2, tmp[1]);
+ neon_store_reg(a->vd, pass * 2 + 1, tmp[0]);
+ }
+ return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 6d18892adee..5fca38b5fae 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5092,28 +5092,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
}
switch (op) {
case NEON_2RM_VREV64:
- for (pass = 0; pass < (q ? 2 : 1); pass++) {
- tmp = neon_load_reg(rm, pass * 2);
- tmp2 = neon_load_reg(rm, pass * 2 + 1);
- switch (size) {
- case 0: tcg_gen_bswap32_i32(tmp, tmp); break;
- case 1: gen_swap_half(tmp); break;
- case 2: /* no-op */ break;
- default: abort();
- }
- neon_store_reg(rd, pass * 2 + 1, tmp);
- if (size == 2) {
- neon_store_reg(rd, pass * 2, tmp2);
- } else {
- switch (size) {
- case 0: tcg_gen_bswap32_i32(tmp2, tmp2); break;
- case 1: gen_swap_half(tmp2); break;
- default: abort();
- }
- neon_store_reg(rd, pass * 2, tmp2);
- }
- }
- break;
+ /* handled by decodetree */
+ return 1;
case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
for (pass = 0; pass < q + 1; pass++) {
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
2020-06-16 17:08 ` [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 22:42 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 03/21] target/arm: Convert VZIP, VUZP " Peter Maydell
` (20 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the pairwise ops VPADDL and VPADAL in the 2-reg-misc grouping
to decodetree.
At this point we can get rid of the weird CPU_V001 #define that was
used to avoid having to explicitly list all the arguments being
passed to some TCG gen/helper functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 6 ++
target/arm/translate-neon.inc.c | 149 ++++++++++++++++++++++++++++++++
target/arm/translate.c | 35 +-------
3 files changed, 157 insertions(+), 33 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index e12fdf30957..dd521baa07d 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -441,6 +441,12 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
&2misc vm=%vm_dp vd=%vd_dp
VREV64 1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
+
+ VPADDL_S 1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
+ VPADDL_U 1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
+
+ VPADAL_S 1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
+ VPADAL_U 1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 90431a5383f..2f7bd0d556f 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3020,3 +3020,152 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a)
}
return true;
}
+
+static bool do_2misc_pairwise(DisasContext *s, arg_2misc *a,
+ NeonGenWidenFn *widenfn,
+ NeonGenTwo64OpFn *opfn,
+ NeonGenTwo64OpFn *accfn)
+{
+ /*
+ * Pairwise long operations: widen both halves of the pair,
+ * combine the pairs with the opfn, and then possibly accumulate
+ * into the destination with the accfn.
+ */
+ int pass;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!widenfn) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ for (pass = 0; pass < a->q + 1; pass++) {
+ TCGv_i32 tmp;
+ TCGv_i64 rm0_64, rm1_64, rd_64;
+
+ rm0_64 = tcg_temp_new_i64();
+ rm1_64 = tcg_temp_new_i64();
+ rd_64 = tcg_temp_new_i64();
+ tmp = neon_load_reg(a->vm, pass * 2);
+ widenfn(rm0_64, tmp);
+ tcg_temp_free_i32(tmp);
+ tmp = neon_load_reg(a->vm, pass * 2 + 1);
+ widenfn(rm1_64, tmp);
+ tcg_temp_free_i32(tmp);
+ opfn(rd_64, rm0_64, rm1_64);
+ tcg_temp_free_i64(rm0_64);
+ tcg_temp_free_i64(rm1_64);
+
+ if (accfn) {
+ TCGv_i64 tmp64 = tcg_temp_new_i64();
+ neon_load_reg64(tmp64, a->vd + pass);
+ accfn(rd_64, tmp64, rd_64);
+ tcg_temp_free_i64(tmp64);
+ }
+ neon_store_reg64(rd_64, a->vd + pass);
+ tcg_temp_free_i64(rd_64);
+ }
+ return true;
+}
+
+static bool trans_VPADDL_S(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenWidenFn * const widenfn[] = {
+ gen_helper_neon_widen_s8,
+ gen_helper_neon_widen_s16,
+ tcg_gen_ext_i32_i64,
+ NULL,
+ };
+ static NeonGenTwo64OpFn * const opfn[] = {
+ gen_helper_neon_paddl_u16,
+ gen_helper_neon_paddl_u32,
+ tcg_gen_add_i64,
+ NULL,
+ };
+
+ return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size], NULL);
+}
+
+static bool trans_VPADDL_U(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenWidenFn * const widenfn[] = {
+ gen_helper_neon_widen_u8,
+ gen_helper_neon_widen_u16,
+ tcg_gen_extu_i32_i64,
+ NULL,
+ };
+ static NeonGenTwo64OpFn * const opfn[] = {
+ gen_helper_neon_paddl_u16,
+ gen_helper_neon_paddl_u32,
+ tcg_gen_add_i64,
+ NULL,
+ };
+
+ return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size], NULL);
+}
+
+static bool trans_VPADAL_S(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenWidenFn * const widenfn[] = {
+ gen_helper_neon_widen_s8,
+ gen_helper_neon_widen_s16,
+ tcg_gen_ext_i32_i64,
+ NULL,
+ };
+ static NeonGenTwo64OpFn * const opfn[] = {
+ gen_helper_neon_paddl_u16,
+ gen_helper_neon_paddl_u32,
+ tcg_gen_add_i64,
+ NULL,
+ };
+ static NeonGenTwo64OpFn * const accfn[] = {
+ gen_helper_neon_addl_u16,
+ gen_helper_neon_addl_u32,
+ tcg_gen_add_i64,
+ NULL,
+ };
+
+ return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size],
+ accfn[a->size]);
+}
+
+static bool trans_VPADAL_U(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenWidenFn * const widenfn[] = {
+ gen_helper_neon_widen_u8,
+ gen_helper_neon_widen_u16,
+ tcg_gen_extu_i32_i64,
+ NULL,
+ };
+ static NeonGenTwo64OpFn * const opfn[] = {
+ gen_helper_neon_paddl_u16,
+ gen_helper_neon_paddl_u32,
+ tcg_gen_add_i64,
+ NULL,
+ };
+ static NeonGenTwo64OpFn * const accfn[] = {
+ gen_helper_neon_addl_u16,
+ gen_helper_neon_addl_u32,
+ tcg_gen_add_i64,
+ NULL,
+ };
+
+ return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size],
+ accfn[a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5fca38b5fae..4405b034f77 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2934,8 +2934,6 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
gen_rfe(s, pc, load_cpu_field(spsr));
}
-#define CPU_V001 cpu_V0, cpu_V0, cpu_V1
-
static int gen_neon_unzip(int rd, int rm, int size, int q)
{
TCGv_ptr pd, pm;
@@ -3117,16 +3115,6 @@ static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
tcg_temp_free_i32(src);
}
-static inline void gen_neon_addl(int size)
-{
- switch (size) {
- case 0: gen_helper_neon_addl_u16(CPU_V001); break;
- case 1: gen_helper_neon_addl_u32(CPU_V001); break;
- case 2: tcg_gen_add_i64(CPU_V001); break;
- default: abort();
- }
-}
-
static void gen_neon_narrow_op(int op, int u, int size,
TCGv_i32 dest, TCGv_i64 src)
{
@@ -5092,29 +5080,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
}
switch (op) {
case NEON_2RM_VREV64:
- /* handled by decodetree */
- return 1;
case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
- for (pass = 0; pass < q + 1; pass++) {
- tmp = neon_load_reg(rm, pass * 2);
- gen_neon_widen(cpu_V0, tmp, size, op & 1);
- tmp = neon_load_reg(rm, pass * 2 + 1);
- gen_neon_widen(cpu_V1, tmp, size, op & 1);
- switch (size) {
- case 0: gen_helper_neon_paddl_u16(CPU_V001); break;
- case 1: gen_helper_neon_paddl_u32(CPU_V001); break;
- case 2: tcg_gen_add_i64(CPU_V001); break;
- default: abort();
- }
- if (op >= NEON_2RM_VPADAL) {
- /* Accumulate. */
- neon_load_reg64(cpu_V1, rd + pass);
- gen_neon_addl(size);
- }
- neon_store_reg64(cpu_V0, rd + pass);
- }
- break;
+ /* handled by decodetree */
+ return 1;
case NEON_2RM_VTRN:
if (size == 2) {
int n;
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
2020-06-16 17:08 ` [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops " Peter Maydell
@ 2020-06-19 22:42 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:42 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the pairwise ops VPADDL and VPADAL in the 2-reg-misc grouping
> to decodetree.
>
> At this point we can get rid of the weird CPU_V001 #define that was
> used to avoid having to explicitly list all the arguments being
> passed to some TCG gen/helper functions.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 6 ++
> target/arm/translate-neon.inc.c | 149 ++++++++++++++++++++++++++++++++
> target/arm/translate.c | 35 +-------
> 3 files changed, 157 insertions(+), 33 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 03/21] target/arm: Convert VZIP, VUZP to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
2020-06-16 17:08 ` [PATCH 01/21] target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Peter Maydell
2020-06-16 17:08 ` [PATCH 02/21] target/arm: Convert Neon 2-reg-misc pairwise ops " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 22:47 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 04/21] target/arm: Convert Neon narrowing moves " Peter Maydell
` (19 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon VZIP and VUZP insns in the 2-reg-misc group to
decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 3 ++
target/arm/translate-neon.inc.c | 74 ++++++++++++++++++++++++++
target/arm/translate.c | 92 +--------------------------------
3 files changed, 79 insertions(+), 90 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index dd521baa07d..ad9e17fd737 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -447,6 +447,9 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VPADAL_S 1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
VPADAL_U 1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
+
+ VUZP 1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
+ VZIP 1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 2f7bd0d556f..f4799dd9770 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3169,3 +3169,77 @@ static bool trans_VPADAL_U(DisasContext *s, arg_2misc *a)
return do_2misc_pairwise(s, a, widenfn[a->size], opfn[a->size],
accfn[a->size]);
}
+
+typedef void ZipFn(TCGv_ptr, TCGv_ptr);
+
+static bool do_zip_uzp(DisasContext *s, arg_2misc *a,
+ ZipFn *fn)
+{
+ TCGv_ptr pd, pm;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!fn) {
+ /* Bad size or size/q combination */
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ pd = vfp_reg_ptr(true, a->vd);
+ pm = vfp_reg_ptr(true, a->vm);
+ fn(pd, pm);
+ tcg_temp_free_ptr(pd);
+ tcg_temp_free_ptr(pm);
+ return true;
+}
+
+static bool trans_VUZP(DisasContext *s, arg_2misc *a)
+{
+ static ZipFn * const fn[2][4] = {
+ {
+ gen_helper_neon_unzip8,
+ gen_helper_neon_unzip16,
+ NULL,
+ NULL,
+ }, {
+ gen_helper_neon_qunzip8,
+ gen_helper_neon_qunzip16,
+ gen_helper_neon_qunzip32,
+ NULL,
+ }
+ };
+ return do_zip_uzp(s, a, fn[a->q][a->size]);
+}
+
+static bool trans_VZIP(DisasContext *s, arg_2misc *a)
+{
+ static ZipFn * const fn[2][4] = {
+ {
+ gen_helper_neon_zip8,
+ gen_helper_neon_zip16,
+ NULL,
+ NULL,
+ }, {
+ gen_helper_neon_qzip8,
+ gen_helper_neon_qzip16,
+ gen_helper_neon_qzip32,
+ NULL,
+ }
+ };
+ return do_zip_uzp(s, a, fn[a->q][a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 4405b034f77..442f287d861 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2934,86 +2934,6 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
gen_rfe(s, pc, load_cpu_field(spsr));
}
-static int gen_neon_unzip(int rd, int rm, int size, int q)
-{
- TCGv_ptr pd, pm;
-
- if (!q && size == 2) {
- return 1;
- }
- pd = vfp_reg_ptr(true, rd);
- pm = vfp_reg_ptr(true, rm);
- if (q) {
- switch (size) {
- case 0:
- gen_helper_neon_qunzip8(pd, pm);
- break;
- case 1:
- gen_helper_neon_qunzip16(pd, pm);
- break;
- case 2:
- gen_helper_neon_qunzip32(pd, pm);
- break;
- default:
- abort();
- }
- } else {
- switch (size) {
- case 0:
- gen_helper_neon_unzip8(pd, pm);
- break;
- case 1:
- gen_helper_neon_unzip16(pd, pm);
- break;
- default:
- abort();
- }
- }
- tcg_temp_free_ptr(pd);
- tcg_temp_free_ptr(pm);
- return 0;
-}
-
-static int gen_neon_zip(int rd, int rm, int size, int q)
-{
- TCGv_ptr pd, pm;
-
- if (!q && size == 2) {
- return 1;
- }
- pd = vfp_reg_ptr(true, rd);
- pm = vfp_reg_ptr(true, rm);
- if (q) {
- switch (size) {
- case 0:
- gen_helper_neon_qzip8(pd, pm);
- break;
- case 1:
- gen_helper_neon_qzip16(pd, pm);
- break;
- case 2:
- gen_helper_neon_qzip32(pd, pm);
- break;
- default:
- abort();
- }
- } else {
- switch (size) {
- case 0:
- gen_helper_neon_zip8(pd, pm);
- break;
- case 1:
- gen_helper_neon_zip16(pd, pm);
- break;
- default:
- abort();
- }
- }
- tcg_temp_free_ptr(pd);
- tcg_temp_free_ptr(pm);
- return 0;
-}
-
static void gen_neon_trn_u8(TCGv_i32 t0, TCGv_i32 t1)
{
TCGv_i32 rd, tmp;
@@ -5082,6 +5002,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VREV64:
case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
+ case NEON_2RM_VUZP:
+ case NEON_2RM_VZIP:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -5097,16 +5019,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
goto elementwise;
}
break;
- case NEON_2RM_VUZP:
- if (gen_neon_unzip(rd, rm, size, q)) {
- return 1;
- }
- break;
- case NEON_2RM_VZIP:
- if (gen_neon_zip(rd, rm, size, q)) {
- return 1;
- }
- break;
case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
/* also VQMOVUN; op field and mnemonics don't line up */
if (rm & 1) {
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 03/21] target/arm: Convert VZIP, VUZP to decodetree
2020-06-16 17:08 ` [PATCH 03/21] target/arm: Convert VZIP, VUZP " Peter Maydell
@ 2020-06-19 22:47 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:47 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VZIP and VUZP insns in the 2-reg-misc group to
> decodetree.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 3 ++
> target/arm/translate-neon.inc.c | 74 ++++++++++++++++++++++++++
> target/arm/translate.c | 92 +--------------------------------
> 3 files changed, 79 insertions(+), 90 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 04/21] target/arm: Convert Neon narrowing moves to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (2 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 03/21] target/arm: Convert VZIP, VUZP " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 22:52 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL " Peter Maydell
` (18 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon narrowing moves VMQNV, VQMOVN, VQMOVUN in the 2-reg-misc
group to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 9 ++++
target/arm/translate-neon.inc.c | 59 ++++++++++++++++++++++++
target/arm/translate.c | 81 +--------------------------------
3 files changed, 70 insertions(+), 79 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index ad9e17fd737..2277b4c7b51 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -439,6 +439,8 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
@2misc .... ... .. . .. size:2 .. .... . .... q:1 . . .... \
&2misc vm=%vm_dp vd=%vd_dp
+ @2misc_q0 .... ... .. . .. size:2 .. .... . .... . . . .... \
+ &2misc vm=%vm_dp vd=%vd_dp q=0
VREV64 1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
@@ -450,6 +452,13 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VUZP 1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
VZIP 1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
+
+ VMOVN 1111 001 11 . 11 .. 10 .... 0 0100 0 . 0 .... @2misc_q0
+ # VQMOVUN: unsigned result (source is always signed)
+ VQMOVUN 1111 001 11 . 11 .. 10 .... 0 0100 1 . 0 .... @2misc_q0
+ # VQMOVN: signed result, source may be signed (_S) or unsigned (_U)
+ VQMOVN_S 1111 001 11 . 11 .. 10 .... 0 0101 0 . 0 .... @2misc_q0
+ VQMOVN_U 1111 001 11 . 11 .. 10 .... 0 0101 1 . 0 .... @2misc_q0
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index f4799dd9770..b0620972854 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3243,3 +3243,62 @@ static bool trans_VZIP(DisasContext *s, arg_2misc *a)
};
return do_zip_uzp(s, a, fn[a->q][a->size]);
}
+
+static bool do_vmovn(DisasContext *s, arg_2misc *a,
+ NeonGenNarrowEnvFn *narrowfn)
+{
+ TCGv_i64 rm;
+ TCGv_i32 rd0, rd1;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->vm & 1) {
+ return false;
+ }
+
+ if (!narrowfn) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ rm = tcg_temp_new_i64();
+ rd0 = tcg_temp_new_i32();
+ rd1 = tcg_temp_new_i32();
+
+ neon_load_reg64(rm, a->vm);
+ narrowfn(rd0, cpu_env, rm);
+ neon_load_reg64(rm, a->vm + 1);
+ narrowfn(rd1, cpu_env, rm);
+ neon_store_reg(a->vd, 0, rd0);
+ neon_store_reg(a->vd, 1, rd1);
+ tcg_temp_free_i64(rm);
+ return true;
+}
+
+#define DO_VMOVN(INSN, FUNC) \
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
+ { \
+ static NeonGenNarrowEnvFn * const narrowfn[] = { \
+ FUNC##8, \
+ FUNC##16, \
+ FUNC##32, \
+ NULL, \
+ }; \
+ return do_vmovn(s, a, narrowfn[a->size]); \
+ }
+
+DO_VMOVN(VMOVN, gen_neon_narrow_u)
+DO_VMOVN(VQMOVUN, gen_helper_neon_unarrow_sat)
+DO_VMOVN(VQMOVN_S, gen_helper_neon_narrow_sat_s)
+DO_VMOVN(VQMOVN_U, gen_helper_neon_narrow_sat_u)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 442f287d861..8ecae264e15 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2975,46 +2975,6 @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
tcg_temp_free_i32(rd);
}
-static inline void gen_neon_narrow(int size, TCGv_i32 dest, TCGv_i64 src)
-{
- switch (size) {
- case 0: gen_helper_neon_narrow_u8(dest, src); break;
- case 1: gen_helper_neon_narrow_u16(dest, src); break;
- case 2: tcg_gen_extrl_i64_i32(dest, src); break;
- default: abort();
- }
-}
-
-static inline void gen_neon_narrow_sats(int size, TCGv_i32 dest, TCGv_i64 src)
-{
- switch (size) {
- case 0: gen_helper_neon_narrow_sat_s8(dest, cpu_env, src); break;
- case 1: gen_helper_neon_narrow_sat_s16(dest, cpu_env, src); break;
- case 2: gen_helper_neon_narrow_sat_s32(dest, cpu_env, src); break;
- default: abort();
- }
-}
-
-static inline void gen_neon_narrow_satu(int size, TCGv_i32 dest, TCGv_i64 src)
-{
- switch (size) {
- case 0: gen_helper_neon_narrow_sat_u8(dest, cpu_env, src); break;
- case 1: gen_helper_neon_narrow_sat_u16(dest, cpu_env, src); break;
- case 2: gen_helper_neon_narrow_sat_u32(dest, cpu_env, src); break;
- default: abort();
- }
-}
-
-static inline void gen_neon_unarrow_sats(int size, TCGv_i32 dest, TCGv_i64 src)
-{
- switch (size) {
- case 0: gen_helper_neon_unarrow_sat8(dest, cpu_env, src); break;
- case 1: gen_helper_neon_unarrow_sat16(dest, cpu_env, src); break;
- case 2: gen_helper_neon_unarrow_sat32(dest, cpu_env, src); break;
- default: abort();
- }
-}
-
static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
{
if (u) {
@@ -3035,24 +2995,6 @@ static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
tcg_temp_free_i32(src);
}
-static void gen_neon_narrow_op(int op, int u, int size,
- TCGv_i32 dest, TCGv_i64 src)
-{
- if (op) {
- if (u) {
- gen_neon_unarrow_sats(size, dest, src);
- } else {
- gen_neon_narrow(size, dest, src);
- }
- } else {
- if (u) {
- gen_neon_narrow_satu(size, dest, src);
- } else {
- gen_neon_narrow_sats(size, dest, src);
- }
- }
-}
-
/* Symbolic constants for op fields for Neon 2-register miscellaneous.
* The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
* table A7-13.
@@ -4994,8 +4936,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
!arm_dc_feature(s, ARM_FEATURE_V8)) {
return 1;
}
- if ((op != NEON_2RM_VMOVN && op != NEON_2RM_VQMOVN) &&
- q && ((rm | rd) & 1)) {
+ if (q && ((rm | rd) & 1)) {
return 1;
}
switch (op) {
@@ -5004,6 +4945,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
case NEON_2RM_VUZP:
case NEON_2RM_VZIP:
+ case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -5019,25 +4961,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
goto elementwise;
}
break;
- case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
- /* also VQMOVUN; op field and mnemonics don't line up */
- if (rm & 1) {
- return 1;
- }
- tmp2 = NULL;
- for (pass = 0; pass < 2; pass++) {
- neon_load_reg64(cpu_V0, rm + pass);
- tmp = tcg_temp_new_i32();
- gen_neon_narrow_op(op == NEON_2RM_VMOVN, q, size,
- tmp, cpu_V0);
- if (pass == 0) {
- tmp2 = tmp;
- } else {
- neon_store_reg(rd, 0, tmp2);
- neon_store_reg(rd, 1, tmp);
- }
- }
- break;
case NEON_2RM_VSHLL:
if (q || (rd & 1)) {
return 1;
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 04/21] target/arm: Convert Neon narrowing moves to decodetree
2020-06-16 17:08 ` [PATCH 04/21] target/arm: Convert Neon narrowing moves " Peter Maydell
@ 2020-06-19 22:52 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 22:52 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon narrowing moves VMQNV, VQMOVN, VQMOVUN in the 2-reg-misc
> group to decodetree.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 9 ++++
> target/arm/translate-neon.inc.c | 59 ++++++++++++++++++++++++
> target/arm/translate.c | 81 +--------------------------------
> 3 files changed, 70 insertions(+), 79 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (3 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 04/21] target/arm: Convert Neon narrowing moves " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 22:55 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns " Peter Maydell
` (17 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VSHLL insn in the 2-reg-misc Neon group to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 2 ++
target/arm/translate-neon.inc.c | 52 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 35 +---------------------
3 files changed, 55 insertions(+), 34 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 2277b4c7b51..0102aa7254b 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -459,6 +459,8 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
# VQMOVN: signed result, source may be signed (_S) or unsigned (_U)
VQMOVN_S 1111 001 11 . 11 .. 10 .... 0 0101 0 . 0 .... @2misc_q0
VQMOVN_U 1111 001 11 . 11 .. 10 .... 0 0101 1 . 0 .... @2misc_q0
+
+ VSHLL 1111 001 11 . 11 .. 10 .... 0 0110 0 . 0 .... @2misc_q0
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index b0620972854..78239ec1c1b 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3302,3 +3302,55 @@ DO_VMOVN(VMOVN, gen_neon_narrow_u)
DO_VMOVN(VQMOVUN, gen_helper_neon_unarrow_sat)
DO_VMOVN(VQMOVN_S, gen_helper_neon_narrow_sat_s)
DO_VMOVN(VQMOVN_U, gen_helper_neon_narrow_sat_u)
+
+static bool trans_VSHLL(DisasContext *s, arg_2misc *a)
+{
+ TCGv_i32 rm0, rm1;
+ TCGv_i64 rd;
+ static NeonGenWidenFn * const widenfns[] = {
+ gen_helper_neon_widen_u8,
+ gen_helper_neon_widen_u16,
+ tcg_gen_extu_i32_i64,
+ NULL,
+ };
+ NeonGenWidenFn *widenfn = widenfns[a->size];
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->vd & 1) {
+ return false;
+ }
+
+ if (!widenfn) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ rd = tcg_temp_new_i64();
+
+ rm0 = neon_load_reg(a->vm, 0);
+ rm1 = neon_load_reg(a->vm, 1);
+
+ widenfn(rd, rm0);
+ tcg_gen_shli_i64(rd, rd, 8 << a->size);
+ neon_store_reg64(rd, a->vd);
+ widenfn(rd, rm1);
+ tcg_gen_shli_i64(rd, rd, 8 << a->size);
+ neon_store_reg64(rd, a->vd + 1);
+
+ tcg_temp_free_i64(rd);
+ tcg_temp_free_i32(rm0);
+ tcg_temp_free_i32(rm1);
+ return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 8ecae264e15..94d5e34fff4 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2975,26 +2975,6 @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
tcg_temp_free_i32(rd);
}
-static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
-{
- if (u) {
- switch (size) {
- case 0: gen_helper_neon_widen_u8(dest, src); break;
- case 1: gen_helper_neon_widen_u16(dest, src); break;
- case 2: tcg_gen_extu_i32_i64(dest, src); break;
- default: abort();
- }
- } else {
- switch (size) {
- case 0: gen_helper_neon_widen_s8(dest, src); break;
- case 1: gen_helper_neon_widen_s16(dest, src); break;
- case 2: tcg_gen_ext_i32_i64(dest, src); break;
- default: abort();
- }
- }
- tcg_temp_free_i32(src);
-}
-
/* Symbolic constants for op fields for Neon 2-register miscellaneous.
* The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
* table A7-13.
@@ -4946,6 +4926,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VUZP:
case NEON_2RM_VZIP:
case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
+ case NEON_2RM_VSHLL:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4961,20 +4942,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
goto elementwise;
}
break;
- case NEON_2RM_VSHLL:
- if (q || (rd & 1)) {
- return 1;
- }
- tmp = neon_load_reg(rm, 0);
- tmp2 = neon_load_reg(rm, 1);
- for (pass = 0; pass < 2; pass++) {
- if (pass == 1)
- tmp = tmp2;
- gen_neon_widen(cpu_V0, tmp, size, 1);
- tcg_gen_shli_i64(cpu_V0, cpu_V0, 8 << size);
- neon_store_reg64(cpu_V0, rd + pass);
- }
- break;
case NEON_2RM_VCVT_F16_F32:
{
TCGv_ptr fpst;
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (4 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 05/21] target/arm: Convert Neon 2-reg-misc VSHLL " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:01 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops " Peter Maydell
` (16 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon insns in the 2-reg-misc group which are
VCVT between f32 and f16 to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 3 ++
target/arm/translate-neon.inc.c | 96 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 65 ++--------------------
3 files changed, 102 insertions(+), 62 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 0102aa7254b..8174f2f92f4 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -461,6 +461,9 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VQMOVN_U 1111 001 11 . 11 .. 10 .... 0 0101 1 . 0 .... @2misc_q0
VSHLL 1111 001 11 . 11 .. 10 .... 0 0110 0 . 0 .... @2misc_q0
+
+ VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
+ VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 78239ec1c1b..d37be597cf4 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3354,3 +3354,99 @@ static bool trans_VSHLL(DisasContext *s, arg_2misc *a)
tcg_temp_free_i32(rm1);
return true;
}
+
+static bool trans_VCVT_F16_F32(DisasContext *s, arg_2misc *a)
+{
+ TCGv_ptr fpst;
+ TCGv_i32 ahp, tmp, tmp2, tmp3;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+ !dc_isar_feature(aa32_fp16_spconv, s)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vm & 1) || (a->size != 1)) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fpst = get_fpstatus_ptr(true);
+ ahp = get_ahp_flag();
+ tmp = neon_load_reg(a->vm, 0);
+ gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
+ tmp2 = neon_load_reg(a->vm, 1);
+ gen_helper_vfp_fcvt_f32_to_f16(tmp2, tmp2, fpst, ahp);
+ tcg_gen_shli_i32(tmp2, tmp2, 16);
+ tcg_gen_or_i32(tmp2, tmp2, tmp);
+ tcg_temp_free_i32(tmp);
+ tmp = neon_load_reg(a->vm, 2);
+ gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
+ tmp3 = neon_load_reg(a->vm, 3);
+ neon_store_reg(a->vd, 0, tmp2);
+ gen_helper_vfp_fcvt_f32_to_f16(tmp3, tmp3, fpst, ahp);
+ tcg_gen_shli_i32(tmp3, tmp3, 16);
+ tcg_gen_or_i32(tmp3, tmp3, tmp);
+ neon_store_reg(a->vd, 1, tmp3);
+ tcg_temp_free_i32(tmp);
+ tcg_temp_free_i32(ahp);
+ tcg_temp_free_ptr(fpst);
+
+ return true;
+}
+
+static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a)
+{
+ TCGv_ptr fpst;
+ TCGv_i32 ahp, tmp, tmp2, tmp3;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+ !dc_isar_feature(aa32_fp16_spconv, s)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vd & 1) || (a->size != 1)) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fpst = get_fpstatus_ptr(true);
+ ahp = get_ahp_flag();
+ tmp3 = tcg_temp_new_i32();
+ tmp = neon_load_reg(a->vm, 0);
+ tmp2 = neon_load_reg(a->vm, 1);
+ tcg_gen_ext16u_i32(tmp3, tmp);
+ gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
+ neon_store_reg(a->vd, 0, tmp3);
+ tcg_gen_shri_i32(tmp, tmp, 16);
+ gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp);
+ neon_store_reg(a->vd, 1, tmp);
+ tmp3 = tcg_temp_new_i32();
+ tcg_gen_ext16u_i32(tmp3, tmp2);
+ gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
+ neon_store_reg(a->vd, 2, tmp3);
+ tcg_gen_shri_i32(tmp2, tmp2, 16);
+ gen_helper_vfp_fcvt_f16_to_f32(tmp2, tmp2, fpst, ahp);
+ neon_store_reg(a->vd, 3, tmp2);
+ tcg_temp_free_i32(ahp);
+ tcg_temp_free_ptr(fpst);
+
+ return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 94d5e34fff4..1ea09695546 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4860,7 +4860,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
int pass;
int u;
int vec_size;
- TCGv_i32 tmp, tmp2, tmp3;
+ TCGv_i32 tmp, tmp2;
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
return 1;
@@ -4927,6 +4927,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VZIP:
case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
case NEON_2RM_VSHLL:
+ case NEON_2RM_VCVT_F16_F32:
+ case NEON_2RM_VCVT_F32_F16:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4942,67 +4944,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
goto elementwise;
}
break;
- case NEON_2RM_VCVT_F16_F32:
- {
- TCGv_ptr fpst;
- TCGv_i32 ahp;
-
- if (!dc_isar_feature(aa32_fp16_spconv, s) ||
- q || (rm & 1)) {
- return 1;
- }
- fpst = get_fpstatus_ptr(true);
- ahp = get_ahp_flag();
- tmp = neon_load_reg(rm, 0);
- gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
- tmp2 = neon_load_reg(rm, 1);
- gen_helper_vfp_fcvt_f32_to_f16(tmp2, tmp2, fpst, ahp);
- tcg_gen_shli_i32(tmp2, tmp2, 16);
- tcg_gen_or_i32(tmp2, tmp2, tmp);
- tcg_temp_free_i32(tmp);
- tmp = neon_load_reg(rm, 2);
- gen_helper_vfp_fcvt_f32_to_f16(tmp, tmp, fpst, ahp);
- tmp3 = neon_load_reg(rm, 3);
- neon_store_reg(rd, 0, tmp2);
- gen_helper_vfp_fcvt_f32_to_f16(tmp3, tmp3, fpst, ahp);
- tcg_gen_shli_i32(tmp3, tmp3, 16);
- tcg_gen_or_i32(tmp3, tmp3, tmp);
- neon_store_reg(rd, 1, tmp3);
- tcg_temp_free_i32(tmp);
- tcg_temp_free_i32(ahp);
- tcg_temp_free_ptr(fpst);
- break;
- }
- case NEON_2RM_VCVT_F32_F16:
- {
- TCGv_ptr fpst;
- TCGv_i32 ahp;
- if (!dc_isar_feature(aa32_fp16_spconv, s) ||
- q || (rd & 1)) {
- return 1;
- }
- fpst = get_fpstatus_ptr(true);
- ahp = get_ahp_flag();
- tmp3 = tcg_temp_new_i32();
- tmp = neon_load_reg(rm, 0);
- tmp2 = neon_load_reg(rm, 1);
- tcg_gen_ext16u_i32(tmp3, tmp);
- gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
- neon_store_reg(rd, 0, tmp3);
- tcg_gen_shri_i32(tmp, tmp, 16);
- gen_helper_vfp_fcvt_f16_to_f32(tmp, tmp, fpst, ahp);
- neon_store_reg(rd, 1, tmp);
- tmp3 = tcg_temp_new_i32();
- tcg_gen_ext16u_i32(tmp3, tmp2);
- gen_helper_vfp_fcvt_f16_to_f32(tmp3, tmp3, fpst, ahp);
- neon_store_reg(rd, 2, tmp3);
- tcg_gen_shri_i32(tmp2, tmp2, 16);
- gen_helper_vfp_fcvt_f16_to_f32(tmp2, tmp2, fpst, ahp);
- neon_store_reg(rd, 3, tmp2);
- tcg_temp_free_i32(ahp);
- tcg_temp_free_ptr(fpst);
- break;
- }
case NEON_2RM_AESE: case NEON_2RM_AESMC:
if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
return 1;
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (5 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 06/21] target/arm: Convert Neon VCVT f16/f32 insns " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:17 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations " Peter Maydell
` (15 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert to decodetree the insns in the Neon 2-reg-misc grouping which
we implement using gvec.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 11 +++++++
target/arm/translate-neon.inc.c | 55 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 35 +++++----------------
3 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 8174f2f92f4..b5692070d62 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -447,9 +447,20 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VPADDL_S 1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
VPADDL_U 1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
+ VMVN 1111 001 11 . 11 .. 00 .... 0 1011 . . 0 .... @2misc
+
VPADAL_S 1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
VPADAL_U 1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
+ VCGT0 1111 001 11 . 11 .. 01 .... 0 0000 . . 0 .... @2misc
+ VCGE0 1111 001 11 . 11 .. 01 .... 0 0001 . . 0 .... @2misc
+ VCEQ0 1111 001 11 . 11 .. 01 .... 0 0010 . . 0 .... @2misc
+ VCLE0 1111 001 11 . 11 .. 01 .... 0 0011 . . 0 .... @2misc
+ VCLT0 1111 001 11 . 11 .. 01 .... 0 0100 . . 0 .... @2misc
+
+ VABS 1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
+ VNEG 1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
+
VUZP 1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
VZIP 1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index d37be597cf4..d80123514c2 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3450,3 +3450,58 @@ static bool trans_VCVT_F32_F16(DisasContext *s, arg_2misc *a)
return true;
}
+
+static bool do_2misc_vec(DisasContext *s, arg_2misc *a, GVecGen2Fn *fn)
+{
+ int vec_size = a->q ? 16 : 8;
+ int rd_ofs = neon_reg_offset(a->vd, 0);
+ int rm_ofs = neon_reg_offset(a->vm, 0);
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->size == 3) {
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fn(a->size, rd_ofs, rm_ofs, vec_size, vec_size);
+
+ return true;
+}
+
+#define DO_2MISC_VEC(INSN, FN) \
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
+ { \
+ return do_2misc_vec(s, a, FN); \
+ }
+
+DO_2MISC_VEC(VNEG, tcg_gen_gvec_neg)
+DO_2MISC_VEC(VABS, tcg_gen_gvec_abs)
+DO_2MISC_VEC(VCEQ0, gen_gvec_ceq0)
+DO_2MISC_VEC(VCGT0, gen_gvec_cgt0)
+DO_2MISC_VEC(VCLE0, gen_gvec_cle0)
+DO_2MISC_VEC(VCGE0, gen_gvec_cge0)
+DO_2MISC_VEC(VCLT0, gen_gvec_clt0)
+
+static bool trans_VMVN(DisasContext *s, arg_2misc *a)
+{
+ if (a->size != 0) {
+ return false;
+ }
+ return do_2misc_vec(s, a, tcg_gen_gvec_not);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 1ea09695546..0f0741a37bc 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4859,7 +4859,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
int size;
int pass;
int u;
- int vec_size;
TCGv_i32 tmp, tmp2;
if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -4883,7 +4882,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
VFP_DREG_D(rd, insn);
VFP_DREG_M(rm, insn);
size = (insn >> 20) & 3;
- vec_size = q ? 16 : 8;
rd_ofs = neon_reg_offset(rd, 0);
rm_ofs = neon_reg_offset(rm, 0);
@@ -4929,6 +4927,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VSHLL:
case NEON_2RM_VCVT_F16_F32:
case NEON_2RM_VCVT_F32_F16:
+ case NEON_2RM_VMVN:
+ case NEON_2RM_VNEG:
+ case NEON_2RM_VABS:
+ case NEON_2RM_VCEQ0:
+ case NEON_2RM_VCGT0:
+ case NEON_2RM_VCLE0:
+ case NEON_2RM_VCGE0:
+ case NEON_2RM_VCLT0:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4989,31 +4995,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
q ? gen_helper_crypto_sha256su0
: gen_helper_crypto_sha1su1);
break;
- case NEON_2RM_VMVN:
- tcg_gen_gvec_not(0, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
- case NEON_2RM_VNEG:
- tcg_gen_gvec_neg(size, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
- case NEON_2RM_VABS:
- tcg_gen_gvec_abs(size, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
-
- case NEON_2RM_VCEQ0:
- gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
- case NEON_2RM_VCGT0:
- gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
- case NEON_2RM_VCLE0:
- gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
- case NEON_2RM_VCGE0:
- gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
- case NEON_2RM_VCLT0:
- gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size);
- break;
default:
elementwise:
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (6 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 07/21] target/arm: Convert vectorised 2-reg-misc Neon ops " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:25 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn Peter Maydell
` (14 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon-2-reg misc crypto ops (AESE, AESMC, SHA1H, SHA1SU1)
to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 12 ++++++++
target/arm/translate-neon.inc.c | 42 ++++++++++++++++++++++++++
target/arm/translate.c | 52 +++------------------------------
3 files changed, 58 insertions(+), 48 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index b5692070d62..86b1b9e34bf 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -441,12 +441,19 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
&2misc vm=%vm_dp vd=%vd_dp
@2misc_q0 .... ... .. . .. size:2 .. .... . .... . . . .... \
&2misc vm=%vm_dp vd=%vd_dp q=0
+ @2misc_q1 .... ... .. . .. size:2 .. .... . .... . . . .... \
+ &2misc vm=%vm_dp vd=%vd_dp q=1
VREV64 1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
VPADDL_S 1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
VPADDL_U 1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
+ AESE 1111 001 11 . 11 .. 00 .... 0 0110 0 . 0 .... @2misc_q1
+ AESD 1111 001 11 . 11 .. 00 .... 0 0110 1 . 0 .... @2misc_q1
+ AESMC 1111 001 11 . 11 .. 00 .... 0 0111 0 . 0 .... @2misc_q1
+ AESIMC 1111 001 11 . 11 .. 00 .... 0 0111 1 . 0 .... @2misc_q1
+
VMVN 1111 001 11 . 11 .. 00 .... 0 1011 . . 0 .... @2misc
VPADAL_S 1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
@@ -458,6 +465,8 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VCLE0 1111 001 11 . 11 .. 01 .... 0 0011 . . 0 .... @2misc
VCLT0 1111 001 11 . 11 .. 01 .... 0 0100 . . 0 .... @2misc
+ SHA1H 1111 001 11 . 11 .. 01 .... 0 0101 1 . 0 .... @2misc_q1
+
VABS 1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
VNEG 1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
@@ -473,6 +482,9 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VSHLL 1111 001 11 . 11 .. 10 .... 0 0110 0 . 0 .... @2misc_q0
+ SHA1SU1 1111 001 11 . 11 .. 10 .... 0 0111 0 . 0 .... @2misc_q1
+ SHA256SU0 1111 001 11 . 11 .. 10 .... 0 0111 1 . 0 .... @2misc_q1
+
VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
]
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index d80123514c2..5e2cd18bf71 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3505,3 +3505,45 @@ static bool trans_VMVN(DisasContext *s, arg_2misc *a)
}
return do_2misc_vec(s, a, tcg_gen_gvec_not);
}
+
+#define WRAP_2M_3_OOL_FN(WRAPNAME, FUNC, DATA) \
+ static void WRAPNAME(unsigned vece, uint32_t rd_ofs, \
+ uint32_t rm_ofs, uint32_t oprsz, \
+ uint32_t maxsz) \
+ { \
+ tcg_gen_gvec_3_ool(rd_ofs, rd_ofs, rm_ofs, oprsz, maxsz, \
+ DATA, FUNC); \
+ }
+
+#define WRAP_2M_2_OOL_FN(WRAPNAME, FUNC, DATA) \
+ static void WRAPNAME(unsigned vece, uint32_t rd_ofs, \
+ uint32_t rm_ofs, uint32_t oprsz, \
+ uint32_t maxsz) \
+ { \
+ tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, oprsz, maxsz, DATA, FUNC); \
+ }
+
+WRAP_2M_3_OOL_FN(gen_AESE, gen_helper_crypto_aese, 0)
+WRAP_2M_3_OOL_FN(gen_AESD, gen_helper_crypto_aese, 1)
+WRAP_2M_2_OOL_FN(gen_AESMC, gen_helper_crypto_aesmc, 0)
+WRAP_2M_2_OOL_FN(gen_AESIMC, gen_helper_crypto_aesmc, 1)
+WRAP_2M_2_OOL_FN(gen_SHA1H, gen_helper_crypto_sha1h, 0)
+WRAP_2M_2_OOL_FN(gen_SHA1SU1, gen_helper_crypto_sha1su1, 0)
+WRAP_2M_2_OOL_FN(gen_SHA256SU0, gen_helper_crypto_sha256su0, 0)
+
+#define DO_2M_CRYPTO(INSN, FEATURE, SIZE) \
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
+ { \
+ if (!dc_isar_feature(FEATURE, s) || a->size != SIZE) { \
+ return false; \
+ } \
+ return do_2misc_vec(s, a, gen_##INSN); \
+ }
+
+DO_2M_CRYPTO(AESE, aa32_aes, 0)
+DO_2M_CRYPTO(AESD, aa32_aes, 0)
+DO_2M_CRYPTO(AESMC, aa32_aes, 0)
+DO_2M_CRYPTO(AESIMC, aa32_aes, 0)
+DO_2M_CRYPTO(SHA1H, aa32_sha1, 2)
+DO_2M_CRYPTO(SHA1SU1, aa32_sha1, 2)
+DO_2M_CRYPTO(SHA256SU0, aa32_sha2, 2)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 0f0741a37bc..38644995ab2 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4855,7 +4855,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
{
int op;
int q;
- int rd, rm, rd_ofs, rm_ofs;
+ int rd, rm;
int size;
int pass;
int u;
@@ -4882,8 +4882,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
VFP_DREG_D(rd, insn);
VFP_DREG_M(rm, insn);
size = (insn >> 20) & 3;
- rd_ofs = neon_reg_offset(rd, 0);
- rm_ofs = neon_reg_offset(rm, 0);
if ((insn & (1 << 23)) == 0) {
/* Three register same length: handled by decodetree */
@@ -4935,6 +4933,9 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VCLE0:
case NEON_2RM_VCGE0:
case NEON_2RM_VCLT0:
+ case NEON_2RM_AESE: case NEON_2RM_AESMC:
+ case NEON_2RM_SHA1H:
+ case NEON_2RM_SHA1SU1:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4950,51 +4951,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
goto elementwise;
}
break;
- case NEON_2RM_AESE: case NEON_2RM_AESMC:
- if (!dc_isar_feature(aa32_aes, s) || ((rm | rd) & 1)) {
- return 1;
- }
- /*
- * Bit 6 is the lowest opcode bit; it distinguishes
- * between encryption (AESE/AESMC) and decryption
- * (AESD/AESIMC).
- */
- if (op == NEON_2RM_AESE) {
- tcg_gen_gvec_3_ool(vfp_reg_offset(true, rd),
- vfp_reg_offset(true, rd),
- vfp_reg_offset(true, rm),
- 16, 16, extract32(insn, 6, 1),
- gen_helper_crypto_aese);
- } else {
- tcg_gen_gvec_2_ool(vfp_reg_offset(true, rd),
- vfp_reg_offset(true, rm),
- 16, 16, extract32(insn, 6, 1),
- gen_helper_crypto_aesmc);
- }
- break;
- case NEON_2RM_SHA1H:
- if (!dc_isar_feature(aa32_sha1, s) || ((rm | rd) & 1)) {
- return 1;
- }
- tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, 16, 16, 0,
- gen_helper_crypto_sha1h);
- break;
- case NEON_2RM_SHA1SU1:
- if ((rm | rd) & 1) {
- return 1;
- }
- /* bit 6 (q): set -> SHA256SU0, cleared -> SHA1SU1 */
- if (q) {
- if (!dc_isar_feature(aa32_sha2, s)) {
- return 1;
- }
- } else if (!dc_isar_feature(aa32_sha1, s)) {
- return 1;
- }
- tcg_gen_gvec_2_ool(rd_ofs, rm_ofs, 16, 16, 0,
- q ? gen_helper_crypto_sha256su0
- : gen_helper_crypto_sha1su1);
- break;
default:
elementwise:
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (7 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 08/21] target/arm: Convert Neon 2-reg-misc crypto operations " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:28 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs Peter Maydell
` (13 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
The NeonGenOneOpFn typedef breaks with the pattern of the other
NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
(which we will need in a subsequent commit).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/translate.h | 2 +-
target/arm/translate-a64.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 62ed5c4780c..35218b3fdf1 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -374,7 +374,7 @@ typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
-typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64);
+typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
typedef void CryptoThreeOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a0e72ad6942..7cb5fbfba80 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -11917,8 +11917,8 @@ static void handle_2misc_pairwise(DisasContext *s, int opcode, bool u,
} else {
for (pass = 0; pass < maxpass; pass++) {
TCGv_i64 tcg_op = tcg_temp_new_i64();
- NeonGenOneOpFn *genfn;
- static NeonGenOneOpFn * const fns[2][2] = {
+ NeonGenOne64OpFn *genfn;
+ static NeonGenOne64OpFn * const fns[2][2] = {
{ gen_helper_neon_addlp_s8, gen_helper_neon_addlp_u8 },
{ gen_helper_neon_addlp_s16, gen_helper_neon_addlp_u16 },
};
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
2020-06-16 17:08 ` [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn Peter Maydell
@ 2020-06-19 23:28 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:28 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> The NeonGenOneOpFn typedef breaks with the pattern of the other
> NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
> but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
> so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
> (which we will need in a subsequent commit).
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/translate.h | 2 +-
> target/arm/translate-a64.c | 4 ++--
> 2 files changed, 3 insertions(+), 3 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (8 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 09/21] target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:32 ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single,Double}OPFn typedefs Richard Henderson
2020-06-16 17:08 ` [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest Peter Maydell
` (12 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
All the other typedefs like these spell "Op" with a lowercase 'p';
remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to
match.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/translate.h | 4 ++--
target/arm/translate-a64.c | 4 ++--
target/arm/translate-neon.inc.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 35218b3fdf1..467c5291101 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -372,8 +372,8 @@ typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
-typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
-typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
+typedef void NeonGenTwoSingleOpFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+typedef void NeonGenTwoDoubleOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
typedef void CryptoTwoOpFn(TCGv_ptr, TCGv_ptr);
typedef void CryptoThreeOpIntFn(TCGv_ptr, TCGv_ptr, TCGv_i32);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 7cb5fbfba80..12040984981 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -9534,7 +9534,7 @@ static void handle_2misc_fcmp_zero(DisasContext *s, int opcode,
TCGv_i64 tcg_op = tcg_temp_new_i64();
TCGv_i64 tcg_zero = tcg_const_i64(0);
TCGv_i64 tcg_res = tcg_temp_new_i64();
- NeonGenTwoDoubleOPFn *genfn;
+ NeonGenTwoDoubleOpFn *genfn;
bool swap = false;
int pass;
@@ -9576,7 +9576,7 @@ static void handle_2misc_fcmp_zero(DisasContext *s, int opcode,
TCGv_i32 tcg_op = tcg_temp_new_i32();
TCGv_i32 tcg_zero = tcg_const_i32(0);
TCGv_i32 tcg_res = tcg_temp_new_i32();
- NeonGenTwoSingleOPFn *genfn;
+ NeonGenTwoSingleOpFn *genfn;
bool swap = false;
int pass, maxpasses;
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 5e2cd18bf71..c39443c8cae 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1664,7 +1664,7 @@ static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
}
static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
- NeonGenTwoSingleOPFn *fn)
+ NeonGenTwoSingleOpFn *fn)
{
/* FP operations in 2-reg-and-shift group */
TCGv_i32 tmp, shiftv;
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (9 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 10/21] target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:33 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree Peter Maydell
` (11 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Make gen_swap_half() take a source and destination TCGv_i32 rather
than modifying the input TCGv_i32; we're going to want to be able to
use it with the more flexible function signature, and this also
brings it into line with other functions like gen_rev16() and
gen_revsh().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/translate-neon.inc.c | 2 +-
target/arm/translate.c | 10 +++++-----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index c39443c8cae..4967e974386 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3007,7 +3007,7 @@ static bool trans_VREV64(DisasContext *s, arg_VREV64 *a)
tcg_gen_bswap32_i32(tmp[half], tmp[half]);
break;
case 1:
- gen_swap_half(tmp[half]);
+ gen_swap_half(tmp[half], tmp[half]);
break;
case 2:
break;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 38644995ab2..64b18a95b64 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -378,9 +378,9 @@ static void gen_revsh(TCGv_i32 dest, TCGv_i32 var)
}
/* Swap low and high halfwords. */
-static void gen_swap_half(TCGv_i32 var)
+static void gen_swap_half(TCGv_i32 dest, TCGv_i32 var)
{
- tcg_gen_rotri_i32(var, var, 16);
+ tcg_gen_rotri_i32(dest, var, 16);
}
/* Dual 16-bit add. Result placed in t0 and t1 is marked as dead.
@@ -4960,7 +4960,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VREV32:
switch (size) {
case 0: tcg_gen_bswap32_i32(tmp, tmp); break;
- case 1: gen_swap_half(tmp); break;
+ case 1: gen_swap_half(tmp, tmp); break;
default: abort();
}
break;
@@ -8046,7 +8046,7 @@ static bool op_smlad(DisasContext *s, arg_rrrr *a, bool m_swap, bool sub)
t1 = load_reg(s, a->rn);
t2 = load_reg(s, a->rm);
if (m_swap) {
- gen_swap_half(t2);
+ gen_swap_half(t2, t2);
}
gen_smul_dual(t1, t2);
@@ -8104,7 +8104,7 @@ static bool op_smlald(DisasContext *s, arg_rrrr *a, bool m_swap, bool sub)
t1 = load_reg(s, a->rn);
t2 = load_reg(s, a->rm);
if (m_swap) {
- gen_swap_half(t2);
+ gen_swap_half(t2, t2);
}
gen_smul_dual(t1, t2);
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest
2020-06-16 17:08 ` [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest Peter Maydell
@ 2020-06-19 23:33 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:33 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Make gen_swap_half() take a source and destination TCGv_i32 rather
> than modifying the input TCGv_i32; we're going to want to be able to
> use it with the more flexible function signature, and this also
> brings it into line with other functions like gen_rev16() and
> gen_revsh().
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/translate-neon.inc.c | 2 +-
> target/arm/translate.c | 10 +++++-----
> 2 files changed, 6 insertions(+), 6 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (10 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 11/21] target/arm: Make gen_swap_half() take separate src and dest Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:35 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops Peter Maydell
` (10 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VREV32 and VREV16 insns in the Neon 2-reg-misc group
to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/translate.h | 1 +
target/arm/neon-dp.decode | 2 ++
target/arm/translate-neon.inc.c | 55 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 12 ++-----
4 files changed, 60 insertions(+), 10 deletions(-)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 467c5291101..4dbeee4c89f 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -363,6 +363,7 @@ typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
uint32_t, uint32_t, uint32_t);
/* Function prototype for gen_ functions for calling Neon helpers */
+typedef void NeonGenOneOpFn(TCGv_i32, TCGv_i32);
typedef void NeonGenOneOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32);
typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
typedef void NeonGenTwoOpEnvFn(TCGv_i32, TCGv_ptr, TCGv_i32, TCGv_i32);
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 86b1b9e34bf..0a791af46c8 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -445,6 +445,8 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
&2misc vm=%vm_dp vd=%vd_dp q=1
VREV64 1111 001 11 . 11 .. 00 .... 0 0000 . . 0 .... @2misc
+ VREV32 1111 001 11 . 11 .. 00 .... 0 0001 . . 0 .... @2misc
+ VREV16 1111 001 11 . 11 .. 00 .... 0 0010 . . 0 .... @2misc
VPADDL_S 1111 001 11 . 11 .. 00 .... 0 0100 . . 0 .... @2misc
VPADDL_U 1111 001 11 . 11 .. 00 .... 0 0101 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 4967e974386..0a779980d01 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3547,3 +3547,58 @@ DO_2M_CRYPTO(AESIMC, aa32_aes, 0)
DO_2M_CRYPTO(SHA1H, aa32_sha1, 2)
DO_2M_CRYPTO(SHA1SU1, aa32_sha1, 2)
DO_2M_CRYPTO(SHA256SU0, aa32_sha2, 2)
+
+static bool do_2misc(DisasContext *s, arg_2misc *a, NeonGenOneOpFn *fn)
+{
+ int pass;
+
+ /* Handle a 2-reg-misc operation by iterating 32 bits at a time */
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (!fn) {
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+ TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+ fn(tmp, tmp);
+ neon_store_reg(a->vd, pass, tmp);
+ }
+
+ return true;
+}
+
+static bool trans_VREV32(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenOneOpFn * const fn[] = {
+ tcg_gen_bswap32_i32,
+ gen_swap_half,
+ NULL,
+ NULL,
+ };
+ return do_2misc(s, a, fn[a->size]);
+}
+
+static bool trans_VREV16(DisasContext *s, arg_2misc *a)
+{
+ if (a->size != 0) {
+ return false;
+ }
+ return do_2misc(s, a, gen_rev16);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 64b18a95b64..5b50eddd111 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4936,6 +4936,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_AESE: case NEON_2RM_AESMC:
case NEON_2RM_SHA1H:
case NEON_2RM_SHA1SU1:
+ case NEON_2RM_VREV32:
+ case NEON_2RM_VREV16:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4957,16 +4959,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
for (pass = 0; pass < (q ? 4 : 2); pass++) {
tmp = neon_load_reg(rm, pass);
switch (op) {
- case NEON_2RM_VREV32:
- switch (size) {
- case 0: tcg_gen_bswap32_i32(tmp, tmp); break;
- case 1: gen_swap_half(tmp, tmp); break;
- default: abort();
- }
- break;
- case NEON_2RM_VREV16:
- gen_rev16(tmp, tmp);
- break;
case NEON_2RM_VCLS:
switch (size) {
case 0: gen_helper_neon_cls_s8(tmp, tmp); break;
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (11 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 12/21] target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:41 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree Peter Maydell
` (9 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the remaining ops in the Neon 2-reg-misc group which
can be implemented simply with our do_2misc() helper.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 10 +++++
target/arm/translate-neon.inc.c | 69 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 38 ++++--------------
3 files changed, 86 insertions(+), 31 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 0a791af46c8..f947f7d09f0 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -456,6 +456,10 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
AESMC 1111 001 11 . 11 .. 00 .... 0 0111 0 . 0 .... @2misc_q1
AESIMC 1111 001 11 . 11 .. 00 .... 0 0111 1 . 0 .... @2misc_q1
+ VCLS 1111 001 11 . 11 .. 00 .... 0 1000 . . 0 .... @2misc
+ VCLZ 1111 001 11 . 11 .. 00 .... 0 1001 . . 0 .... @2misc
+ VCNT 1111 001 11 . 11 .. 00 .... 0 1010 . . 0 .... @2misc
+
VMVN 1111 001 11 . 11 .. 00 .... 0 1011 . . 0 .... @2misc
VPADAL_S 1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
@@ -472,6 +476,9 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VABS 1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
VNEG 1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
+ VABS_F 1111 001 11 . 11 .. 01 .... 0 1110 . . 0 .... @2misc
+ VNEG_F 1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
+
VUZP 1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
VZIP 1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
@@ -489,6 +496,9 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
+
+ VRECPE 1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
+ VRSQRTE 1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 0a779980d01..336c2b312eb 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3602,3 +3602,72 @@ static bool trans_VREV16(DisasContext *s, arg_2misc *a)
}
return do_2misc(s, a, gen_rev16);
}
+
+static bool trans_VCLS(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenOneOpFn * const fn[] = {
+ gen_helper_neon_cls_s8,
+ gen_helper_neon_cls_s16,
+ gen_helper_neon_cls_s32,
+ NULL,
+ };
+ return do_2misc(s, a, fn[a->size]);
+}
+
+static void do_VCLZ_32(TCGv_i32 rd, TCGv_i32 rm)
+{
+ tcg_gen_clzi_i32(rd, rm, 32);
+}
+
+static bool trans_VCLZ(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenOneOpFn * const fn[] = {
+ gen_helper_neon_clz_u8,
+ gen_helper_neon_clz_u16,
+ do_VCLZ_32,
+ NULL,
+ };
+ return do_2misc(s, a, fn[a->size]);
+}
+
+static bool trans_VCNT(DisasContext *s, arg_2misc *a)
+{
+ if (a->size != 0) {
+ return false;
+ }
+ return do_2misc(s, a, gen_helper_neon_cnt_u8);
+}
+
+static bool trans_VABS_F(DisasContext *s, arg_2misc *a)
+{
+ if (a->size != 2) {
+ return false;
+ }
+ /* TODO: FP16 : size == 1 */
+ return do_2misc(s, a, gen_helper_vfp_abss);
+}
+
+static bool trans_VNEG_F(DisasContext *s, arg_2misc *a)
+{
+ if (a->size != 2) {
+ return false;
+ }
+ /* TODO: FP16 : size == 1 */
+ return do_2misc(s, a, gen_helper_vfp_negs);
+}
+
+static bool trans_VRECPE(DisasContext *s, arg_2misc *a)
+{
+ if (a->size != 2) {
+ return false;
+ }
+ return do_2misc(s, a, gen_helper_recpe_u32);
+}
+
+static bool trans_VRSQRTE(DisasContext *s, arg_2misc *a)
+{
+ if (a->size != 2) {
+ return false;
+ }
+ return do_2misc(s, a, gen_helper_rsqrte_u32);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5b50eddd111..17373743889 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4938,6 +4938,13 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_SHA1SU1:
case NEON_2RM_VREV32:
case NEON_2RM_VREV16:
+ case NEON_2RM_VCLS:
+ case NEON_2RM_VCLZ:
+ case NEON_2RM_VCNT:
+ case NEON_2RM_VABS_F:
+ case NEON_2RM_VNEG_F:
+ case NEON_2RM_VRECPE:
+ case NEON_2RM_VRSQRTE:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4959,25 +4966,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
for (pass = 0; pass < (q ? 4 : 2); pass++) {
tmp = neon_load_reg(rm, pass);
switch (op) {
- case NEON_2RM_VCLS:
- switch (size) {
- case 0: gen_helper_neon_cls_s8(tmp, tmp); break;
- case 1: gen_helper_neon_cls_s16(tmp, tmp); break;
- case 2: gen_helper_neon_cls_s32(tmp, tmp); break;
- default: abort();
- }
- break;
- case NEON_2RM_VCLZ:
- switch (size) {
- case 0: gen_helper_neon_clz_u8(tmp, tmp); break;
- case 1: gen_helper_neon_clz_u16(tmp, tmp); break;
- case 2: tcg_gen_clzi_i32(tmp, tmp, 32); break;
- default: abort();
- }
- break;
- case NEON_2RM_VCNT:
- gen_helper_neon_cnt_u8(tmp, tmp);
- break;
case NEON_2RM_VQABS:
switch (size) {
case 0:
@@ -5051,12 +5039,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
tcg_temp_free_ptr(fpstatus);
break;
}
- case NEON_2RM_VABS_F:
- gen_helper_vfp_abss(tmp, tmp);
- break;
- case NEON_2RM_VNEG_F:
- gen_helper_vfp_negs(tmp, tmp);
- break;
case NEON_2RM_VSWP:
tmp2 = neon_load_reg(rd, pass);
neon_store_reg(rm, pass, tmp2);
@@ -5137,12 +5119,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
tcg_temp_free_ptr(fpst);
break;
}
- case NEON_2RM_VRECPE:
- gen_helper_recpe_u32(tmp, tmp);
- break;
- case NEON_2RM_VRSQRTE:
- gen_helper_rsqrte_u32(tmp, tmp);
- break;
case NEON_2RM_VRECPE_F:
{
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (12 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 13/21] target/arm: Convert remaining simple 2-reg-misc Neon ops Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:42 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns Peter Maydell
` (8 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon VQABS and VQNEG insns to decodetree.
Since these are the only ones which need cpu_env passing to
the helper, we wrap the helper rather than creating a whole
new do_2misc_env() function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 3 +++
target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 30 ++--------------------------
3 files changed, 40 insertions(+), 28 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index f947f7d09f0..f0bb34a49eb 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -465,6 +465,9 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VPADAL_S 1111 001 11 . 11 .. 00 .... 0 1100 . . 0 .... @2misc
VPADAL_U 1111 001 11 . 11 .. 00 .... 0 1101 . . 0 .... @2misc
+ VQABS 1111 001 11 . 11 .. 00 .... 0 1110 . . 0 .... @2misc
+ VQNEG 1111 001 11 . 11 .. 00 .... 0 1111 . . 0 .... @2misc
+
VCGT0 1111 001 11 . 11 .. 01 .... 0 0000 . . 0 .... @2misc
VCGE0 1111 001 11 . 11 .. 01 .... 0 0001 . . 0 .... @2misc
VCEQ0 1111 001 11 . 11 .. 01 .... 0 0010 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 336c2b312eb..2b5dc86f628 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3671,3 +3671,38 @@ static bool trans_VRSQRTE(DisasContext *s, arg_2misc *a)
}
return do_2misc(s, a, gen_helper_rsqrte_u32);
}
+
+#define WRAP_1OP_ENV_FN(WRAPNAME, FUNC) \
+ static void WRAPNAME(TCGv_i32 d, TCGv_i32 m) \
+ { \
+ FUNC(d, cpu_env, m); \
+ }
+
+WRAP_1OP_ENV_FN(gen_VQABS_s8, gen_helper_neon_qabs_s8)
+WRAP_1OP_ENV_FN(gen_VQABS_s16, gen_helper_neon_qabs_s16)
+WRAP_1OP_ENV_FN(gen_VQABS_s32, gen_helper_neon_qabs_s32)
+WRAP_1OP_ENV_FN(gen_VQNEG_s8, gen_helper_neon_qneg_s8)
+WRAP_1OP_ENV_FN(gen_VQNEG_s16, gen_helper_neon_qneg_s16)
+WRAP_1OP_ENV_FN(gen_VQNEG_s32, gen_helper_neon_qneg_s32)
+
+static bool trans_VQABS(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenOneOpFn * const fn[] = {
+ gen_VQABS_s8,
+ gen_VQABS_s16,
+ gen_VQABS_s32,
+ NULL,
+ };
+ return do_2misc(s, a, fn[a->size]);
+}
+
+static bool trans_VQNEG(DisasContext *s, arg_2misc *a)
+{
+ static NeonGenOneOpFn * const fn[] = {
+ gen_VQNEG_s8,
+ gen_VQNEG_s16,
+ gen_VQNEG_s32,
+ NULL,
+ };
+ return do_2misc(s, a, fn[a->size]);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 17373743889..3cbd2ab0c96 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4945,6 +4945,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VNEG_F:
case NEON_2RM_VRECPE:
case NEON_2RM_VRSQRTE:
+ case NEON_2RM_VQABS:
+ case NEON_2RM_VQNEG:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4966,34 +4968,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
for (pass = 0; pass < (q ? 4 : 2); pass++) {
tmp = neon_load_reg(rm, pass);
switch (op) {
- case NEON_2RM_VQABS:
- switch (size) {
- case 0:
- gen_helper_neon_qabs_s8(tmp, cpu_env, tmp);
- break;
- case 1:
- gen_helper_neon_qabs_s16(tmp, cpu_env, tmp);
- break;
- case 2:
- gen_helper_neon_qabs_s32(tmp, cpu_env, tmp);
- break;
- default: abort();
- }
- break;
- case NEON_2RM_VQNEG:
- switch (size) {
- case 0:
- gen_helper_neon_qneg_s8(tmp, cpu_env, tmp);
- break;
- case 1:
- gen_helper_neon_qneg_s16(tmp, cpu_env, tmp);
- break;
- case 2:
- gen_helper_neon_qneg_s32(tmp, cpu_env, tmp);
- break;
- default: abort();
- }
- break;
case NEON_2RM_VCGT0_F:
{
TCGv_ptr fpstatus = get_fpstatus_ptr(1);
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree
2020-06-16 17:08 ` [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree Peter Maydell
@ 2020-06-19 23:42 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:42 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VQABS and VQNEG insns to decodetree.
> Since these are the only ones which need cpu_env passing to
> the helper, we wrap the helper rather than creating a whole
> new do_2misc_env() function.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 3 +++
> target/arm/translate-neon.inc.c | 35 +++++++++++++++++++++++++++++++++
> target/arm/translate.c | 30 ++--------------------------
> 3 files changed, 40 insertions(+), 28 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (13 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 14/21] target/arm: Convert Neon VQABS, VQNEG to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:44 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree Peter Maydell
` (7 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon 2-reg-misc insns which are implemented with
simple calls to functions that take the input, output and
fpstatus pointer.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/translate.h | 1 +
target/arm/neon-dp.decode | 8 +++++
target/arm/translate-neon.inc.c | 62 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 56 ++++-------------------------
4 files changed, 78 insertions(+), 49 deletions(-)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 4dbeee4c89f..19650a9e2d7 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -373,6 +373,7 @@ typedef void NeonGenNarrowFn(TCGv_i32, TCGv_i64);
typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64);
typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32);
typedef void NeonGenTwoOpWidenFn(TCGv_i64, TCGv_i32, TCGv_i32);
+typedef void NeonGenOneSingleOpFn(TCGv_i32, TCGv_i32, TCGv_ptr);
typedef void NeonGenTwoSingleOpFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
typedef void NeonGenTwoDoubleOpFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
typedef void NeonGenOne64OpFn(TCGv_i64, TCGv_i64);
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index f0bb34a49eb..ea8d5fd99c3 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -497,11 +497,19 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
SHA1SU1 1111 001 11 . 11 .. 10 .... 0 0111 0 . 0 .... @2misc_q1
SHA256SU0 1111 001 11 . 11 .. 10 .... 0 0111 1 . 0 .... @2misc_q1
+ VRINTX 1111 001 11 . 11 .. 10 .... 0 1001 . . 0 .... @2misc
+
VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
VRECPE 1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
VRSQRTE 1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
+ VRECPE_F 1111 001 11 . 11 .. 11 .... 0 1010 . . 0 .... @2misc
+ VRSQRTE_F 1111 001 11 . 11 .. 11 .... 0 1011 . . 0 .... @2misc
+ VCVT_FS 1111 001 11 . 11 .. 11 .... 0 1100 . . 0 .... @2misc
+ VCVT_FU 1111 001 11 . 11 .. 11 .... 0 1101 . . 0 .... @2misc
+ VCVT_SF 1111 001 11 . 11 .. 11 .... 0 1110 . . 0 .... @2misc
+ VCVT_UF 1111 001 11 . 11 .. 11 .... 0 1111 . . 0 .... @2misc
]
# Subgroup for size != 0b11
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 2b5dc86f628..ab183e47d7d 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3706,3 +3706,65 @@ static bool trans_VQNEG(DisasContext *s, arg_2misc *a)
};
return do_2misc(s, a, fn[a->size]);
}
+
+static bool do_2misc_fp(DisasContext *s, arg_2misc *a,
+ NeonGenOneSingleOpFn *fn)
+{
+ int pass;
+ TCGv_ptr fpst;
+
+ /* Handle a 2-reg-misc operation by iterating 32 bits at a time */
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->size != 2) {
+ /* TODO: FP16 will be the size == 1 case */
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fpst = get_fpstatus_ptr(1);
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+ TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+ fn(tmp, tmp, fpst);
+ neon_store_reg(a->vd, pass, tmp);
+ }
+ tcg_temp_free_ptr(fpst);
+
+ return true;
+}
+
+#define DO_2MISC_FP(INSN, FUNC) \
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
+ { \
+ return do_2misc_fp(s, a, FUNC); \
+ }
+
+DO_2MISC_FP(VRECPE_F, gen_helper_recpe_f32)
+DO_2MISC_FP(VRSQRTE_F, gen_helper_rsqrte_f32)
+DO_2MISC_FP(VCVT_FS, gen_helper_vfp_sitos)
+DO_2MISC_FP(VCVT_FU, gen_helper_vfp_uitos)
+DO_2MISC_FP(VCVT_SF, gen_helper_vfp_tosizs)
+DO_2MISC_FP(VCVT_UF, gen_helper_vfp_touizs)
+
+static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
+{
+ if (!arm_dc_feature(s, ARM_FEATURE_V8)) {
+ return false;
+ }
+ return do_2misc_fp(s, a, gen_helper_rints_exact);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 3cbd2ab0c96..48377860c75 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4947,6 +4947,13 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VRSQRTE:
case NEON_2RM_VQABS:
case NEON_2RM_VQNEG:
+ case NEON_2RM_VRECPE_F:
+ case NEON_2RM_VRSQRTE_F:
+ case NEON_2RM_VCVT_FS:
+ case NEON_2RM_VCVT_FU:
+ case NEON_2RM_VCVT_SF:
+ case NEON_2RM_VCVT_UF:
+ case NEON_2RM_VRINTX:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -5052,13 +5059,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
tcg_temp_free_i32(tcg_rmode);
break;
}
- case NEON_2RM_VRINTX:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- gen_helper_rints_exact(tmp, tmp, fpstatus);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
case NEON_2RM_VCVTAU:
case NEON_2RM_VCVTAS:
case NEON_2RM_VCVTNU:
@@ -5093,48 +5093,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
tcg_temp_free_ptr(fpst);
break;
}
- case NEON_2RM_VRECPE_F:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- gen_helper_recpe_f32(tmp, tmp, fpstatus);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VRSQRTE_F:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- gen_helper_rsqrte_f32(tmp, tmp, fpstatus);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCVT_FS: /* VCVT.F32.S32 */
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- gen_helper_vfp_sitos(tmp, tmp, fpstatus);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCVT_FU: /* VCVT.F32.U32 */
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- gen_helper_vfp_uitos(tmp, tmp, fpstatus);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCVT_SF: /* VCVT.S32.F32 */
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- gen_helper_vfp_tosizs(tmp, tmp, fpstatus);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCVT_UF: /* VCVT.U32.F32 */
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- gen_helper_vfp_touizs(tmp, tmp, fpstatus);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
default:
/* Reserved op values were caught by the
* neon_2rm_sizes[] check earlier.
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns
2020-06-16 17:08 ` [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns Peter Maydell
@ 2020-06-19 23:44 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:44 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon 2-reg-misc insns which are implemented with
> simple calls to functions that take the input, output and
> fpstatus pointer.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/translate.h | 1 +
> target/arm/neon-dp.decode | 8 +++++
> target/arm/translate-neon.inc.c | 62 +++++++++++++++++++++++++++++++++
> target/arm/translate.c | 56 ++++-------------------------
> 4 files changed, 78 insertions(+), 49 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (14 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 15/21] target/arm: Convert simple fp Neon 2-reg-misc insns Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:46 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT " Peter Maydell
` (6 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the fp-compare-with-zero insns in the Neon 2-reg-misc group to
decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 6 ++++
target/arm/translate-neon.inc.c | 28 ++++++++++++++++++
target/arm/translate.c | 50 ++++-----------------------------
3 files changed, 39 insertions(+), 45 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index ea8d5fd99c3..c9acd00f1e8 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -479,6 +479,12 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VABS 1111 001 11 . 11 .. 01 .... 0 0110 . . 0 .... @2misc
VNEG 1111 001 11 . 11 .. 01 .... 0 0111 . . 0 .... @2misc
+ VCGT0_F 1111 001 11 . 11 .. 01 .... 0 1000 . . 0 .... @2misc
+ VCGE0_F 1111 001 11 . 11 .. 01 .... 0 1001 . . 0 .... @2misc
+ VCEQ0_F 1111 001 11 . 11 .. 01 .... 0 1010 . . 0 .... @2misc
+ VCLE0_F 1111 001 11 . 11 .. 01 .... 0 1011 . . 0 .... @2misc
+ VCLT0_F 1111 001 11 . 11 .. 01 .... 0 1100 . . 0 .... @2misc
+
VABS_F 1111 001 11 . 11 .. 01 .... 0 1110 . . 0 .... @2misc
VNEG_F 1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index ab183e47d7d..a62da21b152 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3768,3 +3768,31 @@ static bool trans_VRINTX(DisasContext *s, arg_2misc *a)
}
return do_2misc_fp(s, a, gen_helper_rints_exact);
}
+
+#define WRAP_FP_CMP0_FWD(WRAPNAME, FUNC) \
+ static void WRAPNAME(TCGv_i32 d, TCGv_i32 m, TCGv_ptr fpst) \
+ { \
+ TCGv_i32 zero = tcg_const_i32(0); \
+ FUNC(d, m, zero, fpst); \
+ tcg_temp_free_i32(zero); \
+ }
+#define WRAP_FP_CMP0_REV(WRAPNAME, FUNC) \
+ static void WRAPNAME(TCGv_i32 d, TCGv_i32 m, TCGv_ptr fpst) \
+ { \
+ TCGv_i32 zero = tcg_const_i32(0); \
+ FUNC(d, zero, m, fpst); \
+ tcg_temp_free_i32(zero); \
+ }
+
+#define DO_FP_CMP0(INSN, FUNC, REV) \
+ WRAP_FP_CMP0_##REV(gen_##INSN, FUNC) \
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
+ { \
+ return do_2misc_fp(s, a, gen_##INSN); \
+ }
+
+DO_FP_CMP0(VCGT0_F, gen_helper_neon_cgt_f32, FWD)
+DO_FP_CMP0(VCGE0_F, gen_helper_neon_cge_f32, FWD)
+DO_FP_CMP0(VCEQ0_F, gen_helper_neon_ceq_f32, FWD)
+DO_FP_CMP0(VCLE0_F, gen_helper_neon_cge_f32, REV)
+DO_FP_CMP0(VCLT0_F, gen_helper_neon_cgt_f32, REV)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 48377860c75..dc98928856d 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4954,6 +4954,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VCVT_SF:
case NEON_2RM_VCVT_UF:
case NEON_2RM_VRINTX:
+ case NEON_2RM_VCGT0_F:
+ case NEON_2RM_VCGE0_F:
+ case NEON_2RM_VCEQ0_F:
+ case NEON_2RM_VCLE0_F:
+ case NEON_2RM_VCLT0_F:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4975,51 +4980,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
for (pass = 0; pass < (q ? 4 : 2); pass++) {
tmp = neon_load_reg(rm, pass);
switch (op) {
- case NEON_2RM_VCGT0_F:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- tmp2 = tcg_const_i32(0);
- gen_helper_neon_cgt_f32(tmp, tmp, tmp2, fpstatus);
- tcg_temp_free_i32(tmp2);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCGE0_F:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- tmp2 = tcg_const_i32(0);
- gen_helper_neon_cge_f32(tmp, tmp, tmp2, fpstatus);
- tcg_temp_free_i32(tmp2);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCEQ0_F:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- tmp2 = tcg_const_i32(0);
- gen_helper_neon_ceq_f32(tmp, tmp, tmp2, fpstatus);
- tcg_temp_free_i32(tmp2);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCLE0_F:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- tmp2 = tcg_const_i32(0);
- gen_helper_neon_cge_f32(tmp, tmp2, tmp, fpstatus);
- tcg_temp_free_i32(tmp2);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
- case NEON_2RM_VCLT0_F:
- {
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- tmp2 = tcg_const_i32(0);
- gen_helper_neon_cgt_f32(tmp, tmp2, tmp, fpstatus);
- tcg_temp_free_i32(tmp2);
- tcg_temp_free_ptr(fpstatus);
- break;
- }
case NEON_2RM_VSWP:
tmp2 = neon_load_reg(rd, pass);
neon_store_reg(rm, pass, tmp2);
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (15 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 16/21] target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:49 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT " Peter Maydell
` (5 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon 2-reg-misc VRINT insns to decodetree.
Giving these insns their own do_vrint() function allows us
to change the rounding mode just once at the start and end
rather than doing it for every element in the vector.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 8 +++++
target/arm/translate-neon.inc.c | 61 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 31 +++--------------
3 files changed, 74 insertions(+), 26 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index c9acd00f1e8..e0717c7e4a6 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -503,11 +503,19 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
SHA1SU1 1111 001 11 . 11 .. 10 .... 0 0111 0 . 0 .... @2misc_q1
SHA256SU0 1111 001 11 . 11 .. 10 .... 0 0111 1 . 0 .... @2misc_q1
+ VRINTN 1111 001 11 . 11 .. 10 .... 0 1000 . . 0 .... @2misc
VRINTX 1111 001 11 . 11 .. 10 .... 0 1001 . . 0 .... @2misc
+ VRINTA 1111 001 11 . 11 .. 10 .... 0 1010 . . 0 .... @2misc
+ VRINTZ 1111 001 11 . 11 .. 10 .... 0 1011 . . 0 .... @2misc
VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0
+
+ VRINTM 1111 001 11 . 11 .. 10 .... 0 1101 . . 0 .... @2misc
+
VCVT_F32_F16 1111 001 11 . 11 .. 10 .... 0 1110 0 . 0 .... @2misc_q0
+ VRINTP 1111 001 11 . 11 .. 10 .... 0 1111 . . 0 .... @2misc
+
VRECPE 1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
VRSQRTE 1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
VRECPE_F 1111 001 11 . 11 .. 11 .... 0 1010 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index a62da21b152..0e7f86ad156 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3796,3 +3796,64 @@ DO_FP_CMP0(VCGE0_F, gen_helper_neon_cge_f32, FWD)
DO_FP_CMP0(VCEQ0_F, gen_helper_neon_ceq_f32, FWD)
DO_FP_CMP0(VCLE0_F, gen_helper_neon_cge_f32, REV)
DO_FP_CMP0(VCLT0_F, gen_helper_neon_cgt_f32, REV)
+
+static bool do_vrint(DisasContext *s, arg_2misc *a, int rmode)
+{
+ /*
+ * Handle a VRINT* operation by iterating 32 bits at a time,
+ * with a specified rounding mode in operation.
+ */
+ int pass;
+ TCGv_ptr fpst;
+ TCGv_i32 tcg_rmode;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+ !arm_dc_feature(s, ARM_FEATURE_V8)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->size != 2) {
+ /* TODO: FP16 will be the size == 1 case */
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fpst = get_fpstatus_ptr(1);
+ tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
+ gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+ TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+ gen_helper_rints(tmp, tmp, fpst);
+ neon_store_reg(a->vd, pass, tmp);
+ }
+ gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+ tcg_temp_free_i32(tcg_rmode);
+ tcg_temp_free_ptr(fpst);
+
+ return true;
+}
+
+#define DO_VRINT(INSN, RMODE) \
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
+ { \
+ return do_vrint(s, a, RMODE); \
+ }
+
+DO_VRINT(VRINTN, FPROUNDING_TIEEVEN)
+DO_VRINT(VRINTA, FPROUNDING_TIEAWAY)
+DO_VRINT(VRINTZ, FPROUNDING_ZERO)
+DO_VRINT(VRINTM, FPROUNDING_NEGINF)
+DO_VRINT(VRINTP, FPROUNDING_POSINF)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index dc98928856d..61dfc3ae7af 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4959,6 +4959,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VCEQ0_F:
case NEON_2RM_VCLE0_F:
case NEON_2RM_VCLT0_F:
+ case NEON_2RM_VRINTN:
+ case NEON_2RM_VRINTA:
+ case NEON_2RM_VRINTM:
+ case NEON_2RM_VRINTP:
+ case NEON_2RM_VRINTZ:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4993,32 +4998,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
}
neon_store_reg(rm, pass, tmp2);
break;
- case NEON_2RM_VRINTN:
- case NEON_2RM_VRINTA:
- case NEON_2RM_VRINTM:
- case NEON_2RM_VRINTP:
- case NEON_2RM_VRINTZ:
- {
- TCGv_i32 tcg_rmode;
- TCGv_ptr fpstatus = get_fpstatus_ptr(1);
- int rmode;
-
- if (op == NEON_2RM_VRINTZ) {
- rmode = FPROUNDING_ZERO;
- } else {
- rmode = fp_decode_rm[((op & 0x6) >> 1) ^ 1];
- }
-
- tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
- cpu_env);
- gen_helper_rints(tmp, tmp, fpstatus);
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
- cpu_env);
- tcg_temp_free_ptr(fpstatus);
- tcg_temp_free_i32(tcg_rmode);
- break;
- }
case NEON_2RM_VCVTAU:
case NEON_2RM_VCVTAS:
case NEON_2RM_VCVTNU:
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree
2020-06-16 17:08 ` [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT " Peter Maydell
@ 2020-06-19 23:49 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-19 23:49 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon 2-reg-misc VRINT insns to decodetree.
> Giving these insns their own do_vrint() function allows us
> to change the rounding mode just once at the start and end
> rather than doing it for every element in the vector.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 8 +++++
> target/arm/translate-neon.inc.c | 61 +++++++++++++++++++++++++++++++++
> target/arm/translate.c | 31 +++--------------
> 3 files changed, 74 insertions(+), 26 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (16 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 17/21] target/arm: Convert Neon 2-reg-misc VRINT " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-19 23:52 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 19/21] target/arm: Convert Neon VSWP " Peter Maydell
` (4 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VCVT instructions in the 2-reg-misc grouping to
decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 9 +++++
target/arm/translate-neon.inc.c | 70 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 70 ++++-----------------------------
3 files changed, 87 insertions(+), 62 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index e0717c7e4a6..5507c3e4623 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -516,6 +516,15 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VRINTP 1111 001 11 . 11 .. 10 .... 0 1111 . . 0 .... @2misc
+ VCVTAS 1111 001 11 . 11 .. 11 .... 0 0000 . . 0 .... @2misc
+ VCVTAU 1111 001 11 . 11 .. 11 .... 0 0001 . . 0 .... @2misc
+ VCVTNS 1111 001 11 . 11 .. 11 .... 0 0010 . . 0 .... @2misc
+ VCVTNU 1111 001 11 . 11 .. 11 .... 0 0011 . . 0 .... @2misc
+ VCVTPS 1111 001 11 . 11 .. 11 .... 0 0100 . . 0 .... @2misc
+ VCVTPU 1111 001 11 . 11 .. 11 .... 0 0101 . . 0 .... @2misc
+ VCVTMS 1111 001 11 . 11 .. 11 .... 0 0110 . . 0 .... @2misc
+ VCVTMU 1111 001 11 . 11 .. 11 .... 0 0111 . . 0 .... @2misc
+
VRECPE 1111 001 11 . 11 .. 11 .... 0 1000 . . 0 .... @2misc
VRSQRTE 1111 001 11 . 11 .. 11 .... 0 1001 . . 0 .... @2misc
VRECPE_F 1111 001 11 . 11 .. 11 .... 0 1010 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 0e7f86ad156..29bc161f36a 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3857,3 +3857,73 @@ DO_VRINT(VRINTA, FPROUNDING_TIEAWAY)
DO_VRINT(VRINTZ, FPROUNDING_ZERO)
DO_VRINT(VRINTM, FPROUNDING_NEGINF)
DO_VRINT(VRINTP, FPROUNDING_POSINF)
+
+static bool do_vcvt(DisasContext *s, arg_2misc *a, int rmode, bool is_signed)
+{
+ /*
+ * Handle a VCVT* operation by iterating 32 bits at a time,
+ * with a specified rounding mode in operation.
+ */
+ int pass;
+ TCGv_ptr fpst;
+ TCGv_i32 tcg_rmode, tcg_shift;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON) ||
+ !arm_dc_feature(s, ARM_FEATURE_V8)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->size != 2) {
+ /* TODO: FP16 will be the size == 1 case */
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fpst = get_fpstatus_ptr(1);
+ tcg_shift = tcg_const_i32(0);
+ tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
+ gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+ TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+ if (is_signed) {
+ gen_helper_vfp_tosls(tmp, tmp, tcg_shift, fpst);
+ } else {
+ gen_helper_vfp_touls(tmp, tmp, tcg_shift, fpst);
+ }
+ neon_store_reg(a->vd, pass, tmp);
+ }
+ gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, cpu_env);
+ tcg_temp_free_i32(tcg_rmode);
+ tcg_temp_free_i32(tcg_shift);
+ tcg_temp_free_ptr(fpst);
+
+ return true;
+}
+
+#define DO_VCVT(INSN, RMODE, SIGNED) \
+ static bool trans_##INSN(DisasContext *s, arg_2misc *a) \
+ { \
+ return do_vcvt(s, a, RMODE, SIGNED); \
+ }
+
+DO_VCVT(VCVTAU, FPROUNDING_TIEAWAY, false)
+DO_VCVT(VCVTAS, FPROUNDING_TIEAWAY, true)
+DO_VCVT(VCVTNU, FPROUNDING_TIEEVEN, false)
+DO_VCVT(VCVTNS, FPROUNDING_TIEEVEN, true)
+DO_VCVT(VCVTPU, FPROUNDING_POSINF, false)
+DO_VCVT(VCVTPS, FPROUNDING_POSINF, true)
+DO_VCVT(VCVTMU, FPROUNDING_NEGINF, false)
+DO_VCVT(VCVTMS, FPROUNDING_NEGINF, true)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 61dfc3ae7af..b0181062020 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -3042,30 +3042,6 @@ static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
#define NEON_2RM_VCVT_SF 62
#define NEON_2RM_VCVT_UF 63
-static bool neon_2rm_is_v8_op(int op)
-{
- /* Return true if this neon 2reg-misc op is ARMv8 and up */
- switch (op) {
- case NEON_2RM_VRINTN:
- case NEON_2RM_VRINTA:
- case NEON_2RM_VRINTM:
- case NEON_2RM_VRINTP:
- case NEON_2RM_VRINTZ:
- case NEON_2RM_VRINTX:
- case NEON_2RM_VCVTAU:
- case NEON_2RM_VCVTAS:
- case NEON_2RM_VCVTNU:
- case NEON_2RM_VCVTNS:
- case NEON_2RM_VCVTPU:
- case NEON_2RM_VCVTPS:
- case NEON_2RM_VCVTMU:
- case NEON_2RM_VCVTMS:
- return true;
- default:
- return false;
- }
-}
-
/* Each entry in this array has bit n set if the insn allows
* size value n (otherwise it will UNDEF). Since unallocated
* op values will have no bits set they always UNDEF.
@@ -4908,10 +4884,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
if ((neon_2rm_sizes[op] & (1 << size)) == 0) {
return 1;
}
- if (neon_2rm_is_v8_op(op) &&
- !arm_dc_feature(s, ARM_FEATURE_V8)) {
- return 1;
- }
if (q && ((rm | rd) & 1)) {
return 1;
}
@@ -4964,6 +4936,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VRINTM:
case NEON_2RM_VRINTP:
case NEON_2RM_VRINTZ:
+ case NEON_2RM_VCVTAU:
+ case NEON_2RM_VCVTAS:
+ case NEON_2RM_VCVTNU:
+ case NEON_2RM_VCVTNS:
+ case NEON_2RM_VCVTPU:
+ case NEON_2RM_VCVTPS:
+ case NEON_2RM_VCVTMU:
+ case NEON_2RM_VCVTMS:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4998,40 +4978,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
}
neon_store_reg(rm, pass, tmp2);
break;
- case NEON_2RM_VCVTAU:
- case NEON_2RM_VCVTAS:
- case NEON_2RM_VCVTNU:
- case NEON_2RM_VCVTNS:
- case NEON_2RM_VCVTPU:
- case NEON_2RM_VCVTPS:
- case NEON_2RM_VCVTMU:
- case NEON_2RM_VCVTMS:
- {
- bool is_signed = !extract32(insn, 7, 1);
- TCGv_ptr fpst = get_fpstatus_ptr(1);
- TCGv_i32 tcg_rmode, tcg_shift;
- int rmode = fp_decode_rm[extract32(insn, 8, 2)];
-
- tcg_shift = tcg_const_i32(0);
- tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
- cpu_env);
-
- if (is_signed) {
- gen_helper_vfp_tosls(tmp, tmp,
- tcg_shift, fpst);
- } else {
- gen_helper_vfp_touls(tmp, tmp,
- tcg_shift, fpst);
- }
-
- gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
- cpu_env);
- tcg_temp_free_i32(tcg_rmode);
- tcg_temp_free_i32(tcg_shift);
- tcg_temp_free_ptr(fpst);
- break;
- }
default:
/* Reserved op values were caught by the
* neon_2rm_sizes[] check earlier.
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 19/21] target/arm: Convert Neon VSWP to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (17 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 18/21] target/arm: Convert Neon 2-reg-misc VCVT " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-20 0:16 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 20/21] target/arm: Convert Neon VTRN " Peter Maydell
` (3 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon VSWP insn to decodetree. Since the new implementation
doesn't have to share a pass-loop with the other 2-reg-misc operations
we can implement the swap with 64-bit accesses rather than 32-bits
(which brings us into line with the pseudocode and is more efficient).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 2 ++
target/arm/translate-neon.inc.c | 41 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 5 +---
3 files changed, 44 insertions(+), 4 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 5507c3e4623..2f64841de52 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -488,6 +488,8 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VABS_F 1111 001 11 . 11 .. 01 .... 0 1110 . . 0 .... @2misc
VNEG_F 1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
+ VSWP 1111 001 11 . 11 .. 10 .... 0 0000 . . 0 .... @2misc
+
VUZP 1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
VZIP 1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 29bc161f36a..01da7fad462 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3927,3 +3927,44 @@ DO_VCVT(VCVTPU, FPROUNDING_POSINF, false)
DO_VCVT(VCVTPS, FPROUNDING_POSINF, true)
DO_VCVT(VCVTMU, FPROUNDING_NEGINF, false)
DO_VCVT(VCVTMS, FPROUNDING_NEGINF, true)
+
+static bool trans_VSWP(DisasContext *s, arg_2misc *a)
+{
+ TCGv_i64 rm, rd;
+ int pass;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->size != 0) {
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ rm = tcg_temp_new_i64();
+ rd = tcg_temp_new_i64();
+ for (pass = 0; pass < (a->q ? 2 : 1); pass++) {
+ neon_load_reg64(rm, a->vm + pass);
+ neon_load_reg64(rd, a->vd + pass);
+ neon_store_reg64(rm, a->vd + pass);
+ neon_store_reg64(rd, a->vm + pass);
+ }
+ tcg_temp_free_i64(rm);
+ tcg_temp_free_i64(rd);
+
+ return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index b0181062020..e8cd4a9c61f 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -4944,6 +4944,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case NEON_2RM_VCVTPS:
case NEON_2RM_VCVTMU:
case NEON_2RM_VCVTMS:
+ case NEON_2RM_VSWP:
/* handled by decodetree */
return 1;
case NEON_2RM_VTRN:
@@ -4965,10 +4966,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
for (pass = 0; pass < (q ? 4 : 2); pass++) {
tmp = neon_load_reg(rm, pass);
switch (op) {
- case NEON_2RM_VSWP:
- tmp2 = neon_load_reg(rd, pass);
- neon_store_reg(rm, pass, tmp2);
- break;
case NEON_2RM_VTRN:
tmp2 = neon_load_reg(rd, pass);
switch (size) {
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 19/21] target/arm: Convert Neon VSWP to decodetree
2020-06-16 17:08 ` [PATCH 19/21] target/arm: Convert Neon VSWP " Peter Maydell
@ 2020-06-20 0:16 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-20 0:16 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VSWP insn to decodetree. Since the new implementation
> doesn't have to share a pass-loop with the other 2-reg-misc operations
> we can implement the swap with 64-bit accesses rather than 32-bits
> (which brings us into line with the pseudocode and is more efficient).
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 2 ++
> target/arm/translate-neon.inc.c | 41 +++++++++++++++++++++++++++++++++
> target/arm/translate.c | 5 +---
> 3 files changed, 44 insertions(+), 4 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 20/21] target/arm: Convert Neon VTRN to decodetree
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (18 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 19/21] target/arm: Convert Neon VSWP " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-20 0:20 ` Richard Henderson
2020-06-16 17:08 ` [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file Peter Maydell
` (2 subsequent siblings)
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon VTRN insn to decodetree. This is the last insn in the
Neon data-processing group, so we can remove all the now-unused old
decoder framework.
It's possible that there's a more efficient implementation of
VTRN, but for this conversion we just copy the existing approach.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 2 +-
target/arm/translate-neon.inc.c | 90 ++++++++
target/arm/translate.c | 363 +-------------------------------
3 files changed, 93 insertions(+), 362 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 2f64841de52..686f9fbf46a 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -489,7 +489,7 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
VNEG_F 1111 001 11 . 11 .. 01 .... 0 1111 . . 0 .... @2misc
VSWP 1111 001 11 . 11 .. 10 .... 0 0000 . . 0 .... @2misc
-
+ VTRN 1111 001 11 . 11 .. 10 .... 0 0001 . . 0 .... @2misc
VUZP 1111 001 11 . 11 .. 10 .... 0 0010 . . 0 .... @2misc
VZIP 1111 001 11 . 11 .. 10 .... 0 0011 . . 0 .... @2misc
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 01da7fad462..8cc7f5db544 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -3968,3 +3968,93 @@ static bool trans_VSWP(DisasContext *s, arg_2misc *a)
return true;
}
+static void gen_neon_trn_u8(TCGv_i32 t0, TCGv_i32 t1)
+{
+ TCGv_i32 rd, tmp;
+
+ rd = tcg_temp_new_i32();
+ tmp = tcg_temp_new_i32();
+
+ tcg_gen_shli_i32(rd, t0, 8);
+ tcg_gen_andi_i32(rd, rd, 0xff00ff00);
+ tcg_gen_andi_i32(tmp, t1, 0x00ff00ff);
+ tcg_gen_or_i32(rd, rd, tmp);
+
+ tcg_gen_shri_i32(t1, t1, 8);
+ tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
+ tcg_gen_andi_i32(tmp, t0, 0xff00ff00);
+ tcg_gen_or_i32(t1, t1, tmp);
+ tcg_gen_mov_i32(t0, rd);
+
+ tcg_temp_free_i32(tmp);
+ tcg_temp_free_i32(rd);
+}
+
+static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
+{
+ TCGv_i32 rd, tmp;
+
+ rd = tcg_temp_new_i32();
+ tmp = tcg_temp_new_i32();
+
+ tcg_gen_shli_i32(rd, t0, 16);
+ tcg_gen_andi_i32(tmp, t1, 0xffff);
+ tcg_gen_or_i32(rd, rd, tmp);
+ tcg_gen_shri_i32(t1, t1, 16);
+ tcg_gen_andi_i32(tmp, t0, 0xffff0000);
+ tcg_gen_or_i32(t1, t1, tmp);
+ tcg_gen_mov_i32(t0, rd);
+
+ tcg_temp_free_i32(tmp);
+ tcg_temp_free_i32(rd);
+}
+
+static bool trans_VTRN(DisasContext *s, arg_2misc *a)
+{
+ TCGv_i32 tmp, tmp2;
+ int pass;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vd | a->vm) & a->q) {
+ return false;
+ }
+
+ if (a->size == 3) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ if (a->size == 2) {
+ for (pass = 0; pass < (a->q ? 4 : 2); pass += 2) {
+ tmp = neon_load_reg(a->vm, pass);
+ tmp2 = neon_load_reg(a->vd, pass + 1);
+ neon_store_reg(a->vm, pass, tmp2);
+ neon_store_reg(a->vd, pass + 1, tmp);
+ }
+ } else {
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+ tmp = neon_load_reg(a->vm, pass);
+ tmp2 = neon_load_reg(a->vd, pass);
+ if (a->size == 0) {
+ gen_neon_trn_u8(tmp, tmp2);
+ } else {
+ gen_neon_trn_u16(tmp, tmp2);
+ }
+ neon_store_reg(a->vm, pass, tmp2);
+ neon_store_reg(a->vd, pass, tmp);
+ }
+ }
+ return true;
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index e8cd4a9c61f..581b0b5cde4 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2934,183 +2934,6 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc)
gen_rfe(s, pc, load_cpu_field(spsr));
}
-static void gen_neon_trn_u8(TCGv_i32 t0, TCGv_i32 t1)
-{
- TCGv_i32 rd, tmp;
-
- rd = tcg_temp_new_i32();
- tmp = tcg_temp_new_i32();
-
- tcg_gen_shli_i32(rd, t0, 8);
- tcg_gen_andi_i32(rd, rd, 0xff00ff00);
- tcg_gen_andi_i32(tmp, t1, 0x00ff00ff);
- tcg_gen_or_i32(rd, rd, tmp);
-
- tcg_gen_shri_i32(t1, t1, 8);
- tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
- tcg_gen_andi_i32(tmp, t0, 0xff00ff00);
- tcg_gen_or_i32(t1, t1, tmp);
- tcg_gen_mov_i32(t0, rd);
-
- tcg_temp_free_i32(tmp);
- tcg_temp_free_i32(rd);
-}
-
-static void gen_neon_trn_u16(TCGv_i32 t0, TCGv_i32 t1)
-{
- TCGv_i32 rd, tmp;
-
- rd = tcg_temp_new_i32();
- tmp = tcg_temp_new_i32();
-
- tcg_gen_shli_i32(rd, t0, 16);
- tcg_gen_andi_i32(tmp, t1, 0xffff);
- tcg_gen_or_i32(rd, rd, tmp);
- tcg_gen_shri_i32(t1, t1, 16);
- tcg_gen_andi_i32(tmp, t0, 0xffff0000);
- tcg_gen_or_i32(t1, t1, tmp);
- tcg_gen_mov_i32(t0, rd);
-
- tcg_temp_free_i32(tmp);
- tcg_temp_free_i32(rd);
-}
-
-/* Symbolic constants for op fields for Neon 2-register miscellaneous.
- * The values correspond to bits [17:16,10:7]; see the ARM ARM DDI0406B
- * table A7-13.
- */
-#define NEON_2RM_VREV64 0
-#define NEON_2RM_VREV32 1
-#define NEON_2RM_VREV16 2
-#define NEON_2RM_VPADDL 4
-#define NEON_2RM_VPADDL_U 5
-#define NEON_2RM_AESE 6 /* Includes AESD */
-#define NEON_2RM_AESMC 7 /* Includes AESIMC */
-#define NEON_2RM_VCLS 8
-#define NEON_2RM_VCLZ 9
-#define NEON_2RM_VCNT 10
-#define NEON_2RM_VMVN 11
-#define NEON_2RM_VPADAL 12
-#define NEON_2RM_VPADAL_U 13
-#define NEON_2RM_VQABS 14
-#define NEON_2RM_VQNEG 15
-#define NEON_2RM_VCGT0 16
-#define NEON_2RM_VCGE0 17
-#define NEON_2RM_VCEQ0 18
-#define NEON_2RM_VCLE0 19
-#define NEON_2RM_VCLT0 20
-#define NEON_2RM_SHA1H 21
-#define NEON_2RM_VABS 22
-#define NEON_2RM_VNEG 23
-#define NEON_2RM_VCGT0_F 24
-#define NEON_2RM_VCGE0_F 25
-#define NEON_2RM_VCEQ0_F 26
-#define NEON_2RM_VCLE0_F 27
-#define NEON_2RM_VCLT0_F 28
-#define NEON_2RM_VABS_F 30
-#define NEON_2RM_VNEG_F 31
-#define NEON_2RM_VSWP 32
-#define NEON_2RM_VTRN 33
-#define NEON_2RM_VUZP 34
-#define NEON_2RM_VZIP 35
-#define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */
-#define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */
-#define NEON_2RM_VSHLL 38
-#define NEON_2RM_SHA1SU1 39 /* Includes SHA256SU0 */
-#define NEON_2RM_VRINTN 40
-#define NEON_2RM_VRINTX 41
-#define NEON_2RM_VRINTA 42
-#define NEON_2RM_VRINTZ 43
-#define NEON_2RM_VCVT_F16_F32 44
-#define NEON_2RM_VRINTM 45
-#define NEON_2RM_VCVT_F32_F16 46
-#define NEON_2RM_VRINTP 47
-#define NEON_2RM_VCVTAU 48
-#define NEON_2RM_VCVTAS 49
-#define NEON_2RM_VCVTNU 50
-#define NEON_2RM_VCVTNS 51
-#define NEON_2RM_VCVTPU 52
-#define NEON_2RM_VCVTPS 53
-#define NEON_2RM_VCVTMU 54
-#define NEON_2RM_VCVTMS 55
-#define NEON_2RM_VRECPE 56
-#define NEON_2RM_VRSQRTE 57
-#define NEON_2RM_VRECPE_F 58
-#define NEON_2RM_VRSQRTE_F 59
-#define NEON_2RM_VCVT_FS 60
-#define NEON_2RM_VCVT_FU 61
-#define NEON_2RM_VCVT_SF 62
-#define NEON_2RM_VCVT_UF 63
-
-/* Each entry in this array has bit n set if the insn allows
- * size value n (otherwise it will UNDEF). Since unallocated
- * op values will have no bits set they always UNDEF.
- */
-static const uint8_t neon_2rm_sizes[] = {
- [NEON_2RM_VREV64] = 0x7,
- [NEON_2RM_VREV32] = 0x3,
- [NEON_2RM_VREV16] = 0x1,
- [NEON_2RM_VPADDL] = 0x7,
- [NEON_2RM_VPADDL_U] = 0x7,
- [NEON_2RM_AESE] = 0x1,
- [NEON_2RM_AESMC] = 0x1,
- [NEON_2RM_VCLS] = 0x7,
- [NEON_2RM_VCLZ] = 0x7,
- [NEON_2RM_VCNT] = 0x1,
- [NEON_2RM_VMVN] = 0x1,
- [NEON_2RM_VPADAL] = 0x7,
- [NEON_2RM_VPADAL_U] = 0x7,
- [NEON_2RM_VQABS] = 0x7,
- [NEON_2RM_VQNEG] = 0x7,
- [NEON_2RM_VCGT0] = 0x7,
- [NEON_2RM_VCGE0] = 0x7,
- [NEON_2RM_VCEQ0] = 0x7,
- [NEON_2RM_VCLE0] = 0x7,
- [NEON_2RM_VCLT0] = 0x7,
- [NEON_2RM_SHA1H] = 0x4,
- [NEON_2RM_VABS] = 0x7,
- [NEON_2RM_VNEG] = 0x7,
- [NEON_2RM_VCGT0_F] = 0x4,
- [NEON_2RM_VCGE0_F] = 0x4,
- [NEON_2RM_VCEQ0_F] = 0x4,
- [NEON_2RM_VCLE0_F] = 0x4,
- [NEON_2RM_VCLT0_F] = 0x4,
- [NEON_2RM_VABS_F] = 0x4,
- [NEON_2RM_VNEG_F] = 0x4,
- [NEON_2RM_VSWP] = 0x1,
- [NEON_2RM_VTRN] = 0x7,
- [NEON_2RM_VUZP] = 0x7,
- [NEON_2RM_VZIP] = 0x7,
- [NEON_2RM_VMOVN] = 0x7,
- [NEON_2RM_VQMOVN] = 0x7,
- [NEON_2RM_VSHLL] = 0x7,
- [NEON_2RM_SHA1SU1] = 0x4,
- [NEON_2RM_VRINTN] = 0x4,
- [NEON_2RM_VRINTX] = 0x4,
- [NEON_2RM_VRINTA] = 0x4,
- [NEON_2RM_VRINTZ] = 0x4,
- [NEON_2RM_VCVT_F16_F32] = 0x2,
- [NEON_2RM_VRINTM] = 0x4,
- [NEON_2RM_VCVT_F32_F16] = 0x2,
- [NEON_2RM_VRINTP] = 0x4,
- [NEON_2RM_VCVTAU] = 0x4,
- [NEON_2RM_VCVTAS] = 0x4,
- [NEON_2RM_VCVTNU] = 0x4,
- [NEON_2RM_VCVTNS] = 0x4,
- [NEON_2RM_VCVTPU] = 0x4,
- [NEON_2RM_VCVTPS] = 0x4,
- [NEON_2RM_VCVTMU] = 0x4,
- [NEON_2RM_VCVTMS] = 0x4,
- [NEON_2RM_VRECPE] = 0x4,
- [NEON_2RM_VRSQRTE] = 0x4,
- [NEON_2RM_VRECPE_F] = 0x4,
- [NEON_2RM_VRSQRTE_F] = 0x4,
- [NEON_2RM_VCVT_FS] = 0x4,
- [NEON_2RM_VCVT_FU] = 0x4,
- [NEON_2RM_VCVT_SF] = 0x4,
- [NEON_2RM_VCVT_UF] = 0x4,
-};
-
static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs,
uint32_t opr_sz, uint32_t max_sz,
gen_helper_gvec_3_ptr *fn)
@@ -4822,178 +4645,6 @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]);
}
-/* Translate a NEON data processing instruction. Return nonzero if the
- instruction is invalid.
- We process data in a mixture of 32-bit and 64-bit chunks.
- Mostly we use 32-bit chunks so we can use normal scalar instructions. */
-
-static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
-{
- int op;
- int q;
- int rd, rm;
- int size;
- int pass;
- int u;
- TCGv_i32 tmp, tmp2;
-
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
- return 1;
- }
-
- /* FIXME: this access check should not take precedence over UNDEF
- * for invalid encodings; we will generate incorrect syndrome information
- * for attempts to execute invalid vfp/neon encodings with FP disabled.
- */
- if (s->fp_excp_el) {
- gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
- syn_simd_access_trap(1, 0xe, false), s->fp_excp_el);
- return 0;
- }
-
- if (!s->vfp_enabled)
- return 1;
- q = (insn & (1 << 6)) != 0;
- u = (insn >> 24) & 1;
- VFP_DREG_D(rd, insn);
- VFP_DREG_M(rm, insn);
- size = (insn >> 20) & 3;
-
- if ((insn & (1 << 23)) == 0) {
- /* Three register same length: handled by decodetree */
- return 1;
- } else if (insn & (1 << 4)) {
- /* Two registers and shift or reg and imm: handled by decodetree */
- return 1;
- } else { /* (insn & 0x00800010 == 0x00800000) */
- if (size != 3) {
- /*
- * Three registers of different lengths, or two registers and
- * a scalar: handled by decodetree
- */
- return 1;
- } else { /* size == 3 */
- if (!u) {
- /* Extract: handled by decodetree */
- return 1;
- } else if ((insn & (1 << 11)) == 0) {
- /* Two register misc. */
- op = ((insn >> 12) & 0x30) | ((insn >> 7) & 0xf);
- size = (insn >> 18) & 3;
- /* UNDEF for unknown op values and bad op-size combinations */
- if ((neon_2rm_sizes[op] & (1 << size)) == 0) {
- return 1;
- }
- if (q && ((rm | rd) & 1)) {
- return 1;
- }
- switch (op) {
- case NEON_2RM_VREV64:
- case NEON_2RM_VPADDL: case NEON_2RM_VPADDL_U:
- case NEON_2RM_VPADAL: case NEON_2RM_VPADAL_U:
- case NEON_2RM_VUZP:
- case NEON_2RM_VZIP:
- case NEON_2RM_VMOVN: case NEON_2RM_VQMOVN:
- case NEON_2RM_VSHLL:
- case NEON_2RM_VCVT_F16_F32:
- case NEON_2RM_VCVT_F32_F16:
- case NEON_2RM_VMVN:
- case NEON_2RM_VNEG:
- case NEON_2RM_VABS:
- case NEON_2RM_VCEQ0:
- case NEON_2RM_VCGT0:
- case NEON_2RM_VCLE0:
- case NEON_2RM_VCGE0:
- case NEON_2RM_VCLT0:
- case NEON_2RM_AESE: case NEON_2RM_AESMC:
- case NEON_2RM_SHA1H:
- case NEON_2RM_SHA1SU1:
- case NEON_2RM_VREV32:
- case NEON_2RM_VREV16:
- case NEON_2RM_VCLS:
- case NEON_2RM_VCLZ:
- case NEON_2RM_VCNT:
- case NEON_2RM_VABS_F:
- case NEON_2RM_VNEG_F:
- case NEON_2RM_VRECPE:
- case NEON_2RM_VRSQRTE:
- case NEON_2RM_VQABS:
- case NEON_2RM_VQNEG:
- case NEON_2RM_VRECPE_F:
- case NEON_2RM_VRSQRTE_F:
- case NEON_2RM_VCVT_FS:
- case NEON_2RM_VCVT_FU:
- case NEON_2RM_VCVT_SF:
- case NEON_2RM_VCVT_UF:
- case NEON_2RM_VRINTX:
- case NEON_2RM_VCGT0_F:
- case NEON_2RM_VCGE0_F:
- case NEON_2RM_VCEQ0_F:
- case NEON_2RM_VCLE0_F:
- case NEON_2RM_VCLT0_F:
- case NEON_2RM_VRINTN:
- case NEON_2RM_VRINTA:
- case NEON_2RM_VRINTM:
- case NEON_2RM_VRINTP:
- case NEON_2RM_VRINTZ:
- case NEON_2RM_VCVTAU:
- case NEON_2RM_VCVTAS:
- case NEON_2RM_VCVTNU:
- case NEON_2RM_VCVTNS:
- case NEON_2RM_VCVTPU:
- case NEON_2RM_VCVTPS:
- case NEON_2RM_VCVTMU:
- case NEON_2RM_VCVTMS:
- case NEON_2RM_VSWP:
- /* handled by decodetree */
- return 1;
- case NEON_2RM_VTRN:
- if (size == 2) {
- int n;
- for (n = 0; n < (q ? 4 : 2); n += 2) {
- tmp = neon_load_reg(rm, n);
- tmp2 = neon_load_reg(rd, n + 1);
- neon_store_reg(rm, n, tmp2);
- neon_store_reg(rd, n + 1, tmp);
- }
- } else {
- goto elementwise;
- }
- break;
-
- default:
- elementwise:
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
- tmp = neon_load_reg(rm, pass);
- switch (op) {
- case NEON_2RM_VTRN:
- tmp2 = neon_load_reg(rd, pass);
- switch (size) {
- case 0: gen_neon_trn_u8(tmp, tmp2); break;
- case 1: gen_neon_trn_u16(tmp, tmp2); break;
- default: abort();
- }
- neon_store_reg(rm, pass, tmp2);
- break;
- default:
- /* Reserved op values were caught by the
- * neon_2rm_sizes[] check earlier.
- */
- abort();
- }
- neon_store_reg(rd, pass, tmp);
- }
- break;
- }
- } else {
- /* VTBL, VTBX, VDUP: handled by decodetree */
- return 1;
- }
- }
- }
- return 0;
-}
-
static int disas_coproc_insn(DisasContext *s, uint32_t insn)
{
int cpnum, is64, crn, crm, opc1, opc2, isread, rt, rt2;
@@ -8694,13 +8345,6 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn)
}
/* fall back to legacy decoder */
- if (((insn >> 25) & 7) == 1) {
- /* NEON Data processing. */
- if (disas_neon_data_insn(s, insn)) {
- goto illegal_op;
- }
- return;
- }
if ((insn & 0x0e000f00) == 0x0c000100) {
if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
/* iWMMXt register transfer. */
@@ -8888,11 +8532,8 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
break;
}
if (((insn >> 24) & 3) == 3) {
- /* Translate into the equivalent ARM encoding. */
- insn = (insn & 0xe2ffffff) | ((insn & (1 << 28)) >> 4) | (1 << 28);
- if (disas_neon_data_insn(s, insn)) {
- goto illegal_op;
- }
+ /* Neon DP, but failed disas_neon_dp() */
+ goto illegal_op;
} else if (((insn >> 8) & 0xe) == 10) {
/* VFP, but failed disas_vfp. */
goto illegal_op;
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 20/21] target/arm: Convert Neon VTRN to decodetree
2020-06-16 17:08 ` [PATCH 20/21] target/arm: Convert Neon VTRN " Peter Maydell
@ 2020-06-20 0:20 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-20 0:20 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> Convert the Neon VTRN insn to decodetree. This is the last insn in the
> Neon data-processing group, so we can remove all the now-unused old
> decoder framework.
>
> It's possible that there's a more efficient implementation of
> VTRN, but for this conversion we just copy the existing approach.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 2 +-
> target/arm/translate-neon.inc.c | 90 ++++++++
> target/arm/translate.c | 363 +-------------------------------
> 3 files changed, 93 insertions(+), 362 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (19 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 20/21] target/arm: Convert Neon VTRN " Peter Maydell
@ 2020-06-16 17:08 ` Peter Maydell
2020-06-20 0:21 ` Richard Henderson
2020-06-16 21:37 ` [PATCH 00/21] target/arm: Finish neon decodetree conversion no-reply
2020-06-16 21:52 ` no-reply
22 siblings, 1 reply; 45+ messages in thread
From: Peter Maydell @ 2020-06-16 17:08 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
The functions neon_element_offset(), neon_load_element(),
neon_load_element64(), neon_store_element() and
neon_store_element64() are used only in the translate-neon.inc.c
file, so move their definitions there.
Since the .inc.c file is #included in translate.c this doesn't make
much difference currently, but it's a more logical place to put the
functions and it might be helpful if we ever decide to try to make
the .inc.c files genuinely separate compilation units.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/translate-neon.inc.c | 101 ++++++++++++++++++++++++++++++++
target/arm/translate.c | 101 --------------------------------
2 files changed, 101 insertions(+), 101 deletions(-)
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 8cc7f5db544..f6cb9215739 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -54,6 +54,107 @@ static inline int rsub_8(DisasContext *s, int x)
#include "decode-neon-ls.inc.c"
#include "decode-neon-shared.inc.c"
+/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
+ * where 0 is the least significant end of the register.
+ */
+static inline long
+neon_element_offset(int reg, int element, MemOp size)
+{
+ int element_size = 1 << size;
+ int ofs = element * element_size;
+#ifdef HOST_WORDS_BIGENDIAN
+ /* Calculate the offset assuming fully little-endian,
+ * then XOR to account for the order of the 8-byte units.
+ */
+ if (element_size < 8) {
+ ofs ^= 8 - element_size;
+ }
+#endif
+ return neon_reg_offset(reg, 0) + ofs;
+}
+
+static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop)
+{
+ long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
+
+ switch (mop) {
+ case MO_UB:
+ tcg_gen_ld8u_i32(var, cpu_env, offset);
+ break;
+ case MO_UW:
+ tcg_gen_ld16u_i32(var, cpu_env, offset);
+ break;
+ case MO_UL:
+ tcg_gen_ld_i32(var, cpu_env, offset);
+ break;
+ default:
+ g_assert_not_reached();
+ }
+}
+
+static void neon_load_element64(TCGv_i64 var, int reg, int ele, MemOp mop)
+{
+ long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
+
+ switch (mop) {
+ case MO_UB:
+ tcg_gen_ld8u_i64(var, cpu_env, offset);
+ break;
+ case MO_UW:
+ tcg_gen_ld16u_i64(var, cpu_env, offset);
+ break;
+ case MO_UL:
+ tcg_gen_ld32u_i64(var, cpu_env, offset);
+ break;
+ case MO_Q:
+ tcg_gen_ld_i64(var, cpu_env, offset);
+ break;
+ default:
+ g_assert_not_reached();
+ }
+}
+
+static void neon_store_element(int reg, int ele, MemOp size, TCGv_i32 var)
+{
+ long offset = neon_element_offset(reg, ele, size);
+
+ switch (size) {
+ case MO_8:
+ tcg_gen_st8_i32(var, cpu_env, offset);
+ break;
+ case MO_16:
+ tcg_gen_st16_i32(var, cpu_env, offset);
+ break;
+ case MO_32:
+ tcg_gen_st_i32(var, cpu_env, offset);
+ break;
+ default:
+ g_assert_not_reached();
+ }
+}
+
+static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var)
+{
+ long offset = neon_element_offset(reg, ele, size);
+
+ switch (size) {
+ case MO_8:
+ tcg_gen_st8_i64(var, cpu_env, offset);
+ break;
+ case MO_16:
+ tcg_gen_st16_i64(var, cpu_env, offset);
+ break;
+ case MO_32:
+ tcg_gen_st32_i64(var, cpu_env, offset);
+ break;
+ case MO_64:
+ tcg_gen_st_i64(var, cpu_env, offset);
+ break;
+ default:
+ g_assert_not_reached();
+ }
+}
+
static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a)
{
int opr_sz;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 581b0b5cde4..408fb7a492f 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1133,25 +1133,6 @@ neon_reg_offset (int reg, int n)
return vfp_reg_offset(0, sreg);
}
-/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
- * where 0 is the least significant end of the register.
- */
-static inline long
-neon_element_offset(int reg, int element, MemOp size)
-{
- int element_size = 1 << size;
- int ofs = element * element_size;
-#ifdef HOST_WORDS_BIGENDIAN
- /* Calculate the offset assuming fully little-endian,
- * then XOR to account for the order of the 8-byte units.
- */
- if (element_size < 8) {
- ofs ^= 8 - element_size;
- }
-#endif
- return neon_reg_offset(reg, 0) + ofs;
-}
-
static TCGv_i32 neon_load_reg(int reg, int pass)
{
TCGv_i32 tmp = tcg_temp_new_i32();
@@ -1159,94 +1140,12 @@ static TCGv_i32 neon_load_reg(int reg, int pass)
return tmp;
}
-static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop)
-{
- long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
-
- switch (mop) {
- case MO_UB:
- tcg_gen_ld8u_i32(var, cpu_env, offset);
- break;
- case MO_UW:
- tcg_gen_ld16u_i32(var, cpu_env, offset);
- break;
- case MO_UL:
- tcg_gen_ld_i32(var, cpu_env, offset);
- break;
- default:
- g_assert_not_reached();
- }
-}
-
-static void neon_load_element64(TCGv_i64 var, int reg, int ele, MemOp mop)
-{
- long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
-
- switch (mop) {
- case MO_UB:
- tcg_gen_ld8u_i64(var, cpu_env, offset);
- break;
- case MO_UW:
- tcg_gen_ld16u_i64(var, cpu_env, offset);
- break;
- case MO_UL:
- tcg_gen_ld32u_i64(var, cpu_env, offset);
- break;
- case MO_Q:
- tcg_gen_ld_i64(var, cpu_env, offset);
- break;
- default:
- g_assert_not_reached();
- }
-}
-
static void neon_store_reg(int reg, int pass, TCGv_i32 var)
{
tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass));
tcg_temp_free_i32(var);
}
-static void neon_store_element(int reg, int ele, MemOp size, TCGv_i32 var)
-{
- long offset = neon_element_offset(reg, ele, size);
-
- switch (size) {
- case MO_8:
- tcg_gen_st8_i32(var, cpu_env, offset);
- break;
- case MO_16:
- tcg_gen_st16_i32(var, cpu_env, offset);
- break;
- case MO_32:
- tcg_gen_st_i32(var, cpu_env, offset);
- break;
- default:
- g_assert_not_reached();
- }
-}
-
-static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var)
-{
- long offset = neon_element_offset(reg, ele, size);
-
- switch (size) {
- case MO_8:
- tcg_gen_st8_i64(var, cpu_env, offset);
- break;
- case MO_16:
- tcg_gen_st16_i64(var, cpu_env, offset);
- break;
- case MO_32:
- tcg_gen_st32_i64(var, cpu_env, offset);
- break;
- case MO_64:
- tcg_gen_st_i64(var, cpu_env, offset);
- break;
- default:
- g_assert_not_reached();
- }
-}
-
static inline void neon_load_reg64(TCGv_i64 var, int reg)
{
tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg));
--
2.20.1
^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file
2020-06-16 17:08 ` [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file Peter Maydell
@ 2020-06-20 0:21 ` Richard Henderson
0 siblings, 0 replies; 45+ messages in thread
From: Richard Henderson @ 2020-06-20 0:21 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 6/16/20 10:08 AM, Peter Maydell wrote:
> The functions neon_element_offset(), neon_load_element(),
> neon_load_element64(), neon_store_element() and
> neon_store_element64() are used only in the translate-neon.inc.c
> file, so move their definitions there.
>
> Since the .inc.c file is #included in translate.c this doesn't make
> much difference currently, but it's a more logical place to put the
> functions and it might be helpful if we ever decide to try to make
> the .inc.c files genuinely separate compilation units.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/translate-neon.inc.c | 101 ++++++++++++++++++++++++++++++++
> target/arm/translate.c | 101 --------------------------------
> 2 files changed, 101 insertions(+), 101 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 00/21] target/arm: Finish neon decodetree conversion
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (20 preceding siblings ...)
2020-06-16 17:08 ` [PATCH 21/21] target/arm: Move some functions used only in translate-neon.inc.c to that file Peter Maydell
@ 2020-06-16 21:37 ` no-reply
2020-06-16 21:52 ` no-reply
22 siblings, 0 replies; 45+ messages in thread
From: no-reply @ 2020-06-16 21:37 UTC (permalink / raw)
To: peter.maydell; +Cc: qemu-arm, richard.henderson, qemu-devel
Patchew URL: https://patchew.org/QEMU/20200616170844.13318-1-peter.maydell@linaro.org/
Hi,
This series seems to have some coding style problems. See output below for
more information:
Subject: [PATCH 00/21] target/arm: Finish neon decodetree conversion
Type: series
Message-id: 20200616170844.13318-1-peter.maydell@linaro.org
=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
f6e3d71 target/arm: Move some functions used only in translate-neon.inc.c to that file
4a2df60 target/arm: Convert Neon VTRN to decodetree
26f74d6 target/arm: Convert Neon VSWP to decodetree
7a0fa02 target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree
73198e7 target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree
0908381 target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree
f6d72da target/arm: Convert simple fp Neon 2-reg-misc insns
3ef1127 target/arm: Convert Neon VQABS, VQNEG to decodetree
624f399 target/arm: Convert remaining simple 2-reg-misc Neon ops
9bb8fa6 target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree
3535d27 target/arm: Make gen_swap_half() take separate src and dest
2eac819 target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs
e30825b target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn
7d1109a target/arm: Convert Neon 2-reg-misc crypto operations to decodetree
9d87342 target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree
605ae75 target/arm: Convert Neon VCVT f16/f32 insns to decodetree
5fb6c16 target/arm: Convert Neon 2-reg-misc VSHLL to decodetree
e2e99ab target/arm: Convert Neon narrowing moves to decodetree
3dde5dd target/arm: Convert VZIP, VUZP to decodetree
3ed7eaf target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree
37f7428 target/arm: Convert Neon 2-reg-misc VREV64 to decodetree
=== OUTPUT BEGIN ===
1/21 Checking commit 37f7428534e5 (target/arm: Convert Neon 2-reg-misc VREV64 to decodetree)
2/21 Checking commit 3ed7eaff8f5f (target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree)
3/21 Checking commit 3dde5ddb764b (target/arm: Convert VZIP, VUZP to decodetree)
4/21 Checking commit e2e99ab61e59 (target/arm: Convert Neon narrowing moves to decodetree)
5/21 Checking commit 5fb6c161af86 (target/arm: Convert Neon 2-reg-misc VSHLL to decodetree)
6/21 Checking commit 605ae75d2431 (target/arm: Convert Neon VCVT f16/f32 insns to decodetree)
7/21 Checking commit 9d87342857c8 (target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree)
8/21 Checking commit 7d1109aae4db (target/arm: Convert Neon 2-reg-misc crypto operations to decodetree)
9/21 Checking commit e30825bbd0a4 (target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn)
10/21 Checking commit 2eac8198e699 (target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs)
11/21 Checking commit 3535d2721010 (target/arm: Make gen_swap_half() take separate src and dest)
ERROR: trailing statements should be on next line
#50: FILE: target/arm/translate.c:4963:
+ case 1: gen_swap_half(tmp, tmp); break;
total: 1 errors, 0 warnings, 43 lines checked
Patch 11/21 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
12/21 Checking commit 9bb8fa6a3658 (target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree)
13/21 Checking commit 624f399e8573 (target/arm: Convert remaining simple 2-reg-misc Neon ops)
14/21 Checking commit 3ef11275c669 (target/arm: Convert Neon VQABS, VQNEG to decodetree)
15/21 Checking commit f6d72da4c6fc (target/arm: Convert simple fp Neon 2-reg-misc insns)
16/21 Checking commit 090838197cf0 (target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree)
17/21 Checking commit 73198e721bff (target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree)
18/21 Checking commit 7a0fa02240e2 (target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree)
19/21 Checking commit 26f74d6de184 (target/arm: Convert Neon VSWP to decodetree)
20/21 Checking commit 4a2df6081512 (target/arm: Convert Neon VTRN to decodetree)
21/21 Checking commit f6e3d7186b3e (target/arm: Move some functions used only in translate-neon.inc.c to that file)
WARNING: Block comments use a leading /* on a separate line
#28: FILE: target/arm/translate-neon.inc.c:57:
+/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
WARNING: Block comments use a leading /* on a separate line
#37: FILE: target/arm/translate-neon.inc.c:66:
+ /* Calculate the offset assuming fully little-endian,
total: 0 errors, 2 warnings, 226 lines checked
Patch 21/21 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===
Test command exited with code: 1
The full log is available at
http://patchew.org/logs/20200616170844.13318-1-peter.maydell@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 00/21] target/arm: Finish neon decodetree conversion
2020-06-16 17:08 [PATCH 00/21] target/arm: Finish neon decodetree conversion Peter Maydell
` (21 preceding siblings ...)
2020-06-16 21:37 ` [PATCH 00/21] target/arm: Finish neon decodetree conversion no-reply
@ 2020-06-16 21:52 ` no-reply
22 siblings, 0 replies; 45+ messages in thread
From: no-reply @ 2020-06-16 21:52 UTC (permalink / raw)
To: peter.maydell; +Cc: qemu-arm, richard.henderson, qemu-devel
Patchew URL: https://patchew.org/QEMU/20200616170844.13318-1-peter.maydell@linaro.org/
Hi,
This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.
=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===
CC qga/guest-agent-command-state.o
CC qga/main.o
CC qga/commands-posix.o
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
CC qga/channel-posix.o
CC qga/qapi-generated/qga-qapi-types.o
CC qga/qapi-generated/qga-qapi-visit.o
---
AR libvhost-user.a
GEN docs/interop/qemu-ga-ref.html
GEN docs/interop/qemu-ga-ref.txt
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
GEN docs/interop/qemu-ga-ref.7
LINK qemu-keymap
LINK ivshmem-client
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK ivshmem-server
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK qemu-nbd
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
AS pc-bios/optionrom/multiboot.o
AS pc-bios/optionrom/linuxboot.o
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK qemu-storage-daemon
CC pc-bios/optionrom/linuxboot_dma.o
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
AS pc-bios/optionrom/kvmvapic.o
AS pc-bios/optionrom/pvh.o
CC pc-bios/optionrom/pvh_main.o
---
SIGN pc-bios/optionrom/linuxboot.bin
SIGN pc-bios/optionrom/linuxboot_dma.bin
SIGN pc-bios/optionrom/kvmvapic.bin
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
BUILD pc-bios/optionrom/pvh.raw
SIGN pc-bios/optionrom/pvh.bin
LINK qemu-io
LINK qemu-edid
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK fsdev/virtfs-proxy-helper
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK scsi/qemu-pr-helper
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK qemu-bridge-helper
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK virtiofsd
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
LINK vhost-user-input
LINK qemu-ga
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
/usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o)
GEN x86_64-softmmu/hmp-commands.h
GEN x86_64-softmmu/config-target.h
GEN x86_64-softmmu/hmp-commands-info.h
---
CC x86_64-softmmu/accel/tcg/translator.o
CC x86_64-softmmu/accel/xen/xen-all.o
CC x86_64-softmmu/dump/dump.o
/tmp/qemu-test/src/fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
/tmp/qemu-test/src/fpu/softfloat.c:3423:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
absZ0 &= ~ ( ( (uint64_t) ( absZ1<<1 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
/tmp/qemu-test/src/fpu/softfloat.c:3483:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
absZ0 &= ~(((uint64_t)(absZ1<<1) == 0) & roundNearestEven);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
/tmp/qemu-test/src/fpu/softfloat.c:3606:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
zSig &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
/tmp/qemu-test/src/fpu/softfloat.c:3760:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
zSig &= ~ ( ( ( roundBits ^ 0x200 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
/tmp/qemu-test/src/fpu/softfloat.c:3987:21: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
~ ( ( (uint64_t) ( zSig1<<1 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
/tmp/qemu-test/src/fpu/softfloat.c:4003:22: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
zSig0 &= ~ ( ( (uint64_t) ( zSig1<<1 ) == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
/tmp/qemu-test/src/fpu/softfloat.c:4273:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation]
zSig1 &= ~ ( ( zSig2 + zSig2 == 0 ) & roundNearestEven );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!
8 errors generated.
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: fpu/softfloat.o] Error 1
make[1]: *** Waiting for unfinished jobs....
/tmp/qemu-test/src/migration/ram.c:919:45: error: implicit conversion from 'unsigned long' to 'double' changes value from 18446744073709551615 to 18446744073709551616 [-Werror,-Wimplicit-int-float-conversion]
xbzrle_counters.encoding_rate = UINT64_MAX;
~ ^~~~~~~~~~
/usr/include/stdint.h:130:23: note: expanded from macro 'UINT64_MAX'
---
18446744073709551615UL
^~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make[1]: *** [/tmp/qemu-test/src/rules.mak:69: migration/ram.o] Error 1
make: *** [Makefile:527: x86_64-softmmu/all] Error 2
Traceback (most recent call last):
File "./tests/docker/docker.py", line 669, in <module>
sys.exit(main())
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=626d1d5e121543ca8c3bbdb19797f602', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-heh_4z5t/src/docker-src.2020-06-16-17.47.36.4664:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=626d1d5e121543ca8c3bbdb19797f602
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-heh_4z5t/src'
make: *** [docker-run-test-debug@fedora] Error 2
real 4m29.254s
user 0m7.869s
The full log is available at
http://patchew.org/logs/20200616170844.13318-1-peter.maydell@linaro.org/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 45+ messages in thread