From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org, Peter Maydell <peter.maydell@linaro.org>
Subject: [PATCH v2 34/67] target/arm: Use gvec for neon pmax, pmin
Date: Fri, 24 May 2024 16:20:48 -0700 [thread overview]
Message-ID: <20240524232121.284515-35-richard.henderson@linaro.org> (raw)
In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/translate-neon.c | 78 ++-------------------------------
1 file changed, 4 insertions(+), 74 deletions(-)
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
index 6c5a7a98e1..18b048611b 100644
--- a/target/arm/tcg/translate-neon.c
+++ b/target/arm/tcg/translate-neon.c
@@ -831,6 +831,10 @@ DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba)
DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd)
DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba)
DO_3SAME_NO_SZ_3(VPADD, gen_gvec_addp)
+DO_3SAME_NO_SZ_3(VPMAX_S, gen_gvec_smaxp)
+DO_3SAME_NO_SZ_3(VPMIN_S, gen_gvec_sminp)
+DO_3SAME_NO_SZ_3(VPMAX_U, gen_gvec_umaxp)
+DO_3SAME_NO_SZ_3(VPMIN_U, gen_gvec_uminp)
#define DO_3SAME_CMP(INSN, COND) \
static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \
@@ -1003,80 +1007,6 @@ DO_3SAME_32_ENV(VQSHL_U, qshl_u)
DO_3SAME_32_ENV(VQRSHL_S, qrshl_s)
DO_3SAME_32_ENV(VQRSHL_U, qrshl_u)
-static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn)
-{
- /* Operations handled pairwise 32 bits at a time */
- TCGv_i32 tmp, tmp2, tmp3;
-
- if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
- return false;
- }
-
- /* UNDEF accesses to D16-D31 if they don't exist. */
- if (!dc_isar_feature(aa32_simd_r32, s) &&
- ((a->vd | a->vn | a->vm) & 0x10)) {
- return false;
- }
-
- if (a->size == 3) {
- return false;
- }
-
- if (!vfp_access_check(s)) {
- return true;
- }
-
- assert(a->q == 0); /* enforced by decode patterns */
-
- /*
- * Note that we have to be careful not to clobber the source operands
- * in the "vm == vd" case by storing the result of the first pass too
- * early. Since Q is 0 there are always just two passes, so instead
- * of a complicated loop over each pass we just unroll.
- */
- tmp = tcg_temp_new_i32();
- tmp2 = tcg_temp_new_i32();
- tmp3 = tcg_temp_new_i32();
-
- read_neon_element32(tmp, a->vn, 0, MO_32);
- read_neon_element32(tmp2, a->vn, 1, MO_32);
- fn(tmp, tmp, tmp2);
-
- read_neon_element32(tmp3, a->vm, 0, MO_32);
- read_neon_element32(tmp2, a->vm, 1, MO_32);
- fn(tmp3, tmp3, tmp2);
-
- write_neon_element32(tmp, a->vd, 0, MO_32);
- write_neon_element32(tmp3, a->vd, 1, MO_32);
-
- return true;
-}
-
-#define DO_3SAME_PAIR(INSN, func) \
- static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
- { \
- static NeonGenTwoOpFn * const fns[] = { \
- gen_helper_neon_##func##8, \
- gen_helper_neon_##func##16, \
- gen_helper_neon_##func##32, \
- }; \
- if (a->size > 2) { \
- return false; \
- } \
- return do_3same_pair(s, a, fns[a->size]); \
- }
-
-/* 32-bit pairwise ops end up the same as the elementwise versions. */
-#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32
-#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32
-#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32
-#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32
-
-DO_3SAME_PAIR(VPMAX_S, pmax_s)
-DO_3SAME_PAIR(VPMIN_S, pmin_s)
-DO_3SAME_PAIR(VPMAX_U, pmax_u)
-DO_3SAME_PAIR(VPMIN_U, pmin_u)
-
#define DO_3SAME_VQDMULH(INSN, FUNC) \
WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \
WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \
--
2.34.1
next prev parent reply other threads:[~2024-05-24 23:30 UTC|newest]
Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-24 23:20 [PATCH v2 00/67] target/arm: Convert a64 advsimd to decodetree (part 1) Richard Henderson
2024-05-24 23:20 ` [PATCH v2 01/67] target/arm: Add neoverse-n1 to qemu-arm (DO NOT MERGE) Richard Henderson
2024-05-24 23:20 ` [PATCH v2 02/67] target/arm: Use PLD, PLDW, PLI not NOP for t32 Richard Henderson
2024-05-28 13:13 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 03/67] target/arm: Reject incorrect operands to PLD, PLDW, PLI Richard Henderson
2024-05-28 13:18 ` Peter Maydell
2024-05-28 17:36 ` Richard Henderson
2024-05-29 13:32 ` Peter Maydell
2024-05-29 17:39 ` Richard Henderson
2024-05-24 23:20 ` [PATCH v2 04/67] target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer) Richard Henderson
2024-05-28 12:48 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 05/67] target/arm: Fix decode of FMOV (hp) vs MOVI Richard Henderson
2024-05-28 12:34 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 06/67] target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16) Richard Henderson
2024-05-28 12:25 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 07/67] target/arm: Split out gengvec.c Richard Henderson
2024-05-24 23:20 ` [PATCH v2 08/67] target/arm: Split out gengvec64.c Richard Henderson
2024-05-24 23:20 ` [PATCH v2 09/67] target/arm: Convert Cryptographic AES to decodetree Richard Henderson
2024-05-24 23:20 ` [PATCH v2 10/67] target/arm: Convert Cryptographic 3-register SHA " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 11/67] target/arm: Convert Cryptographic 2-register " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 12/67] target/arm: Convert Cryptographic 3-register SHA512 " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 13/67] target/arm: Convert Cryptographic 2-register " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 14/67] target/arm: Convert Cryptographic 4-register " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 15/67] target/arm: Convert Cryptographic 3-register, imm2 " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 16/67] target/arm: Convert XAR " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 17/67] target/arm: Convert Advanced SIMD copy " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 18/67] target/arm: Convert FMULX " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 19/67] target/arm: Convert FADD, FSUB, FDIV, FMUL " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 20/67] target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 21/67] target/arm: Introduce vfp_load_reg16 Richard Henderson
2024-05-28 12:22 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 22/67] target/arm: Expand vfp neg and abs inline Richard Henderson
2024-05-24 23:20 ` [PATCH v2 23/67] target/arm: Convert FNMUL to decodetree Richard Henderson
2024-05-24 23:20 ` [PATCH v2 24/67] target/arm: Convert FMLA, FMLS " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 25/67] target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 26/67] target/arm: Convert FABD " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 27/67] target/arm: Convert FRECPS, FRSQRTS " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 28/67] target/arm: Convert FADDP " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 29/67] target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 30/67] target/arm: Use gvec for neon faddp, fmaxp, fminp Richard Henderson
2024-05-24 23:20 ` [PATCH v2 31/67] target/arm: Convert ADDP to decodetree Richard Henderson
2024-05-24 23:20 ` [PATCH v2 32/67] target/arm: Use gvec for neon padd Richard Henderson
2024-05-24 23:20 ` [PATCH v2 33/67] target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree Richard Henderson
2024-05-24 23:20 ` Richard Henderson [this message]
2024-05-24 23:20 ` [PATCH v2 35/67] target/arm: Convert FMLAL, FMLSL " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 36/67] target/arm: Convert disas_simd_3same_logic " Richard Henderson
2024-05-24 23:20 ` [PATCH v2 37/67] target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB Richard Henderson
2024-05-28 15:24 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 38/67] target/arm: Convert SUQADD and USQADD to gvec Richard Henderson
2024-05-28 15:37 ` Peter Maydell
2024-05-28 17:41 ` Richard Henderson
2024-05-24 23:20 ` [PATCH v2 39/67] target/arm: Inline scalar SUQADD and USQADD Richard Henderson
2024-05-28 15:40 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 40/67] target/arm: Inline scalar SQADD, UQADD, SQSUB, UQSUB Richard Henderson
2024-05-28 15:42 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 41/67] target/arm: Convert SQADD, SQSUB, UQADD, UQSUB to decodetree Richard Henderson
2024-05-28 15:44 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 42/67] target/arm: Convert SUQADD, USQADD " Richard Henderson
2024-05-28 15:44 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 43/67] target/arm: Convert SSHL, USHL " Richard Henderson
2024-05-28 15:46 ` Peter Maydell
2024-05-24 23:20 ` [PATCH v2 44/67] target/arm: Convert SRSHL and URSHL (register) to gvec Richard Henderson
2024-05-28 15:51 ` Peter Maydell
2024-05-28 17:30 ` Richard Henderson
2024-05-24 23:20 ` [PATCH v2 45/67] target/arm: Convert SRSHL, URSHL to decodetree Richard Henderson
2024-05-28 15:51 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 46/67] target/arm: Convert SQSHL and UQSHL (register) to gvec Richard Henderson
2024-05-28 15:53 ` Peter Maydell
2024-05-28 17:31 ` Richard Henderson
2024-05-24 23:21 ` [PATCH v2 47/67] target/arm: Convert SQSHL, UQSHL to decodetree Richard Henderson
2024-05-28 15:53 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 48/67] target/arm: Convert SQRSHL and UQRSHL (register) to gvec Richard Henderson
2024-05-28 15:54 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 49/67] target/arm: Convert SQRSHL, UQRSHL to decodetree Richard Henderson
2024-05-28 15:55 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 50/67] target/arm: Convert ADD, SUB (vector) " Richard Henderson
2024-05-28 15:56 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 51/67] target/arm: Convert CMGT, CMHI, CMGE, CMHS, CMTST, CMEQ " Richard Henderson
2024-05-28 15:57 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 52/67] target/arm: Use TCG_COND_TSTNE in gen_cmtst_{i32, i64} Richard Henderson
2024-05-28 15:58 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 53/67] target/arm: Use TCG_COND_TSTNE in gen_cmtst_vec Richard Henderson
2024-05-28 15:59 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 54/67] target/arm: Convert SHADD, UHADD to gvec Richard Henderson
2024-05-28 16:01 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 55/67] target/arm: Convert SHADD, UHADD to decodetree Richard Henderson
2024-05-28 16:01 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 56/67] target/arm: Convert SHSUB, UHSUB to gvec Richard Henderson
2024-05-28 16:02 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 57/67] target/arm: Convert SHSUB, UHSUB to decodetree Richard Henderson
2024-05-28 16:02 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 58/67] target/arm: Convert SRHADD, URHADD to gvec Richard Henderson
2024-05-28 16:03 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 59/67] target/arm: Convert SRHADD, URHADD to decodetree Richard Henderson
2024-05-28 16:04 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 60/67] target/arm: Convert SMAX, SMIN, UMAX, UMIN " Richard Henderson
2024-05-28 16:04 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 61/67] target/arm: Convert SABA, SABD, UABA, UABD " Richard Henderson
2024-05-28 16:05 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 62/67] target/arm: Convert MUL, PMUL " Richard Henderson
2024-05-28 16:06 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 63/67] target/arm: Convert MLA, MLS " Richard Henderson
2024-05-28 16:07 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 64/67] target/arm: Tidy SQDMULH, SQRDMULH (vector) Richard Henderson
2024-05-28 16:08 ` Peter Maydell
2024-05-24 23:21 ` [PATCH v2 65/67] target/arm: Convert SQDMULH, SQRDMULH to decodetree Richard Henderson
2024-05-28 16:10 ` Peter Maydell
2024-05-28 17:27 ` Richard Henderson
2024-05-24 23:21 ` [PATCH v2 66/67] target/arm: Convert FMADD, FMSUB, FNMADD, FNMSUB " Richard Henderson
2024-05-28 16:15 ` Peter Maydell
2024-05-28 17:31 ` Richard Henderson
2024-05-24 23:21 ` [PATCH v2 67/67] target/arm: Convert FCSEL " Richard Henderson
2024-05-28 16:16 ` Peter Maydell
2024-05-28 16:20 ` [PATCH v2 00/67] target/arm: Convert a64 advsimd to decodetree (part 1) Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240524232121.284515-35-richard.henderson@linaro.org \
--to=richard.henderson@linaro.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).