[Qemu-devel] [PATCH 2/8] target-arm: A64: Implement long vector x indexed insns

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Cc: "Peter Crosthwaite" <peter.crosthwaite@xilinx.com>,
	patches@linaro.org, "Michael Matz" <matz@suse.de>,
	"Alexander Graf" <agraf@suse.de>,
	"Claudio Fontana" <claudio.fontana@linaro.org>,
	"Dirk Mueller" <dmueller@suse.de>,
	"Will Newton" <will.newton@linaro.org>,
	"Laurent Desnogues" <laurent.desnogues@gmail.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	kvmarm@lists.cs.columbia.edu,
	"Christoffer Dall" <christoffer.dall@linaro.org>,
	"Richard Henderson" <rth@twiddle.net>
Subject: [Qemu-devel] [PATCH 2/8] target-arm: A64: Implement long vector x indexed insns
Date: Fri,  7 Feb 2014 21:49:17 +0000	[thread overview]
Message-ID: <1391809763-11251-3-git-send-email-peter.maydell@linaro.org> (raw)
In-Reply-To: <1391809763-11251-1-git-send-email-peter.maydell@linaro.org>

Implement the 'long' operations in the vector x indexed
element category.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 144 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 139 insertions(+), 5 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index b7f1ecf..64b8c86 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -7906,11 +7906,6 @@ static void disas_simd_indexed_vector(DisasContext *s, uint32_t insn)
         }
     }
 
-    if (is_long) {
-        unsupported_encoding(s, insn);
-        return;
-    }
-
     if (is_fp) {
         fpst = get_fpstatus_ptr();
     } else {
@@ -8052,6 +8047,145 @@ static void disas_simd_indexed_vector(DisasContext *s, uint32_t insn)
         }
     } else {
         /* long ops: 16x16->32 or 32x32->64 */
+        TCGv_i64 tcg_res[2];
+        int pass;
+        bool satop = extract32(opcode, 0, 1);
+        TCGMemOp memop = MO_32;
+
+        if (satop || !u) {
+            memop |= MO_SIGN;
+        }
+
+        if (size == 2) {
+            TCGv_i64 tcg_idx = tcg_temp_new_i64();
+
+            read_vec_element(s, tcg_idx, rm, index, memop);
+
+            for (pass = 0; pass < 2; pass++) {
+                TCGv_i64 tcg_op = tcg_temp_new_i64();
+                TCGv_i64 tcg_passres;
+
+                read_vec_element(s, tcg_op, rn, pass + (is_q * 2), memop);
+
+                tcg_res[pass] = tcg_temp_new_i64();
+
+                if (opcode == 0xa || opcode == 0xb) {
+                    /* Non-accumulating ops */
+                    tcg_passres = tcg_res[pass];
+                } else {
+                    tcg_passres = tcg_temp_new_i64();
+                }
+
+                tcg_gen_mul_i64(tcg_passres, tcg_op, tcg_idx);
+                tcg_temp_free_i64(tcg_op);
+
+                if (satop) {
+                    /* saturating, doubling */
+                    gen_helper_neon_addl_saturate_s64(tcg_passres, cpu_env,
+                                                      tcg_passres, tcg_passres);
+                }
+
+                if (opcode == 0xa || opcode == 0xb) {
+                    continue;
+                }
+
+                /* Accumulating op: handle accumulate step */
+                read_vec_element(s, tcg_res[pass], rd, pass, MO_64);
+
+                switch (opcode) {
+                case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
+                    tcg_gen_add_i64(tcg_res[pass], tcg_res[pass], tcg_passres);
+                    break;
+                case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
+                    tcg_gen_sub_i64(tcg_res[pass], tcg_res[pass], tcg_passres);
+                    break;
+                case 0x7: /* SQDMLSL, SQDMLSL2 */
+                    tcg_gen_neg_i64(tcg_passres, tcg_passres);
+                    /* fall through */
+                case 0x3: /* SQDMLAL, SQDMLAL2 */
+                    gen_helper_neon_addl_saturate_s64(tcg_res[pass], cpu_env,
+                                                      tcg_res[pass],
+                                                      tcg_passres);
+                    break;
+                default:
+                    g_assert_not_reached();
+                }
+                tcg_temp_free_i64(tcg_passres);
+            }
+            tcg_temp_free_i64(tcg_idx);
+        } else {
+            TCGv_i32 tcg_idx = tcg_temp_new_i32();
+
+            assert(size == 1);
+            read_vec_element_i32(s, tcg_idx, rm, index, size);
+
+            /* The simplest way to handle the 16x16 indexed ops is to duplicate
+             * the index into both halves of the 32 bit tcg_idx and then use
+             * the usual Neon helpers.
+             */
+            tcg_gen_deposit_i32(tcg_idx, tcg_idx, tcg_idx, 16, 16);
+
+            for (pass = 0; pass < 2; pass++) {
+                TCGv_i32 tcg_op = tcg_temp_new_i32();
+                TCGv_i64 tcg_passres;
+
+                read_vec_element_i32(s, tcg_op, rn, pass + (is_q * 2), MO_32);
+                tcg_res[pass] = tcg_temp_new_i64();
+
+                if (opcode == 0xa || opcode == 0xb) {
+                    /* Non-accumulating ops */
+                    tcg_passres = tcg_res[pass];
+                } else {
+                    tcg_passres = tcg_temp_new_i64();
+                }
+
+                if (memop & MO_SIGN) {
+                    gen_helper_neon_mull_s16(tcg_passres, tcg_op, tcg_idx);
+                } else {
+                    gen_helper_neon_mull_u16(tcg_passres, tcg_op, tcg_idx);
+                }
+                if (satop) {
+                    gen_helper_neon_addl_saturate_s32(tcg_passres, cpu_env,
+                                                      tcg_passres, tcg_passres);
+                }
+                tcg_temp_free_i32(tcg_op);
+
+                if (opcode == 0xa || opcode == 0xb) {
+                    continue;
+                }
+
+                /* Accumulating op: handle accumulate step */
+                read_vec_element(s, tcg_res[pass], rd, pass, MO_64);
+
+                switch (opcode) {
+                case 0x2: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
+                    gen_helper_neon_addl_u32(tcg_res[pass], tcg_res[pass],
+                                             tcg_passres);
+                    break;
+                case 0x6: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
+                    gen_helper_neon_subl_u32(tcg_res[pass], tcg_res[pass],
+                                             tcg_passres);
+                    break;
+                case 0x7: /* SQDMLSL, SQDMLSL2 */
+                    gen_helper_neon_negl_u32(tcg_passres, tcg_passres);
+                    /* fall through */
+                case 0x3: /* SQDMLAL, SQDMLAL2 */
+                    gen_helper_neon_addl_saturate_s32(tcg_res[pass], cpu_env,
+                                                      tcg_res[pass],
+                                                      tcg_passres);
+                    break;
+                default:
+                    g_assert_not_reached();
+                }
+                tcg_temp_free_i64(tcg_passres);
+            }
+            tcg_temp_free_i32(tcg_idx);
+        }
+
+        for (pass = 0; pass < 2; pass++) {
+            write_vec_element(s, tcg_res[pass], rd, pass, MO_64);
+            tcg_temp_free_i64(tcg_res[pass]);
+        }
     }
 
     if (!TCGV_IS_UNUSED_PTR(fpst)) {
-- 
1.8.5

next prev parent reply	other threads:[~2014-02-07 21:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-07 21:49 [Qemu-devel] [PATCH 0/8] A64: Neon support, fourth set Peter Maydell
2014-02-07 21:49 ` [Qemu-devel] [PATCH 1/8] target-arm: A64: Implement plain vector SIMD indexed element insns Peter Maydell
2014-02-11 14:52   ` Peter Maydell
2014-02-07 21:49 ` Peter Maydell [this message]
2014-02-07 21:49 ` [Qemu-devel] [PATCH 3/8] target-arm: A64: Implement SIMD scalar indexed instructions Peter Maydell
2014-02-07 21:49 ` [Qemu-devel] [PATCH 4/8] target-arm: A64: Implement scalar three different instructions Peter Maydell
2014-02-07 21:49 ` [Qemu-devel] [PATCH 5/8] target-arm: A64: Implement SIMD FP compare and set insns Peter Maydell
2014-02-07 21:49 ` [Qemu-devel] [PATCH 6/8] target-arm: A64: Implement floating point pairwise insns Peter Maydell
2014-02-07 21:49 ` [Qemu-devel] [PATCH 7/8] softfloat: Support halving the result of muladd operation Peter Maydell
2014-02-10 20:15   ` Richard Henderson
2014-02-10 21:31     ` Peter Maydell
2014-02-07 21:49 ` [Qemu-devel] [PATCH 8/8] target-arm: A64: Implement remaining 3-same instructions Peter Maydell
2014-02-10 20:15 ` [Qemu-devel] [PATCH 0/8] A64: Neon support, fourth set Richard Henderson

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:b7f1ecf dfblob:64b8c86 )
 OR (
bs:"[Qemu-devel] [PATCH 2/8] target-arm: A64: Implement long vector x indexed insns" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1391809763-11251-3-git-send-email-peter.maydell@linaro.org \
    --to=peter.maydell@linaro.org \
    --cc=agraf@suse.de \
    --cc=alex.bennee@linaro.org \
    --cc=christoffer.dall@linaro.org \
    --cc=claudio.fontana@linaro.org \
    --cc=dmueller@suse.de \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=laurent.desnogues@gmail.com \
    --cc=matz@suse.de \
    --cc=patches@linaro.org \
    --cc=peter.crosthwaite@xilinx.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=will.newton@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).