[PATCH 19/20] target/arm: Convert load/store single structure to decodetree

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-arm@nongnu.org, qemu-devel@nongnu.org
Subject: [PATCH 19/20] target/arm: Convert load/store single structure to decodetree
Date: Fri,  2 Jun 2023 16:52:22 +0100	[thread overview]
Message-ID: <20230602155223.2040685-20-peter.maydell@linaro.org> (raw)
In-Reply-To: <20230602155223.2040685-1-peter.maydell@linaro.org>

Convert the ASIMD load/store single structure insns to decodetree.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
I note that compared to the old decoder this is rather harder
to compare against the pseudocode; the old hand-decode can
follow the pseudocode quite closely with its switch on 'scale',
whereas here quite a lot of magic is happening in the calculation
of 'index'.
---
 target/arm/tcg/a64.decode      |  37 ++++++
 target/arm/tcg/translate-a64.c | 228 ++++++++++++++++-----------------
 2 files changed, 148 insertions(+), 117 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 12d331b4c2a..48461a0540e 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -494,3 +494,40 @@ LD_mult         0 . 001100 . 1 0 ..... 0110 .. ..... ..... @ldst_mult rpt=3 sele
 LD_mult         0 . 001100 . 1 0 ..... 0111 .. ..... ..... @ldst_mult rpt=1 selem=1
 LD_mult         0 . 001100 . 1 0 ..... 1000 .. ..... ..... @ldst_mult rpt=1 selem=2
 LD_mult         0 . 001100 . 1 0 ..... 1010 .. ..... ..... @ldst_mult rpt=2 selem=1
+
+# Load/store single structure
+
+%ldst_single_selem 13:1 21:1 !function=plus_1
+# The index is made up from bits Q, S and the size; we may then need to scale
+# it down by the size.
+%ldst_single_index q:1 s:1 sz:2
+%ldst_single_index_scaled q:1 s:1 sz:2 scale:3 !function=uimm_scaled_down
+%ldst_single_repl_scale 10:2
+
+# We don't care about S in the trans functions (the decode folds it into
+# the calculation of index), but we have to list it here so that we can
+# handle the S-must-be-0 pattern lines. Similarly we don't care about sz
+# once it has been used to calculate index.
+&ldst_single    rm rn rt sz p q s selem index scale repl
+
+@ldst_single        . q:1 ...... p:1 . . rm:5 ... . .. rn:5 rt:5 \
+                    &ldst_single index=%ldst_single_index_scaled \
+                    selem=%ldst_single_selem repl=0
+@ldst_single_repl   . q:1 ...... p:1 . . rm:5 ... . sz:2 rn:5 rt:5 \
+                    &ldst_single index=%ldst_single_index \
+                    scale=%ldst_single_repl_scale selem=%ldst_single_selem repl=1
+
+
+ST_single       0 . 001101 . 0 . ..... 00 . s:1 sz:2 ..... ..... @ldst_single scale=0
+ST_single       0 . 001101 . 0 . ..... 01 . s:1 00 ..... ..... @ldst_single scale=1 sz=0
+ST_single       0 . 001101 . 0 . ..... 01 . s:1 10 ..... ..... @ldst_single scale=1 sz=2
+ST_single       0 . 001101 . 0 . ..... 10 . s:1 00 ..... ..... @ldst_single scale=2 sz=0
+ST_single       0 . 001101 . 0 . ..... 10 . 0 01 ..... ..... @ldst_single scale=3 sz=1 s=0
+
+LD_single       0 . 001101 . 1 . ..... 00 . s:1 sz:2 ..... ..... @ldst_single scale=0
+LD_single       0 . 001101 . 1 . ..... 01 . s:1 00 ..... ..... @ldst_single scale=1 sz=0
+LD_single       0 . 001101 . 1 . ..... 01 . s:1 10 ..... ..... @ldst_single scale=1 sz=2
+LD_single       0 . 001101 . 1 . ..... 10 . s:1 00 ..... ..... @ldst_single scale=2 sz=0
+LD_single       0 . 001101 . 1 . ..... 10 . 0 01 ..... ..... @ldst_single scale=3 sz=1 s=0
+# Replicating load case
+LD_single_repl  0 . 001101 . 1 . ..... 11 . 0 .. ..... ..... @ldst_single_repl s=0
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index c3b22a74dd5..128c2b8b4b5 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -72,6 +72,17 @@ static int uimm_scaled(DisasContext *s, int x)
     return imm << scale;
 }
 
+/*
+ * For ASIMD load/store single structure: immediate is in bits [31:3],
+ * and should be scaled down by the scale in bits [2:0].
+ */
+static int uimm_scaled_down(DisasContext *s, int x)
+{
+    unsigned imm = x >> 3;
+    unsigned scale = extract32(x, 0, 3);
+    return imm >> scale;
+}
+
 /*
  * Include the generated decoders.
  */
@@ -3405,140 +3416,126 @@ static bool trans_ST_mult(DisasContext *s, arg_ldst_mult *a)
     return true;
 }
 
-/* AdvSIMD load/store single structure
- *
- *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- * | 0 | Q | 0 0 1 1 0 1 0 | L R | 0 0 0 0 0 | opc | S | size |  Rn  |  Rt  |
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- *
- * AdvSIMD load/store single structure (post-indexed)
- *
- *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- * | 0 | Q | 0 0 1 1 0 1 1 | L R |     Rm    | opc | S | size |  Rn  |  Rt  |
- * +---+---+---------------+-----+-----------+-----+---+------+------+------+
- *
- * Rt: first (or only) SIMD&FP register to be transferred
- * Rn: base address or SP
- * Rm (post-index only): post-index register (when !31) or size dependent #imm
- * index = encoded in Q:S:size dependent on size
- *
- * lane_size = encoded in R, opc
- * transfer width = encoded in opc, S, size
- */
-static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
+static bool trans_ST_single(DisasContext *s, arg_ldst_single *a)
 {
-    int rt = extract32(insn, 0, 5);
-    int rn = extract32(insn, 5, 5);
-    int rm = extract32(insn, 16, 5);
-    int size = extract32(insn, 10, 2);
-    int S = extract32(insn, 12, 1);
-    int opc = extract32(insn, 13, 3);
-    int R = extract32(insn, 21, 1);
-    int is_load = extract32(insn, 22, 1);
-    int is_postidx = extract32(insn, 23, 1);
-    int is_q = extract32(insn, 30, 1);
-
-    int scale = extract32(opc, 1, 2);
-    int selem = (extract32(opc, 0, 1) << 1 | R) + 1;
-    bool replicate = false;
-    int index = is_q << 3 | S << 2 | size;
-    int xs, total;
+    int xs, total, rt;
     TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
     MemOp mop;
 
-    if (extract32(insn, 31, 1)) {
-        unallocated_encoding(s);
-        return;
+    if (!a->p && a->rm != 0) {
+        return false;
     }
-    if (!is_postidx && rm != 0) {
-        unallocated_encoding(s);
-        return;
-    }
-
-    switch (scale) {
-    case 3:
-        if (!is_load || S) {
-            unallocated_encoding(s);
-            return;
-        }
-        scale = size;
-        replicate = true;
-        break;
-    case 0:
-        break;
-    case 1:
-        if (extract32(size, 0, 1)) {
-            unallocated_encoding(s);
-            return;
-        }
-        index >>= 1;
-        break;
-    case 2:
-        if (extract32(size, 1, 1)) {
-            unallocated_encoding(s);
-            return;
-        }
-        if (!extract32(size, 0, 1)) {
-            index >>= 2;
-        } else {
-            if (S) {
-                unallocated_encoding(s);
-                return;
-            }
-            index >>= 3;
-            scale = 3;
-        }
-        break;
-    default:
-        g_assert_not_reached();
-    }
-
     if (!fp_access_check(s)) {
-        return;
+        return true;
     }
 
-    if (rn == 31) {
+    if (a->rn == 31) {
         gen_check_sp_alignment(s);
     }
 
-    total = selem << scale;
-    tcg_rn = cpu_reg_sp(s, rn);
+    total = a->selem << a->scale;
+    tcg_rn = cpu_reg_sp(s, a->rn);
 
-    clean_addr = gen_mte_checkN(s, tcg_rn, !is_load, is_postidx || rn != 31,
-                                total);
-    mop = finalize_memop(s, scale);
+    clean_addr = gen_mte_checkN(s, tcg_rn, true, a->p || a->rn != 31, total);
+    mop = finalize_memop(s, a->scale);
 
-    tcg_ebytes = tcg_constant_i64(1 << scale);
-    for (xs = 0; xs < selem; xs++) {
-        if (replicate) {
-            /* Load and replicate to all elements */
-            TCGv_i64 tcg_tmp = tcg_temp_new_i64();
-
-            tcg_gen_qemu_ld_i64(tcg_tmp, clean_addr, get_mem_index(s), mop);
-            tcg_gen_gvec_dup_i64(scale, vec_full_reg_offset(s, rt),
-                                 (is_q + 1) * 8, vec_full_reg_size(s),
-                                 tcg_tmp);
-        } else {
-            /* Load/store one element per register */
-            if (is_load) {
-                do_vec_ld(s, rt, index, clean_addr, mop);
-            } else {
-                do_vec_st(s, rt, index, clean_addr, mop);
-            }
-        }
+    tcg_ebytes = tcg_constant_i64(1 << a->scale);
+    for (xs = 0, rt = a->rt; xs < a->selem; xs++, rt = (rt + 1) % 32) {
+        do_vec_st(s, rt, a->index, clean_addr, mop);
         tcg_gen_add_i64(clean_addr, clean_addr, tcg_ebytes);
-        rt = (rt + 1) % 32;
     }
 
-    if (is_postidx) {
-        if (rm == 31) {
+    if (a->p) {
+        if (a->rm == 31) {
             tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
         } else {
-            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, a->rm));
         }
     }
+    return true;
+}
+
+static bool trans_LD_single(DisasContext *s, arg_ldst_single *a)
+{
+    int xs, total, rt;
+    TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
+    MemOp mop;
+
+    if (!a->p && a->rm != 0) {
+        return false;
+    }
+    if (!fp_access_check(s)) {
+        return true;
+    }
+
+    if (a->rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    total = a->selem << a->scale;
+    tcg_rn = cpu_reg_sp(s, a->rn);
+
+    clean_addr = gen_mte_checkN(s, tcg_rn, false, a->p || a->rn != 31, total);
+    mop = finalize_memop(s, a->scale);
+
+    tcg_ebytes = tcg_constant_i64(1 << a->scale);
+    for (xs = 0, rt = a->rt; xs < a->selem; xs++, rt = (rt + 1) % 32) {
+        do_vec_ld(s, rt, a->index, clean_addr, mop);
+        tcg_gen_add_i64(clean_addr, clean_addr, tcg_ebytes);
+    }
+
+    if (a->p) {
+        if (a->rm == 31) {
+            tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
+        } else {
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, a->rm));
+        }
+    }
+    return true;
+}
+
+static bool trans_LD_single_repl(DisasContext *s, arg_ldst_single *a)
+{
+    int xs, total, rt;
+    TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
+    MemOp mop;
+
+    if (!a->p && a->rm != 0) {
+        return false;
+    }
+    if (!fp_access_check(s)) {
+        return true;
+    }
+
+    if (a->rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    total = a->selem << a->scale;
+    tcg_rn = cpu_reg_sp(s, a->rn);
+
+    clean_addr = gen_mte_checkN(s, tcg_rn, false, a->p || a->rn != 31, total);
+    mop = finalize_memop(s, a->scale);
+
+    tcg_ebytes = tcg_constant_i64(1 << a->scale);
+    for (xs = 0, rt = a->rt; xs < a->selem; xs++, rt = (rt + 1) % 32) {
+        /* Load and replicate to all elements */
+        TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+        tcg_gen_qemu_ld_i64(tcg_tmp, clean_addr, get_mem_index(s), mop);
+        tcg_gen_gvec_dup_i64(a->scale, vec_full_reg_offset(s, rt),
+                             (a->q + 1) * 8, vec_full_reg_size(s), tcg_tmp);
+        tcg_gen_add_i64(clean_addr, clean_addr, tcg_ebytes);
+    }
+
+    if (a->p) {
+        if (a->rm == 31) {
+            tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
+        } else {
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, a->rm));
+        }
+    }
+    return true;
 }
 
 /*
@@ -3747,9 +3744,6 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
 static void disas_ldst(DisasContext *s, uint32_t insn)
 {
     switch (extract32(insn, 24, 6)) {
-    case 0x0d: /* AdvSIMD load/store single structure */
-        disas_ldst_single_struct(s, insn);
-        break;
     case 0x19:
         if (extract32(insn, 21, 1) != 0) {
             disas_ldst_tag(s, insn);
-- 
2.34.1

next prev parent reply	other threads:[~2023-06-02 15:54 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 15:52 [PATCH 00/20] target/arm: Convert exception, system, loads and stores to decodetree Peter Maydell
2023-06-02 15:52 ` [PATCH 01/20] target/arm: Fix return value from LDSMIN/LDSMAX 8/16 bit atomics Peter Maydell
2023-06-03  5:35   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 02/20] target/arm: Convert hint instruction space to decodetree Peter Maydell
2023-06-03  5:42   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 03/20] target/arm: Convert barrier insns " Peter Maydell
2023-06-03  5:48   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 04/20] target/arm: Convert CFINV, XAFLAG and AXFLAG " Peter Maydell
2023-06-03  5:55   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 05/20] target/arm: Convert MSR (immediate) " Peter Maydell
2023-06-03  6:01   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 06/20] target/arm: Convert MSR (reg), MRS, SYS, SYSL " Peter Maydell
2023-06-03  6:05   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 07/20] target/arm: Convert exception generation instructions " Peter Maydell
2023-06-03  6:09   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 08/20] target/arm: Convert load/store exclusive and ordered " Peter Maydell
2023-06-03 22:32   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 09/20] target/arm: Convert LDXP, STXP, CASP, CAS " Peter Maydell
2023-06-03 22:44   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 10/20] target/arm: Convert load reg (literal) group " Peter Maydell
2023-06-03 22:49   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 11/20] target/arm: Convert load/store-pair " Peter Maydell
2023-06-03 23:05   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 12/20] target/arm: Convert ld/st reg+imm9 insns " Peter Maydell
2023-06-03 23:14   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 13/20] target/arm: Convert LDR/STR with 12-bit immediate " Peter Maydell
2023-06-02 20:51   ` Philippe Mathieu-Daudé
2023-06-03 16:18     ` Peter Maydell
2023-06-03 23:19   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 14/20] target/arm: Convert LDR/STR reg+reg " Peter Maydell
2023-06-03 23:27   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 15/20] target/arm: Convert atomic memory ops " Peter Maydell
2023-06-03 23:35   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 16/20] target/arm: Convert load (pointer auth) insns " Peter Maydell
2023-06-02 20:56   ` Philippe Mathieu-Daudé
2023-06-03 23:41   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 17/20] target/arm: Convert LDAPR/STLR (imm) " Peter Maydell
2023-06-03 23:55   ` Richard Henderson
2023-06-02 15:52 ` [PATCH 18/20] target/arm: Convert load/store (multiple structures) " Peter Maydell
2023-06-04  0:00   ` Richard Henderson
2023-06-02 15:52 ` Peter Maydell [this message]
2023-06-04  1:27   ` [PATCH 19/20] target/arm: Convert load/store single structure " Richard Henderson
2023-06-02 15:52 ` [PATCH 20/20] target/arm: Convert load/store tags insns " Peter Maydell
2023-06-04  1:36   ` Richard Henderson

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:12d331b4c2 dfblob:48461a0540 dfblob:c3b22a74dd
dfblob:128c2b8b4b )
 OR (
bs:"[PATCH 19/20] target/arm: Convert load/store single structure to decodetree" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230602155223.2040685-20-peter.maydell@linaro.org \
    --to=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).