qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] target/arm: assorted mte fixes
@ 2024-02-06  3:05 Richard Henderson
  2024-02-06  3:05 ` [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3 Richard Henderson
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Richard Henderson @ 2024-02-06  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, gustavo.romero

The first patch is unchanged from

Supercedes: <20240131003557.176486-1-richard.henderson@linaro.org>

while the remaining patches replace

Supercedes: <20240205023948.25476-1-richard.henderson@linaro.org>

While digging through Gustavo's test case, wondering why it
should be failing at all, I finally noticed that we weren't
overflowing MTEDESC.SIZEM1, but underflowing (-1).  Oops.

But I did find a few other points by inspection where we
weren't properly handling or supplying MTEDESC.


r~


Richard Henderson (6):
  linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
  target/arm: Fix nregs computation in do_ld_zpa
  target/arm: Adjust and validate mtedesc sizem1
  target/arm: Split out make_svemte_desc
  target/arm: Handle mte in do_ldrq, do_ldro
  target/arm: Fix SVE/SME gross MTE suppression checks

 linux-user/aarch64/target_prctl.h | 25 +++++-----
 target/arm/internals.h            |  2 +-
 target/arm/tcg/translate-a64.h    |  2 +
 target/arm/tcg/sme_helper.c       |  8 ++--
 target/arm/tcg/sve_helper.c       | 12 ++---
 target/arm/tcg/translate-sme.c    | 15 ++----
 target/arm/tcg/translate-sve.c    | 80 ++++++++++++++++++-------------
 7 files changed, 78 insertions(+), 66 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
@ 2024-02-06  3:05 ` Richard Henderson
  2024-02-06 14:23   ` Peter Maydell
  2024-02-06 15:18   ` Gustavo Romero
  2024-02-06  3:05 ` [PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa Richard Henderson
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 20+ messages in thread
From: Richard Henderson @ 2024-02-06  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, gustavo.romero

When MTE3 is supported, the kernel maps
  PR_MTE_TCF_ASYNC | PR_MTE_TCF_SYNC
to
  MTE_CTRL_TCF_ASYMM
and from there to
  SCTLR_EL1.TCF0 = 3

There is no error reported for setting ASYNC | SYNC when MTE3 is not
supported; the kernel simply selects the ASYNC behavior of TCF0=2.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/aarch64/target_prctl.h | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
index 5067e7d731..49bd16aa95 100644
--- a/linux-user/aarch64/target_prctl.h
+++ b/linux-user/aarch64/target_prctl.h
@@ -173,21 +173,22 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
     env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
 
     if (cpu_isar_feature(aa64_mte, cpu)) {
-        switch (arg2 & PR_MTE_TCF_MASK) {
-        case PR_MTE_TCF_NONE:
-        case PR_MTE_TCF_SYNC:
-        case PR_MTE_TCF_ASYNC:
-            break;
-        default:
-            return -EINVAL;
-        }
-
         /*
          * Write PR_MTE_TCF to SCTLR_EL1[TCF0].
-         * Note that the syscall values are consistent with hw.
+         * Note that SYNC | ASYNC -> ASYMM with FEAT_MTE3,
+         * otherwise mte_update_sctlr_user chooses ASYNC.
          */
-        env->cp15.sctlr_el[1] =
-            deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
+        unsigned tcf = 0;
+        if (arg2 & PR_MTE_TCF_ASYNC) {
+            if ((arg2 & PR_MTE_TCF_SYNC) && cpu_isar_feature(aa64_mte3, cpu)) {
+                tcf = 3;
+            } else {
+                tcf = 2;
+            }
+        } else if (arg2 & PR_MTE_TCF_SYNC) {
+            tcf = 1;
+        }
+        env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
 
         /*
          * Write PR_MTE_TAG to GCR_EL1[Exclude].
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
  2024-02-06  3:05 ` [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3 Richard Henderson
@ 2024-02-06  3:05 ` Richard Henderson
  2024-02-06 14:46   ` Peter Maydell
  2024-02-06  3:05 ` [PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2024-02-06  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, gustavo.romero, qemu-stable

The field is encoded as [0-3], which is convenient for
indexing our array of function pointers, but the true
value is [1-4].  Adjust before calling do_mem_zpa.

Add an assert, and move the comment re passing ZT to
the helper back next to the relevant code.

Cc: qemu-stable@nongnu.org
Fixes: 206adacfb8d ("target/arm: Add mte helpers for sve scalar + int loads")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sve.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 296e7d1ce2..f50a426c98 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4445,11 +4445,7 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
     TCGv_ptr t_pg;
     int desc = 0;
 
-    /*
-     * For e.g. LD4, there are not enough arguments to pass all 4
-     * registers as pointers, so encode the regno into the data field.
-     * For consistency, do this even for LD1.
-     */
+    assert(mte_n >= 1 && mte_n <= 4);
     if (s->mte_active[0]) {
         int msz = dtype_msz(dtype);
 
@@ -4463,6 +4459,11 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
         addr = clean_data_tbi(s, addr);
     }
 
+    /*
+     * For e.g. LD4, there are not enough arguments to pass all 4
+     * registers as pointers, so encode the regno into the data field.
+     * For consistency, do this even for LD1.
+     */
     desc = simd_desc(vsz, vsz, zt | desc);
     t_pg = tcg_temp_new_ptr();
 
@@ -4600,7 +4601,7 @@ static void do_ld_zpa(DisasContext *s, int zt, int pg,
      * accessible via the instruction encoding.
      */
     assert(fn != NULL);
-    do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
+    do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn);
 }
 
 static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
  2024-02-06  3:05 ` [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3 Richard Henderson
  2024-02-06  3:05 ` [PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa Richard Henderson
@ 2024-02-06  3:05 ` Richard Henderson
  2024-02-06 14:49   ` Peter Maydell
  2024-02-06  3:05 ` [PATCH v2 4/6] target/arm: Split out make_svemte_desc Richard Henderson
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2024-02-06  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, gustavo.romero

When we added SVE_MTEDESC_SHIFT, we effectively limited the
maximum size of MTEDESC.  Adjust SIZEM1 to consume the remaining
bits (32 - 10 - 5 - 12 == 5).  Assert that the data to be stored
fits within the field (expecting 8 * 4 - 1 == 31, exact fit).

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/internals.h         | 2 +-
 target/arm/tcg/translate-sve.c | 7 ++++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index fc337fe40e..50bff44549 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1278,7 +1278,7 @@ FIELD(MTEDESC, TBI,   4, 2)
 FIELD(MTEDESC, TCMA,  6, 2)
 FIELD(MTEDESC, WRITE, 8, 1)
 FIELD(MTEDESC, ALIGN, 9, 3)
-FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12)  /* size - 1 */
+FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12)  /* size - 1 */
 
 bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
 uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index f50a426c98..2628ac2840 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4443,17 +4443,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
 {
     unsigned vsz = vec_full_reg_size(s);
     TCGv_ptr t_pg;
+    uint32_t sizem1;
     int desc = 0;
 
     assert(mte_n >= 1 && mte_n <= 4);
+    sizem1 = (mte_n << dtype_msz(dtype)) - 1;
+    assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
     if (s->mte_active[0]) {
-        int msz = dtype_msz(dtype);
-
         desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
         desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
         desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
         desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
-        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1);
+        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
         desc <<= SVE_MTEDESC_SHIFT;
     } else {
         addr = clean_data_tbi(s, addr);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 4/6] target/arm: Split out make_svemte_desc
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
                   ` (2 preceding siblings ...)
  2024-02-06  3:05 ` [PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
@ 2024-02-06  3:05 ` Richard Henderson
  2024-02-06 14:52   ` Peter Maydell
  2024-02-06  3:05 ` [PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro Richard Henderson
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2024-02-06  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, gustavo.romero

Share code that creates mtedesc and embeds within simd_desc.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-a64.h |  2 ++
 target/arm/tcg/translate-sme.c | 15 +++--------
 target/arm/tcg/translate-sve.c | 47 ++++++++++++++++++----------------
 3 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h
index 96ba39b37e..7b811b8ac5 100644
--- a/target/arm/tcg/translate-a64.h
+++ b/target/arm/tcg/translate-a64.h
@@ -28,6 +28,8 @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
 bool sve_access_check(DisasContext *s);
 bool sme_enabled_check(DisasContext *s);
 bool sme_enabled_check_with_svcr(DisasContext *s, unsigned);
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+                          uint32_t msz, bool is_write, uint32_t data);
 
 /* This function corresponds to CheckStreamingSVEEnabled. */
 static inline bool sme_sm_enabled_check(DisasContext *s)
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 8f0dfc884e..46c7fce8b4 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -206,7 +206,7 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
 
     TCGv_ptr t_za, t_pg;
     TCGv_i64 addr;
-    int svl, desc = 0;
+    uint32_t desc;
     bool be = s->be_data == MO_BE;
     bool mte = s->mte_active[0];
 
@@ -224,18 +224,11 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
     tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->esz);
     tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
 
-    if (mte) {
-        desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
-        desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
-        desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
-        desc = FIELD_DP32(desc, MTEDESC, WRITE, a->st);
-        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << a->esz) - 1);
-        desc <<= SVE_MTEDESC_SHIFT;
-    } else {
+    if (!mte) {
         addr = clean_data_tbi(s, addr);
     }
-    svl = streaming_vec_reg_size(s);
-    desc = simd_desc(svl, svl, desc);
+
+    desc = make_svemte_desc(s, streaming_vec_reg_size(s), 1, a->esz, a->st, 0);
 
     fns[a->esz][be][a->v][mte][a->st](tcg_env, t_za, t_pg, addr,
                                       tcg_constant_i32(desc));
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 2628ac2840..8868aae5ac 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4437,18 +4437,18 @@ static const uint8_t dtype_esz[16] = {
     3, 2, 1, 3
 };
 
-static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
-                       int dtype, uint32_t mte_n, bool is_write,
-                       gen_helper_gvec_mem *fn)
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+                          uint32_t msz, bool is_write, uint32_t data)
 {
-    unsigned vsz = vec_full_reg_size(s);
-    TCGv_ptr t_pg;
     uint32_t sizem1;
-    int desc = 0;
+    uint32_t desc = 0;
 
-    assert(mte_n >= 1 && mte_n <= 4);
-    sizem1 = (mte_n << dtype_msz(dtype)) - 1;
+    /* Assert all of the data fits, with or without MTE enabled. */
+    assert(nregs >= 1 && nregs <= 4);
+    sizem1 = (nregs << msz) - 1;
     assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
+    assert(data < 1u << SVE_MTEDESC_SHIFT);
+
     if (s->mte_active[0]) {
         desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
         desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
@@ -4456,7 +4456,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
         desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
         desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
         desc <<= SVE_MTEDESC_SHIFT;
-    } else {
+    }
+    return simd_desc(vsz, vsz, desc | data);
+}
+
+static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
+                       int dtype, uint32_t nregs, bool is_write,
+                       gen_helper_gvec_mem *fn)
+{
+    TCGv_ptr t_pg;
+    uint32_t desc;
+
+    if (!s->mte_active[0]) {
         addr = clean_data_tbi(s, addr);
     }
 
@@ -4465,7 +4476,8 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
      * registers as pointers, so encode the regno into the data field.
      * For consistency, do this even for LD1.
      */
-    desc = simd_desc(vsz, vsz, zt | desc);
+    desc = make_svemte_desc(s, vec_full_reg_size(s), nregs,
+                            dtype_msz(dtype), is_write, zt);
     t_pg = tcg_temp_new_ptr();
 
     tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
@@ -5225,25 +5237,16 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
                        int scale, TCGv_i64 scalar, int msz, bool is_write,
                        gen_helper_gvec_mem_scatter *fn)
 {
-    unsigned vsz = vec_full_reg_size(s);
     TCGv_ptr t_zm = tcg_temp_new_ptr();
     TCGv_ptr t_pg = tcg_temp_new_ptr();
     TCGv_ptr t_zt = tcg_temp_new_ptr();
-    int desc = 0;
-
-    if (s->mte_active[0]) {
-        desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
-        desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
-        desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
-        desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
-        desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << msz) - 1);
-        desc <<= SVE_MTEDESC_SHIFT;
-    }
-    desc = simd_desc(vsz, vsz, desc | scale);
+    uint32_t desc;
 
     tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
     tcg_gen_addi_ptr(t_zm, tcg_env, vec_full_reg_offset(s, zm));
     tcg_gen_addi_ptr(t_zt, tcg_env, vec_full_reg_offset(s, zt));
+
+    desc = make_svemte_desc(s, vec_full_reg_size(s), 1, msz, is_write, scale);
     fn(tcg_env, t_zt, t_pg, t_zm, scalar, tcg_constant_i32(desc));
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
                   ` (3 preceding siblings ...)
  2024-02-06  3:05 ` [PATCH v2 4/6] target/arm: Split out make_svemte_desc Richard Henderson
@ 2024-02-06  3:05 ` Richard Henderson
  2024-02-06 14:53   ` Peter Maydell
  2024-02-06  3:05 ` [PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks Richard Henderson
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2024-02-06  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, gustavo.romero

These functions "use the standard load helpers", but
fail to clean_data_tbi or populate mtedesc.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sve.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 8868aae5ac..fb3bcb47f6 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4861,8 +4861,13 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
     unsigned vsz = vec_full_reg_size(s);
     TCGv_ptr t_pg;
     int poff;
+    uint32_t desc;
 
     /* Load the first quadword using the normal predicated load helpers.  */
+    if (!s->mte_active[0]) {
+        addr = clean_data_tbi(s, addr);
+    }
+
     poff = pred_full_reg_offset(s, pg);
     if (vsz > 16) {
         /*
@@ -4886,7 +4891,8 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
 
     gen_helper_gvec_mem *fn
         = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
-    fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(16, 16, zt)));
+    desc = make_svemte_desc(s, 16, 1, dtype_msz(dtype), false, zt);
+    fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
 
     /* Replicate that first quadword.  */
     if (vsz > 16) {
@@ -4929,6 +4935,7 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
     unsigned vsz_r32;
     TCGv_ptr t_pg;
     int poff, doff;
+    uint32_t desc;
 
     if (vsz < 32) {
         /*
@@ -4941,6 +4948,9 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
     }
 
     /* Load the first octaword using the normal predicated load helpers.  */
+    if (!s->mte_active[0]) {
+        addr = clean_data_tbi(s, addr);
+    }
 
     poff = pred_full_reg_offset(s, pg);
     if (vsz > 32) {
@@ -4965,7 +4975,8 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
 
     gen_helper_gvec_mem *fn
         = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
-    fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(32, 32, zt)));
+    desc = make_svemte_desc(s, 32, 1, dtype_msz(dtype), false, zt);
+    fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
 
     /*
      * Replicate that first octaword.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
                   ` (4 preceding siblings ...)
  2024-02-06  3:05 ` [PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro Richard Henderson
@ 2024-02-06  3:05 ` Richard Henderson
  2024-02-06 14:54   ` Peter Maydell
  2024-02-06 14:54 ` [PATCH v2 0/6] target/arm: assorted mte fixes Peter Maydell
  2024-02-06 15:17 ` Gustavo Romero
  7 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2024-02-06  3:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, gustavo.romero

The TBI and TCMA bits are located within mtedesc, not desc.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/sme_helper.c |  8 ++++----
 target/arm/tcg/sve_helper.c | 12 ++++++------
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 1ee2690ceb..904bfdac43 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -573,8 +573,8 @@ void sme_ld1_mte(CPUARMState *env, void *za, uint64_t *vg,
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
@@ -750,8 +750,8 @@ void sme_st1_mte(CPUARMState *env, void *za, uint64_t *vg, target_ulong addr,
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index bce4295d28..6853f58c19 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -5800,8 +5800,8 @@ void sve_ldN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
@@ -6156,8 +6156,8 @@ void sve_ldnfff1_r_mte(CPUARMState *env, void *vg, target_ulong addr,
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
@@ -6410,8 +6410,8 @@ void sve_stN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
     desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
 
     /* Perform gross MTE suppression early. */
-    if (!tbi_check(desc, bit55) ||
-        tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+    if (!tbi_check(mtedesc, bit55) ||
+        tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
         mtedesc = 0;
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
  2024-02-06  3:05 ` [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3 Richard Henderson
@ 2024-02-06 14:23   ` Peter Maydell
  2024-02-07  0:31     ` Richard Henderson
  2024-02-07  2:10     ` Richard Henderson
  2024-02-06 15:18   ` Gustavo Romero
  1 sibling, 2 replies; 20+ messages in thread
From: Peter Maydell @ 2024-02-06 14:23 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero

On Tue, 6 Feb 2024 at 03:06, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> When MTE3 is supported, the kernel maps
>   PR_MTE_TCF_ASYNC | PR_MTE_TCF_SYNC
> to
>   MTE_CTRL_TCF_ASYMM
> and from there to
>   SCTLR_EL1.TCF0 = 3

This depends on the setting of
/sys/devices/system/cpu/cpu<N>/mte_tcf_preferred :
I think you only get asymm here if the sysadmin has set
mte_tcf_preferred to 'asymm'; the default is 'async'.

https://www.kernel.org/doc/Documentation/arch/arm64/memory-tagging-extension.rst
documents the intended semantics of the prctl, though it does have
one error (the default-order is asymm; async; sync, not async; asymm; sync).

> There is no error reported for setting ASYNC | SYNC when MTE3 is not
> supported; the kernel simply selects the ASYNC behavior of TCF0=2.

For QEMU's implementation, are there any particular
performance differences between sync, async and asymm ?
That might determine what we effectively consider to be the
mte_tcf_preferred setting for our user-mode CPUs.

> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  linux-user/aarch64/target_prctl.h | 25 +++++++++++++------------
>  1 file changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
> index 5067e7d731..49bd16aa95 100644
> --- a/linux-user/aarch64/target_prctl.h
> +++ b/linux-user/aarch64/target_prctl.h
> @@ -173,21 +173,22 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
>      env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
>
>      if (cpu_isar_feature(aa64_mte, cpu)) {
> -        switch (arg2 & PR_MTE_TCF_MASK) {
> -        case PR_MTE_TCF_NONE:
> -        case PR_MTE_TCF_SYNC:
> -        case PR_MTE_TCF_ASYNC:
> -            break;
> -        default:
> -            return -EINVAL;
> -        }

We should probably check here and reject unknown bits being
set in arg2, as set_tagged_addr_ctrl() does; but the old
code didn't get that right either.

> -
>          /*
>           * Write PR_MTE_TCF to SCTLR_EL1[TCF0].
> -         * Note that the syscall values are consistent with hw.
> +         * Note that SYNC | ASYNC -> ASYMM with FEAT_MTE3,
> +         * otherwise mte_update_sctlr_user chooses ASYNC.
>           */
> -        env->cp15.sctlr_el[1] =
> -            deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
> +        unsigned tcf = 0;
> +        if (arg2 & PR_MTE_TCF_ASYNC) {
> +            if ((arg2 & PR_MTE_TCF_SYNC) && cpu_isar_feature(aa64_mte3, cpu)) {
> +                tcf = 3;
> +            } else {
> +                tcf = 2;
> +            }
> +        } else if (arg2 & PR_MTE_TCF_SYNC) {
> +            tcf = 1;
> +        }
> +        env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
>
>          /*
>           * Write PR_MTE_TAG to GCR_EL1[Exclude].
> --
> 2.34.1

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa
  2024-02-06  3:05 ` [PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa Richard Henderson
@ 2024-02-06 14:46   ` Peter Maydell
  2024-02-07  0:42     ` Richard Henderson
  0 siblings, 1 reply; 20+ messages in thread
From: Peter Maydell @ 2024-02-06 14:46 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero, qemu-stable

On Tue, 6 Feb 2024 at 03:06, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The field is encoded as [0-3], which is convenient for
> indexing our array of function pointers, but the true
> value is [1-4].  Adjust before calling do_mem_zpa.
>
> Add an assert, and move the comment re passing ZT to
> the helper back next to the relevant code.
>
> Cc: qemu-stable@nongnu.org
> Fixes: 206adacfb8d ("target/arm: Add mte helpers for sve scalar + int loads")
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/tcg/translate-sve.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
> index 296e7d1ce2..f50a426c98 100644
> --- a/target/arm/tcg/translate-sve.c
> +++ b/target/arm/tcg/translate-sve.c
> @@ -4445,11 +4445,7 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
>      TCGv_ptr t_pg;
>      int desc = 0;
>
> -    /*
> -     * For e.g. LD4, there are not enough arguments to pass all 4
> -     * registers as pointers, so encode the regno into the data field.
> -     * For consistency, do this even for LD1.
> -     */
> +    assert(mte_n >= 1 && mte_n <= 4);
>      if (s->mte_active[0]) {
>          int msz = dtype_msz(dtype);
>
> @@ -4463,6 +4459,11 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
>          addr = clean_data_tbi(s, addr);
>      }
>
> +    /*
> +     * For e.g. LD4, there are not enough arguments to pass all 4
> +     * registers as pointers, so encode the regno into the data field.
> +     * For consistency, do this even for LD1.
> +     */
>      desc = simd_desc(vsz, vsz, zt | desc);
>      t_pg = tcg_temp_new_ptr();
>
> @@ -4600,7 +4601,7 @@ static void do_ld_zpa(DisasContext *s, int zt, int pg,
>       * accessible via the instruction encoding.
>       */
>      assert(fn != NULL);
> -    do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
> +    do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn);
>  }
>
>  static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)

What about do_st_zpa() ? It's not obvious what the 'nreg'
encoding is in the a->nreg field in arg_rprr_store, but
it's definitely confusing that do_st_zpa() calls
do_mem_zpa() passing "nreg" whereas do_ld_zpa() now
passes it "nreg + 1". Can we make it so the handling
in these two functions lines up?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1
  2024-02-06  3:05 ` [PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
@ 2024-02-06 14:49   ` Peter Maydell
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Maydell @ 2024-02-06 14:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero

On Tue, 6 Feb 2024 at 03:07, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> When we added SVE_MTEDESC_SHIFT, we effectively limited the
> maximum size of MTEDESC.  Adjust SIZEM1 to consume the remaining
> bits (32 - 10 - 5 - 12 == 5).  Assert that the data to be stored
> fits within the field (expecting 8 * 4 - 1 == 31, exact fit).
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/6] target/arm: Split out make_svemte_desc
  2024-02-06  3:05 ` [PATCH v2 4/6] target/arm: Split out make_svemte_desc Richard Henderson
@ 2024-02-06 14:52   ` Peter Maydell
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Maydell @ 2024-02-06 14:52 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero

On Tue, 6 Feb 2024 at 03:06, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Share code that creates mtedesc and embeds within simd_desc.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/tcg/translate-a64.h |  2 ++
>  target/arm/tcg/translate-sme.c | 15 +++--------
>  target/arm/tcg/translate-sve.c | 47 ++++++++++++++++++----------------
>  3 files changed, 31 insertions(+), 33 deletions(-)
>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro
  2024-02-06  3:05 ` [PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro Richard Henderson
@ 2024-02-06 14:53   ` Peter Maydell
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Maydell @ 2024-02-06 14:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero

On Tue, 6 Feb 2024 at 03:06, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> These functions "use the standard load helpers", but
> fail to clean_data_tbi or populate mtedesc.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks
  2024-02-06  3:05 ` [PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks Richard Henderson
@ 2024-02-06 14:54   ` Peter Maydell
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Maydell @ 2024-02-06 14:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero

On Tue, 6 Feb 2024 at 03:06, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The TBI and TCMA bits are located within mtedesc, not desc.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/6] target/arm: assorted mte fixes
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
                   ` (5 preceding siblings ...)
  2024-02-06  3:05 ` [PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks Richard Henderson
@ 2024-02-06 14:54 ` Peter Maydell
  2024-02-06 20:10   ` Richard Henderson
  2024-02-06 15:17 ` Gustavo Romero
  7 siblings, 1 reply; 20+ messages in thread
From: Peter Maydell @ 2024-02-06 14:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero

On Tue, 6 Feb 2024 at 03:07, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The first patch is unchanged from
>
> Supercedes: <20240131003557.176486-1-richard.henderson@linaro.org>
>
> while the remaining patches replace
>
> Supercedes: <20240205023948.25476-1-richard.henderson@linaro.org>
>
> While digging through Gustavo's test case, wondering why it
> should be failing at all, I finally noticed that we weren't
> overflowing MTEDESC.SIZEM1, but underflowing (-1).  Oops.
>
> But I did find a few other points by inspection where we
> weren't properly handling or supplying MTEDESC.
>

Should some or all of this patchset be cc: qemu-stable ?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/6] target/arm: assorted mte fixes
  2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
                   ` (6 preceding siblings ...)
  2024-02-06 14:54 ` [PATCH v2 0/6] target/arm: assorted mte fixes Peter Maydell
@ 2024-02-06 15:17 ` Gustavo Romero
  7 siblings, 0 replies; 20+ messages in thread
From: Gustavo Romero @ 2024-02-06 15:17 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm

Hi Richard,

On 2/6/24 12:05 AM, Richard Henderson wrote:
> The first patch is unchanged from
> 
> Supercedes: <20240131003557.176486-1-richard.henderson@linaro.org>
> 
> while the remaining patches replace
> 
> Supercedes: <20240205023948.25476-1-richard.henderson@linaro.org>
> 
> While digging through Gustavo's test case, wondering why it
> should be failing at all, I finally noticed that we weren't
> overflowing MTEDESC.SIZEM1, but underflowing (-1).  Oops.
> 
> But I did find a few other points by inspection where we
> weren't properly handling or supplying MTEDESC.
> 
> 
> r~
> 
> 
> Richard Henderson (6):
>    linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
>    target/arm: Fix nregs computation in do_ld_zpa
>    target/arm: Adjust and validate mtedesc sizem1
>    target/arm: Split out make_svemte_desc
>    target/arm: Handle mte in do_ldrq, do_ldro
>    target/arm: Fix SVE/SME gross MTE suppression checks
> 
>   linux-user/aarch64/target_prctl.h | 25 +++++-----
>   target/arm/internals.h            |  2 +-
>   target/arm/tcg/translate-a64.h    |  2 +
>   target/arm/tcg/sme_helper.c       |  8 ++--
>   target/arm/tcg/sve_helper.c       | 12 ++---
>   target/arm/tcg/translate-sme.c    | 15 ++----
>   target/arm/tcg/translate-sve.c    | 80 ++++++++++++++++++-------------
>   7 files changed, 78 insertions(+), 66 deletions(-)
> 

Tested-by: Gustavo Romero <gustavo.romero@linaro.org>


Thanks!


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
  2024-02-06  3:05 ` [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3 Richard Henderson
  2024-02-06 14:23   ` Peter Maydell
@ 2024-02-06 15:18   ` Gustavo Romero
  1 sibling, 0 replies; 20+ messages in thread
From: Gustavo Romero @ 2024-02-06 15:18 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: qemu-arm

On 2/6/24 12:05 AM, Richard Henderson wrote:
> When MTE3 is supported, the kernel maps
>    PR_MTE_TCF_ASYNC | PR_MTE_TCF_SYNC
> to
>    MTE_CTRL_TCF_ASYMM
> and from there to
>    SCTLR_EL1.TCF0 = 3
> 
> There is no error reported for setting ASYNC | SYNC when MTE3 is not
> supported; the kernel simply selects the ASYNC behavior of TCF0=2.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   linux-user/aarch64/target_prctl.h | 25 +++++++++++++------------
>   1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
> index 5067e7d731..49bd16aa95 100644
> --- a/linux-user/aarch64/target_prctl.h
> +++ b/linux-user/aarch64/target_prctl.h
> @@ -173,21 +173,22 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
>       env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
>   
>       if (cpu_isar_feature(aa64_mte, cpu)) {
> -        switch (arg2 & PR_MTE_TCF_MASK) {
> -        case PR_MTE_TCF_NONE:
> -        case PR_MTE_TCF_SYNC:
> -        case PR_MTE_TCF_ASYNC:
> -            break;
> -        default:
> -            return -EINVAL;
> -        }
> -
>           /*
>            * Write PR_MTE_TCF to SCTLR_EL1[TCF0].
> -         * Note that the syscall values are consistent with hw.
> +         * Note that SYNC | ASYNC -> ASYMM with FEAT_MTE3,
> +         * otherwise mte_update_sctlr_user chooses ASYNC.
>            */
> -        env->cp15.sctlr_el[1] =
> -            deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
> +        unsigned tcf = 0;
> +        if (arg2 & PR_MTE_TCF_ASYNC) {
> +            if ((arg2 & PR_MTE_TCF_SYNC) && cpu_isar_feature(aa64_mte3, cpu)) {
> +                tcf = 3;
> +            } else {
> +                tcf = 2;
> +            }
> +        } else if (arg2 & PR_MTE_TCF_SYNC) {
> +            tcf = 1;
> +        }
> +        env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
>   
>           /*
>            * Write PR_MTE_TAG to GCR_EL1[Exclude].
> 

Reviewed-by: Gustavo Romero <gustavo.romero@linaro.org>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/6] target/arm: assorted mte fixes
  2024-02-06 14:54 ` [PATCH v2 0/6] target/arm: assorted mte fixes Peter Maydell
@ 2024-02-06 20:10   ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2024-02-06 20:10 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, gustavo.romero

On 2/7/24 00:54, Peter Maydell wrote:
> On Tue, 6 Feb 2024 at 03:07, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> The first patch is unchanged from
>>
>> Supercedes: <20240131003557.176486-1-richard.henderson@linaro.org>
>>
>> while the remaining patches replace
>>
>> Supercedes: <20240205023948.25476-1-richard.henderson@linaro.org>
>>
>> While digging through Gustavo's test case, wondering why it
>> should be failing at all, I finally noticed that we weren't
>> overflowing MTEDESC.SIZEM1, but underflowing (-1).  Oops.
>>
>> But I did find a few other points by inspection where we
>> weren't properly handling or supplying MTEDESC.
>>
> 
> Should some or all of this patchset be cc: qemu-stable ?

All of it, I expect.


r~



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
  2024-02-06 14:23   ` Peter Maydell
@ 2024-02-07  0:31     ` Richard Henderson
  2024-02-07  2:10     ` Richard Henderson
  1 sibling, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2024-02-07  0:31 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, gustavo.romero

On 2/7/24 00:23, Peter Maydell wrote:
> On Tue, 6 Feb 2024 at 03:06, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> When MTE3 is supported, the kernel maps
>>    PR_MTE_TCF_ASYNC | PR_MTE_TCF_SYNC
>> to
>>    MTE_CTRL_TCF_ASYMM
>> and from there to
>>    SCTLR_EL1.TCF0 = 3
> 
> This depends on the setting of
> /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred :
> I think you only get asymm here if the sysadmin has set
> mte_tcf_preferred to 'asymm'; the default is 'async'.

Hmm, I missed that somewhere in the rat's nest.
I suspect this is over-engineered, such that no one will understand how to use it.

> For QEMU's implementation, are there any particular
> performance differences between sync, async and asymm ?

I doubt it.  Getting to the error path at all is the bulk of the work.

I think "performance" in this case would be highly test-case-centric.
Does the test "perform better" with async, which would allow the entire vector operation 
to finish in one go?

I suspect that for debugging purposes, sync is always preferred.
That might be the best setting for qemu.


r~


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa
  2024-02-06 14:46   ` Peter Maydell
@ 2024-02-07  0:42     ` Richard Henderson
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2024-02-07  0:42 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, gustavo.romero, qemu-stable

On 2/7/24 00:46, Peter Maydell wrote:
>> @@ -4600,7 +4601,7 @@ static void do_ld_zpa(DisasContext *s, int zt, int pg,
>>        * accessible via the instruction encoding.
>>        */
>>       assert(fn != NULL);
>> -    do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
>> +    do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn);
>>   }
>>
>>   static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)
> 
> What about do_st_zpa() ? It's not obvious what the 'nreg'
> encoding is in the a->nreg field in arg_rprr_store, but
> it's definitely confusing that do_st_zpa() calls
> do_mem_zpa() passing "nreg" whereas do_ld_zpa() now
> passes it "nreg + 1". Can we make it so the handling
> in these two functions lines up?

Yes, I think there may be a bug in store as well.
Comparing the two is complicated by the cut outs for LDFF1, LDNF1, LD1R and PRF.


r~


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3
  2024-02-06 14:23   ` Peter Maydell
  2024-02-07  0:31     ` Richard Henderson
@ 2024-02-07  2:10     ` Richard Henderson
  1 sibling, 0 replies; 20+ messages in thread
From: Richard Henderson @ 2024-02-07  2:10 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, gustavo.romero

On 2/7/24 00:23, Peter Maydell wrote:
>> +++ b/linux-user/aarch64/target_prctl.h
>> @@ -173,21 +173,22 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
>>       env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
>>
>>       if (cpu_isar_feature(aa64_mte, cpu)) {
>> -        switch (arg2 & PR_MTE_TCF_MASK) {
>> -        case PR_MTE_TCF_NONE:
>> -        case PR_MTE_TCF_SYNC:
>> -        case PR_MTE_TCF_ASYNC:
>> -            break;
>> -        default:
>> -            return -EINVAL;
>> -        }
> 
> We should probably check here and reject unknown bits being
> set in arg2, as set_tagged_addr_ctrl() does; but the old
> code didn't get that right either.

This is done higher up in this function:

     if (arg2 & ~valid_mask) {
         return -TARGET_EINVAL;
     }

The rejection of ASYNC | SYNC here was either a bug in my original implementation, or the 
kernel API changed since the initial implementation in June 2020 (not worth digging to 
find out).


r~



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-02-07  2:11 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-06  3:05 [PATCH v2 0/6] target/arm: assorted mte fixes Richard Henderson
2024-02-06  3:05 ` [PATCH v2 1/6] linux-user/aarch64: Extend PR_SET_TAGGED_ADDR_CTRL for FEAT_MTE3 Richard Henderson
2024-02-06 14:23   ` Peter Maydell
2024-02-07  0:31     ` Richard Henderson
2024-02-07  2:10     ` Richard Henderson
2024-02-06 15:18   ` Gustavo Romero
2024-02-06  3:05 ` [PATCH v2 2/6] target/arm: Fix nregs computation in do_ld_zpa Richard Henderson
2024-02-06 14:46   ` Peter Maydell
2024-02-07  0:42     ` Richard Henderson
2024-02-06  3:05 ` [PATCH v2 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
2024-02-06 14:49   ` Peter Maydell
2024-02-06  3:05 ` [PATCH v2 4/6] target/arm: Split out make_svemte_desc Richard Henderson
2024-02-06 14:52   ` Peter Maydell
2024-02-06  3:05 ` [PATCH v2 5/6] target/arm: Handle mte in do_ldrq, do_ldro Richard Henderson
2024-02-06 14:53   ` Peter Maydell
2024-02-06  3:05 ` [PATCH v2 6/6] target/arm: Fix SVE/SME gross MTE suppression checks Richard Henderson
2024-02-06 14:54   ` Peter Maydell
2024-02-06 14:54 ` [PATCH v2 0/6] target/arm: assorted mte fixes Peter Maydell
2024-02-06 20:10   ` Richard Henderson
2024-02-06 15:17 ` Gustavo Romero

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).