qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty
@ 2024-03-09 20:43 Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 01/10] target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX() Daniel Henrique Barboza
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

Hi,

This new version has changes suggested by Richard in patches 2 and 6.
Other patches are unchanged.

Series based on alistair/riscv-to-apply.next.

Patches missing review: 2, 3, 4, 5, 6.

Changes from v8:
- patch 2:
  - do an early exit if vstart >= vl in vext_ldst_stride(), vext_ldst_us()
    and vext_ldst_index() if vstart >= vl
- patch 6:
  - vec_set_vstart_zero() removed
  - set cpu_vstart directly using tcg_gen_movi_tl()
- v8 link: https://lore.kernel.org/qemu-riscv/20240308215402.117405-1-dbarboza@ventanamicro.com/


Daniel Henrique Barboza (9):
  target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
  target/riscv: handle vstart >= vl in vext_set_tail_elems_1s()
  target/riscv/vector_helper.c: do vstart=0 after updating tail
  target/riscv/vector_helper.c: update tail with
    vext_set_tail_elems_1s()
  target/riscv: use vext_set_tail_elems_1s() in vcrypto insns
  trans_rvv.c.inc: set vstart = 0 in int scalar move insns
  target/riscv: remove 'over' brconds from vector trans
  trans_rvv.c.inc: remove redundant mark_vs_dirty() calls
  target/riscv/vector_helper.c: optimize loops in ldst helpers

Ivan Klokov (1):
  target/riscv: Clear vstart_qe_zero flag

 target/riscv/insn_trans/trans_rvbf16.c.inc |  18 +-
 target/riscv/insn_trans/trans_rvv.c.inc    | 198 +++++----------------
 target/riscv/insn_trans/trans_rvvk.c.inc   |  30 +---
 target/riscv/translate.c                   |   6 +
 target/riscv/vcrypto_helper.c              |  63 +++----
 target/riscv/vector_helper.c               | 192 +++++++++-----------
 target/riscv/vector_internals.c            |  29 +++
 target/riscv/vector_internals.h            |   4 +
 8 files changed, 207 insertions(+), 333 deletions(-)

-- 
2.43.2



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 01/10] target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX()
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 02/10] target/riscv: handle vstart >= vl in vext_set_tail_elems_1s() Daniel Henrique Barboza
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

The helper isn't setting env->vstart = 0 after its execution, as it is
expected from every vector instruction that completes successfully.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/vector_helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index fe56c007d5..ca79571ae2 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4781,6 +4781,7 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
         }                                                                 \
         *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - offset));          \
     }                                                                     \
+    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
     vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
 }
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 02/10] target/riscv: handle vstart >= vl in vext_set_tail_elems_1s()
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 01/10] target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX() Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-10  7:37   ` Richard Henderson
  2024-03-09 20:43 ` [PATCH v9 03/10] target/riscv/vector_helper.c: do vstart=0 after updating tail Daniel Henrique Barboza
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

We're going to make changes that will required each helper to be
responsible for the 'vstart' management, i.e. we will relieve the
'vstart < vl' assumption that helpers have today.

To do that we'll need to deal with how we're updating tail elements
first. We can't update them if vstart >= vl, but at this moment we're
not guarding for it.

We have the vext_set_tail_elems_1s() helper to update tail elements.
Change it to accept an 'env' pointer, where we can read both vstart and
vl, and make it a no-op if vstart >= vl. Note that callers will need to
set env->start = 0 *after* the helper from now on.

The exception are three helpers: vext_ldst_stride(), vext_ldst_us() and
vext_ldst_index(). They are are incrementing env->vstart during
execution and will end up with env->vstart = vl when tail updating. For
these cases only, do an early check and exit if vstart >= vl, and set
env->vstart = 0 before updating the tail.

For everyone else we'll do vext_set_tail_elems_1s() and then clear
env->vstart. This is the case of vext_ldff() that is already using
set_tail_elems_1s(), and will be the case for the rest after the next
patches.

Let's also simplify the API a little by removing the 'nf' argument since
it can be derived from 'desc'.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 target/riscv/vector_helper.c | 59 ++++++++++++++++++++++++++++++------
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index ca79571ae2..a3b496b6e9 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -174,19 +174,32 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw)
 GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl)
 GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq)
 
-static void vext_set_tail_elems_1s(target_ulong vl, void *vd,
-                                   uint32_t desc, uint32_t nf,
-                                   uint32_t esz, uint32_t max_elems)
+/*
+ * This function is sensitive to env->vstart changes since
+ * it'll be a no-op if vstart >= vl. Do not clear env->vstart
+ * before calling it unless you're certain that vstart < vl.
+ */
+static void vext_set_tail_elems_1s(CPURISCVState *env, void *vd,
+                                   uint32_t desc, uint32_t esz,
+                                   uint32_t max_elems)
 {
     uint32_t vta = vext_vta(desc);
+    uint32_t nf = vext_nf(desc);
     int k;
 
-    if (vta == 0) {
+    /*
+     * Section 5.4 of the RVV spec mentions:
+     * "When vstart ≥ vl, there are no body elements, and no
+     *  elements are updated in any destination vector register
+     *  group, including that no tail elements are updated
+     *  with agnostic values."
+     */
+    if (vta == 0 || env->vstart >= env->vl) {
         return;
     }
 
     for (k = 0; k < nf; ++k) {
-        vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz,
+        vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz,
                           (k * max_elems + max_elems) * esz);
     }
 }
@@ -207,6 +220,11 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base,
     uint32_t esz = 1 << log2_esz;
     uint32_t vma = vext_vma(desc);
 
+    if (env->vstart >= env->vl) {
+        env->vstart = 0;
+        return;
+    }
+
     for (i = env->vstart; i < env->vl; i++, env->vstart++) {
         k = 0;
         while (k < nf) {
@@ -222,9 +240,13 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base,
             k++;
         }
     }
+    /*
+     * Set vstart before tail update - vstart changed during
+     * execution and we already checked that vstart < vl.
+     */
     env->vstart = 0;
 
-    vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems);
+    vext_set_tail_elems_1s(env, vd, desc, esz, max_elems);
 }
 
 #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN)                        \
@@ -272,6 +294,11 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc,
     uint32_t max_elems = vext_max_elems(desc, log2_esz);
     uint32_t esz = 1 << log2_esz;
 
+    if (env->vstart >= env->vl) {
+        env->vstart = 0;
+        return;
+    }
+
     /* load bytes from guest memory */
     for (i = env->vstart; i < evl; i++, env->vstart++) {
         k = 0;
@@ -281,9 +308,13 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc,
             k++;
         }
     }
+    /*
+     * Set vstart before tail update - vstart changed during
+     * execution and we already checked that vstart < vl.
+     */
     env->vstart = 0;
 
-    vext_set_tail_elems_1s(evl, vd, desc, nf, esz, max_elems);
+    vext_set_tail_elems_1s(env, vd, desc, esz, max_elems);
 }
 
 /*
@@ -386,6 +417,11 @@ vext_ldst_index(void *vd, void *v0, target_ulong base,
     uint32_t esz = 1 << log2_esz;
     uint32_t vma = vext_vma(desc);
 
+    if (env->vstart >= env->vl) {
+        env->vstart = 0;
+        return;
+    }
+
     /* load bytes from guest memory */
     for (i = env->vstart; i < env->vl; i++, env->vstart++) {
         k = 0;
@@ -402,9 +438,13 @@ vext_ldst_index(void *vd, void *v0, target_ulong base,
             k++;
         }
     }
+    /*
+     * Set vstart before tail update - vstart changed during
+     * execution and we already checked that vstart < vl.
+     */
     env->vstart = 0;
 
-    vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems);
+    vext_set_tail_elems_1s(env, vd, desc, esz, max_elems);
 }
 
 #define GEN_VEXT_LD_INDEX(NAME, ETYPE, INDEX_FN, LOAD_FN)                  \
@@ -532,9 +572,8 @@ ProbeSuccess:
             k++;
         }
     }
+    vext_set_tail_elems_1s(env, vd, desc, esz, max_elems);
     env->vstart = 0;
-
-    vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems);
 }
 
 #define GEN_VEXT_LDFF(NAME, ETYPE, LOAD_FN)               \
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 03/10] target/riscv/vector_helper.c: do vstart=0 after updating tail
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 01/10] target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX() Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 02/10] target/riscv: handle vstart >= vl in vext_set_tail_elems_1s() Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-10  7:38   ` Richard Henderson
  2024-03-09 20:43 ` [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s() Daniel Henrique Barboza
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

vext_vv_rm_1() and vext_vv_rm_2() are setting vstart = 0 before their
respective callers (vext_vv_rm_2 and  vext_vx_rm_2) update the tail
elements.

This is benign now, but we'll convert the tail updates to use
vext_set_tail_elems_1s(), and this function is sensitive to vstart
changes. Do vstart = 0 after vext_set_elems_1s() now to make the
conversion easier.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 target/riscv/vector_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index a3b496b6e9..86b990ce03 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -1962,7 +1962,6 @@ vext_vv_rm_1(void *vd, void *v0, void *vs1, void *vs2,
         }
         fn(vd, vs1, vs2, i, env, vxrm);
     }
-    env->vstart = 0;
 }
 
 static inline void
@@ -1997,6 +1996,7 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2,
     }
     /* set tail elements to 1s */
     vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
+    env->vstart = 0;
 }
 
 /* generate helpers for fixed point instructions with OPIVV format */
@@ -2087,7 +2087,6 @@ vext_vx_rm_1(void *vd, void *v0, target_long s1, void *vs2,
         }
         fn(vd, s1, vs2, i, env, vxrm);
     }
-    env->vstart = 0;
 }
 
 static inline void
@@ -2122,6 +2121,7 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2,
     }
     /* set tail elements to 1s */
     vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
+    env->vstart = 0;
 }
 
 /* generate helpers for fixed point instructions with OPIVX format */
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s()
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
                   ` (2 preceding siblings ...)
  2024-03-09 20:43 ` [PATCH v9 03/10] target/riscv/vector_helper.c: do vstart=0 after updating tail Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-10  7:41   ` Richard Henderson
  2024-03-11  2:40   ` LIU Zhiwei
  2024-03-09 20:43 ` [PATCH v9 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns Daniel Henrique Barboza
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

Change all code that updates tail elems to use vext_set_tail_elems_1s()
instead of vext_set_elems_1s().

Setting 'env->vstart=0' needs to be the very last thing a helper does
because env->vstart is being checked by vext_set_tail_elems_1s().

A side effect of this change is that a lot of 'vta' local variables got
unused. The reason is that 'vta' was being fetched to be used with
vext_set_elems_1s() but vext_set_tail_elems_1s() doesn't use it - 'vta' is
retrieve inside the helper using 'desc'.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 target/riscv/vector_helper.c | 130 ++++++++++++++---------------------
 1 file changed, 52 insertions(+), 78 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 86b990ce03..b174ddeae8 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -913,7 +913,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,   \
     uint32_t esz = sizeof(ETYPE);                             \
     uint32_t total_elems =                                    \
         vext_get_total_elems(env, desc, esz);                 \
-    uint32_t vta = vext_vta(desc);                            \
     uint32_t i;                                               \
                                                               \
     for (i = env->vstart; i < vl; i++) {                      \
@@ -923,9 +922,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,   \
                                                               \
         *((ETYPE *)vd + H(i)) = DO_OP(s2, s1, carry);         \
     }                                                         \
-    env->vstart = 0;                                          \
     /* set tail elements to 1s */                             \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);  \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);  \
+    env->vstart = 0;                                          \
 }
 
 GEN_VEXT_VADC_VVM(vadc_vvm_b, uint8_t,  H1, DO_VADC)
@@ -945,7 +944,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,        \
     uint32_t vl = env->vl;                                               \
     uint32_t esz = sizeof(ETYPE);                                        \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);         \
-    uint32_t vta = vext_vta(desc);                                       \
     uint32_t i;                                                          \
                                                                          \
     for (i = env->vstart; i < vl; i++) {                                 \
@@ -954,9 +952,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,        \
                                                                          \
         *((ETYPE *)vd + H(i)) = DO_OP(s2, (ETYPE)(target_long)s1, carry);\
     }                                                                    \
-    env->vstart = 0;                                                     \
     /* set tail elements to 1s */                                        \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);             \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);             \
+    env->vstart = 0;                                                     \
 }
 
 GEN_VEXT_VADC_VXM(vadc_vxm_b, uint8_t,  H1, DO_VADC)
@@ -1113,7 +1111,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,                          \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(TS1);                                           \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t vma = vext_vma(desc);                                        \
     uint32_t i;                                                           \
                                                                           \
@@ -1127,9 +1124,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,                          \
         TS2 s2 = *((TS2 *)vs2 + HS2(i));                                  \
         *((TS1 *)vd + HS1(i)) = OP(s2, s1 & MASK);                        \
     }                                                                     \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 GEN_VEXT_SHIFT_VV(vsll_vv_b, uint8_t,  uint8_t, H1, H1, DO_SLL, 0x7)
@@ -1160,7 +1157,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,      \
     uint32_t esz = sizeof(TD);                              \
     uint32_t total_elems =                                  \
         vext_get_total_elems(env, desc, esz);               \
-    uint32_t vta = vext_vta(desc);                          \
     uint32_t vma = vext_vma(desc);                          \
     uint32_t i;                                             \
                                                             \
@@ -1174,9 +1170,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,      \
         TS2 s2 = *((TS2 *)vs2 + HS2(i));                    \
         *((TD *)vd + HD(i)) = OP(s2, s1 & MASK);            \
     }                                                       \
-    env->vstart = 0;                                        \
     /* set tail elements to 1s */                           \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);\
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);\
+    env->vstart = 0;                                        \
 }
 
 GEN_VEXT_SHIFT_VX(vsll_vx_b, uint8_t, int8_t, H1, H1, DO_SLL, 0x7)
@@ -1835,16 +1831,15 @@ void HELPER(NAME)(void *vd, void *vs1, CPURISCVState *env,           \
     uint32_t vl = env->vl;                                           \
     uint32_t esz = sizeof(ETYPE);                                    \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
-    uint32_t vta = vext_vta(desc);                                   \
     uint32_t i;                                                      \
                                                                      \
     for (i = env->vstart; i < vl; i++) {                             \
         ETYPE s1 = *((ETYPE *)vs1 + H(i));                           \
         *((ETYPE *)vd + H(i)) = s1;                                  \
     }                                                                \
-    env->vstart = 0;                                                 \
     /* set tail elements to 1s */                                    \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
+    env->vstart = 0;                                                 \
 }
 
 GEN_VEXT_VMV_VV(vmv_v_v_b, int8_t,  H1)
@@ -1859,15 +1854,14 @@ void HELPER(NAME)(void *vd, uint64_t s1, CPURISCVState *env,         \
     uint32_t vl = env->vl;                                           \
     uint32_t esz = sizeof(ETYPE);                                    \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
-    uint32_t vta = vext_vta(desc);                                   \
     uint32_t i;                                                      \
                                                                      \
     for (i = env->vstart; i < vl; i++) {                             \
         *((ETYPE *)vd + H(i)) = (ETYPE)s1;                           \
     }                                                                \
-    env->vstart = 0;                                                 \
     /* set tail elements to 1s */                                    \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
+    env->vstart = 0;                                                 \
 }
 
 GEN_VEXT_VMV_VX(vmv_v_x_b, int8_t,  H1)
@@ -1882,16 +1876,15 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,          \
     uint32_t vl = env->vl;                                           \
     uint32_t esz = sizeof(ETYPE);                                    \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
-    uint32_t vta = vext_vta(desc);                                   \
     uint32_t i;                                                      \
                                                                      \
     for (i = env->vstart; i < vl; i++) {                             \
         ETYPE *vt = (!vext_elem_mask(v0, i) ? vs2 : vs1);            \
         *((ETYPE *)vd + H(i)) = *(vt + H(i));                        \
     }                                                                \
-    env->vstart = 0;                                                 \
     /* set tail elements to 1s */                                    \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
+    env->vstart = 0;                                                 \
 }
 
 GEN_VEXT_VMERGE_VV(vmerge_vvm_b, int8_t,  H1)
@@ -1906,7 +1899,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,               \
     uint32_t vl = env->vl;                                           \
     uint32_t esz = sizeof(ETYPE);                                    \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
-    uint32_t vta = vext_vta(desc);                                   \
     uint32_t i;                                                      \
                                                                      \
     for (i = env->vstart; i < vl; i++) {                             \
@@ -1915,9 +1907,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,               \
                    (ETYPE)(target_long)s1);                          \
         *((ETYPE *)vd + H(i)) = d;                                   \
     }                                                                \
-    env->vstart = 0;                                                 \
     /* set tail elements to 1s */                                    \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
+    env->vstart = 0;                                                 \
 }
 
 GEN_VEXT_VMERGE_VX(vmerge_vxm_b, int8_t,  H1)
@@ -1973,7 +1965,6 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2,
     uint32_t vm = vext_vm(desc);
     uint32_t vl = env->vl;
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);
-    uint32_t vta = vext_vta(desc);
     uint32_t vma = vext_vma(desc);
 
     switch (env->vxrm) {
@@ -1995,7 +1986,7 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2,
         break;
     }
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -2098,7 +2089,6 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2,
     uint32_t vm = vext_vm(desc);
     uint32_t vl = env->vl;
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);
-    uint32_t vta = vext_vta(desc);
     uint32_t vma = vext_vma(desc);
 
     switch (env->vxrm) {
@@ -2120,7 +2110,7 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2,
         break;
     }
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -2872,7 +2862,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
     uint32_t vl = env->vl;                                \
     uint32_t total_elems =                                \
         vext_get_total_elems(env, desc, ESZ);             \
-    uint32_t vta = vext_vta(desc);                        \
     uint32_t vma = vext_vma(desc);                        \
     uint32_t i;                                           \
                                                           \
@@ -2885,10 +2874,10 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
         }                                                 \
         do_##NAME(vd, vs1, vs2, i, env);                  \
     }                                                     \
-    env->vstart = 0;                                      \
     /* set tail elements to 1s */                         \
-    vext_set_elems_1s(vd, vta, vl * ESZ,                  \
-                      total_elems * ESZ);                 \
+    vext_set_tail_elems_1s(env, vd, desc, ESZ,            \
+                           total_elems);                  \
+    env->vstart = 0;                                      \
 }
 
 RVVCALL(OPFVV2, vfadd_vv_h, OP_UUU_H, H2, H2, H2, float16_add)
@@ -2915,7 +2904,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1,        \
     uint32_t vl = env->vl;                                \
     uint32_t total_elems =                                \
         vext_get_total_elems(env, desc, ESZ);             \
-    uint32_t vta = vext_vta(desc);                        \
     uint32_t vma = vext_vma(desc);                        \
     uint32_t i;                                           \
                                                           \
@@ -2928,10 +2916,10 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1,        \
         }                                                 \
         do_##NAME(vd, s1, vs2, i, env);                   \
     }                                                     \
-    env->vstart = 0;                                      \
     /* set tail elements to 1s */                         \
-    vext_set_elems_1s(vd, vta, vl * ESZ,                  \
-                      total_elems * ESZ);                 \
+    vext_set_tail_elems_1s(env, vd, desc, ESZ,            \
+                           total_elems);                  \
+    env->vstart = 0;                                      \
 }
 
 RVVCALL(OPFVF2, vfadd_vf_h, OP_UUU_H, H2, H2, float16_add)
@@ -3501,7 +3489,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,       \
     uint32_t vl = env->vl;                             \
     uint32_t total_elems =                             \
         vext_get_total_elems(env, desc, ESZ);          \
-    uint32_t vta = vext_vta(desc);                     \
     uint32_t vma = vext_vma(desc);                     \
     uint32_t i;                                        \
                                                        \
@@ -3517,9 +3504,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,       \
         }                                              \
         do_##NAME(vd, vs2, i, env);                    \
     }                                                  \
+    vext_set_tail_elems_1s(env, vd, desc, ESZ,         \
+                           total_elems);               \
     env->vstart = 0;                                   \
-    vext_set_elems_1s(vd, vta, vl * ESZ,               \
-                      total_elems * ESZ);              \
 }
 
 RVVCALL(OPFVV1, vfsqrt_v_h, OP_UU_H, H2, H2, float16_sqrt)
@@ -4256,7 +4243,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \
     uint32_t esz = sizeof(ETYPE);                             \
     uint32_t total_elems =                                    \
         vext_get_total_elems(env, desc, esz);                 \
-    uint32_t vta = vext_vta(desc);                            \
     uint32_t i;                                               \
                                                               \
     for (i = env->vstart; i < vl; i++) {                      \
@@ -4264,9 +4250,9 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \
         *((ETYPE *)vd + H(i)) =                               \
             (!vm && !vext_elem_mask(v0, i) ? s2 : s1);        \
     }                                                         \
-    env->vstart = 0;                                          \
     /* set tail elements to 1s */                             \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);  \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);  \
+    env->vstart = 0;                                          \
 }
 
 GEN_VFMERGE_VF(vfmerge_vfm_h, int16_t, H2)
@@ -4421,7 +4407,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
     uint32_t vl = env->vl;                                \
     uint32_t esz = sizeof(TD);                            \
     uint32_t vlenb = simd_maxsz(desc);                    \
-    uint32_t vta = vext_vta(desc);                        \
     uint32_t i;                                           \
     TD s1 =  *((TD *)vs1 + HD(0));                        \
                                                           \
@@ -4433,9 +4418,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
         s1 = OP(s1, (TD)s2);                              \
     }                                                     \
     *((TD *)vd + HD(0)) = s1;                             \
-    env->vstart = 0;                                      \
     /* set tail elements to 1s */                         \
-    vext_set_elems_1s(vd, vta, esz, vlenb);               \
+    vext_set_tail_elems_1s(env, vd, desc, esz, vlenb);    \
+    env->vstart = 0;                                      \
 }
 
 /* vd[0] = sum(vs1[0], vs2[*]) */
@@ -4507,7 +4492,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,           \
     uint32_t vl = env->vl;                                 \
     uint32_t esz = sizeof(TD);                             \
     uint32_t vlenb = simd_maxsz(desc);                     \
-    uint32_t vta = vext_vta(desc);                         \
     uint32_t i;                                            \
     TD s1 =  *((TD *)vs1 + HD(0));                         \
                                                            \
@@ -4519,9 +4503,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,           \
         s1 = OP(s1, (TD)s2, &env->fp_status);              \
     }                                                      \
     *((TD *)vd + HD(0)) = s1;                              \
-    env->vstart = 0;                                       \
     /* set tail elements to 1s */                          \
-    vext_set_elems_1s(vd, vta, esz, vlenb);                \
+    vext_set_tail_elems_1s(env, vd, desc, esz, vlenb);     \
+    env->vstart = 0;                                       \
 }
 
 /* Unordered sum */
@@ -4738,7 +4722,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, CPURISCVState *env,      \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(ETYPE);                                         \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t vma = vext_vma(desc);                                        \
     uint32_t sum = 0;                                                     \
     int i;                                                                \
@@ -4754,9 +4737,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, CPURISCVState *env,      \
             sum++;                                                        \
         }                                                                 \
     }                                                                     \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 GEN_VEXT_VIOTA_M(viota_m_b, uint8_t,  H1)
@@ -4772,7 +4755,6 @@ void HELPER(NAME)(void *vd, void *v0, CPURISCVState *env, uint32_t desc)  \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(ETYPE);                                         \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t vma = vext_vma(desc);                                        \
     int i;                                                                \
                                                                           \
@@ -4784,9 +4766,9 @@ void HELPER(NAME)(void *vd, void *v0, CPURISCVState *env, uint32_t desc)  \
         }                                                                 \
         *((ETYPE *)vd + H(i)) = i;                                        \
     }                                                                     \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 GEN_VEXT_VID_V(vid_v_b, uint8_t,  H1)
@@ -4807,7 +4789,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(ETYPE);                                         \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t vma = vext_vma(desc);                                        \
     target_ulong offset = s1, i_min, i;                                   \
                                                                           \
@@ -4820,9 +4801,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
         }                                                                 \
         *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - offset));          \
     }                                                                     \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 /* vslideup.vx vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */
@@ -4840,7 +4821,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(ETYPE);                                         \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t vma = vext_vma(desc);                                        \
     target_ulong i_max, i_min, i;                                         \
                                                                           \
@@ -4861,9 +4841,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
         }                                                                 \
     }                                                                     \
                                                                           \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 /* vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i+rs1] */
@@ -4882,7 +4862,6 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
     uint32_t vl = env->vl;                                                  \
     uint32_t esz = sizeof(ETYPE);                                           \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);            \
-    uint32_t vta = vext_vta(desc);                                          \
     uint32_t vma = vext_vma(desc);                                          \
     uint32_t i;                                                             \
                                                                             \
@@ -4898,9 +4877,9 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
             *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - 1));             \
         }                                                                   \
     }                                                                       \
-    env->vstart = 0;                                                        \
     /* set tail elements to 1s */                                           \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);                \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);                \
+    env->vstart = 0;                                                        \
 }
 
 GEN_VEXT_VSLIE1UP(8,  H1)
@@ -4931,7 +4910,6 @@ static void vslide1down_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
     uint32_t vl = env->vl;                                                    \
     uint32_t esz = sizeof(ETYPE);                                             \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);              \
-    uint32_t vta = vext_vta(desc);                                            \
     uint32_t vma = vext_vma(desc);                                            \
     uint32_t i;                                                               \
                                                                               \
@@ -4947,9 +4925,9 @@ static void vslide1down_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
             *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i + 1));               \
         }                                                                     \
     }                                                                         \
-    env->vstart = 0;                                                          \
     /* set tail elements to 1s */                                             \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);                  \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);                  \
+    env->vstart = 0;                                                          \
 }
 
 GEN_VEXT_VSLIDE1DOWN(8,  H1)
@@ -5005,7 +4983,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(TS2);                                           \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t vma = vext_vma(desc);                                        \
     uint64_t index;                                                       \
     uint32_t i;                                                           \
@@ -5023,9 +5000,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
             *((TS2 *)vd + HS2(i)) = *((TS2 *)vs2 + HS2(index));           \
         }                                                                 \
     }                                                                     \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 /* vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]]; */
@@ -5048,7 +5025,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(ETYPE);                                         \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t vma = vext_vma(desc);                                        \
     uint64_t index = s1;                                                  \
     uint32_t i;                                                           \
@@ -5065,9 +5041,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
             *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(index));           \
         }                                                                 \
     }                                                                     \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 /* vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[rs1] */
@@ -5084,7 +5060,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
     uint32_t vl = env->vl;                                                \
     uint32_t esz = sizeof(ETYPE);                                         \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
-    uint32_t vta = vext_vta(desc);                                        \
     uint32_t num = 0, i;                                                  \
                                                                           \
     for (i = env->vstart; i < vl; i++) {                                  \
@@ -5094,9 +5069,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
         *((ETYPE *)vd + H(num)) = *((ETYPE *)vs2 + H(i));                 \
         num++;                                                            \
     }                                                                     \
-    env->vstart = 0;                                                      \
     /* set tail elements to 1s */                                         \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
+    env->vstart = 0;                                                      \
 }
 
 /* Compress into vd elements of vs2 where vs1 is enabled */
@@ -5130,7 +5105,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,                 \
     uint32_t vm = vext_vm(desc);                                 \
     uint32_t esz = sizeof(ETYPE);                                \
     uint32_t total_elems = vext_get_total_elems(env, desc, esz); \
-    uint32_t vta = vext_vta(desc);                               \
     uint32_t vma = vext_vma(desc);                               \
     uint32_t i;                                                  \
                                                                  \
@@ -5142,9 +5116,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,                 \
         }                                                        \
         *((ETYPE *)vd + HD(i)) = *((DTYPE *)vs2 + HS1(i));       \
     }                                                            \
-    env->vstart = 0;                                             \
     /* set tail elements to 1s */                                \
-    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);     \
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);     \
+    env->vstart = 0;                                             \
 }
 
 GEN_VEXT_INT_EXT(vzext_vf2_h, uint16_t, uint8_t,  H2, H1)
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
                   ` (3 preceding siblings ...)
  2024-03-09 20:43 ` [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s() Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-10  7:42   ` Richard Henderson
  2024-03-09 20:43 ` [PATCH v9 06/10] trans_rvv.c.inc: set vstart = 0 in int scalar move insns Daniel Henrique Barboza
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

Vcrypto insns should also use the same helper the regular vector insns
uses to update the tail elements.

Move vext_set_tail_elems_1s() to vector_internals.c and make it public.
Use it in vcrypto_helper.c to set tail elements instead of
vext_set_elems_1s(). Helpers must set env->vstart = 0 after setting the
tail.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 target/riscv/vcrypto_helper.c   | 63 ++++++++++++---------------------
 target/riscv/vector_helper.c    | 30 ----------------
 target/riscv/vector_internals.c | 29 +++++++++++++++
 target/riscv/vector_internals.h |  4 +++
 4 files changed, 56 insertions(+), 70 deletions(-)

diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c
index e2d719b13b..66d449c274 100644
--- a/target/riscv/vcrypto_helper.c
+++ b/target/riscv/vcrypto_helper.c
@@ -218,9 +218,7 @@ static inline void xor_round_key(AESState *round_state, AESState *round_key)
     void HELPER(NAME)(void *vd, void *vs2, CPURISCVState *env,            \
                       uint32_t desc)                                      \
     {                                                                     \
-        uint32_t vl = env->vl;                                            \
         uint32_t total_elems = vext_get_total_elems(env, desc, 4);        \
-        uint32_t vta = vext_vta(desc);                                    \
                                                                           \
         for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {        \
             AESState round_key;                                           \
@@ -233,18 +231,16 @@ static inline void xor_round_key(AESState *round_state, AESState *round_key)
             *((uint64_t *)vd + H8(i * 2 + 0)) = round_state.d[0];         \
             *((uint64_t *)vd + H8(i * 2 + 1)) = round_state.d[1];         \
         }                                                                 \
-        env->vstart = 0;                                                  \
         /* set tail elements to 1s */                                     \
-        vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4);              \
+        vext_set_tail_elems_1s(env, vd, desc, 4, total_elems);            \
+        env->vstart = 0;                                                  \
     }
 
 #define GEN_ZVKNED_HELPER_VS(NAME, ...)                                   \
     void HELPER(NAME)(void *vd, void *vs2, CPURISCVState *env,            \
                       uint32_t desc)                                      \
     {                                                                     \
-        uint32_t vl = env->vl;                                            \
         uint32_t total_elems = vext_get_total_elems(env, desc, 4);        \
-        uint32_t vta = vext_vta(desc);                                    \
                                                                           \
         for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {        \
             AESState round_key;                                           \
@@ -257,9 +253,9 @@ static inline void xor_round_key(AESState *round_state, AESState *round_key)
             *((uint64_t *)vd + H8(i * 2 + 0)) = round_state.d[0];         \
             *((uint64_t *)vd + H8(i * 2 + 1)) = round_state.d[1];         \
         }                                                                 \
-        env->vstart = 0;                                                  \
         /* set tail elements to 1s */                                     \
-        vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4);              \
+        vext_set_tail_elems_1s(env, vd, desc, 4, total_elems);            \
+        env->vstart = 0;                                                  \
     }
 
 GEN_ZVKNED_HELPER_VV(vaesef_vv, aesenc_SB_SR_AK(&round_state,
@@ -301,9 +297,7 @@ void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm,
 {
     uint32_t *vd = vd_vptr;
     uint32_t *vs2 = vs2_vptr;
-    uint32_t vl = env->vl;
     uint32_t total_elems = vext_get_total_elems(env, desc, 4);
-    uint32_t vta = vext_vta(desc);
 
     uimm &= 0b1111;
     if (uimm > 10 || uimm == 0) {
@@ -337,9 +331,9 @@ void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm,
         vd[i * 4 + H4(2)] = rk[6];
         vd[i * 4 + H4(3)] = rk[7];
     }
-    env->vstart = 0;
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4);
+    vext_set_tail_elems_1s(env, vd, desc, 4, total_elems);
+    env->vstart = 0;
 }
 
 void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm,
@@ -347,9 +341,7 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm,
 {
     uint32_t *vd = vd_vptr;
     uint32_t *vs2 = vs2_vptr;
-    uint32_t vl = env->vl;
     uint32_t total_elems = vext_get_total_elems(env, desc, 4);
-    uint32_t vta = vext_vta(desc);
 
     uimm &= 0b1111;
     if (uimm > 14 || uimm < 2) {
@@ -394,9 +386,9 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm,
         vd[i * 4 + H4(2)] = rk[10];
         vd[i * 4 + H4(3)] = rk[11];
     }
-    env->vstart = 0;
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4);
+    vext_set_tail_elems_1s(env, vd, desc, 4, total_elems);
+    env->vstart = 0;
 }
 
 static inline uint32_t sig0_sha256(uint32_t x)
@@ -455,7 +447,6 @@ void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
     uint32_t sew = FIELD_EX64(env->vtype, VTYPE, VSEW);
     uint32_t esz = sew == MO_32 ? 4 : 8;
     uint32_t total_elems;
-    uint32_t vta = vext_vta(desc);
 
     for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {
         if (sew == MO_32) {
@@ -469,7 +460,7 @@ void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
     }
     /* set tail elements to 1s */
     total_elems = vext_get_total_elems(env, desc, esz);
-    vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -570,7 +561,6 @@ void HELPER(vsha2ch32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 {
     const uint32_t esz = 4;
     uint32_t total_elems;
-    uint32_t vta = vext_vta(desc);
 
     for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {
         vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i,
@@ -579,7 +569,7 @@ void HELPER(vsha2ch32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 
     /* set tail elements to 1s */
     total_elems = vext_get_total_elems(env, desc, esz);
-    vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -588,7 +578,6 @@ void HELPER(vsha2ch64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 {
     const uint32_t esz = 8;
     uint32_t total_elems;
-    uint32_t vta = vext_vta(desc);
 
     for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {
         vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i,
@@ -597,7 +586,7 @@ void HELPER(vsha2ch64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 
     /* set tail elements to 1s */
     total_elems = vext_get_total_elems(env, desc, esz);
-    vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -606,7 +595,6 @@ void HELPER(vsha2cl32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 {
     const uint32_t esz = 4;
     uint32_t total_elems;
-    uint32_t vta = vext_vta(desc);
 
     for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {
         vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i,
@@ -615,7 +603,7 @@ void HELPER(vsha2cl32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 
     /* set tail elements to 1s */
     total_elems = vext_get_total_elems(env, desc, esz);
-    vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -624,7 +612,6 @@ void HELPER(vsha2cl64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 {
     uint32_t esz = 8;
     uint32_t total_elems;
-    uint32_t vta = vext_vta(desc);
 
     for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {
         vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i,
@@ -633,7 +620,7 @@ void HELPER(vsha2cl64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env,
 
     /* set tail elements to 1s */
     total_elems = vext_get_total_elems(env, desc, esz);
-    vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -653,7 +640,6 @@ void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr,
 {
     uint32_t esz = memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW));
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);
-    uint32_t vta = vext_vta(desc);
     uint32_t *vd = vd_vptr;
     uint32_t *vs1 = vs1_vptr;
     uint32_t *vs2 = vs2_vptr;
@@ -672,7 +658,7 @@ void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr,
             vd[(i * 8) + j] = bswap32(w[H4(j + 16)]);
         }
     }
-    vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd_vptr, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -752,7 +738,6 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm,
 {
     uint32_t esz = memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW));
     uint32_t total_elems = vext_get_total_elems(env, desc, esz);
-    uint32_t vta = vext_vta(desc);
     uint32_t *vd = vd_vptr;
     uint32_t *vs2 = vs2_vptr;
     uint32_t v1[8], v2[8], v3[8];
@@ -767,7 +752,7 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm,
             vd[i * 8 + k] = bswap32(v1[H4(k)]);
         }
     }
-    vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd_vptr, desc, esz, total_elems);
     env->vstart = 0;
 }
 
@@ -777,7 +762,6 @@ void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr,
     uint64_t *vd = vd_vptr;
     uint64_t *vs1 = vs1_vptr;
     uint64_t *vs2 = vs2_vptr;
-    uint32_t vta = vext_vta(desc);
     uint32_t total_elems = vext_get_total_elems(env, desc, 4);
 
     for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {
@@ -805,7 +789,7 @@ void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr,
         vd[i * 2 + 1] = brev8(Z[1]);
     }
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4);
+    vext_set_tail_elems_1s(env, vd, desc, 4, total_elems);
     env->vstart = 0;
 }
 
@@ -814,7 +798,6 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env,
 {
     uint64_t *vd = vd_vptr;
     uint64_t *vs2 = vs2_vptr;
-    uint32_t vta = vext_vta(desc);
     uint32_t total_elems = vext_get_total_elems(env, desc, 4);
 
     for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) {
@@ -839,7 +822,7 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env,
         vd[i * 2 + 1] = brev8(Z[1]);
     }
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4);
+    vext_set_tail_elems_1s(env, vd, desc, 4, total_elems);
     env->vstart = 0;
 }
 
@@ -881,9 +864,9 @@ void HELPER(vsm4k_vi)(void *vd, void *vs2, uint32_t uimm5, CPURISCVState *env,
         }
     }
 
-    env->vstart = 0;
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
+    env->vstart = 0;
 }
 
 static void do_sm4_round(uint32_t *rk, uint32_t *buf)
@@ -930,9 +913,9 @@ void HELPER(vsm4r_vv)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc)
         }
     }
 
-    env->vstart = 0;
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
+    env->vstart = 0;
 }
 
 void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc)
@@ -964,7 +947,7 @@ void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc)
         }
     }
 
-    env->vstart = 0;
     /* set tail elements to 1s */
-    vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz);
+    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
+    env->vstart = 0;
 }
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index b174ddeae8..4fe8752eea 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -174,36 +174,6 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw)
 GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl)
 GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq)
 
-/*
- * This function is sensitive to env->vstart changes since
- * it'll be a no-op if vstart >= vl. Do not clear env->vstart
- * before calling it unless you're certain that vstart < vl.
- */
-static void vext_set_tail_elems_1s(CPURISCVState *env, void *vd,
-                                   uint32_t desc, uint32_t esz,
-                                   uint32_t max_elems)
-{
-    uint32_t vta = vext_vta(desc);
-    uint32_t nf = vext_nf(desc);
-    int k;
-
-    /*
-     * Section 5.4 of the RVV spec mentions:
-     * "When vstart ≥ vl, there are no body elements, and no
-     *  elements are updated in any destination vector register
-     *  group, including that no tail elements are updated
-     *  with agnostic values."
-     */
-    if (vta == 0 || env->vstart >= env->vl) {
-        return;
-    }
-
-    for (k = 0; k < nf; ++k) {
-        vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz,
-                          (k * max_elems + max_elems) * esz);
-    }
-}
-
 /*
  * stride: access vector element from strided memory
  */
diff --git a/target/riscv/vector_internals.c b/target/riscv/vector_internals.c
index 12f5964fbb..bf3e9e2370 100644
--- a/target/riscv/vector_internals.c
+++ b/target/riscv/vector_internals.c
@@ -33,6 +33,35 @@ void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt,
     memset(base + cnt, -1, tot - cnt);
 }
 
+/*
+ * This function is sensitive to env->vstart changes since
+ * it'll be a no-op if vstart >= vl. Do not clear env->vstart
+ * before calling it unless you're certain that vstart < vl.
+ */
+void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, uint32_t desc,
+                            uint32_t esz, uint32_t max_elems)
+{
+    uint32_t vta = vext_vta(desc);
+    uint32_t nf = vext_nf(desc);
+    int k;
+
+    /*
+     * Section 5.4 of the RVV spec mentions:
+     * "When vstart ≥ vl, there are no body elements, and no
+     *  elements are updated in any destination vector register
+     *  group, including that no tail elements are updated
+     *  with agnostic values."
+     */
+    if (vta == 0 || env->vstart >= env->vl) {
+        return;
+    }
+
+    for (k = 0; k < nf; ++k) {
+        vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz,
+                          (k * max_elems + max_elems) * esz);
+    }
+}
+
 void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2,
                 CPURISCVState *env, uint32_t desc,
                 opivv2_fn *fn, uint32_t esz)
diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internals.h
index 842765f6c1..c5a2bc4bf3 100644
--- a/target/riscv/vector_internals.h
+++ b/target/riscv/vector_internals.h
@@ -117,6 +117,10 @@ static inline uint32_t vext_get_total_elems(CPURISCVState *env, uint32_t desc,
 void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt,
                        uint32_t tot);
 
+void vext_set_tail_elems_1s(CPURISCVState *env, void *vd,
+                            uint32_t desc, uint32_t esz,
+                            uint32_t max_elems);
+
 /* expand macro args before macro */
 #define RVVCALL(macro, ...)  macro(__VA_ARGS__)
 
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 06/10] trans_rvv.c.inc: set vstart = 0 in int scalar move insns
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
                   ` (4 preceding siblings ...)
  2024-03-09 20:43 ` [PATCH v9 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-10  7:45   ` Richard Henderson
  2024-03-09 20:43 ` [PATCH v9 07/10] target/riscv: remove 'over' brconds from vector trans Daniel Henrique Barboza
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

trans_vmv_x_s, trans_vmv_s_x, trans_vfmv_f_s and trans_vfmv_s_f aren't
setting vstart = 0 after execution. This is usually done by a helper in
vector_helper.c but these functions don't use helpers.

We'll set vstart after any potential 'over' brconds, and that will also
mandate a mark_vs_dirty() too.

Fixes: dedc53cbc9 ("target/riscv: rvv-1.0: integer scalar move instructions")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 target/riscv/insn_trans/trans_rvv.c.inc | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index e42728990e..8c16a9f5b3 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -3373,6 +3373,8 @@ static bool trans_vmv_x_s(DisasContext *s, arg_vmv_x_s *a)
         vec_element_loadi(s, t1, a->rs2, 0, true);
         tcg_gen_trunc_i64_tl(dest, t1);
         gen_set_gpr(s, a->rd, dest);
+        tcg_gen_movi_tl(cpu_vstart, 0);
+        mark_vs_dirty(s);
         return true;
     }
     return false;
@@ -3399,8 +3401,9 @@ static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x *a)
         s1 = get_gpr(s, a->rs1, EXT_NONE);
         tcg_gen_ext_tl_i64(t1, s1);
         vec_element_storei(s, a->rd, 0, t1);
-        mark_vs_dirty(s);
         gen_set_label(over);
+        tcg_gen_movi_tl(cpu_vstart, 0);
+        mark_vs_dirty(s);
         return true;
     }
     return false;
@@ -3427,6 +3430,8 @@ static bool trans_vfmv_f_s(DisasContext *s, arg_vfmv_f_s *a)
         }
 
         mark_fs_dirty(s);
+        tcg_gen_movi_tl(cpu_vstart, 0);
+        mark_vs_dirty(s);
         return true;
     }
     return false;
@@ -3452,8 +3457,9 @@ static bool trans_vfmv_s_f(DisasContext *s, arg_vfmv_s_f *a)
         do_nanbox(s, t1, cpu_fpr[a->rs1]);
 
         vec_element_storei(s, a->rd, 0, t1);
-        mark_vs_dirty(s);
         gen_set_label(over);
+        tcg_gen_movi_tl(cpu_vstart, 0);
+        mark_vs_dirty(s);
         return true;
     }
     return false;
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 07/10] target/riscv: remove 'over' brconds from vector trans
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
                   ` (5 preceding siblings ...)
  2024-03-09 20:43 ` [PATCH v9 06/10] trans_rvv.c.inc: set vstart = 0 in int scalar move insns Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 08/10] trans_rvv.c.inc: remove redundant mark_vs_dirty() calls Daniel Henrique Barboza
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

Most of the vector translations has this following pattern at the start:

    TCGLabel *over = gen_new_label();
    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);

And then right at the end:

     gen_set_label(over);
     return true;

This means that if vstart >= vl we'll not set vstart = 0 at the end of
the insns - this is done inside the helper that is being skipped.  The
reason why this pattern hasn't been a bigger problem is because the
conditional vstart >= vl is very rare.

Checking all the helpers in vector_helper.c we see all of them with a
pattern like this:

    for (i = env->vstart; i < vl; i++) {
        (...)
    }
    env->vstart = 0;

Thus they can handle vstart >= vl case gracefully, with the benefit of
setting env->vstart = 0 during the process.

Remove all 'over' conditionals and let the helper set env->vstart = 0
every time.

Note that not all insns uses helpers, and for those cases the 'brcond'
jump is the only way to filter vstart >= vl. This is the case of
trans_vmv_s_x() and trans_vfmv_s_f(). We won't remove the 'brcond'
conditionals from them.

While we're at it, remove the (vl == 0) brconds from trans_rvbf16.c.inc
too since they're unneeded.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn_trans/trans_rvbf16.c.inc |  12 ---
 target/riscv/insn_trans/trans_rvv.c.inc    | 108 ---------------------
 target/riscv/insn_trans/trans_rvvk.c.inc   |  18 ----
 3 files changed, 138 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc
index 8ee99df3f3..a842e76a6b 100644
--- a/target/riscv/insn_trans/trans_rvbf16.c.inc
+++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
@@ -71,11 +71,8 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
 
     if (opfv_narrow_check(ctx, a) && (ctx->sew == MO_16)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
 
         gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
-        tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -87,7 +84,6 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
                            ctx->cfg_ptr->vlenb, data,
                            gen_helper_vfncvtbf16_f_f_w);
         mark_vs_dirty(ctx);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -100,11 +96,8 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
 
     if (opfv_widen_check(ctx, a) && (ctx->sew == MO_16)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
 
         gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
-        tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -116,7 +109,6 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
                            ctx->cfg_ptr->vlenb, data,
                            gen_helper_vfwcvtbf16_f_f_v);
         mark_vs_dirty(ctx);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -130,11 +122,8 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, arg_vfwmaccbf16_vv *a)
     if (require_rvv(ctx) && vext_check_isa_ill(ctx) && (ctx->sew == MO_16) &&
         vext_check_dss(ctx, a->rd, a->rs1, a->rs2, a->vm)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
 
         gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
-        tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -147,7 +136,6 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, arg_vfwmaccbf16_vv *a)
                            ctx->cfg_ptr->vlenb, data,
                            gen_helper_vfwmaccbf16_vv);
         mark_vs_dirty(ctx);
-        gen_set_label(over);
         return true;
     }
     return false;
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index 8c16a9f5b3..4c1a064cf6 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -616,9 +616,6 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data,
     TCGv base;
     TCGv_i32 desc;
 
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     dest = tcg_temp_new_ptr();
     mask = tcg_temp_new_ptr();
     base = get_gpr(s, rs1, EXT_NONE);
@@ -660,7 +657,6 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data,
         tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
     }
 
-    gen_set_label(over);
     return true;
 }
 
@@ -802,9 +798,6 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2,
     TCGv base, stride;
     TCGv_i32 desc;
 
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     dest = tcg_temp_new_ptr();
     mask = tcg_temp_new_ptr();
     base = get_gpr(s, rs1, EXT_NONE);
@@ -819,7 +812,6 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2,
 
     fn(dest, mask, base, stride, tcg_env, desc);
 
-    gen_set_label(over);
     return true;
 }
 
@@ -906,9 +898,6 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2,
     TCGv base;
     TCGv_i32 desc;
 
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     dest = tcg_temp_new_ptr();
     mask = tcg_temp_new_ptr();
     index = tcg_temp_new_ptr();
@@ -924,7 +913,6 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2,
 
     fn(dest, mask, base, index, tcg_env, desc);
 
-    gen_set_label(over);
     return true;
 }
 
@@ -1044,9 +1032,6 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data,
     TCGv base;
     TCGv_i32 desc;
 
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     dest = tcg_temp_new_ptr();
     mask = tcg_temp_new_ptr();
     base = get_gpr(s, rs1, EXT_NONE);
@@ -1059,7 +1044,6 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data,
     fn(dest, mask, base, tcg_env, desc);
 
     mark_vs_dirty(s);
-    gen_set_label(over);
     return true;
 }
 
@@ -1100,10 +1084,6 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t nf,
                              uint32_t width, gen_helper_ldst_whole *fn,
                              DisasContext *s)
 {
-    uint32_t evl = s->cfg_ptr->vlenb * nf / width;
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, evl, over);
-
     TCGv_ptr dest;
     TCGv base;
     TCGv_i32 desc;
@@ -1120,8 +1100,6 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t nf,
 
     fn(dest, base, tcg_env, desc);
 
-    gen_set_label(over);
-
     return true;
 }
 
@@ -1195,10 +1173,6 @@ static inline bool
 do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn,
               gen_helper_gvec_4_ptr *fn)
 {
-    TCGLabel *over = gen_new_label();
-
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) {
         gvec_fn(s->sew, vreg_ofs(s, a->rd),
                 vreg_ofs(s, a->rs2), vreg_ofs(s, a->rs1),
@@ -1216,7 +1190,6 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn,
                            s->cfg_ptr->vlenb, data, fn);
     }
     mark_vs_dirty(s);
-    gen_set_label(over);
     return true;
 }
 
@@ -1248,9 +1221,6 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm,
     TCGv_i32 desc;
     uint32_t data = 0;
 
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     dest = tcg_temp_new_ptr();
     mask = tcg_temp_new_ptr();
     src2 = tcg_temp_new_ptr();
@@ -1271,7 +1241,6 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm,
     fn(dest, mask, src1, src2, tcg_env, desc);
 
     mark_vs_dirty(s);
-    gen_set_label(over);
     return true;
 }
 
@@ -1410,9 +1379,6 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm,
     TCGv_i32 desc;
     uint32_t data = 0;
 
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     dest = tcg_temp_new_ptr();
     mask = tcg_temp_new_ptr();
     src2 = tcg_temp_new_ptr();
@@ -1433,7 +1399,6 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm,
     fn(dest, mask, src1, src2, tcg_env, desc);
 
     mark_vs_dirty(s);
-    gen_set_label(over);
     return true;
 }
 
@@ -1495,8 +1460,6 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a,
 {
     if (checkfn(s, a)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
@@ -1509,7 +1472,6 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a,
                            s->cfg_ptr->vlenb,
                            data, fn);
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -1571,8 +1533,6 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a,
 {
     if (opiwv_widen_check(s, a)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
@@ -1584,7 +1544,6 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a,
                            tcg_env, s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data, fn);
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -1643,8 +1602,6 @@ static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm,
                         gen_helper_gvec_4_ptr *fn, DisasContext *s)
 {
     uint32_t data = 0;
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
     data = FIELD_DP32(data, VDATA, VM, vm);
     data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
@@ -1655,7 +1612,6 @@ static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm,
                        vreg_ofs(s, vs2), tcg_env, s->cfg_ptr->vlenb,
                        s->cfg_ptr->vlenb, data, fn);
     mark_vs_dirty(s);
-    gen_set_label(over);
     return true;
 }
 
@@ -1834,8 +1790,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
             gen_helper_##NAME##_h,                                 \
             gen_helper_##NAME##_w,                                 \
         };                                                         \
-        TCGLabel *over = gen_new_label();                          \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -1848,7 +1802,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew]);                           \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2045,14 +1998,11 @@ static bool trans_vmv_v_v(DisasContext *s, arg_vmv_v_v *a)
                 gen_helper_vmv_v_v_b, gen_helper_vmv_v_v_h,
                 gen_helper_vmv_v_v_w, gen_helper_vmv_v_v_d,
             };
-            TCGLabel *over = gen_new_label();
-            tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
             tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1),
                                tcg_env, s->cfg_ptr->vlenb,
                                s->cfg_ptr->vlenb, data,
                                fns[s->sew]);
-            gen_set_label(over);
         }
         mark_vs_dirty(s);
         return true;
@@ -2068,8 +2018,6 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a)
         /* vmv.v.x has rs2 = 0 and vm = 1 */
         vext_check_ss(s, a->rd, 0, 1)) {
         TCGv s1;
-        TCGLabel *over = gen_new_label();
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         s1 = get_gpr(s, a->rs1, EXT_SIGN);
 
@@ -2102,7 +2050,6 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a)
         }
 
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -2129,8 +2076,6 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a)
                 gen_helper_vmv_v_x_b, gen_helper_vmv_v_x_h,
                 gen_helper_vmv_v_x_w, gen_helper_vmv_v_x_d,
             };
-            TCGLabel *over = gen_new_label();
-            tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
             s1 = tcg_constant_i64(simm);
             dest = tcg_temp_new_ptr();
@@ -2140,7 +2085,6 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a)
             fns[s->sew](dest, s1, tcg_env, desc);
 
             mark_vs_dirty(s);
-            gen_set_label(over);
         }
         return true;
     }
@@ -2275,9 +2219,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
             gen_helper_##NAME##_w,                                 \
             gen_helper_##NAME##_d,                                 \
         };                                                         \
-        TCGLabel *over = gen_new_label();                          \
         gen_set_rm(s, RISCV_FRM_DYN);                              \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -2292,7 +2234,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2310,9 +2251,6 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, uint32_t vs2,
     TCGv_i32 desc;
     TCGv_i64 t1;
 
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
-
     dest = tcg_temp_new_ptr();
     mask = tcg_temp_new_ptr();
     src2 = tcg_temp_new_ptr();
@@ -2330,7 +2268,6 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, uint32_t vs2,
     fn(dest, mask, t1, src2, tcg_env, desc);
 
     mark_vs_dirty(s);
-    gen_set_label(over);
     return true;
 }
 
@@ -2393,9 +2330,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)           \
         static gen_helper_gvec_4_ptr * const fns[2] = {          \
             gen_helper_##NAME##_h, gen_helper_##NAME##_w,        \
         };                                                       \
-        TCGLabel *over = gen_new_label();                        \
         gen_set_rm(s, RISCV_FRM_DYN);                            \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);\
                                                                  \
         data = FIELD_DP32(data, VDATA, VM, a->vm);               \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);           \
@@ -2408,7 +2343,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)           \
                            s->cfg_ptr->vlenb, data,              \
                            fns[s->sew - 1]);                     \
         mark_vs_dirty(s);                                        \
-        gen_set_label(over);                                     \
         return true;                                             \
     }                                                            \
     return false;                                                \
@@ -2467,9 +2401,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
         static gen_helper_gvec_4_ptr * const fns[2] = {            \
             gen_helper_##NAME##_h, gen_helper_##NAME##_w,          \
         };                                                         \
-        TCGLabel *over = gen_new_label();                          \
         gen_set_rm(s, RISCV_FRM_DYN);                              \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -2482,7 +2414,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2584,9 +2515,7 @@ static bool do_opfv(DisasContext *s, arg_rmr *a,
 {
     if (checkfn(s, a)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
         gen_set_rm_chkfrm(s, rm);
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
@@ -2597,7 +2526,6 @@ static bool do_opfv(DisasContext *s, arg_rmr *a,
                            s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data, fn);
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -2696,8 +2624,6 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a)
                 gen_helper_vmv_v_x_w,
                 gen_helper_vmv_v_x_d,
             };
-            TCGLabel *over = gen_new_label();
-            tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
             t1 = tcg_temp_new_i64();
             /* NaN-box f[rs1] */
@@ -2711,7 +2637,6 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a)
             fns[s->sew - 1](dest, t1, tcg_env, desc);
 
             mark_vs_dirty(s);
-            gen_set_label(over);
         }
         return true;
     }
@@ -2773,9 +2698,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
             gen_helper_##HELPER##_h,                               \
             gen_helper_##HELPER##_w,                               \
         };                                                         \
-        TCGLabel *over = gen_new_label();                          \
         gen_set_rm_chkfrm(s, FRM);                                 \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -2787,7 +2710,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2824,9 +2746,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
             gen_helper_##NAME##_h,                                 \
             gen_helper_##NAME##_w,                                 \
         };                                                         \
-        TCGLabel *over = gen_new_label();                          \
         gen_set_rm(s, RISCV_FRM_DYN);                              \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -2838,7 +2758,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew]);                           \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2891,9 +2810,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
             gen_helper_##HELPER##_h,                               \
             gen_helper_##HELPER##_w,                               \
         };                                                         \
-        TCGLabel *over = gen_new_label();                          \
         gen_set_rm_chkfrm(s, FRM);                                 \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -2905,7 +2822,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2940,9 +2856,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
             gen_helper_##HELPER##_h,                               \
             gen_helper_##HELPER##_w,                               \
         };                                                         \
-        TCGLabel *over = gen_new_label();                          \
         gen_set_rm_chkfrm(s, FRM);                                 \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -2954,7 +2868,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew]);                           \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -3031,8 +2944,6 @@ static bool trans_##NAME(DisasContext *s, arg_r *a)                \
         vext_check_isa_ill(s)) {                                   \
         uint32_t data = 0;                                         \
         gen_helper_gvec_4_ptr *fn = gen_helper_##NAME;             \
-        TCGLabel *over = gen_new_label();                          \
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \
                                                                    \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
         data =                                                     \
@@ -3043,7 +2954,6 @@ static bool trans_##NAME(DisasContext *s, arg_r *a)                \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data, fn);           \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -3131,8 +3041,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
         s->vstart_eq_zero) {                                       \
         uint32_t data = 0;                                         \
         gen_helper_gvec_3_ptr *fn = gen_helper_##NAME;             \
-        TCGLabel *over = gen_new_label();                          \
-        tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);          \
                                                                    \
         data = FIELD_DP32(data, VDATA, VM, a->vm);                 \
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);             \
@@ -3145,7 +3053,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb,                      \
                            data, fn);                              \
         mark_vs_dirty(s);                                          \
-        gen_set_label(over);                                       \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -3171,8 +3078,6 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m *a)
         require_align(a->rd, s->lmul) &&
         s->vstart_eq_zero) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
-        tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
@@ -3187,7 +3092,6 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m *a)
                            s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data, fns[s->sew]);
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -3201,8 +3105,6 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a)
         require_align(a->rd, s->lmul) &&
         require_vm(a->vm, a->rd)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
-        tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
@@ -3217,7 +3119,6 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a)
                            s->cfg_ptr->vlenb,
                            data, fns[s->sew]);
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -3630,8 +3531,6 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a)
             gen_helper_vcompress_vm_b, gen_helper_vcompress_vm_h,
             gen_helper_vcompress_vm_w, gen_helper_vcompress_vm_d,
         };
-        TCGLabel *over = gen_new_label();
-        tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
 
         data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
         data = FIELD_DP32(data, VDATA, VTA, s->vta);
@@ -3641,7 +3540,6 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a)
                            s->cfg_ptr->vlenb, data,
                            fns[s->sew]);
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -3664,12 +3562,9 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME * a)               \
                              vreg_ofs(s, a->rs2), maxsz, maxsz);        \
             mark_vs_dirty(s);                                           \
         } else {                                                        \
-            TCGLabel *over = gen_new_label();                           \
-            tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, maxsz, over);  \
             tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), \
                                tcg_env, maxsz, maxsz, 0, gen_helper_vmvr_v); \
             mark_vs_dirty(s);                                           \
-            gen_set_label(over);                                        \
         }                                                               \
         return true;                                                    \
     }                                                                   \
@@ -3698,8 +3593,6 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq)
 {
     uint32_t data = 0;
     gen_helper_gvec_3_ptr *fn;
-    TCGLabel *over = gen_new_label();
-    tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
 
     static gen_helper_gvec_3_ptr * const fns[6][4] = {
         {
@@ -3744,7 +3637,6 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq)
                        s->cfg_ptr->vlenb, data, fn);
 
     mark_vs_dirty(s);
-    gen_set_label(over);
     return true;
 }
 
diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc
index a5cdd1b67f..6d640e4596 100644
--- a/target/riscv/insn_trans/trans_rvvk.c.inc
+++ b/target/riscv/insn_trans/trans_rvvk.c.inc
@@ -164,8 +164,6 @@ GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvkb_vx_check)
                 gen_helper_##NAME##_w,                                     \
                 gen_helper_##NAME##_d,                                     \
             };                                                             \
-            TCGLabel *over = gen_new_label();                              \
-            tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);     \
                                                                            \
             data = FIELD_DP32(data, VDATA, VM, a->vm);                     \
             data = FIELD_DP32(data, VDATA, LMUL, s->lmul);                 \
@@ -177,7 +175,6 @@ GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvkb_vx_check)
                                s->cfg_ptr->vlenb, s->cfg_ptr->vlenb,       \
                                data, fns[s->sew]);                         \
             mark_vs_dirty(s);                                              \
-            gen_set_label(over);                                           \
             return true;                                                   \
         }                                                                  \
         return false;                                                      \
@@ -249,14 +246,12 @@ GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check)
             TCGv_ptr rd_v, rs2_v;                                             \
             TCGv_i32 desc, egs;                                               \
             uint32_t data = 0;                                                \
-            TCGLabel *over = gen_new_label();                                 \
                                                                               \
             if (!s->vstart_eq_zero || !s->vl_eq_vlmax) {                      \
                 /* save opcode for unwinding in case we throw an exception */ \
                 decode_save_opc(s);                                           \
                 egs = tcg_constant_i32(EGS);                                  \
                 gen_helper_egs_check(egs, tcg_env);                           \
-                tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);    \
             }                                                                 \
                                                                               \
             data = FIELD_DP32(data, VDATA, VM, a->vm);                        \
@@ -272,7 +267,6 @@ GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check)
             tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2));            \
             gen_helper_##NAME(rd_v, rs2_v, tcg_env, desc);                    \
             mark_vs_dirty(s);                                                 \
-            gen_set_label(over);                                              \
             return true;                                                      \
         }                                                                     \
         return false;                                                         \
@@ -325,14 +319,12 @@ GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS)
             TCGv_ptr rd_v, rs2_v;                                             \
             TCGv_i32 uimm_v, desc, egs;                                       \
             uint32_t data = 0;                                                \
-            TCGLabel *over = gen_new_label();                                 \
                                                                               \
             if (!s->vstart_eq_zero || !s->vl_eq_vlmax) {                      \
                 /* save opcode for unwinding in case we throw an exception */ \
                 decode_save_opc(s);                                           \
                 egs = tcg_constant_i32(EGS);                                  \
                 gen_helper_egs_check(egs, tcg_env);                           \
-                tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);    \
             }                                                                 \
                                                                               \
             data = FIELD_DP32(data, VDATA, VM, a->vm);                        \
@@ -350,7 +342,6 @@ GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS)
             tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2));            \
             gen_helper_##NAME(rd_v, rs2_v, uimm_v, tcg_env, desc);            \
             mark_vs_dirty(s);                                                 \
-            gen_set_label(over);                                              \
             return true;                                                      \
         }                                                                     \
         return false;                                                         \
@@ -394,7 +385,6 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS)
     {                                                                         \
         if (CHECK(s, a)) {                                                    \
             uint32_t data = 0;                                                \
-            TCGLabel *over = gen_new_label();                                 \
             TCGv_i32 egs;                                                     \
                                                                               \
             if (!s->vstart_eq_zero || !s->vl_eq_vlmax) {                      \
@@ -402,7 +392,6 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS)
                 decode_save_opc(s);                                           \
                 egs = tcg_constant_i32(EGS);                                  \
                 gen_helper_egs_check(egs, tcg_env);                           \
-                tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);    \
             }                                                                 \
                                                                               \
             data = FIELD_DP32(data, VDATA, VM, a->vm);                        \
@@ -417,7 +406,6 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS)
                                data, gen_helper_##NAME);                      \
                                                                               \
             mark_vs_dirty(s);                                                 \
-            gen_set_label(over);                                              \
             return true;                                                      \
         }                                                                     \
         return false;                                                         \
@@ -448,7 +436,6 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a)
 {
     if (vsha_check(s, a)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
         TCGv_i32 egs;
 
         if (!s->vstart_eq_zero || !s->vl_eq_vlmax) {
@@ -456,7 +443,6 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a)
             decode_save_opc(s);
             egs = tcg_constant_i32(ZVKNH_EGS);
             gen_helper_egs_check(egs, tcg_env);
-            tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
         }
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
@@ -472,7 +458,6 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a)
                 gen_helper_vsha2cl32_vv : gen_helper_vsha2cl64_vv);
 
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
@@ -482,7 +467,6 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a)
 {
     if (vsha_check(s, a)) {
         uint32_t data = 0;
-        TCGLabel *over = gen_new_label();
         TCGv_i32 egs;
 
         if (!s->vstart_eq_zero || !s->vl_eq_vlmax) {
@@ -490,7 +474,6 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a)
             decode_save_opc(s);
             egs = tcg_constant_i32(ZVKNH_EGS);
             gen_helper_egs_check(egs, tcg_env);
-            tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
         }
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
@@ -506,7 +489,6 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a)
                 gen_helper_vsha2ch32_vv : gen_helper_vsha2ch64_vv);
 
         mark_vs_dirty(s);
-        gen_set_label(over);
         return true;
     }
     return false;
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 08/10] trans_rvv.c.inc: remove redundant mark_vs_dirty() calls
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
                   ` (6 preceding siblings ...)
  2024-03-09 20:43 ` [PATCH v9 07/10] target/riscv: remove 'over' brconds from vector trans Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag Daniel Henrique Barboza
  2024-03-09 20:43 ` [PATCH v9 10/10] target/riscv/vector_helper.c: optimize loops in ldst helpers Daniel Henrique Barboza
  9 siblings, 0 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

trans_vmv_v_i , trans_vfmv_v_f and the trans_##NAME macro from
GEN_VMV_WHOLE_TRANS() are calling mark_vs_dirty() in both branches of
their 'ifs'. conditionals.

Call it just once in the end like other functions are doing.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 target/riscv/insn_trans/trans_rvv.c.inc | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index 4c1a064cf6..b0f19dcd85 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -2065,7 +2065,6 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a)
         if (s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) {
             tcg_gen_gvec_dup_imm(s->sew, vreg_ofs(s, a->rd),
                                  MAXSZ(s), MAXSZ(s), simm);
-            mark_vs_dirty(s);
         } else {
             TCGv_i32 desc;
             TCGv_i64 s1;
@@ -2083,9 +2082,8 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a)
                                               s->cfg_ptr->vlenb, data));
             tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd));
             fns[s->sew](dest, s1, tcg_env, desc);
-
-            mark_vs_dirty(s);
         }
+        mark_vs_dirty(s);
         return true;
     }
     return false;
@@ -2612,7 +2610,6 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a)
 
             tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd),
                                  MAXSZ(s), MAXSZ(s), t1);
-            mark_vs_dirty(s);
         } else {
             TCGv_ptr dest;
             TCGv_i32 desc;
@@ -2635,9 +2632,8 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a)
             tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd));
 
             fns[s->sew - 1](dest, t1, tcg_env, desc);
-
-            mark_vs_dirty(s);
         }
+        mark_vs_dirty(s);
         return true;
     }
     return false;
@@ -3560,12 +3556,11 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME * a)               \
         if (s->vstart_eq_zero) {                                        \
             tcg_gen_gvec_mov(s->sew, vreg_ofs(s, a->rd),                \
                              vreg_ofs(s, a->rs2), maxsz, maxsz);        \
-            mark_vs_dirty(s);                                           \
         } else {                                                        \
             tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), \
                                tcg_env, maxsz, maxsz, 0, gen_helper_vmvr_v); \
-            mark_vs_dirty(s);                                           \
         }                                                               \
+        mark_vs_dirty(s);                                               \
         return true;                                                    \
     }                                                                   \
     return false;                                                       \
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
                   ` (7 preceding siblings ...)
  2024-03-09 20:43 ` [PATCH v9 08/10] trans_rvv.c.inc: remove redundant mark_vs_dirty() calls Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  2024-03-10  7:47   ` Richard Henderson
  2024-03-09 20:43 ` [PATCH v9 10/10] target/riscv/vector_helper.c: optimize loops in ldst helpers Daniel Henrique Barboza
  9 siblings, 1 reply; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Ivan Klokov,
	Daniel Henrique Barboza

From: Ivan Klokov <ivan.klokov@syntacore.com>

The vstart_qe_zero flag is set at the beginning of the translation
phase from the env->vstart variable. During the execution phase all
functions will set env->vstart = 0 after a successful execution,
but the vstart_eq_zero flag remains the same as at the start of the
block. This will wrongly cause SIGILLs in translations that requires
env->vstart = 0 and might be reading vstart_eq_zero = false.

This patch adds a new finalize_rvv_inst() helper that is called at the
end of each vector instruction that will both update vstart_eq_zero and
do a mark_vs_dirty().

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1976
Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn_trans/trans_rvbf16.c.inc |  6 +-
 target/riscv/insn_trans/trans_rvv.c.inc    | 83 ++++++++++++----------
 target/riscv/insn_trans/trans_rvvk.c.inc   | 12 ++--
 target/riscv/translate.c                   |  6 ++
 4 files changed, 59 insertions(+), 48 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc
index a842e76a6b..0a9cd1ec31 100644
--- a/target/riscv/insn_trans/trans_rvbf16.c.inc
+++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
@@ -83,7 +83,7 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
                            ctx->cfg_ptr->vlenb,
                            ctx->cfg_ptr->vlenb, data,
                            gen_helper_vfncvtbf16_f_f_w);
-        mark_vs_dirty(ctx);
+        finalize_rvv_inst(ctx);
         return true;
     }
     return false;
@@ -108,7 +108,7 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
                            ctx->cfg_ptr->vlenb,
                            ctx->cfg_ptr->vlenb, data,
                            gen_helper_vfwcvtbf16_f_f_v);
-        mark_vs_dirty(ctx);
+        finalize_rvv_inst(ctx);
         return true;
     }
     return false;
@@ -135,7 +135,7 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, arg_vfwmaccbf16_vv *a)
                            ctx->cfg_ptr->vlenb,
                            ctx->cfg_ptr->vlenb, data,
                            gen_helper_vfwmaccbf16_vv);
-        mark_vs_dirty(ctx);
+        finalize_rvv_inst(ctx);
         return true;
     }
     return false;
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index b0f19dcd85..b3d467a874 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -167,7 +167,7 @@ static bool do_vsetvl(DisasContext *s, int rd, int rs1, TCGv s2)
 
     gen_helper_vsetvl(dst, tcg_env, s1, s2);
     gen_set_gpr(s, rd, dst);
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
 
     gen_update_pc(s, s->cur_insn_len);
     lookup_and_goto_ptr(s);
@@ -187,7 +187,7 @@ static bool do_vsetivli(DisasContext *s, int rd, TCGv s1, TCGv s2)
 
     gen_helper_vsetvl(dst, tcg_env, s1, s2);
     gen_set_gpr(s, rd, dst);
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     gen_update_pc(s, s->cur_insn_len);
     lookup_and_goto_ptr(s);
     s->base.is_jmp = DISAS_NORETURN;
@@ -657,6 +657,7 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data,
         tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
     }
 
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -812,6 +813,7 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2,
 
     fn(dest, mask, base, stride, tcg_env, desc);
 
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -913,6 +915,7 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2,
 
     fn(dest, mask, base, index, tcg_env, desc);
 
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -1043,7 +1046,7 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data,
 
     fn(dest, mask, base, tcg_env, desc);
 
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -1100,6 +1103,7 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t nf,
 
     fn(dest, base, tcg_env, desc);
 
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -1189,7 +1193,7 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn,
                            tcg_env, s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data, fn);
     }
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -1240,7 +1244,7 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm,
 
     fn(dest, mask, src1, src2, tcg_env, desc);
 
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -1265,7 +1269,7 @@ do_opivx_gvec(DisasContext *s, arg_rmrr *a, GVecGen2sFn *gvec_fn,
         gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2),
                 src1, MAXSZ(s), MAXSZ(s));
 
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s);
@@ -1398,7 +1402,7 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm,
 
     fn(dest, mask, src1, src2, tcg_env, desc);
 
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -1412,7 +1416,7 @@ do_opivi_gvec(DisasContext *s, arg_rmrr *a, GVecGen2iFn *gvec_fn,
     if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) {
         gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2),
                 extract_imm(s, a->rs1, imm_mode), MAXSZ(s), MAXSZ(s));
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return opivi_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s, imm_mode);
@@ -1471,7 +1475,7 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a,
                            tcg_env, s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb,
                            data, fn);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -1543,7 +1547,7 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a,
                            vreg_ofs(s, a->rs2),
                            tcg_env, s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data, fn);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -1611,7 +1615,7 @@ static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm,
     tcg_gen_gvec_4_ptr(vreg_ofs(s, vd), vreg_ofs(s, 0), vreg_ofs(s, vs1),
                        vreg_ofs(s, vs2), tcg_env, s->cfg_ptr->vlenb,
                        s->cfg_ptr->vlenb, data, fn);
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -1744,7 +1748,7 @@ do_opivx_gvec_shift(DisasContext *s, arg_rmrr *a, GVecGen2sFn32 *gvec_fn,
         gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2),
                 src1, MAXSZ(s), MAXSZ(s));
 
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s);
@@ -1801,7 +1805,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew]);                           \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2004,7 +2008,7 @@ static bool trans_vmv_v_v(DisasContext *s, arg_vmv_v_v *a)
                                s->cfg_ptr->vlenb, data,
                                fns[s->sew]);
         }
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -2049,7 +2053,7 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a)
             fns[s->sew](dest, s1_i64, tcg_env, desc);
         }
 
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -2083,7 +2087,7 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a)
             tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd));
             fns[s->sew](dest, s1, tcg_env, desc);
         }
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -2231,7 +2235,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2265,7 +2269,7 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, uint32_t vs2,
 
     fn(dest, mask, t1, src2, tcg_env, desc);
 
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     return true;
 }
 
@@ -2340,7 +2344,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)           \
                            s->cfg_ptr->vlenb,                    \
                            s->cfg_ptr->vlenb, data,              \
                            fns[s->sew - 1]);                     \
-        mark_vs_dirty(s);                                        \
+        finalize_rvv_inst(s);                                    \
         return true;                                             \
     }                                                            \
     return false;                                                \
@@ -2411,7 +2415,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a)             \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2523,7 +2527,7 @@ static bool do_opfv(DisasContext *s, arg_rmr *a,
                            vreg_ofs(s, a->rs2), tcg_env,
                            s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data, fn);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -2633,7 +2637,7 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a)
 
             fns[s->sew - 1](dest, t1, tcg_env, desc);
         }
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -2705,7 +2709,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2753,7 +2757,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew]);                           \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2817,7 +2821,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew - 1]);                       \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2863,7 +2867,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data,                \
                            fns[s->sew]);                           \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -2949,7 +2953,7 @@ static bool trans_##NAME(DisasContext *s, arg_r *a)                \
                            vreg_ofs(s, a->rs2), tcg_env,           \
                            s->cfg_ptr->vlenb,                      \
                            s->cfg_ptr->vlenb, data, fn);           \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -3048,7 +3052,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a)              \
                            tcg_env, s->cfg_ptr->vlenb,             \
                            s->cfg_ptr->vlenb,                      \
                            data, fn);                              \
-        mark_vs_dirty(s);                                          \
+        finalize_rvv_inst(s);                                      \
         return true;                                               \
     }                                                              \
     return false;                                                  \
@@ -3087,7 +3091,7 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m *a)
                            vreg_ofs(s, a->rs2), tcg_env,
                            s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data, fns[s->sew]);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -3114,7 +3118,7 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a)
                            tcg_env, s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb,
                            data, fns[s->sew]);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -3271,7 +3275,7 @@ static bool trans_vmv_x_s(DisasContext *s, arg_vmv_x_s *a)
         tcg_gen_trunc_i64_tl(dest, t1);
         gen_set_gpr(s, a->rd, dest);
         tcg_gen_movi_tl(cpu_vstart, 0);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -3300,7 +3304,7 @@ static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x *a)
         vec_element_storei(s, a->rd, 0, t1);
         gen_set_label(over);
         tcg_gen_movi_tl(cpu_vstart, 0);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -3328,7 +3332,7 @@ static bool trans_vfmv_f_s(DisasContext *s, arg_vfmv_f_s *a)
 
         mark_fs_dirty(s);
         tcg_gen_movi_tl(cpu_vstart, 0);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -3354,9 +3358,10 @@ static bool trans_vfmv_s_f(DisasContext *s, arg_vfmv_s_f *a)
         do_nanbox(s, t1, cpu_fpr[a->rs1]);
 
         vec_element_storei(s, a->rd, 0, t1);
+
         gen_set_label(over);
         tcg_gen_movi_tl(cpu_vstart, 0);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -3462,7 +3467,7 @@ static bool trans_vrgather_vx(DisasContext *s, arg_rmrr *a)
 
         tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd),
                              MAXSZ(s), MAXSZ(s), dest);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
     } else {
         static gen_helper_opivx * const fns[4] = {
             gen_helper_vrgather_vx_b, gen_helper_vrgather_vx_h,
@@ -3490,7 +3495,7 @@ static bool trans_vrgather_vi(DisasContext *s, arg_rmrr *a)
                                  endian_ofs(s, a->rs2, a->rs1),
                                  MAXSZ(s), MAXSZ(s));
         }
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
     } else {
         static gen_helper_opivx * const fns[4] = {
             gen_helper_vrgather_vx_b, gen_helper_vrgather_vx_h,
@@ -3535,7 +3540,7 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a)
                            tcg_env, s->cfg_ptr->vlenb,
                            s->cfg_ptr->vlenb, data,
                            fns[s->sew]);
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -3560,7 +3565,7 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME * a)               \
             tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), \
                                tcg_env, maxsz, maxsz, 0, gen_helper_vmvr_v); \
         }                                                               \
-        mark_vs_dirty(s);                                               \
+        finalize_rvv_inst(s);                                           \
         return true;                                                    \
     }                                                                   \
     return false;                                                       \
@@ -3631,7 +3636,7 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq)
                        s->cfg_ptr->vlenb,
                        s->cfg_ptr->vlenb, data, fn);
 
-    mark_vs_dirty(s);
+    finalize_rvv_inst(s);
     return true;
 }
 
diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc
index 6d640e4596..ae1f40174a 100644
--- a/target/riscv/insn_trans/trans_rvvk.c.inc
+++ b/target/riscv/insn_trans/trans_rvvk.c.inc
@@ -174,7 +174,7 @@ GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvkb_vx_check)
                                vreg_ofs(s, a->rs2), tcg_env,               \
                                s->cfg_ptr->vlenb, s->cfg_ptr->vlenb,       \
                                data, fns[s->sew]);                         \
-            mark_vs_dirty(s);                                              \
+            finalize_rvv_inst(s);                                          \
             return true;                                                   \
         }                                                                  \
         return false;                                                      \
@@ -266,7 +266,7 @@ GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check)
             tcg_gen_addi_ptr(rd_v, tcg_env, vreg_ofs(s, a->rd));              \
             tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2));            \
             gen_helper_##NAME(rd_v, rs2_v, tcg_env, desc);                    \
-            mark_vs_dirty(s);                                                 \
+            finalize_rvv_inst(s);                                             \
             return true;                                                      \
         }                                                                     \
         return false;                                                         \
@@ -341,7 +341,7 @@ GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS)
             tcg_gen_addi_ptr(rd_v, tcg_env, vreg_ofs(s, a->rd));              \
             tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2));            \
             gen_helper_##NAME(rd_v, rs2_v, uimm_v, tcg_env, desc);            \
-            mark_vs_dirty(s);                                                 \
+            finalize_rvv_inst(s);                                             \
             return true;                                                      \
         }                                                                     \
         return false;                                                         \
@@ -405,7 +405,7 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS)
                                s->cfg_ptr->vlenb, s->cfg_ptr->vlenb,          \
                                data, gen_helper_##NAME);                      \
                                                                               \
-            mark_vs_dirty(s);                                                 \
+            finalize_rvv_inst(s);                                             \
             return true;                                                      \
         }                                                                     \
         return false;                                                         \
@@ -457,7 +457,7 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a)
             s->sew == MO_32 ?
                 gen_helper_vsha2cl32_vv : gen_helper_vsha2cl64_vv);
 
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
@@ -488,7 +488,7 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a)
             s->sew == MO_32 ?
                 gen_helper_vsha2ch32_vv : gen_helper_vsha2ch64_vv);
 
-        mark_vs_dirty(s);
+        finalize_rvv_inst(s);
         return true;
     }
     return false;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index ea5d52b2ef..9d57089fcc 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -676,6 +676,12 @@ static void mark_vs_dirty(DisasContext *ctx)
 static inline void mark_vs_dirty(DisasContext *ctx) { }
 #endif
 
+static void finalize_rvv_inst(DisasContext *ctx)
+{
+    mark_vs_dirty(ctx);
+    ctx->vstart_eq_zero = true;
+}
+
 static void gen_set_rm(DisasContext *ctx, int rm)
 {
     if (ctx->frm == rm) {
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 10/10] target/riscv/vector_helper.c: optimize loops in ldst helpers
  2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
                   ` (8 preceding siblings ...)
  2024-03-09 20:43 ` [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag Daniel Henrique Barboza
@ 2024-03-09 20:43 ` Daniel Henrique Barboza
  9 siblings, 0 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-09 20:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, richard.henderson, philmd, Daniel Henrique Barboza

Change the for loops in ldst helpers to do a single increment in the
counter, and assign it env->vstart, to avoid re-reading from vstart
every time.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/vector_helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 4fe8752eea..ee57300dc0 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -195,7 +195,7 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base,
         return;
     }
 
-    for (i = env->vstart; i < env->vl; i++, env->vstart++) {
+    for (i = env->vstart; i < env->vl; env->vstart = ++i) {
         k = 0;
         while (k < nf) {
             if (!vm && !vext_elem_mask(v0, i)) {
@@ -270,7 +270,7 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc,
     }
 
     /* load bytes from guest memory */
-    for (i = env->vstart; i < evl; i++, env->vstart++) {
+    for (i = env->vstart; i < evl; env->vstart = ++i) {
         k = 0;
         while (k < nf) {
             target_ulong addr = base + ((i * nf + k) << log2_esz);
@@ -393,7 +393,7 @@ vext_ldst_index(void *vd, void *v0, target_ulong base,
     }
 
     /* load bytes from guest memory */
-    for (i = env->vstart; i < env->vl; i++, env->vstart++) {
+    for (i = env->vstart; i < env->vl; env->vstart = ++i) {
         k = 0;
         while (k < nf) {
             if (!vm && !vext_elem_mask(v0, i)) {
-- 
2.43.2



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 02/10] target/riscv: handle vstart >= vl in vext_set_tail_elems_1s()
  2024-03-09 20:43 ` [PATCH v9 02/10] target/riscv: handle vstart >= vl in vext_set_tail_elems_1s() Daniel Henrique Barboza
@ 2024-03-10  7:37   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2024-03-10  7:37 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd

On 3/9/24 10:43, Daniel Henrique Barboza wrote:
> We're going to make changes that will required each helper to be
> responsible for the 'vstart' management, i.e. we will relieve the
> 'vstart < vl' assumption that helpers have today.
> 
> To do that we'll need to deal with how we're updating tail elements
> first. We can't update them if vstart >= vl, but at this moment we're
> not guarding for it.
> 
> We have the vext_set_tail_elems_1s() helper to update tail elements.
> Change it to accept an 'env' pointer, where we can read both vstart and
> vl, and make it a no-op if vstart >= vl. Note that callers will need to
> set env->start = 0 *after* the helper from now on.
> 
> The exception are three helpers: vext_ldst_stride(), vext_ldst_us() and
> vext_ldst_index(). They are are incrementing env->vstart during
> execution and will end up with env->vstart = vl when tail updating. For
> these cases only, do an early check and exit if vstart >= vl, and set
> env->vstart = 0 before updating the tail.
> 
> For everyone else we'll do vext_set_tail_elems_1s() and then clear
> env->vstart. This is the case of vext_ldff() that is already using
> set_tail_elems_1s(), and will be the case for the rest after the next
> patches.
> 
> Let's also simplify the API a little by removing the 'nf' argument since
> it can be derived from 'desc'.
> 
> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> ---
>   target/riscv/vector_helper.c | 59 ++++++++++++++++++++++++++++++------
>   1 file changed, 49 insertions(+), 10 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

> +    uint32_t nf = vext_nf(desc);
>       int k;
>   
> -    if (vta == 0) {
> +    /*
> +     * Section 5.4 of the RVV spec mentions:
> +     * "When vstart ≥ vl, there are no body elements, and no
> +     *  elements are updated in any destination vector register
> +     *  group, including that no tail elements are updated
> +     *  with agnostic values."
> +     */
> +    if (vta == 0 || env->vstart >= env->vl) {
>           return;
>       }
>   
>       for (k = 0; k < nf; ++k) {

Existing issue, and we know nf <= 8, but bad form to mix signs on the comparison.

> -        vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz,
> +        vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz,

You may wish to hoist vl to a local anyway.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 03/10] target/riscv/vector_helper.c: do vstart=0 after updating tail
  2024-03-09 20:43 ` [PATCH v9 03/10] target/riscv/vector_helper.c: do vstart=0 after updating tail Daniel Henrique Barboza
@ 2024-03-10  7:38   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2024-03-10  7:38 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd

On 3/9/24 10:43, Daniel Henrique Barboza wrote:
> vext_vv_rm_1() and vext_vv_rm_2() are setting vstart = 0 before their
> respective callers (vext_vv_rm_2 and  vext_vx_rm_2) update the tail
> elements.
> 
> This is benign now, but we'll convert the tail updates to use
> vext_set_tail_elems_1s(), and this function is sensitive to vstart
> changes. Do vstart = 0 after vext_set_elems_1s() now to make the
> conversion easier.
> 
> Signed-off-by: Daniel Henrique Barboza<dbarboza@ventanamicro.com>
> ---
>   target/riscv/vector_helper.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s()
  2024-03-09 20:43 ` [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s() Daniel Henrique Barboza
@ 2024-03-10  7:41   ` Richard Henderson
  2024-03-10  9:50     ` Daniel Henrique Barboza
  2024-03-11  2:40   ` LIU Zhiwei
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2024-03-10  7:41 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd

On 3/9/24 10:43, Daniel Henrique Barboza wrote:
> Change all code that updates tail elems to use vext_set_tail_elems_1s()
> instead of vext_set_elems_1s().
> 
> Setting 'env->vstart=0' needs to be the very last thing a helper does
> because env->vstart is being checked by vext_set_tail_elems_1s().

I did wonder if it would be worth doing the vstart = 0 in vext_set_tail_elems_1s, allowing 
that to be the very last thing in each helper, and could be tail called.

> 
> A side effect of this change is that a lot of 'vta' local variables got
> unused. The reason is that 'vta' was being fetched to be used with
> vext_set_elems_1s() but vext_set_tail_elems_1s() doesn't use it - 'vta' is
> retrieve inside the helper using 'desc'.
> 
> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns
  2024-03-09 20:43 ` [PATCH v9 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns Daniel Henrique Barboza
@ 2024-03-10  7:42   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2024-03-10  7:42 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd

On 3/9/24 10:43, Daniel Henrique Barboza wrote:
> Vcrypto insns should also use the same helper the regular vector insns
> uses to update the tail elements.
> 
> Move vext_set_tail_elems_1s() to vector_internals.c and make it public.
> Use it in vcrypto_helper.c to set tail elements instead of
> vext_set_elems_1s(). Helpers must set env->vstart = 0 after setting the
> tail.
> 
> Signed-off-by: Daniel Henrique Barboza<dbarboza@ventanamicro.com>
> ---
>   target/riscv/vcrypto_helper.c   | 63 ++++++++++++---------------------
>   target/riscv/vector_helper.c    | 30 ----------------
>   target/riscv/vector_internals.c | 29 +++++++++++++++
>   target/riscv/vector_internals.h |  4 +++
>   4 files changed, 56 insertions(+), 70 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 06/10] trans_rvv.c.inc: set vstart = 0 in int scalar move insns
  2024-03-09 20:43 ` [PATCH v9 06/10] trans_rvv.c.inc: set vstart = 0 in int scalar move insns Daniel Henrique Barboza
@ 2024-03-10  7:45   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2024-03-10  7:45 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd

On 3/9/24 10:43, Daniel Henrique Barboza wrote:
> trans_vmv_x_s, trans_vmv_s_x, trans_vfmv_f_s and trans_vfmv_s_f aren't
> setting vstart = 0 after execution. This is usually done by a helper in
> vector_helper.c but these functions don't use helpers.
> 
> We'll set vstart after any potential 'over' brconds, and that will also
> mandate a mark_vs_dirty() too.
> 
> Fixes: dedc53cbc9 ("target/riscv: rvv-1.0: integer scalar move instructions")
> Signed-off-by: Daniel Henrique Barboza<dbarboza@ventanamicro.com>
> ---
>   target/riscv/insn_trans/trans_rvv.c.inc | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag
  2024-03-09 20:43 ` [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag Daniel Henrique Barboza
@ 2024-03-10  7:47   ` Richard Henderson
  2024-03-10 10:17     ` Daniel Henrique Barboza
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2024-03-10  7:47 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd, Ivan Klokov

On 3/9/24 10:43, Daniel Henrique Barboza wrote:
> From: Ivan Klokov <ivan.klokov@syntacore.com>
> 
> The vstart_qe_zero flag is set at the beginning of the translation

Here and subject, s/qe/ne/.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s()
  2024-03-10  7:41   ` Richard Henderson
@ 2024-03-10  9:50     ` Daniel Henrique Barboza
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-10  9:50 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd



On 3/10/24 04:41, Richard Henderson wrote:
> On 3/9/24 10:43, Daniel Henrique Barboza wrote:
>> Change all code that updates tail elems to use vext_set_tail_elems_1s()
>> instead of vext_set_elems_1s().
>>
>> Setting 'env->vstart=0' needs to be the very last thing a helper does
>> because env->vstart is being checked by vext_set_tail_elems_1s().
> 
> I did wonder if it would be worth doing the vstart = 0 in vext_set_tail_elems_1s, allowing that to be the very last thing in each helper, and could be tail called.


Some insns don't update tail, e.g. vext_ldst_whole(), and we would need to
clear vstart explicitly in them regardless. Might as well deal with clearing
vstart in every helper to make them consistent.


Thanks,

Daniel

> 
>>
>> A side effect of this change is that a lot of 'vta' local variables got
>> unused. The reason is that 'vta' was being fetched to be used with
>> vext_set_elems_1s() but vext_set_tail_elems_1s() doesn't use it - 'vta' is
>> retrieve inside the helper using 'desc'.
>>
>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
> r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag
  2024-03-10  7:47   ` Richard Henderson
@ 2024-03-10 10:17     ` Daniel Henrique Barboza
  2024-03-10 18:04       ` Richard Henderson
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-10 10:17 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd, Ivan Klokov



On 3/10/24 04:47, Richard Henderson wrote:
> On 3/9/24 10:43, Daniel Henrique Barboza wrote:
>> From: Ivan Klokov <ivan.klokov@syntacore.com>
>>
>> The vstart_qe_zero flag is set at the beginning of the translation
> 
> Here and subject, s/qe/ne/.

Hmmmm  ... the flag name is correct - vstart_qe_zero. But the patch isn't
clearing it at the end of insns, the patch is setting it.

I'll change the subject to "enable vstart_eq_zero in the end of insns".

And in this first quote I'll change 'set' to 'updated':

"The vstart_qe_zero flag is updated at the beginning of the translation (...)"

Because 'flag is set' can give the impression that we're enabling it. 'flag is
updated' is more in line with what happens: vstart_eq_zero will track the result
of vstart = 0.


Thanks,


Daniel





> 
> 
> r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag
  2024-03-10 10:17     ` Daniel Henrique Barboza
@ 2024-03-10 18:04       ` Richard Henderson
  2024-03-10 18:11         ` Daniel Henrique Barboza
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2024-03-10 18:04 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd, Ivan Klokov

On 3/10/24 00:17, Daniel Henrique Barboza wrote:
> 
> 
> On 3/10/24 04:47, Richard Henderson wrote:
>> On 3/9/24 10:43, Daniel Henrique Barboza wrote:
>>> From: Ivan Klokov <ivan.klokov@syntacore.com>
>>>
>>> The vstart_qe_zero flag is set at the beginning of the translation
>>
>> Here and subject, s/qe/ne/.
> 
> Hmmmm  ... the flag name is correct - vstart_qe_zero.

Gah.  My mistake in pointing out the mistake, which is "qe" not "eq".


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag
  2024-03-10 18:04       ` Richard Henderson
@ 2024-03-10 18:11         ` Daniel Henrique Barboza
  0 siblings, 0 replies; 22+ messages in thread
From: Daniel Henrique Barboza @ 2024-03-10 18:11 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, zhiwei_liu,
	palmer, philmd, Ivan Klokov



On 3/10/24 15:04, Richard Henderson wrote:
> On 3/10/24 00:17, Daniel Henrique Barboza wrote:
>>
>>
>> On 3/10/24 04:47, Richard Henderson wrote:
>>> On 3/9/24 10:43, Daniel Henrique Barboza wrote:
>>>> From: Ivan Klokov <ivan.klokov@syntacore.com>
>>>>
>>>> The vstart_qe_zero flag is set at the beginning of the translation
>>>
>>> Here and subject, s/qe/ne/.
>>
>> Hmmmm  ... the flag name is correct - vstart_qe_zero.
> 
> Gah.  My mistake in pointing out the mistake, which is "qe" not "eq".

Hhehe the flag is still named 'vstart_qe_zero' in v10 both the subject and
commit msg instead of 'vstart_eq_zero' :D

Alistair, can you amend the commit msg of patch 9 when queueing? Thanks,


Daniel




> 
> 
> r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s()
  2024-03-09 20:43 ` [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s() Daniel Henrique Barboza
  2024-03-10  7:41   ` Richard Henderson
@ 2024-03-11  2:40   ` LIU Zhiwei
  1 sibling, 0 replies; 22+ messages in thread
From: LIU Zhiwei @ 2024-03-11  2:40 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel
  Cc: qemu-riscv, alistair.francis, bmeng, liwei1518, palmer,
	richard.henderson, philmd


On 2024/3/10 4:43, Daniel Henrique Barboza wrote:
> Change all code that updates tail elems to use vext_set_tail_elems_1s()
> instead of vext_set_elems_1s().

Hi Daniel,

Notice vext_set_tail_elems_1s will use NF field, which is zero for most 
vector instructions. Thus it will do nothing.
I think you need encode the  right NF value(1) into desc for them if you 
want to do this replacement.

Thanks,
Zhiwei

>
> Setting 'env->vstart=0' needs to be the very last thing a helper does
> because env->vstart is being checked by vext_set_tail_elems_1s().
>
> A side effect of this change is that a lot of 'vta' local variables got
> unused. The reason is that 'vta' was being fetched to be used with
> vext_set_elems_1s() but vext_set_tail_elems_1s() doesn't use it - 'vta' is
> retrieve inside the helper using 'desc'.
>
> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> ---
>   target/riscv/vector_helper.c | 130 ++++++++++++++---------------------
>   1 file changed, 52 insertions(+), 78 deletions(-)
>
> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
> index 86b990ce03..b174ddeae8 100644
> --- a/target/riscv/vector_helper.c
> +++ b/target/riscv/vector_helper.c
> @@ -913,7 +913,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,   \
>       uint32_t esz = sizeof(ETYPE);                             \
>       uint32_t total_elems =                                    \
>           vext_get_total_elems(env, desc, esz);                 \
> -    uint32_t vta = vext_vta(desc);                            \
>       uint32_t i;                                               \
>                                                                 \
>       for (i = env->vstart; i < vl; i++) {                      \
> @@ -923,9 +922,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,   \
>                                                                 \
>           *((ETYPE *)vd + H(i)) = DO_OP(s2, s1, carry);         \
>       }                                                         \
> -    env->vstart = 0;                                          \
>       /* set tail elements to 1s */                             \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);  \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);  \
> +    env->vstart = 0;                                          \
>   }
>   
>   GEN_VEXT_VADC_VVM(vadc_vvm_b, uint8_t,  H1, DO_VADC)
> @@ -945,7 +944,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,        \
>       uint32_t vl = env->vl;                                               \
>       uint32_t esz = sizeof(ETYPE);                                        \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);         \
> -    uint32_t vta = vext_vta(desc);                                       \
>       uint32_t i;                                                          \
>                                                                            \
>       for (i = env->vstart; i < vl; i++) {                                 \
> @@ -954,9 +952,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,        \
>                                                                            \
>           *((ETYPE *)vd + H(i)) = DO_OP(s2, (ETYPE)(target_long)s1, carry);\
>       }                                                                    \
> -    env->vstart = 0;                                                     \
>       /* set tail elements to 1s */                                        \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);             \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);             \
> +    env->vstart = 0;                                                     \
>   }
>   
>   GEN_VEXT_VADC_VXM(vadc_vxm_b, uint8_t,  H1, DO_VADC)
> @@ -1113,7 +1111,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,                          \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(TS1);                                           \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t vma = vext_vma(desc);                                        \
>       uint32_t i;                                                           \
>                                                                             \
> @@ -1127,9 +1124,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,                          \
>           TS2 s2 = *((TS2 *)vs2 + HS2(i));                                  \
>           *((TS1 *)vd + HS1(i)) = OP(s2, s1 & MASK);                        \
>       }                                                                     \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   GEN_VEXT_SHIFT_VV(vsll_vv_b, uint8_t,  uint8_t, H1, H1, DO_SLL, 0x7)
> @@ -1160,7 +1157,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,      \
>       uint32_t esz = sizeof(TD);                              \
>       uint32_t total_elems =                                  \
>           vext_get_total_elems(env, desc, esz);               \
> -    uint32_t vta = vext_vta(desc);                          \
>       uint32_t vma = vext_vma(desc);                          \
>       uint32_t i;                                             \
>                                                               \
> @@ -1174,9 +1170,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,      \
>           TS2 s2 = *((TS2 *)vs2 + HS2(i));                    \
>           *((TD *)vd + HD(i)) = OP(s2, s1 & MASK);            \
>       }                                                       \
> -    env->vstart = 0;                                        \
>       /* set tail elements to 1s */                           \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);\
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);\
> +    env->vstart = 0;                                        \
>   }
>   
>   GEN_VEXT_SHIFT_VX(vsll_vx_b, uint8_t, int8_t, H1, H1, DO_SLL, 0x7)
> @@ -1835,16 +1831,15 @@ void HELPER(NAME)(void *vd, void *vs1, CPURISCVState *env,           \
>       uint32_t vl = env->vl;                                           \
>       uint32_t esz = sizeof(ETYPE);                                    \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
> -    uint32_t vta = vext_vta(desc);                                   \
>       uint32_t i;                                                      \
>                                                                        \
>       for (i = env->vstart; i < vl; i++) {                             \
>           ETYPE s1 = *((ETYPE *)vs1 + H(i));                           \
>           *((ETYPE *)vd + H(i)) = s1;                                  \
>       }                                                                \
> -    env->vstart = 0;                                                 \
>       /* set tail elements to 1s */                                    \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
> +    env->vstart = 0;                                                 \
>   }
>   
>   GEN_VEXT_VMV_VV(vmv_v_v_b, int8_t,  H1)
> @@ -1859,15 +1854,14 @@ void HELPER(NAME)(void *vd, uint64_t s1, CPURISCVState *env,         \
>       uint32_t vl = env->vl;                                           \
>       uint32_t esz = sizeof(ETYPE);                                    \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
> -    uint32_t vta = vext_vta(desc);                                   \
>       uint32_t i;                                                      \
>                                                                        \
>       for (i = env->vstart; i < vl; i++) {                             \
>           *((ETYPE *)vd + H(i)) = (ETYPE)s1;                           \
>       }                                                                \
> -    env->vstart = 0;                                                 \
>       /* set tail elements to 1s */                                    \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
> +    env->vstart = 0;                                                 \
>   }
>   
>   GEN_VEXT_VMV_VX(vmv_v_x_b, int8_t,  H1)
> @@ -1882,16 +1876,15 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,          \
>       uint32_t vl = env->vl;                                           \
>       uint32_t esz = sizeof(ETYPE);                                    \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
> -    uint32_t vta = vext_vta(desc);                                   \
>       uint32_t i;                                                      \
>                                                                        \
>       for (i = env->vstart; i < vl; i++) {                             \
>           ETYPE *vt = (!vext_elem_mask(v0, i) ? vs2 : vs1);            \
>           *((ETYPE *)vd + H(i)) = *(vt + H(i));                        \
>       }                                                                \
> -    env->vstart = 0;                                                 \
>       /* set tail elements to 1s */                                    \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
> +    env->vstart = 0;                                                 \
>   }
>   
>   GEN_VEXT_VMERGE_VV(vmerge_vvm_b, int8_t,  H1)
> @@ -1906,7 +1899,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,               \
>       uint32_t vl = env->vl;                                           \
>       uint32_t esz = sizeof(ETYPE);                                    \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);     \
> -    uint32_t vta = vext_vta(desc);                                   \
>       uint32_t i;                                                      \
>                                                                        \
>       for (i = env->vstart; i < vl; i++) {                             \
> @@ -1915,9 +1907,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,               \
>                      (ETYPE)(target_long)s1);                          \
>           *((ETYPE *)vd + H(i)) = d;                                   \
>       }                                                                \
> -    env->vstart = 0;                                                 \
>       /* set tail elements to 1s */                                    \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);         \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);         \
> +    env->vstart = 0;                                                 \
>   }
>   
>   GEN_VEXT_VMERGE_VX(vmerge_vxm_b, int8_t,  H1)
> @@ -1973,7 +1965,6 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2,
>       uint32_t vm = vext_vm(desc);
>       uint32_t vl = env->vl;
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);
> -    uint32_t vta = vext_vta(desc);
>       uint32_t vma = vext_vma(desc);
>   
>       switch (env->vxrm) {
> @@ -1995,7 +1986,7 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2,
>           break;
>       }
>       /* set tail elements to 1s */
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
>       env->vstart = 0;
>   }
>   
> @@ -2098,7 +2089,6 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2,
>       uint32_t vm = vext_vm(desc);
>       uint32_t vl = env->vl;
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);
> -    uint32_t vta = vext_vta(desc);
>       uint32_t vma = vext_vma(desc);
>   
>       switch (env->vxrm) {
> @@ -2120,7 +2110,7 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2,
>           break;
>       }
>       /* set tail elements to 1s */
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);
>       env->vstart = 0;
>   }
>   
> @@ -2872,7 +2862,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
>       uint32_t vl = env->vl;                                \
>       uint32_t total_elems =                                \
>           vext_get_total_elems(env, desc, ESZ);             \
> -    uint32_t vta = vext_vta(desc);                        \
>       uint32_t vma = vext_vma(desc);                        \
>       uint32_t i;                                           \
>                                                             \
> @@ -2885,10 +2874,10 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
>           }                                                 \
>           do_##NAME(vd, vs1, vs2, i, env);                  \
>       }                                                     \
> -    env->vstart = 0;                                      \
>       /* set tail elements to 1s */                         \
> -    vext_set_elems_1s(vd, vta, vl * ESZ,                  \
> -                      total_elems * ESZ);                 \
> +    vext_set_tail_elems_1s(env, vd, desc, ESZ,            \
> +                           total_elems);                  \
> +    env->vstart = 0;                                      \
>   }
>   
>   RVVCALL(OPFVV2, vfadd_vv_h, OP_UUU_H, H2, H2, H2, float16_add)
> @@ -2915,7 +2904,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1,        \
>       uint32_t vl = env->vl;                                \
>       uint32_t total_elems =                                \
>           vext_get_total_elems(env, desc, ESZ);             \
> -    uint32_t vta = vext_vta(desc);                        \
>       uint32_t vma = vext_vma(desc);                        \
>       uint32_t i;                                           \
>                                                             \
> @@ -2928,10 +2916,10 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1,        \
>           }                                                 \
>           do_##NAME(vd, s1, vs2, i, env);                   \
>       }                                                     \
> -    env->vstart = 0;                                      \
>       /* set tail elements to 1s */                         \
> -    vext_set_elems_1s(vd, vta, vl * ESZ,                  \
> -                      total_elems * ESZ);                 \
> +    vext_set_tail_elems_1s(env, vd, desc, ESZ,            \
> +                           total_elems);                  \
> +    env->vstart = 0;                                      \
>   }
>   
>   RVVCALL(OPFVF2, vfadd_vf_h, OP_UUU_H, H2, H2, float16_add)
> @@ -3501,7 +3489,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,       \
>       uint32_t vl = env->vl;                             \
>       uint32_t total_elems =                             \
>           vext_get_total_elems(env, desc, ESZ);          \
> -    uint32_t vta = vext_vta(desc);                     \
>       uint32_t vma = vext_vma(desc);                     \
>       uint32_t i;                                        \
>                                                          \
> @@ -3517,9 +3504,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,       \
>           }                                              \
>           do_##NAME(vd, vs2, i, env);                    \
>       }                                                  \
> +    vext_set_tail_elems_1s(env, vd, desc, ESZ,         \
> +                           total_elems);               \
>       env->vstart = 0;                                   \
> -    vext_set_elems_1s(vd, vta, vl * ESZ,               \
> -                      total_elems * ESZ);              \
>   }
>   
>   RVVCALL(OPFVV1, vfsqrt_v_h, OP_UU_H, H2, H2, float16_sqrt)
> @@ -4256,7 +4243,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \
>       uint32_t esz = sizeof(ETYPE);                             \
>       uint32_t total_elems =                                    \
>           vext_get_total_elems(env, desc, esz);                 \
> -    uint32_t vta = vext_vta(desc);                            \
>       uint32_t i;                                               \
>                                                                 \
>       for (i = env->vstart; i < vl; i++) {                      \
> @@ -4264,9 +4250,9 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \
>           *((ETYPE *)vd + H(i)) =                               \
>               (!vm && !vext_elem_mask(v0, i) ? s2 : s1);        \
>       }                                                         \
> -    env->vstart = 0;                                          \
>       /* set tail elements to 1s */                             \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);  \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);  \
> +    env->vstart = 0;                                          \
>   }
>   
>   GEN_VFMERGE_VF(vfmerge_vfm_h, int16_t, H2)
> @@ -4421,7 +4407,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
>       uint32_t vl = env->vl;                                \
>       uint32_t esz = sizeof(TD);                            \
>       uint32_t vlenb = simd_maxsz(desc);                    \
> -    uint32_t vta = vext_vta(desc);                        \
>       uint32_t i;                                           \
>       TD s1 =  *((TD *)vs1 + HD(0));                        \
>                                                             \
> @@ -4433,9 +4418,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,          \
>           s1 = OP(s1, (TD)s2);                              \
>       }                                                     \
>       *((TD *)vd + HD(0)) = s1;                             \
> -    env->vstart = 0;                                      \
>       /* set tail elements to 1s */                         \
> -    vext_set_elems_1s(vd, vta, esz, vlenb);               \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, vlenb);    \
> +    env->vstart = 0;                                      \
>   }
>   
>   /* vd[0] = sum(vs1[0], vs2[*]) */
> @@ -4507,7 +4492,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,           \
>       uint32_t vl = env->vl;                                 \
>       uint32_t esz = sizeof(TD);                             \
>       uint32_t vlenb = simd_maxsz(desc);                     \
> -    uint32_t vta = vext_vta(desc);                         \
>       uint32_t i;                                            \
>       TD s1 =  *((TD *)vs1 + HD(0));                         \
>                                                              \
> @@ -4519,9 +4503,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,           \
>           s1 = OP(s1, (TD)s2, &env->fp_status);              \
>       }                                                      \
>       *((TD *)vd + HD(0)) = s1;                              \
> -    env->vstart = 0;                                       \
>       /* set tail elements to 1s */                          \
> -    vext_set_elems_1s(vd, vta, esz, vlenb);                \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, vlenb);     \
> +    env->vstart = 0;                                       \
>   }
>   
>   /* Unordered sum */
> @@ -4738,7 +4722,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, CPURISCVState *env,      \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(ETYPE);                                         \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t vma = vext_vma(desc);                                        \
>       uint32_t sum = 0;                                                     \
>       int i;                                                                \
> @@ -4754,9 +4737,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, CPURISCVState *env,      \
>               sum++;                                                        \
>           }                                                                 \
>       }                                                                     \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   GEN_VEXT_VIOTA_M(viota_m_b, uint8_t,  H1)
> @@ -4772,7 +4755,6 @@ void HELPER(NAME)(void *vd, void *v0, CPURISCVState *env, uint32_t desc)  \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(ETYPE);                                         \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t vma = vext_vma(desc);                                        \
>       int i;                                                                \
>                                                                             \
> @@ -4784,9 +4766,9 @@ void HELPER(NAME)(void *vd, void *v0, CPURISCVState *env, uint32_t desc)  \
>           }                                                                 \
>           *((ETYPE *)vd + H(i)) = i;                                        \
>       }                                                                     \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   GEN_VEXT_VID_V(vid_v_b, uint8_t,  H1)
> @@ -4807,7 +4789,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(ETYPE);                                         \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t vma = vext_vma(desc);                                        \
>       target_ulong offset = s1, i_min, i;                                   \
>                                                                             \
> @@ -4820,9 +4801,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
>           }                                                                 \
>           *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - offset));          \
>       }                                                                     \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   /* vslideup.vx vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */
> @@ -4840,7 +4821,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(ETYPE);                                         \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t vma = vext_vma(desc);                                        \
>       target_ulong i_max, i_min, i;                                         \
>                                                                             \
> @@ -4861,9 +4841,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
>           }                                                                 \
>       }                                                                     \
>                                                                             \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   /* vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i+rs1] */
> @@ -4882,7 +4862,6 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
>       uint32_t vl = env->vl;                                                  \
>       uint32_t esz = sizeof(ETYPE);                                           \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);            \
> -    uint32_t vta = vext_vta(desc);                                          \
>       uint32_t vma = vext_vma(desc);                                          \
>       uint32_t i;                                                             \
>                                                                               \
> @@ -4898,9 +4877,9 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
>               *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - 1));             \
>           }                                                                   \
>       }                                                                       \
> -    env->vstart = 0;                                                        \
>       /* set tail elements to 1s */                                           \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);                \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);                \
> +    env->vstart = 0;                                                        \
>   }
>   
>   GEN_VEXT_VSLIE1UP(8,  H1)
> @@ -4931,7 +4910,6 @@ static void vslide1down_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
>       uint32_t vl = env->vl;                                                    \
>       uint32_t esz = sizeof(ETYPE);                                             \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);              \
> -    uint32_t vta = vext_vta(desc);                                            \
>       uint32_t vma = vext_vma(desc);                                            \
>       uint32_t i;                                                               \
>                                                                                 \
> @@ -4947,9 +4925,9 @@ static void vslide1down_##BITWIDTH(void *vd, void *v0, uint64_t s1,           \
>               *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i + 1));               \
>           }                                                                     \
>       }                                                                         \
> -    env->vstart = 0;                                                          \
>       /* set tail elements to 1s */                                             \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);                  \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);                  \
> +    env->vstart = 0;                                                          \
>   }
>   
>   GEN_VEXT_VSLIDE1DOWN(8,  H1)
> @@ -5005,7 +4983,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(TS2);                                           \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t vma = vext_vma(desc);                                        \
>       uint64_t index;                                                       \
>       uint32_t i;                                                           \
> @@ -5023,9 +5000,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
>               *((TS2 *)vd + HS2(i)) = *((TS2 *)vs2 + HS2(index));           \
>           }                                                                 \
>       }                                                                     \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   /* vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]]; */
> @@ -5048,7 +5025,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(ETYPE);                                         \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t vma = vext_vma(desc);                                        \
>       uint64_t index = s1;                                                  \
>       uint32_t i;                                                           \
> @@ -5065,9 +5041,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2,         \
>               *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(index));           \
>           }                                                                 \
>       }                                                                     \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   /* vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[rs1] */
> @@ -5084,7 +5060,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
>       uint32_t vl = env->vl;                                                \
>       uint32_t esz = sizeof(ETYPE);                                         \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz);          \
> -    uint32_t vta = vext_vta(desc);                                        \
>       uint32_t num = 0, i;                                                  \
>                                                                             \
>       for (i = env->vstart; i < vl; i++) {                                  \
> @@ -5094,9 +5069,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,               \
>           *((ETYPE *)vd + H(num)) = *((ETYPE *)vs2 + H(i));                 \
>           num++;                                                            \
>       }                                                                     \
> -    env->vstart = 0;                                                      \
>       /* set tail elements to 1s */                                         \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);              \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);              \
> +    env->vstart = 0;                                                      \
>   }
>   
>   /* Compress into vd elements of vs2 where vs1 is enabled */
> @@ -5130,7 +5105,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,                 \
>       uint32_t vm = vext_vm(desc);                                 \
>       uint32_t esz = sizeof(ETYPE);                                \
>       uint32_t total_elems = vext_get_total_elems(env, desc, esz); \
> -    uint32_t vta = vext_vta(desc);                               \
>       uint32_t vma = vext_vma(desc);                               \
>       uint32_t i;                                                  \
>                                                                    \
> @@ -5142,9 +5116,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2,                 \
>           }                                                        \
>           *((ETYPE *)vd + HD(i)) = *((DTYPE *)vs2 + HS1(i));       \
>       }                                                            \
> -    env->vstart = 0;                                             \
>       /* set tail elements to 1s */                                \
> -    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);     \
> +    vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);     \
> +    env->vstart = 0;                                             \
>   }
>   
>   GEN_VEXT_INT_EXT(vzext_vf2_h, uint16_t, uint8_t,  H2, H1)


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-03-11  2:41 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-09 20:43 [PATCH v9 00/10] riscv: set vstart_eq_zero on mark_vs_dirty Daniel Henrique Barboza
2024-03-09 20:43 ` [PATCH v9 01/10] target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX() Daniel Henrique Barboza
2024-03-09 20:43 ` [PATCH v9 02/10] target/riscv: handle vstart >= vl in vext_set_tail_elems_1s() Daniel Henrique Barboza
2024-03-10  7:37   ` Richard Henderson
2024-03-09 20:43 ` [PATCH v9 03/10] target/riscv/vector_helper.c: do vstart=0 after updating tail Daniel Henrique Barboza
2024-03-10  7:38   ` Richard Henderson
2024-03-09 20:43 ` [PATCH v9 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s() Daniel Henrique Barboza
2024-03-10  7:41   ` Richard Henderson
2024-03-10  9:50     ` Daniel Henrique Barboza
2024-03-11  2:40   ` LIU Zhiwei
2024-03-09 20:43 ` [PATCH v9 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns Daniel Henrique Barboza
2024-03-10  7:42   ` Richard Henderson
2024-03-09 20:43 ` [PATCH v9 06/10] trans_rvv.c.inc: set vstart = 0 in int scalar move insns Daniel Henrique Barboza
2024-03-10  7:45   ` Richard Henderson
2024-03-09 20:43 ` [PATCH v9 07/10] target/riscv: remove 'over' brconds from vector trans Daniel Henrique Barboza
2024-03-09 20:43 ` [PATCH v9 08/10] trans_rvv.c.inc: remove redundant mark_vs_dirty() calls Daniel Henrique Barboza
2024-03-09 20:43 ` [PATCH v9 09/10] target/riscv: Clear vstart_qe_zero flag Daniel Henrique Barboza
2024-03-10  7:47   ` Richard Henderson
2024-03-10 10:17     ` Daniel Henrique Barboza
2024-03-10 18:04       ` Richard Henderson
2024-03-10 18:11         ` Daniel Henrique Barboza
2024-03-09 20:43 ` [PATCH v9 10/10] target/riscv/vector_helper.c: optimize loops in ldst helpers Daniel Henrique Barboza

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).