* [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks
@ 2025-07-01 10:31 William Kosasih
2025-07-01 10:31 ` [PATCH v2 01/12] target/arm: Fix VLSTM/VLLDM helper load/store " William Kosasih
` (11 more replies)
0 siblings, 12 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
Historically, M-profile helper functions in m_helper.c and mve_helper.c
used the unaligned cpu_*_data_ra() routines to perform guest memory
accesses. This meant we had no way to enforce alignment constraints
when executing helper-based loads/stores. With the addition of the
cpu_*_mmu() APIs, we can now combine the current MMU state with MO_ALIGN
flags to build a MemOpIdx that enforces alignment at the helper level.
This patch series:
- Replaces all calls to cpu_ld*_data_ra(), cpu_st*_data_ra()
in the M-profile helpers (m_helper.c) and the MVE helpers
(mve_helper.c) with their cpu_*_mmu() equivalents.
- Leaves SME and SVE helper code untouched, as those extensions
support unaligned accesses by design.
With this change, all M-profile and MVE helper-based loads and stores
will now correctly honor their alignment requirements.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
William Kosasih (12):
target/arm: Fix VLSTM/VLLDM helper load/store alignment checks
target/arm: Fix BLXNS helper store alignment checks
target/arm: Fix function_return helper load alignment checks
target/arm: Fix VLDR helper load alignment checks
target/arm: Fix VSTR helper store alignment checks
target/arm: Fix VLDR_SG helper load alignment checks
target/arm: Fix VSTR_SG helper store alignment checks
target/arm: Fix VLD4 helper load alignment checks
target/arm: Fix VLD2 helper load alignment checks
target/arm: Fix VST4 helper store alignment checks
target/arm: Fix VST2 helper store alignment checks
target/arm: Fix helper macros indentation in mve_helper.c
target/arm/tcg/m_helper.c | 33 +--
target/arm/tcg/mve_helper.c | 408 ++++++++++++++++++++----------------
2 files changed, 254 insertions(+), 187 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 01/12] target/arm: Fix VLSTM/VLLDM helper load/store alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-02 2:06 ` Richard Henderson
2025-07-01 10:31 ` [PATCH v2 02/12] target/arm: Fix BLXNS helper store " William Kosasih
` (10 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the load operations in the VLLDM
instruction, and in the store operations in the VLSTM instruction.
Manual alignment checks in the both helpers are retained because they
enforce an 8-byte alignment requirement (instead of the 4-byte alignment for
ordinary long loads/stores). References to cpu_*_data_* are still replaced
with cpu_*_mmu(), so that the individual word accesses themselves also
perform the standard alignment checks, in keeping with the ARM pseudocode.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/m_helper.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/target/arm/tcg/m_helper.c b/target/arm/tcg/m_helper.c
index 6614719832..251e12edf9 100644
--- a/target/arm/tcg/m_helper.c
+++ b/target/arm/tcg/m_helper.c
@@ -1048,6 +1048,9 @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
bool s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
bool lspact = env->v7m.fpccr[s] & R_V7M_FPCCR_LSPACT_MASK;
uintptr_t ra = GETPC();
+ ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN,
+ arm_to_core_mmu_idx(mmu_idx));
assert(env->v7m.secure);
@@ -1073,7 +1076,7 @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
* Note that we do not use v7m_stack_write() here, because the
* accesses should not set the FSR bits for stacking errors if they
* fail. (In pseudocode terms, they are AccType_NORMAL, not AccType_STACK
- * or AccType_LAZYFP). Faults in cpu_stl_data_ra() will throw exceptions
+ * or AccType_LAZYFP). Faults in cpu_stl_mmu() will throw exceptions
* and longjmp out.
*/
if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
@@ -1089,12 +1092,12 @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
if (i >= 16) {
faddr += 8; /* skip the slot for the FPSCR */
}
- cpu_stl_data_ra(env, faddr, slo, ra);
- cpu_stl_data_ra(env, faddr + 4, shi, ra);
+ cpu_stl_mmu(env, faddr, slo, oi, ra);
+ cpu_stl_mmu(env, faddr + 4, shi, oi, ra);
}
- cpu_stl_data_ra(env, fptr + 0x40, vfp_get_fpscr(env), ra);
+ cpu_stl_mmu(env, fptr + 0x40, vfp_get_fpscr(env), oi, ra);
if (cpu_isar_feature(aa32_mve, cpu)) {
- cpu_stl_data_ra(env, fptr + 0x44, env->v7m.vpr, ra);
+ cpu_stl_mmu(env, fptr + 0x44, env->v7m.vpr, oi, ra);
}
/*
@@ -1121,6 +1124,9 @@ void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
{
ARMCPU *cpu = env_archcpu(env);
uintptr_t ra = GETPC();
+ ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN,
+ arm_to_core_mmu_idx(mmu_idx));
/* fptr is the value of Rn, the frame pointer we load the FP regs from */
assert(env->v7m.secure);
@@ -1155,16 +1161,16 @@ void HELPER(v7m_vlldm)(CPUARMState *env, uint32_t fptr)
faddr += 8; /* skip the slot for the FPSCR and VPR */
}
- slo = cpu_ldl_data_ra(env, faddr, ra);
- shi = cpu_ldl_data_ra(env, faddr + 4, ra);
+ slo = cpu_ldl_mmu(env, faddr, oi, ra);
+ shi = cpu_ldl_mmu(env, faddr + 4, oi, ra);
dn = (uint64_t) shi << 32 | slo;
*aa32_vfp_dreg(env, i / 2) = dn;
}
- fpscr = cpu_ldl_data_ra(env, fptr + 0x40, ra);
+ fpscr = cpu_ldl_mmu(env, fptr + 0x40, oi, ra);
vfp_set_fpscr(env, fpscr);
if (cpu_isar_feature(aa32_mve, cpu)) {
- env->v7m.vpr = cpu_ldl_data_ra(env, fptr + 0x44, ra);
+ env->v7m.vpr = cpu_ldl_mmu(env, fptr + 0x44, oi, ra);
}
}
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 02/12] target/arm: Fix BLXNS helper store alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
2025-07-01 10:31 ` [PATCH v2 01/12] target/arm: Fix VLSTM/VLLDM helper load/store " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 03/12] target/arm: Fix function_return helper load " William Kosasih
` (9 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the store operations (when stacking the
return pc and psr) in the BLXNS instruction.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/m_helper.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/target/arm/tcg/m_helper.c b/target/arm/tcg/m_helper.c
index 251e12edf9..f342d93489 100644
--- a/target/arm/tcg/m_helper.c
+++ b/target/arm/tcg/m_helper.c
@@ -632,8 +632,11 @@ void HELPER(v7m_blxns)(CPUARMState *env, uint32_t dest)
}
/* Note that these stores can throw exceptions on MPU faults */
- cpu_stl_data_ra(env, sp, nextinst, GETPC());
- cpu_stl_data_ra(env, sp + 4, saved_psr, GETPC());
+ ARMMMUIdx mmu_idx = arm_mmu_idx(env);
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN,
+ arm_to_core_mmu_idx(mmu_idx));
+ cpu_stl_mmu(env, sp, nextinst, oi, GETPC());
+ cpu_stl_mmu(env, sp + 4, saved_psr, oi, GETPC());
env->regs[13] = sp;
env->regs[14] = 0xfeffffff;
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 03/12] target/arm: Fix function_return helper load alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
2025-07-01 10:31 ` [PATCH v2 01/12] target/arm: Fix VLSTM/VLLDM helper load/store " William Kosasih
2025-07-01 10:31 ` [PATCH v2 02/12] target/arm: Fix BLXNS helper store " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 04/12] target/arm: Fix VLDR " William Kosasih
` (8 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the load operations (when unstacking the
return pc and psr) in the FunctionReturn pseudocode.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/m_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/arm/tcg/m_helper.c b/target/arm/tcg/m_helper.c
index f342d93489..28307b5615 100644
--- a/target/arm/tcg/m_helper.c
+++ b/target/arm/tcg/m_helper.c
@@ -1946,7 +1946,7 @@ static bool do_v7m_function_return(ARMCPU *cpu)
* do them as secure, so work out what MMU index that is.
*/
mmu_idx = arm_v7m_mmu_idx_for_secstate(env, true);
- oi = make_memop_idx(MO_LEUL, arm_to_core_mmu_idx(mmu_idx));
+ oi = make_memop_idx(MO_LEUL | MO_ALIGN, arm_to_core_mmu_idx(mmu_idx));
newpc = cpu_ldl_mmu(env, frameptr, oi, 0);
newpsr = cpu_ldl_mmu(env, frameptr + 4, oi, 0);
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 04/12] target/arm: Fix VLDR helper load alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (2 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 03/12] target/arm: Fix function_return helper load " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-02 2:29 ` Richard Henderson
2025-07-01 10:31 ` [PATCH v2 05/12] target/arm: Fix VSTR helper store " William Kosasih
` (7 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the load operations in the VLDR
instruction.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 41 ++++++++++++++++++++++++++++---------
1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index 506d1c3475..922cd2371a 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -147,6 +147,22 @@ static void mve_advance_vpt(CPUARMState *env)
env->v7m.vpr = vpr;
}
+/* Mapping of LDTYPE/STTYPE to the number of bytes accessed */
+#define MSIZE_b 1
+#define MSIZE_w 2
+#define MSIZE_l 4
+
+/* Mapping of LDTYPE/STTYPE to MemOp flag */
+#define MFLAG_b MO_UB
+#define MFLAG_w MO_TEUW
+#define MFLAG_l MO_TEUL
+
+#define MSIZE(t) MSIZE_##t
+#define MFLAG(t) MFLAG_##t
+
+#define SIGN_EXT(v, T, B) { \
+ ((T)(v) << (sizeof(T) * 8 - (B))) >> (sizeof(T) * 8 - (B)) }
+
/* For loads, predicated lanes are zeroed instead of keeping their old values */
#define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \
void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
@@ -155,6 +171,8 @@ static void mve_advance_vpt(CPUARMState *env)
uint16_t mask = mve_element_mask(env); \
uint16_t eci_mask = mve_eci_mask(env); \
unsigned b, e; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN, mmu_idx);\
/* \
* R_SXTM allows the dest reg to become UNKNOWN for abandoned \
* beats so we don't care if we update part of the dest and \
@@ -163,7 +181,10 @@ static void mve_advance_vpt(CPUARMState *env)
for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
if (eci_mask & (1 << b)) { \
d[H##ESIZE(e)] = (mask & (1 << b)) ? \
- cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \
+ SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi, GETPC()),\
+ TYPE, \
+ MSIZE * 8) \
+ : 0; \
} \
addr += MSIZE; \
} \
@@ -185,20 +206,20 @@ static void mve_advance_vpt(CPUARMState *env)
mve_advance_vpt(env); \
}
-DO_VLDR(vldrb, 1, ldub, 1, uint8_t)
-DO_VLDR(vldrh, 2, lduw, 2, uint16_t)
-DO_VLDR(vldrw, 4, ldl, 4, uint32_t)
+DO_VLDR(vldrb, 1, b, 1, uint8_t)
+DO_VLDR(vldrh, 2, w, 2, uint16_t)
+DO_VLDR(vldrw, 4, l, 4, uint32_t)
DO_VSTR(vstrb, 1, stb, 1, uint8_t)
DO_VSTR(vstrh, 2, stw, 2, uint16_t)
DO_VSTR(vstrw, 4, stl, 4, uint32_t)
-DO_VLDR(vldrb_sh, 1, ldsb, 2, int16_t)
-DO_VLDR(vldrb_sw, 1, ldsb, 4, int32_t)
-DO_VLDR(vldrb_uh, 1, ldub, 2, uint16_t)
-DO_VLDR(vldrb_uw, 1, ldub, 4, uint32_t)
-DO_VLDR(vldrh_sw, 2, ldsw, 4, int32_t)
-DO_VLDR(vldrh_uw, 2, lduw, 4, uint32_t)
+DO_VLDR(vldrb_sh, 1, b, 2, int16_t)
+DO_VLDR(vldrb_sw, 1, b, 4, int32_t)
+DO_VLDR(vldrb_uh, 1, b, 2, uint16_t)
+DO_VLDR(vldrb_uw, 1, b, 4, uint32_t)
+DO_VLDR(vldrh_sw, 2, w, 4, int32_t)
+DO_VLDR(vldrh_uw, 2, w, 4, uint32_t)
DO_VSTR(vstrb_h, 1, stb, 2, int16_t)
DO_VSTR(vstrb_w, 1, stb, 4, int32_t)
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 05/12] target/arm: Fix VSTR helper store alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (3 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 04/12] target/arm: Fix VLDR " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 06/12] target/arm: Fix VLDR_SG helper load " William Kosasih
` (6 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the store operations in the VSTR
instruction.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index 922cd2371a..a49b8842e3 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -197,9 +197,11 @@ static void mve_advance_vpt(CPUARMState *env)
TYPE *d = vd; \
uint16_t mask = mve_element_mask(env); \
unsigned b, e; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(STTYPE) | MO_ALIGN, mmu_idx);\
for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
if (mask & (1 << b)) { \
- cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \
+ cpu_st##STTYPE##_mmu(env, addr, d[H##ESIZE(e)], oi, GETPC());\
} \
addr += MSIZE; \
} \
@@ -210,9 +212,9 @@ DO_VLDR(vldrb, 1, b, 1, uint8_t)
DO_VLDR(vldrh, 2, w, 2, uint16_t)
DO_VLDR(vldrw, 4, l, 4, uint32_t)
-DO_VSTR(vstrb, 1, stb, 1, uint8_t)
-DO_VSTR(vstrh, 2, stw, 2, uint16_t)
-DO_VSTR(vstrw, 4, stl, 4, uint32_t)
+DO_VSTR(vstrb, 1, b, 1, uint8_t)
+DO_VSTR(vstrh, 2, w, 2, uint16_t)
+DO_VSTR(vstrw, 4, l, 4, uint32_t)
DO_VLDR(vldrb_sh, 1, b, 2, int16_t)
DO_VLDR(vldrb_sw, 1, b, 4, int32_t)
@@ -221,9 +223,9 @@ DO_VLDR(vldrb_uw, 1, b, 4, uint32_t)
DO_VLDR(vldrh_sw, 2, w, 4, int32_t)
DO_VLDR(vldrh_uw, 2, w, 4, uint32_t)
-DO_VSTR(vstrb_h, 1, stb, 2, int16_t)
-DO_VSTR(vstrb_w, 1, stb, 4, int32_t)
-DO_VSTR(vstrh_w, 2, stw, 4, int32_t)
+DO_VSTR(vstrb_h, 1, b, 2, int16_t)
+DO_VSTR(vstrb_w, 1, b, 4, int32_t)
+DO_VSTR(vstrh_w, 2, w, 4, int32_t)
#undef DO_VLDR
#undef DO_VSTR
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 06/12] target/arm: Fix VLDR_SG helper load alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (4 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 05/12] target/arm: Fix VSTR helper store " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 07/12] target/arm: Fix VSTR_SG helper store " William Kosasih
` (5 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the load operations in the VLDR_SG
instructions.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 41 ++++++++++++++++++++++---------------
1 file changed, 24 insertions(+), 17 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index a49b8842e3..f1e9c87e6a 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -247,13 +247,18 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
uint16_t eci_mask = mve_eci_mask(env); \
unsigned e; \
uint32_t addr; \
- for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN, mmu_idx);\
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) {\
if (!(eci_mask & 1)) { \
continue; \
} \
addr = ADDRFN(base, m[H##ESIZE(e)]); \
d[H##ESIZE(e)] = (mask & 1) ? \
- cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \
+ SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi, GETPC()), \
+ TYPE, \
+ MSIZE(LDTYPE) * 8) \
+ : 0; \
if (WB) { \
m[H##ESIZE(e)] = addr; \
} \
@@ -305,13 +310,15 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
uint16_t eci_mask = mve_eci_mask(env); \
unsigned e; \
uint32_t addr; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
if (!(eci_mask & 1)) { \
continue; \
} \
addr = ADDRFN(base, m[H4(e & ~1)]); \
addr += 4 * (e & 1); \
- d[H4(e)] = (mask & 1) ? cpu_ldl_data_ra(env, addr, GETPC()) : 0; \
+ d[H4(e)] = (mask & 1) ? cpu_ldl_mmu(env, addr, oi, GETPC()) : 0;\
if (WB && (e & 1)) { \
m[H4(e & ~1)] = addr - 4; \
} \
@@ -350,22 +357,22 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
#define ADDR_ADD_OSW(BASE, OFFSET) ((BASE) + ((OFFSET) << 2))
#define ADDR_ADD_OSD(BASE, OFFSET) ((BASE) + ((OFFSET) << 3))
-DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrb_sg_sh, b, 2, int16_t, uint16_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrb_sg_sw, b, 4, int32_t, uint32_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrh_sg_sw, w, 4, int32_t, uint32_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD, false)
-DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrb_sg_ub, b, 1, uint8_t, uint8_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrb_sg_uh, b, 2, uint16_t, uint16_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrb_sg_uw, b, 4, uint32_t, uint32_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrh_sg_uh, w, 2, uint16_t, uint16_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrh_sg_uw, w, 4, uint32_t, uint32_t, ADDR_ADD, false)
+DO_VLDR_SG(vldrw_sg_uw, l, 4, uint32_t, uint32_t, ADDR_ADD, false)
DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD, false)
-DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH, false)
-DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH, false)
-DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH, false)
-DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW, false)
+DO_VLDR_SG(vldrh_sg_os_sw, w, 4, int32_t, uint32_t, ADDR_ADD_OSH, false)
+DO_VLDR_SG(vldrh_sg_os_uh, w, 2, uint16_t, uint16_t, ADDR_ADD_OSH, false)
+DO_VLDR_SG(vldrh_sg_os_uw, w, 4, uint32_t, uint32_t, ADDR_ADD_OSH, false)
+DO_VLDR_SG(vldrw_sg_os_uw, l, 4, uint32_t, uint32_t, ADDR_ADD_OSW, false)
DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD, false)
DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD, false)
@@ -381,7 +388,7 @@ DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH, false)
DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW, false)
DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD, false)
-DO_VLDR_SG(vldrw_sg_wb_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, true)
+DO_VLDR_SG(vldrw_sg_wb_uw, l, 4, uint32_t, uint32_t, ADDR_ADD, true)
DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true)
DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true)
DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true)
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 07/12] target/arm: Fix VSTR_SG helper store alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (5 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 06/12] target/arm: Fix VLDR_SG helper load " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 08/12] target/arm: Fix VLD4 helper load " William Kosasih
` (4 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the store operations in the VSTR_SG
instructions.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 28 ++++++++++++++++------------
1 file changed, 16 insertions(+), 12 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index f1e9c87e6a..5b04fa4425 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -277,13 +277,15 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
uint16_t eci_mask = mve_eci_mask(env); \
unsigned e; \
uint32_t addr; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(STTYPE) | MO_ALIGN, mmu_idx);\
for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \
if (!(eci_mask & 1)) { \
continue; \
} \
addr = ADDRFN(base, m[H##ESIZE(e)]); \
if (mask & 1) { \
- cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \
+ cpu_st##STTYPE##_mmu(env, addr, d[H##ESIZE(e)], oi, GETPC()); \
} \
if (WB) { \
m[H##ESIZE(e)] = addr; \
@@ -336,6 +338,8 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
uint16_t eci_mask = mve_eci_mask(env); \
unsigned e; \
uint32_t addr; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
if (!(eci_mask & 1)) { \
continue; \
@@ -343,7 +347,7 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
addr = ADDRFN(base, m[H4(e & ~1)]); \
addr += 4 * (e & 1); \
if (mask & 1) { \
- cpu_stl_data_ra(env, addr, d[H4(e)], GETPC()); \
+ cpu_stl_mmu(env, addr, d[H4(e)], oi, GETPC()); \
} \
if (WB && (e & 1)) { \
m[H4(e & ~1)] = addr - 4; \
@@ -375,22 +379,22 @@ DO_VLDR_SG(vldrh_sg_os_uw, w, 4, uint32_t, uint32_t, ADDR_ADD_OSH, false)
DO_VLDR_SG(vldrw_sg_os_uw, l, 4, uint32_t, uint32_t, ADDR_ADD_OSW, false)
DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD, false)
-DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD, false)
-DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD, false)
-DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD, false)
-DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD, false)
-DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD, false)
-DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD, false)
+DO_VSTR_SG(vstrb_sg_ub, b, 1, uint8_t, ADDR_ADD, false)
+DO_VSTR_SG(vstrb_sg_uh, b, 2, uint16_t, ADDR_ADD, false)
+DO_VSTR_SG(vstrb_sg_uw, b, 4, uint32_t, ADDR_ADD, false)
+DO_VSTR_SG(vstrh_sg_uh, w, 2, uint16_t, ADDR_ADD, false)
+DO_VSTR_SG(vstrh_sg_uw, w, 4, uint32_t, ADDR_ADD, false)
+DO_VSTR_SG(vstrw_sg_uw, l, 4, uint32_t, ADDR_ADD, false)
DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD, false)
-DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH, false)
-DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH, false)
-DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW, false)
+DO_VSTR_SG(vstrh_sg_os_uh, w, 2, uint16_t, ADDR_ADD_OSH, false)
+DO_VSTR_SG(vstrh_sg_os_uw, w, 4, uint32_t, ADDR_ADD_OSH, false)
+DO_VSTR_SG(vstrw_sg_os_uw, l, 4, uint32_t, ADDR_ADD_OSW, false)
DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD, false)
DO_VLDR_SG(vldrw_sg_wb_uw, l, 4, uint32_t, uint32_t, ADDR_ADD, true)
DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true)
-DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true)
+DO_VSTR_SG(vstrw_sg_wb_uw, l, 4, uint32_t, ADDR_ADD, true)
DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true)
/*
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 08/12] target/arm: Fix VLD4 helper load alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (6 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 07/12] target/arm: Fix VSTR_SG helper store " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 09/12] target/arm: Fix VLD2 " William Kosasih
` (3 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the load operations in the VLD4
instruction.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index 5b04fa4425..2d2936abde 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -421,13 +421,15 @@ DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true)
uint16_t mask = mve_eci_mask(env); \
static const uint8_t off[4] = { O1, O2, O3, O4 }; \
uint32_t addr, data; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
continue; \
} \
addr = base + off[beat] * 4; \
- data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
+ data = cpu_ldl_mmu(env, addr, oi, GETPC()); \
for (e = 0; e < 4; e++, data >>= 8) { \
uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \
qd[H1(off[beat])] = data; \
@@ -445,13 +447,15 @@ DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true)
uint32_t addr, data; \
int y; /* y counts 0 2 0 2 */ \
uint16_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
continue; \
} \
addr = base + off[beat] * 8 + (beat & 1) * 4; \
- data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
+ data = cpu_ldl_mmu(env, addr, oi, GETPC()); \
qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y); \
qd[H2(off[beat])] = data; \
data >>= 16; \
@@ -470,13 +474,15 @@ DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true)
uint32_t addr, data; \
uint32_t *qd; \
int y; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
continue; \
} \
addr = base + off[beat] * 4; \
- data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
+ data = cpu_ldl_mmu(env, addr, oi, GETPC()); \
y = (beat + (O1 & 2)) & 3; \
qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \
qd[H4(off[beat] >> 2)] = data; \
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 09/12] target/arm: Fix VLD2 helper load alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (7 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 08/12] target/arm: Fix VLD4 helper load " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 10/12] target/arm: Fix VST4 helper store " William Kosasih
` (2 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the load operations in the VLD2
instruction.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index 2d2936abde..9e8ea04074 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -513,13 +513,15 @@ DO_VLD4W(vld43w, 6, 7, 8, 9)
static const uint8_t off[4] = { O1, O2, O3, O4 }; \
uint32_t addr, data; \
uint8_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
continue; \
} \
addr = base + off[beat] * 2; \
- data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
+ data = cpu_ldl_mmu(env, addr, oi, GETPC()); \
for (e = 0; e < 4; e++, data >>= 8) { \
qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \
qd[H1(off[beat] + (e >> 1))] = data; \
@@ -537,13 +539,15 @@ DO_VLD4W(vld43w, 6, 7, 8, 9)
uint32_t addr, data; \
int e; \
uint16_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
continue; \
} \
addr = base + off[beat] * 4; \
- data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
+ data = cpu_ldl_mmu(env, addr, oi, GETPC()); \
for (e = 0; e < 2; e++, data >>= 16) { \
qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \
qd[H2(off[beat])] = data; \
@@ -560,13 +564,15 @@ DO_VLD4W(vld43w, 6, 7, 8, 9)
static const uint8_t off[4] = { O1, O2, O3, O4 }; \
uint32_t addr, data; \
uint32_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
continue; \
} \
addr = base + off[beat]; \
- data = cpu_ldl_le_data_ra(env, addr, GETPC()); \
+ data = cpu_ldl_mmu(env, addr, oi, GETPC()); \
qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \
qd[H4(off[beat] >> 3)] = data; \
} \
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 10/12] target/arm: Fix VST4 helper store alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (8 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 09/12] target/arm: Fix VLD2 " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 11/12] target/arm: Fix VST2 " William Kosasih
2025-07-01 10:31 ` [PATCH v2 12/12] target/arm: Fix helper macros indentation in mve_helper.c William Kosasih
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the store operations in the VST4
instruction.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index 9e8ea04074..f16877ba45 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -595,6 +595,8 @@ DO_VLD2W(vld21w, 8, 12, 16, 20)
uint16_t mask = mve_eci_mask(env); \
static const uint8_t off[4] = { O1, O2, O3, O4 }; \
uint32_t addr, data; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
@@ -606,7 +608,7 @@ DO_VLD2W(vld21w, 8, 12, 16, 20)
uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \
data = (data << 8) | qd[H1(off[beat])]; \
} \
- cpu_stl_le_data_ra(env, addr, data, GETPC()); \
+ cpu_stl_mmu(env, addr, data, oi, GETPC()); \
} \
}
@@ -620,6 +622,8 @@ DO_VLD2W(vld21w, 8, 12, 16, 20)
uint32_t addr, data; \
int y; /* y counts 0 2 0 2 */ \
uint16_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
@@ -630,7 +634,7 @@ DO_VLD2W(vld21w, 8, 12, 16, 20)
data = qd[H2(off[beat])]; \
qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y + 1); \
data |= qd[H2(off[beat])] << 16; \
- cpu_stl_le_data_ra(env, addr, data, GETPC()); \
+ cpu_stl_mmu(env, addr, data, oi, GETPC()); \
} \
}
@@ -644,6 +648,8 @@ DO_VLD2W(vld21w, 8, 12, 16, 20)
uint32_t addr, data; \
uint32_t *qd; \
int y; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
@@ -653,7 +659,7 @@ DO_VLD2W(vld21w, 8, 12, 16, 20)
y = (beat + (O1 & 2)) & 3; \
qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \
data = qd[H4(off[beat] >> 2)]; \
- cpu_stl_le_data_ra(env, addr, data, GETPC()); \
+ cpu_stl_mmu(env, addr, data, oi, GETPC()); \
} \
}
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 11/12] target/arm: Fix VST2 helper store alignment checks
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (9 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 10/12] target/arm: Fix VST4 helper store " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 12/12] target/arm: Fix helper macros indentation in mve_helper.c William Kosasih
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
This patch adds alignment checks in the store operations in the VST2
instruction.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index f16877ba45..6dffd9cb35 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -687,6 +687,8 @@ DO_VST4W(vst43w, 6, 7, 8, 9)
static const uint8_t off[4] = { O1, O2, O3, O4 }; \
uint32_t addr, data; \
uint8_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
@@ -698,7 +700,7 @@ DO_VST4W(vst43w, 6, 7, 8, 9)
qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \
data = (data << 8) | qd[H1(off[beat] + (e >> 1))]; \
} \
- cpu_stl_le_data_ra(env, addr, data, GETPC()); \
+ cpu_stl_mmu(env, addr, data, oi, GETPC()); \
} \
}
@@ -712,6 +714,8 @@ DO_VST4W(vst43w, 6, 7, 8, 9)
uint32_t addr, data; \
int e; \
uint16_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
@@ -723,7 +727,7 @@ DO_VST4W(vst43w, 6, 7, 8, 9)
qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \
data = (data << 16) | qd[H2(off[beat])]; \
} \
- cpu_stl_le_data_ra(env, addr, data, GETPC()); \
+ cpu_stl_mmu(env, addr, data, oi, GETPC()); \
} \
}
@@ -736,6 +740,8 @@ DO_VST4W(vst43w, 6, 7, 8, 9)
static const uint8_t off[4] = { O1, O2, O3, O4 }; \
uint32_t addr, data; \
uint32_t *qd; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
for (beat = 0; beat < 4; beat++, mask >>= 4) { \
if ((mask & 1) == 0) { \
/* ECI says skip this beat */ \
@@ -744,7 +750,7 @@ DO_VST4W(vst43w, 6, 7, 8, 9)
addr = base + off[beat]; \
qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \
data = qd[H4(off[beat] >> 3)]; \
- cpu_stl_le_data_ra(env, addr, data, GETPC()); \
+ cpu_stl_mmu(env, addr, data, oi, GETPC()); \
} \
}
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 12/12] target/arm: Fix helper macros indentation in mve_helper.c
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
` (10 preceding siblings ...)
2025-07-01 10:31 ` [PATCH v2 11/12] target/arm: Fix VST2 " William Kosasih
@ 2025-07-01 10:31 ` William Kosasih
11 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-01 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, Peter Maydell, William Kosasih
Recent helper function load and store alignment fix caused the continuation
backslashes in those macro definitions to shift out of alignment.
This patch restores a uniform indentation for those trailing backslashes,
making them consistent.
Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
---
target/arm/tcg/mve_helper.c | 280 ++++++++++++++++++------------------
1 file changed, 140 insertions(+), 140 deletions(-)
diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
index 6dffd9cb35..72c07262c0 100644
--- a/target/arm/tcg/mve_helper.c
+++ b/target/arm/tcg/mve_helper.c
@@ -164,48 +164,48 @@ static void mve_advance_vpt(CPUARMState *env)
((T)(v) << (sizeof(T) * 8 - (B))) >> (sizeof(T) * 8 - (B)) }
/* For loads, predicated lanes are zeroed instead of keeping their old values */
-#define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \
- void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
- { \
- TYPE *d = vd; \
- uint16_t mask = mve_element_mask(env); \
- uint16_t eci_mask = mve_eci_mask(env); \
- unsigned b, e; \
- int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
- MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN, mmu_idx);\
- /* \
- * R_SXTM allows the dest reg to become UNKNOWN for abandoned \
- * beats so we don't care if we update part of the dest and \
- * then take an exception. \
- */ \
- for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
- if (eci_mask & (1 << b)) { \
- d[H##ESIZE(e)] = (mask & (1 << b)) ? \
- SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi, GETPC()),\
- TYPE, \
- MSIZE * 8) \
- : 0; \
- } \
- addr += MSIZE; \
- } \
- mve_advance_vpt(env); \
+#define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
+ { \
+ TYPE *d = vd; \
+ uint16_t mask = mve_element_mask(env); \
+ uint16_t eci_mask = mve_eci_mask(env); \
+ unsigned b, e; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN, mmu_idx); \
+ /* \
+ * R_SXTM allows the dest reg to become UNKNOWN for abandoned \
+ * beats so we don't care if we update part of the dest and \
+ * then take an exception. \
+ */ \
+ for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
+ if (eci_mask & (1 << b)) { \
+ d[H##ESIZE(e)] = (mask & (1 << b)) ? \
+ SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi, GETPC()), \
+ TYPE, \
+ MSIZE * 8) \
+ : 0; \
+ } \
+ addr += MSIZE; \
+ } \
+ mve_advance_vpt(env); \
}
-#define DO_VSTR(OP, MSIZE, STTYPE, ESIZE, TYPE) \
- void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
- { \
- TYPE *d = vd; \
- uint16_t mask = mve_element_mask(env); \
- unsigned b, e; \
- int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
- MemOpIdx oi = make_memop_idx(MFLAG(STTYPE) | MO_ALIGN, mmu_idx);\
- for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
- if (mask & (1 << b)) { \
- cpu_st##STTYPE##_mmu(env, addr, d[H##ESIZE(e)], oi, GETPC());\
- } \
- addr += MSIZE; \
- } \
- mve_advance_vpt(env); \
+#define DO_VSTR(OP, MSIZE, STTYPE, ESIZE, TYPE) \
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
+ { \
+ TYPE *d = vd; \
+ uint16_t mask = mve_element_mask(env); \
+ unsigned b, e; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(STTYPE) | MO_ALIGN, mmu_idx); \
+ for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
+ if (mask & (1 << b)) { \
+ cpu_st##STTYPE##_mmu(env, addr, d[H##ESIZE(e)], oi, GETPC()); \
+ } \
+ addr += MSIZE; \
+ } \
+ mve_advance_vpt(env); \
}
DO_VLDR(vldrb, 1, b, 1, uint8_t)
@@ -237,61 +237,61 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
* For loads, predicated lanes are zeroed instead of retaining
* their previous values.
*/
-#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN, WB) \
- void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
- uint32_t base) \
- { \
- TYPE *d = vd; \
- OFFTYPE *m = vm; \
- uint16_t mask = mve_element_mask(env); \
- uint16_t eci_mask = mve_eci_mask(env); \
- unsigned e; \
- uint32_t addr; \
- int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
- MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN, mmu_idx);\
+#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN, WB) \
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
+ uint32_t base) \
+ { \
+ TYPE *d = vd; \
+ OFFTYPE *m = vm; \
+ uint16_t mask = mve_element_mask(env); \
+ uint16_t eci_mask = mve_eci_mask(env); \
+ unsigned e; \
+ uint32_t addr; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN, mmu_idx); \
for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) {\
- if (!(eci_mask & 1)) { \
- continue; \
- } \
- addr = ADDRFN(base, m[H##ESIZE(e)]); \
- d[H##ESIZE(e)] = (mask & 1) ? \
- SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi, GETPC()), \
- TYPE, \
- MSIZE(LDTYPE) * 8) \
- : 0; \
- if (WB) { \
- m[H##ESIZE(e)] = addr; \
- } \
- } \
- mve_advance_vpt(env); \
+ if (!(eci_mask & 1)) { \
+ continue; \
+ } \
+ addr = ADDRFN(base, m[H##ESIZE(e)]); \
+ d[H##ESIZE(e)] = (mask & 1) ? \
+ SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi, GETPC()), \
+ TYPE, \
+ MSIZE(LDTYPE) * 8) \
+ : 0; \
+ if (WB) { \
+ m[H##ESIZE(e)] = addr; \
+ } \
+ } \
+ mve_advance_vpt(env); \
}
/* We know here TYPE is unsigned so always the same as the offset type */
-#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN, WB) \
- void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
- uint32_t base) \
- { \
- TYPE *d = vd; \
- TYPE *m = vm; \
- uint16_t mask = mve_element_mask(env); \
- uint16_t eci_mask = mve_eci_mask(env); \
- unsigned e; \
- uint32_t addr; \
- int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
- MemOpIdx oi = make_memop_idx(MFLAG(STTYPE) | MO_ALIGN, mmu_idx);\
- for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \
- if (!(eci_mask & 1)) { \
- continue; \
- } \
- addr = ADDRFN(base, m[H##ESIZE(e)]); \
- if (mask & 1) { \
+#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN, WB) \
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
+ uint32_t base) \
+ { \
+ TYPE *d = vd; \
+ TYPE *m = vm; \
+ uint16_t mask = mve_element_mask(env); \
+ uint16_t eci_mask = mve_eci_mask(env); \
+ unsigned e; \
+ uint32_t addr; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MFLAG(STTYPE) | MO_ALIGN, mmu_idx); \
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) {\
+ if (!(eci_mask & 1)) { \
+ continue; \
+ } \
+ addr = ADDRFN(base, m[H##ESIZE(e)]); \
+ if (mask & 1) { \
cpu_st##STTYPE##_mmu(env, addr, d[H##ESIZE(e)], oi, GETPC()); \
- } \
- if (WB) { \
- m[H##ESIZE(e)] = addr; \
- } \
- } \
- mve_advance_vpt(env); \
+ } \
+ if (WB) { \
+ m[H##ESIZE(e)] = addr; \
+ } \
+ } \
+ mve_advance_vpt(env); \
}
/*
@@ -302,58 +302,58 @@ DO_VSTR(vstrh_w, 2, w, 4, int32_t)
* Address writeback happens on the odd beats and updates the address
* stored in the even-beat element.
*/
-#define DO_VLDR64_SG(OP, ADDRFN, WB) \
- void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
- uint32_t base) \
- { \
- uint32_t *d = vd; \
- uint32_t *m = vm; \
- uint16_t mask = mve_element_mask(env); \
- uint16_t eci_mask = mve_eci_mask(env); \
- unsigned e; \
- uint32_t addr; \
- int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
- MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
- for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
- if (!(eci_mask & 1)) { \
- continue; \
- } \
- addr = ADDRFN(base, m[H4(e & ~1)]); \
- addr += 4 * (e & 1); \
- d[H4(e)] = (mask & 1) ? cpu_ldl_mmu(env, addr, oi, GETPC()) : 0;\
- if (WB && (e & 1)) { \
- m[H4(e & ~1)] = addr - 4; \
- } \
- } \
- mve_advance_vpt(env); \
+#define DO_VLDR64_SG(OP, ADDRFN, WB) \
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
+ uint32_t base) \
+ { \
+ uint32_t *d = vd; \
+ uint32_t *m = vm; \
+ uint16_t mask = mve_element_mask(env); \
+ uint16_t eci_mask = mve_eci_mask(env); \
+ unsigned e; \
+ uint32_t addr; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
+ for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
+ if (!(eci_mask & 1)) { \
+ continue; \
+ } \
+ addr = ADDRFN(base, m[H4(e & ~1)]); \
+ addr += 4 * (e & 1); \
+ d[H4(e)] = (mask & 1) ? cpu_ldl_mmu(env, addr, oi, GETPC()) : 0; \
+ if (WB && (e & 1)) { \
+ m[H4(e & ~1)] = addr - 4; \
+ } \
+ } \
+ mve_advance_vpt(env); \
}
-#define DO_VSTR64_SG(OP, ADDRFN, WB) \
- void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
- uint32_t base) \
- { \
- uint32_t *d = vd; \
- uint32_t *m = vm; \
- uint16_t mask = mve_element_mask(env); \
- uint16_t eci_mask = mve_eci_mask(env); \
- unsigned e; \
- uint32_t addr; \
- int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
- MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
- for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
- if (!(eci_mask & 1)) { \
- continue; \
- } \
- addr = ADDRFN(base, m[H4(e & ~1)]); \
- addr += 4 * (e & 1); \
- if (mask & 1) { \
- cpu_stl_mmu(env, addr, d[H4(e)], oi, GETPC()); \
- } \
- if (WB && (e & 1)) { \
- m[H4(e & ~1)] = addr - 4; \
- } \
- } \
- mve_advance_vpt(env); \
+#define DO_VSTR64_SG(OP, ADDRFN, WB) \
+ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \
+ uint32_t base) \
+ { \
+ uint32_t *d = vd; \
+ uint32_t *m = vm; \
+ uint16_t mask = mve_element_mask(env); \
+ uint16_t eci_mask = mve_eci_mask(env); \
+ unsigned e; \
+ uint32_t addr; \
+ int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
+ MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN, mmu_idx); \
+ for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \
+ if (!(eci_mask & 1)) { \
+ continue; \
+ } \
+ addr = ADDRFN(base, m[H4(e & ~1)]); \
+ addr += 4 * (e & 1); \
+ if (mask & 1) { \
+ cpu_stl_mmu(env, addr, d[H4(e)], oi, GETPC()); \
+ } \
+ if (WB && (e & 1)) { \
+ m[H4(e & ~1)] = addr - 4; \
+ } \
+ } \
+ mve_advance_vpt(env); \
}
#define ADDR_ADD(BASE, OFFSET) ((BASE) + (OFFSET))
--
2.48.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 01/12] target/arm: Fix VLSTM/VLLDM helper load/store alignment checks
2025-07-01 10:31 ` [PATCH v2 01/12] target/arm: Fix VLSTM/VLLDM helper load/store " William Kosasih
@ 2025-07-02 2:06 ` Richard Henderson
0 siblings, 0 replies; 16+ messages in thread
From: Richard Henderson @ 2025-07-02 2:06 UTC (permalink / raw)
To: William Kosasih, qemu-devel; +Cc: qemu-arm, Peter Maydell
On 7/1/25 04:31, William Kosasih wrote:
> This patch adds alignment checks in the load operations in the VLLDM
> instruction, and in the store operations in the VLSTM instruction.
>
> Manual alignment checks in the both helpers are retained because they
> enforce an 8-byte alignment requirement (instead of the 4-byte alignment for
> ordinary long loads/stores). References to cpu_*_data_* are still replaced
> with cpu_*_mmu(), so that the individual word accesses themselves also
> perform the standard alignment checks, in keeping with the ARM pseudocode.
So... this merely makes this function match the pseudocode, it doesn't actually fix a bug.
This description should be fixed to reflect that.
> diff --git a/target/arm/tcg/m_helper.c b/target/arm/tcg/m_helper.c
> index 6614719832..251e12edf9 100644
> --- a/target/arm/tcg/m_helper.c
> +++ b/target/arm/tcg/m_helper.c
> @@ -1048,6 +1048,9 @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
> bool s = env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_S_MASK;
> bool lspact = env->v7m.fpccr[s] & R_V7M_FPCCR_LSPACT_MASK;
> uintptr_t ra = GETPC();
> + ARMMMUIdx mmu_idx = arm_mmu_idx(env);
> + MemOpIdx oi = make_memop_idx(MO_TEUL | MO_ALIGN,
> + arm_to_core_mmu_idx(mmu_idx));
>
> assert(env->v7m.secure);
>
> @@ -1073,7 +1076,7 @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
> * Note that we do not use v7m_stack_write() here, because the
> * accesses should not set the FSR bits for stacking errors if they
> * fail. (In pseudocode terms, they are AccType_NORMAL, not AccType_STACK
> - * or AccType_LAZYFP). Faults in cpu_stl_data_ra() will throw exceptions
> + * or AccType_LAZYFP). Faults in cpu_stl_mmu() will throw exceptions
> * and longjmp out.
> */
> if (!(env->v7m.fpccr[M_REG_S] & R_V7M_FPCCR_LSPEN_MASK)) {
> @@ -1089,12 +1092,12 @@ void HELPER(v7m_vlstm)(CPUARMState *env, uint32_t fptr)
> if (i >= 16) {
> faddr += 8; /* skip the slot for the FPSCR */
> }
> - cpu_stl_data_ra(env, faddr, slo, ra);
> - cpu_stl_data_ra(env, faddr + 4, shi, ra);
> + cpu_stl_mmu(env, faddr, slo, oi, ra);
> + cpu_stl_mmu(env, faddr + 4, shi, oi, ra);
This is an improvement because the mmu index is resolved once, instead of within every
call to cpu_stl_data_ra.
r~
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 04/12] target/arm: Fix VLDR helper load alignment checks
2025-07-01 10:31 ` [PATCH v2 04/12] target/arm: Fix VLDR " William Kosasih
@ 2025-07-02 2:29 ` Richard Henderson
2025-07-02 11:16 ` William Kosasih
0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2025-07-02 2:29 UTC (permalink / raw)
To: William Kosasih, qemu-devel; +Cc: qemu-arm, Peter Maydell
On 7/1/25 04:31, William Kosasih wrote:
> This patch adds alignment checks in the load operations in the VLDR
> instruction.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
> Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
> ---
> target/arm/tcg/mve_helper.c | 41 ++++++++++++++++++++++++++++---------
> 1 file changed, 31 insertions(+), 10 deletions(-)
>
> diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
> index 506d1c3475..922cd2371a 100644
> --- a/target/arm/tcg/mve_helper.c
> +++ b/target/arm/tcg/mve_helper.c
> @@ -147,6 +147,22 @@ static void mve_advance_vpt(CPUARMState *env)
> env->v7m.vpr = vpr;
> }
>
> +/* Mapping of LDTYPE/STTYPE to the number of bytes accessed */
> +#define MSIZE_b 1
> +#define MSIZE_w 2
> +#define MSIZE_l 4
> +
> +/* Mapping of LDTYPE/STTYPE to MemOp flag */
> +#define MFLAG_b MO_UB
> +#define MFLAG_w MO_TEUW
> +#define MFLAG_l MO_TEUL
> +
> +#define MSIZE(t) MSIZE_##t
AFAICS, this isn't used.
> +#define MFLAG(t) MFLAG_##t
> +
> +#define SIGN_EXT(v, T, B) { \
> + ((T)(v) << (sizeof(T) * 8 - (B))) >> (sizeof(T) * 8 - (B)) }
Don't do this. (1) Not all of these operations require sign extension, (2) it's clearer
to simply cast to an appropriate MTYPE.
r~
> +
> /* For loads, predicated lanes are zeroed instead of keeping their old values */
> #define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \
> void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \
> @@ -155,6 +171,8 @@ static void mve_advance_vpt(CPUARMState *env)
> uint16_t mask = mve_element_mask(env); \
> uint16_t eci_mask = mve_eci_mask(env); \
> unsigned b, e; \
> + int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env)); \
> + MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN, mmu_idx);\
> /* \
> * R_SXTM allows the dest reg to become UNKNOWN for abandoned \
> * beats so we don't care if we update part of the dest and \
> @@ -163,7 +181,10 @@ static void mve_advance_vpt(CPUARMState *env)
> for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \
> if (eci_mask & (1 << b)) { \
> d[H##ESIZE(e)] = (mask & (1 << b)) ? \
> - cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \
> + SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi, GETPC()),\
> + TYPE, \
> + MSIZE * 8) \
> + : 0; \
> } \
> addr += MSIZE; \
> } \
> @@ -185,20 +206,20 @@ static void mve_advance_vpt(CPUARMState *env)
> mve_advance_vpt(env); \
> }
>
> -DO_VLDR(vldrb, 1, ldub, 1, uint8_t)
> -DO_VLDR(vldrh, 2, lduw, 2, uint16_t)
> -DO_VLDR(vldrw, 4, ldl, 4, uint32_t)
> +DO_VLDR(vldrb, 1, b, 1, uint8_t)
> +DO_VLDR(vldrh, 2, w, 2, uint16_t)
> +DO_VLDR(vldrw, 4, l, 4, uint32_t)
>
> DO_VSTR(vstrb, 1, stb, 1, uint8_t)
> DO_VSTR(vstrh, 2, stw, 2, uint16_t)
> DO_VSTR(vstrw, 4, stl, 4, uint32_t)
>
> -DO_VLDR(vldrb_sh, 1, ldsb, 2, int16_t)
> -DO_VLDR(vldrb_sw, 1, ldsb, 4, int32_t)
> -DO_VLDR(vldrb_uh, 1, ldub, 2, uint16_t)
> -DO_VLDR(vldrb_uw, 1, ldub, 4, uint32_t)
> -DO_VLDR(vldrh_sw, 2, ldsw, 4, int32_t)
> -DO_VLDR(vldrh_uw, 2, lduw, 4, uint32_t)
> +DO_VLDR(vldrb_sh, 1, b, 2, int16_t)
> +DO_VLDR(vldrb_sw, 1, b, 4, int32_t)
> +DO_VLDR(vldrb_uh, 1, b, 2, uint16_t)
> +DO_VLDR(vldrb_uw, 1, b, 4, uint32_t)
> +DO_VLDR(vldrh_sw, 2, w, 4, int32_t)
> +DO_VLDR(vldrh_uw, 2, w, 4, uint32_t)
>
> DO_VSTR(vstrb_h, 1, stb, 2, int16_t)
> DO_VSTR(vstrb_w, 1, stb, 4, int32_t)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 04/12] target/arm: Fix VLDR helper load alignment checks
2025-07-02 2:29 ` Richard Henderson
@ 2025-07-02 11:16 ` William Kosasih
0 siblings, 0 replies; 16+ messages in thread
From: William Kosasih @ 2025-07-02 11:16 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, qemu-arm, Peter Maydell
[-- Attachment #1: Type: text/plain, Size: 4887 bytes --]
Hi Richard, really appreciate your comment, thank you very much.
> Don't do this. (1) Not all of these operations require sign extension,
> (2) it's clearer
> to simply cast to an appropriate MTYPE.
I'll fix that :-)
AFAICS, this isn't used.
I wasn't sure whether your preference is to delete the other macros as well
(MFLAG(), MSIZE(), and their underlying definition),
but I assume that you wanted them gone. I've deleted all of them in my next
patch series, opting to pass those
as the parameter of the VLDR, VSTR helper macro instead. I hope that
matches your intent -- but please let me know if you’d
prefer a different approach, I'm happy to fix them again. Sorry for the
back-and-forth, and thanks again for your guidance.
Best regards,
William
On Wed, Jul 2, 2025 at 11:59 AM Richard Henderson <
richard.henderson@linaro.org> wrote:
> On 7/1/25 04:31, William Kosasih wrote:
> > This patch adds alignment checks in the load operations in the VLDR
> > instruction.
> >
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1154
> > Signed-off-by: William Kosasih <kosasihwilliam4@gmail.com>
> > ---
> > target/arm/tcg/mve_helper.c | 41 ++++++++++++++++++++++++++++---------
> > 1 file changed, 31 insertions(+), 10 deletions(-)
> >
> > diff --git a/target/arm/tcg/mve_helper.c b/target/arm/tcg/mve_helper.c
> > index 506d1c3475..922cd2371a 100644
> > --- a/target/arm/tcg/mve_helper.c
> > +++ b/target/arm/tcg/mve_helper.c
> > @@ -147,6 +147,22 @@ static void mve_advance_vpt(CPUARMState *env)
> > env->v7m.vpr = vpr;
> > }
> >
> > +/* Mapping of LDTYPE/STTYPE to the number of bytes accessed */
> > +#define MSIZE_b 1
> > +#define MSIZE_w 2
> > +#define MSIZE_l 4
> > +
> > +/* Mapping of LDTYPE/STTYPE to MemOp flag */
> > +#define MFLAG_b MO_UB
> > +#define MFLAG_w MO_TEUW
> > +#define MFLAG_l MO_TEUL
> > +
> > +#define MSIZE(t) MSIZE_##t
>
> AFAICS, this isn't used.
>
> > +#define MFLAG(t) MFLAG_##t
> > +
> > +#define SIGN_EXT(v, T, B) { \
> > + ((T)(v) << (sizeof(T) * 8 - (B))) >> (sizeof(T) * 8 - (B)) }
>
> Don't do this. (1) Not all of these operations require sign extension,
> (2) it's clearer
> to simply cast to an appropriate MTYPE.
>
> r~
>
> > +
> > /* For loads, predicated lanes are zeroed instead of keeping their old
> values */
> > #define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE)
> \
> > void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr)
> \
> > @@ -155,6 +171,8 @@ static void mve_advance_vpt(CPUARMState *env)
> > uint16_t mask = mve_element_mask(env);
> \
> > uint16_t eci_mask = mve_eci_mask(env);
> \
> > unsigned b, e;
> \
> > + int mmu_idx = arm_to_core_mmu_idx(arm_mmu_idx(env));
> \
> > + MemOpIdx oi = make_memop_idx(MFLAG(LDTYPE) | MO_ALIGN,
> mmu_idx);\
> > /*
> \
> > * R_SXTM allows the dest reg to become UNKNOWN for abandoned
> \
> > * beats so we don't care if we update part of the dest and
> \
> > @@ -163,7 +181,10 @@ static void mve_advance_vpt(CPUARMState *env)
> > for (b = 0, e = 0; b < 16; b += ESIZE, e++) {
> \
> > if (eci_mask & (1 << b)) {
> \
> > d[H##ESIZE(e)] = (mask & (1 << b)) ?
> \
> > - cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0;
> \
> > + SIGN_EXT(cpu_ld##LDTYPE##_mmu(env, addr, oi,
> GETPC()),\
> > + TYPE,
> \
> > + MSIZE * 8)
> \
> > + : 0;
> \
> > }
> \
> > addr += MSIZE;
> \
> > }
> \
> > @@ -185,20 +206,20 @@ static void mve_advance_vpt(CPUARMState *env)
> > mve_advance_vpt(env);
> \
> > }
> >
> > -DO_VLDR(vldrb, 1, ldub, 1, uint8_t)
> > -DO_VLDR(vldrh, 2, lduw, 2, uint16_t)
> > -DO_VLDR(vldrw, 4, ldl, 4, uint32_t)
> > +DO_VLDR(vldrb, 1, b, 1, uint8_t)
> > +DO_VLDR(vldrh, 2, w, 2, uint16_t)
> > +DO_VLDR(vldrw, 4, l, 4, uint32_t)
> >
> > DO_VSTR(vstrb, 1, stb, 1, uint8_t)
> > DO_VSTR(vstrh, 2, stw, 2, uint16_t)
> > DO_VSTR(vstrw, 4, stl, 4, uint32_t)
> >
> > -DO_VLDR(vldrb_sh, 1, ldsb, 2, int16_t)
> > -DO_VLDR(vldrb_sw, 1, ldsb, 4, int32_t)
> > -DO_VLDR(vldrb_uh, 1, ldub, 2, uint16_t)
> > -DO_VLDR(vldrb_uw, 1, ldub, 4, uint32_t)
> > -DO_VLDR(vldrh_sw, 2, ldsw, 4, int32_t)
> > -DO_VLDR(vldrh_uw, 2, lduw, 4, uint32_t)
> > +DO_VLDR(vldrb_sh, 1, b, 2, int16_t)
> > +DO_VLDR(vldrb_sw, 1, b, 4, int32_t)
> > +DO_VLDR(vldrb_uh, 1, b, 2, uint16_t)
> > +DO_VLDR(vldrb_uw, 1, b, 4, uint32_t)
> > +DO_VLDR(vldrh_sw, 2, w, 4, int32_t)
> > +DO_VLDR(vldrh_uw, 2, w, 4, uint32_t)
> >
> > DO_VSTR(vstrb_h, 1, stb, 2, int16_t)
> > DO_VSTR(vstrb_w, 1, stb, 4, int32_t)
>
>
[-- Attachment #2: Type: text/html, Size: 7335 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-07-02 11:18 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-01 10:31 [PATCH v2 00/12] target/arm: Fix M-profile helper loads/stores alignment checks William Kosasih
2025-07-01 10:31 ` [PATCH v2 01/12] target/arm: Fix VLSTM/VLLDM helper load/store " William Kosasih
2025-07-02 2:06 ` Richard Henderson
2025-07-01 10:31 ` [PATCH v2 02/12] target/arm: Fix BLXNS helper store " William Kosasih
2025-07-01 10:31 ` [PATCH v2 03/12] target/arm: Fix function_return helper load " William Kosasih
2025-07-01 10:31 ` [PATCH v2 04/12] target/arm: Fix VLDR " William Kosasih
2025-07-02 2:29 ` Richard Henderson
2025-07-02 11:16 ` William Kosasih
2025-07-01 10:31 ` [PATCH v2 05/12] target/arm: Fix VSTR helper store " William Kosasih
2025-07-01 10:31 ` [PATCH v2 06/12] target/arm: Fix VLDR_SG helper load " William Kosasih
2025-07-01 10:31 ` [PATCH v2 07/12] target/arm: Fix VSTR_SG helper store " William Kosasih
2025-07-01 10:31 ` [PATCH v2 08/12] target/arm: Fix VLD4 helper load " William Kosasih
2025-07-01 10:31 ` [PATCH v2 09/12] target/arm: Fix VLD2 " William Kosasih
2025-07-01 10:31 ` [PATCH v2 10/12] target/arm: Fix VST4 helper store " William Kosasih
2025-07-01 10:31 ` [PATCH v2 11/12] target/arm: Fix VST2 " William Kosasih
2025-07-01 10:31 ` [PATCH v2 12/12] target/arm: Fix helper macros indentation in mve_helper.c William Kosasih
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).