* [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode
@ 2020-03-12 19:41 Richard Henderson
2020-03-12 19:41 ` [PATCH v6 01/42] target/arm: Add isar tests for mte Richard Henderson
` (42 more replies)
0 siblings, 43 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
I've done quite a lot of reorg since v5, so much so that I've dropped
most of the review tags that had been given.
There are a couple of major improvements to note:
* The effective ATA bit is in hflags, noting whether access to the
allocation tags is enabled by the OS. This is moderately expensive
to compute, so it's better to hold on to that. This allows trivial
inlining of some of the operations.
* Updates for SVE, and thus
Based-on: <20200311064420.30606-1-richard.henderson@linaro.org>
("target/arm: sve load/store improvements")
* Architecture updates, to F.a released this month.
* There are now real patches for the kernel:
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git devel/mte-v2
Booting that and running my user-mode tests showed up a few bugs.
r~
Richard Henderson (42):
target/arm: Add isar tests for mte
target/arm: Improve masking of SCR RES0 bits
target/arm: Add support for MTE to SCTLR_ELx
target/arm: Add support for MTE to HCR_EL2 and SCR_EL3
target/arm: Rename DISAS_UPDATE to DISAS_UPDATE_EXIT
target/arm: Add DISAS_UPDATE_NOCHAIN
target/arm: Add MTE system registers
target/arm: Add MTE bits to tb_flags
target/arm: Implement the IRG instruction
target/arm: Implement the ADDG, SUBG instructions
target/arm: Implement the GMI instruction
target/arm: Implement the SUBP instruction
target/arm: Define arm_cpu_do_unaligned_access for user-only
target/arm: Add helper_probe_access
target/arm: Implement LDG, STG, ST2G instructions
target/arm: Implement the STGP instruction
target/arm: Restrict the values of DCZID.BS under TCG
target/arm: Simplify DC_ZVA
target/arm: Implement the LDGM, STGM, STZGM instructions
target/arm: Implement the access tag cache flushes
target/arm: Move regime_el to internals.h
target/arm: Move regime_tcr to internals.h
target/arm: Add gen_mte_check1
target/arm: Add gen_mte_checkN
target/arm: Implement helper_mte_check1
target/arm: Implement helper_mte_checkN
target/arm: Add helper_mte_check_zva
target/arm: Use mte_checkN for sve unpredicated loads
target/arm: Use mte_checkN for sve unpredicated stores
target/arm: Use mte_check1 for sve LD1R
target/arm: Add mte helpers for sve scalar + int loads
target/arm: Add mte helpers for sve scalar + int stores
target/arm: Add mte helpers for sve scalar + int ff/nf loads
target/arm: Handle TBI for sve scalar + int memory ops
target/arm: Add mte helpers for sve scatter/gather memory ops
target/arm: Complete TBI clearing for user-only for SVE
target/arm: Implement data cache set allocation tags
target/arm: Set PSTATE.TCO on exception entry
target/arm: Enable MTE
target/arm: Cache the Tagged bit for a page in MemTxAttrs
target/arm: Create tagged ram when MTE is enabled
target/arm: Add allocation tag storage for system mode
target/arm/cpu.h | 36 +-
target/arm/helper-a64.h | 16 +
target/arm/helper-sve.h | 491 +++++++++++
target/arm/helper.h | 2 +
target/arm/internals.h | 145 ++++
target/arm/translate-a64.h | 5 +
target/arm/translate.h | 23 +-
hw/arm/virt.c | 52 ++
linux-user/aarch64/cpu_loop.c | 7 +
linux-user/arm/cpu_loop.c | 7 +
target/arm/cpu.c | 79 +-
target/arm/cpu64.c | 1 +
target/arm/helper-a64.c | 94 +--
target/arm/helper.c | 350 ++++++--
target/arm/mte_helper.c | 894 ++++++++++++++++++++
target/arm/op_helper.c | 16 +
target/arm/sve_helper.c | 750 ++++++++++++++---
target/arm/tlb_helper.c | 41 +-
target/arm/translate-a64.c | 607 ++++++++++++--
target/arm/translate-sve.c | 1425 ++++++++++++++++++++------------
target/arm/translate-vfp.inc.c | 2 +-
target/arm/translate.c | 16 +-
target/arm/Makefile.objs | 1 +
23 files changed, 4204 insertions(+), 856 deletions(-)
create mode 100644 target/arm/mte_helper.c
--
2.20.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v6 01/42] target/arm: Add isar tests for mte
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 02/42] target/arm: Improve masking of SCR RES0 bits Richard Henderson
` (41 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/cpu.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 4ffd991b6f..a77b0d97ac 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3777,6 +3777,16 @@ static inline bool isar_feature_aa64_bti(const ARMISARegisters *id)
return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, BT) != 0;
}
+static inline bool isar_feature_aa64_mte_insn_reg(const ARMISARegisters *id)
+{
+ return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, MTE) != 0;
+}
+
+static inline bool isar_feature_aa64_mte(const ARMISARegisters *id)
+{
+ return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, MTE) >= 2;
+}
+
static inline bool isar_feature_aa64_pmu_8_1(const ARMISARegisters *id)
{
return FIELD_EX64(id->id_aa64dfr0, ID_AA64DFR0, PMUVER) >= 4 &&
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 02/42] target/arm: Improve masking of SCR RES0 bits
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
2020-03-12 19:41 ` [PATCH v6 01/42] target/arm: Add isar tests for mte Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 03/42] target/arm: Add support for MTE to SCTLR_ELx Richard Henderson
` (40 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Protect reads of aa64 id registers with ARM_CP_STATE_AA64.
Use this as a simpler test than arm_el_is_aa64, since EL3
cannot change mode.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8f81ca4f54..d04fc0a140 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1966,9 +1966,16 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
uint32_t valid_mask = 0x3fff;
ARMCPU *cpu = env_archcpu(env);
- if (arm_el_is_aa64(env, 3)) {
+ if (ri->state == ARM_CP_STATE_AA64) {
value |= SCR_FW | SCR_AW; /* these two bits are RES1. */
valid_mask &= ~SCR_NET;
+
+ if (cpu_isar_feature(aa64_lor, cpu)) {
+ valid_mask |= SCR_TLOR;
+ }
+ if (cpu_isar_feature(aa64_pauth, cpu)) {
+ valid_mask |= SCR_API | SCR_APK;
+ }
} else {
valid_mask &= ~(SCR_RW | SCR_ST);
}
@@ -1987,12 +1994,6 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
valid_mask &= ~SCR_SMD;
}
}
- if (cpu_isar_feature(aa64_lor, cpu)) {
- valid_mask |= SCR_TLOR;
- }
- if (cpu_isar_feature(aa64_pauth, cpu)) {
- valid_mask |= SCR_API | SCR_APK;
- }
/* Clear all-context RES0 bits. */
value &= valid_mask;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 03/42] target/arm: Add support for MTE to SCTLR_ELx
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
2020-03-12 19:41 ` [PATCH v6 01/42] target/arm: Add isar tests for mte Richard Henderson
2020-03-12 19:41 ` [PATCH v6 02/42] target/arm: Improve masking of SCR RES0 bits Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 04/42] target/arm: Add support for MTE to HCR_EL2 and SCR_EL3 Richard Henderson
` (39 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
This does not attempt to rectify all of the res0 bits, but does
clear the mte bits when not enabled. Since there is no high-part
mapping of SCTLR, aa32 mode cannot write to these bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index d04fc0a140..97272502ce 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4680,6 +4680,22 @@ static void sctlr_write(CPUARMState *env, const ARMCPRegInfo *ri,
{
ARMCPU *cpu = env_archcpu(env);
+ if (arm_feature(env, ARM_FEATURE_PMSA) && !cpu->has_mpu) {
+ /* M bit is RAZ/WI for PMSA with no MPU implemented */
+ value &= ~SCTLR_M;
+ }
+
+ /* ??? Lots of these bits are not implemented. */
+
+ if (ri->state == ARM_CP_STATE_AA64 && !cpu_isar_feature(aa64_mte, cpu)) {
+ if (ri->opc1 == 6) { /* SCTLR_EL3 */
+ value &= ~(SCTLR_ITFSB | SCTLR_TCF | SCTLR_ATA);
+ } else {
+ value &= ~(SCTLR_ITFSB | SCTLR_TCF0 | SCTLR_TCF |
+ SCTLR_ATA0 | SCTLR_ATA);
+ }
+ }
+
if (raw_read(env, ri) == value) {
/* Skip the TLB flush if nothing actually changed; Linux likes
* to do a lot of pointless SCTLR writes.
@@ -4687,13 +4703,8 @@ static void sctlr_write(CPUARMState *env, const ARMCPRegInfo *ri,
return;
}
- if (arm_feature(env, ARM_FEATURE_PMSA) && !cpu->has_mpu) {
- /* M bit is RAZ/WI for PMSA with no MPU implemented */
- value &= ~SCTLR_M;
- }
-
raw_write(env, ri, value);
- /* ??? Lots of these bits are not implemented. */
+
/* This may enable/disable the MMU, so do a TLB flush. */
tlb_flush(CPU(cpu));
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 04/42] target/arm: Add support for MTE to HCR_EL2 and SCR_EL3
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (2 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 03/42] target/arm: Add support for MTE to SCTLR_ELx Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 05/42] target/arm: Rename DISAS_UPDATE to DISAS_UPDATE_EXIT Richard Henderson
` (38 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 97272502ce..b9267ddbc2 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1976,6 +1976,9 @@ static void scr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
if (cpu_isar_feature(aa64_pauth, cpu)) {
valid_mask |= SCR_API | SCR_APK;
}
+ if (cpu_isar_feature(aa64_mte, cpu)) {
+ valid_mask |= SCR_ATA;
+ }
} else {
valid_mask &= ~(SCR_RW | SCR_ST);
}
@@ -5238,6 +5241,9 @@ static void do_hcr_write(CPUARMState *env, uint64_t value, uint64_t valid_mask)
if (cpu_isar_feature(aa64_pauth, cpu)) {
valid_mask |= HCR_API | HCR_APK;
}
+ if (cpu_isar_feature(aa64_mte, cpu)) {
+ valid_mask |= HCR_ATA | HCR_TID5;
+ }
}
/* Clear RES0 bits. */
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 05/42] target/arm: Rename DISAS_UPDATE to DISAS_UPDATE_EXIT
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (3 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 04/42] target/arm: Add support for MTE to HCR_EL2 and SCR_EL3 Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 06/42] target/arm: Add DISAS_UPDATE_NOCHAIN Richard Henderson
` (37 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Emphasize that the is_jmp option exits to the main loop.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/translate.h | 14 ++++++++------
target/arm/translate-a64.c | 8 ++++----
target/arm/translate-vfp.inc.c | 2 +-
target/arm/translate.c | 12 ++++++------
4 files changed, 19 insertions(+), 17 deletions(-)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index d9ea0c99cc..41d6cc8db5 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -148,7 +148,8 @@ static inline void disas_set_insn_syndrome(DisasContext *s, uint32_t syn)
/* is_jmp field values */
#define DISAS_JUMP DISAS_TARGET_0 /* only pc was modified dynamically */
-#define DISAS_UPDATE DISAS_TARGET_1 /* cpu state was modified dynamically */
+/* CPU state was modified dynamically; exit to main loop for interrupts. */
+#define DISAS_UPDATE_EXIT DISAS_TARGET_1
/* These instructions trap after executing, so the A32/T32 decoder must
* defer them until after the conditional execution state has been updated.
* WFI also needs special handling when single-stepping.
@@ -164,11 +165,12 @@ static inline void disas_set_insn_syndrome(DisasContext *s, uint32_t syn)
* custom end-of-TB code)
*/
#define DISAS_BX_EXCRET DISAS_TARGET_8
-/* For instructions which want an immediate exit to the main loop,
- * as opposed to attempting to use lookup_and_goto_ptr. Unlike
- * DISAS_UPDATE this doesn't write the PC on exiting the translation
- * loop so you need to ensure something (gen_a64_set_pc_im or runtime
- * helper) has done so before we reach return from cpu_tb_exec.
+/*
+ * For instructions which want an immediate exit to the main loop, as opposed
+ * to attempting to use lookup_and_goto_ptr. Unlike DISAS_UPDATE_EXIT, this
+ * doesn't write the PC on exiting the translation loop so you need to ensure
+ * something (gen_a64_set_pc_im or runtime helper) has done so before we reach
+ * return from cpu_tb_exec.
*/
#define DISAS_EXIT DISAS_TARGET_9
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 8fffb52203..a793ec9ee1 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1661,7 +1661,7 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
gen_helper_msr_i_daifclear(cpu_env, t1);
tcg_temp_free_i32(t1);
/* For DAIFClear, exit the cpu loop to re-evaluate pending IRQs. */
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
break;
default:
@@ -1840,7 +1840,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
if ((tb_cflags(s->base.tb) & CF_USE_ICOUNT) && (ri->type & ARM_CP_IO)) {
/* I/O operations must end the TB here (whether read or write) */
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
}
if (!isread && !(ri->type & ARM_CP_SUPPRESS_TB_END)) {
/*
@@ -1855,7 +1855,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
* but allow this to be suppressed by the register definition
* (usually only necessary to work around guest bugs).
*/
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
}
}
@@ -14456,7 +14456,7 @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
gen_goto_tb(dc, 1, dc->base.pc_next);
break;
default:
- case DISAS_UPDATE:
+ case DISAS_UPDATE_EXIT:
gen_a64_set_pc_im(dc->base.pc_next);
/* fall through */
case DISAS_EXIT:
diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index b087bbd812..454bff6d01 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -2867,6 +2867,6 @@ static bool trans_VLLDM_VLSTM(DisasContext *s, arg_VLLDM_VLSTM *a)
tcg_temp_free_i32(fptr);
/* End the TB, because we have updated FP control bits */
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
return true;
}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 6259064ea7..5a21766cce 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -2935,7 +2935,7 @@ static void gen_msr_banked(DisasContext *s, int r, int sysm, int rn)
tcg_temp_free_i32(tcg_tgtmode);
tcg_temp_free_i32(tcg_regno);
tcg_temp_free_i32(tcg_reg);
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
}
static void gen_mrs_banked(DisasContext *s, int r, int sysm, int rn)
@@ -2957,7 +2957,7 @@ static void gen_mrs_banked(DisasContext *s, int r, int sysm, int rn)
tcg_temp_free_i32(tcg_tgtmode);
tcg_temp_free_i32(tcg_regno);
store_reg(s, rn, tcg_reg);
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
}
/* Store value to PC as for an exception return (ie don't
@@ -7635,7 +7635,7 @@ static void gen_srs(DisasContext *s,
tcg_temp_free_i32(tmp);
}
tcg_temp_free_i32(addr);
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
}
/* Generate a label used for skipping this instruction */
@@ -10682,7 +10682,7 @@ static bool trans_SETEND(DisasContext *s, arg_SETEND *a)
}
if (a->E != (s->be_data == MO_BE)) {
gen_helper_setend(cpu_env);
- s->base.is_jmp = DISAS_UPDATE;
+ s->base.is_jmp = DISAS_UPDATE_EXIT;
}
return true;
}
@@ -11419,7 +11419,7 @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
break;
case DISAS_NEXT:
case DISAS_TOO_MANY:
- case DISAS_UPDATE:
+ case DISAS_UPDATE_EXIT:
gen_set_pc_im(dc, dc->base.pc_next);
/* fall through */
default:
@@ -11446,7 +11446,7 @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
case DISAS_JUMP:
gen_goto_ptr();
break;
- case DISAS_UPDATE:
+ case DISAS_UPDATE_EXIT:
gen_set_pc_im(dc, dc->base.pc_next);
/* fall through */
default:
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 06/42] target/arm: Add DISAS_UPDATE_NOCHAIN
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (4 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 05/42] target/arm: Rename DISAS_UPDATE to DISAS_UPDATE_EXIT Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 07/42] target/arm: Add MTE system registers Richard Henderson
` (36 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Add an option that writes back the PC, like DISAS_UPDATE_EXIT,
but does not exit back to the main loop.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/translate.h | 2 ++
target/arm/translate-a64.c | 3 +++
target/arm/translate.c | 4 ++++
3 files changed, 9 insertions(+)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 41d6cc8db5..dbbb167174 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -173,6 +173,8 @@ static inline void disas_set_insn_syndrome(DisasContext *s, uint32_t syn)
* return from cpu_tb_exec.
*/
#define DISAS_EXIT DISAS_TARGET_9
+/* CPU state was modified dynamically; no need to exit, but do not chain. */
+#define DISAS_UPDATE_NOCHAIN DISAS_TARGET_10
#ifdef TARGET_AARCH64
void a64_translate_init(void);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a793ec9ee1..943d6034de 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14462,6 +14462,9 @@ static void aarch64_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
case DISAS_EXIT:
tcg_gen_exit_tb(NULL, 0);
break;
+ case DISAS_UPDATE_NOCHAIN:
+ gen_a64_set_pc_im(dc->base.pc_next);
+ /* fall through */
case DISAS_JUMP:
tcg_gen_lookup_and_goto_ptr();
break;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5a21766cce..33a26287f7 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -11420,6 +11420,7 @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
case DISAS_NEXT:
case DISAS_TOO_MANY:
case DISAS_UPDATE_EXIT:
+ case DISAS_UPDATE_NOCHAIN:
gen_set_pc_im(dc, dc->base.pc_next);
/* fall through */
default:
@@ -11443,6 +11444,9 @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, CPUState *cpu)
case DISAS_TOO_MANY:
gen_goto_tb(dc, 1, dc->base.pc_next);
break;
+ case DISAS_UPDATE_NOCHAIN:
+ gen_set_pc_im(dc, dc->base.pc_next);
+ /* fall through */
case DISAS_JUMP:
gen_goto_ptr();
break;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 07/42] target/arm: Add MTE system registers
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (5 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 06/42] target/arm: Add DISAS_UPDATE_NOCHAIN Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 08/42] target/arm: Add MTE bits to tb_flags Richard Henderson
` (35 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
This is TFSRE0_EL1, TFSR_EL1, TFSR_EL2, TFSR_EL3,
RGSR_EL1, GCR_EL1, GMID_EL1, and PSTATE.TCO.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Add GMID; add access_mte.
v4: Define only TCO at mte_insn_reg.
v6: Define RAZ/WI version of TCO at mte_insn_reg;
honor TID5 for GMID_EL1; fix TFS crn/crm; recalc hflags after TCO.
---
target/arm/cpu.h | 4 ++
target/arm/internals.h | 6 +++
target/arm/helper.c | 94 ++++++++++++++++++++++++++++++++++++++
target/arm/translate-a64.c | 21 +++++++++
4 files changed, 125 insertions(+)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index a77b0d97ac..25351abd15 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -492,6 +492,9 @@ typedef struct CPUARMState {
uint64_t pmccfiltr_el0; /* Performance Monitor Filter Register */
uint64_t vpidr_el2; /* Virtualization Processor ID Register */
uint64_t vmpidr_el2; /* Virtualization Multiprocessor ID Register */
+ uint64_t tfsr_el[4]; /* tfsre0_el1 is index 0. */
+ uint64_t gcr_el1;
+ uint64_t rgsr_el1;
} cp15;
struct {
@@ -1259,6 +1262,7 @@ void pmu_init(ARMCPU *cpu);
#define PSTATE_SS (1U << 21)
#define PSTATE_PAN (1U << 22)
#define PSTATE_UAO (1U << 23)
+#define PSTATE_TCO (1U << 25)
#define PSTATE_V (1U << 28)
#define PSTATE_C (1U << 29)
#define PSTATE_Z (1U << 30)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index a833e3941d..0591f30526 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1233,4 +1233,10 @@ void arm_log_exception(int idx);
#endif /* !CONFIG_USER_ONLY */
+/*
+ * The log2 of the words in the tag block, for GMID_EL1.BS.
+ * The is the maximum, 256 bytes, which manipulates 64-bits of tags.
+ */
+#define GMID_EL1_BS 6
+
#endif
diff --git a/target/arm/helper.c b/target/arm/helper.c
index b9267ddbc2..b47209be64 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5869,6 +5869,9 @@ static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
{ K(3, 0, 1, 2, 0), K(3, 4, 1, 2, 0), K(3, 5, 1, 2, 0),
"ZCR_EL1", "ZCR_EL2", "ZCR_EL12", isar_feature_aa64_sve },
+ { K(3, 0, 5, 6, 0), K(3, 4, 5, 6, 0), K(3, 5, 5, 6, 0),
+ "TFSR_EL1", "TFSR_EL2", "TFSR_EL12", isar_feature_aa64_mte },
+
/* TODO: ARMv8.2-SPE -- PMSCR_EL2 */
/* TODO: ARMv8.4-Trace -- TRFCR_EL2 */
};
@@ -6840,6 +6843,86 @@ static const ARMCPRegInfo dcpodp_reg[] = {
};
#endif /*CONFIG_USER_ONLY*/
+static CPAccessResult access_aa64_tid5(CPUARMState *env, const ARMCPRegInfo *ri,
+ bool isread)
+{
+ if ((arm_current_el(env) < 2) && (arm_hcr_el2_eff(env) & HCR_TID5)) {
+ return CP_ACCESS_TRAP_EL2;
+ }
+
+ return CP_ACCESS_OK;
+}
+
+static CPAccessResult access_mte(CPUARMState *env, const ARMCPRegInfo *ri,
+ bool isread)
+{
+ int el = arm_current_el(env);
+
+ if (el < 2 &&
+ arm_feature(env, ARM_FEATURE_EL2) &&
+ !(arm_hcr_el2_eff(env) & HCR_ATA)) {
+ return CP_ACCESS_TRAP_EL2;
+ }
+ if (el < 3 &&
+ arm_feature(env, ARM_FEATURE_EL3) &&
+ !(env->cp15.scr_el3 & SCR_ATA)) {
+ return CP_ACCESS_TRAP_EL3;
+ }
+ return CP_ACCESS_OK;
+}
+
+static uint64_t tco_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+ return env->pstate & PSTATE_TCO;
+}
+
+static void tco_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t val)
+{
+ env->pstate = (env->pstate & ~PSTATE_TCO) | (val & PSTATE_TCO);
+}
+
+static const ARMCPRegInfo mte_reginfo[] = {
+ { .name = "TFSRE0_EL1", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 6, .opc2 = 1,
+ .access = PL1_RW, .accessfn = access_mte,
+ .fieldoffset = offsetof(CPUARMState, cp15.tfsr_el[0]) },
+ { .name = "TFSR_EL1", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 0, .crn = 5, .crm = 6, .opc2 = 0,
+ .access = PL1_RW, .accessfn = access_mte,
+ .fieldoffset = offsetof(CPUARMState, cp15.tfsr_el[1]) },
+ { .name = "TFSR_EL2", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 4, .crn = 5, .crm = 6, .opc2 = 0,
+ .access = PL2_RW, .accessfn = access_mte,
+ .fieldoffset = offsetof(CPUARMState, cp15.tfsr_el[2]) },
+ { .name = "TFSR_EL3", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 6, .crn = 5, .crm = 6, .opc2 = 0,
+ .access = PL3_RW,
+ .fieldoffset = offsetof(CPUARMState, cp15.tfsr_el[3]) },
+ { .name = "RGSR_EL1", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 5,
+ .access = PL1_RW, .accessfn = access_mte,
+ .fieldoffset = offsetof(CPUARMState, cp15.rgsr_el1) },
+ { .name = "GCR_EL1", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 6,
+ .access = PL1_RW, .accessfn = access_mte,
+ .fieldoffset = offsetof(CPUARMState, cp15.gcr_el1) },
+ { .name = "GMID_EL1", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 1, .crn = 0, .crm = 0, .opc2 = 4,
+ .access = PL1_R, .accessfn = access_aa64_tid5,
+ .type = ARM_CP_CONST, .resetvalue = GMID_EL1_BS },
+ { .name = "TCO", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 7,
+ .type = ARM_CP_NO_RAW,
+ .access = PL0_RW, .readfn = tco_read, .writefn = tco_write },
+ REGINFO_SENTINEL
+};
+
+static const ARMCPRegInfo mte_tco_ro_reginfo[] = {
+ { .name = "TCO", .state = ARM_CP_STATE_AA64,
+ .opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 7,
+ .type = ARM_CP_CONST, .access = PL0_RW, },
+ REGINFO_SENTINEL
+};
#endif
static CPAccessResult access_predinv(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -7957,6 +8040,17 @@ void register_cp_regs_for_features(ARMCPU *cpu)
}
}
#endif /*CONFIG_USER_ONLY*/
+
+ /*
+ * If full MTE is enabled, add all of the system registers.
+ * If only "instructions available at EL0" are enabled,
+ * then define only a RAZ/WI version of PSTATE.TCO.
+ */
+ if (cpu_isar_feature(aa64_mte, cpu)) {
+ define_arm_cp_regs(cpu, mte_reginfo);
+ } else if (cpu_isar_feature(aa64_mte_insn_reg, cpu)) {
+ define_arm_cp_regs(cpu, mte_tco_ro_reginfo);
+ }
#endif
if (cpu_isar_feature(any_predinv, cpu)) {
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 943d6034de..528f608408 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1664,6 +1664,27 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
s->base.is_jmp = DISAS_UPDATE_EXIT;
break;
+ case 0x1c: /* TCO */
+ if (dc_isar_feature(aa64_mte, s)) {
+ /* Full MTE is enabled -- set the TCO bit as directed. */
+ if (crm & 1) {
+ set_pstate_bits(PSTATE_TCO);
+ } else {
+ clear_pstate_bits(PSTATE_TCO);
+ }
+ t1 = tcg_const_i32(s->current_el);
+ gen_helper_rebuild_hflags_a64(cpu_env, t1);
+ tcg_temp_free_i32(t1);
+ /* Many factors, including TCO, go into MTE_ACTIVE. */
+ s->base.is_jmp = DISAS_UPDATE_NOCHAIN;
+ } else if (dc_isar_feature(aa64_mte_insn_reg, s)) {
+ /* Only "instructions accessible at EL0" -- PSTATE.TCO is WI. */
+ s->base.is_jmp = DISAS_NEXT;
+ } else {
+ goto do_unallocated;
+ }
+ break;
+
default:
do_unallocated:
unallocated_encoding(s);
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 08/42] target/arm: Add MTE bits to tb_flags
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (6 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 07/42] target/arm: Add MTE system registers Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 09/42] target/arm: Implement the IRG instruction Richard Henderson
` (34 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Cache the composite ATA setting.
Cache when MTE is fully enabled, i.e. access to tags are enabled
and tag checks affect the PE. Do this for both the normal context
and the UNPRIV context.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Remove stub helper_mte_check; moved to a later patch.
v6: Add mte0_active and ata bits; drop reviewed-by.
---
target/arm/cpu.h | 12 ++++++++----
target/arm/internals.h | 18 +++++++++++++++++
target/arm/translate.h | 5 +++++
target/arm/helper.c | 40 ++++++++++++++++++++++++++++++++++++++
target/arm/translate-a64.c | 4 ++++
5 files changed, 75 insertions(+), 4 deletions(-)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 25351abd15..67164d56a1 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3155,10 +3155,10 @@ typedef ARMCPU ArchCPU;
* | | | TBFLAG_A32 | |
* | | +-----+----------+ TBFLAG_AM32 |
* | TBFLAG_ANY | |TBFLAG_M32| |
- * | | +-+----------+--------------|
- * | | | TBFLAG_A64 |
- * +--------------+---------+---------------------------+
- * 31 20 15 0
+ * | +-----------+----------+--------------|
+ * | | TBFLAG_A64 |
+ * +--------------+-------------------------------------+
+ * 31 20 0
*
* Unless otherwise noted, these bits are cached in env->hflags.
*/
@@ -3225,6 +3225,10 @@ FIELD(TBFLAG_A64, BT, 9, 1)
FIELD(TBFLAG_A64, BTYPE, 10, 2) /* Not cached. */
FIELD(TBFLAG_A64, TBID, 12, 2)
FIELD(TBFLAG_A64, UNPRIV, 14, 1)
+FIELD(TBFLAG_A64, ATA, 15, 1)
+FIELD(TBFLAG_A64, TCMA, 16, 2)
+FIELD(TBFLAG_A64, MTE_ACTIVE, 18, 1)
+FIELD(TBFLAG_A64, MTE0_ACTIVE, 19, 1)
/**
* cpu_mmu_index:
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 0591f30526..45f445cf3e 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1194,6 +1194,24 @@ static inline int exception_target_el(CPUARMState *env)
return target_el;
}
+/* Determine if allocation tags are available. */
+static inline bool allocation_tag_access_enabled(CPUARMState *env, int el,
+ uint64_t sctlr)
+{
+ if (el < 3
+ && arm_feature(env, ARM_FEATURE_EL3)
+ && !(env->cp15.scr_el3 & SCR_ATA)) {
+ return false;
+ }
+ if (el < 2
+ && arm_feature(env, ARM_FEATURE_EL2)
+ && !(arm_hcr_el2_eff(env) & HCR_ATA)) {
+ return false;
+ }
+ sctlr &= (el == 0 ? SCTLR_ATA0 : SCTLR_ATA);
+ return sctlr != 0;
+}
+
#ifndef CONFIG_USER_ONLY
/* Security attributes for an address, as returned by v8m_security_lookup. */
diff --git a/target/arm/translate.h b/target/arm/translate.h
index dbbb167174..e0f5d0be63 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -30,6 +30,7 @@ typedef struct DisasContext {
ARMMMUIdx mmu_idx; /* MMU index to use for normal loads/stores */
uint8_t tbii; /* TBI1|TBI0 for insns */
uint8_t tbid; /* TBI1|TBI0 for data */
+ uint8_t tcma; /* TCMA1|TCMA0 for MTE */
bool ns; /* Use non-secure CPREG bank on access */
int fp_excp_el; /* FP exception EL or 0 if enabled */
int sve_excp_el; /* SVE exception EL or 0 if enabled */
@@ -77,6 +78,10 @@ typedef struct DisasContext {
bool unpriv;
/* True if v8.3-PAuth is active. */
bool pauth_active;
+ /* True if v8.5-MTE access to tags is enabled. */
+ bool ata;
+ /* True if v8.5-MTE tag checks affect the PE; index with is_unpriv. */
+ bool mte_active[2];
/* True with v8.5-BTI and SCTLR_ELx.BT* set. */
bool bt;
/* True if any CP15 access is trapped by HSTR_EL2 */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index b47209be64..01d2fcf2e3 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10587,6 +10587,16 @@ static int aa64_va_parameter_tbid(uint64_t tcr, ARMMMUIdx mmu_idx)
}
}
+static int aa64_va_parameter_tcma(uint64_t tcr, ARMMMUIdx mmu_idx)
+{
+ if (regime_has_2_ranges(mmu_idx)) {
+ return extract64(tcr, 57, 2);
+ } else {
+ /* Replicate the single TCMA bit so we always have 2 bits. */
+ return extract32(tcr, 30, 1) * 3;
+ }
+}
+
ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
ARMMMUIdx mmu_idx, bool data)
{
@@ -12590,6 +12600,36 @@ static uint32_t rebuild_hflags_a64(CPUARMState *env, int el, int fp_el,
}
}
+ if (cpu_isar_feature(aa64_mte, env_archcpu(env))) {
+ /*
+ * Set MTE_ACTIVE if any access may be Checked, and leave clear
+ * if all accesses must be Unchecked:
+ * 1) If no TBI, then there are no tags in the address to check,
+ * 2) If Tag Check Override, then all accesses are Unchecked,
+ * 3) If Tag Check Fail == 0, then Checked access have no effect,
+ * 4) If no Allocation Tag Access, then all accesses are Unchecked.
+ */
+ if (allocation_tag_access_enabled(env, el, sctlr)) {
+ flags = FIELD_DP32(flags, TBFLAG_A64, ATA, 1);
+ if (tbid
+ && !(env->pstate & PSTATE_TCO)
+ && (sctlr & (el == 0 ? SCTLR_TCF0 : SCTLR_TCF))) {
+ flags = FIELD_DP32(flags, TBFLAG_A64, MTE_ACTIVE, 1);
+ }
+ }
+ /* And again for unprivileged accesses, if required. */
+ if (FIELD_EX32(flags, TBFLAG_A64, UNPRIV)
+ && tbid
+ && !(env->pstate & PSTATE_TCO)
+ && (sctlr & SCTLR_TCF0)
+ && allocation_tag_access_enabled(env, 0, sctlr)) {
+ flags = FIELD_DP32(flags, TBFLAG_A64, MTE0_ACTIVE, 1);
+ }
+ /* Cache TCMA as well as TBI. */
+ flags = FIELD_DP32(flags, TBFLAG_A64, TCMA,
+ aa64_va_parameter_tcma(tcr, mmu_idx));
+ }
+
return rebuild_hflags_common(env, fp_el, mmu_idx, flags);
}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 528f608408..5eda6ff975 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14335,6 +14335,7 @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
dc->mmu_idx = core_to_aa64_mmu_idx(core_mmu_idx);
dc->tbii = FIELD_EX32(tb_flags, TBFLAG_A64, TBII);
dc->tbid = FIELD_EX32(tb_flags, TBFLAG_A64, TBID);
+ dc->tcma = FIELD_EX32(tb_flags, TBFLAG_A64, TCMA);
dc->current_el = arm_mmu_idx_to_el(dc->mmu_idx);
#if !defined(CONFIG_USER_ONLY)
dc->user = (dc->current_el == 0);
@@ -14346,6 +14347,9 @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
dc->bt = FIELD_EX32(tb_flags, TBFLAG_A64, BT);
dc->btype = FIELD_EX32(tb_flags, TBFLAG_A64, BTYPE);
dc->unpriv = FIELD_EX32(tb_flags, TBFLAG_A64, UNPRIV);
+ dc->ata = FIELD_EX32(tb_flags, TBFLAG_A64, ATA);
+ dc->mte_active[0] = FIELD_EX32(tb_flags, TBFLAG_A64, MTE_ACTIVE);
+ dc->mte_active[1] = FIELD_EX32(tb_flags, TBFLAG_A64, MTE0_ACTIVE);
dc->vec_len = 0;
dc->vec_stride = 0;
dc->cp_regs = arm_cpu->cp_regs;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 09/42] target/arm: Implement the IRG instruction
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (7 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 08/42] target/arm: Add MTE bits to tb_flags Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 10/42] target/arm: Implement the ADDG, SUBG instructions Richard Henderson
` (33 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Update to 00eac5.
Merge choose_random_nonexcluded_tag into helper_irg since
that pseudo function no longer exists separately.
v6: Remove obsolete logical/physical tag distinction;
implement inline for !ATA.
---
target/arm/helper-a64.h | 2 ++
target/arm/internals.h | 5 +++
target/arm/mte_helper.c | 72 ++++++++++++++++++++++++++++++++++++++
target/arm/translate-a64.c | 18 ++++++++++
target/arm/Makefile.objs | 1 +
5 files changed, 98 insertions(+)
create mode 100644 target/arm/mte_helper.c
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index 3df7c185aa..587ccbe42f 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -103,3 +103,5 @@ DEF_HELPER_FLAGS_3(autda, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(autdb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_2(xpaci, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
+
+DEF_HELPER_FLAGS_3(irg, TCG_CALL_NO_RWG, i64, env, i64, i64)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 45f445cf3e..e4450302a6 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1257,4 +1257,9 @@ void arm_log_exception(int idx);
*/
#define GMID_EL1_BS 6
+static inline uint64_t address_with_allocation_tag(uint64_t ptr, int rtag)
+{
+ return deposit64(ptr, 56, 4, rtag);
+}
+
#endif
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
new file mode 100644
index 0000000000..539a04de84
--- /dev/null
+++ b/target/arm/mte_helper.c
@@ -0,0 +1,72 @@
+/*
+ * ARM v8.5-MemTag Operations
+ *
+ * Copyright (c) 2020 Linaro, Ltd.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "internals.h"
+#include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "exec/helper-proto.h"
+
+
+static int choose_nonexcluded_tag(int tag, int offset, uint16_t exclude)
+{
+ if (exclude == 0xffff) {
+ return 0;
+ }
+ if (offset == 0) {
+ while (exclude & (1 << tag)) {
+ tag = (tag + 1) & 15;
+ }
+ } else {
+ do {
+ do {
+ tag = (tag + 1) & 15;
+ } while (exclude & (1 << tag));
+ } while (--offset > 0);
+ }
+ return tag;
+}
+
+uint64_t HELPER(irg)(CPUARMState *env, uint64_t rn, uint64_t rm)
+{
+ int rtag;
+
+ /*
+ * Our IMPDEF choice for GCR_EL1.RRND==1 is to behave as if
+ * GCR_EL1.RRND==0, always producing deterministic results.
+ */
+ uint16_t exclude = extract32(rm | env->cp15.gcr_el1, 0, 16);
+ int start = extract32(env->cp15.rgsr_el1, 0, 4);
+ int seed = extract32(env->cp15.rgsr_el1, 8, 16);
+ int offset, i;
+
+ /* RandomTag */
+ for (i = offset = 0; i < 4; ++i) {
+ /* NextRandomTagBit */
+ int top = (extract32(seed, 5, 1) ^ extract32(seed, 3, 1) ^
+ extract32(seed, 2, 1) ^ extract32(seed, 0, 1));
+ seed = (top << 15) | (seed >> 1);
+ offset |= top << i;
+ }
+ rtag = choose_nonexcluded_tag(start, offset, exclude);
+ env->cp15.rgsr_el1 = rtag | (seed << 8);
+
+ return address_with_allocation_tag(rn, rtag);
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 5eda6ff975..f1e2fe0060 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -243,6 +243,12 @@ static TCGv_i64 clean_data_tbi(DisasContext *s, TCGv_i64 addr)
return clean;
}
+/* Insert a zero tag into src, with the result at dst. */
+static void gen_address_with_allocation_tag0(TCGv_i64 dst, TCGv_i64 src)
+{
+ tcg_gen_andi_i64(dst, src, ~MAKE_64BIT_MASK(56, 4));
+}
+
typedef struct DisasCompare64 {
TCGCond cond;
TCGv_i64 value;
@@ -5329,6 +5335,18 @@ static void disas_data_proc_2src(DisasContext *s, uint32_t insn)
case 3: /* SDIV */
handle_div(s, true, sf, rm, rn, rd);
break;
+ case 4: /* IRG */
+ if (sf == 0 || !dc_isar_feature(aa64_mte_insn_reg, s)) {
+ goto do_unallocated;
+ }
+ if (s->ata) {
+ gen_helper_irg(cpu_reg_sp(s, rd), cpu_env,
+ cpu_reg_sp(s, rn), cpu_reg(s, rm));
+ } else {
+ gen_address_with_allocation_tag0(cpu_reg_sp(s, rd),
+ cpu_reg_sp(s, rn));
+ }
+ break;
case 8: /* LSLV */
handle_shift_reg(s, A64_SHIFT_TYPE_LSL, sf, rm, rn, rd);
break;
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
index cf26c16f5f..8fd7d086c8 100644
--- a/target/arm/Makefile.objs
+++ b/target/arm/Makefile.objs
@@ -67,3 +67,4 @@ obj-$(CONFIG_SOFTMMU) += psci.o
obj-$(TARGET_AARCH64) += translate-a64.o helper-a64.o
obj-$(TARGET_AARCH64) += translate-sve.o sve_helper.o
obj-$(TARGET_AARCH64) += pauth_helper.o
+obj-$(TARGET_AARCH64) += mte_helper.o
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 10/42] target/arm: Implement the ADDG, SUBG instructions
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (8 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 09/42] target/arm: Implement the IRG instruction Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 11/42] target/arm: Implement the GMI instruction Richard Henderson
` (32 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Shift offset in translate; use extract32.
v6: Implement inline for !ATA.
---
target/arm/helper-a64.h | 1 +
target/arm/internals.h | 9 +++++++++
target/arm/mte_helper.c | 10 ++++++++++
target/arm/translate-a64.c | 36 ++++++++++++++++++++++++++++++++++--
4 files changed, 54 insertions(+), 2 deletions(-)
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index 587ccbe42f..6c116481e8 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -105,3 +105,4 @@ DEF_HELPER_FLAGS_2(xpaci, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_3(irg, TCG_CALL_NO_RWG, i64, env, i64, i64)
+DEF_HELPER_FLAGS_4(addsubg, TCG_CALL_NO_RWG_SE, i64, env, i64, s32, i32)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index e4450302a6..3190fc40fc 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1257,6 +1257,15 @@ void arm_log_exception(int idx);
*/
#define GMID_EL1_BS 6
+/* We associate one allocation tag per 16 bytes, the minimum. */
+#define LOG2_TAG_GRANULE 4
+#define TAG_GRANULE (1 << LOG2_TAG_GRANULE)
+
+static inline int allocation_tag_from_addr(uint64_t ptr)
+{
+ return extract64(ptr, 56, 4);
+}
+
static inline uint64_t address_with_allocation_tag(uint64_t ptr, int rtag)
{
return deposit64(ptr, 56, 4, rtag);
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index 539a04de84..9ab9ed749d 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -70,3 +70,13 @@ uint64_t HELPER(irg)(CPUARMState *env, uint64_t rn, uint64_t rm)
return address_with_allocation_tag(rn, rtag);
}
+
+uint64_t HELPER(addsubg)(CPUARMState *env, uint64_t ptr,
+ int32_t offset, uint32_t tag_offset)
+{
+ int start_tag = allocation_tag_from_addr(ptr);
+ uint16_t exclude = extract32(env->cp15.gcr_el1, 0, 16);
+ int rtag = choose_nonexcluded_tag(start_tag, tag_offset, exclude);
+
+ return address_with_allocation_tag(ptr + offset, rtag);
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index f1e2fe0060..02f31d5948 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -3807,17 +3807,20 @@ static void disas_pc_rel_adr(DisasContext *s, uint32_t insn)
* sf: 0 -> 32bit, 1 -> 64bit
* op: 0 -> add , 1 -> sub
* S: 1 -> set flags
- * shift: 00 -> LSL imm by 0, 01 -> LSL imm by 12
+ * shift: 00 -> LSL imm by 0,
+ * 01 -> LSL imm by 12
+ * 10 -> ADDG, SUBG
*/
static void disas_add_sub_imm(DisasContext *s, uint32_t insn)
{
int rd = extract32(insn, 0, 5);
int rn = extract32(insn, 5, 5);
- uint64_t imm = extract32(insn, 10, 12);
+ int imm = extract32(insn, 10, 12);
int shift = extract32(insn, 22, 2);
bool setflags = extract32(insn, 29, 1);
bool sub_op = extract32(insn, 30, 1);
bool is_64bit = extract32(insn, 31, 1);
+ bool is_tag = false;
TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
TCGv_i64 tcg_rd = setflags ? cpu_reg(s, rd) : cpu_reg_sp(s, rd);
@@ -3829,11 +3832,40 @@ static void disas_add_sub_imm(DisasContext *s, uint32_t insn)
case 0x1:
imm <<= 12;
break;
+ case 0x2:
+ /* ADDG, SUBG */
+ if (!is_64bit || setflags || (imm & 0x30) ||
+ !dc_isar_feature(aa64_mte_insn_reg, s)) {
+ goto do_unallocated;
+ }
+ is_tag = true;
+ break;
default:
+ do_unallocated:
unallocated_encoding(s);
return;
}
+ if (is_tag) {
+ imm = (imm >> 6) << LOG2_TAG_GRANULE;
+ if (sub_op) {
+ imm = -imm;
+ }
+
+ if (s->ata) {
+ TCGv_i32 tag_offset = tcg_const_i32(imm & 15);
+ TCGv_i32 offset = tcg_const_i32(imm);
+
+ gen_helper_addsubg(tcg_rd, cpu_env, tcg_rn, offset, tag_offset);
+ tcg_temp_free_i32(tag_offset);
+ tcg_temp_free_i32(offset);
+ } else {
+ tcg_gen_addi_i64(tcg_rd, tcg_rn, imm);
+ gen_address_with_allocation_tag0(tcg_rd, tcg_rd);
+ }
+ return;
+ }
+
tcg_result = tcg_temp_new_i64();
if (!setflags) {
if (sub_op) {
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 11/42] target/arm: Implement the GMI instruction
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (9 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 10/42] target/arm: Implement the ADDG, SUBG instructions Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 12/42] target/arm: Implement the SUBP instruction Richard Henderson
` (31 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v6: Inline the operation.
---
target/arm/translate-a64.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 02f31d5948..9adc259186 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5379,6 +5379,21 @@ static void disas_data_proc_2src(DisasContext *s, uint32_t insn)
cpu_reg_sp(s, rn));
}
break;
+ case 5: /* GMI */
+ if (sf == 0 || !dc_isar_feature(aa64_mte_insn_reg, s)) {
+ goto do_unallocated;
+ } else {
+ TCGv_i64 t1 = tcg_const_i64(1);
+ TCGv_i64 t2 = tcg_temp_new_i64();
+
+ tcg_gen_extract_i64(t2, cpu_reg_sp(s, rn), 56, 4);
+ tcg_gen_shl_i64(t1, t1, t2);
+ tcg_gen_or_i64(cpu_reg(s, rd), cpu_reg(s, rm), t1);
+
+ tcg_temp_free_i64(t1);
+ tcg_temp_free_i64(t2);
+ }
+ break;
case 8: /* LSLV */
handle_shift_reg(s, A64_SHIFT_TYPE_LSL, sf, rm, rn, rd);
break;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 12/42] target/arm: Implement the SUBP instruction
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (10 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 11/42] target/arm: Implement the GMI instruction Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 13/42] target/arm: Define arm_cpu_do_unaligned_access for user-only Richard Henderson
` (30 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Fix extraction length.
---
target/arm/translate-a64.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 9adc259186..49c94bc565 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5348,19 +5348,39 @@ static void handle_crc32(DisasContext *s,
*/
static void disas_data_proc_2src(DisasContext *s, uint32_t insn)
{
- unsigned int sf, rm, opcode, rn, rd;
+ unsigned int sf, rm, opcode, rn, rd, setflag;
sf = extract32(insn, 31, 1);
+ setflag = extract32(insn, 29, 1);
rm = extract32(insn, 16, 5);
opcode = extract32(insn, 10, 6);
rn = extract32(insn, 5, 5);
rd = extract32(insn, 0, 5);
- if (extract32(insn, 29, 1)) {
+ if (setflag && opcode != 0) {
unallocated_encoding(s);
return;
}
switch (opcode) {
+ case 0: /* SUBP(S) */
+ if (sf == 0 || !dc_isar_feature(aa64_mte_insn_reg, s)) {
+ goto do_unallocated;
+ } else {
+ TCGv_i64 tcg_n, tcg_m, tcg_d;
+
+ tcg_n = read_cpu_reg_sp(s, rn, true);
+ tcg_m = read_cpu_reg_sp(s, rm, true);
+ tcg_gen_sextract_i64(tcg_n, tcg_n, 0, 56);
+ tcg_gen_sextract_i64(tcg_m, tcg_m, 0, 56);
+ tcg_d = cpu_reg(s, rd);
+
+ if (setflag) {
+ gen_sub_CC(true, tcg_d, tcg_n, tcg_m);
+ } else {
+ tcg_gen_sub_i64(tcg_d, tcg_n, tcg_m);
+ }
+ }
+ break;
case 2: /* UDIV */
handle_div(s, false, sf, rm, rn, rd);
break;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 13/42] target/arm: Define arm_cpu_do_unaligned_access for user-only
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (11 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 12/42] target/arm: Implement the SUBP instruction Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 14/42] target/arm: Add helper_probe_access Richard Henderson
` (29 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
We need this to raise unaligned exceptions from user mode.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v6: Use EXCP_UNALIGNED for user-only and update cpu_loop.c.
---
linux-user/aarch64/cpu_loop.c | 7 ++++++
linux-user/arm/cpu_loop.c | 7 ++++++
target/arm/cpu.c | 2 +-
target/arm/tlb_helper.c | 41 ++++++++++++++++++++++-------------
4 files changed, 41 insertions(+), 16 deletions(-)
diff --git a/linux-user/aarch64/cpu_loop.c b/linux-user/aarch64/cpu_loop.c
index bbe9fefca8..3cca637bb9 100644
--- a/linux-user/aarch64/cpu_loop.c
+++ b/linux-user/aarch64/cpu_loop.c
@@ -121,6 +121,13 @@ void cpu_loop(CPUARMState *env)
info._sifields._sigfault._addr = env->exception.vaddress;
queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
break;
+ case EXCP_UNALIGNED:
+ info.si_signo = TARGET_SIGBUS;
+ info.si_errno = 0;
+ info.si_code = TARGET_BUS_ADRALN;
+ info._sifields._sigfault._addr = env->exception.vaddress;
+ queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+ break;
case EXCP_DEBUG:
case EXCP_BKPT:
info.si_signo = TARGET_SIGTRAP;
diff --git a/linux-user/arm/cpu_loop.c b/linux-user/arm/cpu_loop.c
index cf618daa1c..d2ce78ae73 100644
--- a/linux-user/arm/cpu_loop.c
+++ b/linux-user/arm/cpu_loop.c
@@ -395,6 +395,13 @@ void cpu_loop(CPUARMState *env)
queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
}
break;
+ case EXCP_UNALIGNED:
+ info.si_signo = TARGET_SIGBUS;
+ info.si_errno = 0;
+ info.si_code = TARGET_BUS_ADRALN;
+ info._sifields._sigfault._addr = env->exception.vaddress;
+ queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+ break;
case EXCP_DEBUG:
excp_debug:
info.si_signo = TARGET_SIGTRAP;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 3623ecefbd..cb3c3fe8c2 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2831,8 +2831,8 @@ static void arm_cpu_class_init(ObjectClass *oc, void *data)
cc->tlb_fill = arm_cpu_tlb_fill;
cc->debug_excp_handler = arm_debug_excp_handler;
cc->debug_check_watchpoint = arm_debug_check_watchpoint;
-#if !defined(CONFIG_USER_ONLY)
cc->do_unaligned_access = arm_cpu_do_unaligned_access;
+#if !defined(CONFIG_USER_ONLY)
cc->do_transaction_failed = arm_cpu_do_transaction_failed;
cc->adjust_watchpoint_address = arm_adjust_watchpoint_address;
#endif /* CONFIG_TCG && !CONFIG_USER_ONLY */
diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
index e63f8bda29..44d7bcc783 100644
--- a/target/arm/tlb_helper.c
+++ b/target/arm/tlb_helper.c
@@ -107,21 +107,6 @@ static void QEMU_NORETURN arm_deliver_fault(ARMCPU *cpu, vaddr addr,
raise_exception(env, exc, syn, target_el);
}
-/* Raise a data fault alignment exception for the specified virtual address */
-void arm_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
- MMUAccessType access_type,
- int mmu_idx, uintptr_t retaddr)
-{
- ARMCPU *cpu = ARM_CPU(cs);
- ARMMMUFaultInfo fi = {};
-
- /* now we have a real cpu fault */
- cpu_restore_state(cs, retaddr, true);
-
- fi.type = ARMFault_Alignment;
- arm_deliver_fault(cpu, vaddr, access_type, mmu_idx, &fi);
-}
-
/*
* arm_cpu_do_transaction_failed: handle a memory system error response
* (eg "no device/memory present at address") by raising an external abort
@@ -198,3 +183,29 @@ bool arm_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
}
#endif
}
+
+/* Raise a data fault alignment exception for the specified virtual address */
+void arm_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
+ MMUAccessType access_type,
+ int mmu_idx, uintptr_t retaddr)
+{
+ ARMCPU *cpu = ARM_CPU(cs);
+
+#ifdef CONFIG_USER_ONLY
+ cpu->env.exception.vaddress = vaddr;
+ /*
+ * For HW, this is EXCP_DATA_ABORT with a proper syndrome.
+ * Make it easier for ourselves in linux-user/arm/cpu_loop.c.
+ */
+ cs->exception_index = EXCP_UNALIGNED;
+ cpu_loop_exit_restore(cs, retaddr);
+#else
+ ARMMMUFaultInfo fi = {};
+
+ /* now we have a real cpu fault */
+ cpu_restore_state(cs, retaddr, true);
+
+ fi.type = ARMFault_Alignment;
+ arm_deliver_fault(cpu, vaddr, access_type, mmu_idx, &fi);
+#endif
+}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 14/42] target/arm: Add helper_probe_access
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (12 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 13/42] target/arm: Define arm_cpu_do_unaligned_access for user-only Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 15/42] target/arm: Implement LDG, STG, ST2G instructions Richard Henderson
` (28 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Raise an exception if the given virtual memory is not accessible.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.h | 2 ++
target/arm/op_helper.c | 16 ++++++++++++++++
target/arm/translate-a64.c | 13 +++++++++++++
3 files changed, 31 insertions(+)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index 72eb9e6a1a..8a616ed6d8 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -95,6 +95,8 @@ DEF_HELPER_FLAGS_1(rebuild_hflags_a32_newel, TCG_CALL_NO_RWG, void, env)
DEF_HELPER_FLAGS_2(rebuild_hflags_a32, TCG_CALL_NO_RWG, void, env, int)
DEF_HELPER_FLAGS_2(rebuild_hflags_a64, TCG_CALL_NO_RWG, void, env, int)
+DEF_HELPER_FLAGS_5(probe_access, TCG_CALL_NO_WG, void, env, tl, i32, i32, i32)
+
DEF_HELPER_1(vfp_get_fpscr, i32, env)
DEF_HELPER_2(vfp_set_fpscr, void, env, i32)
diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index eb0de080f1..b1065216b2 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -935,3 +935,19 @@ uint32_t HELPER(ror_cc)(CPUARMState *env, uint32_t x, uint32_t i)
return ((uint32_t)x >> shift) | (x << (32 - shift));
}
}
+
+void HELPER(probe_access)(CPUARMState *env, target_ulong ptr,
+ uint32_t access_type, uint32_t mmu_idx,
+ uint32_t size)
+{
+ uint32_t in_page = -((uint32_t)ptr | TARGET_PAGE_SIZE);
+ uintptr_t ra = GETPC();
+
+ if (likely(size <= in_page)) {
+ probe_access(env, ptr, size, access_type, mmu_idx, ra);
+ } else {
+ probe_access(env, ptr, in_page, access_type, mmu_idx, ra);
+ probe_access(env, ptr + in_page, size - in_page,
+ access_type, mmu_idx, ra);
+ }
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 49c94bc565..45a95d8ea0 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -249,6 +249,19 @@ static void gen_address_with_allocation_tag0(TCGv_i64 dst, TCGv_i64 src)
tcg_gen_andi_i64(dst, src, ~MAKE_64BIT_MASK(56, 4));
}
+static void gen_probe_access(DisasContext *s, TCGv_i64 ptr,
+ MMUAccessType acc, int log2_size)
+{
+ TCGv_i32 t_acc = tcg_const_i32(acc);
+ TCGv_i32 t_idx = tcg_const_i32(get_mem_index(s));
+ TCGv_i32 t_size = tcg_const_i32(1 << log2_size);
+
+ gen_helper_probe_access(cpu_env, ptr, t_acc, t_idx, t_size);
+ tcg_temp_free_i32(t_acc);
+ tcg_temp_free_i32(t_idx);
+ tcg_temp_free_i32(t_size);
+}
+
typedef struct DisasCompare64 {
TCGCond cond;
TCGv_i64 value;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 15/42] target/arm: Implement LDG, STG, ST2G instructions
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (13 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 14/42] target/arm: Add helper_probe_access Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 16/42] target/arm: Implement the STGP instruction Richard Henderson
` (27 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Split out allocation_tag_mem. Handle atomicity of stores.
v3: Add X[t] input to these insns; require pre-cleaned addresses.
v5: Fix !32-byte aligned operation of st2g.
v6: Fix op2 extract, stg pre/post-index, stores vs sp, commentary;
use pre-computed ata.
---
target/arm/helper-a64.h | 7 ++
target/arm/mte_helper.c | 194 +++++++++++++++++++++++++++++++++++++
target/arm/translate-a64.c | 158 +++++++++++++++++++++++++++++-
3 files changed, 354 insertions(+), 5 deletions(-)
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index 6c116481e8..2fa61b86fa 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -106,3 +106,10 @@ DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_3(irg, TCG_CALL_NO_RWG, i64, env, i64, i64)
DEF_HELPER_FLAGS_4(addsubg, TCG_CALL_NO_RWG_SE, i64, env, i64, s32, i32)
+DEF_HELPER_FLAGS_3(ldg, TCG_CALL_NO_WG, i64, env, i64, i64)
+DEF_HELPER_FLAGS_3(stg, TCG_CALL_NO_WG, void, env, i64, i64)
+DEF_HELPER_FLAGS_3(stg_parallel, TCG_CALL_NO_WG, void, env, i64, i64)
+DEF_HELPER_FLAGS_2(stg_stub, TCG_CALL_NO_WG, void, env, i64)
+DEF_HELPER_FLAGS_3(st2g, TCG_CALL_NO_WG, void, env, i64, i64)
+DEF_HELPER_FLAGS_3(st2g_parallel, TCG_CALL_NO_WG, void, env, i64, i64)
+DEF_HELPER_FLAGS_2(st2g_stub, TCG_CALL_NO_WG, void, env, i64)
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index 9ab9ed749d..7ec7930dfc 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -44,6 +44,40 @@ static int choose_nonexcluded_tag(int tag, int offset, uint16_t exclude)
return tag;
}
+/**
+ * allocation_tag_mem:
+ * @env: the cpu environment
+ * @ptr_mmu_idx: the addressing regime to use for the virtual address
+ * @ptr: the virtual address for which to look up tag memory
+ * @ptr_access: the access to use for the virtual address
+ * @ptr_size: the number of bytes in the normal memory access
+ * @tag_access: the access to use for the tag memory
+ * @tag_size: the number of bytes in the tag memory access
+ * @ra: the return address for exception handling
+ *
+ * Our tag memory is formatted as a sequence of little-endian nibbles.
+ * That is, the byte at (addr >> (LOG2_TAG_GRANULE + 1)) contains two
+ * tags, with the tag at [3:0] for the lower addr and the tag at [7:4]
+ * for the higher addr.
+ *
+ * Here, resolve the physical address from the virtual address, and return
+ * a pointer to the corresponding tag byte. Exit with exception if the
+ * virtual address is not accessible for @ptr_access.
+ *
+ * The @ptr_size and @tag_size values may not have an obvious relation
+ * due to the alignment of @ptr, and the number of tag checks required.
+ *
+ * If there is no tag storage corresponding to @ptr, return NULL.
+ */
+static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
+ uint64_t ptr, MMUAccessType ptr_access,
+ int ptr_size, MMUAccessType tag_access,
+ int tag_size, uintptr_t ra)
+{
+ /* Tag storage not implemented. */
+ return NULL;
+}
+
uint64_t HELPER(irg)(CPUARMState *env, uint64_t rn, uint64_t rm)
{
int rtag;
@@ -80,3 +114,163 @@ uint64_t HELPER(addsubg)(CPUARMState *env, uint64_t ptr,
return address_with_allocation_tag(ptr + offset, rtag);
}
+
+static int load_tag1(uint64_t ptr, uint8_t *mem)
+{
+ int ofs = extract32(ptr, LOG2_TAG_GRANULE, 1) * 4;
+ return extract32(*mem, ofs, 4);
+}
+
+uint64_t HELPER(ldg)(CPUARMState *env, uint64_t ptr, uint64_t xt)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ uint8_t *mem;
+ int rtag = 0;
+
+ /* Trap if accessing an invalid page. */
+ mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_LOAD, 1,
+ MMU_DATA_LOAD, 1, GETPC());
+
+ /* Load if page supports tags. */
+ if (mem) {
+ rtag = load_tag1(ptr, mem);
+ }
+
+ return address_with_allocation_tag(xt, rtag);
+}
+
+static void check_tag_aligned(CPUARMState *env, uint64_t ptr, uintptr_t ra)
+{
+ if (unlikely(!QEMU_IS_ALIGNED(ptr, TAG_GRANULE))) {
+ arm_cpu_do_unaligned_access(env_cpu(env), ptr, MMU_DATA_STORE,
+ cpu_mmu_index(env, false), ra);
+ g_assert_not_reached();
+ }
+}
+
+/* For use in a non-parallel context, store to the given nibble. */
+static void store_tag1(uint64_t ptr, uint8_t *mem, int tag)
+{
+ int ofs = extract32(ptr, LOG2_TAG_GRANULE, 1) * 4;
+ *mem = deposit32(*mem, ofs, 4, tag);
+}
+
+/* For use in a parallel context, atomically store to the given nibble. */
+static void store_tag1_parallel(uint64_t ptr, uint8_t *mem, int tag)
+{
+ int ofs = extract32(ptr, LOG2_TAG_GRANULE, 1) * 4;
+ uint8_t old = atomic_read(mem);
+
+ while (1) {
+ uint8_t new = deposit32(old, ofs, 4, tag);
+ uint8_t cmp = atomic_cmpxchg(mem, old, new);
+ if (likely(cmp == old)) {
+ return;
+ }
+ old = cmp;
+ }
+}
+
+typedef void stg_store1(uint64_t, uint8_t *, int);
+
+static inline void do_stg(CPUARMState *env, uint64_t ptr, uint64_t xt,
+ uintptr_t ra, stg_store1 store1)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ uint8_t *mem;
+
+ check_tag_aligned(env, ptr, ra);
+
+ /* Trap if accessing an invalid page. */
+ mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE, TAG_GRANULE,
+ MMU_DATA_STORE, 1, ra);
+
+ /* Store if page supports tags. */
+ if (mem) {
+ store1(ptr, mem, allocation_tag_from_addr(xt));
+ }
+}
+
+void HELPER(stg)(CPUARMState *env, uint64_t ptr, uint64_t xt)
+{
+ do_stg(env, ptr, xt, GETPC(), store_tag1);
+}
+
+void HELPER(stg_parallel)(CPUARMState *env, uint64_t ptr, uint64_t xt)
+{
+ do_stg(env, ptr, xt, GETPC(), store_tag1_parallel);
+}
+
+void HELPER(stg_stub)(CPUARMState *env, uint64_t ptr)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ uintptr_t ra = GETPC();
+
+ check_tag_aligned(env, ptr, ra);
+ probe_write(env, ptr, TAG_GRANULE, mmu_idx, ra);
+}
+
+static inline void do_st2g(CPUARMState *env, uint64_t ptr, uint64_t xt,
+ uintptr_t ra, stg_store1 store1)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ int tag = allocation_tag_from_addr(xt);
+ uint8_t *mem1, *mem2;
+
+ check_tag_aligned(env, ptr, ra);
+
+ /*
+ * Trap if accessing an invalid page(s).
+ * This takes priority over !allocation_tag_access_enabled.
+ */
+ if (ptr & TAG_GRANULE) {
+ /* Two stores unaligned mod TAG_GRANULE*2 -- modify two bytes. */
+ mem1 = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE,
+ TAG_GRANULE, MMU_DATA_STORE, 1, ra);
+ mem2 = allocation_tag_mem(env, mmu_idx, ptr + TAG_GRANULE,
+ MMU_DATA_STORE, TAG_GRANULE,
+ MMU_DATA_STORE, 1, ra);
+
+ /* Store if page(s) support tags. */
+ if (mem1) {
+ store1(TAG_GRANULE, mem1, tag);
+ }
+ if (mem2) {
+ store1(0, mem2, tag);
+ }
+ } else {
+ /* Two stores aligned mod TAG_GRANULE*2 -- modify one byte. */
+ mem1 = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE,
+ 2 * TAG_GRANULE, MMU_DATA_STORE, 1, ra);
+ if (mem1) {
+ tag |= tag << 4;
+ atomic_set(mem1, tag);
+ }
+ }
+}
+
+void HELPER(st2g)(CPUARMState *env, uint64_t ptr, uint64_t xt)
+{
+ do_st2g(env, ptr, xt, GETPC(), store_tag1);
+}
+
+void HELPER(st2g_parallel)(CPUARMState *env, uint64_t ptr, uint64_t xt)
+{
+ do_st2g(env, ptr, xt, GETPC(), store_tag1_parallel);
+}
+
+void HELPER(st2g_stub)(CPUARMState *env, uint64_t ptr)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ uintptr_t ra = GETPC();
+ int in_page = -(ptr | TARGET_PAGE_MASK);
+
+ check_tag_aligned(env, ptr, ra);
+
+ if (likely(in_page >= 2 * TAG_GRANULE)) {
+ probe_write(env, ptr, 2 * TAG_GRANULE, mmu_idx, ra);
+ } else {
+ probe_write(env, ptr, TAG_GRANULE, mmu_idx, ra);
+ probe_write(env, ptr + TAG_GRANULE, TAG_GRANULE, mmu_idx, ra);
+ }
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 45a95d8ea0..b99973b5b1 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -3743,6 +3743,153 @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
}
}
+/*
+ * Load/Store memory tags
+ *
+ * 31 30 29 24 22 21 12 10 5 0
+ * +-----+-------------+-----+---+------+-----+------+------+
+ * | 1 1 | 0 1 1 0 0 1 | op1 | 1 | imm9 | op2 | Rn | Rt |
+ * +-----+-------------+-----+---+------+-----+------+------+
+ */
+static void disas_ldst_tag(DisasContext *s, uint32_t insn)
+{
+ int rt = extract32(insn, 0, 5);
+ int rn = extract32(insn, 5, 5);
+ uint64_t offset = sextract64(insn, 12, 9) << LOG2_TAG_GRANULE;
+ int op2 = extract32(insn, 10, 2);
+ int op1 = extract32(insn, 22, 2);
+ bool is_load = false, is_pair = false, is_zero = false;
+ int index = 0;
+ TCGv_i64 addr, clean_addr, tcg_rt;
+
+ /* We checked insn bits [29:24,21] in the caller. */
+ if (extract32(insn, 30, 2) != 3) {
+ goto do_unallocated;
+ }
+
+ /*
+ * @index is a tri-state variable which has 3 states:
+ * < 0 : post-index, writeback
+ * = 0 : signed offset
+ * > 0 : pre-index, writeback
+ */
+ switch (op1) {
+ case 0: /* STG */
+ if (op2 != 0) {
+ /* STG */
+ index = op2 - 2;
+ break;
+ }
+ goto do_unallocated;
+ case 1:
+ if (op2 != 0) {
+ /* STZG */
+ is_zero = true;
+ index = op2 - 2;
+ } else {
+ /* LDG */
+ is_load = true;
+ }
+ break;
+ case 2:
+ if (op2 != 0) {
+ /* ST2G */
+ is_pair = true;
+ index = op2 - 2;
+ break;
+ }
+ goto do_unallocated;
+ case 3:
+ if (op2 != 0) {
+ /* STZ2G */
+ is_pair = is_zero = true;
+ index = op2 - 2;
+ break;
+ }
+ goto do_unallocated;
+
+ default:
+ do_unallocated:
+ unallocated_encoding(s);
+ return;
+ }
+
+ if (!dc_isar_feature(aa64_mte_insn_reg, s)) {
+ goto do_unallocated;
+ }
+
+ if (rn == 31) {
+ gen_check_sp_alignment(s);
+ }
+
+ addr = read_cpu_reg_sp(s, rn, true);
+ if (index >= 0) {
+ /* pre-index or signed offset */
+ tcg_gen_addi_i64(addr, addr, offset);
+ }
+
+ if (is_load) {
+ tcg_rt = cpu_reg(s, rt);
+ if (s->ata) {
+ gen_helper_ldg(tcg_rt, cpu_env, addr, tcg_rt);
+ } else {
+ clean_addr = clean_data_tbi(s, addr);
+ gen_probe_access(s, clean_addr, MMU_DATA_LOAD, MO_8);
+ gen_address_with_allocation_tag0(tcg_rt, addr);
+ }
+ } else {
+ tcg_rt = cpu_reg_sp(s, rt);
+ if (!s->ata) {
+ /*
+ * For STG and ST2G, we need to check alignment and probe memory.
+ * TODO: For STZG and STZ2G, we could rely on the stores below,
+ * at least for system mode; user-only won't enforce alignment.
+ */
+ if (is_pair) {
+ gen_helper_st2g_stub(cpu_env, addr);
+ } else {
+ gen_helper_stg_stub(cpu_env, addr);
+ }
+ } else if (tb_cflags(s->base.tb) & CF_PARALLEL) {
+ if (is_pair) {
+ gen_helper_st2g_parallel(cpu_env, addr, tcg_rt);
+ } else {
+ gen_helper_stg_parallel(cpu_env, addr, tcg_rt);
+ }
+ } else {
+ if (is_pair) {
+ gen_helper_st2g(cpu_env, addr, tcg_rt);
+ } else {
+ gen_helper_stg(cpu_env, addr, tcg_rt);
+ }
+ }
+ }
+
+ if (is_zero) {
+ TCGv_i64 clean_addr = clean_data_tbi(s, addr);
+ TCGv_i64 tcg_zero = tcg_const_i64(0);
+ int mem_index = get_mem_index(s);
+ int i, n = (1 + is_pair) << LOG2_TAG_GRANULE;
+
+ tcg_gen_qemu_st_i64(tcg_zero, clean_addr, mem_index,
+ MO_Q | MO_ALIGN_16);
+ for (i = 1; i < n; i += 8) {
+ tcg_gen_addi_i64(clean_addr, clean_addr, 8);
+ tcg_gen_qemu_st_i64(tcg_zero, clean_addr, mem_index, MO_Q);
+ }
+ tcg_temp_free_i64(tcg_zero);
+ }
+
+ if (index != 0) {
+ /* pre-index or post-index */
+ if (index < 0) {
+ /* post-index */
+ tcg_gen_addi_i64(addr, addr, offset);
+ }
+ tcg_gen_mov_i64(cpu_reg_sp(s, rn), addr);
+ }
+}
+
/* Loads and stores */
static void disas_ldst(DisasContext *s, uint32_t insn)
{
@@ -3767,13 +3914,14 @@ static void disas_ldst(DisasContext *s, uint32_t insn)
case 0x0d: /* AdvSIMD load/store single structure */
disas_ldst_single_struct(s, insn);
break;
- case 0x19: /* LDAPR/STLR (unscaled immediate) */
- if (extract32(insn, 10, 2) != 0 ||
- extract32(insn, 21, 1) != 0) {
+ case 0x19:
+ if (extract32(insn, 21, 1) != 0) {
+ disas_ldst_tag(s, insn);
+ } else if (extract32(insn, 10, 2) == 0) {
+ disas_ldst_ldapr_stlr(s, insn);
+ } else {
unallocated_encoding(s);
- break;
}
- disas_ldst_ldapr_stlr(s, insn);
break;
default:
unallocated_encoding(s);
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 16/42] target/arm: Implement the STGP instruction
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (14 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 15/42] target/arm: Implement LDG, STG, ST2G instructions Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 17/42] target/arm: Restrict the values of DCZID.BS under TCG Richard Henderson
` (26 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Handle atomicity, require pre-cleaned address.
v6: Fix constant offset shift, non-checked address, use pre-computed ata.
---
target/arm/translate-a64.c | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index b99973b5b1..048140ddc0 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -2735,7 +2735,7 @@ static void disas_ld_lit(DisasContext *s, uint32_t insn)
* +-----+-------+---+---+-------+---+-------+-------+------+------+
*
* opc: LDP/STP/LDNP/STNP 00 -> 32 bit, 10 -> 64 bit
- * LDPSW 01
+ * LDPSW/STGP 01
* LDP/STP/LDNP/STNP (SIMD) 00 -> 32 bit, 01 -> 64 bit, 10 -> 128 bit
* V: 0 -> GPR, 1 -> Vector
* idx: 00 -> signed offset with non-temporal hint, 01 -> post-index,
@@ -2760,6 +2760,7 @@ static void disas_ldst_pair(DisasContext *s, uint32_t insn)
bool is_signed = false;
bool postindex = false;
bool wback = false;
+ bool set_tag = false;
TCGv_i64 clean_addr, dirty_addr;
@@ -2772,6 +2773,14 @@ static void disas_ldst_pair(DisasContext *s, uint32_t insn)
if (is_vector) {
size = 2 + opc;
+ } else if (opc == 1 && !is_load) {
+ /* STGP */
+ if (!dc_isar_feature(aa64_mte_insn_reg, s) || index == 0) {
+ unallocated_encoding(s);
+ return;
+ }
+ size = 3;
+ set_tag = true;
} else {
size = 2 + extract32(opc, 1, 1);
is_signed = extract32(opc, 0, 1);
@@ -2812,7 +2821,7 @@ static void disas_ldst_pair(DisasContext *s, uint32_t insn)
return;
}
- offset <<= size;
+ offset <<= (set_tag ? LOG2_TAG_GRANULE : size);
if (rn == 31) {
gen_check_sp_alignment(s);
@@ -2822,8 +2831,22 @@ static void disas_ldst_pair(DisasContext *s, uint32_t insn)
if (!postindex) {
tcg_gen_addi_i64(dirty_addr, dirty_addr, offset);
}
- clean_addr = clean_data_tbi(s, dirty_addr);
+ if (set_tag) {
+ if (!s->ata) {
+ /*
+ * TODO: We could rely on the stores below, at least for
+ * system mode, if we arrange to add MO_ALIGN_16.
+ */
+ gen_helper_stg_stub(cpu_env, dirty_addr);
+ } else if (tb_cflags(s->base.tb) & CF_PARALLEL) {
+ gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr);
+ } else {
+ gen_helper_stg(cpu_env, dirty_addr, dirty_addr);
+ }
+ }
+
+ clean_addr = clean_data_tbi(s, dirty_addr);
if (is_vector) {
if (is_load) {
do_fp_ld(s, rt, clean_addr, size);
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 17/42] target/arm: Restrict the values of DCZID.BS under TCG
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (15 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 16/42] target/arm: Implement the STGP instruction Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 18/42] target/arm: Simplify DC_ZVA Richard Henderson
` (25 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
We can simplify our DC_ZVA if we recognize that the largest BS
that we actually use in system mode is 64. Let us just assert
that it fits within TARGET_PAGE_SIZE.
For DC_GVA and STZGM, we want to be able to write whole bytes
of tag memory, so assert that BS is >= 2 * TAG_GRANULE, or 32.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/cpu.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index cb3c3fe8c2..96c20317ad 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1807,6 +1807,30 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
}
#endif
+ if (tcg_enabled()) {
+ int dcz_blocklen = 4 << cpu->dcz_blocksize;
+
+ /*
+ * We only support DCZ blocklen that fits on one page.
+ *
+ * Architectually this is always true. However TARGET_PAGE_SIZE
+ * is variable and, for compatibility with -machine virt-2.7,
+ * is only 1KiB, as an artifact of legacy ARMv5 subpage support.
+ * But even then, while the largest architectural DCZ blocklen
+ * is 2KiB, no cpu actually uses such a large blocklen.
+ */
+ assert(dcz_blocklen <= TARGET_PAGE_SIZE);
+
+ /*
+ * We only support DCZ blocksize >= 2*TAG_GRANULE, which is to say
+ * both nibbles of each byte storing tag data may be written at once.
+ * Since TAG_GRANULE is 16, this means that blocklen must be >= 32.
+ */
+ if (cpu_isar_feature(aa64_mte, cpu)) {
+ assert(dcz_blocklen >= 2 * TAG_GRANULE);
+ }
+ }
+
qemu_init_vcpu(cs);
cpu_reset(cs);
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 18/42] target/arm: Simplify DC_ZVA
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (16 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 17/42] target/arm: Restrict the values of DCZID.BS under TCG Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 19/42] target/arm: Implement the LDGM, STGM, STZGM instructions Richard Henderson
` (24 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Now that we know that the operation is on a single page,
we need not loop over pages while probing.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-a64.c | 94 +++++++++++------------------------------
1 file changed, 25 insertions(+), 69 deletions(-)
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index bc0649a44a..60a04dc880 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -1119,85 +1119,41 @@ void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
* (which matches the usual QEMU behaviour of not implementing either
* alignment faults or any memory attribute handling).
*/
+ int blocklen = 4 << env_archcpu(env)->dcz_blocksize;
+ uint64_t vaddr = vaddr_in & -blocklen;
+ int mmu_idx = cpu_mmu_index(env, false);
+ void *mem;
- ARMCPU *cpu = env_archcpu(env);
- uint64_t blocklen = 4 << cpu->dcz_blocksize;
- uint64_t vaddr = vaddr_in & ~(blocklen - 1);
+ /*
+ * Trapless lookup. In addition to actual invalid page, may
+ * return NULL for I/O, watchpoints, clean pages, etc.
+ */
+ mem = tlb_vaddr_to_host(env, vaddr, MMU_DATA_STORE, mmu_idx);
#ifndef CONFIG_USER_ONLY
- {
+ if (unlikely(!mem)) {
+ uintptr_t ra = GETPC();
+
/*
- * Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than
- * the block size so we might have to do more than one TLB lookup.
- * We know that in fact for any v8 CPU the page size is at least 4K
- * and the block size must be 2K or less, but TARGET_PAGE_SIZE is only
- * 1K as an artefact of legacy v5 subpage support being present in the
- * same QEMU executable. So in practice the hostaddr[] array has
- * two entries, given the current setting of TARGET_PAGE_BITS_MIN.
+ * Trap if accessing an invalid page. DC_ZVA requires that we supply
+ * the original pointer for an invalid page. But watchpoints require
+ * that we probe the actual space. So do both.
*/
- int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE);
- void *hostaddr[DIV_ROUND_UP(2 * KiB, 1 << TARGET_PAGE_BITS_MIN)];
- int try, i;
- unsigned mmu_idx = cpu_mmu_index(env, false);
- TCGMemOpIdx oi = make_memop_idx(MO_UB, mmu_idx);
+ (void) probe_write(env, vaddr_in, 1, mmu_idx, ra);
+ mem = probe_write(env, vaddr, blocklen, mmu_idx, ra);
- assert(maxidx <= ARRAY_SIZE(hostaddr));
-
- for (try = 0; try < 2; try++) {
-
- for (i = 0; i < maxidx; i++) {
- hostaddr[i] = tlb_vaddr_to_host(env,
- vaddr + TARGET_PAGE_SIZE * i,
- 1, mmu_idx);
- if (!hostaddr[i]) {
- break;
- }
- }
- if (i == maxidx) {
- /*
- * If it's all in the TLB it's fair game for just writing to;
- * we know we don't need to update dirty status, etc.
- */
- for (i = 0; i < maxidx - 1; i++) {
- memset(hostaddr[i], 0, TARGET_PAGE_SIZE);
- }
- memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE));
- return;
- }
+ if (unlikely(!mem)) {
/*
- * OK, try a store and see if we can populate the tlb. This
- * might cause an exception if the memory isn't writable,
- * in which case we will longjmp out of here. We must for
- * this purpose use the actual register value passed to us
- * so that we get the fault address right.
+ * The only remaining reason for mem == NULL is I/O.
+ * Just do a series of byte writes as the architecture demands.
*/
- helper_ret_stb_mmu(env, vaddr_in, 0, oi, GETPC());
- /* Now we can populate the other TLB entries, if any */
- for (i = 0; i < maxidx; i++) {
- uint64_t va = vaddr + TARGET_PAGE_SIZE * i;
- if (va != (vaddr_in & TARGET_PAGE_MASK)) {
- helper_ret_stb_mmu(env, va, 0, oi, GETPC());
- }
+ for (int i = 0; i < blocklen; i++) {
+ cpu_stb_mmuidx_ra(env, vaddr + i, 0, mmu_idx, ra);
}
- }
-
- /*
- * Slow path (probably attempt to do this to an I/O device or
- * similar, or clearing of a block of code we have translations
- * cached for). Just do a series of byte writes as the architecture
- * demands. It's not worth trying to use a cpu_physical_memory_map(),
- * memset(), unmap() sequence here because:
- * + we'd need to account for the blocksize being larger than a page
- * + the direct-RAM access case is almost always going to be dealt
- * with in the fastpath code above, so there's no speed benefit
- * + we would have to deal with the map returning NULL because the
- * bounce buffer was in use
- */
- for (i = 0; i < blocklen; i++) {
- helper_ret_stb_mmu(env, vaddr + i, 0, oi, GETPC());
+ return;
}
}
-#else
- memset(g2h(vaddr), 0, blocklen);
#endif
+
+ memset(mem, 0, blocklen);
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 19/42] target/arm: Implement the LDGM, STGM, STZGM instructions
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (17 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 18/42] target/arm: Simplify DC_ZVA Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 20/42] target/arm: Implement the access tag cache flushes Richard Henderson
` (23 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Require pre-cleaned addresses.
v6: Check full mte enabled. Reorg the helpers.
---
target/arm/helper-a64.h | 3 ++
target/arm/translate.h | 2 +
target/arm/mte_helper.c | 84 ++++++++++++++++++++++++++++++++++++++
target/arm/translate-a64.c | 74 +++++++++++++++++++++++++++++----
4 files changed, 154 insertions(+), 9 deletions(-)
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index 2fa61b86fa..7b628d100e 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -113,3 +113,6 @@ DEF_HELPER_FLAGS_2(stg_stub, TCG_CALL_NO_WG, void, env, i64)
DEF_HELPER_FLAGS_3(st2g, TCG_CALL_NO_WG, void, env, i64, i64)
DEF_HELPER_FLAGS_3(st2g_parallel, TCG_CALL_NO_WG, void, env, i64, i64)
DEF_HELPER_FLAGS_2(st2g_stub, TCG_CALL_NO_WG, void, env, i64)
+DEF_HELPER_FLAGS_2(ldgm, TCG_CALL_NO_WG, i64, env, i64)
+DEF_HELPER_FLAGS_3(stgm, TCG_CALL_NO_WG, void, env, i64, i64)
+DEF_HELPER_FLAGS_3(stzgm_tags, TCG_CALL_NO_WG, void, env, i64, i64)
diff --git a/target/arm/translate.h b/target/arm/translate.h
index e0f5d0be63..5552ee5a94 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -91,6 +91,8 @@ typedef struct DisasContext {
* < 0, set by the current instruction.
*/
int8_t btype;
+ /* A copy of cpu->dcz_blocksize. */
+ uint8_t dcz_blocksize;
/* True if this page is guarded. */
bool guarded_page;
/* Bottom two bits of XScale c15_cpar coprocessor access control reg */
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index 7ec7930dfc..27d4b4536c 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -274,3 +274,87 @@ void HELPER(st2g_stub)(CPUARMState *env, uint64_t ptr)
probe_write(env, ptr + TAG_GRANULE, TAG_GRANULE, mmu_idx, ra);
}
}
+
+#define LDGM_STGM_SIZE (4 << GMID_EL1_BS)
+
+uint64_t HELPER(ldgm)(CPUARMState *env, uint64_t ptr)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ uintptr_t ra = GETPC();
+ void *tag_mem;
+
+ ptr = QEMU_ALIGN_DOWN(ptr, LDGM_STGM_SIZE);
+
+ /* Trap if accessing an invalid page. */
+ tag_mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_LOAD,
+ LDGM_STGM_SIZE, MMU_DATA_LOAD,
+ LDGM_STGM_SIZE / (2 * TAG_GRANULE), ra);
+
+ /* The tag is squashed to zero if the page does not support tags. */
+ if (!tag_mem) {
+ return 0;
+ }
+
+ QEMU_BUILD_BUG_ON(GMID_EL1_BS != 6);
+ /*
+ * We are loading 64-bits worth of tags. The ordering of elements
+ * within the word corresponds to a 64-bit little-endian operation.
+ */
+ return ldq_le_p(tag_mem);
+}
+
+void HELPER(stgm)(CPUARMState *env, uint64_t ptr, uint64_t val)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ uintptr_t ra = GETPC();
+ void *tag_mem;
+
+ ptr = QEMU_ALIGN_DOWN(ptr, LDGM_STGM_SIZE);
+
+ /* Trap if accessing an invalid page. */
+ tag_mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE,
+ LDGM_STGM_SIZE, MMU_DATA_LOAD,
+ LDGM_STGM_SIZE / (2 * TAG_GRANULE), ra);
+
+ /*
+ * Tag store only happens if the page support tags,
+ * and if the OS has enabled access to the tags.
+ */
+ if (!tag_mem) {
+ return;
+ }
+
+ QEMU_BUILD_BUG_ON(GMID_EL1_BS != 6);
+ /*
+ * We are storing 64-bits worth of tags. The ordering of elements
+ * within the word corresponds to a 64-bit little-endian operation.
+ */
+ stq_le_p(tag_mem, val);
+}
+
+void HELPER(stzgm_tags)(CPUARMState *env, uint64_t ptr, uint64_t val)
+{
+ uintptr_t ra = GETPC();
+ int mmu_idx = cpu_mmu_index(env, false);
+ int log2_dcz_bytes, log2_tag_bytes;
+ intptr_t dcz_bytes, tag_bytes;
+ uint8_t *mem;
+
+ /*
+ * In arm_cpu_realizefn, we assert that dcz > LOG2_TAG_GRANULE+1,
+ * i.e. 32 bytes, which is an unreasonably small dcz anyway,
+ * to make sure that we can access one complete tag byte here.
+ */
+ log2_dcz_bytes = env_archcpu(env)->dcz_blocksize + 2;
+ log2_tag_bytes = log2_dcz_bytes - (LOG2_TAG_GRANULE + 1);
+ dcz_bytes = (intptr_t)1 << log2_dcz_bytes;
+ tag_bytes = (intptr_t)1 << log2_tag_bytes;
+ ptr &= -dcz_bytes;
+
+ mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE, dcz_bytes,
+ MMU_DATA_STORE, tag_bytes, ra);
+ if (mem) {
+ int tag_pair = (val & 0xf) * 0x11;
+ memset(mem, tag_pair, tag_bytes);
+ }
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 048140ddc0..f010aa2b58 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -3781,7 +3781,7 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
uint64_t offset = sextract64(insn, 12, 9) << LOG2_TAG_GRANULE;
int op2 = extract32(insn, 10, 2);
int op1 = extract32(insn, 22, 2);
- bool is_load = false, is_pair = false, is_zero = false;
+ bool is_load = false, is_pair = false, is_zero = false, is_mult = false;
int index = 0;
TCGv_i64 addr, clean_addr, tcg_rt;
@@ -3797,13 +3797,18 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
* > 0 : pre-index, writeback
*/
switch (op1) {
- case 0: /* STG */
+ case 0:
if (op2 != 0) {
/* STG */
index = op2 - 2;
- break;
+ } else {
+ /* STZGM */
+ if (s->current_el == 0 || offset != 0) {
+ goto do_unallocated;
+ }
+ is_mult = is_zero = true;
}
- goto do_unallocated;
+ break;
case 1:
if (op2 != 0) {
/* STZG */
@@ -3819,17 +3824,27 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
/* ST2G */
is_pair = true;
index = op2 - 2;
- break;
+ } else {
+ /* STGM */
+ if (s->current_el == 0 || offset != 0) {
+ goto do_unallocated;
+ }
+ is_mult = true;
}
- goto do_unallocated;
+ break;
case 3:
if (op2 != 0) {
/* STZ2G */
is_pair = is_zero = true;
index = op2 - 2;
- break;
+ } else {
+ /* LDGM */
+ if (s->current_el == 0 || offset != 0) {
+ goto do_unallocated;
+ }
+ is_mult = is_load = true;
}
- goto do_unallocated;
+ break;
default:
do_unallocated:
@@ -3837,7 +3852,9 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
return;
}
- if (!dc_isar_feature(aa64_mte_insn_reg, s)) {
+ if (is_mult
+ ? !dc_isar_feature(aa64_mte, s)
+ : !dc_isar_feature(aa64_mte_insn_reg, s)) {
goto do_unallocated;
}
@@ -3851,6 +3868,44 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
tcg_gen_addi_i64(addr, addr, offset);
}
+ if (is_mult) {
+ tcg_rt = cpu_reg(s, rt);
+
+ if (is_zero) {
+ int size = 4 << s->dcz_blocksize;
+
+ if (s->ata) {
+ gen_helper_stzgm_tags(cpu_env, addr, tcg_rt);
+ }
+ /*
+ * The non-tags portion of STZGM is mostly like DC_ZVA,
+ * except the alignment happens before the access.
+ */
+ clean_addr = clean_data_tbi(s, addr);
+ tcg_gen_andi_i64(clean_addr, clean_addr, -size);
+ gen_helper_dc_zva(cpu_env, clean_addr);
+ } else if (s->ata) {
+ if (is_load) {
+ gen_helper_ldgm(tcg_rt, cpu_env, addr);
+ } else {
+ gen_helper_stgm(cpu_env, addr, tcg_rt);
+ }
+ } else {
+ MMUAccessType acc = is_load ? MMU_DATA_LOAD : MMU_DATA_STORE;
+ int size = 4 << GMID_EL1_BS;
+
+ clean_addr = clean_data_tbi(s, addr);
+ tcg_gen_andi_i64(clean_addr, clean_addr, -size);
+ gen_probe_access(s, clean_addr, acc, size);
+
+ if (is_load) {
+ /* The result tags are zeros. */
+ tcg_gen_movi_i64(tcg_rt, 0);
+ }
+ }
+ return;
+ }
+
if (is_load) {
tcg_rt = cpu_reg(s, rt);
if (s->ata) {
@@ -14623,6 +14678,7 @@ static void aarch64_tr_init_disas_context(DisasContextBase *dcbase,
dc->vec_stride = 0;
dc->cp_regs = arm_cpu->cp_regs;
dc->features = env->features;
+ dc->dcz_blocksize = arm_cpu->dcz_blocksize;
/* Single step state. The code-generation logic here is:
* SS_ACTIVE == 0:
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 20/42] target/arm: Implement the access tag cache flushes
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (18 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 19/42] target/arm: Implement the LDGM, STGM, STZGM instructions Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 21/42] target/arm: Move regime_el to internals.h Richard Henderson
` (22 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Like the regular data cache flushes, these are nops within qemu.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v6: Split out and handle el0 cache ops properly.
---
target/arm/helper.c | 65 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 65 insertions(+)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 01d2fcf2e3..f9daeec1f4 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -6914,6 +6914,32 @@ static const ARMCPRegInfo mte_reginfo[] = {
.opc0 = 3, .opc1 = 3, .crn = 4, .crm = 2, .opc2 = 7,
.type = ARM_CP_NO_RAW,
.access = PL0_RW, .readfn = tco_read, .writefn = tco_write },
+ { .name = "DC_IGVAC", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 3,
+ .type = ARM_CP_NOP, .access = PL1_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_IGSW", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 4,
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+ { .name = "DC_IGDVAC", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 5,
+ .type = ARM_CP_NOP, .access = PL1_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_IGDSW", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 6, .opc2 = 6,
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+ { .name = "DC_CGSW", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 4,
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+ { .name = "DC_CGDSW", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 10, .opc2 = 6,
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+ { .name = "DC_CIGSW", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 4,
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
+ { .name = "DC_CIGDSW", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 0, .crn = 7, .crm = 14, .opc2 = 6,
+ .type = ARM_CP_NOP, .access = PL1_W, .accessfn = access_tsw },
REGINFO_SENTINEL
};
@@ -6923,6 +6949,43 @@ static const ARMCPRegInfo mte_tco_ro_reginfo[] = {
.type = ARM_CP_CONST, .access = PL0_RW, },
REGINFO_SENTINEL
};
+
+static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
+ { .name = "DC_CGVAC", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 3,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_CGDVAC", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 10, .opc2 = 5,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_CGVAP", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 3,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_CGDVAP", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 5,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_CGVADP", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 3,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_CGDVADP", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 5,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_CIGVAC", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 3,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_CIGDVAC", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 5,
+ .type = ARM_CP_NOP, .access = PL0_W,
+ .accessfn = aa64_cacheop_poc_access },
+ REGINFO_SENTINEL
+};
+
#endif
static CPAccessResult access_predinv(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -8048,8 +8111,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
*/
if (cpu_isar_feature(aa64_mte, cpu)) {
define_arm_cp_regs(cpu, mte_reginfo);
+ define_arm_cp_regs(cpu, mte_el0_cacheop_reginfo);
} else if (cpu_isar_feature(aa64_mte_insn_reg, cpu)) {
define_arm_cp_regs(cpu, mte_tco_ro_reginfo);
+ define_arm_cp_regs(cpu, mte_el0_cacheop_reginfo);
}
#endif
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 21/42] target/arm: Move regime_el to internals.h
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (19 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 20/42] target/arm: Implement the access tag cache flushes Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:41 ` [PATCH v6 22/42] target/arm: Move regime_tcr " Richard Henderson
` (21 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
We will shortly need this in mte_helper.c as well.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/internals.h | 36 ++++++++++++++++++++++++++++++++++++
target/arm/helper.c | 36 ------------------------------------
2 files changed, 36 insertions(+), 36 deletions(-)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 3190fc40fc..f091891312 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -912,6 +912,42 @@ static inline bool regime_is_pan(CPUARMState *env, ARMMMUIdx mmu_idx)
}
}
+/* Return the exception level which controls this address translation regime */
+static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
+{
+ switch (mmu_idx) {
+ case ARMMMUIdx_E20_0:
+ case ARMMMUIdx_E20_2:
+ case ARMMMUIdx_E20_2_PAN:
+ case ARMMMUIdx_Stage2:
+ case ARMMMUIdx_E2:
+ return 2;
+ case ARMMMUIdx_SE3:
+ return 3;
+ case ARMMMUIdx_SE10_0:
+ return arm_el_is_aa64(env, 3) ? 1 : 3;
+ case ARMMMUIdx_SE10_1:
+ case ARMMMUIdx_SE10_1_PAN:
+ case ARMMMUIdx_Stage1_E0:
+ case ARMMMUIdx_Stage1_E1:
+ case ARMMMUIdx_Stage1_E1_PAN:
+ case ARMMMUIdx_E10_0:
+ case ARMMMUIdx_E10_1:
+ case ARMMMUIdx_E10_1_PAN:
+ case ARMMMUIdx_MPrivNegPri:
+ case ARMMMUIdx_MUserNegPri:
+ case ARMMMUIdx_MPriv:
+ case ARMMMUIdx_MUser:
+ case ARMMMUIdx_MSPrivNegPri:
+ case ARMMMUIdx_MSUserNegPri:
+ case ARMMMUIdx_MSPriv:
+ case ARMMMUIdx_MSUser:
+ return 1;
+ default:
+ g_assert_not_reached();
+ }
+}
+
/* Return the FSR value for a debug exception (watchpoint, hardware
* breakpoint or BKPT insn) targeting the specified exception level.
*/
diff --git a/target/arm/helper.c b/target/arm/helper.c
index f9daeec1f4..2a50d4e9a2 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -9753,42 +9753,6 @@ void arm_cpu_do_interrupt(CPUState *cs)
}
#endif /* !CONFIG_USER_ONLY */
-/* Return the exception level which controls this address translation regime */
-static uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
-{
- switch (mmu_idx) {
- case ARMMMUIdx_E20_0:
- case ARMMMUIdx_E20_2:
- case ARMMMUIdx_E20_2_PAN:
- case ARMMMUIdx_Stage2:
- case ARMMMUIdx_E2:
- return 2;
- case ARMMMUIdx_SE3:
- return 3;
- case ARMMMUIdx_SE10_0:
- return arm_el_is_aa64(env, 3) ? 1 : 3;
- case ARMMMUIdx_SE10_1:
- case ARMMMUIdx_SE10_1_PAN:
- case ARMMMUIdx_Stage1_E0:
- case ARMMMUIdx_Stage1_E1:
- case ARMMMUIdx_Stage1_E1_PAN:
- case ARMMMUIdx_E10_0:
- case ARMMMUIdx_E10_1:
- case ARMMMUIdx_E10_1_PAN:
- case ARMMMUIdx_MPrivNegPri:
- case ARMMMUIdx_MUserNegPri:
- case ARMMMUIdx_MPriv:
- case ARMMMUIdx_MUser:
- case ARMMMUIdx_MSPrivNegPri:
- case ARMMMUIdx_MSUserNegPri:
- case ARMMMUIdx_MSPriv:
- case ARMMMUIdx_MSUser:
- return 1;
- default:
- g_assert_not_reached();
- }
-}
-
uint64_t arm_sctlr(CPUARMState *env, int el)
{
/* Only EL0 needs to be adjusted for EL1&0 or EL2&0. */
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 22/42] target/arm: Move regime_tcr to internals.h
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (20 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 21/42] target/arm: Move regime_el to internals.h Richard Henderson
@ 2020-03-12 19:41 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 23/42] target/arm: Add gen_mte_check1 Richard Henderson
` (20 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
We will shortly need this in mte_helper.c as well.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/internals.h | 9 +++++++++
target/arm/helper.c | 9 ---------
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index f091891312..56fb07f2b6 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -948,6 +948,15 @@ static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
}
}
+/* Return the TCR controlling this translation regime */
+static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
+{
+ if (mmu_idx == ARMMMUIdx_Stage2) {
+ return &env->cp15.vtcr_el2;
+ }
+ return &env->cp15.tcr_el[regime_el(env, mmu_idx)];
+}
+
/* Return the FSR value for a debug exception (watchpoint, hardware
* breakpoint or BKPT insn) targeting the specified exception level.
*/
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 2a50d4e9a2..e4b4366af7 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -9835,15 +9835,6 @@ static inline uint64_t regime_ttbr(CPUARMState *env, ARMMMUIdx mmu_idx,
#endif /* !CONFIG_USER_ONLY */
-/* Return the TCR controlling this translation regime */
-static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
-{
- if (mmu_idx == ARMMMUIdx_Stage2) {
- return &env->cp15.vtcr_el2;
- }
- return &env->cp15.tcr_el[regime_el(env, mmu_idx)];
-}
-
/* Convert a possible stage1+2 MMU index into the appropriate
* stage 1 MMU index
*/
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 23/42] target/arm: Add gen_mte_check1
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (21 preceding siblings ...)
2020-03-12 19:41 ` [PATCH v6 22/42] target/arm: Move regime_tcr " Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 24/42] target/arm: Add gen_mte_checkN Richard Henderson
` (19 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Replace existing uses of check_data_tbi in translate-a64.c that
perform a single logical memory access. Leave the helper blank
for now to reduce the patch size.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-a64.h | 1 +
target/arm/internals.h | 8 ++++
target/arm/translate-a64.h | 2 +
target/arm/mte_helper.c | 8 ++++
target/arm/translate-a64.c | 96 ++++++++++++++++++++++++++++----------
5 files changed, 91 insertions(+), 24 deletions(-)
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index 7b628d100e..2faa49d0a3 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -104,6 +104,7 @@ DEF_HELPER_FLAGS_3(autdb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_2(xpaci, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
+DEF_HELPER_FLAGS_3(mte_check1, TCG_CALL_NO_WG, i64, env, i32, i64)
DEF_HELPER_FLAGS_3(irg, TCG_CALL_NO_RWG, i64, env, i64, i64)
DEF_HELPER_FLAGS_4(addsubg, TCG_CALL_NO_RWG_SE, i64, env, i64, s32, i32)
DEF_HELPER_FLAGS_3(ldg, TCG_CALL_NO_WG, i64, env, i64, i64)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 56fb07f2b6..a993e8ca0a 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1306,6 +1306,14 @@ void arm_log_exception(int idx);
#define LOG2_TAG_GRANULE 4
#define TAG_GRANULE (1 << LOG2_TAG_GRANULE)
+/* Bits within a descriptor passed to the helper_mte_check* functions. */
+FIELD(MTEDESC, MIDX, 0, 4)
+FIELD(MTEDESC, TBI, 4, 2)
+FIELD(MTEDESC, TCMA, 6, 2)
+FIELD(MTEDESC, WRITE, 8, 1)
+FIELD(MTEDESC, ESIZE, 9, 5)
+FIELD(MTEDESC, TSIZE, 14, 10) /* mte_checkN only */
+
static inline int allocation_tag_from_addr(uint64_t ptr)
{
return extract64(ptr, 56, 4);
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index 4c2c91ae1b..cc77e6ce8b 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -40,6 +40,8 @@ TCGv_ptr get_fpstatus_ptr(bool);
bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
unsigned int imms, unsigned int immr);
bool sve_access_check(DisasContext *s);
+TCGv_i64 gen_mte_check1(DisasContext *s, TCGv_i64 addr, bool is_write,
+ bool tag_checked, int log2_size);
/* We should have at some point before trying to access an FP register
* done the necessary access check, so assert that
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index 27d4b4536c..ec12768dfc 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -358,3 +358,11 @@ void HELPER(stzgm_tags)(CPUARMState *env, uint64_t ptr, uint64_t val)
memset(mem, tag_pair, tag_bytes);
}
}
+
+/*
+ * Perform an MTE checked access for a single logical or atomic access.
+ */
+uint64_t HELPER(mte_check1)(CPUARMState *env, uint32_t desc, uint64_t ptr)
+{
+ return ptr;
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index f010aa2b58..683a8a3dba 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -221,20 +221,20 @@ static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
}
/*
- * Return a "clean" address for ADDR according to TBID.
- * This is always a fresh temporary, as we need to be able to
- * increment this independently of a dirty write-back address.
+ * Handle MTE and/or TBI.
+ *
+ * For TBI, ideally, we would do nothing. Proper behaviour on fault is
+ * for the tag to be present in the FAR_ELx register. But for user-only
+ * mode, we do not have a TLB with which to implement this, so we must
+ * remote the top byte now.
+ *
+ * Always return a fresh temporary that we can increment independently
+ * of the write-back address.
*/
+
static TCGv_i64 clean_data_tbi(DisasContext *s, TCGv_i64 addr)
{
TCGv_i64 clean = new_tmp_a64(s);
- /*
- * In order to get the correct value in the FAR_ELx register,
- * we must present the memory subsystem with the "dirty" address
- * including the TBI. In system mode we can make this work via
- * the TLB, dropping the TBI during translation. But for user-only
- * mode we don't have that option, and must remove the top byte now.
- */
#ifdef CONFIG_USER_ONLY
gen_top_byte_ignore(s, clean, addr, s->tbid);
#else
@@ -262,6 +262,45 @@ static void gen_probe_access(DisasContext *s, TCGv_i64 ptr,
tcg_temp_free_i32(t_size);
}
+/*
+ * For MTE, check a single logical or atomic access. This probes a single
+ * address, the exact one specified. The size and alignment of the access
+ * is not relevant to MTE, per se, but watchpoints do require the size,
+ * and we want to recognize those before making any other changes to state.
+ */
+static TCGv_i64 gen_mte_check1_mmuidx(DisasContext *s, TCGv_i64 addr,
+ bool is_write, bool tag_checked,
+ int log2_size, bool is_unpriv,
+ int core_idx)
+{
+ if (tag_checked && s->mte_active[is_unpriv]) {
+ TCGv_i32 tcg_desc;
+ TCGv_i64 ret;
+ int desc = 0;
+
+ desc = FIELD_DP32(desc, MTEDESC, MIDX, core_idx);
+ desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
+ desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
+ desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
+ desc = FIELD_DP32(desc, MTEDESC, ESIZE, 1 << log2_size);
+ tcg_desc = tcg_const_i32(desc);
+
+ ret = new_tmp_a64(s);
+ gen_helper_mte_check1(ret, cpu_env, tcg_desc, addr);
+ tcg_temp_free_i32(tcg_desc);
+
+ return ret;
+ }
+ return clean_data_tbi(s, addr);
+}
+
+TCGv_i64 gen_mte_check1(DisasContext *s, TCGv_i64 addr, bool is_write,
+ bool tag_checked, int log2_size)
+{
+ return gen_mte_check1_mmuidx(s, addr, is_write, tag_checked, log2_size,
+ false, get_mem_index(s));
+}
+
typedef struct DisasCompare64 {
TCGCond cond;
TCGv_i64 value;
@@ -2412,7 +2451,7 @@ static void gen_compare_and_swap(DisasContext *s, int rs, int rt,
if (rn == 31) {
gen_check_sp_alignment(s);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn), true, rn != 31, size);
tcg_gen_atomic_cmpxchg_i64(tcg_rs, clean_addr, tcg_rs, tcg_rt, memidx,
size | MO_ALIGN | s->be_data);
}
@@ -2430,7 +2469,9 @@ static void gen_compare_and_swap_pair(DisasContext *s, int rs, int rt,
if (rn == 31) {
gen_check_sp_alignment(s);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+
+ /* This is a single atomic access, despite the "pair". */
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn), true, rn != 31, size + 1);
if (size == 2) {
TCGv_i64 cmp = tcg_temp_new_i64();
@@ -2555,7 +2596,7 @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
if (is_lasr) {
tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn), true, rn != 31, size);
gen_store_exclusive(s, rs, rt, rt2, clean_addr, size, false);
return;
@@ -2564,7 +2605,7 @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
if (rn == 31) {
gen_check_sp_alignment(s);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn), false, rn != 31, size);
s->is_ldex = true;
gen_load_exclusive(s, rt, rt2, clean_addr, size, false);
if (is_lasr) {
@@ -2584,7 +2625,7 @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
gen_check_sp_alignment(s);
}
tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn), true, rn != 31, size);
do_gpr_st(s, cpu_reg(s, rt), clean_addr, size, true, rt,
disas_ldst_compute_iss_sf(size, false, 0), is_lasr);
return;
@@ -2600,7 +2641,7 @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
if (rn == 31) {
gen_check_sp_alignment(s);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn), false, rn != 31, size);
do_gpr_ld(s, cpu_reg(s, rt), clean_addr, size, false, false, true, rt,
disas_ldst_compute_iss_sf(size, false, 0), is_lasr);
tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
@@ -2614,7 +2655,8 @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
if (is_lasr) {
tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn),
+ true, rn != 31, size);
gen_store_exclusive(s, rs, rt, rt2, clean_addr, size, true);
return;
}
@@ -2632,7 +2674,8 @@ static void disas_ldst_excl(DisasContext *s, uint32_t insn)
if (rn == 31) {
gen_check_sp_alignment(s);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn),
+ false, rn != 31, size);
s->is_ldex = true;
gen_load_exclusive(s, rt, rt2, clean_addr, size, true);
if (is_lasr) {
@@ -2926,6 +2969,7 @@ static void disas_ldst_reg_imm9(DisasContext *s, uint32_t insn,
bool iss_valid = !is_vector;
bool post_index;
bool writeback;
+ int memidx;
TCGv_i64 clean_addr, dirty_addr;
@@ -2983,7 +3027,11 @@ static void disas_ldst_reg_imm9(DisasContext *s, uint32_t insn,
if (!post_index) {
tcg_gen_addi_i64(dirty_addr, dirty_addr, imm9);
}
- clean_addr = clean_data_tbi(s, dirty_addr);
+
+ memidx = is_unpriv ? get_a64_user_mem_index(s) : get_mem_index(s);
+ clean_addr = gen_mte_check1_mmuidx(s, dirty_addr, is_store,
+ writeback || rn != 31,
+ size, is_unpriv, memidx);
if (is_vector) {
if (is_store) {
@@ -2993,7 +3041,6 @@ static void disas_ldst_reg_imm9(DisasContext *s, uint32_t insn,
}
} else {
TCGv_i64 tcg_rt = cpu_reg(s, rt);
- int memidx = is_unpriv ? get_a64_user_mem_index(s) : get_mem_index(s);
bool iss_sf = disas_ldst_compute_iss_sf(size, is_signed, opc);
if (is_store) {
@@ -3090,7 +3137,7 @@ static void disas_ldst_reg_roffset(DisasContext *s, uint32_t insn,
ext_and_shift_reg(tcg_rm, tcg_rm, opt, shift ? size : 0);
tcg_gen_add_i64(dirty_addr, dirty_addr, tcg_rm);
- clean_addr = clean_data_tbi(s, dirty_addr);
+ clean_addr = gen_mte_check1(s, dirty_addr, is_store, true, size);
if (is_vector) {
if (is_store) {
@@ -3175,7 +3222,7 @@ static void disas_ldst_reg_unsigned_imm(DisasContext *s, uint32_t insn,
dirty_addr = read_cpu_reg_sp(s, rn, 1);
offset = imm12 << size;
tcg_gen_addi_i64(dirty_addr, dirty_addr, offset);
- clean_addr = clean_data_tbi(s, dirty_addr);
+ clean_addr = gen_mte_check1(s, dirty_addr, is_store, rn != 31, size);
if (is_vector) {
if (is_store) {
@@ -3268,7 +3315,7 @@ static void disas_ldst_atomic(DisasContext *s, uint32_t insn,
if (rn == 31) {
gen_check_sp_alignment(s);
}
- clean_addr = clean_data_tbi(s, cpu_reg_sp(s, rn));
+ clean_addr = gen_mte_check1(s, cpu_reg_sp(s, rn), false, rn != 31, size);
if (o3_opc == 014) {
/*
@@ -3345,7 +3392,8 @@ static void disas_ldst_pac(DisasContext *s, uint32_t insn,
tcg_gen_addi_i64(dirty_addr, dirty_addr, offset);
/* Note that "clean" and "dirty" here refer to TBI not PAC. */
- clean_addr = clean_data_tbi(s, dirty_addr);
+ clean_addr = gen_mte_check1(s, dirty_addr, false,
+ is_wback || rn != 31, size);
tcg_rt = cpu_reg(s, rt);
do_gpr_ld(s, tcg_rt, clean_addr, size, /* is_signed */ false,
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 24/42] target/arm: Add gen_mte_checkN
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (22 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 23/42] target/arm: Add gen_mte_check1 Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 25/42] target/arm: Implement helper_mte_check1 Richard Henderson
` (18 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Replace existing uses of check_data_tbi in translate-a64.c that
perform multiple logical memory access. Leave the helper blank
for now to reduce the patch size.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-a64.h | 1 +
target/arm/translate-a64.h | 2 ++
target/arm/mte_helper.c | 8 +++++
target/arm/translate-a64.c | 71 +++++++++++++++++++++++++++++---------
4 files changed, 66 insertions(+), 16 deletions(-)
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index 2faa49d0a3..005af678c7 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -105,6 +105,7 @@ DEF_HELPER_FLAGS_2(xpaci, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_3(mte_check1, TCG_CALL_NO_WG, i64, env, i32, i64)
+DEF_HELPER_FLAGS_3(mte_checkN, TCG_CALL_NO_WG, i64, env, i32, i64)
DEF_HELPER_FLAGS_3(irg, TCG_CALL_NO_RWG, i64, env, i64, i64)
DEF_HELPER_FLAGS_4(addsubg, TCG_CALL_NO_RWG_SE, i64, env, i64, s32, i32)
DEF_HELPER_FLAGS_3(ldg, TCG_CALL_NO_WG, i64, env, i64, i64)
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index cc77e6ce8b..452b4764d0 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -42,6 +42,8 @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
bool sve_access_check(DisasContext *s);
TCGv_i64 gen_mte_check1(DisasContext *s, TCGv_i64 addr, bool is_write,
bool tag_checked, int log2_size);
+TCGv_i64 gen_mte_checkN(DisasContext *s, TCGv_i64 addr, bool is_write,
+ bool tag_checked, int count, int log2_esize);
/* We should have at some point before trying to access an FP register
* done the necessary access check, so assert that
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index ec12768dfc..907a12b366 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -366,3 +366,11 @@ uint64_t HELPER(mte_check1)(CPUARMState *env, uint32_t desc, uint64_t ptr)
{
return ptr;
}
+
+/*
+ * Perform an MTE checked access for multiple logical accesses.
+ */
+uint64_t HELPER(mte_checkN)(CPUARMState *env, uint32_t desc, uint64_t ptr)
+{
+ return ptr;
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 683a8a3dba..c6187ccd60 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -301,6 +301,34 @@ TCGv_i64 gen_mte_check1(DisasContext *s, TCGv_i64 addr, bool is_write,
false, get_mem_index(s));
}
+/*
+ * For MTE, check multiple logical sequential accesses.
+ */
+TCGv_i64 gen_mte_checkN(DisasContext *s, TCGv_i64 addr, bool is_write,
+ bool tag_checked, int log2_esize, int total_size)
+{
+ if (tag_checked && s->mte_active[0] && total_size != (1 << log2_esize)) {
+ TCGv_i32 tcg_desc;
+ TCGv_i64 ret;
+ int desc = 0;
+
+ desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
+ desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
+ desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
+ desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
+ desc = FIELD_DP32(desc, MTEDESC, ESIZE, 1 << log2_esize);
+ desc = FIELD_DP32(desc, MTEDESC, TSIZE, total_size);
+ tcg_desc = tcg_const_i32(desc);
+
+ ret = new_tmp_a64(s);
+ gen_helper_mte_checkN(ret, cpu_env, tcg_desc, addr);
+ tcg_temp_free_i32(tcg_desc);
+
+ return ret;
+ }
+ return gen_mte_check1(s, addr, is_write, tag_checked, log2_esize);
+}
+
typedef struct DisasCompare64 {
TCGCond cond;
TCGv_i64 value;
@@ -2889,7 +2917,10 @@ static void disas_ldst_pair(DisasContext *s, uint32_t insn)
}
}
- clean_addr = clean_data_tbi(s, dirty_addr);
+ clean_addr = gen_mte_checkN(s, dirty_addr, !is_load,
+ (wback || rn != 31) && !set_tag,
+ size, 2 << size);
+
if (is_vector) {
if (is_load) {
do_fp_ld(s, rt, clean_addr, size);
@@ -3555,7 +3586,7 @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
MemOp endian = s->be_data;
- int ebytes; /* bytes per element */
+ int total; /* total bytes */
int elements; /* elements per vector */
int rpt; /* num iterations */
int selem; /* structure elements */
@@ -3625,19 +3656,26 @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
endian = MO_LE;
}
- /* Consecutive little-endian elements from a single register
+ total = rpt * selem * (is_q ? 16 : 8);
+ tcg_rn = cpu_reg_sp(s, rn);
+
+ /*
+ * Issue the MTE check vs the logical repeat count, before we
+ * promote consecutive little-endian elements below.
+ */
+ clean_addr = gen_mte_checkN(s, tcg_rn, is_store, is_postidx || rn != 31,
+ size, total);
+
+ /*
+ * Consecutive little-endian elements from a single register
* can be promoted to a larger little-endian operation.
*/
if (selem == 1 && endian == MO_LE) {
size = 3;
}
- ebytes = 1 << size;
- elements = (is_q ? 16 : 8) / ebytes;
-
- tcg_rn = cpu_reg_sp(s, rn);
- clean_addr = clean_data_tbi(s, tcg_rn);
- tcg_ebytes = tcg_const_i64(ebytes);
+ elements = (is_q ? 16 : 8) >> size;
+ tcg_ebytes = tcg_const_i64(1 << size);
for (r = 0; r < rpt; r++) {
int e;
for (e = 0; e < elements; e++) {
@@ -3671,7 +3709,7 @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
if (is_postidx) {
if (rm == 31) {
- tcg_gen_addi_i64(tcg_rn, tcg_rn, rpt * elements * selem * ebytes);
+ tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
} else {
tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
}
@@ -3717,7 +3755,7 @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
int selem = (extract32(opc, 0, 1) << 1 | R) + 1;
bool replicate = false;
int index = is_q << 3 | S << 2 | size;
- int ebytes, xs;
+ int xs, total;
TCGv_i64 clean_addr, tcg_rn, tcg_ebytes;
if (extract32(insn, 31, 1)) {
@@ -3771,16 +3809,17 @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
return;
}
- ebytes = 1 << scale;
-
if (rn == 31) {
gen_check_sp_alignment(s);
}
+ total = selem << scale;
tcg_rn = cpu_reg_sp(s, rn);
- clean_addr = clean_data_tbi(s, tcg_rn);
- tcg_ebytes = tcg_const_i64(ebytes);
+ clean_addr = gen_mte_checkN(s, tcg_rn, !is_load, is_postidx || rn != 31,
+ scale, total);
+
+ tcg_ebytes = tcg_const_i64(1 << scale);
for (xs = 0; xs < selem; xs++) {
if (replicate) {
/* Load and replicate to all elements */
@@ -3807,7 +3846,7 @@ static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
if (is_postidx) {
if (rm == 31) {
- tcg_gen_addi_i64(tcg_rn, tcg_rn, selem * ebytes);
+ tcg_gen_addi_i64(tcg_rn, tcg_rn, total);
} else {
tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 25/42] target/arm: Implement helper_mte_check1
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (23 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 24/42] target/arm: Add gen_mte_checkN Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 26/42] target/arm: Implement helper_mte_checkN Richard Henderson
` (17 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Fill out the stub that was added earlier.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/internals.h | 47 +++++++++++++++
target/arm/mte_helper.c | 126 +++++++++++++++++++++++++++++++++++++++-
2 files changed, 172 insertions(+), 1 deletion(-)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index a993e8ca0a..8bbaf9b453 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1314,6 +1314,9 @@ FIELD(MTEDESC, WRITE, 8, 1)
FIELD(MTEDESC, ESIZE, 9, 5)
FIELD(MTEDESC, TSIZE, 14, 10) /* mte_checkN only */
+bool mte_probe1(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
+uint64_t mte_check1(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
+
static inline int allocation_tag_from_addr(uint64_t ptr)
{
return extract64(ptr, 56, 4);
@@ -1324,4 +1327,48 @@ static inline uint64_t address_with_allocation_tag(uint64_t ptr, int rtag)
return deposit64(ptr, 56, 4, rtag);
}
+/* Return true if tbi bits mean that the access is checked. */
+static inline bool tbi_check(uint32_t desc, int bit55)
+{
+ return (desc >> (R_MTEDESC_TBI_SHIFT + bit55)) & 1;
+}
+
+/* Return true if tcma bits mean that the access is unchecked. */
+static inline bool tcma_check(uint32_t desc, int bit55, int ptr_tag)
+{
+ /*
+ * We had extracted bit55 and ptr_tag for other reasons, so fold
+ * (ptr<59:55> == 00000 || ptr<59:55> == 11111) into a single test.
+ */
+ bool match = ((ptr_tag + bit55) & 0xf) == 0;
+ bool tcma = (desc >> (R_MTEDESC_TCMA_SHIFT + bit55)) & 1;
+ return tcma && match;
+}
+
+/*
+ * For TBI, ideally, we would do nothing. Proper behaviour on fault is
+ * for the tag to be present in the FAR_ELx register. But for user-only
+ * mode, we do not have a TLB with which to implement this, so we must
+ * remote the top byte.
+ */
+static inline uint64_t useronly_clean_ptr(uint64_t ptr)
+{
+ /* TBI is known to be enabled. */
+#ifdef CONFIG_USER_ONLY
+ ptr = sextract64(ptr, 0, 56);
+#endif
+ return ptr;
+}
+
+static inline uint64_t useronly_maybe_clean_ptr(uint32_t desc, uint64_t ptr)
+{
+#ifdef CONFIG_USER_ONLY
+ int64_t clean_ptr = sextract64(ptr, 0, 56);
+ if (tbi_check(desc, clean_ptr < 0)) {
+ ptr = clean_ptr;
+ }
+#endif
+ return ptr;
+}
+
#endif
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index 907a12b366..7a87574b35 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -359,12 +359,136 @@ void HELPER(stzgm_tags)(CPUARMState *env, uint64_t ptr, uint64_t val)
}
}
+/* Record a tag check failure. */
+static void mte_check_fail(CPUARMState *env, int mmu_idx,
+ uint64_t dirty_ptr, uintptr_t ra)
+{
+ ARMMMUIdx arm_mmu_idx = core_to_aa64_mmu_idx(mmu_idx);
+ int el, reg_el, tcf, select;
+ uint64_t sctlr;
+
+ reg_el = regime_el(env, arm_mmu_idx);
+ sctlr = env->cp15.sctlr_el[reg_el];
+
+ switch (arm_mmu_idx) {
+ case ARMMMUIdx_E10_0:
+ case ARMMMUIdx_E20_0:
+ el = 0;
+ tcf = extract64(sctlr, 38, 2);
+ break;
+ default:
+ el = reg_el;
+ tcf = extract64(sctlr, 40, 2);
+ }
+
+ switch (tcf) {
+ case 1:
+ /*
+ * Tag check fail causes a synchronous exception.
+ *
+ * In restore_state_to_opc, we set the exception syndrome
+ * for the load or store operation. Unwind first so we
+ * may overwrite that with the syndrome for the tag check.
+ */
+ cpu_restore_state(env_cpu(env), ra, true);
+ env->exception.vaddress = dirty_ptr;
+ raise_exception(env, EXCP_DATA_ABORT,
+ syn_data_abort_no_iss(el != 0, 0, 0, 0, 0, 0x11),
+ exception_target_el(env));
+ /* noreturn, but fall through to the assert anyway */
+
+ case 0:
+ /*
+ * Tag check fail does not affect the PE.
+ * We eliminate this case by not setting MTE_ACTIVE
+ * in tb_flags, so that we never make this runtime call.
+ */
+ g_assert_not_reached();
+
+ case 2:
+ /* Tag check fail causes asynchronous flag set. */
+ mmu_idx = arm_mmu_idx_el(env, el);
+ if (regime_has_2_ranges(mmu_idx)) {
+ select = extract64(dirty_ptr, 55, 1);
+ } else {
+ select = 0;
+ }
+ env->cp15.tfsr_el[el] |= 1 << select;
+ break;
+
+ default:
+ /* Case 3: Reserved. */
+ qemu_log_mask(LOG_GUEST_ERROR,
+ "Tag check failure with SCTLR_EL%d.TCF%s "
+ "set to reserved value %d\n",
+ reg_el, el ? "" : "0", tcf);
+ break;
+ }
+}
+
/*
* Perform an MTE checked access for a single logical or atomic access.
*/
+static bool mte_probe1_int(CPUARMState *env, uint32_t desc, uint64_t ptr,
+ uintptr_t ra, int bit55)
+{
+ int mem_tag, mmu_idx, ptr_tag, size;
+ MMUAccessType type;
+ uint8_t *mem;
+
+ ptr_tag = allocation_tag_from_addr(ptr);
+
+ if (tcma_check(desc, bit55, ptr_tag)) {
+ return true;
+ }
+
+ mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
+ type = FIELD_EX32(desc, MTEDESC, WRITE) ? MMU_DATA_STORE : MMU_DATA_LOAD;
+ size = FIELD_EX32(desc, MTEDESC, ESIZE);
+
+ mem = allocation_tag_mem(env, mmu_idx, ptr, type, size,
+ MMU_DATA_LOAD, 1, ra);
+ if (!mem) {
+ return true;
+ }
+
+ mem_tag = load_tag1(ptr, mem);
+ return ptr_tag == mem_tag;
+}
+
+/* No-fault version of mte_check1, to be used by SVE for MemSingleNF. */
+bool mte_probe1(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra)
+{
+ int bit55 = extract64(ptr, 55, 1);
+
+ /* If TBI is disabled, the access is unchecked. */
+ if (unlikely(!tbi_check(desc, bit55))) {
+ return true;
+ }
+
+ return mte_probe1_int(env, desc, ptr, ra, bit55);
+}
+
+uint64_t mte_check1(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra)
+{
+ int bit55 = extract64(ptr, 55, 1);
+
+ /* If TBI is disabled, the access is unchecked, and ptr is not dirty. */
+ if (unlikely(!tbi_check(desc, bit55))) {
+ return ptr;
+ }
+
+ if (unlikely(!mte_probe1_int(env, desc, ptr, ra, bit55))) {
+ int mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
+ mte_check_fail(env, mmu_idx, ptr, ra);
+ }
+
+ return useronly_clean_ptr(ptr);
+}
+
uint64_t HELPER(mte_check1)(CPUARMState *env, uint32_t desc, uint64_t ptr)
{
- return ptr;
+ return mte_check1(env, desc, ptr, GETPC());
}
/*
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 26/42] target/arm: Implement helper_mte_checkN
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (24 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 25/42] target/arm: Implement helper_mte_check1 Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 27/42] target/arm: Add helper_mte_check_zva Richard Henderson
` (16 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Fill out the stub that was added earlier.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/internals.h | 1 +
target/arm/mte_helper.c | 162 +++++++++++++++++++++++++++++++++++++++-
2 files changed, 162 insertions(+), 1 deletion(-)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 8bbaf9b453..04f0b619b7 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1316,6 +1316,7 @@ FIELD(MTEDESC, TSIZE, 14, 10) /* mte_checkN only */
bool mte_probe1(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
uint64_t mte_check1(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
+uint64_t mte_checkN(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
static inline int allocation_tag_from_addr(uint64_t ptr)
{
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index 7a87574b35..94f67b33d1 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -494,7 +494,167 @@ uint64_t HELPER(mte_check1)(CPUARMState *env, uint32_t desc, uint64_t ptr)
/*
* Perform an MTE checked access for multiple logical accesses.
*/
+
+/**
+ * checkN:
+ * @tag: tag memory to test
+ * @odd: true to begin testing at tags at odd nibble
+ * @cmp: the tag to compare against
+ * @count: number of tags to test
+ *
+ * Return the number of successful tests.
+ * Thus a return value < @count indicates a failure.
+ *
+ * A note about sizes: count is expected to be small.
+ *
+ * The most common use will be LDP/STP of two integer registers,
+ * which means 16 bytes of memory touching at most 2 tags, but
+ * often the access is aligned and thus just 1 tag.
+ *
+ * Using AdvSIMD LD/ST (multiple), one can access 64 bytes of memory,
+ * touching at most 5 tags. SVE LDR/STR (vector) with the default
+ * vector length is also 64 bytes; the maximum architectural length
+ * is 256 bytes touching at most 9 tags.
+ *
+ * The loop below uses 7 logical operations and 1 memory operation
+ * per tag pair. An implementation that loads an aligned word and
+ * uses masking to ignore adjacent tags requires 18 logical operations
+ * and thus does not begin to pay off until 6 tags.
+ * Which, according to the survey above, is unlikely to be common.
+ */
+
+static int checkN(uint8_t *mem, int odd, int cmp, int count)
+{
+ int n = 0, diff;
+
+ /* Replicate the test tag and compare. */
+ cmp *= 0x11;
+ diff = *mem++ ^ cmp;
+
+ if (odd) {
+ goto start_odd;
+ }
+
+ while (1) {
+ /* Test even tag. */
+ if (unlikely((diff) & 0x0f)) {
+ break;
+ }
+ if (++n == count) {
+ break;
+ }
+
+ start_odd:
+ /* Test odd tag. */
+ if (unlikely((diff) & 0xf0)) {
+ break;
+ }
+ if (++n == count) {
+ break;
+ }
+
+ diff = *mem++ ^ cmp;
+ }
+ return n;
+}
+
+uint64_t mte_checkN(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra)
+{
+ int mmu_idx, ptr_tag, bit55;
+ uint64_t ptr_last, ptr_end, next_page;
+ uint64_t tag_first, tag_end;
+ uint64_t tag_byte_first, tag_byte_end;
+ uint32_t esize, total, tag_count, tag_size, n, c;
+ uint8_t *mem1, *mem2;
+ MMUAccessType type;
+
+ bit55 = extract64(ptr, 55, 1);
+
+ /* If TBI is disabled, the access is unchecked, and ptr is not dirty. */
+ if (unlikely(!tbi_check(desc, bit55))) {
+ return ptr;
+ }
+
+ ptr_tag = allocation_tag_from_addr(ptr);
+
+ if (tcma_check(desc, bit55, ptr_tag)) {
+ goto done;
+ }
+
+ mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
+ type = FIELD_EX32(desc, MTEDESC, WRITE) ? MMU_DATA_STORE : MMU_DATA_LOAD;
+ esize = FIELD_EX32(desc, MTEDESC, ESIZE);
+ total = FIELD_EX32(desc, MTEDESC, TSIZE);
+
+ /* Find the addr of the end of the access, and of the last element. */
+ ptr_end = ptr + total;
+ ptr_last = ptr_end - esize;
+
+ /* Round the bounds to the tag granule, and compute the number of tags. */
+ tag_first = QEMU_ALIGN_DOWN(ptr, TAG_GRANULE);
+ tag_end = QEMU_ALIGN_UP(ptr_last, TAG_GRANULE);
+ tag_count = (tag_end - tag_first) / TAG_GRANULE;
+
+ /* Round the bounds to twice the tag granule, and compute the bytes. */
+ tag_byte_first = QEMU_ALIGN_DOWN(ptr, 2 * TAG_GRANULE);
+ tag_byte_end = QEMU_ALIGN_UP(ptr_last, 2 * TAG_GRANULE);
+
+ next_page = TARGET_PAGE_ALIGN(ptr);
+ if (likely(tag_end <= next_page)) {
+ /* Memory access stays on one page. */
+ tag_size = (tag_byte_end - tag_byte_first) / (2 * TAG_GRANULE);
+ mem1 = allocation_tag_mem(env, mmu_idx, ptr, type, total,
+ MMU_DATA_LOAD, tag_size, ra);
+ if (!mem1) {
+ goto done;
+ }
+ /* Perform all of the comparisons. */
+ n = checkN(mem1, ptr & TAG_GRANULE, ptr_tag, tag_count);
+ } else {
+ /* Memory access crosses to next page. */
+ tag_size = (next_page - tag_byte_first) / (2 * TAG_GRANULE);
+ mem1 = allocation_tag_mem(env, mmu_idx, ptr, type, next_page - ptr,
+ MMU_DATA_LOAD, tag_size, ra);
+
+ tag_size = (tag_byte_end - next_page) / (2 * TAG_GRANULE);
+ mem2 = allocation_tag_mem(env, mmu_idx, next_page, type,
+ ptr_end - next_page,
+ MMU_DATA_LOAD, tag_size, ra);
+
+ /*
+ * Perform all of the comparisons.
+ * Note the possible but unlikely case of the operation spanning
+ * two pages that do not both have tagging enabled.
+ */
+ n = c = (next_page - tag_first) / TAG_GRANULE;
+ if (mem1) {
+ n = checkN(mem1, ptr & TAG_GRANULE, ptr_tag, c);
+ }
+ if (n == c) {
+ if (!mem2) {
+ goto done;
+ }
+ n += checkN(mem2, 0, ptr_tag, tag_count - c);
+ }
+ }
+
+ /*
+ * If we failed, we know which granule. Compute the element that
+ * is first in that granule, and signal failure on that element.
+ */
+ if (unlikely(n < tag_count)) {
+ uint64_t fail_ofs;
+
+ fail_ofs = tag_first + n * TAG_GRANULE - ptr;
+ fail_ofs = ROUND_UP(fail_ofs, esize);
+ mte_check_fail(env, mmu_idx, ptr + fail_ofs, ra);
+ }
+
+ done:
+ return useronly_clean_ptr(ptr);
+}
+
uint64_t HELPER(mte_checkN)(CPUARMState *env, uint32_t desc, uint64_t ptr)
{
- return ptr;
+ return mte_checkN(env, desc, ptr, GETPC());
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 27/42] target/arm: Add helper_mte_check_zva
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (25 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 26/42] target/arm: Implement helper_mte_checkN Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 28/42] target/arm: Use mte_checkN for sve unpredicated loads Richard Henderson
` (15 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Use a special helper for DC_ZVA, rather than the more
general mte_checkN. Leave the helper blank for now.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-a64.h | 1 +
target/arm/mte_helper.c | 106 +++++++++++++++++++++++++++++++++++++
target/arm/translate-a64.c | 16 +++++-
3 files changed, 122 insertions(+), 1 deletion(-)
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index 005af678c7..5b0b699a50 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -106,6 +106,7 @@ DEF_HELPER_FLAGS_2(xpacd, TCG_CALL_NO_RWG_SE, i64, env, i64)
DEF_HELPER_FLAGS_3(mte_check1, TCG_CALL_NO_WG, i64, env, i32, i64)
DEF_HELPER_FLAGS_3(mte_checkN, TCG_CALL_NO_WG, i64, env, i32, i64)
+DEF_HELPER_FLAGS_3(mte_check_zva, TCG_CALL_NO_WG, i64, env, i32, i64)
DEF_HELPER_FLAGS_3(irg, TCG_CALL_NO_RWG, i64, env, i64, i64)
DEF_HELPER_FLAGS_4(addsubg, TCG_CALL_NO_RWG_SE, i64, env, i64, s32, i32)
DEF_HELPER_FLAGS_3(ldg, TCG_CALL_NO_WG, i64, env, i64, i64)
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index 94f67b33d1..c51f7f04f4 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -658,3 +658,109 @@ uint64_t HELPER(mte_checkN)(CPUARMState *env, uint32_t desc, uint64_t ptr)
{
return mte_checkN(env, desc, ptr, GETPC());
}
+
+/*
+ * Perform an MTE checked access for DC_ZVA.
+ */
+uint64_t HELPER(mte_check_zva)(CPUARMState *env, uint32_t desc, uint64_t ptr)
+{
+ uintptr_t ra = GETPC();
+ int log2_dcz_bytes, log2_tag_bytes;
+ int mmu_idx, bit55;
+ intptr_t dcz_bytes, tag_bytes, i;
+ void *mem;
+ uint64_t ptr_tag, mem_tag, align_ptr;
+
+ bit55 = extract64(ptr, 55, 1);
+
+ /* If TBI is disabled, the access is unchecked, and ptr is not dirty. */
+ if (unlikely(!tbi_check(desc, bit55))) {
+ return ptr;
+ }
+
+ ptr_tag = allocation_tag_from_addr(ptr);
+
+ if (tcma_check(desc, bit55, ptr_tag)) {
+ goto done;
+ }
+
+ /*
+ * In arm_cpu_realizefn, we asserted that dcz > LOG2_TAG_GRANULE+1,
+ * i.e. 32 bytes, which is an unreasonably small dcz anyway, to make
+ * sure that we can access one complete tag byte here.
+ */
+ log2_dcz_bytes = env_archcpu(env)->dcz_blocksize + 2;
+ log2_tag_bytes = log2_dcz_bytes - (LOG2_TAG_GRANULE + 1);
+ dcz_bytes = (intptr_t)1 << log2_dcz_bytes;
+ tag_bytes = (intptr_t)1 << log2_tag_bytes;
+ align_ptr = ptr & -dcz_bytes;
+
+ /*
+ * Trap if accessing an invalid page. DC_ZVA requires that we supply
+ * the original pointer for an invalid page. But watchpoints require
+ * that we probe the actual space. So do both.
+ */
+ mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
+ (void) probe_write(env, ptr, 1, mmu_idx, ra);
+ mem = allocation_tag_mem(env, mmu_idx, align_ptr, MMU_DATA_STORE,
+ dcz_bytes, MMU_DATA_LOAD, tag_bytes, ra);
+ if (!mem) {
+ goto done;
+ }
+
+ /*
+ * Unlike the reasoning for checkN, DC_ZVA is always aligned, and thus
+ * it is quite easy to perform all of the comparisons at once without
+ * any extra masking.
+ *
+ * The most common zva block size is 64; some of the thunderx cpus use
+ * a block size of 128. For user-only, aarch64_max_initfn will set the
+ * block size to 512. Fill out the other cases for future-proofing.
+ *
+ * In order to be able to find the first miscompare later, we want the
+ * tag bytes to be in little-endian order.
+ */
+ switch (log2_tag_bytes) {
+ case 0: /* zva_blocksize 32 */
+ mem_tag = *(uint8_t *)mem;
+ ptr_tag *= 0x11u;
+ break;
+ case 1: /* zva_blocksize 64 */
+ mem_tag = cpu_to_le16(*(uint16_t *)mem);
+ ptr_tag *= 0x1111u;
+ break;
+ case 2: /* zva_blocksize 128 */
+ mem_tag = cpu_to_le32(*(uint32_t *)mem);
+ ptr_tag *= 0x11111111u;
+ break;
+ case 3: /* zva_blocksize 256 */
+ mem_tag = cpu_to_le64(*(uint64_t *)mem);
+ ptr_tag *= 0x1111111111111111ull;
+ break;
+
+ default: /* zva_blocksize 512, 1024, 2048 */
+ ptr_tag *= 0x1111111111111111ull;
+ i = 0;
+ do {
+ mem_tag = cpu_to_le64(*(uint64_t *)(mem + i));
+ if (unlikely(mem_tag != ptr_tag)) {
+ goto fail;
+ }
+ i += 8;
+ align_ptr += 16 * TAG_GRANULE;
+ } while (i < tag_bytes);
+ goto done;
+ }
+
+ if (likely(mem_tag == ptr_tag)) {
+ goto done;
+ }
+
+ fail:
+ /* Locate the first nibble that differs. */
+ i = ctz64(mem_tag ^ ptr_tag) >> 4;
+ mte_check_fail(env, mmu_idx, align_ptr + i * TAG_GRANULE, ra);
+
+ done:
+ return useronly_clean_ptr(ptr);
+}
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index c6187ccd60..d86c13a32d 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1902,7 +1902,21 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
return;
case ARM_CP_DC_ZVA:
/* Writes clear the aligned block of memory which rt points into. */
- tcg_rt = clean_data_tbi(s, cpu_reg(s, rt));
+ if (s->mte_active[0]) {
+ TCGv_i32 t_desc;
+ int desc = 0;
+
+ desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
+ desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
+ desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
+ t_desc = tcg_const_i32(desc);
+
+ tcg_rt = new_tmp_a64(s);
+ gen_helper_mte_check_zva(tcg_rt, cpu_env, t_desc, cpu_reg(s, rt));
+ tcg_temp_free_i32(t_desc);
+ } else {
+ tcg_rt = clean_data_tbi(s, cpu_reg(s, rt));
+ }
gen_helper_dc_zva(cpu_env, tcg_rt);
return;
default:
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 28/42] target/arm: Use mte_checkN for sve unpredicated loads
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (26 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 27/42] target/arm: Add helper_mte_check_zva Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 29/42] target/arm: Use mte_checkN for sve unpredicated stores Richard Henderson
` (14 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-sve.h | 2 +
target/arm/sve_helper.c | 74 ++++++++++++++++++++++++++++--
target/arm/translate-sve.c | 93 ++++++++++++++++----------------------
3 files changed, 110 insertions(+), 59 deletions(-)
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 2f47279155..82ea70cf63 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1123,6 +1123,8 @@ DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_ldr, TCG_CALL_NO_WG, void, env, ptr, tl, int)
+
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 80453953ad..ede72a2989 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3967,10 +3967,6 @@ void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc)
} while (i != 0);
}
-/*
- * Load contiguous data, protected by a governing predicate.
- */
-
/*
* Load elements into @vd + @reg_off, from @host,
* or the reverse for stores.
@@ -4194,6 +4190,76 @@ static bool sve_probe_page(SVEHostPage *info, bool nofault,
return true;
}
+/*
+ * Load contiguous data, unpredicated.
+ *
+ * Note that unpredicated load/store of vector/predicate registers
+ * are defined as a stream of bytes, which equates to little-endian
+ * operations on larger quantities.
+ *
+ * Note any MTE check is already handled.
+ */
+
+void HELPER(sve_ldr)(CPUARMState *env, void *vd, target_ulong addr, int size)
+{
+ int mmu_idx = cpu_mmu_index(env, false);
+ int in_page = -((int)addr | TARGET_PAGE_MASK);
+ uintptr_t ra = GETPC();
+ uint64_t val;
+ int i;
+
+ /* Small loads are expanded inline. */
+ tcg_debug_assert(size > 2 * 8);
+
+ /* Bulk copy the data from memory to the register. */
+ if (likely(size <= in_page)) {
+ void *host = probe_read(env, addr, size, mmu_idx, ra);
+
+ if (unlikely(!host)) {
+ goto mmio;
+ }
+ memcpy(vd, host, size);
+ } else {
+ void *h1 = probe_read(env, addr, in_page, mmu_idx, ra);
+ void *h2 = probe_read(env, addr + in_page, size - in_page, mmu_idx, ra);
+
+ if (unlikely(!h1 || !h2)) {
+ goto mmio;
+ }
+ memcpy(vd, h1, in_page);
+ memcpy(vd + in_page, h2, size - in_page);
+ }
+
+ /* Predicate load length may be any multiple of 2; ensure high bits 0. */
+ if (unlikely(size & 7)) {
+ memset(vd + size, 0, 8 - (size & 7));
+ }
+
+ /*
+ * The memcpy and memset above kept the bytes in memory order.
+ * The in-register format has uint64_t in host order, so for
+ * big-endian host we need to bswap.
+ */
+ for (i = 0; i < size; i += 8) {
+ le64_to_cpus(vd + i);
+ }
+ return;
+
+ mmio:
+ for (i = 0; i + 8 <= size; i += 8) {
+ val = cpu_ldq_data_ra(env, addr + i, ra);
+ val = le_bswap64(val);
+ *(uint64_t *)(vd + i) = val;
+ }
+
+ /* Predicate load length may be any multiple of 2. */
+ if (unlikely(i != size)) {
+ val = cpu_ldq_data_ra(env, addr + i, ra);
+ val = le_bswap64(val);
+ val >>= (size - i) * 8;
+ *(uint64_t *)(vd + i + 8) = val;
+ }
+}
/*
* Analyse contiguous data, protected by a governing predicate.
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 7bd7de80e6..e55f8835bb 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4352,8 +4352,13 @@ static bool trans_UCVTF_dd(DisasContext *s, arg_rpr_esz *a)
*** SVE Memory - 32-bit Gather and Unsized Contiguous Group
*/
-/* Subroutine loading a vector register at VOFS of LEN bytes.
+/*
+ * Subroutine loading a vector register at VOFS of LEN bytes.
* The load should begin at the address Rn + IMM.
+ *
+ * Note that unpredicated load/store of vector/predicate registers
+ * are defined as a stream of bytes, which equates to little-endian
+ * operations on larger quantities.
*/
static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
@@ -4362,81 +4367,59 @@ static void do_ldr(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
int len_remain = len % 8;
int nparts = len / 8 + ctpop8(len_remain);
int midx = get_mem_index(s);
- TCGv_i64 addr, t0, t1;
+ TCGv_i64 dirty_addr, clean_addr, t0, t1;
+ int i;
- addr = tcg_temp_new_i64();
- t0 = tcg_temp_new_i64();
+ dirty_addr = read_cpu_reg_sp(s, rn, true);
+ tcg_gen_addi_i64(dirty_addr, dirty_addr, imm);
- /* Note that unpredicated load/store of vector/predicate registers
- * are defined as a stream of bytes, which equates to little-endian
- * operations on larger quantities. There is no nice way to force
- * a little-endian load for aarch64_be-linux-user out of line.
- *
- * Attempt to keep code expansion to a minimum by limiting the
- * amount of unrolling done.
- */
- if (nparts <= 4) {
- int i;
+ clean_addr = gen_mte_checkN(s, dirty_addr, false, rn != 31, len, MO_8);
- for (i = 0; i < len_align; i += 8) {
- tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i);
- tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ);
- tcg_gen_st_i64(t0, cpu_env, vofs + i);
- }
- } else {
- TCGLabel *loop = gen_new_label();
- TCGv_ptr tp, i = tcg_const_local_ptr(0);
+ /* Limit tcg code expansion by doing large loads out of line. */
+ if (nparts > 4) {
+ TCGv_ptr t_rd = tcg_temp_new_ptr();
+ TCGv_i32 t_len = tcg_const_i32(len);
- gen_set_label(loop);
-
- /* Minimize the number of local temps that must be re-read from
- * the stack each iteration. Instead, re-compute values other
- * than the loop counter.
- */
- tp = tcg_temp_new_ptr();
- tcg_gen_addi_ptr(tp, i, imm);
- tcg_gen_extu_ptr_i64(addr, tp);
- tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn));
-
- tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ);
-
- tcg_gen_add_ptr(tp, cpu_env, i);
- tcg_gen_addi_ptr(i, i, 8);
- tcg_gen_st_i64(t0, tp, vofs);
- tcg_temp_free_ptr(tp);
-
- tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
- tcg_temp_free_ptr(i);
+ tcg_gen_addi_ptr(t_rd, cpu_env, vofs);
+ gen_helper_sve_ldr(cpu_env, t_rd, clean_addr, t_len);
+ tcg_temp_free_ptr(t_rd);
+ tcg_temp_free_i32(t_len);
+ return;
}
- /* Predicate register loads can be any multiple of 2.
- * Note that we still store the entire 64-bit unit into cpu_env.
+ t0 = tcg_temp_new_i64();
+ for (i = 0; i < len_align; i += 8) {
+ tcg_gen_qemu_ld_i64(t0, clean_addr, midx, MO_LEQ);
+ tcg_gen_st_i64(t0, cpu_env, vofs + i);
+ tcg_gen_addi_i64(clean_addr, clean_addr, 8);
+ }
+
+ /*
+ * Predicate register loads can be any multiple of 2.
+ * Note that we still store the entire 64-bit unit into cpu_env
+ * so that the high bits are zeroed.
*/
if (len_remain) {
- tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align);
-
switch (len_remain) {
case 2:
- case 4:
- case 8:
- tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LE | ctz32(len_remain));
+ tcg_gen_qemu_ld_i64(t0, clean_addr, midx, MO_LEUW);
+ break;
+ case 4:
+ tcg_gen_qemu_ld_i64(t0, clean_addr, midx, MO_LEUL);
break;
-
case 6:
t1 = tcg_temp_new_i64();
- tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL);
- tcg_gen_addi_i64(addr, addr, 4);
- tcg_gen_qemu_ld_i64(t1, addr, midx, MO_LEUW);
+ tcg_gen_qemu_ld_i64(t0, clean_addr, midx, MO_LEUL);
+ tcg_gen_addi_i64(clean_addr, clean_addr, 4);
+ tcg_gen_qemu_ld_i64(t1, clean_addr, midx, MO_LEUW);
tcg_gen_deposit_i64(t0, t0, t1, 32, 32);
tcg_temp_free_i64(t1);
break;
-
default:
g_assert_not_reached();
}
tcg_gen_st_i64(t0, cpu_env, vofs + len_align);
}
- tcg_temp_free_i64(addr);
tcg_temp_free_i64(t0);
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 29/42] target/arm: Use mte_checkN for sve unpredicated stores
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (27 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 28/42] target/arm: Use mte_checkN for sve unpredicated loads Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 30/42] target/arm: Use mte_check1 for sve LD1R Richard Henderson
` (13 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-sve.h | 1 +
target/arm/sve_helper.c | 63 ++++++++++++++++++++++++++-
target/arm/translate-sve.c | 88 ++++++++++++++------------------------
3 files changed, 94 insertions(+), 58 deletions(-)
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 82ea70cf63..4e71501838 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1124,6 +1124,7 @@ DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(sve_ldr, TCG_CALL_NO_WG, void, env, ptr, tl, int)
+DEF_HELPER_FLAGS_4(sve_str, TCG_CALL_NO_WG, void, env, ptr, tl, int)
DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index ede72a2989..2396737420 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -4191,7 +4191,7 @@ static bool sve_probe_page(SVEHostPage *info, bool nofault,
}
/*
- * Load contiguous data, unpredicated.
+ * Load/store contiguous data, unpredicated.
*
* Note that unpredicated load/store of vector/predicate registers
* are defined as a stream of bytes, which equates to little-endian
@@ -4261,6 +4261,67 @@ void HELPER(sve_ldr)(CPUARMState *env, void *vd, target_ulong addr, int size)
}
}
+void HELPER(sve_str)(CPUARMState *env, void *vd, target_ulong addr, int size)
+{
+ int mem_idx = cpu_mmu_index(env, false);
+ int in_page = -((int)addr | TARGET_PAGE_MASK);
+ uintptr_t ra = GETPC();
+ uint64_t val;
+ void *host;
+ int i;
+
+ /* Small stores are expanded inline. */
+ tcg_debug_assert(size > 2 * 8);
+
+ if (likely(size <= in_page)) {
+ host = probe_write(env, addr, size, mem_idx, ra);
+ if (likely(host != NULL)) {
+ for (i = 0; i + 8 <= size; i += 8) {
+ stq_le_p(host + i, *(uint64_t *)(vd + i));
+ }
+
+ /* Predicate load length may be any multiple of 2. */
+ if (unlikely(i != size)) {
+ val = *(uint64_t *)(vd + i);
+ if (size & 4) {
+ stl_le_p(host + i, val);
+ i += 4;
+ val >>= 32;
+ }
+ if (size & 2) {
+ stw_le_p(host + i, val);
+ }
+ }
+ return;
+ }
+ } else {
+ (void)probe_write(env, addr, in_page, mem_idx, ra);
+ (void)probe_write(env, addr + in_page, size - in_page, mem_idx, ra);
+ }
+
+ /*
+ * Note there is no endian-specific target store function, so to handle
+ * aarch64_be-linux-user we need to bswap the big-endian store.
+ */
+ for (i = 0; i + 8 <= size; i += 8) {
+ val = *(uint64_t *)(vd + i);
+ cpu_stq_data_ra(env, addr + i, le_bswap64(val), ra);
+ }
+
+ /* Predicate load length may be any multiple of 2. */
+ if (unlikely(i != size)) {
+ val = *(uint64_t *)(vd + i);
+ if (size & 4) {
+ cpu_stl_data_ra(env, addr + i, le_bswap32(val), ra);
+ i += 4;
+ val >>= 32;
+ }
+ if (size & 2) {
+ cpu_stw_data_ra(env, addr + i, le_bswap16(val), ra);
+ }
+ }
+}
+
/*
* Analyse contiguous data, protected by a governing predicate.
*/
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index e55f8835bb..49d2e68564 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4430,78 +4430,52 @@ static void do_str(DisasContext *s, uint32_t vofs, int len, int rn, int imm)
int len_remain = len % 8;
int nparts = len / 8 + ctpop8(len_remain);
int midx = get_mem_index(s);
- TCGv_i64 addr, t0;
+ TCGv_i64 dirty_addr, clean_addr, t0;
+ int i;
+
+ dirty_addr = read_cpu_reg_sp(s, rn, true);
+ tcg_gen_addi_i64(dirty_addr, dirty_addr, imm);
+
+ clean_addr = gen_mte_checkN(s, dirty_addr, true, rn != 31, len, MO_8);
+
+ /* Limit tcg code expansion by doing large loads out of line. */
+ if (nparts > 4) {
+ TCGv_ptr t_rd = tcg_temp_new_ptr();
+ TCGv_i32 t_len = tcg_const_i32(len);
+
+ tcg_gen_addi_ptr(t_rd, cpu_env, vofs);
+ gen_helper_sve_str(cpu_env, t_rd, clean_addr, t_len);
+ tcg_temp_free_ptr(t_rd);
+ tcg_temp_free_i32(t_len);
+ return;
+ }
- addr = tcg_temp_new_i64();
t0 = tcg_temp_new_i64();
-
- /* Note that unpredicated load/store of vector/predicate registers
- * are defined as a stream of bytes, which equates to little-endian
- * operations on larger quantities. There is no nice way to force
- * a little-endian store for aarch64_be-linux-user out of line.
- *
- * Attempt to keep code expansion to a minimum by limiting the
- * amount of unrolling done.
- */
- if (nparts <= 4) {
- int i;
-
- for (i = 0; i < len_align; i += 8) {
- tcg_gen_ld_i64(t0, cpu_env, vofs + i);
- tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i);
- tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
- }
- } else {
- TCGLabel *loop = gen_new_label();
- TCGv_ptr t2, i = tcg_const_local_ptr(0);
-
- gen_set_label(loop);
-
- t2 = tcg_temp_new_ptr();
- tcg_gen_add_ptr(t2, cpu_env, i);
- tcg_gen_ld_i64(t0, t2, vofs);
-
- /* Minimize the number of local temps that must be re-read from
- * the stack each iteration. Instead, re-compute values other
- * than the loop counter.
- */
- tcg_gen_addi_ptr(t2, i, imm);
- tcg_gen_extu_ptr_i64(addr, t2);
- tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn));
- tcg_temp_free_ptr(t2);
-
- tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
-
- tcg_gen_addi_ptr(i, i, 8);
-
- tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
- tcg_temp_free_ptr(i);
+ for (i = 0; i < len_align; i += 8) {
+ tcg_gen_ld_i64(t0, cpu_env, vofs + i);
+ tcg_gen_qemu_st_i64(t0, clean_addr, midx, MO_LEQ);
+ tcg_gen_addi_i64(clean_addr, clean_addr, 8);
}
/* Predicate register stores can be any multiple of 2. */
if (len_remain) {
tcg_gen_ld_i64(t0, cpu_env, vofs + len_align);
- tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align);
-
switch (len_remain) {
- case 2:
- case 4:
- case 8:
- tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain));
- break;
-
case 6:
- tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL);
- tcg_gen_addi_i64(addr, addr, 4);
+ tcg_gen_qemu_st_i64(t0, clean_addr, midx, MO_LEUL);
+ tcg_gen_addi_i64(clean_addr, clean_addr, 4);
tcg_gen_shri_i64(t0, t0, 32);
- tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW);
+ /* fall through */
+ case 2:
+ tcg_gen_qemu_st_i64(t0, clean_addr, midx, MO_LEUW);
+ break;
+ case 4:
+ tcg_gen_qemu_st_i64(t0, clean_addr, midx, MO_LEUL);
break;
-
default:
g_assert_not_reached();
}
}
- tcg_temp_free_i64(addr);
tcg_temp_free_i64(t0);
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 30/42] target/arm: Use mte_check1 for sve LD1R
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (28 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 29/42] target/arm: Use mte_checkN for sve unpredicated stores Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 31/42] target/arm: Add mte helpers for sve scalar + int loads Richard Henderson
` (12 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/translate-sve.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 49d2e68564..e5d12edd55 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4850,16 +4850,16 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a)
/* Load and broadcast element. */
static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a)
{
- if (!sve_access_check(s)) {
- return true;
- }
-
unsigned vsz = vec_full_reg_size(s);
unsigned psz = pred_full_reg_size(s);
unsigned esz = dtype_esz[a->dtype];
unsigned msz = dtype_msz(a->dtype);
TCGLabel *over = gen_new_label();
- TCGv_i64 temp;
+ TCGv_i64 temp, clean_addr;
+
+ if (!sve_access_check(s)) {
+ return true;
+ }
/* If the guarding predicate has no bits set, no load occurs. */
if (psz <= 8) {
@@ -4880,9 +4880,11 @@ static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a)
}
/* Load the data. */
- temp = tcg_temp_new_i64();
- tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << msz);
- tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s),
+ temp = read_cpu_reg_sp(s, a->rn, true);
+ tcg_gen_addi_i64(temp, temp, a->imm << msz);
+ clean_addr = gen_mte_check1(s, temp, false, true, msz);
+
+ tcg_gen_qemu_ld_i64(temp, clean_addr, get_mem_index(s),
s->be_data | dtype_mop[a->dtype]);
/* Broadcast to *all* elements. */
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 31/42] target/arm: Add mte helpers for sve scalar + int loads
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (29 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 30/42] target/arm: Use mte_check1 for sve LD1R Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 32/42] target/arm: Add mte helpers for sve scalar + int stores Richard Henderson
` (11 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-sve.h | 58 ++++++++++
target/arm/internals.h | 6 +
target/arm/sve_helper.c | 218 ++++++++++++++++++++++++++++++-------
target/arm/translate-sve.c | 186 ++++++++++++++++++++++---------
4 files changed, 377 insertions(+), 91 deletions(-)
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 4e71501838..af1c6967a6 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1184,6 +1184,64 @@ DEF_HELPER_FLAGS_4(sve_ld1sds_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ld1sdu_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ld1sds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1bhu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bsu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bdu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bhs_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bss_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bds_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1hsu_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hdu_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hds_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1hsu_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hdu_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hds_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1sdu_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1sds_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1sdu_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1sds_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldff1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 04f0b619b7..94b8f07e93 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1306,6 +1306,12 @@ void arm_log_exception(int idx);
#define LOG2_TAG_GRANULE 4
#define TAG_GRANULE (1 << LOG2_TAG_GRANULE)
+/*
+ * The SVE simd_data field, for memory ops, contains either
+ * rd (5 bits) or a shift count (2 bits).
+ */
+#define SVE_MTEDESC_SHIFT 5
+
/* Bits within a descriptor passed to the helper_mte_check* functions. */
FIELD(MTEDESC, MIDX, 0, 4)
FIELD(MTEDESC, TBI, 4, 2)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 2396737420..0fbf331742 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -4564,15 +4564,89 @@ static void sve_cont_ldst_watchpoints(SVEContLdSt *info, CPUARMState *env,
#endif
}
+typedef uint64_t mte_check_fn(CPUARMState *, uint32_t, uint64_t, uintptr_t);
+
+static inline QEMU_ALWAYS_INLINE
+void sve_cont_ldst_mte_check_int(SVEContLdSt *info, CPUARMState *env,
+ uint64_t *vg, target_ulong addr, int esize,
+ int msize, uint32_t mtedesc, uintptr_t ra,
+ mte_check_fn *check)
+{
+ intptr_t mem_off, reg_off, reg_last;
+
+ /* Process the page only if MemAttr == Tagged. */
+ if (info->page[0].attrs.target_tlb_bit1) {
+ mem_off = info->mem_off_first[0];
+ reg_off = info->reg_off_first[0];
+ reg_last = info->reg_off_split;
+ if (reg_last < 0) {
+ reg_last = info->reg_off_last[0];
+ }
+
+ do {
+ uint64_t pg = vg[reg_off >> 6];
+ do {
+ if ((pg >> (reg_off & 63)) & 1) {
+ check(env, mtedesc, addr, ra);
+ }
+ reg_off += esize;
+ mem_off += msize;
+ } while (reg_off <= reg_last && (reg_off & 63));
+ } while (reg_off <= reg_last);
+ }
+
+ mem_off = info->mem_off_first[1];
+ if (mem_off >= 0 && info->page[1].attrs.target_tlb_bit1) {
+ reg_off = info->reg_off_first[1];
+ reg_last = info->reg_off_last[1];
+
+ do {
+ uint64_t pg = vg[reg_off >> 6];
+ do {
+ if ((pg >> (reg_off & 63)) & 1) {
+ check(env, mtedesc, addr, ra);
+ }
+ reg_off += esize;
+ mem_off += msize;
+ } while (reg_off & 63);
+ } while (reg_off <= reg_last);
+ }
+}
+
+typedef void sve_cont_ldst_mte_check_fn(SVEContLdSt *info, CPUARMState *env,
+ uint64_t *vg, target_ulong addr,
+ int esize, int msize, uint32_t mtedesc,
+ uintptr_t ra);
+
+static void sve_cont_ldst_mte_check1(SVEContLdSt *info, CPUARMState *env,
+ uint64_t *vg, target_ulong addr,
+ int esize, int msize, uint32_t mtedesc,
+ uintptr_t ra)
+{
+ sve_cont_ldst_mte_check_int(info, env, vg, addr, esize, msize,
+ mtedesc, ra, mte_check1);
+}
+
+static void sve_cont_ldst_mte_checkN(SVEContLdSt *info, CPUARMState *env,
+ uint64_t *vg, target_ulong addr,
+ int esize, int msize, uint32_t mtedesc,
+ uintptr_t ra)
+{
+ sve_cont_ldst_mte_check_int(info, env, vg, addr, esize, msize,
+ mtedesc, ra, mte_checkN);
+}
+
+
/*
* Common helper for all contiguous 1,2,3,4-register predicated stores.
*/
static inline QEMU_ALWAYS_INLINE
void sve_ldN_r(CPUARMState *env, uint64_t *vg, const target_ulong addr,
uint32_t desc, const uintptr_t retaddr,
- const int esz, const int msz, const int N,
+ const int esz, const int msz, const int N, uint32_t mtedesc,
sve_ldst1_host_fn *host_fn,
- sve_ldst1_tlb_fn *tlb_fn)
+ sve_ldst1_tlb_fn *tlb_fn,
+ sve_cont_ldst_mte_check_fn *mte_check_fn)
{
const unsigned rd = simd_data(desc);
const intptr_t reg_max = simd_oprsz(desc);
@@ -4597,7 +4671,14 @@ void sve_ldN_r(CPUARMState *env, uint64_t *vg, const target_ulong addr,
sve_cont_ldst_watchpoints(&info, env, vg, addr, 1 << esz, N << msz,
BP_MEM_READ, retaddr);
- /* TODO: MTE check. */
+ /*
+ * Handle mte checks for all active elements.
+ * Since TBI must be set for MTE, !mtedesc => !mte_active.
+ */
+ if (mte_check_fn && mtedesc) {
+ mte_check_fn(&info, env, vg, addr, 1 << esz, N << msz,
+ mtedesc, retaddr);
+ }
flags = info.page[0].flags | info.page[1].flags;
if (unlikely(flags != 0)) {
@@ -4703,26 +4784,67 @@ void sve_ldN_r(CPUARMState *env, uint64_t *vg, const target_ulong addr,
}
}
-#define DO_LD1_1(NAME, ESZ) \
-void HELPER(sve_##NAME##_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, MO_8, 1, \
- sve_##NAME##_host, sve_##NAME##_tlb); \
+static inline QEMU_ALWAYS_INLINE
+void sve_ldN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
+ uint32_t desc, const uintptr_t ra,
+ const int esz, const int msz, const int N,
+ sve_ldst1_host_fn *host_fn,
+ sve_ldst1_tlb_fn *tlb_fn)
+{
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+ int bit55 = extract64(addr, 55, 1);
+
+ /* Remove mtedesc from the normal sve descriptor. */
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+
+ /* Perform gross MTE suppression early. */
+ if (!tbi_check(desc, bit55) ||
+ tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ mtedesc = 0;
+ }
+
+ sve_ldN_r(env, vg, addr, desc, ra, esz, msz, N, mtedesc, host_fn, tlb_fn,
+ N == 1 ? sve_cont_ldst_mte_check1 : sve_cont_ldst_mte_checkN);
}
-#define DO_LD1_2(NAME, ESZ, MSZ) \
-void HELPER(sve_##NAME##_le_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, 1, \
- sve_##NAME##_le_host, sve_##NAME##_le_tlb); \
-} \
-void HELPER(sve_##NAME##_be_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, 1, \
- sve_##NAME##_be_host, sve_##NAME##_be_tlb); \
+#define DO_LD1_1(NAME, ESZ) \
+void HELPER(sve_##NAME##_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, MO_8, 1, 0, \
+ sve_##NAME##_host, sve_##NAME##_tlb, NULL); \
+} \
+void HELPER(sve_##NAME##_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r_mte(env, vg, addr, desc, GETPC(), ESZ, MO_8, 1, \
+ sve_##NAME##_host, sve_##NAME##_tlb); \
+}
+
+#define DO_LD1_2(NAME, ESZ, MSZ) \
+void HELPER(sve_##NAME##_le_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, 1, 0, \
+ sve_##NAME##_le_host, sve_##NAME##_le_tlb, NULL); \
+} \
+void HELPER(sve_##NAME##_be_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, 1, 0, \
+ sve_##NAME##_be_host, sve_##NAME##_be_tlb, NULL); \
+} \
+void HELPER(sve_##NAME##_le_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, 1, \
+ sve_##NAME##_le_host, sve_##NAME##_le_tlb); \
+} \
+void HELPER(sve_##NAME##_be_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, 1, \
+ sve_##NAME##_be_host, sve_##NAME##_be_tlb); \
}
DO_LD1_1(ld1bb, MO_8)
@@ -4748,26 +4870,44 @@ DO_LD1_2(ld1dd, MO_64, MO_64)
#undef DO_LD1_1
#undef DO_LD1_2
-#define DO_LDN_1(N) \
-void HELPER(sve_ld##N##bb_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_ldN_r(env, vg, addr, desc, GETPC(), MO_8, MO_8, N, \
- sve_ld1bb_host, sve_ld1bb_tlb); \
+#define DO_LDN_1(N) \
+void HELPER(sve_ld##N##bb_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r(env, vg, addr, desc, GETPC(), MO_8, MO_8, N, 0, \
+ sve_ld1bb_host, sve_ld1bb_tlb, NULL); \
+} \
+void HELPER(sve_ld##N##bb_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r_mte(env, vg, addr, desc, GETPC(), MO_8, MO_8, N, \
+ sve_ld1bb_host, sve_ld1bb_tlb); \
}
-#define DO_LDN_2(N, SUFF, ESZ) \
-void HELPER(sve_ld##N##SUFF##_le_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, ESZ, N, \
- sve_ld1##SUFF##_le_host, sve_ld1##SUFF##_le_tlb); \
-} \
-void HELPER(sve_ld##N##SUFF##_be_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, ESZ, N, \
- sve_ld1##SUFF##_be_host, sve_ld1##SUFF##_be_tlb); \
+#define DO_LDN_2(N, SUFF, ESZ) \
+void HELPER(sve_ld##N##SUFF##_le_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, ESZ, N, 0, \
+ sve_ld1##SUFF##_le_host, sve_ld1##SUFF##_le_tlb, NULL); \
+} \
+void HELPER(sve_ld##N##SUFF##_be_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r(env, vg, addr, desc, GETPC(), ESZ, ESZ, N, 0, \
+ sve_ld1##SUFF##_be_host, sve_ld1##SUFF##_be_tlb, NULL); \
+} \
+void HELPER(sve_ld##N##SUFF##_le_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r_mte(env, vg, addr, desc, GETPC(), ESZ, ESZ, N, \
+ sve_ld1##SUFF##_le_host, sve_ld1##SUFF##_le_tlb); \
+} \
+void HELPER(sve_ld##N##SUFF##_be_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldN_r_mte(env, vg, addr, desc, GETPC(), ESZ, ESZ, N, \
+ sve_ld1##SUFF##_be_host, sve_ld1##SUFF##_be_tlb); \
}
DO_LDN_1(2)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index e5d12edd55..e8b9873a4c 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4542,18 +4542,32 @@ static const uint8_t dtype_esz[16] = {
};
static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
- int dtype, gen_helper_gvec_mem *fn)
+ int dtype, uint32_t mte_n, bool is_write,
+ gen_helper_gvec_mem *fn)
{
unsigned vsz = vec_full_reg_size(s);
TCGv_ptr t_pg;
TCGv_i32 t_desc;
- int desc;
+ int desc = 0;
- /* For e.g. LD4, there are not enough arguments to pass all 4
+ /*
+ * For e.g. LD4, there are not enough arguments to pass all 4
* registers as pointers, so encode the regno into the data field.
* For consistency, do this even for LD1.
+ * TODO: mte_n check here while callers are updated.
*/
- desc = simd_desc(vsz, vsz, zt);
+ if (mte_n && s->mte_active[0]) {
+ int msz = dtype_msz(dtype);
+
+ desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
+ desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
+ desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
+ desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
+ desc = FIELD_DP32(desc, MTEDESC, ESIZE, 1 << msz);
+ desc = FIELD_DP32(desc, MTEDESC, TSIZE, mte_n << msz);
+ desc <<= SVE_MTEDESC_SHIFT;
+ }
+ desc = simd_desc(vsz, vsz, zt | desc);
t_desc = tcg_const_i32(desc);
t_pg = tcg_temp_new_ptr();
@@ -4567,64 +4581,132 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
static void do_ld_zpa(DisasContext *s, int zt, int pg,
TCGv_i64 addr, int dtype, int nreg)
{
- static gen_helper_gvec_mem * const fns[2][16][4] = {
- /* Little-endian */
- { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
+ static gen_helper_gvec_mem * const fns[2][2][16][4] = {
+ { /* mte inactive, little-endian */
+ { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
- { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1sds_le_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hh_le_r, gen_helper_sve_ld2hh_le_r,
- gen_helper_sve_ld3hh_le_r, gen_helper_sve_ld4hh_le_r },
- { gen_helper_sve_ld1hsu_le_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hdu_le_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1sds_le_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hh_le_r, gen_helper_sve_ld2hh_le_r,
+ gen_helper_sve_ld3hh_le_r, gen_helper_sve_ld4hh_le_r },
+ { gen_helper_sve_ld1hsu_le_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hdu_le_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hds_le_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hss_le_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld2ss_le_r,
- gen_helper_sve_ld3ss_le_r, gen_helper_sve_ld4ss_le_r },
- { gen_helper_sve_ld1sdu_le_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hds_le_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hss_le_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld2ss_le_r,
+ gen_helper_sve_ld3ss_le_r, gen_helper_sve_ld4ss_le_r },
+ { gen_helper_sve_ld1sdu_le_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1dd_le_r, gen_helper_sve_ld2dd_le_r,
- gen_helper_sve_ld3dd_le_r, gen_helper_sve_ld4dd_le_r } },
+ { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1dd_le_r, gen_helper_sve_ld2dd_le_r,
+ gen_helper_sve_ld3dd_le_r, gen_helper_sve_ld4dd_le_r } },
- /* Big-endian */
- { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
- gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
- { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
+ /* mte inactive, big-endian */
+ { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r,
+ gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r },
+ { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1sds_be_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hh_be_r, gen_helper_sve_ld2hh_be_r,
- gen_helper_sve_ld3hh_be_r, gen_helper_sve_ld4hh_be_r },
- { gen_helper_sve_ld1hsu_be_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hdu_be_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1sds_be_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hh_be_r, gen_helper_sve_ld2hh_be_r,
+ gen_helper_sve_ld3hh_be_r, gen_helper_sve_ld4hh_be_r },
+ { gen_helper_sve_ld1hsu_be_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hdu_be_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hds_be_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1hss_be_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld2ss_be_r,
- gen_helper_sve_ld3ss_be_r, gen_helper_sve_ld4ss_be_r },
- { gen_helper_sve_ld1sdu_be_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hds_be_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hss_be_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld2ss_be_r,
+ gen_helper_sve_ld3ss_be_r, gen_helper_sve_ld4ss_be_r },
+ { gen_helper_sve_ld1sdu_be_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
- { gen_helper_sve_ld1dd_be_r, gen_helper_sve_ld2dd_be_r,
- gen_helper_sve_ld3dd_be_r, gen_helper_sve_ld4dd_be_r } }
+ { gen_helper_sve_ld1bds_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bss_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL },
+ { gen_helper_sve_ld1dd_be_r, gen_helper_sve_ld2dd_be_r,
+ gen_helper_sve_ld3dd_be_r, gen_helper_sve_ld4dd_be_r } } },
+
+ { /* mte active, little-endian */
+ { { gen_helper_sve_ld1bb_r_mte,
+ gen_helper_sve_ld2bb_r_mte,
+ gen_helper_sve_ld3bb_r_mte,
+ gen_helper_sve_ld4bb_r_mte },
+ { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL },
+
+ { gen_helper_sve_ld1sds_le_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hh_le_r_mte,
+ gen_helper_sve_ld2hh_le_r_mte,
+ gen_helper_sve_ld3hh_le_r_mte,
+ gen_helper_sve_ld4hh_le_r_mte },
+ { gen_helper_sve_ld1hsu_le_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hdu_le_r_mte, NULL, NULL, NULL },
+
+ { gen_helper_sve_ld1hds_le_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hss_le_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1ss_le_r_mte,
+ gen_helper_sve_ld2ss_le_r_mte,
+ gen_helper_sve_ld3ss_le_r_mte,
+ gen_helper_sve_ld4ss_le_r_mte },
+ { gen_helper_sve_ld1sdu_le_r_mte, NULL, NULL, NULL },
+
+ { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1dd_le_r_mte,
+ gen_helper_sve_ld2dd_le_r_mte,
+ gen_helper_sve_ld3dd_le_r_mte,
+ gen_helper_sve_ld4dd_le_r_mte } },
+
+ /* mte active, big-endian */
+ { { gen_helper_sve_ld1bb_r_mte,
+ gen_helper_sve_ld2bb_r_mte,
+ gen_helper_sve_ld3bb_r_mte,
+ gen_helper_sve_ld4bb_r_mte },
+ { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL },
+
+ { gen_helper_sve_ld1sds_be_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hh_be_r_mte,
+ gen_helper_sve_ld2hh_be_r_mte,
+ gen_helper_sve_ld3hh_be_r_mte,
+ gen_helper_sve_ld4hh_be_r_mte },
+ { gen_helper_sve_ld1hsu_be_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hdu_be_r_mte, NULL, NULL, NULL },
+
+ { gen_helper_sve_ld1hds_be_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1hss_be_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1ss_be_r_mte,
+ gen_helper_sve_ld2ss_be_r_mte,
+ gen_helper_sve_ld3ss_be_r_mte,
+ gen_helper_sve_ld4ss_be_r_mte },
+ { gen_helper_sve_ld1sdu_be_r_mte, NULL, NULL, NULL },
+
+ { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL },
+ { gen_helper_sve_ld1dd_be_r_mte,
+ gen_helper_sve_ld2dd_be_r_mte,
+ gen_helper_sve_ld3dd_be_r_mte,
+ gen_helper_sve_ld4dd_be_r_mte } } },
};
- gen_helper_gvec_mem *fn = fns[s->be_data == MO_BE][dtype][nreg];
+ gen_helper_gvec_mem *fn
+ = fns[s->mte_active[0]][s->be_data == MO_BE][dtype][nreg];
- /* While there are holes in the table, they are not
+ /*
+ * While there are holes in the table, they are not
* accessible via the instruction encoding.
*/
assert(fn != NULL);
- do_mem_zpa(s, zt, pg, addr, dtype, fn);
+ do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
}
static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)
@@ -4706,7 +4788,7 @@ static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a)
TCGv_i64 addr = new_tmp_a64(s);
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
- do_mem_zpa(s, a->rd, a->pg, addr, a->dtype,
+ do_mem_zpa(s, a->rd, a->pg, addr, a->dtype, 0, false,
fns[s->be_data == MO_BE][a->dtype]);
}
return true;
@@ -4765,7 +4847,7 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a)
TCGv_i64 addr = new_tmp_a64(s);
tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off);
- do_mem_zpa(s, a->rd, a->pg, addr, a->dtype,
+ do_mem_zpa(s, a->rd, a->pg, addr, a->dtype, 0, false,
fns[s->be_data == MO_BE][a->dtype]);
}
return true;
@@ -4967,7 +5049,7 @@ static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
fn = fn_multiple[be][nreg - 1][msz];
}
assert(fn != NULL);
- do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), fn);
+ do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), 0, true, fn);
}
static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a)
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 32/42] target/arm: Add mte helpers for sve scalar + int stores
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (30 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 31/42] target/arm: Add mte helpers for sve scalar + int loads Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 33/42] target/arm: Add mte helpers for sve scalar + int ff/nf loads Richard Henderson
` (10 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-sve.h | 47 +++++++++++
target/arm/sve_helper.c | 95 ++++++++++++++++------
target/arm/translate-sve.c | 162 ++++++++++++++++++++++++-------------
3 files changed, 226 insertions(+), 78 deletions(-)
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index af1c6967a6..e81d06b27c 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1351,6 +1351,53 @@ DEF_HELPER_FLAGS_4(sve_st1hd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_st1sd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_st1sd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4hh_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4hh_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4ss_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4ss_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4dd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4dd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1bh_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1bs_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1bd_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1hs_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hs_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1sd_le_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1sd_be_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
DEF_HELPER_FLAGS_6(sve_ldhsu_le_zsu, TCG_CALL_NO_WG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 0fbf331742..0b8522ff01 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -5187,11 +5187,12 @@ DO_LDFF1_LDNF1_2(dd, MO_64, MO_64)
*/
static inline QEMU_ALWAYS_INLINE
-void sve_stN_r(CPUARMState *env, uint64_t *vg, target_ulong addr, uint32_t desc,
- const uintptr_t retaddr, const int esz,
- const int msz, const int N,
+void sve_stN_r(CPUARMState *env, uint64_t *vg, target_ulong addr,
+ uint32_t desc, const uintptr_t retaddr,
+ const int esz, const int msz, const int N, uint32_t mtedesc,
sve_ldst1_host_fn *host_fn,
- sve_ldst1_tlb_fn *tlb_fn)
+ sve_ldst1_tlb_fn *tlb_fn,
+ sve_cont_ldst_mte_check_fn *mte_check_fn)
{
const unsigned rd = simd_data(desc);
const intptr_t reg_max = simd_oprsz(desc);
@@ -5213,7 +5214,14 @@ void sve_stN_r(CPUARMState *env, uint64_t *vg, target_ulong addr, uint32_t desc,
sve_cont_ldst_watchpoints(&info, env, vg, addr, 1 << esz, N << msz,
BP_MEM_WRITE, retaddr);
- /* TODO: MTE check. */
+ /*
+ * Handle mte checks for all active elements.
+ * Since TBI must be set for MTE, !mtedesc => !mte_active.
+ */
+ if (mte_check_fn && mtedesc) {
+ mte_check_fn(&info, env, vg, addr, 1 << esz, N << msz,
+ mtedesc, retaddr);
+ }
flags = info.page[0].flags | info.page[1].flags;
if (unlikely(flags != 0)) {
@@ -5307,26 +5315,67 @@ void sve_stN_r(CPUARMState *env, uint64_t *vg, target_ulong addr, uint32_t desc,
}
}
-#define DO_STN_1(N, NAME, ESZ) \
-void HELPER(sve_st##N##NAME##_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_stN_r(env, vg, addr, desc, GETPC(), ESZ, MO_8, N, \
- sve_st1##NAME##_host, sve_st1##NAME##_tlb); \
+static inline QEMU_ALWAYS_INLINE
+void sve_stN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
+ uint32_t desc, const uintptr_t ra,
+ const int esz, const int msz, const int N,
+ sve_ldst1_host_fn *host_fn,
+ sve_ldst1_tlb_fn *tlb_fn)
+{
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+ int bit55 = extract64(addr, 55, 1);
+
+ /* Remove mtedesc from the normal sve descriptor. */
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+
+ /* Perform gross MTE suppression early. */
+ if (!tbi_check(desc, bit55) ||
+ tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ mtedesc = 0;
+ }
+
+ sve_stN_r(env, vg, addr, desc, ra, esz, msz, N, mtedesc, host_fn, tlb_fn,
+ N == 1 ? sve_cont_ldst_mte_check1 : sve_cont_ldst_mte_checkN);
}
-#define DO_STN_2(N, NAME, ESZ, MSZ) \
-void HELPER(sve_st##N##NAME##_le_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_stN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, N, \
- sve_st1##NAME##_le_host, sve_st1##NAME##_le_tlb); \
-} \
-void HELPER(sve_st##N##NAME##_be_r)(CPUARMState *env, void *vg, \
- target_ulong addr, uint32_t desc) \
-{ \
- sve_stN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, N, \
- sve_st1##NAME##_be_host, sve_st1##NAME##_be_tlb); \
+#define DO_STN_1(N, NAME, ESZ) \
+void HELPER(sve_st##N##NAME##_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_stN_r(env, vg, addr, desc, GETPC(), ESZ, MO_8, N, 0, \
+ sve_st1##NAME##_host, sve_st1##NAME##_tlb, NULL); \
+} \
+void HELPER(sve_st##N##NAME##_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_stN_r_mte(env, vg, addr, desc, GETPC(), ESZ, MO_8, N, \
+ sve_st1##NAME##_host, sve_st1##NAME##_tlb); \
+}
+
+#define DO_STN_2(N, NAME, ESZ, MSZ) \
+void HELPER(sve_st##N##NAME##_le_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_stN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, N, 0, \
+ sve_st1##NAME##_le_host, sve_st1##NAME##_le_tlb, NULL); \
+} \
+void HELPER(sve_st##N##NAME##_be_r)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_stN_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, N, 0, \
+ sve_st1##NAME##_be_host, sve_st1##NAME##_be_tlb, NULL); \
+} \
+void HELPER(sve_st##N##NAME##_le_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_stN_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, N, \
+ sve_st1##NAME##_le_host, sve_st1##NAME##_le_tlb); \
+} \
+void HELPER(sve_st##N##NAME##_be_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_stN_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, N, \
+ sve_st1##NAME##_be_host, sve_st1##NAME##_be_tlb); \
}
DO_STN_1(1, bb, MO_8)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index e8b9873a4c..8d2a93e7d9 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4983,73 +4983,125 @@ static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a)
static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
int msz, int esz, int nreg)
{
- static gen_helper_gvec_mem * const fn_single[2][4][4] = {
- { { gen_helper_sve_st1bb_r,
- gen_helper_sve_st1bh_r,
- gen_helper_sve_st1bs_r,
- gen_helper_sve_st1bd_r },
- { NULL,
- gen_helper_sve_st1hh_le_r,
- gen_helper_sve_st1hs_le_r,
- gen_helper_sve_st1hd_le_r },
- { NULL, NULL,
- gen_helper_sve_st1ss_le_r,
- gen_helper_sve_st1sd_le_r },
- { NULL, NULL, NULL,
- gen_helper_sve_st1dd_le_r } },
- { { gen_helper_sve_st1bb_r,
- gen_helper_sve_st1bh_r,
- gen_helper_sve_st1bs_r,
- gen_helper_sve_st1bd_r },
- { NULL,
- gen_helper_sve_st1hh_be_r,
- gen_helper_sve_st1hs_be_r,
- gen_helper_sve_st1hd_be_r },
- { NULL, NULL,
- gen_helper_sve_st1ss_be_r,
- gen_helper_sve_st1sd_be_r },
- { NULL, NULL, NULL,
- gen_helper_sve_st1dd_be_r } },
+ static gen_helper_gvec_mem * const fn_single[2][2][4][4] = {
+ { { { gen_helper_sve_st1bb_r,
+ gen_helper_sve_st1bh_r,
+ gen_helper_sve_st1bs_r,
+ gen_helper_sve_st1bd_r },
+ { NULL,
+ gen_helper_sve_st1hh_le_r,
+ gen_helper_sve_st1hs_le_r,
+ gen_helper_sve_st1hd_le_r },
+ { NULL, NULL,
+ gen_helper_sve_st1ss_le_r,
+ gen_helper_sve_st1sd_le_r },
+ { NULL, NULL, NULL,
+ gen_helper_sve_st1dd_le_r } },
+ { { gen_helper_sve_st1bb_r,
+ gen_helper_sve_st1bh_r,
+ gen_helper_sve_st1bs_r,
+ gen_helper_sve_st1bd_r },
+ { NULL,
+ gen_helper_sve_st1hh_be_r,
+ gen_helper_sve_st1hs_be_r,
+ gen_helper_sve_st1hd_be_r },
+ { NULL, NULL,
+ gen_helper_sve_st1ss_be_r,
+ gen_helper_sve_st1sd_be_r },
+ { NULL, NULL, NULL,
+ gen_helper_sve_st1dd_be_r } } },
+
+ { { { gen_helper_sve_st1bb_r_mte,
+ gen_helper_sve_st1bh_r_mte,
+ gen_helper_sve_st1bs_r_mte,
+ gen_helper_sve_st1bd_r_mte },
+ { NULL,
+ gen_helper_sve_st1hh_le_r_mte,
+ gen_helper_sve_st1hs_le_r_mte,
+ gen_helper_sve_st1hd_le_r_mte },
+ { NULL, NULL,
+ gen_helper_sve_st1ss_le_r_mte,
+ gen_helper_sve_st1sd_le_r_mte },
+ { NULL, NULL, NULL,
+ gen_helper_sve_st1dd_le_r_mte } },
+ { { gen_helper_sve_st1bb_r_mte,
+ gen_helper_sve_st1bh_r_mte,
+ gen_helper_sve_st1bs_r_mte,
+ gen_helper_sve_st1bd_r_mte },
+ { NULL,
+ gen_helper_sve_st1hh_be_r_mte,
+ gen_helper_sve_st1hs_be_r_mte,
+ gen_helper_sve_st1hd_be_r_mte },
+ { NULL, NULL,
+ gen_helper_sve_st1ss_be_r_mte,
+ gen_helper_sve_st1sd_be_r_mte },
+ { NULL, NULL, NULL,
+ gen_helper_sve_st1dd_be_r_mte } } },
};
- static gen_helper_gvec_mem * const fn_multiple[2][3][4] = {
- { { gen_helper_sve_st2bb_r,
- gen_helper_sve_st2hh_le_r,
- gen_helper_sve_st2ss_le_r,
- gen_helper_sve_st2dd_le_r },
- { gen_helper_sve_st3bb_r,
- gen_helper_sve_st3hh_le_r,
- gen_helper_sve_st3ss_le_r,
- gen_helper_sve_st3dd_le_r },
- { gen_helper_sve_st4bb_r,
- gen_helper_sve_st4hh_le_r,
- gen_helper_sve_st4ss_le_r,
- gen_helper_sve_st4dd_le_r } },
- { { gen_helper_sve_st2bb_r,
- gen_helper_sve_st2hh_be_r,
- gen_helper_sve_st2ss_be_r,
- gen_helper_sve_st2dd_be_r },
- { gen_helper_sve_st3bb_r,
- gen_helper_sve_st3hh_be_r,
- gen_helper_sve_st3ss_be_r,
- gen_helper_sve_st3dd_be_r },
- { gen_helper_sve_st4bb_r,
- gen_helper_sve_st4hh_be_r,
- gen_helper_sve_st4ss_be_r,
- gen_helper_sve_st4dd_be_r } },
+ static gen_helper_gvec_mem * const fn_multiple[2][2][3][4] = {
+ { { { gen_helper_sve_st2bb_r,
+ gen_helper_sve_st2hh_le_r,
+ gen_helper_sve_st2ss_le_r,
+ gen_helper_sve_st2dd_le_r },
+ { gen_helper_sve_st3bb_r,
+ gen_helper_sve_st3hh_le_r,
+ gen_helper_sve_st3ss_le_r,
+ gen_helper_sve_st3dd_le_r },
+ { gen_helper_sve_st4bb_r,
+ gen_helper_sve_st4hh_le_r,
+ gen_helper_sve_st4ss_le_r,
+ gen_helper_sve_st4dd_le_r } },
+ { { gen_helper_sve_st2bb_r,
+ gen_helper_sve_st2hh_be_r,
+ gen_helper_sve_st2ss_be_r,
+ gen_helper_sve_st2dd_be_r },
+ { gen_helper_sve_st3bb_r,
+ gen_helper_sve_st3hh_be_r,
+ gen_helper_sve_st3ss_be_r,
+ gen_helper_sve_st3dd_be_r },
+ { gen_helper_sve_st4bb_r,
+ gen_helper_sve_st4hh_be_r,
+ gen_helper_sve_st4ss_be_r,
+ gen_helper_sve_st4dd_be_r } } },
+ { { { gen_helper_sve_st2bb_r_mte,
+ gen_helper_sve_st2hh_le_r_mte,
+ gen_helper_sve_st2ss_le_r_mte,
+ gen_helper_sve_st2dd_le_r_mte },
+ { gen_helper_sve_st3bb_r_mte,
+ gen_helper_sve_st3hh_le_r_mte,
+ gen_helper_sve_st3ss_le_r_mte,
+ gen_helper_sve_st3dd_le_r_mte },
+ { gen_helper_sve_st4bb_r_mte,
+ gen_helper_sve_st4hh_le_r_mte,
+ gen_helper_sve_st4ss_le_r_mte,
+ gen_helper_sve_st4dd_le_r_mte } },
+ { { gen_helper_sve_st2bb_r_mte,
+ gen_helper_sve_st2hh_be_r_mte,
+ gen_helper_sve_st2ss_be_r_mte,
+ gen_helper_sve_st2dd_be_r_mte },
+ { gen_helper_sve_st3bb_r_mte,
+ gen_helper_sve_st3hh_be_r_mte,
+ gen_helper_sve_st3ss_be_r_mte,
+ gen_helper_sve_st3dd_be_r_mte },
+ { gen_helper_sve_st4bb_r_mte,
+ gen_helper_sve_st4hh_be_r_mte,
+ gen_helper_sve_st4ss_be_r_mte,
+ gen_helper_sve_st4dd_be_r_mte } } },
};
gen_helper_gvec_mem *fn;
int be = s->be_data == MO_BE;
if (nreg == 0) {
/* ST1 */
- fn = fn_single[be][msz][esz];
+ fn = fn_single[s->mte_active[0]][be][msz][esz];
+ nreg = 1;
} else {
/* ST2, ST3, ST4 -- msz == esz, enforced by encoding */
assert(msz == esz);
- fn = fn_multiple[be][nreg - 1][msz];
+ fn = fn_multiple[s->mte_active[0]][be][nreg - 1][msz];
}
assert(fn != NULL);
- do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), 0, true, fn);
+ do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), nreg, true, fn);
}
static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a)
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 33/42] target/arm: Add mte helpers for sve scalar + int ff/nf loads
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (31 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 32/42] target/arm: Add mte helpers for sve scalar + int stores Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 34/42] target/arm: Handle TBI for sve scalar + int memory ops Richard Henderson
` (9 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-sve.h | 98 ++++++++++++++++
target/arm/sve_helper.c | 98 ++++++++++++++--
target/arm/translate-sve.c | 232 +++++++++++++++++++++++++------------
3 files changed, 342 insertions(+), 86 deletions(-)
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index e81d06b27c..849478fc76 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1273,6 +1273,55 @@ DEF_HELPER_FLAGS_4(sve_ldff1sds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldff1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldff1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bhu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bsu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bdu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bhs_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bss_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bds_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1hh_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hsu_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hdu_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hss_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hds_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1hh_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hsu_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hdu_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hss_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hds_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1ss_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1sdu_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1sds_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1ss_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1sdu_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1sds_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1dd_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1dd_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
@@ -1304,6 +1353,55 @@ DEF_HELPER_FLAGS_4(sve_ldnf1sds_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldnf1dd_le_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_ldnf1dd_be_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bb_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bdu_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bss_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bds_r_mte, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1hh_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hsu_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hdu_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hss_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hds_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1hh_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hsu_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hdu_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hss_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hds_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1ss_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1sdu_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1sds_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1ss_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1sdu_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1sds_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1dd_le_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1dd_be_r_mte, TCG_CALL_NO_WG,
+ void, env, ptr, tl, i32)
+
DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 0b8522ff01..9e8ffb7062 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -4965,7 +4965,7 @@ static void record_fault(CPUARMState *env, uintptr_t i, uintptr_t oprsz)
*/
static inline QEMU_ALWAYS_INLINE
void sve_ldnfff1_r(CPUARMState *env, void *vg, const target_ulong addr,
- uint32_t desc, const uintptr_t retaddr,
+ uint32_t desc, const uintptr_t retaddr, uint32_t mtedesc,
const int esz, const int msz, const SVEContFault fault,
sve_ldst1_host_fn *host_fn,
sve_ldst1_tlb_fn *tlb_fn)
@@ -4997,13 +4997,22 @@ void sve_ldnfff1_r(CPUARMState *env, void *vg, const target_ulong addr,
mem_off = info.mem_off_first[0];
flags = info.page[0].flags;
+ /* Since TBI must be set for MTE, !mtedesc => !mte_active. */
+ if (info.page[0].attrs.target_tlb_bit1) {
+ mtedesc = 0;
+ }
+
if (fault == FAULT_FIRST) {
+ /* Trapping mte check for the first-fault element. */
+ if (mtedesc) {
+ mte_check1(env, mtedesc, addr + mem_off, retaddr);
+ }
+
/*
* Special handling of the first active element,
* if it crosses a page boundary or is MMIO.
*/
bool is_split = mem_off == info.mem_off_split;
- /* TODO: MTE check. */
if (unlikely(flags != 0) || unlikely(is_split)) {
/*
* Use the slow path for cross-page handling.
@@ -5038,7 +5047,10 @@ void sve_ldnfff1_r(CPUARMState *env, void *vg, const target_ulong addr,
/* Watchpoint hit, see below. */
goto do_fault;
}
- /* TODO: MTE check. */
+ if (mtedesc &&
+ !mte_probe1(env, mtedesc, addr + mem_off, retaddr)) {
+ goto do_fault;
+ }
/*
* Use the slow path for cross-page handling.
* This is RAM, without a watchpoint, and will not trap.
@@ -5081,7 +5093,10 @@ void sve_ldnfff1_r(CPUARMState *env, void *vg, const target_ulong addr,
1 << msz, BP_MEM_READ)) {
goto do_fault;
}
- /* TODO: MTE check. */
+ if (mtedesc &&
+ !mte_probe1(env, mtedesc, addr + mem_off, retaddr)) {
+ goto do_fault;
+ }
host_fn(vd, reg_off, host + mem_off);
}
reg_off += 1 << esz;
@@ -5119,44 +5134,103 @@ void sve_ldnfff1_r(CPUARMState *env, void *vg, const target_ulong addr,
record_fault(env, reg_off, reg_max);
}
-#define DO_LDFF1_LDNF1_1(PART, ESZ) \
+static inline QEMU_ALWAYS_INLINE
+void sve_ldnfff1_r_mte(CPUARMState *env, void *vg, target_ulong addr,
+ uint32_t desc, const uintptr_t retaddr,
+ const int esz, const int msz, const SVEContFault fault,
+ sve_ldst1_host_fn *host_fn,
+ sve_ldst1_tlb_fn *tlb_fn)
+{
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+ int bit55 = extract64(addr, 55, 1);
+
+ /* Remove mtedesc from the normal sve descriptor. */
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+
+ /* Perform gross MTE suppression early. */
+ if (!tbi_check(desc, bit55) ||
+ tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ mtedesc = 0;
+ }
+
+ sve_ldnfff1_r(env, vg, addr, desc, retaddr, mtedesc,
+ esz, msz, fault, host_fn, tlb_fn);
+}
+
+#define DO_LDFF1_LDNF1_1(PART, ESZ) \
void HELPER(sve_ldff1##PART##_r)(CPUARMState *env, void *vg, \
target_ulong addr, uint32_t desc) \
{ \
- sve_ldnfff1_r(env, vg, addr, desc, GETPC(), ESZ, MO_8, FAULT_FIRST, \
+ sve_ldnfff1_r(env, vg, addr, desc, GETPC(), 0, ESZ, MO_8, FAULT_FIRST, \
sve_ld1##PART##_host, sve_ld1##PART##_tlb); \
} \
void HELPER(sve_ldnf1##PART##_r)(CPUARMState *env, void *vg, \
target_ulong addr, uint32_t desc) \
{ \
- sve_ldnfff1_r(env, vg, addr, desc, GETPC(), ESZ, MO_8, FAULT_NO, \
+ sve_ldnfff1_r(env, vg, addr, desc, GETPC(), 0, ESZ, MO_8, FAULT_NO, \
+ sve_ld1##PART##_host, sve_ld1##PART##_tlb); \
+} \
+void HELPER(sve_ldff1##PART##_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldnfff1_r_mte(env, vg, addr, desc, GETPC(), ESZ, MO_8, FAULT_FIRST, \
+ sve_ld1##PART##_host, sve_ld1##PART##_tlb); \
+} \
+void HELPER(sve_ldnf1##PART##_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldnfff1_r_mte(env, vg, addr, desc, GETPC(), ESZ, MO_8, FAULT_NO, \
sve_ld1##PART##_host, sve_ld1##PART##_tlb); \
}
-#define DO_LDFF1_LDNF1_2(PART, ESZ, MSZ) \
+#define DO_LDFF1_LDNF1_2(PART, ESZ, MSZ) \
void HELPER(sve_ldff1##PART##_le_r)(CPUARMState *env, void *vg, \
target_ulong addr, uint32_t desc) \
{ \
- sve_ldnfff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_FIRST, \
+ sve_ldnfff1_r(env, vg, addr, desc, GETPC(), 0, ESZ, MSZ, FAULT_FIRST, \
sve_ld1##PART##_le_host, sve_ld1##PART##_le_tlb); \
} \
void HELPER(sve_ldnf1##PART##_le_r)(CPUARMState *env, void *vg, \
target_ulong addr, uint32_t desc) \
{ \
- sve_ldnfff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_NO, \
+ sve_ldnfff1_r(env, vg, addr, desc, GETPC(), 0, ESZ, MSZ, FAULT_NO, \
sve_ld1##PART##_le_host, sve_ld1##PART##_le_tlb); \
} \
void HELPER(sve_ldff1##PART##_be_r)(CPUARMState *env, void *vg, \
target_ulong addr, uint32_t desc) \
{ \
- sve_ldnfff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_FIRST, \
+ sve_ldnfff1_r(env, vg, addr, desc, GETPC(), 0, ESZ, MSZ, FAULT_FIRST, \
sve_ld1##PART##_be_host, sve_ld1##PART##_be_tlb); \
} \
void HELPER(sve_ldnf1##PART##_be_r)(CPUARMState *env, void *vg, \
target_ulong addr, uint32_t desc) \
{ \
- sve_ldnfff1_r(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_NO, \
+ sve_ldnfff1_r(env, vg, addr, desc, GETPC(), 0, ESZ, MSZ, FAULT_NO, \
sve_ld1##PART##_be_host, sve_ld1##PART##_be_tlb); \
+} \
+void HELPER(sve_ldff1##PART##_le_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldnfff1_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_FIRST, \
+ sve_ld1##PART##_le_host, sve_ld1##PART##_le_tlb); \
+} \
+void HELPER(sve_ldnf1##PART##_le_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldnfff1_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_NO, \
+ sve_ld1##PART##_le_host, sve_ld1##PART##_le_tlb); \
+} \
+void HELPER(sve_ldff1##PART##_be_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldnfff1_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_FIRST, \
+ sve_ld1##PART##_be_host, sve_ld1##PART##_be_tlb); \
+} \
+void HELPER(sve_ldnf1##PART##_be_r_mte)(CPUARMState *env, void *vg, \
+ target_ulong addr, uint32_t desc) \
+{ \
+ sve_ldnfff1_r_mte(env, vg, addr, desc, GETPC(), ESZ, MSZ, FAULT_NO, \
+ sve_ld1##PART##_be_host, sve_ld1##PART##_be_tlb); \
}
DO_LDFF1_LDNF1_1(bb, MO_8)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 8d2a93e7d9..5b399bb005 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4740,104 +4740,188 @@ static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a)
static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a)
{
- static gen_helper_gvec_mem * const fns[2][16] = {
- /* Little-endian */
- { gen_helper_sve_ldff1bb_r,
- gen_helper_sve_ldff1bhu_r,
- gen_helper_sve_ldff1bsu_r,
- gen_helper_sve_ldff1bdu_r,
+ static gen_helper_gvec_mem * const fns[2][2][16] = {
+ { /* mte inactive, little-endian */
+ { gen_helper_sve_ldff1bb_r,
+ gen_helper_sve_ldff1bhu_r,
+ gen_helper_sve_ldff1bsu_r,
+ gen_helper_sve_ldff1bdu_r,
- gen_helper_sve_ldff1sds_le_r,
- gen_helper_sve_ldff1hh_le_r,
- gen_helper_sve_ldff1hsu_le_r,
- gen_helper_sve_ldff1hdu_le_r,
+ gen_helper_sve_ldff1sds_le_r,
+ gen_helper_sve_ldff1hh_le_r,
+ gen_helper_sve_ldff1hsu_le_r,
+ gen_helper_sve_ldff1hdu_le_r,
- gen_helper_sve_ldff1hds_le_r,
- gen_helper_sve_ldff1hss_le_r,
- gen_helper_sve_ldff1ss_le_r,
- gen_helper_sve_ldff1sdu_le_r,
+ gen_helper_sve_ldff1hds_le_r,
+ gen_helper_sve_ldff1hss_le_r,
+ gen_helper_sve_ldff1ss_le_r,
+ gen_helper_sve_ldff1sdu_le_r,
- gen_helper_sve_ldff1bds_r,
- gen_helper_sve_ldff1bss_r,
- gen_helper_sve_ldff1bhs_r,
- gen_helper_sve_ldff1dd_le_r },
+ gen_helper_sve_ldff1bds_r,
+ gen_helper_sve_ldff1bss_r,
+ gen_helper_sve_ldff1bhs_r,
+ gen_helper_sve_ldff1dd_le_r },
- /* Big-endian */
- { gen_helper_sve_ldff1bb_r,
- gen_helper_sve_ldff1bhu_r,
- gen_helper_sve_ldff1bsu_r,
- gen_helper_sve_ldff1bdu_r,
+ /* mte inactive, big-endian */
+ { gen_helper_sve_ldff1bb_r,
+ gen_helper_sve_ldff1bhu_r,
+ gen_helper_sve_ldff1bsu_r,
+ gen_helper_sve_ldff1bdu_r,
- gen_helper_sve_ldff1sds_be_r,
- gen_helper_sve_ldff1hh_be_r,
- gen_helper_sve_ldff1hsu_be_r,
- gen_helper_sve_ldff1hdu_be_r,
+ gen_helper_sve_ldff1sds_be_r,
+ gen_helper_sve_ldff1hh_be_r,
+ gen_helper_sve_ldff1hsu_be_r,
+ gen_helper_sve_ldff1hdu_be_r,
- gen_helper_sve_ldff1hds_be_r,
- gen_helper_sve_ldff1hss_be_r,
- gen_helper_sve_ldff1ss_be_r,
- gen_helper_sve_ldff1sdu_be_r,
+ gen_helper_sve_ldff1hds_be_r,
+ gen_helper_sve_ldff1hss_be_r,
+ gen_helper_sve_ldff1ss_be_r,
+ gen_helper_sve_ldff1sdu_be_r,
- gen_helper_sve_ldff1bds_r,
- gen_helper_sve_ldff1bss_r,
- gen_helper_sve_ldff1bhs_r,
- gen_helper_sve_ldff1dd_be_r },
+ gen_helper_sve_ldff1bds_r,
+ gen_helper_sve_ldff1bss_r,
+ gen_helper_sve_ldff1bhs_r,
+ gen_helper_sve_ldff1dd_be_r } },
+
+ { /* mte active, little-endian */
+ { gen_helper_sve_ldff1bb_r_mte,
+ gen_helper_sve_ldff1bhu_r_mte,
+ gen_helper_sve_ldff1bsu_r_mte,
+ gen_helper_sve_ldff1bdu_r_mte,
+
+ gen_helper_sve_ldff1sds_le_r_mte,
+ gen_helper_sve_ldff1hh_le_r_mte,
+ gen_helper_sve_ldff1hsu_le_r_mte,
+ gen_helper_sve_ldff1hdu_le_r_mte,
+
+ gen_helper_sve_ldff1hds_le_r_mte,
+ gen_helper_sve_ldff1hss_le_r_mte,
+ gen_helper_sve_ldff1ss_le_r_mte,
+ gen_helper_sve_ldff1sdu_le_r_mte,
+
+ gen_helper_sve_ldff1bds_r_mte,
+ gen_helper_sve_ldff1bss_r_mte,
+ gen_helper_sve_ldff1bhs_r_mte,
+ gen_helper_sve_ldff1dd_le_r_mte },
+
+ /* mte active, big-endian */
+ { gen_helper_sve_ldff1bb_r_mte,
+ gen_helper_sve_ldff1bhu_r_mte,
+ gen_helper_sve_ldff1bsu_r_mte,
+ gen_helper_sve_ldff1bdu_r_mte,
+
+ gen_helper_sve_ldff1sds_be_r_mte,
+ gen_helper_sve_ldff1hh_be_r_mte,
+ gen_helper_sve_ldff1hsu_be_r_mte,
+ gen_helper_sve_ldff1hdu_be_r_mte,
+
+ gen_helper_sve_ldff1hds_be_r_mte,
+ gen_helper_sve_ldff1hss_be_r_mte,
+ gen_helper_sve_ldff1ss_be_r_mte,
+ gen_helper_sve_ldff1sdu_be_r_mte,
+
+ gen_helper_sve_ldff1bds_r_mte,
+ gen_helper_sve_ldff1bss_r_mte,
+ gen_helper_sve_ldff1bhs_r_mte,
+ gen_helper_sve_ldff1dd_be_r_mte } },
};
if (sve_access_check(s)) {
TCGv_i64 addr = new_tmp_a64(s);
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype));
tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
- do_mem_zpa(s, a->rd, a->pg, addr, a->dtype, 0, false,
- fns[s->be_data == MO_BE][a->dtype]);
+ do_mem_zpa(s, a->rd, a->pg, addr, a->dtype, 1, false,
+ fns[s->mte_active[0]][s->be_data == MO_BE][a->dtype]);
}
return true;
}
static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a)
{
- static gen_helper_gvec_mem * const fns[2][16] = {
- /* Little-endian */
- { gen_helper_sve_ldnf1bb_r,
- gen_helper_sve_ldnf1bhu_r,
- gen_helper_sve_ldnf1bsu_r,
- gen_helper_sve_ldnf1bdu_r,
+ static gen_helper_gvec_mem * const fns[2][2][16] = {
+ { /* mte inactive, little-endian */
+ { gen_helper_sve_ldnf1bb_r,
+ gen_helper_sve_ldnf1bhu_r,
+ gen_helper_sve_ldnf1bsu_r,
+ gen_helper_sve_ldnf1bdu_r,
- gen_helper_sve_ldnf1sds_le_r,
- gen_helper_sve_ldnf1hh_le_r,
- gen_helper_sve_ldnf1hsu_le_r,
- gen_helper_sve_ldnf1hdu_le_r,
+ gen_helper_sve_ldnf1sds_le_r,
+ gen_helper_sve_ldnf1hh_le_r,
+ gen_helper_sve_ldnf1hsu_le_r,
+ gen_helper_sve_ldnf1hdu_le_r,
- gen_helper_sve_ldnf1hds_le_r,
- gen_helper_sve_ldnf1hss_le_r,
- gen_helper_sve_ldnf1ss_le_r,
- gen_helper_sve_ldnf1sdu_le_r,
+ gen_helper_sve_ldnf1hds_le_r,
+ gen_helper_sve_ldnf1hss_le_r,
+ gen_helper_sve_ldnf1ss_le_r,
+ gen_helper_sve_ldnf1sdu_le_r,
- gen_helper_sve_ldnf1bds_r,
- gen_helper_sve_ldnf1bss_r,
- gen_helper_sve_ldnf1bhs_r,
- gen_helper_sve_ldnf1dd_le_r },
+ gen_helper_sve_ldnf1bds_r,
+ gen_helper_sve_ldnf1bss_r,
+ gen_helper_sve_ldnf1bhs_r,
+ gen_helper_sve_ldnf1dd_le_r },
- /* Big-endian */
- { gen_helper_sve_ldnf1bb_r,
- gen_helper_sve_ldnf1bhu_r,
- gen_helper_sve_ldnf1bsu_r,
- gen_helper_sve_ldnf1bdu_r,
+ /* mte inactive, big-endian */
+ { gen_helper_sve_ldnf1bb_r,
+ gen_helper_sve_ldnf1bhu_r,
+ gen_helper_sve_ldnf1bsu_r,
+ gen_helper_sve_ldnf1bdu_r,
- gen_helper_sve_ldnf1sds_be_r,
- gen_helper_sve_ldnf1hh_be_r,
- gen_helper_sve_ldnf1hsu_be_r,
- gen_helper_sve_ldnf1hdu_be_r,
+ gen_helper_sve_ldnf1sds_be_r,
+ gen_helper_sve_ldnf1hh_be_r,
+ gen_helper_sve_ldnf1hsu_be_r,
+ gen_helper_sve_ldnf1hdu_be_r,
- gen_helper_sve_ldnf1hds_be_r,
- gen_helper_sve_ldnf1hss_be_r,
- gen_helper_sve_ldnf1ss_be_r,
- gen_helper_sve_ldnf1sdu_be_r,
+ gen_helper_sve_ldnf1hds_be_r,
+ gen_helper_sve_ldnf1hss_be_r,
+ gen_helper_sve_ldnf1ss_be_r,
+ gen_helper_sve_ldnf1sdu_be_r,
- gen_helper_sve_ldnf1bds_r,
- gen_helper_sve_ldnf1bss_r,
- gen_helper_sve_ldnf1bhs_r,
- gen_helper_sve_ldnf1dd_be_r },
+ gen_helper_sve_ldnf1bds_r,
+ gen_helper_sve_ldnf1bss_r,
+ gen_helper_sve_ldnf1bhs_r,
+ gen_helper_sve_ldnf1dd_be_r } },
+
+ { /* mte inactive, little-endian */
+ { gen_helper_sve_ldnf1bb_r_mte,
+ gen_helper_sve_ldnf1bhu_r_mte,
+ gen_helper_sve_ldnf1bsu_r_mte,
+ gen_helper_sve_ldnf1bdu_r_mte,
+
+ gen_helper_sve_ldnf1sds_le_r_mte,
+ gen_helper_sve_ldnf1hh_le_r_mte,
+ gen_helper_sve_ldnf1hsu_le_r_mte,
+ gen_helper_sve_ldnf1hdu_le_r_mte,
+
+ gen_helper_sve_ldnf1hds_le_r_mte,
+ gen_helper_sve_ldnf1hss_le_r_mte,
+ gen_helper_sve_ldnf1ss_le_r_mte,
+ gen_helper_sve_ldnf1sdu_le_r_mte,
+
+ gen_helper_sve_ldnf1bds_r_mte,
+ gen_helper_sve_ldnf1bss_r_mte,
+ gen_helper_sve_ldnf1bhs_r_mte,
+ gen_helper_sve_ldnf1dd_le_r_mte },
+
+ /* mte inactive, big-endian */
+ { gen_helper_sve_ldnf1bb_r_mte,
+ gen_helper_sve_ldnf1bhu_r_mte,
+ gen_helper_sve_ldnf1bsu_r_mte,
+ gen_helper_sve_ldnf1bdu_r_mte,
+
+ gen_helper_sve_ldnf1sds_be_r_mte,
+ gen_helper_sve_ldnf1hh_be_r_mte,
+ gen_helper_sve_ldnf1hsu_be_r_mte,
+ gen_helper_sve_ldnf1hdu_be_r_mte,
+
+ gen_helper_sve_ldnf1hds_be_r_mte,
+ gen_helper_sve_ldnf1hss_be_r_mte,
+ gen_helper_sve_ldnf1ss_be_r_mte,
+ gen_helper_sve_ldnf1sdu_be_r_mte,
+
+ gen_helper_sve_ldnf1bds_r_mte,
+ gen_helper_sve_ldnf1bss_r_mte,
+ gen_helper_sve_ldnf1bhs_r_mte,
+ gen_helper_sve_ldnf1dd_be_r_mte } },
};
if (sve_access_check(s)) {
@@ -4847,8 +4931,8 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a)
TCGv_i64 addr = new_tmp_a64(s);
tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off);
- do_mem_zpa(s, a->rd, a->pg, addr, a->dtype, 0, false,
- fns[s->be_data == MO_BE][a->dtype]);
+ do_mem_zpa(s, a->rd, a->pg, addr, a->dtype, 1, false,
+ fns[s->mte_active[0]][s->be_data == MO_BE][a->dtype]);
}
return true;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 34/42] target/arm: Handle TBI for sve scalar + int memory ops
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (32 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 33/42] target/arm: Add mte helpers for sve scalar + int ff/nf loads Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 35/42] target/arm: Add mte helpers for sve scatter/gather " Richard Henderson
` (8 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
We still need to handle tbi for user-only when mte is inactive.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/translate-a64.h | 1 +
target/arm/translate-a64.c | 2 +-
target/arm/translate-sve.c | 6 ++++--
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index 452b4764d0..65c0280498 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -40,6 +40,7 @@ TCGv_ptr get_fpstatus_ptr(bool);
bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
unsigned int imms, unsigned int immr);
bool sve_access_check(DisasContext *s);
+TCGv_i64 clean_data_tbi(DisasContext *s, TCGv_i64 addr);
TCGv_i64 gen_mte_check1(DisasContext *s, TCGv_i64 addr, bool is_write,
bool tag_checked, int log2_size);
TCGv_i64 gen_mte_checkN(DisasContext *s, TCGv_i64 addr, bool is_write,
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index d86c13a32d..1314b200e0 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -232,7 +232,7 @@ static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
* of the write-back address.
*/
-static TCGv_i64 clean_data_tbi(DisasContext *s, TCGv_i64 addr)
+TCGv_i64 clean_data_tbi(DisasContext *s, TCGv_i64 addr)
{
TCGv_i64 clean = new_tmp_a64(s);
#ifdef CONFIG_USER_ONLY
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 5b399bb005..ce67b79b02 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4554,9 +4554,8 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
* For e.g. LD4, there are not enough arguments to pass all 4
* registers as pointers, so encode the regno into the data field.
* For consistency, do this even for LD1.
- * TODO: mte_n check here while callers are updated.
*/
- if (mte_n && s->mte_active[0]) {
+ if (s->mte_active[0]) {
int msz = dtype_msz(dtype);
desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
@@ -4566,7 +4565,10 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
desc = FIELD_DP32(desc, MTEDESC, ESIZE, 1 << msz);
desc = FIELD_DP32(desc, MTEDESC, TSIZE, mte_n << msz);
desc <<= SVE_MTEDESC_SHIFT;
+ } else {
+ addr = clean_data_tbi(s, addr);
}
+
desc = simd_desc(vsz, vsz, zt | desc);
t_desc = tcg_const_i32(desc);
t_pg = tcg_temp_new_ptr();
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 35/42] target/arm: Add mte helpers for sve scatter/gather memory ops
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (33 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 34/42] target/arm: Handle TBI for sve scalar + int memory ops Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 36/42] target/arm: Complete TBI clearing for user-only for SVE Richard Henderson
` (7 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Because the elements are non-sequential, we cannot eliminate many
tests straight away like we can for sequential operations. But
we often have the PTE details handy, so we can test for Tagged.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper-sve.h | 285 ++++++++++++++++
target/arm/sve_helper.c | 185 +++++++++--
target/arm/translate-sve.c | 650 +++++++++++++++++++++++++------------
3 files changed, 872 insertions(+), 248 deletions(-)
diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 849478fc76..ddf9069c7f 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1605,6 +1605,115 @@ DEF_HELPER_FLAGS_6(sve_ldsds_le_zd, TCG_CALL_NO_WG,
DEF_HELPER_FLAGS_6(sve_ldsds_be_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbsu_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhsu_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhsu_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbss_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbsu_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhsu_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhsu_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldss_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbss_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbdu_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbds_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbdu_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbds_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbdu_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_lddd_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbds_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
DEF_HELPER_FLAGS_6(sve_ldffhsu_le_zsu, TCG_CALL_NO_WG,
@@ -1714,6 +1823,115 @@ DEF_HELPER_FLAGS_6(sve_ldffsds_le_zd, TCG_CALL_NO_WG,
DEF_HELPER_FLAGS_6(sve_ldffsds_be_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhsu_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhsu_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbss_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhsu_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhsu_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffss_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbss_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbds_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbds_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffdd_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbds_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
DEF_HELPER_FLAGS_6(sve_sths_le_zsu, TCG_CALL_NO_WG,
@@ -1781,4 +1999,71 @@ DEF_HELPER_FLAGS_6(sve_stdd_le_zd, TCG_CALL_NO_WG,
DEF_HELPER_FLAGS_6(sve_stdd_be_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stbs_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sths_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sths_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbs_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sths_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sths_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbd_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_le_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_be_zsu_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbd_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_le_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_be_zss_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbd_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_le_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_be_zd_mte, TCG_CALL_NO_WG,
+ void, env, ptr, ptr, ptr, tl, i32)
+
DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 9e8ffb7062..566a619300 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -5518,7 +5518,8 @@ static target_ulong off_zd_d(void *reg, intptr_t reg_ofs)
static inline QEMU_ALWAYS_INLINE
void sve_ld1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
target_ulong base, uint32_t desc, uintptr_t retaddr,
- int esize, int msize, zreg_off_fn *off_fn,
+ uint32_t mtedesc, int esize, int msize,
+ zreg_off_fn *off_fn,
sve_ldst1_host_fn *host_fn,
sve_ldst1_tlb_fn *tlb_fn)
{
@@ -5546,7 +5547,9 @@ void sve_ld1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
cpu_check_watchpoint(env_cpu(env), addr, msize,
info.attrs, BP_MEM_READ, retaddr);
}
- /* TODO: MTE check */
+ if (mtedesc && info.attrs.target_tlb_bit1) {
+ mte_check1(env, mtedesc, addr, retaddr);
+ }
host_fn(&scratch, reg_off, info.host);
} else {
/* Element crosses the page boundary. */
@@ -5557,7 +5560,9 @@ void sve_ld1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
msize, info.attrs,
BP_MEM_READ, retaddr);
}
- /* TODO: MTE check */
+ if (mtedesc && info.attrs.target_tlb_bit1) {
+ mte_check1(env, mtedesc, addr, retaddr);
+ }
tlb_fn(env, &scratch, reg_off, addr, retaddr);
}
}
@@ -5570,20 +5575,53 @@ void sve_ld1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
memcpy(vd, &scratch, reg_max);
}
+static inline QEMU_ALWAYS_INLINE
+void sve_ld1_z_mte(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
+ target_ulong base, uint32_t desc, uintptr_t retaddr,
+ int esize, int msize, zreg_off_fn *off_fn,
+ sve_ldst1_host_fn *host_fn,
+ sve_ldst1_tlb_fn *tlb_fn)
+{
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+ /* Remove mtedesc from the normal sve descriptor. */
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+
+ /*
+ * ??? TODO: For the 32-bit offset extractions, base + ofs cannot
+ * offset base entirely over the address space hole to change the
+ * pointer tag, or change the bit55 selector. So we could here
+ * examine TBI + TCMA like we do for sve_ldN_r_mte().
+ */
+ sve_ld1_z(env, vd, vg, vm, base, desc, retaddr, mtedesc,
+ esize, msize, off_fn, host_fn, tlb_fn);
+}
+
#define DO_LD1_ZPZ_S(MEM, OFS, MSZ) \
void HELPER(sve_ld##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
void *vm, target_ulong base, uint32_t desc) \
{ \
- sve_ld1_z(env, vd, vg, vm, base, desc, GETPC(), 4, 1 << MSZ, \
+ sve_ld1_z(env, vd, vg, vm, base, desc, GETPC(), 0, 4, 1 << MSZ, \
off_##OFS##_s, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
+} \
+void HELPER(sve_ld##MEM##_##OFS##_mte)(CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_ld1_z_mte(env, vd, vg, vm, base, desc, GETPC(), 4, 1 << MSZ, \
+ off_##OFS##_s, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
}
#define DO_LD1_ZPZ_D(MEM, OFS, MSZ) \
void HELPER(sve_ld##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
void *vm, target_ulong base, uint32_t desc) \
{ \
- sve_ld1_z(env, vd, vg, vm, base, desc, GETPC(), 8, 1 << MSZ, \
+ sve_ld1_z(env, vd, vg, vm, base, desc, GETPC(), 0, 8, 1 << MSZ, \
off_##OFS##_d, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
+} \
+void HELPER(sve_ld##MEM##_##OFS##_mte)(CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_ld1_z_mte(env, vd, vg, vm, base, desc, GETPC(), 8, 1 << MSZ, \
+ off_##OFS##_d, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
}
DO_LD1_ZPZ_S(bsu, zsu, MO_8)
@@ -5662,7 +5700,8 @@ DO_LD1_ZPZ_D(dd_be, zd, MO_64)
static inline QEMU_ALWAYS_INLINE
void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
target_ulong base, uint32_t desc, uintptr_t retaddr,
- const int esz, const int msz, zreg_off_fn *off_fn,
+ uint32_t mtedesc, const int esz, const int msz,
+ zreg_off_fn *off_fn,
sve_ldst1_host_fn *host_fn,
sve_ldst1_tlb_fn *tlb_fn)
{
@@ -5687,6 +5726,9 @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
* Probe the first element, allowing faults.
*/
addr = base + (off_fn(vm, reg_off) << scale);
+ if (mtedesc) {
+ mte_check1(env, mtedesc, addr, retaddr);
+ }
tlb_fn(env, vd, reg_off, addr, retaddr);
/* After any fault, zero the other elements. */
@@ -5719,7 +5761,11 @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
BP_MEM_READ)) {
goto fault;
}
- /* TODO: MTE check. */
+ if (mtedesc &&
+ info.attrs.target_tlb_bit1 &&
+ !mte_probe1(env, mtedesc, addr, retaddr)) {
+ goto fault;
+ }
host_fn(vd, reg_off, info.host);
}
@@ -5732,20 +5778,58 @@ void sve_ldff1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
record_fault(env, reg_off, reg_max);
}
-#define DO_LDFF1_ZPZ_S(MEM, OFS, MSZ) \
-void HELPER(sve_ldff##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
- void *vm, target_ulong base, uint32_t desc) \
-{ \
- sve_ldff1_z(env, vd, vg, vm, base, desc, GETPC(), MO_32, MSZ, \
- off_##OFS##_s, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
+static inline QEMU_ALWAYS_INLINE
+void sve_ldff1_z_mte(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
+ target_ulong base, uint32_t desc, uintptr_t retaddr,
+ const int esz, const int msz,
+ zreg_off_fn *off_fn,
+ sve_ldst1_host_fn *host_fn,
+ sve_ldst1_tlb_fn *tlb_fn)
+{
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+ /* Remove mtedesc from the normal sve descriptor. */
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+
+ /*
+ * ??? TODO: For the 32-bit offset extractions, base + ofs cannot
+ * offset base entirely over the address space hole to change the
+ * pointer tag, or change the bit55 selector. So we could here
+ * examine TBI + TCMA like we do for sve_ldN_r_mte().
+ */
+ sve_ldff1_z(env, vd, vg, vm, base, desc, retaddr, mtedesc,
+ esz, msz, off_fn, host_fn, tlb_fn);
}
-#define DO_LDFF1_ZPZ_D(MEM, OFS, MSZ) \
-void HELPER(sve_ldff##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
- void *vm, target_ulong base, uint32_t desc) \
-{ \
- sve_ldff1_z(env, vd, vg, vm, base, desc, GETPC(), MO_64, MSZ, \
- off_##OFS##_d, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
+#define DO_LDFF1_ZPZ_S(MEM, OFS, MSZ) \
+void HELPER(sve_ldff##MEM##_##OFS) \
+ (CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_ldff1_z(env, vd, vg, vm, base, desc, GETPC(), 0, MO_32, MSZ, \
+ off_##OFS##_s, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
+} \
+void HELPER(sve_ldff##MEM##_##OFS##_mte) \
+ (CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_ldff1_z_mte(env, vd, vg, vm, base, desc, GETPC(), MO_32, MSZ, \
+ off_##OFS##_s, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
+}
+
+#define DO_LDFF1_ZPZ_D(MEM, OFS, MSZ) \
+void HELPER(sve_ldff##MEM##_##OFS) \
+ (CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_ldff1_z(env, vd, vg, vm, base, desc, GETPC(), 0, MO_64, MSZ, \
+ off_##OFS##_d, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
+} \
+void HELPER(sve_ldff##MEM##_##OFS##_mte) \
+ (CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_ldff1_z_mte(env, vd, vg, vm, base, desc, GETPC(), MO_64, MSZ, \
+ off_##OFS##_d, sve_ld1##MEM##_host, sve_ld1##MEM##_tlb); \
}
DO_LDFF1_ZPZ_S(bsu, zsu, MO_8)
@@ -5817,7 +5901,8 @@ DO_LDFF1_ZPZ_D(dd_be, zd, MO_64)
static inline QEMU_ALWAYS_INLINE
void sve_st1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
target_ulong base, uint32_t desc, uintptr_t retaddr,
- int esize, int msize, zreg_off_fn *off_fn,
+ uint32_t mtedesc, int esize, int msize,
+ zreg_off_fn *off_fn,
sve_ldst1_host_fn *host_fn,
sve_ldst1_tlb_fn *tlb_fn)
{
@@ -5861,7 +5946,10 @@ void sve_st1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
cpu_check_watchpoint(env_cpu(env), addr, msize,
info.attrs, BP_MEM_WRITE, retaddr);
}
- /* TODO: MTE check. */
+
+ if (mtedesc && info.attrs.target_tlb_bit1) {
+ mte_check1(env, mtedesc, addr, retaddr);
+ }
}
i += 1;
reg_off += esize;
@@ -5891,20 +5979,53 @@ void sve_st1_z(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
} while (reg_off < reg_max);
}
-#define DO_ST1_ZPZ_S(MEM, OFS, MSZ) \
-void HELPER(sve_st##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
- void *vm, target_ulong base, uint32_t desc) \
-{ \
- sve_st1_z(env, vd, vg, vm, base, desc, GETPC(), 4, 1 << MSZ, \
- off_##OFS##_s, sve_st1##MEM##_host, sve_st1##MEM##_tlb); \
+static inline QEMU_ALWAYS_INLINE
+void sve_st1_z_mte(CPUARMState *env, void *vd, uint64_t *vg, void *vm,
+ target_ulong base, uint32_t desc, uintptr_t retaddr,
+ int esize, int msize, zreg_off_fn *off_fn,
+ sve_ldst1_host_fn *host_fn,
+ sve_ldst1_tlb_fn *tlb_fn)
+{
+ uint32_t mtedesc = desc >> (SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+ /* Remove mtedesc from the normal sve descriptor. */
+ desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
+
+ /*
+ * ??? TODO: For the 32-bit offset extractions, base + ofs cannot
+ * offset base entirely over the address space hole to change the
+ * pointer tag, or change the bit55 selector. So we could here
+ * examine TBI + TCMA like we do for sve_ldN_r_mte().
+ */
+ sve_st1_z(env, vd, vg, vm, base, desc, retaddr, mtedesc,
+ esize, msize, off_fn, host_fn, tlb_fn);
}
-#define DO_ST1_ZPZ_D(MEM, OFS, MSZ) \
-void HELPER(sve_st##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
+#define DO_ST1_ZPZ_S(MEM, OFS, MSZ) \
+void HELPER(sve_st##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
void *vm, target_ulong base, uint32_t desc) \
-{ \
- sve_st1_z(env, vd, vg, vm, base, desc, GETPC(), 8, 1 << MSZ, \
- off_##OFS##_d, sve_st1##MEM##_host, sve_st1##MEM##_tlb); \
+{ \
+ sve_st1_z(env, vd, vg, vm, base, desc, GETPC(), 0, 4, 1 << MSZ, \
+ off_##OFS##_s, sve_st1##MEM##_host, sve_st1##MEM##_tlb); \
+} \
+void HELPER(sve_st##MEM##_##OFS##_mte)(CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_st1_z_mte(env, vd, vg, vm, base, desc, GETPC(), 4, 1 << MSZ, \
+ off_##OFS##_s, sve_st1##MEM##_host, sve_st1##MEM##_tlb); \
+}
+
+#define DO_ST1_ZPZ_D(MEM, OFS, MSZ) \
+void HELPER(sve_st##MEM##_##OFS)(CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_st1_z(env, vd, vg, vm, base, desc, GETPC(), 0, 8, 1 << MSZ, \
+ off_##OFS##_d, sve_st1##MEM##_host, sve_st1##MEM##_tlb); \
+} \
+void HELPER(sve_st##MEM##_##OFS##_mte)(CPUARMState *env, void *vd, void *vg, \
+ void *vm, target_ulong base, uint32_t desc) \
+{ \
+ sve_st1_z_mte(env, vd, vg, vm, base, desc, GETPC(), 8, 1 << MSZ, \
+ off_##OFS##_d, sve_st1##MEM##_host, sve_st1##MEM##_tlb); \
}
DO_ST1_ZPZ_S(bs, zsu, MO_8)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index ce67b79b02..d4525b9112 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -5226,7 +5226,7 @@ static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a)
*/
static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
- int scale, TCGv_i64 scalar, int msz,
+ int scale, TCGv_i64 scalar, int msz, bool is_write,
gen_helper_gvec_mem_scatter *fn)
{
unsigned vsz = vec_full_reg_size(s);
@@ -5234,8 +5234,16 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
TCGv_ptr t_pg = tcg_temp_new_ptr();
TCGv_ptr t_zt = tcg_temp_new_ptr();
TCGv_i32 t_desc;
- int desc;
+ int desc = 0;
+ if (s->mte_active[0]) {
+ desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
+ desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
+ desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
+ desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
+ desc = FIELD_DP32(desc, MTEDESC, ESIZE, 1 << msz);
+ desc <<= SVE_MTEDESC_SHIFT;
+ }
desc = simd_desc(vsz, vsz, scale);
t_desc = tcg_const_i32(desc);
@@ -5250,176 +5258,339 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
tcg_temp_free_i32(t_desc);
}
-/* Indexed by [be][ff][xs][u][msz]. */
-static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][2][3] = {
- /* Little-endian */
- { { { { gen_helper_sve_ldbss_zsu,
- gen_helper_sve_ldhss_le_zsu,
- NULL, },
- { gen_helper_sve_ldbsu_zsu,
- gen_helper_sve_ldhsu_le_zsu,
- gen_helper_sve_ldss_le_zsu, } },
- { { gen_helper_sve_ldbss_zss,
- gen_helper_sve_ldhss_le_zss,
- NULL, },
- { gen_helper_sve_ldbsu_zss,
- gen_helper_sve_ldhsu_le_zss,
- gen_helper_sve_ldss_le_zss, } } },
+/* Indexed by [mte][be][ff][xs][u][msz]. */
+static gen_helper_gvec_mem_scatter * const
+gather_load_fn32[2][2][2][2][2][3] = {
+ { /* MTE Inactive */
+ { /* Little-endian */
+ { { { gen_helper_sve_ldbss_zsu,
+ gen_helper_sve_ldhss_le_zsu,
+ NULL, },
+ { gen_helper_sve_ldbsu_zsu,
+ gen_helper_sve_ldhsu_le_zsu,
+ gen_helper_sve_ldss_le_zsu, } },
+ { { gen_helper_sve_ldbss_zss,
+ gen_helper_sve_ldhss_le_zss,
+ NULL, },
+ { gen_helper_sve_ldbsu_zss,
+ gen_helper_sve_ldhsu_le_zss,
+ gen_helper_sve_ldss_le_zss, } } },
- /* First-fault */
- { { { gen_helper_sve_ldffbss_zsu,
- gen_helper_sve_ldffhss_le_zsu,
- NULL, },
- { gen_helper_sve_ldffbsu_zsu,
- gen_helper_sve_ldffhsu_le_zsu,
- gen_helper_sve_ldffss_le_zsu, } },
- { { gen_helper_sve_ldffbss_zss,
- gen_helper_sve_ldffhss_le_zss,
- NULL, },
- { gen_helper_sve_ldffbsu_zss,
- gen_helper_sve_ldffhsu_le_zss,
- gen_helper_sve_ldffss_le_zss, } } } },
+ /* First-fault */
+ { { { gen_helper_sve_ldffbss_zsu,
+ gen_helper_sve_ldffhss_le_zsu,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zsu,
+ gen_helper_sve_ldffhsu_le_zsu,
+ gen_helper_sve_ldffss_le_zsu, } },
+ { { gen_helper_sve_ldffbss_zss,
+ gen_helper_sve_ldffhss_le_zss,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zss,
+ gen_helper_sve_ldffhsu_le_zss,
+ gen_helper_sve_ldffss_le_zss, } } } },
- /* Big-endian */
- { { { { gen_helper_sve_ldbss_zsu,
- gen_helper_sve_ldhss_be_zsu,
- NULL, },
- { gen_helper_sve_ldbsu_zsu,
- gen_helper_sve_ldhsu_be_zsu,
- gen_helper_sve_ldss_be_zsu, } },
- { { gen_helper_sve_ldbss_zss,
- gen_helper_sve_ldhss_be_zss,
- NULL, },
- { gen_helper_sve_ldbsu_zss,
- gen_helper_sve_ldhsu_be_zss,
- gen_helper_sve_ldss_be_zss, } } },
+ { /* Big-endian */
+ { { { gen_helper_sve_ldbss_zsu,
+ gen_helper_sve_ldhss_be_zsu,
+ NULL, },
+ { gen_helper_sve_ldbsu_zsu,
+ gen_helper_sve_ldhsu_be_zsu,
+ gen_helper_sve_ldss_be_zsu, } },
+ { { gen_helper_sve_ldbss_zss,
+ gen_helper_sve_ldhss_be_zss,
+ NULL, },
+ { gen_helper_sve_ldbsu_zss,
+ gen_helper_sve_ldhsu_be_zss,
+ gen_helper_sve_ldss_be_zss, } } },
- /* First-fault */
- { { { gen_helper_sve_ldffbss_zsu,
- gen_helper_sve_ldffhss_be_zsu,
- NULL, },
- { gen_helper_sve_ldffbsu_zsu,
- gen_helper_sve_ldffhsu_be_zsu,
- gen_helper_sve_ldffss_be_zsu, } },
- { { gen_helper_sve_ldffbss_zss,
- gen_helper_sve_ldffhss_be_zss,
- NULL, },
- { gen_helper_sve_ldffbsu_zss,
- gen_helper_sve_ldffhsu_be_zss,
- gen_helper_sve_ldffss_be_zss, } } } },
+ /* First-fault */
+ { { { gen_helper_sve_ldffbss_zsu,
+ gen_helper_sve_ldffhss_be_zsu,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zsu,
+ gen_helper_sve_ldffhsu_be_zsu,
+ gen_helper_sve_ldffss_be_zsu, } },
+ { { gen_helper_sve_ldffbss_zss,
+ gen_helper_sve_ldffhss_be_zss,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zss,
+ gen_helper_sve_ldffhsu_be_zss,
+ gen_helper_sve_ldffss_be_zss, } } } } },
+ { /* MTE Active */
+ { /* Little-endian */
+ { { { gen_helper_sve_ldbss_zsu_mte,
+ gen_helper_sve_ldhss_le_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldbsu_zsu_mte,
+ gen_helper_sve_ldhsu_le_zsu_mte,
+ gen_helper_sve_ldss_le_zsu_mte, } },
+ { { gen_helper_sve_ldbss_zss_mte,
+ gen_helper_sve_ldhss_le_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldbsu_zss_mte,
+ gen_helper_sve_ldhsu_le_zss_mte,
+ gen_helper_sve_ldss_le_zss_mte, } } },
+
+ /* First-fault */
+ { { { gen_helper_sve_ldffbss_zsu_mte,
+ gen_helper_sve_ldffhss_le_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zsu_mte,
+ gen_helper_sve_ldffhsu_le_zsu_mte,
+ gen_helper_sve_ldffss_le_zsu_mte, } },
+ { { gen_helper_sve_ldffbss_zss_mte,
+ gen_helper_sve_ldffhss_le_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zss_mte,
+ gen_helper_sve_ldffhsu_le_zss_mte,
+ gen_helper_sve_ldffss_le_zss_mte, } } } },
+
+ { /* Big-endian */
+ { { { gen_helper_sve_ldbss_zsu_mte,
+ gen_helper_sve_ldhss_be_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldbsu_zsu_mte,
+ gen_helper_sve_ldhsu_be_zsu_mte,
+ gen_helper_sve_ldss_be_zsu_mte, } },
+ { { gen_helper_sve_ldbss_zss_mte,
+ gen_helper_sve_ldhss_be_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldbsu_zss_mte,
+ gen_helper_sve_ldhsu_be_zss_mte,
+ gen_helper_sve_ldss_be_zss_mte, } } },
+
+ /* First-fault */
+ { { { gen_helper_sve_ldffbss_zsu_mte,
+ gen_helper_sve_ldffhss_be_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zsu_mte,
+ gen_helper_sve_ldffhsu_be_zsu_mte,
+ gen_helper_sve_ldffss_be_zsu_mte, } },
+ { { gen_helper_sve_ldffbss_zss_mte,
+ gen_helper_sve_ldffhss_be_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldffbsu_zss_mte,
+ gen_helper_sve_ldffhsu_be_zss_mte,
+ gen_helper_sve_ldffss_be_zss_mte, } } } } },
};
/* Note that we overload xs=2 to indicate 64-bit offset. */
-static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][2][3][2][4] = {
- /* Little-endian */
- { { { { gen_helper_sve_ldbds_zsu,
- gen_helper_sve_ldhds_le_zsu,
- gen_helper_sve_ldsds_le_zsu,
- NULL, },
- { gen_helper_sve_ldbdu_zsu,
- gen_helper_sve_ldhdu_le_zsu,
- gen_helper_sve_ldsdu_le_zsu,
- gen_helper_sve_lddd_le_zsu, } },
- { { gen_helper_sve_ldbds_zss,
- gen_helper_sve_ldhds_le_zss,
- gen_helper_sve_ldsds_le_zss,
- NULL, },
- { gen_helper_sve_ldbdu_zss,
- gen_helper_sve_ldhdu_le_zss,
- gen_helper_sve_ldsdu_le_zss,
- gen_helper_sve_lddd_le_zss, } },
- { { gen_helper_sve_ldbds_zd,
- gen_helper_sve_ldhds_le_zd,
- gen_helper_sve_ldsds_le_zd,
- NULL, },
- { gen_helper_sve_ldbdu_zd,
- gen_helper_sve_ldhdu_le_zd,
- gen_helper_sve_ldsdu_le_zd,
- gen_helper_sve_lddd_le_zd, } } },
+static gen_helper_gvec_mem_scatter * const
+gather_load_fn64[2][2][2][3][2][4] = {
+ { /* MTE Inactive */
+ { /* Little-endian */
+ { { { gen_helper_sve_ldbds_zsu,
+ gen_helper_sve_ldhds_le_zsu,
+ gen_helper_sve_ldsds_le_zsu,
+ NULL, },
+ { gen_helper_sve_ldbdu_zsu,
+ gen_helper_sve_ldhdu_le_zsu,
+ gen_helper_sve_ldsdu_le_zsu,
+ gen_helper_sve_lddd_le_zsu, } },
+ { { gen_helper_sve_ldbds_zss,
+ gen_helper_sve_ldhds_le_zss,
+ gen_helper_sve_ldsds_le_zss,
+ NULL, },
+ { gen_helper_sve_ldbdu_zss,
+ gen_helper_sve_ldhdu_le_zss,
+ gen_helper_sve_ldsdu_le_zss,
+ gen_helper_sve_lddd_le_zss, } },
+ { { gen_helper_sve_ldbds_zd,
+ gen_helper_sve_ldhds_le_zd,
+ gen_helper_sve_ldsds_le_zd,
+ NULL, },
+ { gen_helper_sve_ldbdu_zd,
+ gen_helper_sve_ldhdu_le_zd,
+ gen_helper_sve_ldsdu_le_zd,
+ gen_helper_sve_lddd_le_zd, } } },
- /* First-fault */
- { { { gen_helper_sve_ldffbds_zsu,
- gen_helper_sve_ldffhds_le_zsu,
- gen_helper_sve_ldffsds_le_zsu,
- NULL, },
- { gen_helper_sve_ldffbdu_zsu,
- gen_helper_sve_ldffhdu_le_zsu,
- gen_helper_sve_ldffsdu_le_zsu,
- gen_helper_sve_ldffdd_le_zsu, } },
- { { gen_helper_sve_ldffbds_zss,
- gen_helper_sve_ldffhds_le_zss,
- gen_helper_sve_ldffsds_le_zss,
- NULL, },
- { gen_helper_sve_ldffbdu_zss,
- gen_helper_sve_ldffhdu_le_zss,
- gen_helper_sve_ldffsdu_le_zss,
- gen_helper_sve_ldffdd_le_zss, } },
- { { gen_helper_sve_ldffbds_zd,
- gen_helper_sve_ldffhds_le_zd,
- gen_helper_sve_ldffsds_le_zd,
- NULL, },
- { gen_helper_sve_ldffbdu_zd,
- gen_helper_sve_ldffhdu_le_zd,
- gen_helper_sve_ldffsdu_le_zd,
- gen_helper_sve_ldffdd_le_zd, } } } },
+ /* First-fault */
+ { { { gen_helper_sve_ldffbds_zsu,
+ gen_helper_sve_ldffhds_le_zsu,
+ gen_helper_sve_ldffsds_le_zsu,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zsu,
+ gen_helper_sve_ldffhdu_le_zsu,
+ gen_helper_sve_ldffsdu_le_zsu,
+ gen_helper_sve_ldffdd_le_zsu, } },
+ { { gen_helper_sve_ldffbds_zss,
+ gen_helper_sve_ldffhds_le_zss,
+ gen_helper_sve_ldffsds_le_zss,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zss,
+ gen_helper_sve_ldffhdu_le_zss,
+ gen_helper_sve_ldffsdu_le_zss,
+ gen_helper_sve_ldffdd_le_zss, } },
+ { { gen_helper_sve_ldffbds_zd,
+ gen_helper_sve_ldffhds_le_zd,
+ gen_helper_sve_ldffsds_le_zd,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zd,
+ gen_helper_sve_ldffhdu_le_zd,
+ gen_helper_sve_ldffsdu_le_zd,
+ gen_helper_sve_ldffdd_le_zd, } } } },
+ { /* Big-endian */
+ { { { gen_helper_sve_ldbds_zsu,
+ gen_helper_sve_ldhds_be_zsu,
+ gen_helper_sve_ldsds_be_zsu,
+ NULL, },
+ { gen_helper_sve_ldbdu_zsu,
+ gen_helper_sve_ldhdu_be_zsu,
+ gen_helper_sve_ldsdu_be_zsu,
+ gen_helper_sve_lddd_be_zsu, } },
+ { { gen_helper_sve_ldbds_zss,
+ gen_helper_sve_ldhds_be_zss,
+ gen_helper_sve_ldsds_be_zss,
+ NULL, },
+ { gen_helper_sve_ldbdu_zss,
+ gen_helper_sve_ldhdu_be_zss,
+ gen_helper_sve_ldsdu_be_zss,
+ gen_helper_sve_lddd_be_zss, } },
+ { { gen_helper_sve_ldbds_zd,
+ gen_helper_sve_ldhds_be_zd,
+ gen_helper_sve_ldsds_be_zd,
+ NULL, },
+ { gen_helper_sve_ldbdu_zd,
+ gen_helper_sve_ldhdu_be_zd,
+ gen_helper_sve_ldsdu_be_zd,
+ gen_helper_sve_lddd_be_zd, } } },
- /* Big-endian */
- { { { { gen_helper_sve_ldbds_zsu,
- gen_helper_sve_ldhds_be_zsu,
- gen_helper_sve_ldsds_be_zsu,
- NULL, },
- { gen_helper_sve_ldbdu_zsu,
- gen_helper_sve_ldhdu_be_zsu,
- gen_helper_sve_ldsdu_be_zsu,
- gen_helper_sve_lddd_be_zsu, } },
- { { gen_helper_sve_ldbds_zss,
- gen_helper_sve_ldhds_be_zss,
- gen_helper_sve_ldsds_be_zss,
- NULL, },
- { gen_helper_sve_ldbdu_zss,
- gen_helper_sve_ldhdu_be_zss,
- gen_helper_sve_ldsdu_be_zss,
- gen_helper_sve_lddd_be_zss, } },
- { { gen_helper_sve_ldbds_zd,
- gen_helper_sve_ldhds_be_zd,
- gen_helper_sve_ldsds_be_zd,
- NULL, },
- { gen_helper_sve_ldbdu_zd,
- gen_helper_sve_ldhdu_be_zd,
- gen_helper_sve_ldsdu_be_zd,
- gen_helper_sve_lddd_be_zd, } } },
+ /* First-fault */
+ { { { gen_helper_sve_ldffbds_zsu,
+ gen_helper_sve_ldffhds_be_zsu,
+ gen_helper_sve_ldffsds_be_zsu,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zsu,
+ gen_helper_sve_ldffhdu_be_zsu,
+ gen_helper_sve_ldffsdu_be_zsu,
+ gen_helper_sve_ldffdd_be_zsu, } },
+ { { gen_helper_sve_ldffbds_zss,
+ gen_helper_sve_ldffhds_be_zss,
+ gen_helper_sve_ldffsds_be_zss,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zss,
+ gen_helper_sve_ldffhdu_be_zss,
+ gen_helper_sve_ldffsdu_be_zss,
+ gen_helper_sve_ldffdd_be_zss, } },
+ { { gen_helper_sve_ldffbds_zd,
+ gen_helper_sve_ldffhds_be_zd,
+ gen_helper_sve_ldffsds_be_zd,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zd,
+ gen_helper_sve_ldffhdu_be_zd,
+ gen_helper_sve_ldffsdu_be_zd,
+ gen_helper_sve_ldffdd_be_zd, } } } } },
+ { /* MTE Active */
+ { /* Little-endian */
+ { { { gen_helper_sve_ldbds_zsu_mte,
+ gen_helper_sve_ldhds_le_zsu_mte,
+ gen_helper_sve_ldsds_le_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldbdu_zsu_mte,
+ gen_helper_sve_ldhdu_le_zsu_mte,
+ gen_helper_sve_ldsdu_le_zsu_mte,
+ gen_helper_sve_lddd_le_zsu_mte, } },
+ { { gen_helper_sve_ldbds_zss_mte,
+ gen_helper_sve_ldhds_le_zss_mte,
+ gen_helper_sve_ldsds_le_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldbdu_zss_mte,
+ gen_helper_sve_ldhdu_le_zss_mte,
+ gen_helper_sve_ldsdu_le_zss_mte,
+ gen_helper_sve_lddd_le_zss_mte, } },
+ { { gen_helper_sve_ldbds_zd_mte,
+ gen_helper_sve_ldhds_le_zd_mte,
+ gen_helper_sve_ldsds_le_zd_mte,
+ NULL, },
+ { gen_helper_sve_ldbdu_zd_mte,
+ gen_helper_sve_ldhdu_le_zd_mte,
+ gen_helper_sve_ldsdu_le_zd_mte,
+ gen_helper_sve_lddd_le_zd_mte, } } },
- /* First-fault */
- { { { gen_helper_sve_ldffbds_zsu,
- gen_helper_sve_ldffhds_be_zsu,
- gen_helper_sve_ldffsds_be_zsu,
- NULL, },
- { gen_helper_sve_ldffbdu_zsu,
- gen_helper_sve_ldffhdu_be_zsu,
- gen_helper_sve_ldffsdu_be_zsu,
- gen_helper_sve_ldffdd_be_zsu, } },
- { { gen_helper_sve_ldffbds_zss,
- gen_helper_sve_ldffhds_be_zss,
- gen_helper_sve_ldffsds_be_zss,
- NULL, },
- { gen_helper_sve_ldffbdu_zss,
- gen_helper_sve_ldffhdu_be_zss,
- gen_helper_sve_ldffsdu_be_zss,
- gen_helper_sve_ldffdd_be_zss, } },
- { { gen_helper_sve_ldffbds_zd,
- gen_helper_sve_ldffhds_be_zd,
- gen_helper_sve_ldffsds_be_zd,
- NULL, },
- { gen_helper_sve_ldffbdu_zd,
- gen_helper_sve_ldffhdu_be_zd,
- gen_helper_sve_ldffsdu_be_zd,
- gen_helper_sve_ldffdd_be_zd, } } } },
+ /* First-fault */
+ { { { gen_helper_sve_ldffbds_zsu_mte,
+ gen_helper_sve_ldffhds_le_zsu_mte,
+ gen_helper_sve_ldffsds_le_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zsu_mte,
+ gen_helper_sve_ldffhdu_le_zsu_mte,
+ gen_helper_sve_ldffsdu_le_zsu_mte,
+ gen_helper_sve_ldffdd_le_zsu_mte, } },
+ { { gen_helper_sve_ldffbds_zss_mte,
+ gen_helper_sve_ldffhds_le_zss_mte,
+ gen_helper_sve_ldffsds_le_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zss_mte,
+ gen_helper_sve_ldffhdu_le_zss_mte,
+ gen_helper_sve_ldffsdu_le_zss_mte,
+ gen_helper_sve_ldffdd_le_zss_mte, } },
+ { { gen_helper_sve_ldffbds_zd_mte,
+ gen_helper_sve_ldffhds_le_zd_mte,
+ gen_helper_sve_ldffsds_le_zd_mte,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zd_mte,
+ gen_helper_sve_ldffhdu_le_zd_mte,
+ gen_helper_sve_ldffsdu_le_zd_mte,
+ gen_helper_sve_ldffdd_le_zd_mte, } } } },
+ { /* Big-endian */
+ { { { gen_helper_sve_ldbds_zsu_mte,
+ gen_helper_sve_ldhds_be_zsu_mte,
+ gen_helper_sve_ldsds_be_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldbdu_zsu_mte,
+ gen_helper_sve_ldhdu_be_zsu_mte,
+ gen_helper_sve_ldsdu_be_zsu_mte,
+ gen_helper_sve_lddd_be_zsu_mte, } },
+ { { gen_helper_sve_ldbds_zss_mte,
+ gen_helper_sve_ldhds_be_zss_mte,
+ gen_helper_sve_ldsds_be_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldbdu_zss_mte,
+ gen_helper_sve_ldhdu_be_zss_mte,
+ gen_helper_sve_ldsdu_be_zss_mte,
+ gen_helper_sve_lddd_be_zss_mte, } },
+ { { gen_helper_sve_ldbds_zd_mte,
+ gen_helper_sve_ldhds_be_zd_mte,
+ gen_helper_sve_ldsds_be_zd_mte,
+ NULL, },
+ { gen_helper_sve_ldbdu_zd_mte,
+ gen_helper_sve_ldhdu_be_zd_mte,
+ gen_helper_sve_ldsdu_be_zd_mte,
+ gen_helper_sve_lddd_be_zd_mte, } } },
+
+ /* First-fault */
+ { { { gen_helper_sve_ldffbds_zsu_mte,
+ gen_helper_sve_ldffhds_be_zsu_mte,
+ gen_helper_sve_ldffsds_be_zsu_mte,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zsu_mte,
+ gen_helper_sve_ldffhdu_be_zsu_mte,
+ gen_helper_sve_ldffsdu_be_zsu_mte,
+ gen_helper_sve_ldffdd_be_zsu_mte, } },
+ { { gen_helper_sve_ldffbds_zss_mte,
+ gen_helper_sve_ldffhds_be_zss_mte,
+ gen_helper_sve_ldffsds_be_zss_mte,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zss_mte,
+ gen_helper_sve_ldffhdu_be_zss_mte,
+ gen_helper_sve_ldffsdu_be_zss_mte,
+ gen_helper_sve_ldffdd_be_zss_mte, } },
+ { { gen_helper_sve_ldffbds_zd_mte,
+ gen_helper_sve_ldffhds_be_zd_mte,
+ gen_helper_sve_ldffsds_be_zd_mte,
+ NULL, },
+ { gen_helper_sve_ldffbdu_zd_mte,
+ gen_helper_sve_ldffhdu_be_zd_mte,
+ gen_helper_sve_ldffsdu_be_zd_mte,
+ gen_helper_sve_ldffdd_be_zd_mte, } } } } },
};
static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a)
{
gen_helper_gvec_mem_scatter *fn = NULL;
- int be = s->be_data == MO_BE;
+ bool be = s->be_data == MO_BE;
+ bool mte = s->mte_active[0];
if (!sve_access_check(s)) {
return true;
@@ -5427,23 +5598,24 @@ static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a)
switch (a->esz) {
case MO_32:
- fn = gather_load_fn32[be][a->ff][a->xs][a->u][a->msz];
+ fn = gather_load_fn32[mte][be][a->ff][a->xs][a->u][a->msz];
break;
case MO_64:
- fn = gather_load_fn64[be][a->ff][a->xs][a->u][a->msz];
+ fn = gather_load_fn64[mte][be][a->ff][a->xs][a->u][a->msz];
break;
}
assert(fn != NULL);
do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
- cpu_reg_sp(s, a->rn), a->msz, fn);
+ cpu_reg_sp(s, a->rn), a->msz, false, fn);
return true;
}
static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a)
{
gen_helper_gvec_mem_scatter *fn = NULL;
- int be = s->be_data == MO_BE;
+ bool be = s->be_data == MO_BE;
+ bool mte = s->mte_active[0];
TCGv_i64 imm;
if (a->esz < a->msz || (a->esz == a->msz && !a->u)) {
@@ -5455,10 +5627,10 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a)
switch (a->esz) {
case MO_32:
- fn = gather_load_fn32[be][a->ff][0][a->u][a->msz];
+ fn = gather_load_fn32[mte][be][a->ff][0][a->u][a->msz];
break;
case MO_64:
- fn = gather_load_fn64[be][a->ff][2][a->u][a->msz];
+ fn = gather_load_fn64[mte][be][a->ff][2][a->u][a->msz];
break;
}
assert(fn != NULL);
@@ -5467,63 +5639,108 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a)
* by loading the immediate into the scalar parameter.
*/
imm = tcg_const_i64(a->imm << a->msz);
- do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, fn);
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, false, fn);
tcg_temp_free_i64(imm);
return true;
}
-/* Indexed by [be][xs][msz]. */
-static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][2][3] = {
- /* Little-endian */
- { { gen_helper_sve_stbs_zsu,
- gen_helper_sve_sths_le_zsu,
- gen_helper_sve_stss_le_zsu, },
- { gen_helper_sve_stbs_zss,
- gen_helper_sve_sths_le_zss,
- gen_helper_sve_stss_le_zss, } },
- /* Big-endian */
- { { gen_helper_sve_stbs_zsu,
- gen_helper_sve_sths_be_zsu,
- gen_helper_sve_stss_be_zsu, },
- { gen_helper_sve_stbs_zss,
- gen_helper_sve_sths_be_zss,
- gen_helper_sve_stss_be_zss, } },
+/* Indexed by [mte][be][xs][msz]. */
+static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][2][2][3] = {
+ { /* MTE Inactive */
+ { /* Little-endian */
+ { gen_helper_sve_stbs_zsu,
+ gen_helper_sve_sths_le_zsu,
+ gen_helper_sve_stss_le_zsu, },
+ { gen_helper_sve_stbs_zss,
+ gen_helper_sve_sths_le_zss,
+ gen_helper_sve_stss_le_zss, } },
+ { /* Big-endian */
+ { gen_helper_sve_stbs_zsu,
+ gen_helper_sve_sths_be_zsu,
+ gen_helper_sve_stss_be_zsu, },
+ { gen_helper_sve_stbs_zss,
+ gen_helper_sve_sths_be_zss,
+ gen_helper_sve_stss_be_zss, } } },
+ { /* MTE Active */
+ { /* Little-endian */
+ { gen_helper_sve_stbs_zsu_mte,
+ gen_helper_sve_sths_le_zsu_mte,
+ gen_helper_sve_stss_le_zsu_mte, },
+ { gen_helper_sve_stbs_zss_mte,
+ gen_helper_sve_sths_le_zss_mte,
+ gen_helper_sve_stss_le_zss_mte, } },
+ { /* Big-endian */
+ { gen_helper_sve_stbs_zsu_mte,
+ gen_helper_sve_sths_be_zsu_mte,
+ gen_helper_sve_stss_be_zsu_mte, },
+ { gen_helper_sve_stbs_zss_mte,
+ gen_helper_sve_sths_be_zss_mte,
+ gen_helper_sve_stss_be_zss_mte, } } },
};
/* Note that we overload xs=2 to indicate 64-bit offset. */
-static gen_helper_gvec_mem_scatter * const scatter_store_fn64[2][3][4] = {
- /* Little-endian */
- { { gen_helper_sve_stbd_zsu,
- gen_helper_sve_sthd_le_zsu,
- gen_helper_sve_stsd_le_zsu,
- gen_helper_sve_stdd_le_zsu, },
- { gen_helper_sve_stbd_zss,
- gen_helper_sve_sthd_le_zss,
- gen_helper_sve_stsd_le_zss,
- gen_helper_sve_stdd_le_zss, },
- { gen_helper_sve_stbd_zd,
- gen_helper_sve_sthd_le_zd,
- gen_helper_sve_stsd_le_zd,
- gen_helper_sve_stdd_le_zd, } },
- /* Big-endian */
- { { gen_helper_sve_stbd_zsu,
- gen_helper_sve_sthd_be_zsu,
- gen_helper_sve_stsd_be_zsu,
- gen_helper_sve_stdd_be_zsu, },
- { gen_helper_sve_stbd_zss,
- gen_helper_sve_sthd_be_zss,
- gen_helper_sve_stsd_be_zss,
- gen_helper_sve_stdd_be_zss, },
- { gen_helper_sve_stbd_zd,
- gen_helper_sve_sthd_be_zd,
- gen_helper_sve_stsd_be_zd,
- gen_helper_sve_stdd_be_zd, } },
+static gen_helper_gvec_mem_scatter * const scatter_store_fn64[2][2][3][4] = {
+ { /* MTE Inactive */
+ { /* Little-endian */
+ { gen_helper_sve_stbd_zsu,
+ gen_helper_sve_sthd_le_zsu,
+ gen_helper_sve_stsd_le_zsu,
+ gen_helper_sve_stdd_le_zsu, },
+ { gen_helper_sve_stbd_zss,
+ gen_helper_sve_sthd_le_zss,
+ gen_helper_sve_stsd_le_zss,
+ gen_helper_sve_stdd_le_zss, },
+ { gen_helper_sve_stbd_zd,
+ gen_helper_sve_sthd_le_zd,
+ gen_helper_sve_stsd_le_zd,
+ gen_helper_sve_stdd_le_zd, } },
+ { /* Big-endian */
+ { gen_helper_sve_stbd_zsu,
+ gen_helper_sve_sthd_be_zsu,
+ gen_helper_sve_stsd_be_zsu,
+ gen_helper_sve_stdd_be_zsu, },
+ { gen_helper_sve_stbd_zss,
+ gen_helper_sve_sthd_be_zss,
+ gen_helper_sve_stsd_be_zss,
+ gen_helper_sve_stdd_be_zss, },
+ { gen_helper_sve_stbd_zd,
+ gen_helper_sve_sthd_be_zd,
+ gen_helper_sve_stsd_be_zd,
+ gen_helper_sve_stdd_be_zd, } } },
+ { /* MTE Inactive */
+ { /* Little-endian */
+ { gen_helper_sve_stbd_zsu_mte,
+ gen_helper_sve_sthd_le_zsu_mte,
+ gen_helper_sve_stsd_le_zsu_mte,
+ gen_helper_sve_stdd_le_zsu_mte, },
+ { gen_helper_sve_stbd_zss_mte,
+ gen_helper_sve_sthd_le_zss_mte,
+ gen_helper_sve_stsd_le_zss_mte,
+ gen_helper_sve_stdd_le_zss_mte, },
+ { gen_helper_sve_stbd_zd_mte,
+ gen_helper_sve_sthd_le_zd_mte,
+ gen_helper_sve_stsd_le_zd_mte,
+ gen_helper_sve_stdd_le_zd_mte, } },
+ { /* Big-endian */
+ { gen_helper_sve_stbd_zsu_mte,
+ gen_helper_sve_sthd_be_zsu_mte,
+ gen_helper_sve_stsd_be_zsu_mte,
+ gen_helper_sve_stdd_be_zsu_mte, },
+ { gen_helper_sve_stbd_zss_mte,
+ gen_helper_sve_sthd_be_zss_mte,
+ gen_helper_sve_stsd_be_zss_mte,
+ gen_helper_sve_stdd_be_zss_mte, },
+ { gen_helper_sve_stbd_zd_mte,
+ gen_helper_sve_sthd_be_zd_mte,
+ gen_helper_sve_stsd_be_zd_mte,
+ gen_helper_sve_stdd_be_zd_mte, } } },
};
static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a)
{
gen_helper_gvec_mem_scatter *fn;
- int be = s->be_data == MO_BE;
+ bool be = s->be_data == MO_BE;
+ bool mte = s->mte_active[0];
if (a->esz < a->msz || (a->msz == 0 && a->scale)) {
return false;
@@ -5533,23 +5750,24 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a)
}
switch (a->esz) {
case MO_32:
- fn = scatter_store_fn32[be][a->xs][a->msz];
+ fn = scatter_store_fn32[mte][be][a->xs][a->msz];
break;
case MO_64:
- fn = scatter_store_fn64[be][a->xs][a->msz];
+ fn = scatter_store_fn64[mte][be][a->xs][a->msz];
break;
default:
g_assert_not_reached();
}
do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz,
- cpu_reg_sp(s, a->rn), a->msz, fn);
+ cpu_reg_sp(s, a->rn), a->msz, true, fn);
return true;
}
static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a)
{
gen_helper_gvec_mem_scatter *fn = NULL;
- int be = s->be_data == MO_BE;
+ bool be = s->be_data == MO_BE;
+ bool mte = s->mte_active[0];
TCGv_i64 imm;
if (a->esz < a->msz) {
@@ -5561,10 +5779,10 @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a)
switch (a->esz) {
case MO_32:
- fn = scatter_store_fn32[be][0][a->msz];
+ fn = scatter_store_fn32[mte][be][0][a->msz];
break;
case MO_64:
- fn = scatter_store_fn64[be][2][a->msz];
+ fn = scatter_store_fn64[mte][be][2][a->msz];
break;
}
assert(fn != NULL);
@@ -5573,7 +5791,7 @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a)
* by loading the immediate into the scalar parameter.
*/
imm = tcg_const_i64(a->imm << a->msz);
- do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, fn);
+ do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, a->msz, true, fn);
tcg_temp_free_i64(imm);
return true;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 36/42] target/arm: Complete TBI clearing for user-only for SVE
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (34 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 35/42] target/arm: Add mte helpers for sve scatter/gather " Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 37/42] target/arm: Implement data cache set allocation tags Richard Henderson
` (6 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
There are a number of paths by which the TBI is still intact
for user-only in the SVE helpers.
Because we currently always set TBI for user-only, we do not
need to pass down the actual TBI setting from above, and we
can remove the top byte in the inner-most primitives, so that
none are forgotten. Moreover, this keeps the "dirty" pointer
around at the higher levels, where we need it for any MTE checking.
Since the normal case, especially for user-only, goes through
RAM, this clearing merely adds two insns per page lookup, which
will be completely in the noise.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/sve_helper.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 566a619300..f0afbd0faf 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3985,7 +3985,7 @@ typedef void sve_ldst1_tlb_fn(CPUARMState *env, void *vd, intptr_t reg_off,
*
* For *_tlb, this uses the cpu_*_data_ra helpers. There are not
* endian-specific versions of these, so we must handle endianness
- * locally.
+ * locally. See sve_probe_page about TBI.
*
* For *_host, this is a trivial application of the <qemu/bswap.h>
* endian-specific access followed by a store into the vector register.
@@ -4009,7 +4009,7 @@ static void sve_##NAME##_host(void *vd, intptr_t reg_off, void *host) \
static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
target_ulong addr, uintptr_t ra) \
{ \
- TYPEM val = BSWAP(TLB(env, addr, ra)); \
+ TYPEM val = BSWAP(TLB(env, useronly_clean_ptr(addr), ra)); \
*(TYPEE *)(vd + H(reg_off)) = val; \
}
@@ -4018,7 +4018,7 @@ static void sve_##NAME##_tlb(CPUARMState *env, void *vd, intptr_t reg_off, \
target_ulong addr, uintptr_t ra) \
{ \
TYPEM val = *(TYPEE *)(vd + H(reg_off)); \
- TLB(env, addr, BSWAP(val), ra); \
+ TLB(env, useronly_clean_ptr(addr), BSWAP(val), ra); \
}
#define DO_LD_PRIM_1(NAME, H, TE, TM) \
@@ -4152,6 +4152,19 @@ static bool sve_probe_page(SVEHostPage *info, bool nofault,
int flags;
addr += mem_off;
+
+ /*
+ * User-only currently always issues with TBI. See the comment
+ * above useronly_clean_ptr. Usually we clean this top byte away
+ * during translation, but we can't do that for e.g. vector + imm
+ * addressing modes.
+ *
+ * We currently always enable TBI for user-only, and do not provide
+ * a way to turn it off. So clean the pointer unconditionally here,
+ * rather than look it up here, or pass it down from above.
+ */
+ addr = useronly_clean_ptr(addr);
+
flags = probe_access_flags(env, addr, access_type, mmu_idx, nofault,
&info->host, retaddr);
info->flags = flags;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 37/42] target/arm: Implement data cache set allocation tags
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (35 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 36/42] target/arm: Complete TBI clearing for user-only for SVE Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 38/42] target/arm: Set PSTATE.TCO on exception entry Richard Henderson
` (5 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
This is DC GVA and DC GZVA, and the tag check for DC ZVA.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Use allocation_tag_mem + memset.
v3: Require pre-cleaned addresses.
v6: Move DCZ block size assert to cpu realize.
Perform a tag check for DC ZVA.
---
target/arm/cpu.h | 4 +++-
target/arm/helper.c | 16 ++++++++++++++++
target/arm/translate-a64.c | 39 ++++++++++++++++++++++++++++++++++++++
3 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 67164d56a1..b78bf2be4a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2337,7 +2337,9 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
#define ARM_CP_NZCV (ARM_CP_SPECIAL | 0x0300)
#define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | 0x0400)
#define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | 0x0500)
-#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
+#define ARM_CP_DC_GVA (ARM_CP_SPECIAL | 0x0600)
+#define ARM_CP_DC_GZVA (ARM_CP_SPECIAL | 0x0700)
+#define ARM_LAST_SPECIAL ARM_CP_DC_GZVA
#define ARM_CP_FPU 0x1000
#define ARM_CP_SVE 0x2000
#define ARM_CP_NO_GDB 0x4000
diff --git a/target/arm/helper.c b/target/arm/helper.c
index e4b4366af7..44e7c0d19b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -6983,6 +6983,22 @@ static const ARMCPRegInfo mte_el0_cacheop_reginfo[] = {
.opc0 = 1, .opc1 = 3, .crn = 7, .crm = 14, .opc2 = 5,
.type = ARM_CP_NOP, .access = PL0_W,
.accessfn = aa64_cacheop_poc_access },
+ { .name = "DC_GVA", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 4, .opc2 = 3,
+ .access = PL0_W, .type = ARM_CP_DC_GVA,
+#ifndef CONFIG_USER_ONLY
+ /* Avoid overhead of an access check that always passes in user-mode */
+ .accessfn = aa64_zva_access,
+#endif
+ },
+ { .name = "DC_GZVA", .state = ARM_CP_STATE_AA64,
+ .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 4, .opc2 = 4,
+ .access = PL0_W, .type = ARM_CP_DC_GZVA,
+#ifndef CONFIG_USER_ONLY
+ /* Avoid overhead of an access check that always passes in user-mode */
+ .accessfn = aa64_zva_access,
+#endif
+ },
REGINFO_SENTINEL
};
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 1314b200e0..f33f174584 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1919,6 +1919,45 @@ static void handle_sys(DisasContext *s, uint32_t insn, bool isread,
}
gen_helper_dc_zva(cpu_env, tcg_rt);
return;
+ case ARM_CP_DC_GVA:
+ {
+ TCGv_i64 clean_addr, tag;
+
+ /*
+ * DC_GVA, like DC_ZVA, requires that we supply the original
+ * pointer for an invalid page. Probe that address first.
+ */
+ tcg_rt = cpu_reg(s, rt);
+ clean_addr = clean_data_tbi(s, tcg_rt);
+ gen_probe_access(s, clean_addr, MMU_DATA_STORE, MO_8);
+
+ if (s->ata) {
+ /* Extract the tag from the register to match STZGM. */
+ tag = tcg_temp_new_i64();
+ tcg_gen_shri_i64(tag, tcg_rt, 56);
+ gen_helper_stzgm_tags(cpu_env, clean_addr, tag);
+ tcg_temp_free_i64(tag);
+ }
+ }
+ return;
+ case ARM_CP_DC_GZVA:
+ {
+ TCGv_i64 clean_addr, tag;
+
+ /* For DC_GZVA, we can rely on DC_ZVA for the proper fault. */
+ tcg_rt = cpu_reg(s, rt);
+ clean_addr = clean_data_tbi(s, tcg_rt);
+ gen_helper_dc_zva(cpu_env, clean_addr);
+
+ if (s->ata) {
+ /* Extract the tag from the register to match STZGM. */
+ tag = tcg_temp_new_i64();
+ tcg_gen_shri_i64(tag, tcg_rt, 56);
+ gen_helper_stzgm_tags(cpu_env, clean_addr, tag);
+ tcg_temp_free_i64(tag);
+ }
+ }
+ return;
default:
break;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 38/42] target/arm: Set PSTATE.TCO on exception entry
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (36 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 37/42] target/arm: Implement data cache set allocation tags Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 39/42] target/arm: Enable MTE Richard Henderson
` (4 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
D1.10 specifies that exception handlers begin with tag checks overridden.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Only set if MTE feature present.
---
target/arm/helper.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 44e7c0d19b..b38dc74733 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -9664,6 +9664,9 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
break;
}
}
+ if (cpu_isar_feature(aa64_mte, cpu)) {
+ new_mode |= PSTATE_TCO;
+ }
pstate_write(env, PSTATE_DAIF | new_mode);
env->aarch64 = 1;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 39/42] target/arm: Enable MTE
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (37 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 38/42] target/arm: Set PSTATE.TCO on exception entry Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 40/42] target/arm: Cache the Tagged bit for a page in MemTxAttrs Richard Henderson
` (3 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
We now implement all of the components of MTE, without actually
supporting any tagged memory. All MTE instructions will work,
trivially, so we can enable support.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v6: Delay user-only cpu reset bits to the user-only patch set.
---
target/arm/cpu64.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 62d36f9e8d..94e83b8487 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -665,6 +665,7 @@ static void aarch64_max_initfn(Object *obj)
t = cpu->isar.id_aa64pfr1;
t = FIELD_DP64(t, ID_AA64PFR1, BT, 1);
+ t = FIELD_DP64(t, ID_AA64PFR1, MTE, 2);
cpu->isar.id_aa64pfr1 = t;
t = cpu->isar.id_aa64mmfr1;
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 40/42] target/arm: Cache the Tagged bit for a page in MemTxAttrs
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (38 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 39/42] target/arm: Enable MTE Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 41/42] target/arm: Create tagged ram when MTE is enabled Richard Henderson
` (2 subsequent siblings)
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
This "bit" is a particular value of the page's MemAttr.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v6: Test HCR_EL2.{DC,DCT}; test Stage2 attributes.
---
target/arm/helper.c | 43 +++++++++++++++++++++++++++++++++++--------
1 file changed, 35 insertions(+), 8 deletions(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index b38dc74733..c272c48467 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10785,6 +10785,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
uint64_t descaddrmask;
bool aarch64 = arm_el_is_aa64(env, el);
bool guarded = false;
+ uint8_t memattr;
/* TODO:
* This code does not handle the different format TCR for VTCR_EL2.
@@ -11013,17 +11014,32 @@ static bool get_phys_addr_lpae(CPUARMState *env, target_ulong address,
txattrs->target_tlb_bit0 = true;
}
- if (cacheattrs != NULL) {
+ if (mmu_idx == ARMMMUIdx_Stage2) {
+ memattr = convert_stage2_attrs(env, extract32(attrs, 0, 4));
+ } else {
+ /* Index into MAIR registers for cache attributes */
+ uint64_t mair = env->cp15.mair_el[el];
+ memattr = extract64(mair, extract32(attrs, 0, 3) * 8, 8);
+ }
+
+ /* When MTE is enabled, remember Tagged Memory in IOTLB. */
+ if (aarch64 && cpu_isar_feature(aa64_mte, cpu)) {
if (mmu_idx == ARMMMUIdx_Stage2) {
- cacheattrs->attrs = convert_stage2_attrs(env,
- extract32(attrs, 0, 4));
+ /*
+ * Require Normal, I+O Shareable, WB, NT, RA, WA (0xff).
+ * If not, squash stage1 tagged normal setting.
+ */
+ if (memattr != 0xff) {
+ txattrs->target_tlb_bit1 = false;
+ }
} else {
- /* Index into MAIR registers for cache attributes */
- uint8_t attrindx = extract32(attrs, 0, 3);
- uint64_t mair = env->cp15.mair_el[regime_el(env, mmu_idx)];
- assert(attrindx <= 7);
- cacheattrs->attrs = extract64(mair, attrindx * 8, 8);
+ /* Tagged Normal Memory (0xf0). */
+ txattrs->target_tlb_bit1 = (memattr == 0xf0);
}
+ }
+
+ if (cacheattrs != NULL) {
+ cacheattrs->attrs = memattr;
cacheattrs->shareability = extract32(attrs, 6, 2);
}
@@ -11978,6 +11994,17 @@ bool get_phys_addr(CPUARMState *env, target_ulong address,
*phys_ptr = address;
*prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
*page_size = TARGET_PAGE_SIZE;
+
+ /* Stage1 translations are Tagged or Untagged based on HCR_DCT. */
+ if (cpu_isar_feature(aa64_mte, env_archcpu(env))
+ && (mmu_idx == ARMMMUIdx_Stage1_E0 ||
+ mmu_idx == ARMMMUIdx_Stage1_E1 ||
+ mmu_idx == ARMMMUIdx_Stage1_E1_PAN)) {
+ uint64_t hcr = arm_hcr_el2_eff(env);
+ if ((hcr & (HCR_DC | HCR_DCT)) == (HCR_DC | HCR_DCT)) {
+ attrs->target_tlb_bit1 = true;
+ }
+ }
return 0;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 41/42] target/arm: Create tagged ram when MTE is enabled
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (39 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 40/42] target/arm: Cache the Tagged bit for a page in MemTxAttrs Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-03-12 19:42 ` [PATCH v6 42/42] target/arm: Add allocation tag storage for system mode Richard Henderson
2020-05-18 14:46 ` [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, " Peter Maydell
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v5: Assign cs->num_ases to the final value first.
Downgrade to ID_AA64PFR1.MTE=1 if tag memory is not available.
v6: Add secure tag memory for EL3.
---
target/arm/cpu.h | 6 ++++++
hw/arm/virt.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++
target/arm/cpu.c | 53 +++++++++++++++++++++++++++++++++++++++++++++---
3 files changed, 108 insertions(+), 3 deletions(-)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index b78bf2be4a..b360123b37 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -771,6 +771,10 @@ struct ARMCPU {
/* MemoryRegion to use for secure physical accesses */
MemoryRegion *secure_memory;
+ /* MemoryRegion to use for allocation tag accesses */
+ MemoryRegion *tag_memory;
+ MemoryRegion *secure_tag_memory;
+
/* For v8M, pointer to the IDAU interface provided by board/SoC */
Object *idau;
@@ -2953,6 +2957,8 @@ typedef enum ARMMMUIdxBit {
typedef enum ARMASIdx {
ARMASIdx_NS = 0,
ARMASIdx_S = 1,
+ ARMASIdx_TagNS = 2,
+ ARMASIdx_TagS = 3,
} ARMASIdx;
/* Return the Exception Level targeted by debug exceptions. */
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 32d865a488..63b9d84eb8 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1389,6 +1389,16 @@ static void create_secure_ram(VirtMachineState *vms,
g_free(nodename);
}
+static void create_tag_ram(MemoryRegion *tag_sysmem,
+ hwaddr base, hwaddr size,
+ const char *name)
+{
+ MemoryRegion *tagram = g_new(MemoryRegion, 1);
+
+ memory_region_init_ram(tagram, NULL, name, size / 32, &error_fatal);
+ memory_region_add_subregion(tag_sysmem, base / 32, tagram);
+}
+
static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
{
const VirtMachineState *board = container_of(binfo, VirtMachineState,
@@ -1543,6 +1553,8 @@ static void machvirt_init(MachineState *machine)
const CPUArchIdList *possible_cpus;
MemoryRegion *sysmem = get_system_memory();
MemoryRegion *secure_sysmem = NULL;
+ MemoryRegion *tag_sysmem = NULL;
+ MemoryRegion *secure_tag_sysmem = NULL;
int n, virt_max_cpus;
bool firmware_loaded;
bool aarch64 = true;
@@ -1715,6 +1727,35 @@ static void machvirt_init(MachineState *machine)
"secure-memory", &error_abort);
}
+ /*
+ * The cpu adds the property if and only if MemTag is supported.
+ * If it is, we must allocate the ram to back that up.
+ */
+ if (object_property_find(cpuobj, "tag-memory", NULL)) {
+ if (!tag_sysmem) {
+ tag_sysmem = g_new(MemoryRegion, 1);
+ memory_region_init(tag_sysmem, OBJECT(machine),
+ "tag-memory", UINT64_MAX / 32);
+
+ if (vms->secure) {
+ secure_tag_sysmem = g_new(MemoryRegion, 1);
+ memory_region_init(secure_tag_sysmem, OBJECT(machine),
+ "secure-tag-memory", UINT64_MAX / 32);
+
+ /* As with ram, secure-tag takes precedence over tag. */
+ memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
+ tag_sysmem, -1);
+ }
+ }
+
+ object_property_set_link(cpuobj, OBJECT(tag_sysmem),
+ "tag-memory", &error_abort);
+ if (vms->secure) {
+ object_property_set_link(cpuobj, OBJECT(secure_tag_sysmem),
+ "secure-tag-memory", &error_abort);
+ }
+ }
+
object_property_set_bool(cpuobj, true, "realized", &error_fatal);
object_unref(cpuobj);
}
@@ -1757,6 +1798,17 @@ static void machvirt_init(MachineState *machine)
create_uart(vms, VIRT_SECURE_UART, secure_sysmem, serial_hd(1));
}
+ if (tag_sysmem) {
+ create_tag_ram(tag_sysmem, vms->memmap[VIRT_MEM].base,
+ machine->ram_size, "mach-virt.tag");
+ if (vms->secure) {
+ create_tag_ram(secure_tag_sysmem,
+ vms->memmap[VIRT_SECURE_MEM].base,
+ vms->memmap[VIRT_SECURE_MEM].size,
+ "mach-virt.secure-tag");
+ }
+ }
+
vms->highmem_ecam &= vms->highmem && (!firmware_loaded || aarch64);
create_rtc(vms);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 96c20317ad..c320b4bc71 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1298,6 +1298,27 @@ void arm_cpu_post_init(Object *obj)
if (arm_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER)) {
qdev_property_add_static(DEVICE(cpu), &arm_cpu_gt_cntfrq_property);
}
+
+#ifndef CONFIG_USER_ONLY
+ if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64) &&
+ cpu_isar_feature(aa64_mte, cpu)) {
+ object_property_add_link(obj, "tag-memory",
+ TYPE_MEMORY_REGION,
+ (Object **)&cpu->tag_memory,
+ qdev_prop_allow_set_link_before_realize,
+ OBJ_PROP_LINK_STRONG,
+ &error_abort);
+
+ if (arm_feature(&cpu->env, ARM_FEATURE_EL3)) {
+ object_property_add_link(obj, "secure-tag-memory",
+ TYPE_MEMORY_REGION,
+ (Object **)&cpu->secure_tag_memory,
+ qdev_prop_allow_set_link_before_realize,
+ OBJ_PROP_LINK_STRONG,
+ &error_abort);
+ }
+ }
+#endif
}
static void arm_cpu_finalizefn(Object *obj)
@@ -1788,17 +1809,43 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
MachineState *ms = MACHINE(qdev_get_machine());
unsigned int smp_cpus = ms->smp.cpus;
- if (cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY)) {
+ /*
+ * We must set cs->num_ases to the final value before
+ * the first call to cpu_address_space_init.
+ */
+ if (cpu->tag_memory != NULL) {
+ cs->num_ases = 4;
+ } else if (cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY)) {
cs->num_ases = 2;
+ } else {
+ cs->num_ases = 1;
+ }
+ if (cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY)) {
if (!cpu->secure_memory) {
cpu->secure_memory = cs->memory;
}
cpu_address_space_init(cs, ARMASIdx_S, "cpu-secure-memory",
cpu->secure_memory);
- } else {
- cs->num_ases = 1;
}
+
+ if (cpu->tag_memory != NULL) {
+ cpu_address_space_init(cs, ARMASIdx_TagNS, "cpu-tag-memory",
+ cpu->tag_memory);
+ if (cpu->has_el3) {
+ cpu_address_space_init(cs, ARMASIdx_TagS, "cpu-tag-memory",
+ cpu->secure_tag_memory);
+ }
+ } else if (cpu_isar_feature(aa64_mte, cpu)) {
+ /*
+ * Since there is no tag memory, we can't meaningfully support MTE
+ * to its fullest. To avoid problems later, when we would come to
+ * use the tag memory, downgrade support to insns only.
+ */
+ cpu->isar.id_aa64pfr1 =
+ FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
+ }
+
cpu_address_space_init(cs, ARMASIdx_NS, "cpu-memory", cs->memory);
/* No core_count specified, default to smp_cpus. */
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH v6 42/42] target/arm: Add allocation tag storage for system mode
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (40 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 41/42] target/arm: Create tagged ram when MTE is enabled Richard Henderson
@ 2020-03-12 19:42 ` Richard Henderson
2020-05-18 14:46 ` [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, " Peter Maydell
42 siblings, 0 replies; 44+ messages in thread
From: Richard Henderson @ 2020-03-12 19:42 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, qemu-arm, alex.bennee
Look up the physical address for the given virtual address,
convert that to a tag physical address, and finally return
the host address that backs it.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/mte_helper.c | 128 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 128 insertions(+)
diff --git a/target/arm/mte_helper.c b/target/arm/mte_helper.c
index c51f7f04f4..47db87a5a1 100644
--- a/target/arm/mte_helper.c
+++ b/target/arm/mte_helper.c
@@ -21,6 +21,7 @@
#include "cpu.h"
#include "internals.h"
#include "exec/exec-all.h"
+#include "exec/ram_addr.h"
#include "exec/cpu_ldst.h"
#include "exec/helper-proto.h"
@@ -74,8 +75,135 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
int ptr_size, MMUAccessType tag_access,
int tag_size, uintptr_t ra)
{
+#ifdef CONFIG_USER_ONLY
/* Tag storage not implemented. */
return NULL;
+#else
+ uintptr_t index;
+ CPUIOTLBEntry *iotlbentry;
+ int in_page, flags;
+ ram_addr_t ptr_ra;
+ hwaddr ptr_paddr, tag_paddr, xlat;
+ MemoryRegion *mr;
+ ARMASIdx tag_asi;
+ AddressSpace *tag_as;
+ void *host;
+
+ /*
+ * The caller must split calls to this function such that it will
+ * not access *tag* memory beyond the end of the page.
+ */
+ in_page = -(ptr | -(TARGET_PAGE_SIZE >> (LOG2_TAG_GRANULE + 1)));
+ g_assert(tag_size <= in_page);
+
+ /*
+ * Probe the first byte of the virtual address. This raises an
+ * exception for inaccessible pages, and resolves the virtual address
+ * into the softmmu tlb.
+ */
+ flags = probe_access_flags(env, ptr, ptr_access, ptr_mmu_idx,
+ false, &host, ra);
+
+ /*
+ * Find the iotlbentry for ptr. This *must* be present in the TLB
+ * because we just found the mapping.
+ * TODO: Perhaps there should be a cputlb helper that returns a
+ * matching tlb entry + iotlb entry.
+ */
+ index = tlb_index(env, ptr_mmu_idx, ptr);
+# ifdef CONFIG_DEBUG_TCG
+ {
+ CPUTLBEntry *entry = tlb_entry(env, ptr_mmu_idx, ptr);
+ target_ulong comparator = (ptr_access == MMU_DATA_LOAD
+ ? entry->addr_read
+ : tlb_addr_write(entry));
+ g_assert(tlb_hit(comparator, ptr));
+ }
+# endif
+ iotlbentry = &env_tlb(env)->d[ptr_mmu_idx].iotlb[index];
+
+ /* If the virtual page MemAttr != Tagged, access unchecked. */
+ if (!iotlbentry->attrs.target_tlb_bit1) {
+ return NULL;
+ }
+
+ /* If not normal memory, tag storage is not implemented, access unchecked. */
+ if (unlikely(flags & TLB_MMIO)) {
+ qemu_log_mask(LOG_GUEST_ERROR,
+ "Page @ 0x%" PRIx64 " indicates Tagged Normal memory "
+ "but is Device memory\n", ptr);
+ return NULL;
+ }
+
+ /*
+ * The Normal memory access can extend to the next page. E.g. a single
+ * 8-byte access to the last byte of a page will check only the last
+ * tag on the first page.
+ * Any page access exception has priority over tag check exception.
+ */
+ in_page = -(ptr | TARGET_PAGE_MASK);
+ if (unlikely(ptr_size > in_page)) {
+ void *ignore;
+ flags |= probe_access_flags(env, ptr + in_page, ptr_access,
+ ptr_mmu_idx, false, &ignore, ra);
+ }
+
+ /* Any debug exception has priority over a tag check exception. */
+ if (unlikely(flags & TLB_WATCHPOINT)) {
+ int wp = ptr_access == MMU_DATA_LOAD ? BP_MEM_READ : BP_MEM_WRITE;
+ cpu_check_watchpoint(env_cpu(env), ptr, ptr_size,
+ iotlbentry->attrs, wp, ra);
+ }
+
+ /*
+ * Find the physical address within the normal mem space.
+ * The memory region lookup must succeed because TLB_MMIO was
+ * not set in the cputlb lookup above.
+ */
+ mr = memory_region_from_host(host, &ptr_ra);
+ tcg_debug_assert(mr != NULL);
+ tcg_debug_assert(memory_region_is_ram(mr));
+ ptr_paddr = ptr_ra;
+ do {
+ ptr_paddr += mr->addr;
+ mr = mr->container;
+ } while (mr);
+
+ /* Convert to the physical address in tag space. */
+ tag_paddr = ptr_paddr >> (LOG2_TAG_GRANULE + 1);
+
+ /* Look up the address in tag space. */
+ tag_asi = iotlbentry->attrs.secure ? ARMASIdx_TagS : ARMASIdx_TagNS;
+ tag_as = cpu_get_address_space(env_cpu(env), tag_asi);
+ mr = address_space_translate(tag_as, tag_paddr, &xlat, NULL,
+ tag_access == MMU_DATA_STORE,
+ iotlbentry->attrs);
+
+ /*
+ * Note that @mr will never be NULL. If there is nothing in the address
+ * space at @tag_paddr, the translation will return the unallocated memory
+ * region. For our purposes, the result must be ram.
+ */
+ if (unlikely(!memory_region_is_ram(mr))) {
+ /* ??? Failure is a board configuration error. */
+ qemu_log_mask(LOG_UNIMP,
+ "Tag Memory @ 0x%" HWADDR_PRIx " not found for "
+ "Normal Memory @ 0x%" HWADDR_PRIx "\n",
+ tag_paddr, ptr_paddr);
+ return NULL;
+ }
+
+ /*
+ * Ensure the tag memory is dirty on write, for migration.
+ * Tag memory can never contain code or display memory (vga).
+ */
+ if (tag_access == MMU_DATA_STORE) {
+ ram_addr_t tag_ra = memory_region_get_ram_addr(mr) + xlat;
+ cpu_physical_memory_set_dirty_flag(tag_ra, DIRTY_MEMORY_MIGRATION);
+ }
+
+ return memory_region_get_ram_ptr(mr) + xlat;
+#endif
}
uint64_t HELPER(irg)(CPUARMState *env, uint64_t rn, uint64_t rm)
--
2.20.1
^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
` (41 preceding siblings ...)
2020-03-12 19:42 ` [PATCH v6 42/42] target/arm: Add allocation tag storage for system mode Richard Henderson
@ 2020-05-18 14:46 ` Peter Maydell
42 siblings, 0 replies; 44+ messages in thread
From: Peter Maydell @ 2020-05-18 14:46 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-arm, Alex Bennée, QEMU Developers
On Thu, 12 Mar 2020 at 19:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> I've done quite a lot of reorg since v5, so much so that I've dropped
> most of the review tags that had been given.
>
> There are a couple of major improvements to note:
>
> * The effective ATA bit is in hflags, noting whether access to the
> allocation tags is enabled by the OS. This is moderately expensive
> to compute, so it's better to hold on to that. This allows trivial
> inlining of some of the operations.
>
> * Updates for SVE, and thus
>
> Based-on: <20200311064420.30606-1-richard.henderson@linaro.org>
> ("target/arm: sve load/store improvements")
For clarity, given this is now 2 months old I'm just going to drop
it from my to-review queue on the assumption that you'll do a
rebase-and-resend at some point now the prereq patchset is in master.
thanks
-- PMM
^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2020-05-18 14:47 UTC | newest]
Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-12 19:41 [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, system mode Richard Henderson
2020-03-12 19:41 ` [PATCH v6 01/42] target/arm: Add isar tests for mte Richard Henderson
2020-03-12 19:41 ` [PATCH v6 02/42] target/arm: Improve masking of SCR RES0 bits Richard Henderson
2020-03-12 19:41 ` [PATCH v6 03/42] target/arm: Add support for MTE to SCTLR_ELx Richard Henderson
2020-03-12 19:41 ` [PATCH v6 04/42] target/arm: Add support for MTE to HCR_EL2 and SCR_EL3 Richard Henderson
2020-03-12 19:41 ` [PATCH v6 05/42] target/arm: Rename DISAS_UPDATE to DISAS_UPDATE_EXIT Richard Henderson
2020-03-12 19:41 ` [PATCH v6 06/42] target/arm: Add DISAS_UPDATE_NOCHAIN Richard Henderson
2020-03-12 19:41 ` [PATCH v6 07/42] target/arm: Add MTE system registers Richard Henderson
2020-03-12 19:41 ` [PATCH v6 08/42] target/arm: Add MTE bits to tb_flags Richard Henderson
2020-03-12 19:41 ` [PATCH v6 09/42] target/arm: Implement the IRG instruction Richard Henderson
2020-03-12 19:41 ` [PATCH v6 10/42] target/arm: Implement the ADDG, SUBG instructions Richard Henderson
2020-03-12 19:41 ` [PATCH v6 11/42] target/arm: Implement the GMI instruction Richard Henderson
2020-03-12 19:41 ` [PATCH v6 12/42] target/arm: Implement the SUBP instruction Richard Henderson
2020-03-12 19:41 ` [PATCH v6 13/42] target/arm: Define arm_cpu_do_unaligned_access for user-only Richard Henderson
2020-03-12 19:41 ` [PATCH v6 14/42] target/arm: Add helper_probe_access Richard Henderson
2020-03-12 19:41 ` [PATCH v6 15/42] target/arm: Implement LDG, STG, ST2G instructions Richard Henderson
2020-03-12 19:41 ` [PATCH v6 16/42] target/arm: Implement the STGP instruction Richard Henderson
2020-03-12 19:41 ` [PATCH v6 17/42] target/arm: Restrict the values of DCZID.BS under TCG Richard Henderson
2020-03-12 19:41 ` [PATCH v6 18/42] target/arm: Simplify DC_ZVA Richard Henderson
2020-03-12 19:41 ` [PATCH v6 19/42] target/arm: Implement the LDGM, STGM, STZGM instructions Richard Henderson
2020-03-12 19:41 ` [PATCH v6 20/42] target/arm: Implement the access tag cache flushes Richard Henderson
2020-03-12 19:41 ` [PATCH v6 21/42] target/arm: Move regime_el to internals.h Richard Henderson
2020-03-12 19:41 ` [PATCH v6 22/42] target/arm: Move regime_tcr " Richard Henderson
2020-03-12 19:42 ` [PATCH v6 23/42] target/arm: Add gen_mte_check1 Richard Henderson
2020-03-12 19:42 ` [PATCH v6 24/42] target/arm: Add gen_mte_checkN Richard Henderson
2020-03-12 19:42 ` [PATCH v6 25/42] target/arm: Implement helper_mte_check1 Richard Henderson
2020-03-12 19:42 ` [PATCH v6 26/42] target/arm: Implement helper_mte_checkN Richard Henderson
2020-03-12 19:42 ` [PATCH v6 27/42] target/arm: Add helper_mte_check_zva Richard Henderson
2020-03-12 19:42 ` [PATCH v6 28/42] target/arm: Use mte_checkN for sve unpredicated loads Richard Henderson
2020-03-12 19:42 ` [PATCH v6 29/42] target/arm: Use mte_checkN for sve unpredicated stores Richard Henderson
2020-03-12 19:42 ` [PATCH v6 30/42] target/arm: Use mte_check1 for sve LD1R Richard Henderson
2020-03-12 19:42 ` [PATCH v6 31/42] target/arm: Add mte helpers for sve scalar + int loads Richard Henderson
2020-03-12 19:42 ` [PATCH v6 32/42] target/arm: Add mte helpers for sve scalar + int stores Richard Henderson
2020-03-12 19:42 ` [PATCH v6 33/42] target/arm: Add mte helpers for sve scalar + int ff/nf loads Richard Henderson
2020-03-12 19:42 ` [PATCH v6 34/42] target/arm: Handle TBI for sve scalar + int memory ops Richard Henderson
2020-03-12 19:42 ` [PATCH v6 35/42] target/arm: Add mte helpers for sve scatter/gather " Richard Henderson
2020-03-12 19:42 ` [PATCH v6 36/42] target/arm: Complete TBI clearing for user-only for SVE Richard Henderson
2020-03-12 19:42 ` [PATCH v6 37/42] target/arm: Implement data cache set allocation tags Richard Henderson
2020-03-12 19:42 ` [PATCH v6 38/42] target/arm: Set PSTATE.TCO on exception entry Richard Henderson
2020-03-12 19:42 ` [PATCH v6 39/42] target/arm: Enable MTE Richard Henderson
2020-03-12 19:42 ` [PATCH v6 40/42] target/arm: Cache the Tagged bit for a page in MemTxAttrs Richard Henderson
2020-03-12 19:42 ` [PATCH v6 41/42] target/arm: Create tagged ram when MTE is enabled Richard Henderson
2020-03-12 19:42 ` [PATCH v6 42/42] target/arm: Add allocation tag storage for system mode Richard Henderson
2020-05-18 14:46 ` [PATCH v6 00/42] target/arm: Implement ARMv8.5-MemTag, " Peter Maydell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).