* [PULL 01/25] target/arm: Allow setting the FPCR.EBF bit for FEAT_EBF16
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 02/25] target/arm: Pass env pointer through to sme_bfmopa helper Peter Maydell
` (24 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
FEAT_EBF16 adds one new bit to the FPCR floating point control
register. Allow this bit to be read and written when the ID
registers indicate the presence of the feature.
Note that because this new bit is not in FPSCR_FPCR_MASK the bit is
not visible in the AArch32 FPSCR, and FPSCR writes do not affect it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/cpu-features.h | 5 +++++
target/arm/cpu.h | 1 +
target/arm/vfp_helper.c | 8 ++++++--
3 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
index c59ca104fe1..cfb82c23cad 100644
--- a/target/arm/cpu-features.h
+++ b/target/arm/cpu-features.h
@@ -556,6 +556,11 @@ static inline bool isar_feature_aa64_bf16(const ARMISARegisters *id)
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, BF16) != 0;
}
+static inline bool isar_feature_aa64_ebf16(const ARMISARegisters *id)
+{
+ return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, BF16) > 1;
+}
+
static inline bool isar_feature_aa64_rcpc_8_3(const ARMISARegisters *id)
{
return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, LRCPC) != 0;
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9a3fd595621..f065756c5c7 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1707,6 +1707,7 @@ void vfp_set_fpscr(CPUARMState *env, uint32_t val);
#define FPCR_OFE (1 << 10) /* Overflow exception trap enable */
#define FPCR_UFE (1 << 11) /* Underflow exception trap enable */
#define FPCR_IXE (1 << 12) /* Inexact exception trap enable */
+#define FPCR_EBF (1 << 13) /* Extended BFloat16 behaviors */
#define FPCR_IDE (1 << 15) /* Input Denormal exception trap enable */
#define FPCR_LEN_MASK (7 << 16) /* LEN, A-profile only */
#define FPCR_FZ16 (1 << 19) /* ARMv8.2+, FP16 flush-to-zero */
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index b3698da8ca7..203d37303bd 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -254,6 +254,10 @@ static void vfp_set_fpcr_masked(CPUARMState *env, uint32_t val, uint32_t mask)
val &= ~FPCR_FZ16;
}
+ if (!cpu_isar_feature(aa64_ebf16, cpu)) {
+ val &= ~FPCR_EBF;
+ }
+
vfp_set_fpcr_to_host(env, val, mask);
if (mask & (FPCR_LEN_MASK | FPCR_STRIDE_MASK)) {
@@ -278,12 +282,12 @@ static void vfp_set_fpcr_masked(CPUARMState *env, uint32_t val, uint32_t mask)
* We don't implement trapped exception handling, so the
* trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!)
*
- * The FPCR bits we keep in vfp.fpcr are AHP, DN, FZ, RMode
+ * The FPCR bits we keep in vfp.fpcr are AHP, DN, FZ, RMode, EBF
* and FZ16. Len, Stride and LTPSIZE we just handled. Store those bits
* there, and zero any of the other FPCR bits and the RES0 and RAZ/WI
* bits.
*/
- val &= FPCR_AHP | FPCR_DN | FPCR_FZ | FPCR_RMODE_MASK | FPCR_FZ16;
+ val &= FPCR_AHP | FPCR_DN | FPCR_FZ | FPCR_RMODE_MASK | FPCR_FZ16 | FPCR_EBF;
env->vfp.fpcr &= ~mask;
env->vfp.fpcr |= val;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 02/25] target/arm: Pass env pointer through to sme_bfmopa helper
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
2024-09-05 13:00 ` [PULL 01/25] target/arm: Allow setting the FPCR.EBF bit for FEAT_EBF16 Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 03/25] target/arm: Pass env pointer through to gvec_bfdot helper Peter Maydell
` (23 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
To implement the FEAT_EBF16 semantics, we are going to need
the CPUARMState env pointer in every helper function which calls
bfdotadd().
Pass the env pointer through from generated code to the sme_bfmopa
helper. (We'll add the code that uses it when we've adjusted
all the helpers to have access to the env pointer.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/helper-sme.h | 4 ++--
target/arm/tcg/sme_helper.c | 4 ++--
target/arm/tcg/translate-sme.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/target/arm/tcg/helper-sme.h b/target/arm/tcg/helper-sme.h
index d22bf9d21b0..59ecaa15485 100644
--- a/target/arm/tcg/helper-sme.h
+++ b/target/arm/tcg/helper-sme.h
@@ -126,8 +126,8 @@ DEF_HELPER_FLAGS_7(sme_fmopa_s, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_7(sme_fmopa_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, ptr, i32)
-DEF_HELPER_FLAGS_6(sme_bfmopa, TCG_CALL_NO_RWG,
- void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_7(sme_bfmopa, TCG_CALL_NO_RWG,
+ void, ptr, ptr, ptr, ptr, ptr, env, i32)
DEF_HELPER_FLAGS_6(sme_smopa_s, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_6(sme_umopa_s, TCG_CALL_NO_RWG,
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 02106809ce1..289ffabbfbe 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1079,8 +1079,8 @@ void HELPER(sme_fmopa_h)(void *vza, void *vzn, void *vzm, void *vpn,
}
}
-void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm, void *vpn,
- void *vpm, uint32_t desc)
+void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm,
+ void *vpn, void *vpm, CPUARMState *env, uint32_t desc)
{
intptr_t row, col, oprsz = simd_maxsz(desc);
uint32_t neg = simd_data(desc) * 0x80008000u;
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index ae42ddef7b3..3ceb32e8bd9 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -363,7 +363,7 @@ TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a,
MO_64, FPST_FPCR, gen_helper_sme_fmopa_d)
/* TODO: FEAT_EBF16 */
-TRANS_FEAT(BFMOPA, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_bfmopa)
+TRANS_FEAT(BFMOPA, aa64_sme, do_outprod_env, a, MO_32, gen_helper_sme_bfmopa)
TRANS_FEAT(SMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_smopa_s)
TRANS_FEAT(UMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_umopa_s)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 03/25] target/arm: Pass env pointer through to gvec_bfdot helper
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
2024-09-05 13:00 ` [PULL 01/25] target/arm: Allow setting the FPCR.EBF bit for FEAT_EBF16 Peter Maydell
2024-09-05 13:00 ` [PULL 02/25] target/arm: Pass env pointer through to sme_bfmopa helper Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 04/25] target/arm: Pass env pointer through to gvec_bfdot_idx helper Peter Maydell
` (22 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Pass the env pointer through to the gvec_bfdot helper,
so we can use it to add support for FEAT_EBF16.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.h | 4 ++--
target/arm/tcg/translate-a64.c | 27 ++++++++++++++++++++++++-
target/arm/tcg/translate-neon.c | 35 +++++++++++++++++++++++++++++++--
target/arm/tcg/translate-sve.c | 15 +++++++++++++-
target/arm/tcg/vec_helper.c | 3 ++-
5 files changed, 77 insertions(+), 7 deletions(-)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index 970d059dec5..4466e796cb0 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -1027,8 +1027,8 @@ DEF_HELPER_FLAGS_5(gvec_ummla_b, TCG_CALL_NO_RWG,
DEF_HELPER_FLAGS_5(gvec_usmmla_b, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
-DEF_HELPER_FLAGS_5(gvec_bfdot, TCG_CALL_NO_RWG,
- void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(gvec_bfdot, TCG_CALL_NO_RWG,
+ void, ptr, ptr, ptr, ptr, env, i32)
DEF_HELPER_FLAGS_5(gvec_bfdot_idx, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 4684e7eb6ea..3813c75895b 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -735,6 +735,22 @@ static void gen_gvec_op4_ool(DisasContext *s, bool is_q, int rd, int rn,
is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
}
+/*
+ * Expand a 4-operand operation using an out-of-line helper that takes
+ * a pointer to the CPU env.
+ */
+static void gen_gvec_op4_env(DisasContext *s, bool is_q, int rd, int rn,
+ int rm, int ra, int data,
+ gen_helper_gvec_4_ptr *fn)
+{
+ tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, rd),
+ vec_full_reg_offset(s, rn),
+ vec_full_reg_offset(s, rm),
+ vec_full_reg_offset(s, ra),
+ tcg_env,
+ is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
+}
+
/*
* Expand a 4-operand + fpstatus pointer + simd data value operation using
* an out-of-line helper.
@@ -5608,10 +5624,19 @@ static bool do_dot_vector(DisasContext *s, arg_qrrr_e *a,
return true;
}
+static bool do_dot_vector_env(DisasContext *s, arg_qrrr_e *a,
+ gen_helper_gvec_4_ptr *fn)
+{
+ if (fp_access_check(s)) {
+ gen_gvec_op4_env(s, a->q, a->rd, a->rn, a->rm, a->rd, 0, fn);
+ }
+ return true;
+}
+
TRANS_FEAT(SDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_sdot_b)
TRANS_FEAT(UDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_udot_b)
TRANS_FEAT(USDOT_v, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usdot_b)
-TRANS_FEAT(BFDOT_v, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfdot)
+TRANS_FEAT(BFDOT_v, aa64_bf16, do_dot_vector_env, a, gen_helper_gvec_bfdot)
TRANS_FEAT(BFMMLA, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfmmla)
TRANS_FEAT(SMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_smmla_b)
TRANS_FEAT(UMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_ummla_b)
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
index 915c9e56db5..454380f01d7 100644
--- a/target/arm/tcg/translate-neon.c
+++ b/target/arm/tcg/translate-neon.c
@@ -148,6 +148,37 @@ static bool do_neon_ddda(DisasContext *s, int q, int vd, int vn, int vm,
return true;
}
+static bool do_neon_ddda_env(DisasContext *s, int q, int vd, int vn, int vm,
+ int data, gen_helper_gvec_4_ptr *fn_gvec)
+{
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (((vd | vn | vm) & 0x10) && !dc_isar_feature(aa32_simd_r32, s)) {
+ return false;
+ }
+
+ /*
+ * UNDEF accesses to odd registers for each bit of Q.
+ * Q will be 0b111 for all Q-reg instructions, otherwise
+ * when we have mixed Q- and D-reg inputs.
+ */
+ if (((vd & 1) * 4 | (vn & 1) * 2 | (vm & 1)) & q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ int opr_sz = q ? 16 : 8;
+ tcg_gen_gvec_4_ptr(vfp_reg_offset(1, vd),
+ vfp_reg_offset(1, vn),
+ vfp_reg_offset(1, vm),
+ vfp_reg_offset(1, vd),
+ tcg_env,
+ opr_sz, opr_sz, data, fn_gvec);
+ return true;
+}
+
static bool do_neon_ddda_fpst(DisasContext *s, int q, int vd, int vn, int vm,
int data, ARMFPStatusFlavour fp_flavour,
gen_helper_gvec_4_ptr *fn_gvec_ptr)
@@ -266,8 +297,8 @@ static bool trans_VDOT_b16(DisasContext *s, arg_VDOT_b16 *a)
if (!dc_isar_feature(aa32_bf16, s)) {
return false;
}
- return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0,
- gen_helper_gvec_bfdot);
+ return do_neon_ddda_env(s, a->q * 7, a->vd, a->vn, a->vm, 0,
+ gen_helper_gvec_bfdot);
}
static bool trans_VFML(DisasContext *s, arg_VFML *a)
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index a72c2620960..e1dd6617e8b 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -252,6 +252,19 @@ static bool gen_gvec_fpst_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn,
return ret;
}
+static bool gen_gvec_env_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn,
+ int rd, int rn, int rm, int ra,
+ int data)
+{
+ return gen_gvec_ptr_zzzz(s, fn, rd, rn, rm, ra, data, tcg_env);
+}
+
+static bool gen_gvec_env_arg_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn,
+ arg_rrrr_esz *a, int data)
+{
+ return gen_gvec_env_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data);
+}
+
/* Invoke an out-of-line helper on 4 Zregs, 1 Preg, plus fpst. */
static bool gen_gvec_fpst_zzzzp(DisasContext *s, gen_helper_gvec_5_ptr *fn,
int rd, int rn, int rm, int ra, int pg,
@@ -7113,7 +7126,7 @@ TRANS_FEAT_NONSTREAMING(USMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
TRANS_FEAT_NONSTREAMING(UMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
gen_helper_gvec_ummla_b, a, 0)
-TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_ool_arg_zzzz,
+TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_env_arg_zzzz,
gen_helper_gvec_bfdot, a, 0)
TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_ool_arg_zzxz,
gen_helper_gvec_bfdot_idx, a)
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 98604d170fd..01b36fdd786 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2814,7 +2814,8 @@ float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2)
return t1;
}
-void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, uint32_t desc)
+void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va,
+ CPUARMState *env, uint32_t desc)
{
intptr_t i, opr_sz = simd_oprsz(desc);
float32 *d = vd, *a = va;
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 04/25] target/arm: Pass env pointer through to gvec_bfdot_idx helper
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (2 preceding siblings ...)
2024-09-05 13:00 ` [PULL 03/25] target/arm: Pass env pointer through to gvec_bfdot helper Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 05/25] target/arm: Pass env pointer through to gvec_bfmmla helper Peter Maydell
` (21 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Pass the env pointer through to the gvec_bfdot_idx helper,
so we can use it to add support for FEAT_EBF16.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.h | 4 ++--
target/arm/tcg/translate-a64.c | 11 ++++++++++-
target/arm/tcg/translate-neon.c | 4 ++--
target/arm/tcg/translate-sve.c | 8 +++++++-
target/arm/tcg/vec_helper.c | 2 +-
5 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index 4466e796cb0..e197b5b1d2c 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -1029,8 +1029,8 @@ DEF_HELPER_FLAGS_5(gvec_usmmla_b, TCG_CALL_NO_RWG,
DEF_HELPER_FLAGS_6(gvec_bfdot, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_FLAGS_5(gvec_bfdot_idx, TCG_CALL_NO_RWG,
- void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(gvec_bfdot_idx, TCG_CALL_NO_RWG,
+ void, ptr, ptr, ptr, ptr, env, i32)
DEF_HELPER_FLAGS_5(gvec_bfmmla, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 3813c75895b..c7876513c72 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -6410,13 +6410,22 @@ static bool do_dot_vector_idx(DisasContext *s, arg_qrrx_e *a,
return true;
}
+static bool do_dot_vector_idx_env(DisasContext *s, arg_qrrx_e *a,
+ gen_helper_gvec_4_ptr *fn)
+{
+ if (fp_access_check(s)) {
+ gen_gvec_op4_env(s, a->q, a->rd, a->rn, a->rm, a->rd, a->idx, fn);
+ }
+ return true;
+}
+
TRANS_FEAT(SDOT_vi, aa64_dp, do_dot_vector_idx, a, gen_helper_gvec_sdot_idx_b)
TRANS_FEAT(UDOT_vi, aa64_dp, do_dot_vector_idx, a, gen_helper_gvec_udot_idx_b)
TRANS_FEAT(SUDOT_vi, aa64_i8mm, do_dot_vector_idx, a,
gen_helper_gvec_sudot_idx_b)
TRANS_FEAT(USDOT_vi, aa64_i8mm, do_dot_vector_idx, a,
gen_helper_gvec_usdot_idx_b)
-TRANS_FEAT(BFDOT_vi, aa64_bf16, do_dot_vector_idx, a,
+TRANS_FEAT(BFDOT_vi, aa64_bf16, do_dot_vector_idx_env, a,
gen_helper_gvec_bfdot_idx)
static bool trans_BFMLAL_vi(DisasContext *s, arg_qrrx_e *a)
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
index 454380f01d7..7de157c539c 100644
--- a/target/arm/tcg/translate-neon.c
+++ b/target/arm/tcg/translate-neon.c
@@ -391,8 +391,8 @@ static bool trans_VDOT_b16_scal(DisasContext *s, arg_VDOT_b16_scal *a)
if (!dc_isar_feature(aa32_bf16, s)) {
return false;
}
- return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index,
- gen_helper_gvec_bfdot_idx);
+ return do_neon_ddda_env(s, a->q * 6, a->vd, a->vn, a->vm, a->index,
+ gen_helper_gvec_bfdot_idx);
}
static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a)
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index e1dd6617e8b..eb77c943c8f 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -265,6 +265,12 @@ static bool gen_gvec_env_arg_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn,
return gen_gvec_env_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data);
}
+static bool gen_gvec_env_arg_zzxz(DisasContext *s, gen_helper_gvec_4_ptr *fn,
+ arg_rrxr_esz *a)
+{
+ return gen_gvec_env_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, a->index);
+}
+
/* Invoke an out-of-line helper on 4 Zregs, 1 Preg, plus fpst. */
static bool gen_gvec_fpst_zzzzp(DisasContext *s, gen_helper_gvec_5_ptr *fn,
int rd, int rn, int rm, int ra, int pg,
@@ -7128,7 +7134,7 @@ TRANS_FEAT_NONSTREAMING(UMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz,
TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_env_arg_zzzz,
gen_helper_gvec_bfdot, a, 0)
-TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_ool_arg_zzxz,
+TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_env_arg_zzxz,
gen_helper_gvec_bfdot_idx, a)
TRANS_FEAT_NONSTREAMING(BFMMLA, aa64_sve_bf16, gen_gvec_ool_arg_zzzz,
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 01b36fdd786..a2c62a86d84 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2828,7 +2828,7 @@ void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va,
}
void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm,
- void *va, uint32_t desc)
+ void *va, CPUARMState *env, uint32_t desc)
{
intptr_t i, j, opr_sz = simd_oprsz(desc);
intptr_t index = simd_data(desc);
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 05/25] target/arm: Pass env pointer through to gvec_bfmmla helper
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (3 preceding siblings ...)
2024-09-05 13:00 ` [PULL 04/25] target/arm: Pass env pointer through to gvec_bfdot_idx helper Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 06/25] target/arm: Prepare bfdotadd() callers for FEAT_EBF support Peter Maydell
` (20 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Pass the env pointer through to the gvec_bfmmla helper,
so we can use it to add support for FEAT_EBF16.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.h | 4 ++--
target/arm/tcg/translate-a64.c | 2 +-
target/arm/tcg/translate-neon.c | 4 ++--
target/arm/tcg/translate-sve.c | 2 +-
target/arm/tcg/vec_helper.c | 3 ++-
5 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index e197b5b1d2c..b463be38c52 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -1032,8 +1032,8 @@ DEF_HELPER_FLAGS_6(gvec_bfdot, TCG_CALL_NO_RWG,
DEF_HELPER_FLAGS_6(gvec_bfdot_idx, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, env, i32)
-DEF_HELPER_FLAGS_5(gvec_bfmmla, TCG_CALL_NO_RWG,
- void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(gvec_bfmmla, TCG_CALL_NO_RWG,
+ void, ptr, ptr, ptr, ptr, env, i32)
DEF_HELPER_FLAGS_6(gvec_bfmlal, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index c7876513c72..6d5f12e8f55 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -5637,7 +5637,7 @@ TRANS_FEAT(SDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_sdot_b)
TRANS_FEAT(UDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_udot_b)
TRANS_FEAT(USDOT_v, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usdot_b)
TRANS_FEAT(BFDOT_v, aa64_bf16, do_dot_vector_env, a, gen_helper_gvec_bfdot)
-TRANS_FEAT(BFMMLA, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfmmla)
+TRANS_FEAT(BFMMLA, aa64_bf16, do_dot_vector_env, a, gen_helper_gvec_bfmmla)
TRANS_FEAT(SMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_smmla_b)
TRANS_FEAT(UMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_ummla_b)
TRANS_FEAT(USMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usmmla_b)
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
index 7de157c539c..13cd31aad42 100644
--- a/target/arm/tcg/translate-neon.c
+++ b/target/arm/tcg/translate-neon.c
@@ -3730,8 +3730,8 @@ static bool trans_VMMLA_b16(DisasContext *s, arg_VMMLA_b16 *a)
if (!dc_isar_feature(aa32_bf16, s)) {
return false;
}
- return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0,
- gen_helper_gvec_bfmmla);
+ return do_neon_ddda_env(s, 7, a->vd, a->vn, a->vm, 0,
+ gen_helper_gvec_bfmmla);
}
static bool trans_VFMA_b16(DisasContext *s, arg_VFMA_b16 *a)
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index eb77c943c8f..9e2536dfe99 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -7137,7 +7137,7 @@ TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_env_arg_zzzz,
TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_env_arg_zzxz,
gen_helper_gvec_bfdot_idx, a)
-TRANS_FEAT_NONSTREAMING(BFMMLA, aa64_sve_bf16, gen_gvec_ool_arg_zzzz,
+TRANS_FEAT_NONSTREAMING(BFMMLA, aa64_sve_bf16, gen_gvec_env_arg_zzzz,
gen_helper_gvec_bfmmla, a, 0)
static bool do_BFMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel)
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index a2c62a86d84..616ec54bb77 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2847,7 +2847,8 @@ void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm,
clear_tail(d, opr_sz, simd_maxsz(desc));
}
-void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va, uint32_t desc)
+void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va,
+ CPUARMState *env, uint32_t desc)
{
intptr_t s, opr_sz = simd_oprsz(desc);
float32 *d = vd, *a = va;
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 06/25] target/arm: Prepare bfdotadd() callers for FEAT_EBF support
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (4 preceding siblings ...)
2024-09-05 13:00 ` [PULL 05/25] target/arm: Pass env pointer through to gvec_bfmmla helper Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 07/25] target/arm: Implement FPCR.EBF=1 semantics for bfdotadd() Peter Maydell
` (19 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
We use bfdotadd() in four callsites for various helper functions. Currently
this all assumes that we have the FPCR.EBF=0 semantics. For FPCR.EBF=1
we will need to:
* call a different routine to bfdotadd() because we need to do a
fused multiply-add rather than separate multiply and add steps
* use a different float_status that honours the FPCR rounding mode
and denormal-flushing fields
* pass in an extra float_status that has been set up to perform
round-to-odd rounding
To prepare for this, refactor all the callsites so that instead of
for (...) {
x = bfdotadd(...);
}
they are:
float_status fpst, fpst_odd;
if (is_ebf(env, &fpst, &fpst_odd)) {
for (...) {
x = bfdotadd_ebf(..., &fpst, &fpst_odd);
}
} else {
for (...) {
x = bfdotadd(..., &fpst);
}
}
For the moment the is_ebf() function always returns false, sets up
fpst for EBF=0 semantics and never sets up fpst_odd; bfdotadd_ebf()
will assert if called. We'll fill in the handling for EBF=1 in the
next commit.
This change should be a zero-behaviour-change refactor.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/vec_internal.h | 37 ++++++++-
target/arm/tcg/sme_helper.c | 74 ++++++++++++------
target/arm/tcg/vec_helper.c | 139 +++++++++++++++++++++++++---------
3 files changed, 189 insertions(+), 61 deletions(-)
diff --git a/target/arm/tcg/vec_internal.h b/target/arm/tcg/vec_internal.h
index 3ca1b94ccf9..094f5c169ca 100644
--- a/target/arm/tcg/vec_internal.h
+++ b/target/arm/tcg/vec_internal.h
@@ -223,13 +223,46 @@ int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool);
* bfdotadd:
* @sum: addend
* @e1, @e2: multiplicand vectors
+ * @fpst: floating-point status to use
*
* BFloat16 2-way dot product of @e1 & @e2, accumulating with @sum.
* The @e1 and @e2 operands correspond to the 32-bit source vector
* slots and contain two Bfloat16 values each.
*
- * Corresponds to the ARM pseudocode function BFDotAdd.
+ * Corresponds to the ARM pseudocode function BFDotAdd, specialized
+ * for the FPCR.EBF == 0 case.
*/
-float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2);
+float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst);
+/**
+ * bfdotadd_ebf:
+ * @sum: addend
+ * @e1, @e2: multiplicand vectors
+ * @fpst: floating-point status to use
+ * @fpst_odd: floating-point status to use for round-to-odd operations
+ *
+ * BFloat16 2-way dot product of @e1 & @e2, accumulating with @sum.
+ * The @e1 and @e2 operands correspond to the 32-bit source vector
+ * slots and contain two Bfloat16 values each.
+ *
+ * Corresponds to the ARM pseudocode function BFDotAdd, specialized
+ * for the FPCR.EBF == 1 case.
+ */
+float32 bfdotadd_ebf(float32 sum, uint32_t e1, uint32_t e2,
+ float_status *fpst, float_status *fpst_odd);
+
+/**
+ * is_ebf:
+ * @env: CPU state
+ * @statusp: pointer to floating point status to fill in
+ * @oddstatusp: pointer to floating point status to fill in for round-to-odd
+ *
+ * Determine whether a BFDotAdd operation should use FPCR.EBF = 0
+ * or FPCR.EBF = 1 semantics. On return, has initialized *statusp
+ * and *oddstatusp to suitable float_status arguments to use with either
+ * bfdotadd() or bfdotadd_ebf().
+ * Returns true for EBF = 1, false for EBF = 0. (The caller should use this
+ * to decide whether to call bfdotadd() or bfdotadd_ebf().)
+ */
+bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp);
#endif /* TARGET_ARM_VEC_INTERNAL_H */
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 289ffabbfbe..8cf12654e56 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1085,32 +1085,62 @@ void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm,
intptr_t row, col, oprsz = simd_maxsz(desc);
uint32_t neg = simd_data(desc) * 0x80008000u;
uint16_t *pn = vpn, *pm = vpm;
+ float_status fpst, fpst_odd;
- for (row = 0; row < oprsz; ) {
- uint16_t prow = pn[H2(row >> 4)];
- do {
- void *vza_row = vza + tile_vslice_offset(row);
- uint32_t n = *(uint32_t *)(vzn + H1_4(row));
+ if (is_ebf(env, &fpst, &fpst_odd)) {
+ for (row = 0; row < oprsz; ) {
+ uint16_t prow = pn[H2(row >> 4)];
+ do {
+ void *vza_row = vza + tile_vslice_offset(row);
+ uint32_t n = *(uint32_t *)(vzn + H1_4(row));
- n = f16mop_adj_pair(n, prow, neg);
+ n = f16mop_adj_pair(n, prow, neg);
- for (col = 0; col < oprsz; ) {
- uint16_t pcol = pm[H2(col >> 4)];
- do {
- if (prow & pcol & 0b0101) {
- uint32_t *a = vza_row + H1_4(col);
- uint32_t m = *(uint32_t *)(vzm + H1_4(col));
+ for (col = 0; col < oprsz; ) {
+ uint16_t pcol = pm[H2(col >> 4)];
+ do {
+ if (prow & pcol & 0b0101) {
+ uint32_t *a = vza_row + H1_4(col);
+ uint32_t m = *(uint32_t *)(vzm + H1_4(col));
- m = f16mop_adj_pair(m, pcol, 0);
- *a = bfdotadd(*a, n, m);
- }
- col += 4;
- pcol >>= 4;
- } while (col & 15);
- }
- row += 4;
- prow >>= 4;
- } while (row & 15);
+ m = f16mop_adj_pair(m, pcol, 0);
+ *a = bfdotadd_ebf(*a, n, m, &fpst, &fpst_odd);
+ }
+ col += 4;
+ pcol >>= 4;
+ } while (col & 15);
+ }
+ row += 4;
+ prow >>= 4;
+ } while (row & 15);
+ }
+ } else {
+ for (row = 0; row < oprsz; ) {
+ uint16_t prow = pn[H2(row >> 4)];
+ do {
+ void *vza_row = vza + tile_vslice_offset(row);
+ uint32_t n = *(uint32_t *)(vzn + H1_4(row));
+
+ n = f16mop_adj_pair(n, prow, neg);
+
+ for (col = 0; col < oprsz; ) {
+ uint16_t pcol = pm[H2(col >> 4)];
+ do {
+ if (prow & pcol & 0b0101) {
+ uint32_t *a = vza_row + H1_4(col);
+ uint32_t m = *(uint32_t *)(vzm + H1_4(col));
+
+ m = f16mop_adj_pair(m, pcol, 0);
+ *a = bfdotadd(*a, n, m, &fpst);
+ }
+ col += 4;
+ pcol >>= 4;
+ } while (col & 15);
+ }
+ row += 4;
+ prow >>= 4;
+ } while (row & 15);
+ }
}
}
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 616ec54bb77..b0de74b55f1 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2790,39 +2790,58 @@ DO_MMLA_B(gvec_usmmla_b, do_usmmla_b)
* BFloat16 Dot Product
*/
-float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2)
+bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp)
{
/* FPCR is ignored for BFDOT and BFMMLA. */
- float_status bf_status = {
+ *statusp = (float_status){
.tininess_before_rounding = float_tininess_before_rounding,
.float_rounding_mode = float_round_to_odd_inf,
.flush_to_zero = true,
.flush_inputs_to_zero = true,
.default_nan_mode = true,
};
+
+ return false;
+}
+
+float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst)
+{
float32 t1, t2;
/*
* Extract each BFloat16 from the element pair, and shift
* them such that they become float32.
*/
- t1 = float32_mul(e1 << 16, e2 << 16, &bf_status);
- t2 = float32_mul(e1 & 0xffff0000u, e2 & 0xffff0000u, &bf_status);
- t1 = float32_add(t1, t2, &bf_status);
- t1 = float32_add(sum, t1, &bf_status);
+ t1 = float32_mul(e1 << 16, e2 << 16, fpst);
+ t2 = float32_mul(e1 & 0xffff0000u, e2 & 0xffff0000u, fpst);
+ t1 = float32_add(t1, t2, fpst);
+ t1 = float32_add(sum, t1, fpst);
return t1;
}
+float32 bfdotadd_ebf(float32 sum, uint32_t e1, uint32_t e2,
+ float_status *fpst, float_status *fpst_odd)
+{
+ g_assert_not_reached();
+}
+
void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va,
CPUARMState *env, uint32_t desc)
{
intptr_t i, opr_sz = simd_oprsz(desc);
float32 *d = vd, *a = va;
uint32_t *n = vn, *m = vm;
+ float_status fpst, fpst_odd;
- for (i = 0; i < opr_sz / 4; ++i) {
- d[i] = bfdotadd(a[i], n[i], m[i]);
+ if (is_ebf(env, &fpst, &fpst_odd)) {
+ for (i = 0; i < opr_sz / 4; ++i) {
+ d[i] = bfdotadd_ebf(a[i], n[i], m[i], &fpst, &fpst_odd);
+ }
+ } else {
+ for (i = 0; i < opr_sz / 4; ++i) {
+ d[i] = bfdotadd(a[i], n[i], m[i], &fpst);
+ }
}
clear_tail(d, opr_sz, simd_maxsz(desc));
}
@@ -2836,12 +2855,23 @@ void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm,
intptr_t eltspersegment = MIN(16 / 4, elements);
float32 *d = vd, *a = va;
uint32_t *n = vn, *m = vm;
+ float_status fpst, fpst_odd;
- for (i = 0; i < elements; i += eltspersegment) {
- uint32_t m_idx = m[i + H4(index)];
+ if (is_ebf(env, &fpst, &fpst_odd)) {
+ for (i = 0; i < elements; i += eltspersegment) {
+ uint32_t m_idx = m[i + H4(index)];
- for (j = i; j < i + eltspersegment; j++) {
- d[j] = bfdotadd(a[j], n[j], m_idx);
+ for (j = i; j < i + eltspersegment; j++) {
+ d[j] = bfdotadd_ebf(a[j], n[j], m_idx, &fpst, &fpst_odd);
+ }
+ }
+ } else {
+ for (i = 0; i < elements; i += eltspersegment) {
+ uint32_t m_idx = m[i + H4(index)];
+
+ for (j = i; j < i + eltspersegment; j++) {
+ d[j] = bfdotadd(a[j], n[j], m_idx, &fpst);
+ }
}
}
clear_tail(d, opr_sz, simd_maxsz(desc));
@@ -2853,37 +2883,72 @@ void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va,
intptr_t s, opr_sz = simd_oprsz(desc);
float32 *d = vd, *a = va;
uint32_t *n = vn, *m = vm;
+ float_status fpst, fpst_odd;
- for (s = 0; s < opr_sz / 4; s += 4) {
- float32 sum00, sum01, sum10, sum11;
+ if (is_ebf(env, &fpst, &fpst_odd)) {
+ for (s = 0; s < opr_sz / 4; s += 4) {
+ float32 sum00, sum01, sum10, sum11;
- /*
- * Process the entire segment at once, writing back the
- * results only after we've consumed all of the inputs.
- *
- * Key to indices by column:
- * i j i k j k
- */
- sum00 = a[s + H4(0 + 0)];
- sum00 = bfdotadd(sum00, n[s + H4(0 + 0)], m[s + H4(0 + 0)]);
- sum00 = bfdotadd(sum00, n[s + H4(0 + 1)], m[s + H4(0 + 1)]);
+ /*
+ * Process the entire segment at once, writing back the
+ * results only after we've consumed all of the inputs.
+ *
+ * Key to indices by column:
+ * i j i k j k
+ */
+ sum00 = a[s + H4(0 + 0)];
+ sum00 = bfdotadd_ebf(sum00, n[s + H4(0 + 0)], m[s + H4(0 + 0)], &fpst, &fpst_odd);
+ sum00 = bfdotadd_ebf(sum00, n[s + H4(0 + 1)], m[s + H4(0 + 1)], &fpst, &fpst_odd);
- sum01 = a[s + H4(0 + 1)];
- sum01 = bfdotadd(sum01, n[s + H4(0 + 0)], m[s + H4(2 + 0)]);
- sum01 = bfdotadd(sum01, n[s + H4(0 + 1)], m[s + H4(2 + 1)]);
+ sum01 = a[s + H4(0 + 1)];
+ sum01 = bfdotadd_ebf(sum01, n[s + H4(0 + 0)], m[s + H4(2 + 0)], &fpst, &fpst_odd);
+ sum01 = bfdotadd_ebf(sum01, n[s + H4(0 + 1)], m[s + H4(2 + 1)], &fpst, &fpst_odd);
- sum10 = a[s + H4(2 + 0)];
- sum10 = bfdotadd(sum10, n[s + H4(2 + 0)], m[s + H4(0 + 0)]);
- sum10 = bfdotadd(sum10, n[s + H4(2 + 1)], m[s + H4(0 + 1)]);
+ sum10 = a[s + H4(2 + 0)];
+ sum10 = bfdotadd_ebf(sum10, n[s + H4(2 + 0)], m[s + H4(0 + 0)], &fpst, &fpst_odd);
+ sum10 = bfdotadd_ebf(sum10, n[s + H4(2 + 1)], m[s + H4(0 + 1)], &fpst, &fpst_odd);
- sum11 = a[s + H4(2 + 1)];
- sum11 = bfdotadd(sum11, n[s + H4(2 + 0)], m[s + H4(2 + 0)]);
- sum11 = bfdotadd(sum11, n[s + H4(2 + 1)], m[s + H4(2 + 1)]);
+ sum11 = a[s + H4(2 + 1)];
+ sum11 = bfdotadd_ebf(sum11, n[s + H4(2 + 0)], m[s + H4(2 + 0)], &fpst, &fpst_odd);
+ sum11 = bfdotadd_ebf(sum11, n[s + H4(2 + 1)], m[s + H4(2 + 1)], &fpst, &fpst_odd);
- d[s + H4(0 + 0)] = sum00;
- d[s + H4(0 + 1)] = sum01;
- d[s + H4(2 + 0)] = sum10;
- d[s + H4(2 + 1)] = sum11;
+ d[s + H4(0 + 0)] = sum00;
+ d[s + H4(0 + 1)] = sum01;
+ d[s + H4(2 + 0)] = sum10;
+ d[s + H4(2 + 1)] = sum11;
+ }
+ } else {
+ for (s = 0; s < opr_sz / 4; s += 4) {
+ float32 sum00, sum01, sum10, sum11;
+
+ /*
+ * Process the entire segment at once, writing back the
+ * results only after we've consumed all of the inputs.
+ *
+ * Key to indices by column:
+ * i j i k j k
+ */
+ sum00 = a[s + H4(0 + 0)];
+ sum00 = bfdotadd(sum00, n[s + H4(0 + 0)], m[s + H4(0 + 0)], &fpst);
+ sum00 = bfdotadd(sum00, n[s + H4(0 + 1)], m[s + H4(0 + 1)], &fpst);
+
+ sum01 = a[s + H4(0 + 1)];
+ sum01 = bfdotadd(sum01, n[s + H4(0 + 0)], m[s + H4(2 + 0)], &fpst);
+ sum01 = bfdotadd(sum01, n[s + H4(0 + 1)], m[s + H4(2 + 1)], &fpst);
+
+ sum10 = a[s + H4(2 + 0)];
+ sum10 = bfdotadd(sum10, n[s + H4(2 + 0)], m[s + H4(0 + 0)], &fpst);
+ sum10 = bfdotadd(sum10, n[s + H4(2 + 1)], m[s + H4(0 + 1)], &fpst);
+
+ sum11 = a[s + H4(2 + 1)];
+ sum11 = bfdotadd(sum11, n[s + H4(2 + 0)], m[s + H4(2 + 0)], &fpst);
+ sum11 = bfdotadd(sum11, n[s + H4(2 + 1)], m[s + H4(2 + 1)], &fpst);
+
+ d[s + H4(0 + 0)] = sum00;
+ d[s + H4(0 + 1)] = sum01;
+ d[s + H4(2 + 0)] = sum10;
+ d[s + H4(2 + 1)] = sum11;
+ }
}
clear_tail(d, opr_sz, simd_maxsz(desc));
}
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 07/25] target/arm: Implement FPCR.EBF=1 semantics for bfdotadd()
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (5 preceding siblings ...)
2024-09-05 13:00 ` [PULL 06/25] target/arm: Prepare bfdotadd() callers for FEAT_EBF support Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 08/25] target/arm: Enable FEAT_EBF16 in the "max" CPU Peter Maydell
` (18 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Implement the FPCR.EBF=1 semantics for bfdotadd() operations:
* is_ebf() sets up fpst and fpst_odd
* bfdotadd_ebf() implements the fused paired-multiply-and-add
operation that we need
The paired-multiply-and-add is similar to f16_dotadd() and
we use the same trick here as in that function, but the inputs
here are bfloat16 rather than float16.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/vec_helper.c | 57 +++++++++++++++++++++++++++++++++++--
1 file changed, 54 insertions(+), 3 deletions(-)
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index b0de74b55f1..22ddb968817 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2792,7 +2792,20 @@ DO_MMLA_B(gvec_usmmla_b, do_usmmla_b)
bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp)
{
- /* FPCR is ignored for BFDOT and BFMMLA. */
+ /*
+ * For BFDOT, BFMMLA, etc, the behaviour depends on FPCR.EBF.
+ * For EBF = 0, we ignore the FPCR bits which determine rounding
+ * mode and denormal-flushing, and we do unfused multiplies and
+ * additions with intermediate rounding of all products and sums.
+ * For EBF = 1, we honour FPCR rounding mode and denormal-flushing bits,
+ * and we perform a fused two-way sum-of-products without intermediate
+ * rounding of the products.
+ * In either case, we don't set fp exception flags.
+ *
+ * EBF is AArch64 only, so even if it's set in the FPCR it has
+ * no effect on AArch32 instructions.
+ */
+ bool ebf = is_a64(env) && env->vfp.fpcr & FPCR_EBF;
*statusp = (float_status){
.tininess_before_rounding = float_tininess_before_rounding,
.float_rounding_mode = float_round_to_odd_inf,
@@ -2801,7 +2814,18 @@ bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp)
.default_nan_mode = true,
};
- return false;
+ if (ebf) {
+ float_status *fpst = &env->vfp.fp_status;
+ set_flush_to_zero(get_flush_to_zero(fpst), statusp);
+ set_flush_inputs_to_zero(get_flush_inputs_to_zero(fpst), statusp);
+ set_float_rounding_mode(get_float_rounding_mode(fpst), statusp);
+
+ /* EBF=1 needs to do a step with round-to-odd semantics */
+ *oddstatusp = *statusp;
+ set_float_rounding_mode(float_round_to_odd, oddstatusp);
+ }
+
+ return ebf;
}
float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst)
@@ -2823,7 +2847,34 @@ float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst)
float32 bfdotadd_ebf(float32 sum, uint32_t e1, uint32_t e2,
float_status *fpst, float_status *fpst_odd)
{
- g_assert_not_reached();
+ /*
+ * Compare f16_dotadd() in sme_helper.c, but here we have
+ * bfloat16 inputs. In particular that means that we do not
+ * want the FPCR.FZ16 flush semantics, so we use the normal
+ * float_status for the input handling here.
+ */
+ float64 e1r = float32_to_float64(e1 << 16, fpst);
+ float64 e1c = float32_to_float64(e1 & 0xffff0000u, fpst);
+ float64 e2r = float32_to_float64(e2 << 16, fpst);
+ float64 e2c = float32_to_float64(e2 & 0xffff0000u, fpst);
+ float64 t64;
+ float32 t32;
+
+ /*
+ * The ARM pseudocode function FPDot performs both multiplies
+ * and the add with a single rounding operation. Emulate this
+ * by performing the first multiply in round-to-odd, then doing
+ * the second multiply as fused multiply-add, and rounding to
+ * float32 all in one step.
+ */
+ t64 = float64_mul(e1r, e2r, fpst_odd);
+ t64 = float64r32_muladd(e1c, e2c, t64, 0, fpst);
+
+ /* This conversion is exact, because we've already rounded. */
+ t32 = float64_to_float32(t64, fpst);
+
+ /* The final accumulation step is not fused. */
+ return float32_add(sum, t32, fpst);
}
void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va,
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 08/25] target/arm: Enable FEAT_EBF16 in the "max" CPU
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (6 preceding siblings ...)
2024-09-05 13:00 ` [PULL 07/25] target/arm: Implement FPCR.EBF=1 semantics for bfdotadd() Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 09/25] accel/tcg: Remove dead code from rr_cpu_thread_fn() Peter Maydell
` (17 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Now that we've implemented the required behaviour for FEAT_EBF16, we
can enable it for the "max" CPU type, list it in our documentation,
and delete a TODO comment about it being missing.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
docs/system/arm/emulation.rst | 1 +
target/arm/tcg/cpu64.c | 4 ++--
target/arm/tcg/translate-sme.c | 1 -
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 3ab6e726679..35f52a54b1c 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -45,6 +45,7 @@ the following architecture extensions:
- FEAT_DotProd (Advanced SIMD dot product instructions)
- FEAT_DoubleFault (Double Fault Extension)
- FEAT_E0PD (Preventing EL0 access to halves of address maps)
+- FEAT_EBF16 (AArch64 Extended BFloat16 instructions)
- FEAT_ECV (Enhanced Counter Virtualization)
- FEAT_EL0 (Support for execution at EL0)
- FEAT_EL1 (Support for execution at EL1)
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index fe232eb3069..79258a7c928 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1160,7 +1160,7 @@ void aarch64_max_tcg_initfn(Object *obj)
t = FIELD_DP64(t, ID_AA64ISAR1, FRINTTS, 1); /* FEAT_FRINTTS */
t = FIELD_DP64(t, ID_AA64ISAR1, SB, 1); /* FEAT_SB */
t = FIELD_DP64(t, ID_AA64ISAR1, SPECRES, 1); /* FEAT_SPECRES */
- t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 1); /* FEAT_BF16 */
+ t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 2); /* FEAT_BF16, FEAT_EBF16 */
t = FIELD_DP64(t, ID_AA64ISAR1, DGH, 1); /* FEAT_DGH */
t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1); /* FEAT_I8MM */
cpu->isar.id_aa64isar1 = t;
@@ -1244,7 +1244,7 @@ void aarch64_max_tcg_initfn(Object *obj)
t = FIELD_DP64(t, ID_AA64ZFR0, SVEVER, 1);
t = FIELD_DP64(t, ID_AA64ZFR0, AES, 2); /* FEAT_SVE_PMULL128 */
t = FIELD_DP64(t, ID_AA64ZFR0, BITPERM, 1); /* FEAT_SVE_BitPerm */
- t = FIELD_DP64(t, ID_AA64ZFR0, BFLOAT16, 1); /* FEAT_BF16 */
+ t = FIELD_DP64(t, ID_AA64ZFR0, BFLOAT16, 2); /* FEAT_BF16, FEAT_EBF16 */
t = FIELD_DP64(t, ID_AA64ZFR0, SHA3, 1); /* FEAT_SVE_SHA3 */
t = FIELD_DP64(t, ID_AA64ZFR0, SM4, 1); /* FEAT_SVE_SM4 */
t = FIELD_DP64(t, ID_AA64ZFR0, I8MM, 1); /* FEAT_I8MM */
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 3ceb32e8bd9..01ece570164 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -362,7 +362,6 @@ TRANS_FEAT(FMOPA_s, aa64_sme, do_outprod_fpst, a,
TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a,
MO_64, FPST_FPCR, gen_helper_sme_fmopa_d)
-/* TODO: FEAT_EBF16 */
TRANS_FEAT(BFMOPA, aa64_sme, do_outprod_env, a, MO_32, gen_helper_sme_bfmopa)
TRANS_FEAT(SMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_smopa_s)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 09/25] accel/tcg: Remove dead code from rr_cpu_thread_fn()
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (7 preceding siblings ...)
2024-09-05 13:00 ` [PULL 08/25] target/arm: Enable FEAT_EBF16 in the "max" CPU Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 10/25] hw: add compat machines for 9.2 Peter Maydell
` (16 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
The main loop in rr_cpu_thread_fn() can never terminate, so the
code at the end of the function to clean up the RCU subsystem is
dead code. Replace it with g_assert_not_reached().
(This is different from the other cpu_thread_fn for e.g. MTTCG or
for the KVM accelerator -- those can exit, if the vCPU they
are responsible for is unplugged. But the RR cpu thread fn
handles all CPUs in the system in a round-robin way, so even
if one is unplugged it keeps looping.)
Resolves: Coverity CID 1547782
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240815143634.3413679-1-peter.maydell@linaro.org
---
accel/tcg/tcg-accel-ops-rr.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
index c59c77da4b3..8ebadf8e9e1 100644
--- a/accel/tcg/tcg-accel-ops-rr.c
+++ b/accel/tcg/tcg-accel-ops-rr.c
@@ -302,9 +302,7 @@ static void *rr_cpu_thread_fn(void *arg)
rr_deal_with_unplugged_cpus();
}
- rcu_remove_force_rcu_notifier(&force_rcu);
- rcu_unregister_thread();
- return NULL;
+ g_assert_not_reached();
}
void rr_start_vcpu_thread(CPUState *cpu)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 10/25] hw: add compat machines for 9.2
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (8 preceding siblings ...)
2024-09-05 13:00 ` [PULL 09/25] accel/tcg: Remove dead code from rr_cpu_thread_fn() Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 11/25] hw/arm/smmuv3: Update comment documenting "stage" property Peter Maydell
` (15 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
From: Cornelia Huck <cohuck@redhat.com>
Add 9.2 machine types for arm/i440fx/m68k/q35/s390x/spapr.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240816161350.3706332-2-peter.maydell@linaro.org
Message-id: 20240816103723.2325982-1-cohuck@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
include/hw/boards.h | 3 +++
include/hw/i386/pc.h | 3 +++
hw/arm/virt.c | 11 +++++++++--
hw/core/machine.c | 3 +++
hw/i386/pc.c | 3 +++
hw/i386/pc_piix.c | 15 ++++++++++++---
hw/i386/pc_q35.c | 13 +++++++++++--
hw/m68k/virt.c | 11 +++++++++--
hw/ppc/spapr.c | 17 ++++++++++++++---
hw/s390x/s390-virtio-ccw.c | 14 +++++++++++++-
10 files changed, 80 insertions(+), 13 deletions(-)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 48ff6d8b93f..9a492770cbb 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -732,6 +732,9 @@ struct MachineState {
} \
type_init(machine_initfn##_register_types)
+extern GlobalProperty hw_compat_9_1[];
+extern const size_t hw_compat_9_1_len;
+
extern GlobalProperty hw_compat_9_0[];
extern const size_t hw_compat_9_0_len;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 4e55d7ef6ea..14ee06287da 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -215,6 +215,9 @@ void pc_system_parse_ovmf_flash(uint8_t *flash_ptr, size_t flash_size);
/* sgx.c */
void pc_machine_init_sgx_epc(PCMachineState *pcms);
+extern GlobalProperty pc_compat_9_1[];
+extern const size_t pc_compat_9_1_len;
+
extern GlobalProperty pc_compat_9_0[];
extern const size_t pc_compat_9_0_len;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 687fe0bb8bc..a5d3ad9bf9e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3301,10 +3301,17 @@ static void machvirt_machine_init(void)
}
type_init(machvirt_machine_init);
-static void virt_machine_9_1_options(MachineClass *mc)
+static void virt_machine_9_2_options(MachineClass *mc)
{
}
-DEFINE_VIRT_MACHINE_AS_LATEST(9, 1)
+DEFINE_VIRT_MACHINE_AS_LATEST(9, 2)
+
+static void virt_machine_9_1_options(MachineClass *mc)
+{
+ virt_machine_9_2_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_9_1, hw_compat_9_1_len);
+}
+DEFINE_VIRT_MACHINE(9, 1)
static void virt_machine_9_0_options(MachineClass *mc)
{
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 27dcda02483..adaba17ebac 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -34,6 +34,9 @@
#include "hw/virtio/virtio-iommu.h"
#include "audio/audio.h"
+GlobalProperty hw_compat_9_1[] = {};
+const size_t hw_compat_9_1_len = G_N_ELEMENTS(hw_compat_9_1);
+
GlobalProperty hw_compat_9_0[] = {
{"arm-cpu", "backcompat-cntfrq", "true" },
{ "scsi-hd", "migrate-emulated-scsi-request", "false" },
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 7779c88a91e..ba0ff511836 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -79,6 +79,9 @@
{ "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
{ "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
+GlobalProperty pc_compat_9_1[] = {};
+const size_t pc_compat_9_1_len = G_N_ELEMENTS(pc_compat_9_1);
+
GlobalProperty pc_compat_9_0[] = {
{ TYPE_X86_CPU, "x-amd-topoext-features-only", "false" },
{ TYPE_X86_CPU, "x-l1-cache-per-thread", "false" },
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 347afa4c370..2bf6865d405 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -474,13 +474,24 @@ static void pc_i440fx_machine_options(MachineClass *m)
"Use a different south bridge than PIIX3");
}
-static void pc_i440fx_machine_9_1_options(MachineClass *m)
+static void pc_i440fx_machine_9_2_options(MachineClass *m)
{
pc_i440fx_machine_options(m);
m->alias = "pc";
m->is_default = true;
}
+DEFINE_I440FX_MACHINE(9, 2);
+
+static void pc_i440fx_machine_9_1_options(MachineClass *m)
+{
+ pc_i440fx_machine_9_2_options(m);
+ m->alias = NULL;
+ m->is_default = false;
+ compat_props_add(m->compat_props, hw_compat_9_1, hw_compat_9_1_len);
+ compat_props_add(m->compat_props, pc_compat_9_1, pc_compat_9_1_len);
+}
+
DEFINE_I440FX_MACHINE(9, 1);
static void pc_i440fx_machine_9_0_options(MachineClass *m)
@@ -488,8 +499,6 @@ static void pc_i440fx_machine_9_0_options(MachineClass *m)
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
pc_i440fx_machine_9_1_options(m);
- m->alias = NULL;
- m->is_default = false;
m->smbios_memory_device_size = 16 * GiB;
compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index f2d8edfa846..8319b6d45ee 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -356,19 +356,28 @@ static void pc_q35_machine_options(MachineClass *m)
pc_q35_compat_defaults, pc_q35_compat_defaults_len);
}
-static void pc_q35_machine_9_1_options(MachineClass *m)
+static void pc_q35_machine_9_2_options(MachineClass *m)
{
pc_q35_machine_options(m);
m->alias = "q35";
}
+DEFINE_Q35_MACHINE(9, 2);
+
+static void pc_q35_machine_9_1_options(MachineClass *m)
+{
+ pc_q35_machine_9_2_options(m);
+ m->alias = NULL;
+ compat_props_add(m->compat_props, hw_compat_9_1, hw_compat_9_1_len);
+ compat_props_add(m->compat_props, pc_compat_9_1, pc_compat_9_1_len);
+}
+
DEFINE_Q35_MACHINE(9, 1);
static void pc_q35_machine_9_0_options(MachineClass *m)
{
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
pc_q35_machine_9_1_options(m);
- m->alias = NULL;
m->smbios_memory_device_size = 16 * GiB;
compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
diff --git a/hw/m68k/virt.c b/hw/m68k/virt.c
index cda199af8fa..ea5c4a5a570 100644
--- a/hw/m68k/virt.c
+++ b/hw/m68k/virt.c
@@ -366,10 +366,17 @@ type_init(virt_machine_register_types)
#define DEFINE_VIRT_MACHINE(major, minor) \
DEFINE_VIRT_MACHINE_IMPL(false, major, minor)
-static void virt_machine_9_1_options(MachineClass *mc)
+static void virt_machine_9_2_options(MachineClass *mc)
{
}
-DEFINE_VIRT_MACHINE_AS_LATEST(9, 1)
+DEFINE_VIRT_MACHINE_AS_LATEST(9, 2)
+
+static void virt_machine_9_1_options(MachineClass *mc)
+{
+ virt_machine_9_2_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_9_1, hw_compat_9_1_len);
+}
+DEFINE_VIRT_MACHINE(9, 1)
static void virt_machine_9_0_options(MachineClass *mc)
{
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 370d7c35d3a..8aa3ce7449b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4838,14 +4838,25 @@ static void spapr_machine_latest_class_options(MachineClass *mc)
DEFINE_SPAPR_MACHINE_IMPL(false, major, minor, _, tag)
/*
- * pseries-9.1
+ * pseries-9.2
*/
-static void spapr_machine_9_1_class_options(MachineClass *mc)
+static void spapr_machine_9_2_class_options(MachineClass *mc)
{
/* Defaults for the latest behaviour inherited from the base class */
}
-DEFINE_SPAPR_MACHINE_AS_LATEST(9, 1);
+DEFINE_SPAPR_MACHINE_AS_LATEST(9, 2);
+
+/*
+ * pseries-9.1
+ */
+static void spapr_machine_9_1_class_options(MachineClass *mc)
+{
+ spapr_machine_9_2_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_9_1, hw_compat_9_1_len);
+}
+
+DEFINE_SPAPR_MACHINE(9, 1);
/*
* pseries-9.0
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index c483ff8064d..18240a0fd8b 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -871,14 +871,26 @@ static const TypeInfo ccw_machine_info = {
DEFINE_CCW_MACHINE_IMPL(false, major, minor)
+static void ccw_machine_9_2_instance_options(MachineState *machine)
+{
+}
+
+static void ccw_machine_9_2_class_options(MachineClass *mc)
+{
+}
+DEFINE_CCW_MACHINE_AS_LATEST(9, 2);
+
static void ccw_machine_9_1_instance_options(MachineState *machine)
{
+ ccw_machine_9_2_instance_options(machine);
}
static void ccw_machine_9_1_class_options(MachineClass *mc)
{
+ ccw_machine_9_2_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_9_1, hw_compat_9_1_len);
}
-DEFINE_CCW_MACHINE_AS_LATEST(9, 1);
+DEFINE_CCW_MACHINE(9, 1);
static void ccw_machine_9_0_instance_options(MachineState *machine)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 11/25] hw/arm/smmuv3: Update comment documenting "stage" property
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (9 preceding siblings ...)
2024-09-05 13:00 ` [PULL 10/25] hw: add compat machines for 9.2 Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 12/25] hw/arm/virt: Default to two-stage SMMU from virt-9.2 Peter Maydell
` (14 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
When we added support for nested (stage 1 + stage 2) translation
to the SMMU in commit 58377c363291d we forgot to update the
comment that documents the valid values of the "stage" property.
Add the new "nested" value to it.
Fixes: 58377c363291d ("hw/arm/smmuv3: Support and advertise nesting")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240816161350.3706332-3-peter.maydell@linaro.org
---
hw/arm/smmuv3.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 39719763897..4c49b5a885f 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1981,6 +1981,7 @@ static Property smmuv3_properties[] = {
* Stages of translation advertised.
* "1": Stage 1
* "2": Stage 2
+ * "nested": Both stage 1 and stage 2
* Defaults to stage 1
*/
DEFINE_PROP_STRING("stage", SMMUv3State, stage),
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 12/25] hw/arm/virt: Default to two-stage SMMU from virt-9.2
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (10 preceding siblings ...)
2024-09-05 13:00 ` [PULL 11/25] hw/arm/smmuv3: Update comment documenting "stage" property Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 13/25] hw/arm/sbsa-ref: Use two-stage SMMU Peter Maydell
` (13 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Now that our SMMU model supports enabling both stages of translation
at once, we can enable this in the virt board. This is no change in
behaviour for guests, because if they simply ignore stage 2 and never
configure it then it has no effect. For the usual backwards
compatibility reasons we enable this only for machine types starting
with 9.2.
(Note that the SMMU is disabled by default on the virt board and is
only created if the user passes the 'iommu=smmuv3' machine option.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240816161350.3706332-4-peter.maydell@linaro.org
---
include/hw/arm/virt.h | 1 +
hw/arm/virt.c | 8 ++++++++
2 files changed, 9 insertions(+)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index a4d937ed45a..aca4f8061b1 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -134,6 +134,7 @@ struct VirtMachineClass {
bool no_cpu_topology;
bool no_tcg_lpa2;
bool no_ns_el2_virt_timer_irq;
+ bool no_nested_smmu;
};
struct VirtMachineState {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a5d3ad9bf9e..7934b236516 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1408,6 +1408,7 @@ static void create_pcie_irq_map(const MachineState *ms,
static void create_smmu(const VirtMachineState *vms,
PCIBus *bus)
{
+ VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
char *node;
const char compat[] = "arm,smmu-v3";
int irq = vms->irqmap[VIRT_SMMU];
@@ -1424,6 +1425,9 @@ static void create_smmu(const VirtMachineState *vms,
dev = qdev_new(TYPE_ARM_SMMUV3);
+ if (!vmc->no_nested_smmu) {
+ object_property_set_str(OBJECT(dev), "stage", "nested", &error_fatal);
+ }
object_property_set_link(OBJECT(dev), "primary-bus", OBJECT(bus),
&error_abort);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
@@ -3308,8 +3312,12 @@ DEFINE_VIRT_MACHINE_AS_LATEST(9, 2)
static void virt_machine_9_1_options(MachineClass *mc)
{
+ VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
virt_machine_9_2_options(mc);
compat_props_add(mc->compat_props, hw_compat_9_1, hw_compat_9_1_len);
+ /* 9.1 and earlier have only a stage-1 SMMU, not a nested s1+2 one */
+ vmc->no_nested_smmu = true;
}
DEFINE_VIRT_MACHINE(9, 1)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 13/25] hw/arm/sbsa-ref: Use two-stage SMMU
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (11 preceding siblings ...)
2024-09-05 13:00 ` [PULL 12/25] hw/arm/virt: Default to two-stage SMMU from virt-9.2 Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 14/25] hw/misc/xlnx-versal-cfu: destroy fifo in finalize Peter Maydell
` (12 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Now that our SMMU model supports enabling both stages of translation
at once, we can enable this in the sbsa-ref board. Existing guest
code that only programs stage 1 and doesn't care about stage 2 should
continue to run with the same behaviour, but guests that do want to
do nested SMMU configurations can now do so.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-id: 20240816161350.3706332-5-peter.maydell@linaro.org
---
hw/arm/sbsa-ref.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index ae37a923015..396abe9c1bd 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -621,6 +621,7 @@ static void create_smmu(const SBSAMachineState *sms, PCIBus *bus)
dev = qdev_new(TYPE_ARM_SMMUV3);
+ object_property_set_str(OBJECT(dev), "stage", "nested", &error_abort);
object_property_set_link(OBJECT(dev), "primary-bus", OBJECT(bus),
&error_abort);
sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 14/25] hw/misc/xlnx-versal-cfu: destroy fifo in finalize
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (12 preceding siblings ...)
2024-09-05 13:00 ` [PULL 13/25] hw/arm/sbsa-ref: Use two-stage SMMU Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 15/25] hw/misc/xlnx-versal-trng: Free s->prng in finalize, not unrealize Peter Maydell
` (11 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
Since the TYPE_XNLX_VERSAL_CFU_FDRO device creates a FIFO in its
instance_init method, we must destroy the FIFO in instance_finalize
to avoid a memory leak for the QOM introspection
"instantiate-examine-finalize" cycle:
Direct leak of 8192 byte(s) in 1 object(s) allocated from:
#0 0x55ec89eae7ee in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-aarch64+0x294d7ee) (BuildId: 6d508874816cc47d17c8dd775e8f809ae520e8cb)
#1 0x7f697018f738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x55ec8d98d98d in fifo8_create util/fifo8.c:27:18
#3 0x55ec8aa2a624 in fifo32_create /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/include/qemu/fifo32.h:35:5
#4 0x55ec8aa2a33c in cfu_fdro_init hw/misc/xlnx-versal-cfu.c:397:5
#5 0x55ec8ce75da1 in object_init_with_type qom/object.c:420:9
#6 0x55ec8ce5d07b in object_initialize_with_type qom/object.c:562:5
#7 0x55ec8ce5e91d in object_new_with_type qom/object.c:782:5
#8 0x55ec8ce5e9f1 in object_new qom/object.c:797:12
#9 0x55ec8d65c81d in qmp_device_list_properties qom/qom-qmp-cmds.c:144:11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20240822162127.705879-2-peter.maydell@linaro.org
---
hw/misc/xlnx-versal-cfu.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/hw/misc/xlnx-versal-cfu.c b/hw/misc/xlnx-versal-cfu.c
index 6bb82e51c15..2284b407eab 100644
--- a/hw/misc/xlnx-versal-cfu.c
+++ b/hw/misc/xlnx-versal-cfu.c
@@ -397,6 +397,13 @@ static void cfu_fdro_init(Object *obj)
fifo32_create(&s->fdro_data, 8 * KiB / sizeof(uint32_t));
}
+static void cfu_fdro_finalize(Object *obj)
+{
+ XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(obj);
+
+ fifo32_destroy(&s->fdro_data);
+}
+
static void cfu_fdro_reset_enter(Object *obj, ResetType type)
{
XlnxVersalCFUFDRO *s = XLNX_VERSAL_CFU_FDRO(obj);
@@ -539,6 +546,7 @@ static const TypeInfo cfu_fdro_info = {
.instance_size = sizeof(XlnxVersalCFUFDRO),
.class_init = cfu_fdro_class_init,
.instance_init = cfu_fdro_init,
+ .instance_finalize = cfu_fdro_finalize,
.interfaces = (InterfaceInfo[]) {
{ TYPE_XLNX_CFI_IF },
{ }
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 15/25] hw/misc/xlnx-versal-trng: Free s->prng in finalize, not unrealize
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (13 preceding siblings ...)
2024-09-05 13:00 ` [PULL 14/25] hw/misc/xlnx-versal-cfu: destroy fifo in finalize Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 16/25] hw/nvram/xlnx-bbram: Call register_finalize_block Peter Maydell
` (10 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
The TYPE_XLNX_VERSAL_TRNG device creates s->prng with g_rand_new()
in its init method, but it frees it in its unrealize method. This
results in a leak in the QOM introspection "initialize-inspect-finalize"
lifecycle:
Direct leak of 2500 byte(s) in 1 object(s) allocated from:
#0 0x55ec89eae9d8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-aarch64+0x294d9d8) (BuildId: 6d5
08874816cc47d17c8dd775e8f809ae520e8cb)
#1 0x7f697018fc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x7f6970197738 in g_rand_new_with_seed_array debian/build/deb/../../../glib/grand.c:202:17
#3 0x7f6970197816 in g_rand_new debian/build/deb/../../../glib/grand.c:286:10
#4 0x55ec8aa3656a in trng_init hw/misc/xlnx-versal-trng.c:624:15
#5 0x55ec8ce75da1 in object_init_with_type qom/object.c:420:9
#6 0x55ec8ce5d07b in object_initialize_with_type qom/object.c:562:5
#7 0x55ec8ce5e91d in object_new_with_type qom/object.c:782:5
#8 0x55ec8ce5e9f1 in object_new qom/object.c:797:12
#9 0x55ec8d65c81d in qmp_device_list_properties qom/qom-qmp-cmds.c:144:11
Move the free to finalize so it matches where we are initing
s->prng. Since that's the only thing our unrealize method was
doing, this essentially switches the whole function to be
a finalize implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20240822162127.705879-3-peter.maydell@linaro.org
---
hw/misc/xlnx-versal-trng.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/misc/xlnx-versal-trng.c b/hw/misc/xlnx-versal-trng.c
index 51eb7600414..c0d1dde8708 100644
--- a/hw/misc/xlnx-versal-trng.c
+++ b/hw/misc/xlnx-versal-trng.c
@@ -624,9 +624,9 @@ static void trng_init(Object *obj)
s->prng = g_rand_new();
}
-static void trng_unrealize(DeviceState *dev)
+static void trng_finalize(Object *obj)
{
- XlnxVersalTRng *s = XLNX_VERSAL_TRNG(dev);
+ XlnxVersalTRng *s = XLNX_VERSAL_TRNG(obj);
g_rand_free(s->prng);
s->prng = NULL;
@@ -689,7 +689,6 @@ static void trng_class_init(ObjectClass *klass, void *data)
ResettableClass *rc = RESETTABLE_CLASS(klass);
dc->vmsd = &vmstate_trng;
- dc->unrealize = trng_unrealize;
rc->phases.hold = trng_reset_hold;
/* Clone uint64 property with set allowed after realized */
@@ -706,6 +705,7 @@ static const TypeInfo trng_info = {
.instance_size = sizeof(XlnxVersalTRng),
.class_init = trng_class_init,
.instance_init = trng_init,
+ .instance_finalize = trng_finalize,
};
static void trng_register_types(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 16/25] hw/nvram/xlnx-bbram: Call register_finalize_block
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (14 preceding siblings ...)
2024-09-05 13:00 ` [PULL 15/25] hw/misc/xlnx-versal-trng: Free s->prng in finalize, not unrealize Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 17/25] hw/nvram/xlnx-zynqmp-efuse: " Peter Maydell
` (9 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
The TYPE_XLNX_BBRAM device creates a register block with
register_init_block32() in its instance_init method; we must
therefore destroy it in our instance_finalize method to avoid a leak
in the QOM introspection "init-inspect-finalize" lifecycle:
Direct leak of 304 byte(s) in 1 object(s) allocated from:
#0 0x5641518ca9d8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-aarch64+0x294d9d8) (BuildId: 4a6
18cb63d57d5a19ed45cfc262b08da47eaafe5)
#1 0x7ff1aab31c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x564151cffc5d in register_init_block hw/core/register.c:248:34
#3 0x564151d006be in register_init_block32 hw/core/register.c:299:12
#4 0x56415293df75 in bbram_ctrl_init hw/nvram/xlnx-bbram.c:462:9
#5 0x564154891dc1 in object_init_with_type qom/object.c:420:9
#6 0x56415487909b in object_initialize_with_type qom/object.c:562:5
#7 0x56415487a93d in object_new_with_type qom/object.c:782:5
#8 0x56415487aa11 in object_new qom/object.c:797:12
#9 0x56415507883d in qmp_device_list_properties qom/qom-qmp-cmds.c:144:11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20240822162127.705879-4-peter.maydell@linaro.org
---
include/hw/nvram/xlnx-bbram.h | 1 +
hw/nvram/xlnx-bbram.c | 13 ++++++++++---
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/include/hw/nvram/xlnx-bbram.h b/include/hw/nvram/xlnx-bbram.h
index 6fc13f8cc17..bce8e89d905 100644
--- a/include/hw/nvram/xlnx-bbram.h
+++ b/include/hw/nvram/xlnx-bbram.h
@@ -47,6 +47,7 @@ struct XlnxBBRam {
bool bbram8_wo;
bool blk_ro;
+ RegisterInfoArray *reg_array;
uint32_t regs[RMAX_XLNX_BBRAM];
RegisterInfo regs_info[RMAX_XLNX_BBRAM];
};
diff --git a/hw/nvram/xlnx-bbram.c b/hw/nvram/xlnx-bbram.c
index 09575a77d77..1bc58e90ad0 100644
--- a/hw/nvram/xlnx-bbram.c
+++ b/hw/nvram/xlnx-bbram.c
@@ -456,9 +456,8 @@ static void bbram_ctrl_init(Object *obj)
{
XlnxBBRam *s = XLNX_BBRAM(obj);
SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
- RegisterInfoArray *reg_array;
- reg_array =
+ s->reg_array =
register_init_block32(DEVICE(obj), bbram_ctrl_regs_info,
ARRAY_SIZE(bbram_ctrl_regs_info),
s->regs_info, s->regs,
@@ -466,10 +465,17 @@ static void bbram_ctrl_init(Object *obj)
XLNX_BBRAM_ERR_DEBUG,
R_MAX * 4);
- sysbus_init_mmio(sbd, ®_array->mem);
+ sysbus_init_mmio(sbd, &s->reg_array->mem);
sysbus_init_irq(sbd, &s->irq_bbram);
}
+static void bbram_ctrl_finalize(Object *obj)
+{
+ XlnxBBRam *s = XLNX_BBRAM(obj);
+
+ register_finalize_block(s->reg_array);
+}
+
static void bbram_prop_set_drive(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
@@ -537,6 +543,7 @@ static const TypeInfo bbram_ctrl_info = {
.instance_size = sizeof(XlnxBBRam),
.class_init = bbram_ctrl_class_init,
.instance_init = bbram_ctrl_init,
+ .instance_finalize = bbram_ctrl_finalize,
};
static void bbram_ctrl_register_types(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 17/25] hw/nvram/xlnx-zynqmp-efuse: Call register_finalize_block
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (15 preceding siblings ...)
2024-09-05 13:00 ` [PULL 16/25] hw/nvram/xlnx-bbram: Call register_finalize_block Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 18/25] hw/misc/xlnx-versal-trng: " Peter Maydell
` (8 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
The TYPE_XLNX_ZYNQMP_EFUSE device creates a register block with
register_init_block32() in its instance_init method; we must
therefore destroy it in our instance_finalize method to avoid a leak
in the QOM introspection "init-inspect-finalize" lifecycle:
Direct leak of 304 byte(s) in 1 object(s) allocated from:
#0 0x55f3ff5839d8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-aarch64+0x294d9d8) (BuildId: 23cf931c66865a71b6cc4da95156d03bc106fa72)
#1 0x7f3f31c6bc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55f3ff9b8c5d in register_init_block hw/core/register.c:248:34
#3 0x55f3ff9b96be in register_init_block32 hw/core/register.c:299:12
#4 0x55f4005e5b25 in efuse_ctrl_init hw/nvram/xlnx-versal-efuse-ctrl.c:718:9
#5 0x55f40254afb1 in object_init_with_type qom/object.c:420:9
#6 0x55f40253228b in object_initialize_with_type qom/object.c:562:5
#7 0x55f402533b2d in object_new_with_type qom/object.c:782:5
#8 0x55f402533c01 in object_new qom/object.c:797:12
#9 0x55f402d31a2d in qmp_device_list_properties qom/qom-qmp-cmds.c:144:11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20240822162127.705879-5-peter.maydell@linaro.org
---
include/hw/nvram/xlnx-zynqmp-efuse.h | 1 +
hw/nvram/xlnx-zynqmp-efuse.c | 13 ++++++++++---
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/include/hw/nvram/xlnx-zynqmp-efuse.h b/include/hw/nvram/xlnx-zynqmp-efuse.h
index f5beacc2e6a..7fb12df3fbb 100644
--- a/include/hw/nvram/xlnx-zynqmp-efuse.h
+++ b/include/hw/nvram/xlnx-zynqmp-efuse.h
@@ -37,6 +37,7 @@ struct XlnxZynqMPEFuse {
qemu_irq irq;
XlnxEFuse *efuse;
+ RegisterInfoArray *reg_array;
uint32_t regs[XLNX_ZYNQMP_EFUSE_R_MAX];
RegisterInfo regs_info[XLNX_ZYNQMP_EFUSE_R_MAX];
};
diff --git a/hw/nvram/xlnx-zynqmp-efuse.c b/hw/nvram/xlnx-zynqmp-efuse.c
index 2d465f0fc6a..4e2d1b9d1e7 100644
--- a/hw/nvram/xlnx-zynqmp-efuse.c
+++ b/hw/nvram/xlnx-zynqmp-efuse.c
@@ -803,9 +803,8 @@ static void zynqmp_efuse_init(Object *obj)
{
XlnxZynqMPEFuse *s = XLNX_ZYNQMP_EFUSE(obj);
SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
- RegisterInfoArray *reg_array;
- reg_array =
+ s->reg_array =
register_init_block32(DEVICE(obj), zynqmp_efuse_regs_info,
ARRAY_SIZE(zynqmp_efuse_regs_info),
s->regs_info, s->regs,
@@ -813,10 +812,17 @@ static void zynqmp_efuse_init(Object *obj)
ZYNQMP_EFUSE_ERR_DEBUG,
R_MAX * 4);
- sysbus_init_mmio(sbd, ®_array->mem);
+ sysbus_init_mmio(sbd, &s->reg_array->mem);
sysbus_init_irq(sbd, &s->irq);
}
+static void zynqmp_efuse_finalize(Object *obj)
+{
+ XlnxZynqMPEFuse *s = XLNX_ZYNQMP_EFUSE(obj);
+
+ register_finalize_block(s->reg_array);
+}
+
static const VMStateDescription vmstate_efuse = {
.name = TYPE_XLNX_ZYNQMP_EFUSE,
.version_id = 1,
@@ -853,6 +859,7 @@ static const TypeInfo efuse_info = {
.instance_size = sizeof(XlnxZynqMPEFuse),
.class_init = zynqmp_efuse_class_init,
.instance_init = zynqmp_efuse_init,
+ .instance_finalize = zynqmp_efuse_finalize,
};
static void efuse_register_types(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 18/25] hw/misc/xlnx-versal-trng: Call register_finalize_block
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (16 preceding siblings ...)
2024-09-05 13:00 ` [PULL 17/25] hw/nvram/xlnx-zynqmp-efuse: " Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 19/25] hm/nvram/xlnx-versal-efuse-ctrl: " Peter Maydell
` (7 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
The TYPE_XLNX_VERSAL_TRNG device creates a register block with
register_init_block32() in its instance_init method; we must
therefore destroy it in our instance_finalize method to avoid a leak
in the QOM introspection "init-inspect-finalize" lifecycle:
Direct leak of 304 byte(s) in 1 object(s) allocated from:
#0 0x55842ec799d8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-aarch64+0x294e9d8) (BuildId: 47496e53f3e779f1c7e9b82cbea07407152b498b)
#1 0x7fe793c75c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55842f0aec5d in register_init_block hw/core/register.c:248:34
#3 0x55842f0af6be in register_init_block32 hw/core/register.c:299:12
#4 0x55842f801588 in trng_init hw/misc/xlnx-versal-trng.c:614:9
#5 0x558431c411a1 in object_init_with_type qom/object.c:420:9
#6 0x558431c2847b in object_initialize_with_type qom/object.c:562:5
#7 0x558431c29d1d in object_new_with_type qom/object.c:782:5
#8 0x558431c29df1 in object_new qom/object.c:797:12
#9 0x558432427c1d in qmp_device_list_properties qom/qom-qmp-cmds.c:144:11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20240822162127.705879-6-peter.maydell@linaro.org
---
include/hw/misc/xlnx-versal-trng.h | 1 +
hw/misc/xlnx-versal-trng.c | 6 +++---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/hw/misc/xlnx-versal-trng.h b/include/hw/misc/xlnx-versal-trng.h
index 0bcef8a6132..d96f8f9eff3 100644
--- a/include/hw/misc/xlnx-versal-trng.h
+++ b/include/hw/misc/xlnx-versal-trng.h
@@ -50,6 +50,7 @@ typedef struct XlnxVersalTRng {
uint64_t forced_prng_count;
uint64_t tst_seed[2];
+ RegisterInfoArray *reg_array;
uint32_t regs[RMAX_XLNX_VERSAL_TRNG];
RegisterInfo regs_info[RMAX_XLNX_VERSAL_TRNG];
} XlnxVersalTRng;
diff --git a/hw/misc/xlnx-versal-trng.c b/hw/misc/xlnx-versal-trng.c
index c0d1dde8708..86905479b8f 100644
--- a/hw/misc/xlnx-versal-trng.c
+++ b/hw/misc/xlnx-versal-trng.c
@@ -608,9 +608,8 @@ static void trng_init(Object *obj)
{
XlnxVersalTRng *s = XLNX_VERSAL_TRNG(obj);
SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
- RegisterInfoArray *reg_array;
- reg_array =
+ s->reg_array =
register_init_block32(DEVICE(obj), trng_regs_info,
ARRAY_SIZE(trng_regs_info),
s->regs_info, s->regs,
@@ -618,7 +617,7 @@ static void trng_init(Object *obj)
XLNX_VERSAL_TRNG_ERR_DEBUG,
R_MAX * 4);
- sysbus_init_mmio(sbd, ®_array->mem);
+ sysbus_init_mmio(sbd, &s->reg_array->mem);
sysbus_init_irq(sbd, &s->irq);
s->prng = g_rand_new();
@@ -628,6 +627,7 @@ static void trng_finalize(Object *obj)
{
XlnxVersalTRng *s = XLNX_VERSAL_TRNG(obj);
+ register_finalize_block(s->reg_array);
g_rand_free(s->prng);
s->prng = NULL;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 19/25] hm/nvram/xlnx-versal-efuse-ctrl: Call register_finalize_block
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (17 preceding siblings ...)
2024-09-05 13:00 ` [PULL 18/25] hw/misc/xlnx-versal-trng: " Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 20/25] hw/arm/sbsa-ref: Don't leak string in sbsa_fdt_add_gic_node() Peter Maydell
` (6 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
The TYPE_XLNX_VERSAL_EFUSE_CTRL device creates a register block with
register_init_block32() in its instance_init method; we must
therefore destroy it in our instance_finalize method to avoid a leak
in the QOM introspection "init-inspect-finalize" lifecycle:
Direct leak of 304 byte(s) in 1 object(s) allocated from:
#0 0x55f222b5b9d8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-aarch64+0x294e9d8) (BuildId: 420
43d49e1139e3f3071b1f22fac1e3e7249c9a6)
#1 0x7fbb10669c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55f222f90c5d in register_init_block hw/core/register.c:248:34
#3 0x55f222f916be in register_init_block32 hw/core/register.c:299:12
#4 0x55f223bbdd15 in efuse_ctrl_init hw/nvram/xlnx-versal-efuse-ctrl.c:718:9
#5 0x55f225b23391 in object_init_with_type qom/object.c:420:9
#6 0x55f225b0a66b in object_initialize_with_type qom/object.c:562:5
#7 0x55f225b0bf0d in object_new_with_type qom/object.c:782:5
#8 0x55f225b0bfe1 in object_new qom/object.c:797:12
#9 0x55f226309e0d in qmp_device_list_properties qom/qom-qmp-cmds.c:144:11
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Francisco Iglesias <francisco.iglesias@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20240822162127.705879-7-peter.maydell@linaro.org
---
include/hw/nvram/xlnx-versal-efuse.h | 1 +
hw/nvram/xlnx-versal-efuse-ctrl.c | 6 +++---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/hw/nvram/xlnx-versal-efuse.h b/include/hw/nvram/xlnx-versal-efuse.h
index 86e2261b9a3..afa4f4f9960 100644
--- a/include/hw/nvram/xlnx-versal-efuse.h
+++ b/include/hw/nvram/xlnx-versal-efuse.h
@@ -44,6 +44,7 @@ struct XlnxVersalEFuseCtrl {
void *extra_pg0_lock_spec; /* Opaque property */
uint32_t extra_pg0_lock_n16;
+ RegisterInfoArray *reg_array;
uint32_t regs[XLNX_VERSAL_EFUSE_CTRL_R_MAX];
RegisterInfo regs_info[XLNX_VERSAL_EFUSE_CTRL_R_MAX];
};
diff --git a/hw/nvram/xlnx-versal-efuse-ctrl.c b/hw/nvram/xlnx-versal-efuse-ctrl.c
index def6fe3302b..8252a5cabe0 100644
--- a/hw/nvram/xlnx-versal-efuse-ctrl.c
+++ b/hw/nvram/xlnx-versal-efuse-ctrl.c
@@ -712,9 +712,8 @@ static void efuse_ctrl_init(Object *obj)
{
XlnxVersalEFuseCtrl *s = XLNX_VERSAL_EFUSE_CTRL(obj);
SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
- RegisterInfoArray *reg_array;
- reg_array =
+ s->reg_array =
register_init_block32(DEVICE(obj), efuse_ctrl_regs_info,
ARRAY_SIZE(efuse_ctrl_regs_info),
s->regs_info, s->regs,
@@ -722,7 +721,7 @@ static void efuse_ctrl_init(Object *obj)
XLNX_VERSAL_EFUSE_CTRL_ERR_DEBUG,
R_MAX * 4);
- sysbus_init_mmio(sbd, ®_array->mem);
+ sysbus_init_mmio(sbd, &s->reg_array->mem);
sysbus_init_irq(sbd, &s->irq_efuse_imr);
}
@@ -730,6 +729,7 @@ static void efuse_ctrl_finalize(Object *obj)
{
XlnxVersalEFuseCtrl *s = XLNX_VERSAL_EFUSE_CTRL(obj);
+ register_finalize_block(s->reg_array);
g_free(s->extra_pg0_lock_spec);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 20/25] hw/arm/sbsa-ref: Don't leak string in sbsa_fdt_add_gic_node()
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (18 preceding siblings ...)
2024-09-05 13:00 ` [PULL 19/25] hm/nvram/xlnx-versal-efuse-ctrl: " Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 21/25] target/arm: Correct names of VFP VFNMA and VFNMS insns Peter Maydell
` (5 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
In sbsa_fdt_add_gic_node() we g_strdup_printf() two nodename
strings, but only free one.
Since the string is actually entirely constant and we don't
make any use of printf's format-string operations, we can
drop the g_strdup_printf() use entirely.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-id: 20240822162323.706382-1-peter.maydell@linaro.org
---
hw/arm/sbsa-ref.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 396abe9c1bd..e3195d54497 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -164,23 +164,20 @@ static uint64_t sbsa_ref_cpu_mp_affinity(SBSAMachineState *sms, int idx)
static void sbsa_fdt_add_gic_node(SBSAMachineState *sms)
{
- char *nodename;
+ const char *intc_nodename = "/intc";
+ const char *its_nodename = "/intc/its";
- nodename = g_strdup_printf("/intc");
- qemu_fdt_add_subnode(sms->fdt, nodename);
- qemu_fdt_setprop_sized_cells(sms->fdt, nodename, "reg",
+ qemu_fdt_add_subnode(sms->fdt, intc_nodename);
+ qemu_fdt_setprop_sized_cells(sms->fdt, intc_nodename, "reg",
2, sbsa_ref_memmap[SBSA_GIC_DIST].base,
2, sbsa_ref_memmap[SBSA_GIC_DIST].size,
2, sbsa_ref_memmap[SBSA_GIC_REDIST].base,
2, sbsa_ref_memmap[SBSA_GIC_REDIST].size);
- nodename = g_strdup_printf("/intc/its");
- qemu_fdt_add_subnode(sms->fdt, nodename);
- qemu_fdt_setprop_sized_cells(sms->fdt, nodename, "reg",
+ qemu_fdt_add_subnode(sms->fdt, its_nodename);
+ qemu_fdt_setprop_sized_cells(sms->fdt, its_nodename, "reg",
2, sbsa_ref_memmap[SBSA_GIC_ITS].base,
2, sbsa_ref_memmap[SBSA_GIC_ITS].size);
-
- g_free(nodename);
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 21/25] target/arm: Correct names of VFP VFNMA and VFNMS insns
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (19 preceding siblings ...)
2024-09-05 13:00 ` [PULL 20/25] hw/arm/sbsa-ref: Don't leak string in sbsa_fdt_add_gic_node() Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 22/25] hw/arm/xilinx_zynq: Enable Security Extensions Peter Maydell
` (4 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
In vfp.decode we have the names of the VFNMA and VFNMS instructions
the wrong way around. The architecture says that bit 6 is the 'op'
bit, which is 1 for VFNMA and 0 for VFNMS, but we label these two
lines of decode the other way around. This doesn't cause any
user-visible problem because in the handling of these functions in
translate-vfp.c we give VFNMA the behaviour specified for VFNMS and
vice-versa, but it's confusing when reading the code.
Switch the names of the VFP VFNMA and VFNMS instructions in
the decode file and flip the behaviour also.
NB: the instructions VFMA and VFMS *are* decoded with op=0 for
VFMA and op=1 for VFMS; the confusion probably arose because
we assumed VFNMA and VFNMS to be the same way around.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2536
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20240830152156.2046590-1-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/vfp.decode | 12 ++++++------
target/arm/tcg/translate-vfp.c | 8 ++++----
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/target/arm/tcg/vfp.decode b/target/arm/tcg/vfp.decode
index 5405e80197b..2dd87a27089 100644
--- a/target/arm/tcg/vfp.decode
+++ b/target/arm/tcg/vfp.decode
@@ -141,18 +141,18 @@ VDIV_dp ---- 1110 1.00 .... .... 1011 .0.0 .... @vfp_dnm_d
VFMA_hp ---- 1110 1.10 .... .... 1001 .0. 0 .... @vfp_dnm_s
VFMS_hp ---- 1110 1.10 .... .... 1001 .1. 0 .... @vfp_dnm_s
-VFNMA_hp ---- 1110 1.01 .... .... 1001 .0. 0 .... @vfp_dnm_s
-VFNMS_hp ---- 1110 1.01 .... .... 1001 .1. 0 .... @vfp_dnm_s
+VFNMS_hp ---- 1110 1.01 .... .... 1001 .0. 0 .... @vfp_dnm_s
+VFNMA_hp ---- 1110 1.01 .... .... 1001 .1. 0 .... @vfp_dnm_s
VFMA_sp ---- 1110 1.10 .... .... 1010 .0. 0 .... @vfp_dnm_s
VFMS_sp ---- 1110 1.10 .... .... 1010 .1. 0 .... @vfp_dnm_s
-VFNMA_sp ---- 1110 1.01 .... .... 1010 .0. 0 .... @vfp_dnm_s
-VFNMS_sp ---- 1110 1.01 .... .... 1010 .1. 0 .... @vfp_dnm_s
+VFNMS_sp ---- 1110 1.01 .... .... 1010 .0. 0 .... @vfp_dnm_s
+VFNMA_sp ---- 1110 1.01 .... .... 1010 .1. 0 .... @vfp_dnm_s
VFMA_dp ---- 1110 1.10 .... .... 1011 .0.0 .... @vfp_dnm_d
VFMS_dp ---- 1110 1.10 .... .... 1011 .1.0 .... @vfp_dnm_d
-VFNMA_dp ---- 1110 1.01 .... .... 1011 .0.0 .... @vfp_dnm_d
-VFNMS_dp ---- 1110 1.01 .... .... 1011 .1.0 .... @vfp_dnm_d
+VFNMS_dp ---- 1110 1.01 .... .... 1011 .0.0 .... @vfp_dnm_d
+VFNMA_dp ---- 1110 1.01 .... .... 1011 .1.0 .... @vfp_dnm_d
VMOV_imm_hp ---- 1110 1.11 .... .... 1001 0000 .... \
vd=%vd_sp imm=%vmov_imm
diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c
index cd5b8483576..b6fa28a7bf6 100644
--- a/target/arm/tcg/translate-vfp.c
+++ b/target/arm/tcg/translate-vfp.c
@@ -2190,8 +2190,8 @@ static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d)
static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
{
/*
- * VFNMA : fd = muladd(-fd, fn, fm)
- * VFNMS : fd = muladd(-fd, -fn, fm)
+ * VFNMA : fd = muladd(-fd, -fn, fm)
+ * VFNMS : fd = muladd(-fd, fn, fm)
* VFMA : fd = muladd( fd, fn, fm)
* VFMS : fd = muladd( fd, -fn, fm)
*
@@ -2262,8 +2262,8 @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d)
#define MAKE_VFM_TRANS_FNS(PREC) \
MAKE_ONE_VFM_TRANS_FN(VFMA, PREC, false, false) \
MAKE_ONE_VFM_TRANS_FN(VFMS, PREC, true, false) \
- MAKE_ONE_VFM_TRANS_FN(VFNMA, PREC, false, true) \
- MAKE_ONE_VFM_TRANS_FN(VFNMS, PREC, true, true)
+ MAKE_ONE_VFM_TRANS_FN(VFNMS, PREC, false, true) \
+ MAKE_ONE_VFM_TRANS_FN(VFNMA, PREC, true, true)
MAKE_VFM_TRANS_FNS(hp)
MAKE_VFM_TRANS_FNS(sp)
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 22/25] hw/arm/xilinx_zynq: Enable Security Extensions
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (20 preceding siblings ...)
2024-09-05 13:00 ` [PULL 21/25] target/arm: Correct names of VFP VFNMA and VFNMS insns Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 23/25] hw/arm/boot: Report error msg if loading elf/dtb failed Peter Maydell
` (3 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
From: Sebastian Huber <sebastian.huber@embedded-brains.de>
The system supports the Security Extensions (core and GIC). This change is
necessary to run tests which pass on the real hardware.
Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Message-id: 20240828005019.57705-1-sebastian.huber@embedded-brains.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
hw/arm/xilinx_zynq.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 3c56b9abe1c..37c234f5aba 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -219,14 +219,6 @@ static void zynq_init(MachineState *machine)
for (n = 0; n < smp_cpus; n++) {
Object *cpuobj = object_new(machine->cpu_type);
- /*
- * By default A9 CPUs have EL3 enabled. This board does not currently
- * support EL3 so the CPU EL3 property is disabled before realization.
- */
- if (object_property_find(cpuobj, "has_el3")) {
- object_property_set_bool(cpuobj, "has_el3", false, &error_fatal);
- }
-
object_property_set_int(cpuobj, "midr", ZYNQ_BOARD_MIDR,
&error_fatal);
object_property_set_int(cpuobj, "reset-cbar", MPCORE_PERIPHBASE,
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 23/25] hw/arm/boot: Report error msg if loading elf/dtb failed
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (21 preceding siblings ...)
2024-09-05 13:00 ` [PULL 22/25] hw/arm/xilinx_zynq: Enable Security Extensions Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:00 ` [PULL 24/25] hw/arm/boot: Explain why load_elf_hdr() error is ignored Peter Maydell
` (2 subsequent siblings)
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
From: Changbin Du <changbin.du@huawei.com>
Print errors before exit. Do not exit silently.
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240903133940.3447430-1-changbin.du@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
hw/arm/boot.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index d480a7da02c..6c895e05cbc 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -839,6 +839,8 @@ static ssize_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
1, data_swab, as);
if (ret <= 0) {
/* The header loaded but the image didn't */
+ error_report("Couldn't load elf '%s': %s",
+ info->kernel_filename, load_elf_strerror(ret));
exit(1);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 24/25] hw/arm/boot: Explain why load_elf_hdr() error is ignored
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (22 preceding siblings ...)
2024-09-05 13:00 ` [PULL 23/25] hw/arm/boot: Report error msg if loading elf/dtb failed Peter Maydell
@ 2024-09-05 13:00 ` Peter Maydell
2024-09-05 13:01 ` [PULL 25/25] platform-bus: fix refcount leak Peter Maydell
2024-09-06 14:24 ` [PULL 00/25] target-arm queue Peter Maydell
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:00 UTC (permalink / raw)
To: qemu-devel
From: Philippe Mathieu-Daudé <philmd@linaro.org>
If the file is not an ELF file, arm_setup_direct_kernel_boot()
falls back to try it as a uimage or an AArch64 Image file or as
last resort a bare raw binary. We can discard load_elf_hdr()
error and silently return.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240903144154.17135-1-philmd@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
---
hw/arm/boot.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 6c895e05cbc..5301d8d318c 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -799,14 +799,18 @@ static ssize_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
} elf_header;
int data_swab = 0;
bool big_endian;
- ssize_t ret = -1;
+ ssize_t ret;
Error *err = NULL;
load_elf_hdr(info->kernel_filename, &elf_header, &elf_is64, &err);
if (err) {
+ /*
+ * If the file is not an ELF file we silently return.
+ * The caller will fall back to try other formats.
+ */
error_free(err);
- return ret;
+ return -1;
}
if (elf_is64) {
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PULL 25/25] platform-bus: fix refcount leak
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (23 preceding siblings ...)
2024-09-05 13:00 ` [PULL 24/25] hw/arm/boot: Explain why load_elf_hdr() error is ignored Peter Maydell
@ 2024-09-05 13:01 ` Peter Maydell
2024-09-06 14:24 ` [PULL 00/25] target-arm queue Peter Maydell
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-05 13:01 UTC (permalink / raw)
To: qemu-devel
From: Gao Shiyuan <gaoshiyuan@baidu.com>
memory_region_find() returns an MR which it is the caller's
responsibility to unref, but platform_bus_map_mmio() was
forgetting to do so, thus leaking the MR.
Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
Message-id: 20240829131005.9196-1-gaoshiyuan@baidu.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
hw/core/platform-bus.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/hw/core/platform-bus.c b/hw/core/platform-bus.c
index b8487b26b67..dc58bf505aa 100644
--- a/hw/core/platform-bus.c
+++ b/hw/core/platform-bus.c
@@ -145,9 +145,12 @@ static void platform_bus_map_mmio(PlatformBusDevice *pbus, SysBusDevice *sbdev,
* the target device's memory region
*/
for (off = 0; off < pbus->mmio_size; off += alignment) {
- if (!memory_region_find(&pbus->mmio, off, size).mr) {
+ MemoryRegion *mr = memory_region_find(&pbus->mmio, off, size).mr;
+ if (!mr) {
found_region = true;
break;
+ } else {
+ memory_region_unref(mr);
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PULL 00/25] target-arm queue
2024-09-05 13:00 [PULL 00/25] target-arm queue Peter Maydell
` (24 preceding siblings ...)
2024-09-05 13:01 ` [PULL 25/25] platform-bus: fix refcount leak Peter Maydell
@ 2024-09-06 14:24 ` Peter Maydell
25 siblings, 0 replies; 38+ messages in thread
From: Peter Maydell @ 2024-09-06 14:24 UTC (permalink / raw)
To: qemu-devel
On Thu, 5 Sept 2024 at 14:01, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> First target-arm queue for 9.2. I know I have more stuff in
> my to-review queue after this...
>
> -- PMM
>
> The following changes since commit cab1afb393ea0943b3086188e91d71d594ede6bf:
>
> Merge tag 'hppa-v9.1-fixes-pull-request' of https://github.com/hdeller/qemu-hppa into staging (2024-09-04 13:20:17 +0100)
>
> are available in the Git repository at:
>
> https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20240905
>
> for you to fetch changes up to 99ec7b440a1d6a6ef07450b68687d24d13a25fb5:
>
> platform-bus: fix refcount leak (2024-09-05 13:12:37 +0100)
>
> ----------------------------------------------------------------
> target-arm queue:
> * Implement FEAT_EBF16 emulation
> * accel/tcg: Remove dead code from rr_cpu_thread_fn()
> * hw: add compat machines for 9.2
> * virt: default to two-stage SMMU from virt-9.2
> * sbsa-ref: use two-stage SMMU
> * hw: Various minor memory leak fixes
> * target/arm: Correct names of VFP VFNMA and VFNMS insns
> * hw/arm/xilinx_zynq: Enable Security Extensions
> * hw/arm/boot: Report error msg if loading elf/dtb failed
>
Applied, thanks.
Please update the changelog at https://wiki.qemu.org/ChangeLog/9.2
for any user-visible changes.
-- PMM
^ permalink raw reply [flat|nested] 38+ messages in thread