* [PATCH 0/4] target/arm: Various minor SME bugfixes
@ 2024-07-22 17:29 Peter Maydell
2024-07-22 17:29 ` [PATCH 1/4] target/arm: Don't assert for 128-bit tile accesses when SVL is 128 Peter Maydell
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Peter Maydell @ 2024-07-22 17:29 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: qemu-stable
This patchset fixes a handful of minor bugs in our SME implementation:
one spotted by Coverity, one raised as a Gitlab issue, and a couple
that I just noticed in passing while I was working with the code.
thanks
-- PMM
Peter Maydell (4):
target/arm: Don't assert for 128-bit tile accesses when SVL is 128
target/arm: Fix UMOPA/UMOPS of 16-bit values
target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled
target/arm/helper.c | 2 +-
target/arm/tcg/sme_helper.c | 8 ++++----
target/arm/tcg/translate-sme.c | 10 +++++++++-
target/arm/tcg/translate-sve.c | 18 ++++++++++++++++--
4 files changed, 30 insertions(+), 8 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/4] target/arm: Don't assert for 128-bit tile accesses when SVL is 128
2024-07-22 17:29 [PATCH 0/4] target/arm: Various minor SME bugfixes Peter Maydell
@ 2024-07-22 17:29 ` Peter Maydell
2024-07-22 17:29 ` [PATCH 2/4] target/arm: Fix UMOPA/UMOPS of 16-bit values Peter Maydell
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Peter Maydell @ 2024-07-22 17:29 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: qemu-stable
For an instruction which accesses a 128-bit element tile when
the SVL is also 128 (for example MOV z0.Q, p0/M, ZA0H.Q[w0,0]),
we will assert in get_tile_rowcol():
qemu-system-aarch64: ../../tcg/tcg-op.c:926: tcg_gen_deposit_z_i32: Assertion `len > 0' failed.
This happens because we calculate
len = ctz32(streaming_vec_reg_size(s)) - esz;$
but if the SVL and the element size are the same len is 0, and
the deposit operation asserts.
In this case the ZA storage contains exactly one 128 bit
element ZA tile, and the horizontal or vertical slice is just
that tile. This means that regardless of the index value in
the Ws register, we always access that tile. (In pseudocode terms,
we calculate (index + offset) MOD 1, which is 0.)
Special case the len == 0 case to avoid hitting the assertion
in tcg_gen_deposit_z_i32().
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/tcg/translate-sme.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 185a8a917b0..a50a419af27 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -49,7 +49,15 @@ static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs,
/* Prepare a power-of-two modulo via extraction of @len bits. */
len = ctz32(streaming_vec_reg_size(s)) - esz;
- if (vertical) {
+ if (!len) {
+ /*
+ * SVL is 128 and the element size is 128. There is exactly
+ * one 128x128 tile in the ZA storage, and so we calculate
+ * (Rs + imm) MOD 1, which is always 0. We need to special case
+ * this because TCG doesn't allow deposit ops with len 0.
+ */
+ tcg_gen_movi_i32(tmp, 0);
+ } else if (vertical) {
/*
* Compute the byte offset of the index within the tile:
* (index % (svl / size)) * size
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/4] target/arm: Fix UMOPA/UMOPS of 16-bit values
2024-07-22 17:29 [PATCH 0/4] target/arm: Various minor SME bugfixes Peter Maydell
2024-07-22 17:29 ` [PATCH 1/4] target/arm: Don't assert for 128-bit tile accesses when SVL is 128 Peter Maydell
@ 2024-07-22 17:29 ` Peter Maydell
2024-07-22 17:29 ` [PATCH 3/4] target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl() Peter Maydell
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Peter Maydell @ 2024-07-22 17:29 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: qemu-stable
The UMOPA/UMOPS instructions are supposed to multiply unsigned 8 or
16 bit elements and accumulate the products into a 64-bit element.
In the Arm ARM pseudocode, this is done with the usual
infinite-precision signed arithmetic. However our implementation
doesn't quite get it right, because in the DEF_IMOP_64() macro we do:
sum += (NTYPE)(n >> 0) * (MTYPE)(m >> 0);
where NTYPE and MTYPE are uint16_t or int16_t. In the uint16_t case,
the C usual arithmetic conversions mean the values are converted to
"int" type and the multiply is done as a 32-bit multiply. This means
that if the inputs are, for example, 0xffff and 0xffff then the
result is 0xFFFE0001 as an int, which is then promoted to uint64_t
for the accumulation into sum; this promotion incorrectly sign
extends the multiply.
Avoid the incorrect sign extension by casting to int64_t before
the multiply, so we do the multiply as 64-bit signed arithmetic,
which is a type large enough that the multiply can never
overflow into the sign bit.
(The equivalent 8-bit operations in DEF_IMOP_32() are fine, because
the 8-bit multiplies can never overflow into the sign bit of a
32-bit integer.)
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2372
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/tcg/sme_helper.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 5a6dd76489f..f9001f5213a 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1146,10 +1146,10 @@ static uint64_t NAME(uint64_t n, uint64_t m, uint64_t a, uint8_t p, bool neg) \
uint64_t sum = 0; \
/* Apply P to N as a mask, making the inactive elements 0. */ \
n &= expand_pred_h(p); \
- sum += (NTYPE)(n >> 0) * (MTYPE)(m >> 0); \
- sum += (NTYPE)(n >> 16) * (MTYPE)(m >> 16); \
- sum += (NTYPE)(n >> 32) * (MTYPE)(m >> 32); \
- sum += (NTYPE)(n >> 48) * (MTYPE)(m >> 48); \
+ sum += (int64_t)(NTYPE)(n >> 0) * (MTYPE)(m >> 0); \
+ sum += (int64_t)(NTYPE)(n >> 16) * (MTYPE)(m >> 16); \
+ sum += (int64_t)(NTYPE)(n >> 32) * (MTYPE)(m >> 32); \
+ sum += (int64_t)(NTYPE)(n >> 48) * (MTYPE)(m >> 48); \
return neg ? a - sum : a + sum; \
}
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/4] target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
2024-07-22 17:29 [PATCH 0/4] target/arm: Various minor SME bugfixes Peter Maydell
2024-07-22 17:29 ` [PATCH 1/4] target/arm: Don't assert for 128-bit tile accesses when SVL is 128 Peter Maydell
2024-07-22 17:29 ` [PATCH 2/4] target/arm: Fix UMOPA/UMOPS of 16-bit values Peter Maydell
@ 2024-07-22 17:29 ` Peter Maydell
2024-07-22 17:29 ` [PATCH 4/4] target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled Peter Maydell
2024-07-22 22:14 ` [PATCH 0/4] target/arm: Various minor SME bugfixes Richard Henderson
4 siblings, 0 replies; 7+ messages in thread
From: Peter Maydell @ 2024-07-22 17:29 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: qemu-stable
The function tszimm_esz() returns a shift amount, or possibly -1 in
certain cases that correspond to unallocated encodings in the
instruction set. We catch these later in the trans_ functions
(generally with an "a-esz < 0" check), but before we do the
decodetree-generated code will also call tszimm_shr() or tszimm_sl(),
which will use the tszimm_esz() return value as a shift count without
checking that it is not negative, which is undefined behaviour.
Avoid the UB by checking the return value in tszimm_shr() and
tszimm_shl().
Cc: qemu-stable@nongnu.org
Resolves: Coverity CID 1547617, 1547694
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/tcg/translate-sve.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 798ab2bfb13..a72c2620960 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -50,13 +50,27 @@ static int tszimm_esz(DisasContext *s, int x)
static int tszimm_shr(DisasContext *s, int x)
{
- return (16 << tszimm_esz(s, x)) - x;
+ /*
+ * We won't use the tszimm_shr() value if tszimm_esz() returns -1 (the
+ * trans function will check for esz < 0), so we can return any
+ * value we like from here in that case as long as we avoid UB.
+ */
+ int esz = tszimm_esz(s, x);
+ if (esz < 0) {
+ return esz;
+ }
+ return (16 << esz) - x;
}
/* See e.g. LSL (immediate, predicated). */
static int tszimm_shl(DisasContext *s, int x)
{
- return x - (8 << tszimm_esz(s, x));
+ /* As with tszimm_shr(), value will be unused if esz < 0 */
+ int esz = tszimm_esz(s, x);
+ if (esz < 0) {
+ return esz;
+ }
+ return x - (8 << esz);
}
/* The SH bit is in bit 8. Extract the low 8 and shift. */
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/4] target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled
2024-07-22 17:29 [PATCH 0/4] target/arm: Various minor SME bugfixes Peter Maydell
` (2 preceding siblings ...)
2024-07-22 17:29 ` [PATCH 3/4] target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl() Peter Maydell
@ 2024-07-22 17:29 ` Peter Maydell
2024-07-22 20:01 ` Philippe Mathieu-Daudé
2024-07-22 22:14 ` [PATCH 0/4] target/arm: Various minor SME bugfixes Richard Henderson
4 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2024-07-22 17:29 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: qemu-stable
When determining the current vector length, the SMCR_EL2.LEN and
SVCR_EL2.LEN settings should only be considered if EL2 is enabled
(compare the pseudocode CurrentSVL and CurrentNSVL which call
EL2Enabled()).
We were checking against ARM_FEATURE_EL2 rather than calling
arm_is_el2_enabled(), which meant that we would look at
SMCR_EL2/SVCR_EL2 when in Secure EL1 or Secure EL0 even if Secure EL2
was not enabled.
Use the correct check in sve_vqm1_for_el_sm().
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index ce319572354..8fb4b474e83 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7232,7 +7232,7 @@ uint32_t sve_vqm1_for_el_sm(CPUARMState *env, int el, bool sm)
if (el <= 1 && !el_is_in_host(env, el)) {
len = MIN(len, 0xf & (uint32_t)cr[1]);
}
- if (el <= 2 && arm_feature(env, ARM_FEATURE_EL2)) {
+ if (el <= 2 && arm_is_el2_enabled(env)) {
len = MIN(len, 0xf & (uint32_t)cr[2]);
}
if (arm_feature(env, ARM_FEATURE_EL3)) {
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 4/4] target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled
2024-07-22 17:29 ` [PATCH 4/4] target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled Peter Maydell
@ 2024-07-22 20:01 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 7+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-07-22 20:01 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel; +Cc: qemu-stable
On 22/7/24 19:29, Peter Maydell wrote:
> When determining the current vector length, the SMCR_EL2.LEN and
> SVCR_EL2.LEN settings should only be considered if EL2 is enabled
> (compare the pseudocode CurrentSVL and CurrentNSVL which call
> EL2Enabled()).
>
> We were checking against ARM_FEATURE_EL2 rather than calling
> arm_is_el2_enabled(), which meant that we would look at
> SMCR_EL2/SVCR_EL2 when in Secure EL1 or Secure EL0 even if Secure EL2
> was not enabled.
Just curious, how did you notice?
> Use the correct check in sve_vqm1_for_el_sm().
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/helper.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index ce319572354..8fb4b474e83 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -7232,7 +7232,7 @@ uint32_t sve_vqm1_for_el_sm(CPUARMState *env, int el, bool sm)
> if (el <= 1 && !el_is_in_host(env, el)) {
> len = MIN(len, 0xf & (uint32_t)cr[1]);
> }
> - if (el <= 2 && arm_feature(env, ARM_FEATURE_EL2)) {
> + if (el <= 2 && arm_is_el2_enabled(env)) {
> len = MIN(len, 0xf & (uint32_t)cr[2]);
> }
> if (arm_feature(env, ARM_FEATURE_EL3)) {
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/4] target/arm: Various minor SME bugfixes
2024-07-22 17:29 [PATCH 0/4] target/arm: Various minor SME bugfixes Peter Maydell
` (3 preceding siblings ...)
2024-07-22 17:29 ` [PATCH 4/4] target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled Peter Maydell
@ 2024-07-22 22:14 ` Richard Henderson
4 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2024-07-22 22:14 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel; +Cc: qemu-stable
On 7/23/24 03:29, Peter Maydell wrote:
>
> Peter Maydell (4):
> target/arm: Don't assert for 128-bit tile accesses when SVL is 128
> target/arm: Fix UMOPA/UMOPS of 16-bit values
> target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl()
> target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-07-22 22:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-22 17:29 [PATCH 0/4] target/arm: Various minor SME bugfixes Peter Maydell
2024-07-22 17:29 ` [PATCH 1/4] target/arm: Don't assert for 128-bit tile accesses when SVL is 128 Peter Maydell
2024-07-22 17:29 ` [PATCH 2/4] target/arm: Fix UMOPA/UMOPS of 16-bit values Peter Maydell
2024-07-22 17:29 ` [PATCH 3/4] target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl() Peter Maydell
2024-07-22 17:29 ` [PATCH 4/4] target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled Peter Maydell
2024-07-22 20:01 ` Philippe Mathieu-Daudé
2024-07-22 22:14 ` [PATCH 0/4] target/arm: Various minor SME bugfixes Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).