* [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes
@ 2018-07-05 19:19 Richard Henderson
2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, alex.bennee
RISU testing with ARM Fast Models and VQ=16 showed up two bugs,
both affecting the PTRUE instruction.
r~
Richard Henderson (2):
tcg: Restrict check_size_impl to multiples of the line size
target/arm: Fix do_predset for large VL
target/arm/translate-sve.c | 10 ++--------
tcg/tcg-op-gvec.c | 7 +++++--
2 files changed, 7 insertions(+), 10 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size
2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
@ 2018-07-05 19:19 ` Richard Henderson
2018-07-06 8:08 ` Alex Bennée
2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell
2 siblings, 1 reply; 6+ messages in thread
From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, alex.bennee
Normally this is automatic in the size restrictions that are placed
on vector sizes coming from the implementation. However, for the
legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final
24 bytes of the vector register. Without this check, do_dup selects
TCG_TYPE_V128 and clears only 16 bytes.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-op-gvec.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index 22db1590d5..61c25f5784 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
in units of LNSZ. This limits the expansion of inline code. */
static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz)
{
- uint32_t lnct = oprsz / lnsz;
- return lnct >= 1 && lnct <= MAX_UNROLL;
+ if (oprsz % lnsz == 0) {
+ uint32_t lnct = oprsz / lnsz;
+ return lnct >= 1 && lnct <= MAX_UNROLL;
+ }
+ return false;
}
static void expand_clr(uint32_t dofs, uint32_t maxsz);
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL
2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
@ 2018-07-05 19:19 ` Richard Henderson
2018-07-06 8:14 ` Alex Bennée
2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell
2 siblings, 1 reply; 6+ messages in thread
From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, alex.bennee
Use MAKE_64BIT_MASK instead of open-coding. Remove an odd
vector size check that is unlikely to be more profitable
than 3 64-bit integer stores. Correct the iteration for WORD
to avoid writing too much data.
Fixes RISU tests of PTRUE for VL 256.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/translate-sve.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index d41f1155f9..374051cd20 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -1438,7 +1438,7 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
setsz = numelem << esz;
lastword = word = pred_esz_masks[esz];
if (setsz % 64) {
- lastword &= ~(-1ull << (setsz % 64));
+ lastword &= MAKE_64BIT_MASK(0, setsz % 64);
}
}
@@ -1457,19 +1457,13 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
goto done;
}
- if (oprsz * 8 == setsz + 8) {
- tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
- tcg_gen_movi_i64(t, 0);
- tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8);
- goto done;
- }
}
setsz /= 8;
fullsz /= 8;
tcg_gen_movi_i64(t, word);
- for (i = 0; i < setsz; i += 8) {
+ for (i = 0; i < QEMU_ALIGN_DOWN(setsz, 8); i += 8) {
tcg_gen_st_i64(t, cpu_env, ofs + i);
}
if (lastword != word) {
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size
2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
@ 2018-07-06 8:08 ` Alex Bennée
0 siblings, 0 replies; 6+ messages in thread
From: Alex Bennée @ 2018-07-06 8:08 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, peter.maydell
Richard Henderson <richard.henderson@linaro.org> writes:
> Normally this is automatic in the size restrictions that are placed
> on vector sizes coming from the implementation. However, for the
> legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final
> 24 bytes of the vector register. Without this check, do_dup selects
> TCG_TYPE_V128 and clears only 16 bytes.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> tcg/tcg-op-gvec.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
> index 22db1590d5..61c25f5784 100644
> --- a/tcg/tcg-op-gvec.c
> +++ b/tcg/tcg-op-gvec.c
> @@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
> in units of LNSZ. This limits the expansion of inline code. */
> static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz)
> {
> - uint32_t lnct = oprsz / lnsz;
> - return lnct >= 1 && lnct <= MAX_UNROLL;
> + if (oprsz % lnsz == 0) {
> + uint32_t lnct = oprsz / lnsz;
> + return lnct >= 1 && lnct <= MAX_UNROLL;
> + }
> + return false;
> }
>
> static void expand_clr(uint32_t dofs, uint32_t maxsz);
--
Alex Bennée
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL
2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
@ 2018-07-06 8:14 ` Alex Bennée
0 siblings, 0 replies; 6+ messages in thread
From: Alex Bennée @ 2018-07-06 8:14 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, peter.maydell
Richard Henderson <richard.henderson@linaro.org> writes:
> Use MAKE_64BIT_MASK instead of open-coding. Remove an odd
> vector size check that is unlikely to be more profitable
> than 3 64-bit integer stores. Correct the iteration for WORD
> to avoid writing too much data.
>
> Fixes RISU tests of PTRUE for VL 256.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
> ---
> target/arm/translate-sve.c | 10 ++--------
> 1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index d41f1155f9..374051cd20 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -1438,7 +1438,7 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
> setsz = numelem << esz;
> lastword = word = pred_esz_masks[esz];
> if (setsz % 64) {
> - lastword &= ~(-1ull << (setsz % 64));
> + lastword &= MAKE_64BIT_MASK(0, setsz % 64);
> }
> }
>
> @@ -1457,19 +1457,13 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
> tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
> goto done;
> }
> - if (oprsz * 8 == setsz + 8) {
> - tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
> - tcg_gen_movi_i64(t, 0);
> - tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8);
> - goto done;
> - }
> }
>
> setsz /= 8;
> fullsz /= 8;
>
> tcg_gen_movi_i64(t, word);
> - for (i = 0; i < setsz; i += 8) {
> + for (i = 0; i < QEMU_ALIGN_DOWN(setsz, 8); i += 8) {
> tcg_gen_st_i64(t, cpu_env, ofs + i);
> }
> if (lastword != word) {
--
Alex Bennée
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes
2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
@ 2018-07-06 10:44 ` Peter Maydell
2 siblings, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2018-07-06 10:44 UTC (permalink / raw)
To: Richard Henderson; +Cc: QEMU Developers, Alex Bennée
On 5 July 2018 at 20:19, Richard Henderson <richard.henderson@linaro.org> wrote:
> RISU testing with ARM Fast Models and VQ=16 showed up two bugs,
> both affecting the PTRUE instruction.
>
>
> r~
>
>
> Richard Henderson (2):
> tcg: Restrict check_size_impl to multiples of the line size
> target/arm: Fix do_predset for large VL
>
> target/arm/translate-sve.c | 10 ++--------
> tcg/tcg-op-gvec.c | 7 +++++--
> 2 files changed, 7 insertions(+), 10 deletions(-)
Applied to target-arm.next, thanks.
-- PMM
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-07-06 10:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
2018-07-06 8:08 ` Alex Bennée
2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
2018-07-06 8:14 ` Alex Bennée
2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).