* [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes @ 2018-07-05 19:19 Richard Henderson 2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell, alex.bennee RISU testing with ARM Fast Models and VQ=16 showed up two bugs, both affecting the PTRUE instruction. r~ Richard Henderson (2): tcg: Restrict check_size_impl to multiples of the line size target/arm: Fix do_predset for large VL target/arm/translate-sve.c | 10 ++-------- tcg/tcg-op-gvec.c | 7 +++++-- 2 files changed, 7 insertions(+), 10 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size 2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson @ 2018-07-05 19:19 ` Richard Henderson 2018-07-06 8:08 ` Alex Bennée 2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson 2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell 2 siblings, 1 reply; 6+ messages in thread From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell, alex.bennee Normally this is automatic in the size restrictions that are placed on vector sizes coming from the implementation. However, for the legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final 24 bytes of the vector register. Without this check, do_dup selects TCG_TYPE_V128 and clears only 16 bytes. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- tcg/tcg-op-gvec.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 22db1590d5..61c25f5784 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs, in units of LNSZ. This limits the expansion of inline code. */ static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz) { - uint32_t lnct = oprsz / lnsz; - return lnct >= 1 && lnct <= MAX_UNROLL; + if (oprsz % lnsz == 0) { + uint32_t lnct = oprsz / lnsz; + return lnct >= 1 && lnct <= MAX_UNROLL; + } + return false; } static void expand_clr(uint32_t dofs, uint32_t maxsz); -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size 2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson @ 2018-07-06 8:08 ` Alex Bennée 0 siblings, 0 replies; 6+ messages in thread From: Alex Bennée @ 2018-07-06 8:08 UTC (permalink / raw) To: Richard Henderson; +Cc: qemu-devel, peter.maydell Richard Henderson <richard.henderson@linaro.org> writes: > Normally this is automatic in the size restrictions that are placed > on vector sizes coming from the implementation. However, for the > legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final > 24 bytes of the vector register. Without this check, do_dup selects > TCG_TYPE_V128 and clears only 16 bytes. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> > --- > tcg/tcg-op-gvec.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c > index 22db1590d5..61c25f5784 100644 > --- a/tcg/tcg-op-gvec.c > +++ b/tcg/tcg-op-gvec.c > @@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs, > in units of LNSZ. This limits the expansion of inline code. */ > static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz) > { > - uint32_t lnct = oprsz / lnsz; > - return lnct >= 1 && lnct <= MAX_UNROLL; > + if (oprsz % lnsz == 0) { > + uint32_t lnct = oprsz / lnsz; > + return lnct >= 1 && lnct <= MAX_UNROLL; > + } > + return false; > } > > static void expand_clr(uint32_t dofs, uint32_t maxsz); -- Alex Bennée ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL 2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson 2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson @ 2018-07-05 19:19 ` Richard Henderson 2018-07-06 8:14 ` Alex Bennée 2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell 2 siblings, 1 reply; 6+ messages in thread From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw) To: qemu-devel; +Cc: peter.maydell, alex.bennee Use MAKE_64BIT_MASK instead of open-coding. Remove an odd vector size check that is unlikely to be more profitable than 3 64-bit integer stores. Correct the iteration for WORD to avoid writing too much data. Fixes RISU tests of PTRUE for VL 256. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- target/arm/translate-sve.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d41f1155f9..374051cd20 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1438,7 +1438,7 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag) setsz = numelem << esz; lastword = word = pred_esz_masks[esz]; if (setsz % 64) { - lastword &= ~(-1ull << (setsz % 64)); + lastword &= MAKE_64BIT_MASK(0, setsz % 64); } } @@ -1457,19 +1457,13 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag) tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); goto done; } - if (oprsz * 8 == setsz + 8) { - tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); - tcg_gen_movi_i64(t, 0); - tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8); - goto done; - } } setsz /= 8; fullsz /= 8; tcg_gen_movi_i64(t, word); - for (i = 0; i < setsz; i += 8) { + for (i = 0; i < QEMU_ALIGN_DOWN(setsz, 8); i += 8) { tcg_gen_st_i64(t, cpu_env, ofs + i); } if (lastword != word) { -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL 2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson @ 2018-07-06 8:14 ` Alex Bennée 0 siblings, 0 replies; 6+ messages in thread From: Alex Bennée @ 2018-07-06 8:14 UTC (permalink / raw) To: Richard Henderson; +Cc: qemu-devel, peter.maydell Richard Henderson <richard.henderson@linaro.org> writes: > Use MAKE_64BIT_MASK instead of open-coding. Remove an odd > vector size check that is unlikely to be more profitable > than 3 64-bit integer stores. Correct the iteration for WORD > to avoid writing too much data. > > Fixes RISU tests of PTRUE for VL 256. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> > --- > target/arm/translate-sve.c | 10 ++-------- > 1 file changed, 2 insertions(+), 8 deletions(-) > > diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c > index d41f1155f9..374051cd20 100644 > --- a/target/arm/translate-sve.c > +++ b/target/arm/translate-sve.c > @@ -1438,7 +1438,7 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag) > setsz = numelem << esz; > lastword = word = pred_esz_masks[esz]; > if (setsz % 64) { > - lastword &= ~(-1ull << (setsz % 64)); > + lastword &= MAKE_64BIT_MASK(0, setsz % 64); > } > } > > @@ -1457,19 +1457,13 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag) > tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); > goto done; > } > - if (oprsz * 8 == setsz + 8) { > - tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); > - tcg_gen_movi_i64(t, 0); > - tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8); > - goto done; > - } > } > > setsz /= 8; > fullsz /= 8; > > tcg_gen_movi_i64(t, word); > - for (i = 0; i < setsz; i += 8) { > + for (i = 0; i < QEMU_ALIGN_DOWN(setsz, 8); i += 8) { > tcg_gen_st_i64(t, cpu_env, ofs + i); > } > if (lastword != word) { -- Alex Bennée ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes 2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson 2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson 2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson @ 2018-07-06 10:44 ` Peter Maydell 2 siblings, 0 replies; 6+ messages in thread From: Peter Maydell @ 2018-07-06 10:44 UTC (permalink / raw) To: Richard Henderson; +Cc: QEMU Developers, Alex Bennée On 5 July 2018 at 20:19, Richard Henderson <richard.henderson@linaro.org> wrote: > RISU testing with ARM Fast Models and VQ=16 showed up two bugs, > both affecting the PTRUE instruction. > > > r~ > > > Richard Henderson (2): > tcg: Restrict check_size_impl to multiples of the line size > target/arm: Fix do_predset for large VL > > target/arm/translate-sve.c | 10 ++-------- > tcg/tcg-op-gvec.c | 7 +++++-- > 2 files changed, 7 insertions(+), 10 deletions(-) Applied to target-arm.next, thanks. -- PMM ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-07-06 10:44 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson 2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson 2018-07-06 8:08 ` Alex Bennée 2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson 2018-07-06 8:14 ` Alex Bennée 2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).