[Qemu-devel] [PATCH 0/2] target/arm: SVE fixes

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes
@ 2018-07-05 19:19 Richard Henderson
  2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee

RISU testing with ARM Fast Models and VQ=16 showed up two bugs,
both affecting the PTRUE instruction.


r~


Richard Henderson (2):
  tcg: Restrict check_size_impl to multiples of the line size
  target/arm: Fix do_predset for large VL

 target/arm/translate-sve.c | 10 ++--------
 tcg/tcg-op-gvec.c          |  7 +++++--
 2 files changed, 7 insertions(+), 10 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size
  2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
@ 2018-07-05 19:19 ` Richard Henderson
  2018-07-06  8:08   ` Alex Bennée
  2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
  2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell
  2 siblings, 1 reply; 6+ messages in thread
From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee

Normally this is automatic in the size restrictions that are placed
on vector sizes coming from the implementation.  However, for the
legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final
24 bytes of the vector register.  Without this check, do_dup selects
TCG_TYPE_V128 and clears only 16 bytes.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg-op-gvec.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index 22db1590d5..61c25f5784 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
    in units of LNSZ.  This limits the expansion of inline code.  */
 static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz)
 {
-    uint32_t lnct = oprsz / lnsz;
-    return lnct >= 1 && lnct <= MAX_UNROLL;
+    if (oprsz % lnsz == 0) {
+        uint32_t lnct = oprsz / lnsz;
+        return lnct >= 1 && lnct <= MAX_UNROLL;
+    }
+    return false;
 }
 
 static void expand_clr(uint32_t dofs, uint32_t maxsz);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size
  2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
@ 2018-07-06  8:08   ` Alex Bennée
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Bennée @ 2018-07-06  8:08 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, peter.maydell


Richard Henderson <richard.henderson@linaro.org> writes:

> Normally this is automatic in the size restrictions that are placed
> on vector sizes coming from the implementation.  However, for the
> legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final
> 24 bytes of the vector register.  Without this check, do_dup selects
> TCG_TYPE_V128 and clears only 16 bytes.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  tcg/tcg-op-gvec.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
> index 22db1590d5..61c25f5784 100644
> --- a/tcg/tcg-op-gvec.c
> +++ b/tcg/tcg-op-gvec.c
> @@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs,
>     in units of LNSZ.  This limits the expansion of inline code.  */
>  static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz)
>  {
> -    uint32_t lnct = oprsz / lnsz;
> -    return lnct >= 1 && lnct <= MAX_UNROLL;
> +    if (oprsz % lnsz == 0) {
> +        uint32_t lnct = oprsz / lnsz;
> +        return lnct >= 1 && lnct <= MAX_UNROLL;
> +    }
> +    return false;
>  }
>
>  static void expand_clr(uint32_t dofs, uint32_t maxsz);


--
Alex Bennée

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL
  2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
  2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
@ 2018-07-05 19:19 ` Richard Henderson
  2018-07-06  8:14   ` Alex Bennée
  2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell
  2 siblings, 1 reply; 6+ messages in thread
From: Richard Henderson @ 2018-07-05 19:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee

Use MAKE_64BIT_MASK instead of open-coding.  Remove an odd
vector size check that is unlikely to be more profitable
than 3 64-bit integer stores.  Correct the iteration for WORD
to avoid writing too much data.

Fixes RISU tests of PTRUE for VL 256.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-sve.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index d41f1155f9..374051cd20 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -1438,7 +1438,7 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
         setsz = numelem << esz;
         lastword = word = pred_esz_masks[esz];
         if (setsz % 64) {
-            lastword &= ~(-1ull << (setsz % 64));
+            lastword &= MAKE_64BIT_MASK(0, setsz % 64);
         }
     }
 
@@ -1457,19 +1457,13 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
             tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
             goto done;
         }
-        if (oprsz * 8 == setsz + 8) {
-            tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
-            tcg_gen_movi_i64(t, 0);
-            tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8);
-            goto done;
-        }
     }
 
     setsz /= 8;
     fullsz /= 8;
 
     tcg_gen_movi_i64(t, word);
-    for (i = 0; i < setsz; i += 8) {
+    for (i = 0; i < QEMU_ALIGN_DOWN(setsz, 8); i += 8) {
         tcg_gen_st_i64(t, cpu_env, ofs + i);
     }
     if (lastword != word) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL
  2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
@ 2018-07-06  8:14   ` Alex Bennée
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Bennée @ 2018-07-06  8:14 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, peter.maydell


Richard Henderson <richard.henderson@linaro.org> writes:

> Use MAKE_64BIT_MASK instead of open-coding.  Remove an odd
> vector size check that is unlikely to be more profitable
> than 3 64-bit integer stores.  Correct the iteration for WORD
> to avoid writing too much data.
>
> Fixes RISU tests of PTRUE for VL 256.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  target/arm/translate-sve.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index d41f1155f9..374051cd20 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -1438,7 +1438,7 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
>          setsz = numelem << esz;
>          lastword = word = pred_esz_masks[esz];
>          if (setsz % 64) {
> -            lastword &= ~(-1ull << (setsz % 64));
> +            lastword &= MAKE_64BIT_MASK(0, setsz % 64);
>          }
>      }
>
> @@ -1457,19 +1457,13 @@ static bool do_predset(DisasContext *s, int esz, int rd, int pat, bool setflag)
>              tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
>              goto done;
>          }
> -        if (oprsz * 8 == setsz + 8) {
> -            tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word);
> -            tcg_gen_movi_i64(t, 0);
> -            tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8);
> -            goto done;
> -        }
>      }
>
>      setsz /= 8;
>      fullsz /= 8;
>
>      tcg_gen_movi_i64(t, word);
> -    for (i = 0; i < setsz; i += 8) {
> +    for (i = 0; i < QEMU_ALIGN_DOWN(setsz, 8); i += 8) {
>          tcg_gen_st_i64(t, cpu_env, ofs + i);
>      }
>      if (lastword != word) {


--
Alex Bennée

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes
  2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
  2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
  2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
@ 2018-07-06 10:44 ` Peter Maydell
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2018-07-06 10:44 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, Alex Bennée

On 5 July 2018 at 20:19, Richard Henderson <richard.henderson@linaro.org> wrote:
> RISU testing with ARM Fast Models and VQ=16 showed up two bugs,
> both affecting the PTRUE instruction.
>
>
> r~
>
>
> Richard Henderson (2):
>   tcg: Restrict check_size_impl to multiples of the line size
>   target/arm: Fix do_predset for large VL
>
>  target/arm/translate-sve.c | 10 ++--------
>  tcg/tcg-op-gvec.c          |  7 +++++--
>  2 files changed, 7 insertions(+), 10 deletions(-)



Applied to target-arm.next, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-07-06 10:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-05 19:19 [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Richard Henderson
2018-07-05 19:19 ` [Qemu-devel] [PATCH 1/2] tcg: Restrict check_size_impl to multiples of the line size Richard Henderson
2018-07-06  8:08   ` Alex Bennée
2018-07-05 19:19 ` [Qemu-devel] [PATCH 2/2] target/arm: Fix do_predset for large VL Richard Henderson
2018-07-06  8:14   ` Alex Bennée
2018-07-06 10:44 ` [Qemu-devel] [PATCH 0/2] target/arm: SVE fixes Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).