[PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask
@ 2021-11-10 18:56 matheus.ferst
  2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david

From: Matheus Ferst <matheus.ferst@eldorado.org.br>

This is a small patch series just to allow Ubuntu 21.10 to boot with
-cpu POWER10. Glibc 2.34 is using vextractbm, so the init is killed by
SIGILL without the second patch of this series. The other two insns. are
included as they are somewhat close to Vector Extract Mask (at least in
pseudocode).

Matheus Ferst (3):
  target/ppc: Implement Vector Expand Mask
  target/ppc: Implement Vector Extract Mask
  target/ppc: Implement Vector Mask Move insns

 target/ppc/insn32.decode            |  28 ++++
 target/ppc/translate/vmx-impl.c.inc | 233 ++++++++++++++++++++++++++++
 2 files changed, 261 insertions(+)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] target/ppc: Implement Vector Expand Mask
  2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
@ 2021-11-10 18:56 ` matheus.ferst
  2021-11-11  9:28   ` Richard Henderson
  2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
  2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
  2 siblings, 1 reply; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david

From: Matheus Ferst <matheus.ferst@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vexpandbm: Vector Expand Byte Mask
vexpandhm: Vector Expand Halfword Mask
vexpandwm: Vector Expand Word Mask
vexpanddm: Vector Expand Doubleword Mask
vexpandqm: Vector Expand Quadword Mask

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
 target/ppc/insn32.decode            | 11 ++++++++++
 target/ppc/translate/vmx-impl.c.inc | 34 +++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index e135b8aba4..9a28f1d266 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -56,6 +56,9 @@
 &VX_uim4        vrt uim vrb
 @VX_uim4        ...... vrt:5 . uim:4 vrb:5 ...........  &VX_uim4
 
+&VX_tb          vrt vrb
+@VX_tb          ...... vrt:5 ..... vrb:5 ...........    &VX_tb
+
 &X              rt ra rb
 @X              ...... rt:5 ra:5 rb:5 .......... .      &X
 
@@ -408,6 +411,14 @@ VINSWVRX        000100 ..... ..... ..... 00110001111    @VX
 VSLDBI          000100 ..... ..... ..... 00 ... 010110  @VN
 VSRDBI          000100 ..... ..... ..... 01 ... 010110  @VN
 
+## Vector Mask Manipulation Instructions
+
+VEXPANDBM       000100 ..... 00000 ..... 11001000010    @VX_tb
+VEXPANDHM       000100 ..... 00001 ..... 11001000010    @VX_tb
+VEXPANDWM       000100 ..... 00010 ..... 11001000010    @VX_tb
+VEXPANDDM       000100 ..... 00011 ..... 11001000010    @VX_tb
+VEXPANDQM       000100 ..... 00100 ..... 11001000010    @VX_tb
+
 # VSX Load/Store Instructions
 
 LXV             111101 ..... ..... ............ . 001   @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index b361f73a67..58aca58f0f 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1505,6 +1505,40 @@ static bool trans_VSRDBI(DisasContext *ctx, arg_VN *a)
     return true;
 }
 
+static bool do_vexpand(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
+{
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    tcg_gen_gvec_sari(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrb),
+                      (8 << vece) - 1, 16, 16);
+
+    return true;
+}
+
+TRANS(VEXPANDBM, do_vexpand, MO_8)
+TRANS(VEXPANDHM, do_vexpand, MO_16)
+TRANS(VEXPANDWM, do_vexpand, MO_32)
+TRANS(VEXPANDDM, do_vexpand, MO_64)
+
+static bool trans_VEXPANDQM(DisasContext *ctx, arg_VX_tb *a)
+{
+    TCGv_i64 tmp;
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    tmp = tcg_temp_new_i64();
+
+    get_avr64(tmp, a->vrb, true);
+    tcg_gen_sari_i64(tmp, tmp, 63);
+    set_avr64(a->vrt, tmp, false);
+    set_avr64(a->vrt, tmp, true);
+
+    tcg_temp_free_i64(tmp);
+    return true;
+}
+
 #define GEN_VAFORM_PAIRED(name0, name1, opc2)                           \
 static void glue(gen_, name0##_##name1)(DisasContext *ctx)              \
     {                                                                   \
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] target/ppc: Implement Vector Expand Mask
  2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
@ 2021-11-11  9:28   ` Richard Henderson
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2021-11-11  9:28 UTC (permalink / raw)
  To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david

On 11/10/21 7:56 PM, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
> 
> Implement the following PowerISA v3.1 instructions:
> vexpandbm: Vector Expand Byte Mask
> vexpandhm: Vector Expand Halfword Mask
> vexpandwm: Vector Expand Word Mask
> vexpanddm: Vector Expand Doubleword Mask
> vexpandqm: Vector Expand Quadword Mask
> 
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
>   target/ppc/insn32.decode            | 11 ++++++++++
>   target/ppc/translate/vmx-impl.c.inc | 34 +++++++++++++++++++++++++++++
>   2 files changed, 45 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/3] target/ppc: Implement Vector Extract Mask
  2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
  2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
@ 2021-11-10 18:56 ` matheus.ferst
  2021-11-11  9:54   ` Richard Henderson
  2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
  2 siblings, 1 reply; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david

From: Matheus Ferst <matheus.ferst@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vextractbm: Vector Extract Byte Mask
vextracthm: Vector Extract Halfword Mask
vextractwm: Vector Extract Word Mask
vextractdm: Vector Extract Doubleword Mask
vextractqm: Vector Extract Quadword Mask

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
 target/ppc/insn32.decode            |  6 ++
 target/ppc/translate/vmx-impl.c.inc | 85 +++++++++++++++++++++++++++++
 2 files changed, 91 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 9a28f1d266..639ac22bf0 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -419,6 +419,12 @@ VEXPANDWM       000100 ..... 00010 ..... 11001000010    @VX_tb
 VEXPANDDM       000100 ..... 00011 ..... 11001000010    @VX_tb
 VEXPANDQM       000100 ..... 00100 ..... 11001000010    @VX_tb
 
+VEXTRACTBM      000100 ..... 01000 ..... 11001000010    @VX_tb
+VEXTRACTHM      000100 ..... 01001 ..... 11001000010    @VX_tb
+VEXTRACTWM      000100 ..... 01010 ..... 11001000010    @VX_tb
+VEXTRACTDM      000100 ..... 01011 ..... 11001000010    @VX_tb
+VEXTRACTQM      000100 ..... 01100 ..... 11001000010    @VX_tb
+
 # VSX Load/Store Instructions
 
 LXV             111101 ..... ..... ............ . 001   @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 58aca58f0f..c6a30614fb 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1539,6 +1539,91 @@ static bool trans_VEXPANDQM(DisasContext *ctx, arg_VX_tb *a)
     return true;
 }
 
+static bool do_vextractm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
+{
+    const uint64_t elem_length = 8 << vece, elem_num = 15 >> vece;
+    int i = elem_num;
+    uint64_t bit;
+    TCGv_i64 t, b, tmp, zero;
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    t = tcg_const_i64(0);
+    b = tcg_temp_new_i64();
+    tmp = tcg_temp_new_i64();
+    zero = tcg_constant_i64(0);
+
+    get_avr64(b, a->vrb, true);
+    for (bit = 1ULL << 63; i > elem_num / 2; i--, bit >>= elem_length) {
+        tcg_gen_shli_i64(t, t, 1);
+        tcg_gen_andi_i64(tmp, b, bit);
+        tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, zero);
+        tcg_gen_or_i64(t, t, tmp);
+    }
+
+    get_avr64(b, a->vrb, false);
+    for (bit = 1ULL << 63; i >= 0; i--, bit >>= elem_length) {
+        tcg_gen_shli_i64(t, t, 1);
+        tcg_gen_andi_i64(tmp, b, bit);
+        tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, zero);
+        tcg_gen_or_i64(t, t, tmp);
+    }
+
+    tcg_gen_trunc_i64_tl(cpu_gpr[a->vrt], t);
+
+    tcg_temp_free_i64(t);
+    tcg_temp_free_i64(b);
+    tcg_temp_free_i64(tmp);
+
+    return true;
+}
+
+TRANS(VEXTRACTBM, do_vextractm, MO_8)
+TRANS(VEXTRACTHM, do_vextractm, MO_16)
+TRANS(VEXTRACTWM, do_vextractm, MO_32)
+
+static bool trans_VEXTRACTDM(DisasContext *ctx, arg_VX_tb *a)
+{
+    TCGv_i64 t, b;
+
+    t = tcg_temp_new_i64();
+    b = tcg_temp_new_i64();
+
+    get_avr64(b, a->vrb, true);
+    tcg_gen_andi_i64(t, b, 1);
+    tcg_gen_shli_i64(t, t, 1);
+
+    get_avr64(b, a->vrb, false);
+    tcg_gen_andi_i64(b, b, 1);
+    tcg_gen_or_i64(t, t, b);
+
+    tcg_gen_trunc_i64_tl(cpu_gpr[a->vrt], t);
+
+    tcg_temp_free_i64(t);
+    tcg_temp_free_i64(b);
+
+    return true;
+}
+
+static bool trans_VEXTRACTQM(DisasContext *ctx, arg_VX_tb *a)
+{
+    TCGv_i64 tmp;
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    tmp = tcg_temp_new_i64();
+
+    get_avr64(tmp, a->vrb, true);
+    tcg_gen_shri_i64(tmp, tmp, 63);
+    tcg_gen_trunc_i64_tl(cpu_gpr[a->vrt], tmp);
+
+    tcg_temp_free_i64(tmp);
+
+    return true;
+}
+
 #define GEN_VAFORM_PAIRED(name0, name1, opc2)                           \
 static void glue(gen_, name0##_##name1)(DisasContext *ctx)              \
     {                                                                   \
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] target/ppc: Implement Vector Extract Mask
  2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
@ 2021-11-11  9:54   ` Richard Henderson
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2021-11-11  9:54 UTC (permalink / raw)
  To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david

On 11/10/21 7:56 PM, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
> 
> Implement the following PowerISA v3.1 instructions:
> vextractbm: Vector Extract Byte Mask
> vextracthm: Vector Extract Halfword Mask
> vextractwm: Vector Extract Word Mask
> vextractdm: Vector Extract Doubleword Mask
> vextractqm: Vector Extract Quadword Mask
> 
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
>   target/ppc/insn32.decode            |  6 ++
>   target/ppc/translate/vmx-impl.c.inc | 85 +++++++++++++++++++++++++++++
>   2 files changed, 91 insertions(+)
> 
> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> index 9a28f1d266..639ac22bf0 100644
> --- a/target/ppc/insn32.decode
> +++ b/target/ppc/insn32.decode
> @@ -419,6 +419,12 @@ VEXPANDWM       000100 ..... 00010 ..... 11001000010    @VX_tb
>   VEXPANDDM       000100 ..... 00011 ..... 11001000010    @VX_tb
>   VEXPANDQM       000100 ..... 00100 ..... 11001000010    @VX_tb
>   
> +VEXTRACTBM      000100 ..... 01000 ..... 11001000010    @VX_tb
> +VEXTRACTHM      000100 ..... 01001 ..... 11001000010    @VX_tb
> +VEXTRACTWM      000100 ..... 01010 ..... 11001000010    @VX_tb
> +VEXTRACTDM      000100 ..... 01011 ..... 11001000010    @VX_tb
> +VEXTRACTQM      000100 ..... 01100 ..... 11001000010    @VX_tb
> +
>   # VSX Load/Store Instructions
>   
>   LXV             111101 ..... ..... ............ . 001   @DQ_TSX
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index 58aca58f0f..c6a30614fb 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -1539,6 +1539,91 @@ static bool trans_VEXPANDQM(DisasContext *ctx, arg_VX_tb *a)
>       return true;
>   }
>   
> +static bool do_vextractm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
> +{
> +    const uint64_t elem_length = 8 << vece, elem_num = 15 >> vece;
> +    int i = elem_num;
> +    uint64_t bit;
> +    TCGv_i64 t, b, tmp, zero;
> +
> +    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> +    REQUIRE_VECTOR(ctx);
> +
> +    t = tcg_const_i64(0);
> +    b = tcg_temp_new_i64();
> +    tmp = tcg_temp_new_i64();
> +    zero = tcg_constant_i64(0);
> +
> +    get_avr64(b, a->vrb, true);
> +    for (bit = 1ULL << 63; i > elem_num / 2; i--, bit >>= elem_length) {
> +        tcg_gen_shli_i64(t, t, 1);
> +        tcg_gen_andi_i64(tmp, b, bit);
> +        tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, zero);
> +        tcg_gen_or_i64(t, t, tmp);
> +    }

This is over-complicated.  Shift b into the correct position, isolate the correct bit, or 
it into the result.

     int ele_width = 8 << vece;
     int ele_count_half = 8 >> vece;

     tcg_gen_movi_i64(r, 0);
     for (int w = 0; w < 2; w++) {
         get_avr64(v, a->vrb, w);

         for (int i = 0; i < ele_count_half; ++i) {
             int b_in = i * ele_width - 1;
             int b_out = w * ele_count_half + i;

             tcg_gen_shri_i64(t, v, b_in - b_out);
             tcg_gen_andi_i64(t, t, 1 << b_out);
             tcg_gen_or_i64(r, r, t);
         }
     }
     tcg_gen_trunc_i64_tl(gpr, r);


> +TRANS(VEXTRACTBM, do_vextractm, MO_8)
> +TRANS(VEXTRACTHM, do_vextractm, MO_16)
> +TRANS(VEXTRACTWM, do_vextractm, MO_32)
> +
> +static bool trans_VEXTRACTDM(DisasContext *ctx, arg_VX_tb *a)

Should be able to use the common routine above as well.


r~


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/3] target/ppc: Implement Vector Mask Move insns
  2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
  2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
  2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
@ 2021-11-10 18:56 ` matheus.ferst
  2021-11-11 10:43   ` Richard Henderson
  2 siblings, 1 reply; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david

From: Matheus Ferst <matheus.ferst@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
mtvsrbm: Move to VSR Byte Mask
mtvsrhm: Move to VSR Halfword Mask
mtvsrwm: Move to VSR Word Mask
mtvsrdm: Move to VSR Doubleword Mask
mtvsrqm: Move to VSR Quadword Mask
mtvsrbmi: Move to VSR Byte Mask Immediate

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
 target/ppc/insn32.decode            |  11 +++
 target/ppc/translate/vmx-impl.c.inc | 114 ++++++++++++++++++++++++++++
 2 files changed, 125 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 639ac22bf0..f68931f4f3 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -40,6 +40,10 @@
 %ds_rtp         22:4   !function=times_2
 @DS_rtp         ...... ....0 ra:5 .............. ..             &D rt=%ds_rtp si=%ds_si
 
+&DX_b           vrt b
+%dx_b           6:10 16:5 0:1
+@DX_b           ...... vrt:5  ..... .......... ..... .          &DX_b b=%dx_b
+
 &DX             rt d
 %dx_d           6:s10 16:5 0:1
 @DX             ...... rt:5  ..... .......... ..... .   &DX d=%dx_d
@@ -413,6 +417,13 @@ VSRDBI          000100 ..... ..... ..... 01 ... 010110  @VN
 
 ## Vector Mask Manipulation Instructions
 
+MTVSRBM         000100 ..... 10000 ..... 11001000010    @VX_tb
+MTVSRHM         000100 ..... 10001 ..... 11001000010    @VX_tb
+MTVSRWM         000100 ..... 10010 ..... 11001000010    @VX_tb
+MTVSRDM         000100 ..... 10011 ..... 11001000010    @VX_tb
+MTVSRQM         000100 ..... 10100 ..... 11001000010    @VX_tb
+MTVSRBMI        000100 ..... ..... .......... 01010 .   @DX_b
+
 VEXPANDBM       000100 ..... 00000 ..... 11001000010    @VX_tb
 VEXPANDHM       000100 ..... 00001 ..... 11001000010    @VX_tb
 VEXPANDWM       000100 ..... 00010 ..... 11001000010    @VX_tb
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index c6a30614fb..9f86133d1d 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1624,6 +1624,120 @@ static bool trans_VEXTRACTQM(DisasContext *ctx, arg_VX_tb *a)
     return true;
 }
 
+static bool do_mtvsrm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
+{
+    const uint64_t elem_length = 8 << vece, highest_bit = 15 >> vece;
+    int i;
+    TCGv_i64 t0, t1, zero, ones;
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    t0 = tcg_const_i64(0);
+    t1 = tcg_temp_new_i64();
+    zero = tcg_constant_i64(0);
+    ones = tcg_constant_i64(MAKE_64BIT_MASK(0, elem_length));
+
+    for (i = 1 << highest_bit; i > 1 << (highest_bit / 2); i >>= 1) {
+        tcg_gen_shli_i64(t0, t0, elem_length);
+        tcg_gen_ext_tl_i64(t1, cpu_gpr[a->vrb]);
+        tcg_gen_andi_i64(t1, t1, i);
+        tcg_gen_movcond_i64(TCG_COND_NE, t1, t1, zero, ones, zero);
+        tcg_gen_or_i64(t0, t0, t1);
+    }
+
+    set_avr64(a->vrt, t0, true);
+
+    for (; i > 0; i >>= 1) {
+        tcg_gen_shli_i64(t0, t0, elem_length);
+        tcg_gen_ext_tl_i64(t1, cpu_gpr[a->vrb]);
+        tcg_gen_andi_i64(t1, t1, i);
+        tcg_gen_movcond_i64(TCG_COND_NE, t1, t1, zero, ones, zero);
+        tcg_gen_or_i64(t0, t0, t1);
+    }
+
+    set_avr64(a->vrt, t0, false);
+
+    tcg_temp_free_i64(t0);
+    tcg_temp_free_i64(t1);
+
+    return true;
+}
+
+TRANS(MTVSRBM, do_mtvsrm, MO_8)
+TRANS(MTVSRHM, do_mtvsrm, MO_16)
+TRANS(MTVSRWM, do_mtvsrm, MO_32)
+
+static bool trans_MTVSRDM(DisasContext *ctx, arg_VX_tb *a)
+{
+    TCGv_i64 t0, t1;
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    t0 = tcg_temp_new_i64();
+    t1 = tcg_temp_new_i64();
+
+    tcg_gen_ext_tl_i64(t0, cpu_gpr[a->vrb]);
+    tcg_gen_sextract_i64(t1, t0, 1, 1);
+    set_avr64(a->vrt, t1, true);
+    tcg_gen_sextract_i64(t0, t0, 0, 1);
+    set_avr64(a->vrt, t0, false);
+
+    tcg_temp_free_i64(t0);
+    tcg_temp_free_i64(t1);
+
+    return true;
+}
+
+static bool trans_MTVSRQM(DisasContext *ctx, arg_VX_tb *a)
+{
+    TCGv_i64 tmp;
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    tmp = tcg_temp_new_i64();
+
+    tcg_gen_ext_tl_i64(tmp, cpu_gpr[a->vrb]);
+    tcg_gen_sextract_i64(tmp, tmp, 0, 1);
+    set_avr64(a->vrt, tmp, false);
+    set_avr64(a->vrt, tmp, true);
+
+    tcg_temp_free_i64(tmp);
+
+    return true;
+}
+
+static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b *a)
+{
+    int i;
+    uint64_t hi = 0, lo = 0;
+
+    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+    REQUIRE_VECTOR(ctx);
+
+    for (i = 1 << 15; i >= 1 << 8; i >>= 1) {
+        hi <<= 8;
+        if (a->b & i) {
+            hi |= 0xFF;
+        }
+    }
+
+    set_avr64(a->vrt, tcg_constant_i64(hi), true);
+
+    for (; i > 0; i >>= 1) {
+        lo <<= 8;
+        if (a->b & i) {
+            lo |= 0xFF;
+        }
+    }
+
+    set_avr64(a->vrt, tcg_constant_i64(lo), false);
+
+    return true;
+}
+
 #define GEN_VAFORM_PAIRED(name0, name1, opc2)                           \
 static void glue(gen_, name0##_##name1)(DisasContext *ctx)              \
     {                                                                   \
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] target/ppc: Implement Vector Mask Move insns
  2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
@ 2021-11-11 10:43   ` Richard Henderson
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2021-11-11 10:43 UTC (permalink / raw)
  To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david

On 11/10/21 7:56 PM, matheus.ferst@eldorado.org.br wrote:
> +static bool do_mtvsrm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
> +{
> +    const uint64_t elem_length = 8 << vece, highest_bit = 15 >> vece;
> +    int i;
> +    TCGv_i64 t0, t1, zero, ones;
> +
> +    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> +    REQUIRE_VECTOR(ctx);
> +
> +    t0 = tcg_const_i64(0);
> +    t1 = tcg_temp_new_i64();
> +    zero = tcg_constant_i64(0);
> +    ones = tcg_constant_i64(MAKE_64BIT_MASK(0, elem_length));
> +
> +    for (i = 1 << highest_bit; i > 1 << (highest_bit / 2); i >>= 1) {
> +        tcg_gen_shli_i64(t0, t0, elem_length);
> +        tcg_gen_ext_tl_i64(t1, cpu_gpr[a->vrb]);
> +        tcg_gen_andi_i64(t1, t1, i);
> +        tcg_gen_movcond_i64(TCG_COND_NE, t1, t1, zero, ones, zero);
> +        tcg_gen_or_i64(t0, t0, t1);
> +    }

We can do better than that.

     tcg_gen_extu_tl_i64(t0, gpr);
     tcg_gen_extract_i64(t1, t0, elem_count_half, elem_count_half);
     tcg_gen_extract_i64(t0, t0, 0, elem_count_half);

     /*
      * Spread the bits into their respective elements.
      * E.g. for bytes:
      * 00000000000000000000000000000000000000000000000000000000abcdefgh
      *   << 32-4
      * 0000000000000000000000000000abcdefgh0000000000000000000000000000
      *   |
      * 0000000000000000000000000000abcdefgh00000000000000000000abcdefgh
      *   << 16-2
      * 00000000000000abcdefgh00000000000000000000abcdefgh00000000000000
      *   |
      * 00000000000000abcdefgh000000abcdefgh000000abcdefgh000000abcdefgh
      *   << 8-1
      * 0000000abcdefgh000000abcdefgh000000abcdefgh000000abcdefgh0000000
      *   |
      * 0000000abcdefgXbcdefgXbcdefgXbcdefgXbcdefgXbcdefgXbcdefgXbcdefgh
      *   & dup(1)
      * 0000000a0000000b0000000c0000000d0000000e0000000f0000000g0000000h
      *   * 0xff
      * aaaaaaaabbbbbbbbccccccccddddddddeeeeeeeeffffffffgggggggghhhhhhhh
      */

     for (i = elem_count_half, j = 32; i > 0; i >>= 1, j >>= 1) {
         tcg_gen_shli_i64(s0, t0, j - i);
         tcg_gen_shli_i64(s1, t1, j - i);
         tcg_gen_or_i64(t0, t0, s0);
         tcg_gen_or_i64(t1, t1, s1);
     }

     c = dup_const(vece, 1);
     tcg_gen_andi_i64(t0, t0, c);
     tcg_gen_andi_i64(t1, t1, c);

     c = MAKE_64BIT_MASK(0, elem_length);
     tcg_gen_muli_i64(t0, t0, c);
     tcg_gen_muli_i64(t1, t1, c);

     set_avr64(a->vrt, t0, false);
     set_avr64(a->vrt, t1, true);



r~


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-11 10:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
2021-11-11  9:28   ` Richard Henderson
2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
2021-11-11  9:54   ` Richard Henderson
2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
2021-11-11 10:43   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).