[PATCH 00/10] VDIV/VMOD Implementation

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/10] VDIV/VMOD Implementation
@ 2022-03-30 20:25 Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 01/10] qemu/int128: avoid undefined behavior in int128_lshift Lucas Mateus Castro(alqotel)
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: Lucas Mateus Castro (alqotel), danielhb413, richard.henderson,
	clg

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

This patch series is an implementation of the vector divide, vector
divide extended and vector modulo instructions from PowerISA 3.1

The first 2 patches are Matheus' patches used here since the divs256 and
divu256 functions use int128_lshift and int128_urshift.

Lucas Mateus Castro (alqotel) (8):
  target/ppc: Implemented vector divide instructions
  target/ppc: Implemented vector divide quadword
  target/ppc: Implemented vector divide extended word
  Implemented unsigned 256-by-128 division
  Implemented signed 256-by-128 division
  target/ppc: Implemented remaining vector divide extended
  target/ppc: Implemented vector module word/doubleword
  target/ppc: Implemented vector module quadword

Matheus Ferst (2):
  qemu/int128: avoid undefined behavior in int128_lshift
  qemu/int128: add int128_urshift

 include/qemu/host-utils.h           |  16 +++
 include/qemu/int128.h               |  53 +++++++-
 target/ppc/helper.h                 |   8 ++
 target/ppc/insn32.decode            |  23 ++++
 target/ppc/int_helper.c             | 109 +++++++++++++++++
 target/ppc/translate/vmx-impl.c.inc | 148 +++++++++++++++++++++++
 tests/unit/test-int128.c            |  32 +++++
 util/host-utils.c                   | 179 ++++++++++++++++++++++++++++
 8 files changed, 566 insertions(+), 2 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 01/10] qemu/int128: avoid undefined behavior in int128_lshift
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 21:21   ` Peter Maydell
  2022-03-30 20:25 ` [PATCH 02/10] qemu/int128: add int128_urshift Lucas Mateus Castro(alqotel)
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: Peter Maydell, danielhb413, richard.henderson,
	Philippe Mathieu-Daudé, Lucas Mateus Castro, clg,
	Frédéric Pétrot, Matheus Ferst

From: Matheus Ferst <matheus.ferst@eldorado.org.br>

Avoid the left shift of negative values in int128_lshift by casting
a/a.hi to unsigned.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 include/qemu/int128.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/qemu/int128.h b/include/qemu/int128.h
index 2c4064256c..2a19558ac6 100644
--- a/include/qemu/int128.h
+++ b/include/qemu/int128.h
@@ -85,7 +85,7 @@ static inline Int128 int128_rshift(Int128 a, int n)
 
 static inline Int128 int128_lshift(Int128 a, int n)
 {
-    return a << n;
+    return (__uint128_t)a << n;
 }
 
 static inline Int128 int128_add(Int128 a, Int128 b)
@@ -305,7 +305,7 @@ static inline Int128 int128_lshift(Int128 a, int n)
     if (n >= 64) {
         return int128_make128(0, l);
     } else if (n > 0) {
-        return int128_make128(l, (a.hi << n) | (a.lo >> (64 - n)));
+        return int128_make128(l, ((uint64_t)a.hi << n) | (a.lo >> (64 - n)));
     }
     return a;
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 02/10] qemu/int128: add int128_urshift
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 01/10] qemu/int128: avoid undefined behavior in int128_lshift Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 03/10] target/ppc: Implemented vector divide instructions Lucas Mateus Castro(alqotel)
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: Peter Maydell, danielhb413, richard.henderson,
	Philippe Mathieu-Daudé, Lucas Mateus Castro, clg,
	Matheus Ferst, David Gibson

From: Matheus Ferst <matheus.ferst@eldorado.org.br>

Implement an unsigned right shift for Int128 values and add the same
tests cases of int128_rshift in the unit test.

Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 include/qemu/int128.h    | 19 +++++++++++++++++++
 tests/unit/test-int128.c | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/include/qemu/int128.h b/include/qemu/int128.h
index 2a19558ac6..ca32b0b276 100644
--- a/include/qemu/int128.h
+++ b/include/qemu/int128.h
@@ -83,6 +83,11 @@ static inline Int128 int128_rshift(Int128 a, int n)
     return a >> n;
 }
 
+static inline Int128 int128_urshift(Int128 a, int n)
+{
+    return (__uint128_t)a >> n;
+}
+
 static inline Int128 int128_lshift(Int128 a, int n)
 {
     return (__uint128_t)a << n;
@@ -299,6 +304,20 @@ static inline Int128 int128_rshift(Int128 a, int n)
     }
 }
 
+static inline Int128 int128_urshift(Int128 a, int n)
+{
+    uint64_t h = a.hi;
+    if (!n) {
+        return a;
+    }
+    h = h >> (n & 63);
+    if (n >= 64) {
+        return int128_make64(h);
+    } else {
+        return int128_make128((a.lo >> n) | ((uint64_t)a.hi << (64 - n)), h);
+    }
+}
+
 static inline Int128 int128_lshift(Int128 a, int n)
 {
     uint64_t l = a.lo << (n & 63);
diff --git a/tests/unit/test-int128.c b/tests/unit/test-int128.c
index b86a3c76e6..ae0f552193 100644
--- a/tests/unit/test-int128.c
+++ b/tests/unit/test-int128.c
@@ -206,6 +206,37 @@ static void test_rshift(void)
     test_rshift_one(0xFFFE8000U,  0, 0xFFFFFFFFFFFFFFFEULL, 0x8000000000000000ULL);
 }
 
+static void __attribute__((__noinline__)) ATTRIBUTE_NOCLONE
+test_urshift_one(uint32_t x, int n, uint64_t h, uint64_t l)
+{
+    Int128 a = expand(x);
+    Int128 r = int128_urshift(a, n);
+    g_assert_cmpuint(int128_getlo(r), ==, l);
+    g_assert_cmpuint(int128_gethi(r), ==, h);
+}
+
+static void test_urshift(void)
+{
+    test_urshift_one(0x00010000U, 64, 0x0000000000000000ULL, 0x0000000000000001ULL);
+    test_urshift_one(0x80010000U, 64, 0x0000000000000000ULL, 0x8000000000000001ULL);
+    test_urshift_one(0x7FFE0000U, 64, 0x0000000000000000ULL, 0x7FFFFFFFFFFFFFFEULL);
+    test_urshift_one(0xFFFE0000U, 64, 0x0000000000000000ULL, 0xFFFFFFFFFFFFFFFEULL);
+    test_urshift_one(0x00010000U, 60, 0x0000000000000000ULL, 0x0000000000000010ULL);
+    test_urshift_one(0x80010000U, 60, 0x0000000000000008ULL, 0x0000000000000010ULL);
+    test_urshift_one(0x00018000U, 60, 0x0000000000000000ULL, 0x0000000000000018ULL);
+    test_urshift_one(0x80018000U, 60, 0x0000000000000008ULL, 0x0000000000000018ULL);
+    test_urshift_one(0x7FFE0000U, 60, 0x0000000000000007ULL, 0xFFFFFFFFFFFFFFE0ULL);
+    test_urshift_one(0xFFFE0000U, 60, 0x000000000000000FULL, 0xFFFFFFFFFFFFFFE0ULL);
+    test_urshift_one(0x7FFE8000U, 60, 0x0000000000000007ULL, 0xFFFFFFFFFFFFFFE8ULL);
+    test_urshift_one(0xFFFE8000U, 60, 0x000000000000000FULL, 0xFFFFFFFFFFFFFFE8ULL);
+    test_urshift_one(0x00018000U,  0, 0x0000000000000001ULL, 0x8000000000000000ULL);
+    test_urshift_one(0x80018000U,  0, 0x8000000000000001ULL, 0x8000000000000000ULL);
+    test_urshift_one(0x7FFE0000U,  0, 0x7FFFFFFFFFFFFFFEULL, 0x0000000000000000ULL);
+    test_urshift_one(0xFFFE0000U,  0, 0xFFFFFFFFFFFFFFFEULL, 0x0000000000000000ULL);
+    test_urshift_one(0x7FFE8000U,  0, 0x7FFFFFFFFFFFFFFEULL, 0x8000000000000000ULL);
+    test_urshift_one(0xFFFE8000U,  0, 0xFFFFFFFFFFFFFFFEULL, 0x8000000000000000ULL);
+}
+
 int main(int argc, char **argv)
 {
     g_test_init(&argc, &argv, NULL);
@@ -219,5 +250,6 @@ int main(int argc, char **argv)
     g_test_add_func("/int128/int128_ge", test_ge);
     g_test_add_func("/int128/int128_gt", test_gt);
     g_test_add_func("/int128/int128_rshift", test_rshift);
+    g_test_add_func("/int128/int128_urshift", test_urshift);
     return g_test_run();
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 03/10] target/ppc: Implemented vector divide instructions
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 01/10] qemu/int128: avoid undefined behavior in int128_lshift Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 02/10] qemu/int128: add int128_urshift Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 21:06   ` Richard Henderson
  2022-03-30 20:25 ` [PATCH 04/10] target/ppc: Implemented vector divide quadword Lucas Mateus Castro(alqotel)
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, Greg Kurz,
	Lucas Mateus Castro (alqotel), clg, David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vdivsw: Vector Divide Signed Word
vdivuw: Vector Divide Unsigned Word 
vdivsd: Vector Divide Signed Doubleword
vdivud: Vector Divide Unsigned Doubleword

Hardware behavior based on mambo

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 target/ppc/insn32.decode            |  7 +++++
 target/ppc/translate/vmx-impl.c.inc | 49 +++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index ac2d3da9a7..597768558b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -703,3 +703,10 @@ XVTLSBB         111100 ... -- 00010 ..... 111011011 . - @XX2_bf_xb
 &XL_s           s:uint8_t
 @XL_s           ......-------------- s:1 .......... -   &XL_s
 RFEBB           010011-------------- .   0010010010 -   @XL_s
+
+## Vector Division Instructions
+
+VDIVSW          000100 ..... ..... ..... 00110001011    @VX
+VDIVUW          000100 ..... ..... ..... 00010001011    @VX
+VDIVSD          000100 ..... ..... ..... 00111001011    @VX
+VDIVUD          000100 ..... ..... ..... 00011001011    @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 6101bca3fd..d96e804abb 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -3236,6 +3236,55 @@ TRANS(VMULHSD, do_vx_mulh, true , do_vx_vmulhd_i64)
 TRANS(VMULHUW, do_vx_mulh, false, do_vx_vmulhw_i64)
 TRANS(VMULHUD, do_vx_mulh, false, do_vx_vmulhd_i64)
 
+#define TRANS_VDIV_VMOD(FLAGS, NAME, VECE, FNI4_FUNC, FNI8_FUNC)        \
+static bool trans_##NAME(DisasContext *ctx, arg_VX *a)                  \
+{                                                                       \
+    static const GVecGen3 op[2] = {                                     \
+        {                                                               \
+            .fni4 = FNI4_FUNC,                                          \
+            .fni8 = FNI8_FUNC,                                          \
+            .vece = MO_32                                               \
+        },                                                              \
+        {                                                               \
+            .fni4 = FNI4_FUNC,                                          \
+            .fni8 = FNI8_FUNC,                                          \
+            .vece = MO_64                                               \
+        },                                                              \
+    };                                                                  \
+                                                                        \
+    REQUIRE_VECTOR(ctx);                                                \
+    REQUIRE_INSNS_FLAGS2(ctx, FLAGS);                                   \
+                                                                        \
+    tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),    \
+                   avr_full_offset(a->vrb), 16, 16, &op[VECE - MO_32]); \
+                                                                        \
+    return true;                                                        \
+}
+
+#define DIV_VEC(NAME, SZ, DIV)                                          \
+static void do_vx_##NAME(TCGv_##SZ t, TCGv_##SZ a, TCGv_##SZ b)         \
+{                                                                       \
+    TCGv_##SZ zero = tcg_constant_##SZ(0), one = tcg_constant_##SZ(1);  \
+    /*                                                                  \
+     *  If N/0 the instruction used by the backend might deliver        \
+     *  a signal to the process and the hardware returns 0 when         \
+     *  N/0, so if b = 0 return 0/1                                     \
+     */                                                                 \
+    tcg_gen_movcond_##SZ(TCG_COND_EQ, a, b, zero, zero, a);             \
+    tcg_gen_movcond_##SZ(TCG_COND_EQ, b, b, zero, one, b);              \
+    DIV(t, a, b);                                                       \
+}
+
+DIV_VEC(div_i32 , i32, tcg_gen_div_i32)
+DIV_VEC(divu_i32, i32, tcg_gen_divu_i32)
+DIV_VEC(div_i64 , i64, tcg_gen_div_i64)
+DIV_VEC(divu_i64, i64, tcg_gen_divu_i64)
+
+TRANS_VDIV_VMOD(ISA310, VDIVSW, MO_32, do_vx_div_i32 , NULL)
+TRANS_VDIV_VMOD(ISA310, VDIVUW, MO_32, do_vx_divu_i32, NULL)
+TRANS_VDIV_VMOD(ISA310, VDIVSD, MO_64, NULL, do_vx_div_i64)
+TRANS_VDIV_VMOD(ISA310, VDIVUD, MO_64, NULL, do_vx_divu_i64)
+
 #undef GEN_VR_LDX
 #undef GEN_VR_STX
 #undef GEN_VR_LVE
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 04/10] target/ppc: Implemented vector divide quadword
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
                   ` (2 preceding siblings ...)
  2022-03-30 20:25 ` [PATCH 03/10] target/ppc: Implemented vector divide instructions Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 05/10] target/ppc: Implemented vector divide extended word Lucas Mateus Castro(alqotel)
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, Greg Kurz,
	Lucas Mateus Castro (alqotel), clg, David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vdivsq: Vector Divide Signed Quadword
vdivuq: Vector Divide Unsigned Quadword
Undefined behavior based on mambo.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 target/ppc/helper.h                 |  2 ++
 target/ppc/insn32.decode            |  2 ++
 target/ppc/int_helper.c             | 18 ++++++++++++++++++
 target/ppc/translate/vmx-impl.c.inc |  2 ++
 4 files changed, 24 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 57da11c77e..4cfdf7b3ec 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -171,6 +171,8 @@ DEF_HELPER_FLAGS_3(VMULOSW, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUB, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUH, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VDIVSQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VDIVUQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 597768558b..3a88a0b5bc 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -710,3 +710,5 @@ VDIVSW          000100 ..... ..... ..... 00110001011    @VX
 VDIVUW          000100 ..... ..... ..... 00010001011    @VX
 VDIVSD          000100 ..... ..... ..... 00111001011    @VX
 VDIVUD          000100 ..... ..... ..... 00011001011    @VX
+VDIVSQ          000100 ..... ..... ..... 00100001011    @VX
+VDIVUQ          000100 ..... ..... ..... 00000001011    @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 492f34c499..18e5430e00 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1036,6 +1036,24 @@ void helper_XXPERMX(ppc_vsr_t *t, ppc_vsr_t *s0, ppc_vsr_t *s1, ppc_vsr_t *pcv,
     *t = tmp;
 }
 
+void helper_VDIVSQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    if (int128_nz(b->s128)) {
+        t->s128 = int128_divs(a->s128, b->s128);
+    } else {
+        t->s128 = int128_zero(); /* Undefined behavior */
+    }
+}
+
+void helper_VDIVUQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    if (int128_nz(b->s128)) {
+        t->s128 = int128_divu(a->s128, b->s128);
+    } else {
+        t->s128 = int128_zero(); /* Undefined behavior */
+    }
+}
+
 void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
 {
     ppc_avr_t result;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index d96e804abb..949e47be1c 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -3284,6 +3284,8 @@ TRANS_VDIV_VMOD(ISA310, VDIVSW, MO_32, do_vx_div_i32 , NULL)
 TRANS_VDIV_VMOD(ISA310, VDIVUW, MO_32, do_vx_divu_i32, NULL)
 TRANS_VDIV_VMOD(ISA310, VDIVSD, MO_64, NULL, do_vx_div_i64)
 TRANS_VDIV_VMOD(ISA310, VDIVUD, MO_64, NULL, do_vx_divu_i64)
+TRANS_FLAGS2(ISA310, VDIVSQ, do_vx_helper, gen_helper_VDIVSQ)
+TRANS_FLAGS2(ISA310, VDIVUQ, do_vx_helper, gen_helper_VDIVUQ)
 
 #undef GEN_VR_LDX
 #undef GEN_VR_STX
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 05/10] target/ppc: Implemented vector divide extended word
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
                   ` (3 preceding siblings ...)
  2022-03-30 20:25 ` [PATCH 04/10] target/ppc: Implemented vector divide quadword Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 21:24   ` Richard Henderson
  2022-03-30 20:25 ` [PATCH 06/10] Implemented unsigned 256-by-128 division Lucas Mateus Castro(alqotel)
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, Greg Kurz,
	Lucas Mateus Castro (alqotel), clg, David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vdivesw: Vector Divide Extended Signed Word
vdiveuw: Vector Divide Extended Unsigned Word
Undefined behavior based on mambo.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 target/ppc/insn32.decode            |  3 ++
 target/ppc/translate/vmx-impl.c.inc | 65 +++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 3a88a0b5bc..8c115c9c60 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -712,3 +712,6 @@ VDIVSD          000100 ..... ..... ..... 00111001011    @VX
 VDIVUD          000100 ..... ..... ..... 00011001011    @VX
 VDIVSQ          000100 ..... ..... ..... 00100001011    @VX
 VDIVUQ          000100 ..... ..... ..... 00000001011    @VX
+
+VDIVESW         000100 ..... ..... ..... 01110001011    @VX
+VDIVEUW         000100 ..... ..... ..... 01010001011    @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 949e47be1c..752f3af659 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -3287,6 +3287,71 @@ TRANS_VDIV_VMOD(ISA310, VDIVUD, MO_64, NULL, do_vx_divu_i64)
 TRANS_FLAGS2(ISA310, VDIVSQ, do_vx_helper, gen_helper_VDIVSQ)
 TRANS_FLAGS2(ISA310, VDIVUQ, do_vx_helper, gen_helper_VDIVUQ)
 
+static void do_vx_dives_i32(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b)
+{
+    TCGv_i64 res, val1, val2;
+    TCGv_i64 zero = tcg_constant_i64(0);
+    TCGv_i64 one =  tcg_constant_i64(1);
+
+    res = tcg_temp_new_i64();
+    val1 = tcg_temp_new_i64();
+    val2 = tcg_temp_new_i64();
+
+    tcg_gen_ext_i32_i64(val1, a);
+    tcg_gen_ext_i32_i64(val2, b);
+
+    /* return 0 if b = 0, so make b = 1 so the result doesn't fit in 32 bits*/
+    tcg_gen_movcond_i64(TCG_COND_EQ, val2, val2, zero, one, val2);
+
+    /* (a << 32)/b */
+    tcg_gen_shli_i64(val1, val1, 32);
+    tcg_gen_div_i64(res, val1, val2);
+
+    tcg_gen_ext32s_i64(val1, res);
+    /* if result is undefined (quotient doesn't fit in 32 bits) return 0 */
+    tcg_gen_movcond_i64(TCG_COND_EQ, res, res, val1, res, zero);
+    tcg_gen_extrl_i64_i32(t, res);
+
+    tcg_temp_free_i64(res);
+    tcg_temp_free_i64(val1);
+    tcg_temp_free_i64(val2);
+}
+
+static void do_vx_diveu_i32(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b)
+{
+    TCGv_i64 val1, val2;
+    TCGv_i32 h, l;
+    TCGv_i32 zero = tcg_constant_i32(0);
+    TCGv_i32 one =  tcg_constant_i32(1);
+
+    val1 = tcg_temp_new_i64();
+    val2 = tcg_temp_new_i64();
+    h = tcg_temp_new_i32();
+    l = tcg_temp_new_i32();
+
+    /* return 0 if b = 0, so make b = 1 so the result doesn't fit in 32 bits*/
+    tcg_gen_movcond_i32(TCG_COND_EQ, b, b, zero, one, b);
+
+    tcg_gen_ext_i32_i64(val1, a);
+    tcg_gen_extu_i32_i64(val2, b);
+
+    /* (a << 32)/b */
+    tcg_gen_shli_i64(val1, val1, 32);
+    tcg_gen_divu_i64(val1, val1, val2);
+
+    tcg_gen_extrh_i64_i32(h, val1);
+    tcg_gen_extrl_i64_i32(l, val1);
+    /* if result is undefined (quotient doesn't fit in 32 bits) return 0 */
+    tcg_gen_movcond_i32(TCG_COND_EQ, t, h, zero, l, zero);
+    tcg_temp_free_i32(h);
+    tcg_temp_free_i32(l);
+    tcg_temp_free_i64(val1);
+    tcg_temp_free_i64(val2);
+}
+
+TRANS_VDIV_VMOD(ISA310, VDIVESW, MO_32, do_vx_dives_i32, NULL)
+TRANS_VDIV_VMOD(ISA310, VDIVEUW, MO_32, do_vx_diveu_i32, NULL)
+
 #undef GEN_VR_LDX
 #undef GEN_VR_STX
 #undef GEN_VR_LVE
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 06/10] Implemented unsigned 256-by-128 division
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
                   ` (4 preceding siblings ...)
  2022-03-30 20:25 ` [PATCH 05/10] target/ppc: Implemented vector divide extended word Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 07/10] Implemented signed " Lucas Mateus Castro(alqotel)
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: Peter Maydell, David Hildenbrand, Matheus Ferst, danielhb413,
	richard.henderson, Luis Pires, Philippe Mathieu-Daudé,
	Lucas Mateus Castro (alqotel), clg, Alex Bennée,
	David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Based on already existing QEMU implementation, created an unsigned 256
bit by 128 bit division needed to implement the vector divide extended
unsigned instruction from PowerISA3.1

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 include/qemu/host-utils.h |  15 +++++
 include/qemu/int128.h     |  20 ++++++
 util/host-utils.c         | 128 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 163 insertions(+)

diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index ca979dc6cc..6479403935 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -32,6 +32,7 @@
 
 #include "qemu/compiler.h"
 #include "qemu/bswap.h"
+#include "qemu/int128.h"
 
 #ifdef CONFIG_INT128
 static inline void mulu64(uint64_t *plow, uint64_t *phigh,
@@ -142,6 +143,19 @@ static inline int clz64(uint64_t val)
     return val ? __builtin_clzll(val) : 64;
 }
 
+/**
+ * clz128 - count leading zeros in a 128-bit value.
+ * @val: The value to search
+ */
+static inline int clz128(Int128 a)
+{
+    if (int128_gethi(a)) {
+        return clz64(int128_gethi(a));
+    } else {
+        return clz64(int128_getlo(a)) + 64;
+    }
+}
+
 /**
  * clo64 - count leading ones in a 64-bit value.
  * @val: The value to search
@@ -849,4 +863,5 @@ static inline uint64_t udiv_qrnnd(uint64_t *r, uint64_t n1,
 #endif
 }
 
+Int128 divu256(Int128 *plow, Int128 *phigh, Int128 divisor);
 #endif
diff --git a/include/qemu/int128.h b/include/qemu/int128.h
index ca32b0b276..b1eb094525 100644
--- a/include/qemu/int128.h
+++ b/include/qemu/int128.h
@@ -128,11 +128,21 @@ static inline bool int128_ge(Int128 a, Int128 b)
     return a >= b;
 }
 
+static inline bool int128_uge(Int128 a, Int128 b)
+{
+    return ((__uint128_t)a) >= ((__uint128_t)b);
+}
+
 static inline bool int128_lt(Int128 a, Int128 b)
 {
     return a < b;
 }
 
+static inline bool int128_ult(Int128 a, Int128 b)
+{
+    return (__uint128_t)a < (__uint128_t)b;
+}
+
 static inline bool int128_le(Int128 a, Int128 b)
 {
     return a <= b;
@@ -373,11 +383,21 @@ static inline bool int128_ge(Int128 a, Int128 b)
     return a.hi > b.hi || (a.hi == b.hi && a.lo >= b.lo);
 }
 
+static inline bool int128_uge(Int128 a, Int128 b)
+{
+    return (uint64_t)a.hi > (uint64_t)b.hi || (a.hi == b.hi && a.lo >= b.lo);
+}
+
 static inline bool int128_lt(Int128 a, Int128 b)
 {
     return !int128_ge(a, b);
 }
 
+static inline bool int128_ult(Int128 a, Int128 b)
+{
+    return !int128_uge(a, b);
+}
+
 static inline bool int128_le(Int128 a, Int128 b)
 {
     return int128_ge(b, a);
diff --git a/util/host-utils.c b/util/host-utils.c
index bcc772b8ec..a495cc820d 100644
--- a/util/host-utils.c
+++ b/util/host-utils.c
@@ -266,3 +266,131 @@ void ulshift(uint64_t *plow, uint64_t *phigh, int32_t shift, bool *overflow)
         *plow = *plow << shift;
     }
 }
+/*
+ * Unsigned 256-by-128 division.
+ * Returns the remainder via r.
+ * Returns lower 128 bit of quotient.
+ * Needs a normalized divisor (most significant bit set to 1).
+ *
+ * Adapted from include/qemu/host-utils.h udiv_qrnnd,
+ * from the GNU Multi Precision Library - longlong.h __udiv_qrnnd
+ * (https://gmplib.org/repo/gmp/file/tip/longlong.h)
+ *
+ * Licensed under the GPLv2/LGPLv3
+ */
+static Int128 udiv256_qrnnd(Int128 *r, Int128 n1, Int128 n0, Int128 d)
+{
+    Int128 d0, d1, q0, q1, r1, r0, m;
+    uint64_t mp0, mp1;
+
+    d0 = int128_make64(int128_getlo(d));
+    d1 = int128_make64(int128_gethi(d));
+
+    r1 = int128_remu(n1, d1);
+    q1 = int128_divu(n1, d1);
+    mp0 = int128_getlo(q1);
+    mp1 = int128_gethi(q1);
+    mulu128(&mp0, &mp1, int128_getlo(d0));
+    m = int128_make128(mp0, mp1);
+    r1 = int128_make128(int128_gethi(n0), int128_getlo(r1));
+    if (int128_ult(r1, m)) {
+        q1 = int128_sub(q1, int128_one());
+        r1 = int128_add(r1, d);
+        if (int128_uge(r1, d)) {
+            if (int128_ult(r1, m)) {
+                q1 = int128_sub(q1, int128_one());
+                r1 = int128_add(r1, d);
+            }
+        }
+    }
+    r1 = int128_sub(r1, m);
+
+    r0 = int128_remu(r1, d1);
+    q0 = int128_divu(r1, d1);
+    mp0 = int128_getlo(q0);
+    mp1 = int128_gethi(q0);
+    mulu128(&mp0, &mp1, int128_getlo(d0));
+    m = int128_make128(mp0, mp1);
+    r0 = int128_make128(int128_getlo(n0), int128_getlo(r0));
+    if (int128_ult(r0, m)) {
+        q0 = int128_sub(q0, int128_one());
+        r0 = int128_add(r0, d);
+        if (int128_uge(r0, d)) {
+            if (int128_ult(r0, m)) {
+                q0 = int128_sub(q0, int128_one());
+                r0 = int128_add(r0, d);
+            }
+        }
+    }
+    r0 = int128_sub(r0, m);
+
+    *r = r0;
+    return int128_or(int128_lshift(q1, 64), q0);
+}
+
+/*
+ * Unsigned 256-by-128 division.
+ * Returns the remainder.
+ * Returns quotient via plow and phigh.
+ * Also returns the remainder via the function return value.
+ */
+Int128 divu256(Int128 *plow, Int128 *phigh, Int128 divisor)
+{
+    Int128 dhi = *phigh;
+    Int128 dlo = *plow;
+    Int128 rem, dhighest;
+    int sh;
+
+    if (!int128_nz(divisor) || !int128_nz(dhi)) {
+        *plow  = int128_divu(dlo, divisor);
+        *phigh = int128_zero();
+        return int128_remu(dlo, divisor);
+    } else {
+        sh = clz128(divisor);
+
+        if (int128_ult(dhi, divisor)) {
+            if (sh != 0) {
+                /* normalize the divisor, shifting the dividend accordingly */
+                divisor = int128_lshift(divisor, sh);
+                dhi = int128_or(int128_lshift(dhi, sh),
+                                int128_urshift(dlo, (128 - sh)));
+                dlo = int128_lshift(dlo, sh);
+            }
+
+            *phigh = int128_zero();
+            *plow = udiv256_qrnnd(&rem, dhi, dlo, divisor);
+        } else {
+            if (sh != 0) {
+                /* normalize the divisor, shifting the dividend accordingly */
+                divisor = int128_lshift(divisor, sh);
+                dhighest = int128_rshift(dhi, (128 - sh));
+                dhi = int128_or(int128_lshift(dhi, sh),
+                                int128_urshift(dlo, (128 - sh)));
+                dlo = int128_lshift(dlo, sh);
+
+                *phigh = udiv256_qrnnd(&dhi, dhighest, dhi, divisor);
+            } else {
+                /**
+                 * dhi >= divisor
+                 * Since the MSB of divisor is set (sh == 0),
+                 * (dhi - divisor) < divisor
+                 *
+                 * Thus, the high part of the quotient is 1, and we can
+                 * calculate the low part with a single call to udiv_qrnnd
+                 * after subtracting divisor from dhi
+                 */
+                dhi = int128_sub(dhi, divisor);
+                *phigh = int128_one();
+            }
+
+            *plow = udiv256_qrnnd(&rem, dhi, dlo, divisor);
+        }
+
+        /*
+         * since the dividend/divisor might have been normalized,
+         * the remainder might also have to be shifted back
+         */
+        rem = int128_urshift(rem, sh);
+        return rem;
+    }
+}
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 07/10] Implemented signed 256-by-128 division
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
                   ` (5 preceding siblings ...)
  2022-03-30 20:25 ` [PATCH 06/10] Implemented unsigned 256-by-128 division Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 21:45   ` Richard Henderson
  2022-03-30 20:25 ` [PATCH 08/10] target/ppc: Implemented remaining vector divide extended Lucas Mateus Castro(alqotel)
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: Eduardo Habkost, danielhb413, richard.henderson, Luis Pires,
	Lucas Mateus Castro (alqotel), clg, Alex Bennée,
	David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Based on already existing QEMU implementation created a signed
256 bit by 128 bit division needed to implement the vector divide
extended signed quadword instruction from PowerISA 3.1

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 include/qemu/host-utils.h |  1 +
 util/host-utils.c         | 51 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index 6479403935..e9b963333f 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -864,4 +864,5 @@ static inline uint64_t udiv_qrnnd(uint64_t *r, uint64_t n1,
 }
 
 Int128 divu256(Int128 *plow, Int128 *phigh, Int128 divisor);
+Int128 divs256(Int128 *plow, Int128 *phigh, Int128 divisor);
 #endif
diff --git a/util/host-utils.c b/util/host-utils.c
index a495cc820d..75322dc836 100644
--- a/util/host-utils.c
+++ b/util/host-utils.c
@@ -394,3 +394,54 @@ Int128 divu256(Int128 *plow, Int128 *phigh, Int128 divisor)
         return rem;
     }
 }
+
+/*
+ * Signed 256-by-128 division.
+ * Returns quotient via plow and phigh.
+ * Also returns the remainder via the function return value.
+ */
+Int128 divs256(Int128 *plow, Int128 *phigh, Int128 divisor)
+{
+    bool neg_quotient = false, neg_remainder = false;
+    Int128 unsig_hi = *phigh, unsig_lo = *plow;
+    Int128 rem;
+
+    if (!int128_nonneg(*phigh)) {
+        neg_quotient = !neg_quotient;
+        neg_remainder = !neg_remainder;
+
+        if (!int128_nz(unsig_lo)) {
+            unsig_hi = int128_neg(unsig_hi);
+        } else {
+            unsig_hi = int128_not(unsig_hi);
+            unsig_lo = int128_neg(unsig_lo);
+        }
+    }
+
+    if (!int128_nonneg(divisor)) {
+        neg_quotient = !neg_quotient;
+
+        divisor = int128_neg(divisor);
+    }
+
+    rem = divu256(&unsig_lo, &unsig_hi, divisor);
+
+    if (neg_quotient) {
+        if (!int128_nz(unsig_lo)) {
+            *phigh = int128_neg(unsig_hi);
+            *plow = int128_zero();
+        } else {
+            *phigh = int128_not(unsig_hi);
+            *plow = int128_neg(unsig_lo);
+        }
+    } else {
+        *phigh = unsig_hi;
+        *plow = unsig_lo;
+    }
+
+    if (neg_remainder) {
+        return int128_neg(rem);
+    } else {
+        return rem;
+    }
+}
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 08/10] target/ppc: Implemented remaining vector divide extended
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
                   ` (6 preceding siblings ...)
  2022-03-30 20:25 ` [PATCH 07/10] Implemented signed " Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 09/10] target/ppc: Implemented vector module word/doubleword Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 10/10] target/ppc: Implemented vector module quadword Lucas Mateus Castro(alqotel)
  9 siblings, 0 replies; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, Greg Kurz,
	Lucas Mateus Castro (alqotel), clg, David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vdivesd: Vector Divide Extended Signed Doubleword
vdiveud: Vector Divide Extended Unsigned Doubleword
vdivesq: Vector Divide Extended Signed Quadword
vdiveuq: Vector Divide Extended Unsigned Quadword
Undefined behavior based on mambo.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 include/qemu/int128.h               | 10 ++++
 target/ppc/helper.h                 |  4 ++
 target/ppc/insn32.decode            |  4 ++
 target/ppc/int_helper.c             | 73 +++++++++++++++++++++++++++++
 target/ppc/translate/vmx-impl.c.inc |  4 ++
 5 files changed, 95 insertions(+)

diff --git a/include/qemu/int128.h b/include/qemu/int128.h
index b1eb094525..cbafd5a60f 100644
--- a/include/qemu/int128.h
+++ b/include/qemu/int128.h
@@ -158,6 +158,11 @@ static inline bool int128_nz(Int128 a)
     return a != 0;
 }
 
+static inline Int128 int128_abs(Int128 a)
+{
+    return a < 0 ? -a : a;
+}
+
 static inline Int128 int128_min(Int128 a, Int128 b)
 {
     return a < b ? a : b;
@@ -413,6 +418,11 @@ static inline bool int128_nz(Int128 a)
     return a.lo || a.hi;
 }
 
+static inline Int128 int128_abs(Int128 a)
+{
+    return int128_nonneg(a) ? a : int128_neg(a);
+}
+
 static inline Int128 int128_min(Int128 a, Int128 b)
 {
     return int128_le(a, b) ? a : b;
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 4cfdf7b3ec..67ecff2c9a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -173,6 +173,10 @@ DEF_HELPER_FLAGS_3(VMULOUH, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VDIVSQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VDIVUQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VDIVESD, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VDIVEUD, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VDIVESQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VDIVEUQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 8c115c9c60..3eb920ac76 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -715,3 +715,7 @@ VDIVUQ          000100 ..... ..... ..... 00000001011    @VX
 
 VDIVESW         000100 ..... ..... ..... 01110001011    @VX
 VDIVEUW         000100 ..... ..... ..... 01010001011    @VX
+VDIVESD         000100 ..... ..... ..... 01111001011    @VX
+VDIVEUD         000100 ..... ..... ..... 01011001011    @VX
+VDIVESQ         000100 ..... ..... ..... 01100001011    @VX
+VDIVEUQ         000100 ..... ..... ..... 01000001011    @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 18e5430e00..de9bda8132 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1054,6 +1054,79 @@ void helper_VDIVUQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
     }
 }
 
+void helper_VDIVESD(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    int i;
+    int64_t high;
+    uint64_t low;
+    for (i = 0; i < 2; i++) {
+        high = a->s64[i];
+        low = 0;
+        if (unlikely(uabs64(a->s64[i]) >= uabs64(b->s64[i]) || !b->s64[i])) {
+            t->s64[i] = 0; /* Undefined behavior */
+        } else {
+            divs128(&low, &high, b->s64[i]);
+            if (unlikely((low >= INT64_MAX && high != -1) ||
+                         (low < INT64_MAX && high == -1))) {
+                t->s64[i] = 0; /* Undefined behavior */
+            } else {
+                t->s64[i] = low;
+            }
+        }
+    }
+}
+
+void helper_VDIVEUD(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    int i;
+    uint64_t high, low;
+    for (i = 0; i < 2; i++) {
+        high = a->u64[i];
+        low = 0;
+        if (unlikely(high >= b->u64[i] || !b->u64[i])) {
+            t->u64[i] = 0; /* Undefined behavior */
+        } else {
+            divu128(&low, &high, b->u64[i]);
+            t->u64[i] = low;
+        }
+    }
+}
+
+void helper_VDIVESQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    Int128 high, low;
+
+    high = a->s128;
+    low = int128_zero();
+    if (unlikely(!int128_nz(b->s128) ||
+                 int128_uge(int128_abs(high), int128_abs(b->s128)))) {
+        t->s128 = int128_zero(); /* Undefined behavior */
+    } else {
+        divs256(&low, &high, b->s128);
+        if (unlikely(
+                (!int128_nonneg(low) && !int128_eq(high, int128_makes64(-1))) ||
+                (int128_nonneg(low) && int128_eq(high, int128_makes64(-1))))) {
+            t->s128 = int128_zero(); /* Undefined behavior */
+        } else {
+            t->s128 = low;
+        }
+    }
+}
+
+void helper_VDIVEUQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    Int128 dhigh, dlow;
+
+    dhigh = a->s128;
+    dlow = int128_zero();
+    if (unlikely(!int128_nz(b->s128) || int128_uge(a->s128, b->s128))) {
+        t->s128 = int128_zero(); /* Undefined behavior */
+    } else {
+        divu256(&dlow, &dhigh, b->s128);
+        t->s128 = dlow;
+    }
+}
+
 void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
 {
     ppc_avr_t result;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 752f3af659..62b2fcd45c 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -3351,6 +3351,10 @@ static void do_vx_diveu_i32(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b)
 
 TRANS_VDIV_VMOD(ISA310, VDIVESW, MO_32, do_vx_dives_i32, NULL)
 TRANS_VDIV_VMOD(ISA310, VDIVEUW, MO_32, do_vx_diveu_i32, NULL)
+TRANS_FLAGS2(ISA310, VDIVESD, do_vx_helper, gen_helper_VDIVESD)
+TRANS_FLAGS2(ISA310, VDIVEUD, do_vx_helper, gen_helper_VDIVEUD)
+TRANS_FLAGS2(ISA310, VDIVESQ, do_vx_helper, gen_helper_VDIVESQ)
+TRANS_FLAGS2(ISA310, VDIVEUQ, do_vx_helper, gen_helper_VDIVEUQ)
 
 #undef GEN_VR_LDX
 #undef GEN_VR_STX
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 09/10] target/ppc: Implemented vector module word/doubleword
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
                   ` (7 preceding siblings ...)
  2022-03-30 20:25 ` [PATCH 08/10] target/ppc: Implemented remaining vector divide extended Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  2022-03-30 20:25 ` [PATCH 10/10] target/ppc: Implemented vector module quadword Lucas Mateus Castro(alqotel)
  9 siblings, 0 replies; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, Greg Kurz,
	Lucas Mateus Castro (alqotel), clg, David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vmodsw: Vector Modulo Signed Word
vmoduw: Vector Modulo Unsigned Word
vmodsd: Vector Modulo Signed Doubleword
vmodud: Vector Modulo Unsigned Doubleword
Hardware behavior based on mambo.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 target/ppc/insn32.decode            |  5 +++++
 target/ppc/translate/vmx-impl.c.inc | 26 ++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 3eb920ac76..36b42e41d2 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -719,3 +719,8 @@ VDIVESD         000100 ..... ..... ..... 01111001011    @VX
 VDIVEUD         000100 ..... ..... ..... 01011001011    @VX
 VDIVESQ         000100 ..... ..... ..... 01100001011    @VX
 VDIVEUQ         000100 ..... ..... ..... 01000001011    @VX
+
+VMODSW          000100 ..... ..... ..... 11110001011    @VX
+VMODUW          000100 ..... ..... ..... 11010001011    @VX
+VMODSD          000100 ..... ..... ..... 11111001011    @VX
+VMODUD          000100 ..... ..... ..... 11011001011    @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 62b2fcd45c..ed01d91b87 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -3349,6 +3349,27 @@ static void do_vx_diveu_i32(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b)
     tcg_temp_free_i64(val2);
 }
 
+#define REM_VEC(NAME, SZ, REM)                                          \
+static void do_vx_##NAME(TCGv_##SZ t, TCGv_##SZ a, TCGv_##SZ b)         \
+{                                                                       \
+    TCGv_##SZ zero = tcg_constant_##SZ(0), one = tcg_constant_##SZ(1);  \
+    /*                                                                  \
+     *  If N%0 the instruction used by the backend might deliver        \
+     *  a signal to the process and the hardware returns 0 when         \
+     *  N%0, so if b = 0 return 0%1                                     \
+     */                                                                 \
+    tcg_gen_movcond_##SZ(TCG_COND_EQ, a, b, zero, zero, a);             \
+    tcg_gen_movcond_##SZ(TCG_COND_EQ, b, b, zero, one, b);              \
+    REM(t, a, b);                                                       \
+}
+
+REM_VEC(rem_i32 , i32, tcg_gen_rem_i32)
+REM_VEC(remu_i32, i32, tcg_gen_remu_i32)
+REM_VEC(rem_i64 , i64, tcg_gen_rem_i64)
+REM_VEC(remu_i64, i64, tcg_gen_remu_i64)
+
+#undef REM_VEC
+
 TRANS_VDIV_VMOD(ISA310, VDIVESW, MO_32, do_vx_dives_i32, NULL)
 TRANS_VDIV_VMOD(ISA310, VDIVEUW, MO_32, do_vx_diveu_i32, NULL)
 TRANS_FLAGS2(ISA310, VDIVESD, do_vx_helper, gen_helper_VDIVESD)
@@ -3356,6 +3377,11 @@ TRANS_FLAGS2(ISA310, VDIVEUD, do_vx_helper, gen_helper_VDIVEUD)
 TRANS_FLAGS2(ISA310, VDIVESQ, do_vx_helper, gen_helper_VDIVESQ)
 TRANS_FLAGS2(ISA310, VDIVEUQ, do_vx_helper, gen_helper_VDIVEUQ)
 
+TRANS_VDIV_VMOD(ISA310, VMODSW, MO_32, do_vx_rem_i32 , NULL)
+TRANS_VDIV_VMOD(ISA310, VMODUW, MO_32, do_vx_remu_i32, NULL)
+TRANS_VDIV_VMOD(ISA310, VMODSD, MO_64, NULL, do_vx_rem_i64)
+TRANS_VDIV_VMOD(ISA310, VMODUD, MO_64, NULL, do_vx_remu_i64)
+
 #undef GEN_VR_LDX
 #undef GEN_VR_STX
 #undef GEN_VR_LVE
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 10/10] target/ppc: Implemented vector module quadword
  2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
                   ` (8 preceding siblings ...)
  2022-03-30 20:25 ` [PATCH 09/10] target/ppc: Implemented vector module word/doubleword Lucas Mateus Castro(alqotel)
@ 2022-03-30 20:25 ` Lucas Mateus Castro(alqotel)
  9 siblings, 0 replies; 16+ messages in thread
From: Lucas Mateus Castro(alqotel) @ 2022-03-30 20:25 UTC (permalink / raw)
  To: qemu-devel, qemu-ppc
  Cc: danielhb413, richard.henderson, Greg Kurz,
	Lucas Mateus Castro (alqotel), clg, David Gibson

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vmodsq: Vector Modulo Signed Quadword
vmoduq: Vector Modulo Unsigned Quadword
Undefined behavior based on mambo.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
 target/ppc/helper.h                 |  2 ++
 target/ppc/insn32.decode            |  2 ++
 target/ppc/int_helper.c             | 18 ++++++++++++++++++
 target/ppc/translate/vmx-impl.c.inc |  2 ++
 4 files changed, 24 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 67ecff2c9a..881e03959a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -177,6 +177,8 @@ DEF_HELPER_FLAGS_3(VDIVESD, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VDIVEUD, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VDIVESQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VDIVEUQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMODSQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMODUQ, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 36b42e41d2..b53efe1915 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -724,3 +724,5 @@ VMODSW          000100 ..... ..... ..... 11110001011    @VX
 VMODUW          000100 ..... ..... ..... 11010001011    @VX
 VMODSD          000100 ..... ..... ..... 11111001011    @VX
 VMODUD          000100 ..... ..... ..... 11011001011    @VX
+VMODSQ          000100 ..... ..... ..... 11100001011    @VX
+VMODUQ          000100 ..... ..... ..... 11000001011    @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index de9bda8132..5e4fcaa357 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1127,6 +1127,24 @@ void helper_VDIVEUQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
     }
 }
 
+void helper_VMODSQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    if (likely(int128_nz(b->s128))) {
+        t->s128 = int128_rems(a->s128, b->s128);
+    } else {
+        t->s128 = int128_zero(); /* Undefined behavior */
+    }
+}
+
+void helper_VMODUQ(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b)
+{
+    if (likely(int128_nz(b->s128))) {
+        t->s128 = int128_remu(a->s128, b->s128);
+    } else {
+        t->s128 = int128_zero(); /* Undefined behavior */
+    }
+}
+
 void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
 {
     ppc_avr_t result;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index ed01d91b87..b62a77f531 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -3381,6 +3381,8 @@ TRANS_VDIV_VMOD(ISA310, VMODSW, MO_32, do_vx_rem_i32 , NULL)
 TRANS_VDIV_VMOD(ISA310, VMODUW, MO_32, do_vx_remu_i32, NULL)
 TRANS_VDIV_VMOD(ISA310, VMODSD, MO_64, NULL, do_vx_rem_i64)
 TRANS_VDIV_VMOD(ISA310, VMODUD, MO_64, NULL, do_vx_remu_i64)
+TRANS_FLAGS2(ISA310, VMODSQ, do_vx_helper, gen_helper_VMODSQ)
+TRANS_FLAGS2(ISA310, VMODUQ, do_vx_helper, gen_helper_VMODUQ)
 
 #undef GEN_VR_LDX
 #undef GEN_VR_STX
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 03/10] target/ppc: Implemented vector divide instructions
  2022-03-30 20:25 ` [PATCH 03/10] target/ppc: Implemented vector divide instructions Lucas Mateus Castro(alqotel)
@ 2022-03-30 21:06   ` Richard Henderson
  2022-03-31 18:28     ` Lucas Mateus Martins Araujo e Castro
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-03-30 21:06 UTC (permalink / raw)
  To: Lucas Mateus Castro(alqotel), qemu-devel, qemu-ppc
  Cc: Greg Kurz, danielhb413, clg, David Gibson

On 3/30/22 14:25, Lucas Mateus Castro(alqotel) wrote:
> +#define TRANS_VDIV_VMOD(FLAGS, NAME, VECE, FNI4_FUNC, FNI8_FUNC)        \
> +static bool trans_##NAME(DisasContext *ctx, arg_VX *a)                  \
> +{                                                                       \
> +    static const GVecGen3 op[2] = {                                     \
> +        {                                                               \
> +            .fni4 = FNI4_FUNC,                                          \
> +            .fni8 = FNI8_FUNC,                                          \
> +            .vece = MO_32                                               \
> +        },                                                              \
> +        {                                                               \
> +            .fni4 = FNI4_FUNC,                                          \
> +            .fni8 = FNI8_FUNC,                                          \
> +            .vece = MO_64                                               \
> +        },                                                              \
> +    };                                                                  \

There is zero point in having a two element array here:
(1) VECE is a constant
(2) The unused array element is actively wrong.

> +#define DIV_VEC(NAME, SZ, DIV)                                          \
> +static void do_vx_##NAME(TCGv_##SZ t, TCGv_##SZ a, TCGv_##SZ b)         \
> +{                                                                       \
> +    TCGv_##SZ zero = tcg_constant_##SZ(0), one = tcg_constant_##SZ(1);  \
> +    /*                                                                  \
> +     *  If N/0 the instruction used by the backend might deliver        \
> +     *  a signal to the process and the hardware returns 0 when         \
> +     *  N/0, so if b = 0 return 0/1                                     \
> +     */                                                                 \
> +    tcg_gen_movcond_##SZ(TCG_COND_EQ, a, b, zero, zero, a);             \
> +    tcg_gen_movcond_##SZ(TCG_COND_EQ, b, b, zero, one, b);              \
> +    DIV(t, a, b);                                                       \
> +}

The manual says N/0 = undefined.  I don't think it's important to require 0.

The signed versions still need to check for int_min / -1, which will fault on x86. 
Compare vs gen_op_arith_div{w,d}.


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 01/10] qemu/int128: avoid undefined behavior in int128_lshift
  2022-03-30 20:25 ` [PATCH 01/10] qemu/int128: avoid undefined behavior in int128_lshift Lucas Mateus Castro(alqotel)
@ 2022-03-30 21:21   ` Peter Maydell
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Maydell @ 2022-03-30 21:21 UTC (permalink / raw)
  To: Lucas Mateus Castro(alqotel)
  Cc: danielhb413, richard.henderson, qemu-devel,
	Philippe Mathieu-Daudé, qemu-ppc, clg,
	Frédéric Pétrot, Matheus Ferst

On Wed, 30 Mar 2022 at 21:26, Lucas Mateus Castro(alqotel)
<lucas.araujo@eldorado.org.br> wrote:
>
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>
> Avoid the left shift of negative values in int128_lshift by casting
> a/a.hi to unsigned.

We compile with -fwrapv, so left shift of negative integers
should always be well-defined for us.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 05/10] target/ppc: Implemented vector divide extended word
  2022-03-30 20:25 ` [PATCH 05/10] target/ppc: Implemented vector divide extended word Lucas Mateus Castro(alqotel)
@ 2022-03-30 21:24   ` Richard Henderson
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Henderson @ 2022-03-30 21:24 UTC (permalink / raw)
  To: Lucas Mateus Castro(alqotel), qemu-devel, qemu-ppc
  Cc: Greg Kurz, danielhb413, clg, David Gibson

On 3/30/22 14:25, Lucas Mateus Castro(alqotel) wrote:
> +static void do_vx_dives_i32(TCGv_i32 t, TCGv_i32 a, TCGv_i32 b)
> +{
> +    TCGv_i64 res, val1, val2;
> +    TCGv_i64 zero = tcg_constant_i64(0);
> +    TCGv_i64 one =  tcg_constant_i64(1);
> +
> +    res = tcg_temp_new_i64();
> +    val1 = tcg_temp_new_i64();
> +    val2 = tcg_temp_new_i64();
> +
> +    tcg_gen_ext_i32_i64(val1, a);
> +    tcg_gen_ext_i32_i64(val2, b);
> +
> +    /* return 0 if b = 0, so make b = 1 so the result doesn't fit in 32 bits*/
> +    tcg_gen_movcond_i64(TCG_COND_EQ, val2, val2, zero, one, val2);

Need int_min / -1 check.

> +    /* (a << 32)/b */
> +    tcg_gen_shli_i64(val1, val1, 32);
> +    tcg_gen_div_i64(res, val1, val2);
> +
> +    tcg_gen_ext32s_i64(val1, res);
> +    /* if result is undefined (quotient doesn't fit in 32 bits) return 0 */
> +    tcg_gen_movcond_i64(TCG_COND_EQ, res, res, val1, res, zero);

Again, I don't see the point in producing 0 for undefined.

> +    tcg_gen_ext_i32_i64(val1, a);
> +    tcg_gen_extu_i32_i64(val2, b);

Better with extu for val1, just because a is logically unsigned.


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 07/10] Implemented signed 256-by-128 division
  2022-03-30 20:25 ` [PATCH 07/10] Implemented signed " Lucas Mateus Castro(alqotel)
@ 2022-03-30 21:45   ` Richard Henderson
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Henderson @ 2022-03-30 21:45 UTC (permalink / raw)
  To: Lucas Mateus Castro(alqotel), qemu-devel, qemu-ppc
  Cc: Eduardo Habkost, danielhb413, Luis Pires, clg, Alex Bennée,
	David Gibson

On 3/30/22 14:25, Lucas Mateus Castro(alqotel) wrote:
> From: "Lucas Mateus Castro (alqotel)"<lucas.araujo@eldorado.org.br>
> 
> Based on already existing QEMU implementation created a signed
> 256 bit by 128 bit division needed to implement the vector divide
> extended signed quadword instruction from PowerISA 3.1
> 
> Signed-off-by: Lucas Mateus Castro (alqotel)<lucas.araujo@eldorado.org.br>
> ---
>   include/qemu/host-utils.h |  1 +
>   util/host-utils.c         | 51 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 52 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 03/10] target/ppc: Implemented vector divide instructions
  2022-03-30 21:06   ` Richard Henderson
@ 2022-03-31 18:28     ` Lucas Mateus Martins Araujo e Castro
  0 siblings, 0 replies; 16+ messages in thread
From: Lucas Mateus Martins Araujo e Castro @ 2022-03-31 18:28 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-ppc
  Cc: Greg Kurz, danielhb413, clg, David Gibson

[-- Attachment #1: Type: text/plain, Size: 3293 bytes --]


On 30/03/2022 18:06, Richard Henderson wrote:
>
> On 3/30/22 14:25, Lucas Mateus Castro(alqotel) wrote:
>> +#define TRANS_VDIV_VMOD(FLAGS, NAME, VECE, FNI4_FUNC, 
>> FNI8_FUNC)        \
>> +static bool trans_##NAME(DisasContext *ctx, arg_VX 
>> *a)                  \
>> +{ \
>> +    static const GVecGen3 op[2] = 
>> {                                     \
>> + { \
>> +            .fni4 = 
>> FNI4_FUNC,                                          \
>> +            .fni8 = 
>> FNI8_FUNC,                                          \
>> +            .vece = 
>> MO_32                                               \
>> + }, \
>> + { \
>> +            .fni4 = 
>> FNI4_FUNC,                                          \
>> +            .fni8 = 
>> FNI8_FUNC,                                          \
>> +            .vece = 
>> MO_64                                               \
>> + }, \
>> + }; \
>
> There is zero point in having a two element array here:
> (1) VECE is a constant
> (2) The unused array element is actively wrong.
Ok, I'll set VECE based on which function is NULL
>
>> +#define DIV_VEC(NAME, SZ, 
>> DIV)                                          \
>> +static void do_vx_##NAME(TCGv_##SZ t, TCGv_##SZ a, TCGv_##SZ 
>> b)         \
>> +{ \
>> +    TCGv_##SZ zero = tcg_constant_##SZ(0), one = 
>> tcg_constant_##SZ(1);  \
>> + /* \
>> +     *  If N/0 the instruction used by the backend might 
>> deliver        \
>> +     *  a signal to the process and the hardware returns 0 
>> when         \
>> +     *  N/0, so if b = 0 return 
>> 0/1                                     \
>> + */ \
>> +    tcg_gen_movcond_##SZ(TCG_COND_EQ, a, b, zero, zero, 
>> a);             \
>> +    tcg_gen_movcond_##SZ(TCG_COND_EQ, b, b, zero, one, 
>> b);              \
>> +    DIV(t, a, 
>> b);                                                       \
>> +}
>
> The manual says N/0 = undefined.  I don't think it's important to 
> require 0.
My idea here was mostly to mimic the hardware behavior, testing on a 
Power9 both divw and divd result in 0 when N/0 and mambo results in 0 in 
vdiv* and vmod* when N/0, but yeah the PowerISA just said that it's 
undefined. I'll just set b = 1 if N/0 or int_min/-1 in v2 then.
>
> The signed versions still need to check for int_min / -1, which will 
> fault on x86.
> Compare vs gen_op_arith_div{w,d}.
My mistake, I'll add this check in v2
>
>
> r~
-- 
Lucas Mateus M. Araujo e Castro
Instituto de Pesquisas ELDORADO 
<https://www.eldorado.org.br/?utm_campaign=assinatura_de_e-mail&utm_medium=email&utm_source=RD+Station>
Departamento Computação Embarcada
Analista de Software Trainee
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>

[-- Attachment #2: Type: text/html, Size: 6298 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-03-31 18:35 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-30 20:25 [PATCH 00/10] VDIV/VMOD Implementation Lucas Mateus Castro(alqotel)
2022-03-30 20:25 ` [PATCH 01/10] qemu/int128: avoid undefined behavior in int128_lshift Lucas Mateus Castro(alqotel)
2022-03-30 21:21   ` Peter Maydell
2022-03-30 20:25 ` [PATCH 02/10] qemu/int128: add int128_urshift Lucas Mateus Castro(alqotel)
2022-03-30 20:25 ` [PATCH 03/10] target/ppc: Implemented vector divide instructions Lucas Mateus Castro(alqotel)
2022-03-30 21:06   ` Richard Henderson
2022-03-31 18:28     ` Lucas Mateus Martins Araujo e Castro
2022-03-30 20:25 ` [PATCH 04/10] target/ppc: Implemented vector divide quadword Lucas Mateus Castro(alqotel)
2022-03-30 20:25 ` [PATCH 05/10] target/ppc: Implemented vector divide extended word Lucas Mateus Castro(alqotel)
2022-03-30 21:24   ` Richard Henderson
2022-03-30 20:25 ` [PATCH 06/10] Implemented unsigned 256-by-128 division Lucas Mateus Castro(alqotel)
2022-03-30 20:25 ` [PATCH 07/10] Implemented signed " Lucas Mateus Castro(alqotel)
2022-03-30 21:45   ` Richard Henderson
2022-03-30 20:25 ` [PATCH 08/10] target/ppc: Implemented remaining vector divide extended Lucas Mateus Castro(alqotel)
2022-03-30 20:25 ` [PATCH 09/10] target/ppc: Implemented vector module word/doubleword Lucas Mateus Castro(alqotel)
2022-03-30 20:25 ` [PATCH 10/10] target/ppc: Implemented vector module quadword Lucas Mateus Castro(alqotel)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).