[Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part
@ 2016-12-03  4:59 Jose Ricardo Ziviani
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 1/7] target-ppc: Implement bcd_is_valid function Jose Ricardo Ziviani
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  4:59 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

This serie contains 5 new instructions for POWER9 ISA3.0, left/right shifts for unsigned quadwords and a small improvement to check whether a bcd value is valid or not.

bcds.: Decimal signed shift
bcdus.: Decimal unsigned shift
bcdsr.: Decimal shift and round
bcdtrunc.: Decimal signed trucate
bcdtrunc.: Decimal unsigned truncate

Jose Ricardo Ziviani (7):
  target-ppc: Implement bcd_is_valid function
  target-ppc: Implement unsigned quadword left/right shift and unit
    tests
  target-ppc: Implement bcds. instruction
  target-ppc: Implement bcdus. instruction
  target-ppc: Implement bcdsr. instruction
  target-ppc: Implement bcdtrunc. instruction
  target-ppc: Implement bcdtrunc. instruction

 include/qemu/host-utils.h           |  29 +++++
 target-ppc/helper.h                 |   5 +
 target-ppc/int_helper.c             | 229 +++++++++++++++++++++++++++++++++++-
 target-ppc/translate/vmx-impl.inc.c |  16 +++
 target-ppc/translate/vmx-ops.inc.c  |  13 +-
 tests/Makefile.include              |   5 +-
 tests/test-shift128.c               |  97 +++++++++++++++
 util/host-utils.c                   |  38 ++++++
 8 files changed, 422 insertions(+), 10 deletions(-)
 create mode 100644 tests/test-shift128.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 1/7] target-ppc: Implement bcd_is_valid function
  2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
@ 2016-12-03  5:00 ` Jose Ricardo Ziviani
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests Jose Ricardo Ziviani
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  5:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

A function to check if all digits of a given BCD number is valid is
here presented because more instructions will need to reuse the
same code.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 target-ppc/int_helper.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 7030f61..7989b1f 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -2596,6 +2596,24 @@ static void bcd_put_digit(ppc_avr_t *bcd, uint8_t digit, int n)
     }
 }
 
+static bool bcd_is_valid(ppc_avr_t *bcd)
+{
+    int i;
+    int invalid = 0;
+
+    if (bcd_get_sgn(bcd) == 0) {
+        return false;
+    }
+
+    for (i = 1; i < 32; i++) {
+        bcd_get_digit(bcd, i, &invalid);
+        if (unlikely(invalid)) {
+            return false;
+        }
+    }
+    return true;
+}
+
 static int bcd_cmp_zero(ppc_avr_t *bcd)
 {
     if (bcd->u64[HI_IDX] == 0 && (bcd->u64[LO_IDX] >> 4) == 0) {
@@ -3013,18 +3031,13 @@ uint32_t helper_bcdcpsgn(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
 
 uint32_t helper_bcdsetsgn(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
 {
-    int i;
-    int invalid = 0;
     int sgnb = bcd_get_sgn(b);
 
     *r = *b;
     bcd_put_digit(r, bcd_preferred_sgn(sgnb, ps), 0);
 
-    for (i = 1; i < 32; i++) {
-        bcd_get_digit(b, i, &invalid);
-        if (unlikely(invalid)) {
-            return CRF_SO;
-        }
+    if (bcd_is_valid(b) == false) {
+        return CRF_SO;
     }
 
     return bcd_cmp_zero(r);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests
  2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 1/7] target-ppc: Implement bcd_is_valid function Jose Ricardo Ziviani
@ 2016-12-03  5:00 ` Jose Ricardo Ziviani
  2016-12-04  1:37   ` Richard Henderson
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 3/7] target-ppc: Implement bcds. instruction Jose Ricardo Ziviani
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  5:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

This commit implements functions to right and left shifts and the
unittest for them. Such functions is needed due to instructions
that requires them.

Today, there is already a right shift implementation in int128.h
but it's for signed numbers.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 include/qemu/host-utils.h | 29 ++++++++++++++
 tests/Makefile.include    |  5 ++-
 tests/test-shift128.c     | 97 +++++++++++++++++++++++++++++++++++++++++++++++
 util/host-utils.c         | 38 +++++++++++++++++++
 4 files changed, 168 insertions(+), 1 deletion(-)
 create mode 100644 tests/test-shift128.c

diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index 46187bb..b3e5f72 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -29,6 +29,33 @@
 #include "qemu/bswap.h"
 
 #ifdef CONFIG_INT128
+static inline void urshift(uint64_t *plow, uint64_t *phigh, uint32_t shift)
+{
+    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
+    val >>= (shift & 127);
+    *phigh = val >> 64;
+    *plow = val & 0xffffffffffffffff;
+}
+
+static inline void ulshift(uint64_t *plow, uint64_t *phigh,
+                           uint32_t shift, bool *overflow)
+{
+    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
+
+    if (shift == 0) {
+        return;
+    }
+
+    if (shift > 127 || (val >> (128 - (shift & 127))) != 0) {
+        *overflow = true;
+    }
+
+    val <<= (shift & 127);
+
+    *phigh = val >> 64;
+    *plow = val & 0xffffffffffffffff;
+}
+
 static inline void mulu64(uint64_t *plow, uint64_t *phigh,
                           uint64_t a, uint64_t b)
 {
@@ -81,6 +108,8 @@ void muls64(uint64_t *phigh, uint64_t *plow, int64_t a, int64_t b);
 void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b);
 int divu128(uint64_t *plow, uint64_t *phigh, uint64_t divisor);
 int divs128(int64_t *plow, int64_t *phigh, int64_t divisor);
+void urshift(uint64_t *plow, uint64_t *phigh, uint32_t shift);
+void ulshift(uint64_t *plow, uint64_t *phigh, uint32_t shift, bool *overflow);
 
 static inline uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)
 {
diff --git a/tests/Makefile.include b/tests/Makefile.include
index e98d3b6..89e5e85 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -65,6 +65,8 @@ check-unit-$(CONFIG_POSIX) += tests/test-vmstate$(EXESUF)
 endif
 check-unit-y += tests/test-cutils$(EXESUF)
 gcov-files-test-cutils-y += util/cutils.c
+check-unit-y += tests/test-shift128$(EXESUF)
+gcov-files-test-shift128-y = util/host-utils.c
 check-unit-y += tests/test-mul64$(EXESUF)
 gcov-files-test-mul64-y = util/host-utils.c
 check-unit-y += tests/test-int128$(EXESUF)
@@ -460,7 +462,7 @@ test-obj-y = tests/check-qint.o tests/check-qstring.o tests/check-qdict.o \
 	tests/test-x86-cpuid.o tests/test-mul64.o tests/test-int128.o \
 	tests/test-opts-visitor.o tests/test-qmp-event.o \
 	tests/rcutorture.o tests/test-rcu-list.o \
-	tests/test-qdist.o \
+	tests/test-qdist.o tests/test-shift128.o \
 	tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o \
 	tests/atomic_add-bench.o
 
@@ -568,6 +570,7 @@ tests/test-qmp-commands$(EXESUF): tests/test-qmp-commands.o tests/test-qmp-marsh
 tests/test-visitor-serialization$(EXESUF): tests/test-visitor-serialization.o $(test-qapi-obj-y)
 tests/test-opts-visitor$(EXESUF): tests/test-opts-visitor.o $(test-qapi-obj-y)
 
+tests/test-shift128$(EXESUF): tests/test-shift128.o $(test-util-obj-y)
 tests/test-mul64$(EXESUF): tests/test-mul64.o $(test-util-obj-y)
 tests/test-bitops$(EXESUF): tests/test-bitops.o $(test-util-obj-y)
 tests/test-crypto-hash$(EXESUF): tests/test-crypto-hash.o $(test-crypto-obj-y)
diff --git a/tests/test-shift128.c b/tests/test-shift128.c
new file mode 100644
index 0000000..67615c9
--- /dev/null
+++ b/tests/test-shift128.c
@@ -0,0 +1,97 @@
+/*
+ * Test unsigned left and right shift
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/host-utils.h"
+
+typedef struct {
+    uint64_t low;
+    uint64_t high;
+    uint64_t rlow;
+    uint64_t rhigh;
+    int32_t shift;
+    bool overflow;
+} test_data;
+
+static const test_data test_ltable[] = {
+    { 1223ULL, 0, 1223ULL,   0, 0, false },
+    { 1ULL,    0, 2ULL,   0, 1, false },
+    { 1ULL,    0, 4ULL,   0, 2, false },
+    { 1ULL,    0, 16ULL,  0, 4, false },
+    { 1ULL,    0, 256ULL, 0, 8, false },
+    { 1ULL,    0, 65536ULL, 0, 16, false },
+    { 1ULL,    0, 2147483648ULL, 0, 31, false },
+    { 1ULL,    0, 35184372088832ULL, 0, 45, false },
+    { 1ULL,    0, 1152921504606846976ULL, 0, 60, false },
+    { 1ULL,    0, 0, 1ULL, 64, false },
+    { 1ULL,    0, 0, 65536ULL, 80, false },
+    { 1ULL,    0, 0, 9223372036854775808ULL, 127, false },
+    { 0ULL,    1, 0, 0, 64, true },
+    { 0x8888888888888888ULL, 0x9999999999999999ULL,
+        0x8000000000000000ULL, 0x9888888888888888ULL, 60, true },
+    { 0x8888888888888888ULL, 0x9999999999999999ULL,
+        0, 0x8888888888888888ULL, 64, true },
+    { 0x8ULL, 0, 0, 0x8ULL, 64, false },
+    { 0x8ULL, 0, 0, 0x8000000000000000ULL, 124, false },
+    { 0x1ULL, 0, 0, 0x4000000000000000ULL, 126, false },
+    { 0x1ULL, 0, 0, 0x8000000000000000ULL, 127, false },
+    { 0x1ULL, 0, 0x1ULL, 0, 128, true },
+};
+
+static const test_data test_rtable[] = {
+    { 1223ULL, 0, 1223ULL,   0, 0, false },
+    { 9223372036854775808ULL, 9223372036854775808ULL,
+        2147483648L, 2147483648ULL, 32, false },
+    { 9223372036854775808ULL, 9223372036854775808ULL,
+        9223372036854775808ULL, 0, 64, false },
+    { 9223372036854775808ULL, 9223372036854775808ULL,
+        36028797018963968ULL, 0, 72, false },
+    { 9223372036854775808ULL, 9223372036854775808ULL,
+        1ULL, 0, 127, false },
+    { 9223372036854775808ULL, 0, 4611686018427387904ULL, 0, 1, false },
+    { 9223372036854775808ULL, 0, 2305843009213693952ULL, 0, 2, false },
+    { 9223372036854775808ULL, 0, 36028797018963968ULL, 0, 8, false },
+    { 9223372036854775808ULL, 0, 140737488355328ULL, 0, 16, false },
+    { 9223372036854775808ULL, 0, 2147483648ULL, 0, 32, false },
+    { 9223372036854775808ULL, 0, 1ULL, 0, 63, false },
+    { 9223372036854775808ULL, 0, 0ULL, 0, 64, false },
+};
+
+static void test_lshift(void)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(test_ltable); ++i) {
+        bool overflow = false;
+        test_data tmp = test_ltable[i];
+        ulshift(&tmp.low, &tmp.high, tmp.shift, &overflow);
+        g_assert_cmpuint(tmp.low, ==, tmp.rlow);
+        g_assert_cmpuint(tmp.high, ==, tmp.rhigh);
+        g_assert_cmpuint(tmp.overflow, ==, tmp.overflow);
+    }
+}
+
+static void test_rshift(void)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(test_rtable); ++i) {
+        test_data tmp = test_rtable[i];
+        urshift(&tmp.low, &tmp.high, tmp.shift);
+        g_assert_cmpuint(tmp.low, ==, tmp.rlow);
+        g_assert_cmpuint(tmp.high, ==, tmp.rhigh);
+    }
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+    g_test_add_func("/host-utils/test_lshift", test_lshift);
+    g_test_add_func("/host-utils/test_rshift", test_rshift);
+    return g_test_run();
+}
diff --git a/util/host-utils.c b/util/host-utils.c
index b166e57..7a97397 100644
--- a/util/host-utils.c
+++ b/util/host-utils.c
@@ -159,3 +159,41 @@ int divs128(int64_t *plow, int64_t *phigh, int64_t divisor)
     return overflow;
 }
 
+void urshift(uint64_t *plow, uint64_t *phigh, uint32_t shift)
+{
+    uint64_t h = *phigh >> (shift & 63);
+    if (shift == 0) {
+        return;
+    } else if (shift >= 64) {
+        *plow = h;
+        *phigh = 0;
+    } else {
+        *plow = (*plow >> (shift & 63)) | (*phigh << (64 - (shift & 63)));
+        *phigh = h;
+    }
+}
+
+void ulshift(uint64_t *plow, uint64_t *phigh, uint32_t shift, bool *overflow)
+{
+    uint64_t low = *plow;
+    uint64_t high = *phigh;
+    shift &= 127;
+
+    if (shift == 0) {
+        return;
+    }
+
+    urshift(&low, &high, 128 - (shift & 127));
+
+    if (shift > 127 || low > 0 || high > 0) {
+        *overflow = true;
+    }
+
+    if (shift >= 64) {
+        *phigh = *plow << (shift & 63);
+        *plow = 0;
+    } else {
+        *phigh = (*plow >> (64 - (shift & 63))) | (*phigh << (shift & 63));
+        *plow = *plow << shift;
+    }
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 3/7] target-ppc: Implement bcds. instruction
  2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 1/7] target-ppc: Implement bcd_is_valid function Jose Ricardo Ziviani
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests Jose Ricardo Ziviani
@ 2016-12-03  5:00 ` Jose Ricardo Ziviani
  2016-12-05  3:12   ` David Gibson
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 4/7] target-ppc: Implement bcdus. instruction Jose Ricardo Ziviani
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  5:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

bcds.: Decimal shift. Given two registers vra and vrb, this instruction
shift the vrb value by vra bits into the result register.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  1 +
 target-ppc/int_helper.c             | 38 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  3 +++
 target-ppc/translate/vmx-ops.inc.c  |  3 ++-
 4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index bc39efb..471a1da 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -392,6 +392,7 @@ DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
 DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
 DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
 DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
+DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
 
 DEF_HELPER_2(xsadddp, void, env, i32)
 DEF_HELPER_2(xssubdp, void, env, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 7989b1f..b25c020 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -3043,6 +3043,44 @@ uint32_t helper_bcdsetsgn(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
     return bcd_cmp_zero(r);
 }
 
+uint32_t helper_bcds(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
+{
+    int cr;
+    int i = 0;
+    bool ox_flag = false;
+    int sgnb = bcd_get_sgn(b);
+    ppc_avr_t ret = *b;
+    ret.u64[LO_IDX] &= ~0xf;
+
+#if defined(HOST_WORDS_BIGENDIAN)
+    int upper = ARRAY_SIZE(a->s32) - 1;
+#else
+    int upper = 0;
+#endif
+
+    if (bcd_is_valid(b) == false) {
+        return CRF_SO;
+    }
+
+    if (a->s32[upper] > 0) {
+        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];
+        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
+    } else {
+        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
+        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
+    }
+    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
+
+    *r = ret;
+
+    cr = bcd_cmp_zero(r);
+    if (unlikely(ox_flag)) {
+        cr |= CRF_SO;
+    }
+
+    return cr;
+}
+
 void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
 {
     int i;
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index e8e527f..84ebb7e 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -1016,6 +1016,7 @@ GEN_BCD2(bcdcfsq)
 GEN_BCD2(bcdctsq)
 GEN_BCD2(bcdsetsgn)
 GEN_BCD(bcdcpsgn);
+GEN_BCD(bcds);
 
 static void gen_xpnd04_1(DisasContext *ctx)
 {
@@ -1090,6 +1091,8 @@ GEN_VXFORM_DUAL(vsubuhs, PPC_ALTIVEC, PPC_NONE, \
                 bcdsub, PPC_NONE, PPC2_ALTIVEC_207)
 GEN_VXFORM_DUAL(vaddshs, PPC_ALTIVEC, PPC_NONE, \
                 bcdcpsgn, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_DUAL(vsubudm, PPC2_ALTIVEC_207, PPC_NONE, \
+                bcds, PPC_NONE, PPC2_ISA300)
 
 static void gen_vsbox(DisasContext *ctx)
 {
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index 57dce6e..7b4b009 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -62,7 +62,8 @@ GEN_VXFORM_207(vaddudm, 0, 3),
 GEN_VXFORM_DUAL(vsububm, bcdadd, 0, 16, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_DUAL(vsubuhm, bcdsub, 0, 17, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vsubuwm, 0, 18),
-GEN_VXFORM_207(vsubudm, 0, 19),
+GEN_VXFORM_DUAL(vsubudm, bcds, 0, 19, PPC2_ALTIVEC_207, PPC2_ISA300),
+GEN_VXFORM_300(bcds, 0, 27),
 GEN_VXFORM(vmaxub, 1, 0),
 GEN_VXFORM(vmaxuh, 1, 1),
 GEN_VXFORM(vmaxuw, 1, 2),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 4/7] target-ppc: Implement bcdus. instruction
  2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
                   ` (2 preceding siblings ...)
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 3/7] target-ppc: Implement bcds. instruction Jose Ricardo Ziviani
@ 2016-12-03  5:00 ` Jose Ricardo Ziviani
  2016-12-05  3:14   ` David Gibson
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 5/7] target-ppc: Implement bcdsr. instruction Jose Ricardo Ziviani
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  5:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

bcdus.: Decimal unsigned shift. This instruction works like bcds. but
considers only unsigned BCDs (no sign in least meaning 4 bits).

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  1 +
 target-ppc/int_helper.c             | 43 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  3 +++
 target-ppc/translate/vmx-ops.inc.c  |  2 +-
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 471a1da..386ea67 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -393,6 +393,7 @@ DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
 DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
 DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
 DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
+DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
 
 DEF_HELPER_2(xsadddp, void, env, i32)
 DEF_HELPER_2(xssubdp, void, env, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index b25c020..4b5eea1 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -3081,6 +3081,49 @@ uint32_t helper_bcds(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
     return cr;
 }
 
+uint32_t helper_bcdus(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
+{
+    int cr;
+    int i;
+    int invalid = 0;
+    bool ox_flag = false;
+    ppc_avr_t ret = *b;
+
+#if defined(HOST_WORDS_BIGENDIAN)
+    int upper = ARRAY_SIZE(a->s32) - 1;
+#else
+    int upper = 0;
+#endif
+
+    for (i = 0; i < 32; i++) {
+        bcd_get_digit(b, i, &invalid);
+
+        if (unlikely(invalid)) {
+            return CRF_SO;
+        }
+    }
+
+    if (a->s32[upper] >= 32) {
+        ox_flag = true;
+        ret.u64[LO_IDX] = ret.u64[HI_IDX] = 0;
+    } else if (a->s32[upper] <= -32) {
+        ret.u64[LO_IDX] = ret.u64[HI_IDX] = 0;
+    } else if (a->s32[upper] > 0) {
+        i = a->s32[upper] & 31;
+        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
+    } else {
+        i = (-a->s32[upper]) & 31;
+        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
+    }
+
+    cr = bcd_cmp_zero(r);
+    if (unlikely(ox_flag)) {
+        cr |= CRF_SO;
+    }
+
+    return cr;
+}
+
 void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
 {
     int i;
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index 84ebb7e..fc54881 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -1017,6 +1017,7 @@ GEN_BCD2(bcdctsq)
 GEN_BCD2(bcdsetsgn)
 GEN_BCD(bcdcpsgn);
 GEN_BCD(bcds);
+GEN_BCD(bcdus);
 
 static void gen_xpnd04_1(DisasContext *ctx)
 {
@@ -1093,6 +1094,8 @@ GEN_VXFORM_DUAL(vaddshs, PPC_ALTIVEC, PPC_NONE, \
                 bcdcpsgn, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM_DUAL(vsubudm, PPC2_ALTIVEC_207, PPC_NONE, \
                 bcds, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_DUAL(vsubuwm, PPC_ALTIVEC, PPC_NONE, \
+                bcdus, PPC_NONE, PPC2_ISA300)
 
 static void gen_vsbox(DisasContext *ctx)
 {
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index 7b4b009..cdd3abe 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -61,7 +61,7 @@ GEN_VXFORM(vadduwm, 0, 2),
 GEN_VXFORM_207(vaddudm, 0, 3),
 GEN_VXFORM_DUAL(vsububm, bcdadd, 0, 16, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_DUAL(vsubuhm, bcdsub, 0, 17, PPC_ALTIVEC, PPC_NONE),
-GEN_VXFORM(vsubuwm, 0, 18),
+GEN_VXFORM_DUAL(vsubuwm, bcdus, 0, 18, PPC_ALTIVEC, PPC2_ISA300),
 GEN_VXFORM_DUAL(vsubudm, bcds, 0, 19, PPC2_ALTIVEC_207, PPC2_ISA300),
 GEN_VXFORM_300(bcds, 0, 27),
 GEN_VXFORM(vmaxub, 1, 0),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 5/7] target-ppc: Implement bcdsr. instruction
  2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
                   ` (3 preceding siblings ...)
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 4/7] target-ppc: Implement bcdus. instruction Jose Ricardo Ziviani
@ 2016-12-03  5:00 ` Jose Ricardo Ziviani
  2016-12-05  3:19   ` David Gibson
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 6/7] target-ppc: Implement bcdtrunc. instruction Jose Ricardo Ziviani
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 7/7] " Jose Ricardo Ziviani
  6 siblings, 1 reply; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  5:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

bcdsr.: Decimal shift and round. This instruction works like bcds.
however, when performing right shift, 1 will be added to the
result if the last digit was >= 5.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  1 +
 target-ppc/int_helper.c             | 45 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  1 +
 target-ppc/translate/vmx-ops.inc.c  |  2 ++
 4 files changed, 49 insertions(+)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 386ea67..d9528eb 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -394,6 +394,7 @@ DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
 DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
 DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
+DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
 
 DEF_HELPER_2(xsadddp, void, env, i32)
 DEF_HELPER_2(xssubdp, void, env, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 4b5eea1..c9fcb1a 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -3124,6 +3124,51 @@ uint32_t helper_bcdus(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
     return cr;
 }
 
+uint32_t helper_bcdsr(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
+{
+    int cr;
+    int i;
+    int unused = 0;
+    int invalid = 0;
+    bool ox_flag = false;
+    int sgnb = bcd_get_sgn(b);
+    ppc_avr_t ret = *b;
+    ret.u64[LO_IDX] &= ~0xf;
+
+#if defined(HOST_WORDS_BIGENDIAN)
+    ppc_avr_t bcd_one = { .u64 = { 0, 0x10 } };
+    int upper = ARRAY_SIZE(a->s32) - 1;
+#else
+    ppc_avr_t bcd_one = { .u64 = { 0x10, 0 } };
+    int upper = 0;
+#endif
+
+    if (bcd_is_valid(b) == false) {
+        return CRF_SO;
+    }
+
+    if (a->s32[upper] > 0) {
+        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];
+        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
+    } else {
+        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
+        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
+
+        if (bcd_get_digit(&ret, 0, &invalid) >= 5) {
+            bcd_add_mag(&ret, &ret, &bcd_one, &invalid, &unused);
+        }
+    }
+    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
+
+    cr = bcd_cmp_zero(&ret);
+    if (unlikely(ox_flag)) {
+        cr |= CRF_SO;
+    }
+    *r = ret;
+
+    return cr;
+}
+
 void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
 {
     int i;
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index fc54881..451abb5 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -1018,6 +1018,7 @@ GEN_BCD2(bcdsetsgn)
 GEN_BCD(bcdcpsgn);
 GEN_BCD(bcds);
 GEN_BCD(bcdus);
+GEN_BCD(bcdsr);
 
 static void gen_xpnd04_1(DisasContext *ctx)
 {
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index cdd3abe..fa9c996 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -132,6 +132,8 @@ GEN_HANDLER_E_2(vprtybd, 0x4, 0x1, 0x18, 9, 0, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER_E_2(vprtybq, 0x4, 0x1, 0x18, 10, 0, PPC_NONE, PPC2_ISA300),
 
 GEN_VXFORM_DUAL(vsubcuw, xpnd04_1, 0, 22, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_300(bcdsr, 0, 23),
+GEN_VXFORM_300(bcdsr, 0, 31),
 GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vadduws, 0, 10),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 6/7] target-ppc: Implement bcdtrunc. instruction
  2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
                   ` (4 preceding siblings ...)
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 5/7] target-ppc: Implement bcdsr. instruction Jose Ricardo Ziviani
@ 2016-12-03  5:00 ` Jose Ricardo Ziviani
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 7/7] " Jose Ricardo Ziviani
  6 siblings, 0 replies; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  5:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

bcdtrunc.: Decimal integer truncate. Given a BCD number in vrb and the
number of bytes to truncate in vra, the return register will have vrb
with such bits truncated.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  1 +
 target-ppc/int_helper.c             | 43 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  5 +++++
 target-ppc/translate/vmx-ops.inc.c  |  4 ++--
 4 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index d9528eb..49965b0 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -395,6 +395,7 @@ DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
 DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
+DEF_HELPER_4(bcdtrunc, i32, avr, avr, avr, i32)
 
 DEF_HELPER_2(xsadddp, void, env, i32)
 DEF_HELPER_2(xssubdp, void, env, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index c9fcb1a..a8fc718 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -3169,6 +3169,49 @@ uint32_t helper_bcdsr(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
     return cr;
 }
 
+uint32_t helper_bcdtrunc(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
+{
+    int i;
+    int cr;
+    int ox_flag;
+    uint8_t digit;
+    uint8_t trunc;
+    int invalid = 0;
+    int sgnb = bcd_get_sgn(b);
+    ppc_avr_t ret = *b;
+
+#if defined(HOST_WORDS_BIGENDIAN)
+    int upper = ARRAY_SIZE(a->u16) - 1;
+#else
+    int upper = 0;
+#endif
+
+    trunc = 32 - (a->u16[upper] & 31);
+    for (i = 1; i < 32; i++) {
+        digit = bcd_get_digit(b, i, &invalid);
+
+        if (unlikely(invalid)) {
+            return CRF_SO;
+        }
+
+        if (i >= trunc) {
+            if (!ox_flag && digit > 0x0) {
+                ox_flag = 1;
+            }
+            bcd_put_digit(&ret, 0, i);
+        }
+    }
+    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
+
+    cr = bcd_cmp_zero(&ret);
+    if (unlikely(ox_flag)) {
+        cr |= CRF_SO;
+    }
+    *r = ret;
+
+    return cr;
+}
+
 void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
 {
     int i;
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index 451abb5..1683f42 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -1019,6 +1019,7 @@ GEN_BCD(bcdcpsgn);
 GEN_BCD(bcds);
 GEN_BCD(bcdus);
 GEN_BCD(bcdsr);
+GEN_BCD(bcdtrunc);
 
 static void gen_xpnd04_1(DisasContext *ctx)
 {
@@ -1097,6 +1098,10 @@ GEN_VXFORM_DUAL(vsubudm, PPC2_ALTIVEC_207, PPC_NONE, \
                 bcds, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM_DUAL(vsubuwm, PPC_ALTIVEC, PPC_NONE, \
                 bcdus, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_DUAL(vsubsbs, PPC_ALTIVEC, PPC_NONE, \
+                bcdtrunc, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_DUAL(vsubuqm, PPC2_ALTIVEC_207, PPC_NONE, \
+                bcdtrunc, PPC_NONE, PPC2_ISA300)
 
 static void gen_vsbox(DisasContext *ctx)
 {
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index fa9c996..e6167a4 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -143,14 +143,14 @@ GEN_VXFORM(vaddsws, 0, 14),
 GEN_VXFORM_DUAL(vsububs, bcdadd, 0, 24, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_DUAL(vsubuhs, bcdsub, 0, 25, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vsubuws, 0, 26),
-GEN_VXFORM(vsubsbs, 0, 28),
+GEN_VXFORM_DUAL(vsubsbs, bcdtrunc, 0, 28, PPC_NONE, PPC2_ISA300),
 GEN_VXFORM(vsubshs, 0, 29),
 GEN_VXFORM_DUAL(vsubsws, xpnd04_2, 0, 30, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_207(vadduqm, 0, 4),
 GEN_VXFORM_207(vaddcuq, 0, 5),
 GEN_VXFORM_DUAL(vaddeuqm, vaddecuq, 30, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM_207(vsubuqm, 0, 20),
 GEN_VXFORM_207(vsubcuq, 0, 21),
+GEN_VXFORM_DUAL(vsubuqm, bcdtrunc, 0, 20, PPC2_ALTIVEC_207, PPC2_ISA300),
 GEN_VXFORM_DUAL(vsubeuqm, vsubecuq, 31, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM(vrlb, 2, 0),
 GEN_VXFORM(vrlh, 2, 1),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 7/7] target-ppc: Implement bcdtrunc. instruction
  2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
                   ` (5 preceding siblings ...)
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 6/7] target-ppc: Implement bcdtrunc. instruction Jose Ricardo Ziviani
@ 2016-12-03  5:00 ` Jose Ricardo Ziviani
  6 siblings, 0 replies; 19+ messages in thread
From: Jose Ricardo Ziviani @ 2016-12-03  5:00 UTC (permalink / raw)
  To: qemu-ppc; +Cc: qemu-devel, david, nikunj, bharata

bcdutrunc. Decimal unsigned truncate. Works like bcdtrunc. with
unsigned BCD numbers.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 target-ppc/helper.h                 |  1 +
 target-ppc/int_helper.c             | 39 +++++++++++++++++++++++++++++++++++++
 target-ppc/translate/vmx-impl.inc.c |  4 ++++
 target-ppc/translate/vmx-ops.inc.c  |  2 +-
 4 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 49965b0..52a2707 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -396,6 +396,7 @@ DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
 DEF_HELPER_4(bcdtrunc, i32, avr, avr, avr, i32)
+DEF_HELPER_4(bcdutrunc, i32, avr, avr, avr, i32)
 
 DEF_HELPER_2(xsadddp, void, env, i32)
 DEF_HELPER_2(xssubdp, void, env, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index a8fc718..ab531e8 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -3212,6 +3212,45 @@ uint32_t helper_bcdtrunc(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
     return cr;
 }
 
+uint32_t helper_bcdutrunc(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
+{
+    int i;
+    uint8_t digit;
+    uint8_t trunc;
+    int ox_flag = 0;
+    int invalid = 0;
+    ppc_avr_t ret = *b;
+
+#if defined(HOST_WORDS_BIGENDIAN)
+    int upper = ARRAY_SIZE(a->u16) - 1;
+#else
+    int upper = 0;
+#endif
+
+    trunc = 32 - (a->u16[upper] % 33);
+    for (i = 0; i < 32; i++) {
+        digit = bcd_get_digit(b, i, &invalid);
+
+        if (unlikely(invalid)) {
+            return CRF_SO;
+        }
+
+        if (i >= trunc) {
+            if (unlikely(!ox_flag && digit > 0x0)) {
+                ox_flag = 1;
+            }
+            bcd_put_digit(&ret, 0, i);
+        }
+    }
+
+    *r = ret;
+    if (r->u64[HI_IDX] == 0 && r->u64[LO_IDX] == 0) {
+        return (ox_flag) ? CRF_SO | CRF_EQ : CRF_EQ;
+    } else {
+        return (ox_flag) ? CRF_SO | CRF_GT : CRF_GT;
+    }
+}
+
 void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
 {
     int i;
diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
index 1683f42..3cb6fc2 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -1020,6 +1020,7 @@ GEN_BCD(bcds);
 GEN_BCD(bcdus);
 GEN_BCD(bcdsr);
 GEN_BCD(bcdtrunc);
+GEN_BCD(bcdutrunc);
 
 static void gen_xpnd04_1(DisasContext *ctx)
 {
@@ -1102,6 +1103,9 @@ GEN_VXFORM_DUAL(vsubsbs, PPC_ALTIVEC, PPC_NONE, \
                 bcdtrunc, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM_DUAL(vsubuqm, PPC2_ALTIVEC_207, PPC_NONE, \
                 bcdtrunc, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_DUAL(vsubcuq, PPC2_ALTIVEC_207, PPC_NONE, \
+                bcdutrunc, PPC_NONE, PPC2_ISA300)
+
 
 static void gen_vsbox(DisasContext *ctx)
 {
diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
index e6167a4..139f80c 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -149,8 +149,8 @@ GEN_VXFORM_DUAL(vsubsws, xpnd04_2, 0, 30, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM_207(vadduqm, 0, 4),
 GEN_VXFORM_207(vaddcuq, 0, 5),
 GEN_VXFORM_DUAL(vaddeuqm, vaddecuq, 30, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM_207(vsubcuq, 0, 21),
 GEN_VXFORM_DUAL(vsubuqm, bcdtrunc, 0, 20, PPC2_ALTIVEC_207, PPC2_ISA300),
+GEN_VXFORM_DUAL(vsubcuq, bcdutrunc, 0, 21, PPC2_ALTIVEC_207, PPC2_ISA300),
 GEN_VXFORM_DUAL(vsubeuqm, vsubecuq, 31, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM(vrlb, 2, 0),
 GEN_VXFORM(vrlh, 2, 1),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests Jose Ricardo Ziviani
@ 2016-12-04  1:37   ` Richard Henderson
  2016-12-05  1:56     ` David Gibson
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2016-12-04  1:37 UTC (permalink / raw)
  To: Jose Ricardo Ziviani, qemu-ppc; +Cc: bharata, qemu-devel, nikunj, david

On 12/02/2016 09:00 PM, Jose Ricardo Ziviani wrote:
> +++ b/include/qemu/host-utils.h
> @@ -29,6 +29,33 @@
>  #include "qemu/bswap.h"
>
>  #ifdef CONFIG_INT128
> +static inline void urshift(uint64_t *plow, uint64_t *phigh, uint32_t shift)
> +{
> +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> +    val >>= (shift & 127);
> +    *phigh = val >> 64;
> +    *plow = val & 0xffffffffffffffff;
> +}
> +
> +static inline void ulshift(uint64_t *plow, uint64_t *phigh,
> +                           uint32_t shift, bool *overflow)
> +{
> +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> +
> +    if (shift == 0) {
> +        return;
> +    }
> +
> +    if (shift > 127 || (val >> (128 - (shift & 127))) != 0) {
> +        *overflow = true;
> +    }
> +
> +    val <<= (shift & 127);
> +
> +    *phigh = val >> 64;
> +    *plow = val & 0xffffffffffffffff;
> +}
> +

This belongs in qemu/int128.h, not here.  And certainly not predicated on 
CONFIG_INT128.


r~

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests
  2016-12-04  1:37   ` Richard Henderson
@ 2016-12-05  1:56     ` David Gibson
  2016-12-05  9:35       ` [Qemu-devel] [Qemu-ppc] " joserz
  0 siblings, 1 reply; 19+ messages in thread
From: David Gibson @ 2016-12-05  1:56 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jose Ricardo Ziviani, qemu-ppc, bharata, qemu-devel, nikunj

[-- Attachment #1: Type: text/plain, Size: 1443 bytes --]

On Sat, Dec 03, 2016 at 05:37:27PM -0800, Richard Henderson wrote:
> On 12/02/2016 09:00 PM, Jose Ricardo Ziviani wrote:
> > +++ b/include/qemu/host-utils.h
> > @@ -29,6 +29,33 @@
> >  #include "qemu/bswap.h"
> > 
> >  #ifdef CONFIG_INT128
> > +static inline void urshift(uint64_t *plow, uint64_t *phigh, uint32_t shift)
> > +{
> > +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> > +    val >>= (shift & 127);
> > +    *phigh = val >> 64;
> > +    *plow = val & 0xffffffffffffffff;
> > +}
> > +
> > +static inline void ulshift(uint64_t *plow, uint64_t *phigh,
> > +                           uint32_t shift, bool *overflow)
> > +{
> > +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> > +
> > +    if (shift == 0) {
> > +        return;
> > +    }
> > +
> > +    if (shift > 127 || (val >> (128 - (shift & 127))) != 0) {
> > +        *overflow = true;
> > +    }
> > +
> > +    val <<= (shift & 127);
> > +
> > +    *phigh = val >> 64;
> > +    *plow = val & 0xffffffffffffffff;
> > +}
> > +
> 
> This belongs in qemu/int128.h, not here.  And certainly not predicated on
> CONFIG_INT128.

Is there actually any advantage to the __uint128_t based versions over
the 64-bit versions?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] target-ppc: Implement bcds. instruction
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 3/7] target-ppc: Implement bcds. instruction Jose Ricardo Ziviani
@ 2016-12-05  3:12   ` David Gibson
  2016-12-05  3:15     ` David Gibson
  0 siblings, 1 reply; 19+ messages in thread
From: David Gibson @ 2016-12-05  3:12 UTC (permalink / raw)
  To: Jose Ricardo Ziviani; +Cc: qemu-ppc, qemu-devel, nikunj, bharata

[-- Attachment #1: Type: text/plain, Size: 4169 bytes --]

On Sat, Dec 03, 2016 at 03:00:02AM -0200, Jose Ricardo Ziviani wrote:
> bcds.: Decimal shift. Given two registers vra and vrb, this instruction
> shift the vrb value by vra bits into the result register.
> 
> Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  target-ppc/helper.h                 |  1 +
>  target-ppc/int_helper.c             | 38 +++++++++++++++++++++++++++++++++++++
>  target-ppc/translate/vmx-impl.inc.c |  3 +++
>  target-ppc/translate/vmx-ops.inc.c  |  3 ++-
>  4 files changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index bc39efb..471a1da 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -392,6 +392,7 @@ DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
>  DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
>  DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
> +DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
>  
>  DEF_HELPER_2(xsadddp, void, env, i32)
>  DEF_HELPER_2(xssubdp, void, env, i32)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 7989b1f..b25c020 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -3043,6 +3043,44 @@ uint32_t helper_bcdsetsgn(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
>      return bcd_cmp_zero(r);
>  }
>  
> +uint32_t helper_bcds(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> +{
> +    int cr;
> +    int i = 0;
> +    bool ox_flag = false;
> +    int sgnb = bcd_get_sgn(b);
> +    ppc_avr_t ret = *b;
> +    ret.u64[LO_IDX] &= ~0xf;
> +
> +#if defined(HOST_WORDS_BIGENDIAN)
> +    int upper = ARRAY_SIZE(a->s32) - 1;
> +#else
> +    int upper = 0;
> +#endif
> +
> +    if (bcd_is_valid(b) == false) {
> +        return CRF_SO;
> +    }
> +
> +    if (a->s32[upper] > 0) {
> +        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];
> +        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
> +    } else {
> +        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
> +        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
> +    }
> +    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
> +
> +    *r = ret;
> +
> +    cr = bcd_cmp_zero(r);
> +    if (unlikely(ox_flag)) {
> +        cr |= CRF_SO;
> +    }
> +
> +    return cr;
> +}
> +
>  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
>  {
>      int i;
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index e8e527f..84ebb7e 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -1016,6 +1016,7 @@ GEN_BCD2(bcdcfsq)
>  GEN_BCD2(bcdctsq)
>  GEN_BCD2(bcdsetsgn)
>  GEN_BCD(bcdcpsgn);
> +GEN_BCD(bcds);
>  
>  static void gen_xpnd04_1(DisasContext *ctx)
>  {
> @@ -1090,6 +1091,8 @@ GEN_VXFORM_DUAL(vsubuhs, PPC_ALTIVEC, PPC_NONE, \
>                  bcdsub, PPC_NONE, PPC2_ALTIVEC_207)
>  GEN_VXFORM_DUAL(vaddshs, PPC_ALTIVEC, PPC_NONE, \
>                  bcdcpsgn, PPC_NONE, PPC2_ISA300)
> +GEN_VXFORM_DUAL(vsubudm, PPC2_ALTIVEC_207, PPC_NONE, \
> +                bcds, PPC_NONE, PPC2_ISA300)
>  
>  static void gen_vsbox(DisasContext *ctx)
>  {
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index 57dce6e..7b4b009 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -62,7 +62,8 @@ GEN_VXFORM_207(vaddudm, 0, 3),
>  GEN_VXFORM_DUAL(vsububm, bcdadd, 0, 16, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_DUAL(vsubuhm, bcdsub, 0, 17, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM(vsubuwm, 0, 18),
> -GEN_VXFORM_207(vsubudm, 0, 19),
> +GEN_VXFORM_DUAL(vsubudm, bcds, 0, 19, PPC2_ALTIVEC_207, PPC2_ISA300),
> +GEN_VXFORM_300(bcds, 0, 27),
>  GEN_VXFORM(vmaxub, 1, 0),
>  GEN_VXFORM(vmaxuh, 1, 1),
>  GEN_VXFORM(vmaxuw, 1, 2),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] target-ppc: Implement bcdus. instruction
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 4/7] target-ppc: Implement bcdus. instruction Jose Ricardo Ziviani
@ 2016-12-05  3:14   ` David Gibson
  0 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2016-12-05  3:14 UTC (permalink / raw)
  To: Jose Ricardo Ziviani; +Cc: qemu-ppc, qemu-devel, nikunj, bharata

[-- Attachment #1: Type: text/plain, Size: 4404 bytes --]

On Sat, Dec 03, 2016 at 03:00:03AM -0200, Jose Ricardo Ziviani wrote:
> bcdus.: Decimal unsigned shift. This instruction works like bcds. but
> considers only unsigned BCDs (no sign in least meaning 4 bits).
> 
> Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
> ---
>  target-ppc/helper.h                 |  1 +
>  target-ppc/int_helper.c             | 43 +++++++++++++++++++++++++++++++++++++
>  target-ppc/translate/vmx-impl.inc.c |  3 +++
>  target-ppc/translate/vmx-ops.inc.c  |  2 +-
>  4 files changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 471a1da..386ea67 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -393,6 +393,7 @@ DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
>  DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
>  DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
>  DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
> +DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
>  
>  DEF_HELPER_2(xsadddp, void, env, i32)
>  DEF_HELPER_2(xssubdp, void, env, i32)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index b25c020..4b5eea1 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -3081,6 +3081,49 @@ uint32_t helper_bcds(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
>      return cr;
>  }
>  
> +uint32_t helper_bcdus(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> +{
> +    int cr;
> +    int i;
> +    int invalid = 0;
> +    bool ox_flag = false;
> +    ppc_avr_t ret = *b;
> +
> +#if defined(HOST_WORDS_BIGENDIAN)
> +    int upper = ARRAY_SIZE(a->s32) - 1;
> +#else
> +    int upper = 0;
> +#endif

Retrieving the shift in terms of s32 elements, seems very odd when the
architecture defines the shift argument in terms of byte elements of
the vector.

> +
> +    for (i = 0; i < 32; i++) {
> +        bcd_get_digit(b, i, &invalid);
> +
> +        if (unlikely(invalid)) {
> +            return CRF_SO;
> +        }
> +    }
> +
> +    if (a->s32[upper] >= 32) {
> +        ox_flag = true;
> +        ret.u64[LO_IDX] = ret.u64[HI_IDX] = 0;
> +    } else if (a->s32[upper] <= -32) {
> +        ret.u64[LO_IDX] = ret.u64[HI_IDX] = 0;
> +    } else if (a->s32[upper] > 0) {
> +        i = a->s32[upper] & 31;
> +        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
> +    } else {
> +        i = (-a->s32[upper]) & 31;
> +        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
> +    }
> +
> +    cr = bcd_cmp_zero(r);
> +    if (unlikely(ox_flag)) {
> +        cr |= CRF_SO;
> +    }
> +
> +    return cr;
> +}
> +
>  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
>  {
>      int i;
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index 84ebb7e..fc54881 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -1017,6 +1017,7 @@ GEN_BCD2(bcdctsq)
>  GEN_BCD2(bcdsetsgn)
>  GEN_BCD(bcdcpsgn);
>  GEN_BCD(bcds);
> +GEN_BCD(bcdus);
>  
>  static void gen_xpnd04_1(DisasContext *ctx)
>  {
> @@ -1093,6 +1094,8 @@ GEN_VXFORM_DUAL(vaddshs, PPC_ALTIVEC, PPC_NONE, \
>                  bcdcpsgn, PPC_NONE, PPC2_ISA300)
>  GEN_VXFORM_DUAL(vsubudm, PPC2_ALTIVEC_207, PPC_NONE, \
>                  bcds, PPC_NONE, PPC2_ISA300)
> +GEN_VXFORM_DUAL(vsubuwm, PPC_ALTIVEC, PPC_NONE, \
> +                bcdus, PPC_NONE, PPC2_ISA300)
>  
>  static void gen_vsbox(DisasContext *ctx)
>  {
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index 7b4b009..cdd3abe 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -61,7 +61,7 @@ GEN_VXFORM(vadduwm, 0, 2),
>  GEN_VXFORM_207(vaddudm, 0, 3),
>  GEN_VXFORM_DUAL(vsububm, bcdadd, 0, 16, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_DUAL(vsubuhm, bcdsub, 0, 17, PPC_ALTIVEC, PPC_NONE),
> -GEN_VXFORM(vsubuwm, 0, 18),
> +GEN_VXFORM_DUAL(vsubuwm, bcdus, 0, 18, PPC_ALTIVEC, PPC2_ISA300),
>  GEN_VXFORM_DUAL(vsubudm, bcds, 0, 19, PPC2_ALTIVEC_207, PPC2_ISA300),
>  GEN_VXFORM_300(bcds, 0, 27),
>  GEN_VXFORM(vmaxub, 1, 0),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] target-ppc: Implement bcds. instruction
  2016-12-05  3:12   ` David Gibson
@ 2016-12-05  3:15     ` David Gibson
  0 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2016-12-05  3:15 UTC (permalink / raw)
  To: Jose Ricardo Ziviani; +Cc: qemu-ppc, qemu-devel, nikunj, bharata

[-- Attachment #1: Type: text/plain, Size: 4609 bytes --]

On Mon, Dec 05, 2016 at 02:12:29PM +1100, David Gibson wrote:
> On Sat, Dec 03, 2016 at 03:00:02AM -0200, Jose Ricardo Ziviani wrote:
> > bcds.: Decimal shift. Given two registers vra and vrb, this instruction
> > shift the vrb value by vra bits into the result register.
> > 
> > Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

Sorry, I take that back...

> > ---
> >  target-ppc/helper.h                 |  1 +
> >  target-ppc/int_helper.c             | 38 +++++++++++++++++++++++++++++++++++++
> >  target-ppc/translate/vmx-impl.inc.c |  3 +++
> >  target-ppc/translate/vmx-ops.inc.c  |  3 ++-
> >  4 files changed, 44 insertions(+), 1 deletion(-)
> > 
> > diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> > index bc39efb..471a1da 100644
> > --- a/target-ppc/helper.h
> > +++ b/target-ppc/helper.h
> > @@ -392,6 +392,7 @@ DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
> >  DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
> >  DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
> >  DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
> > +DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
> >  
> >  DEF_HELPER_2(xsadddp, void, env, i32)
> >  DEF_HELPER_2(xssubdp, void, env, i32)
> > diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> > index 7989b1f..b25c020 100644
> > --- a/target-ppc/int_helper.c
> > +++ b/target-ppc/int_helper.c
> > @@ -3043,6 +3043,44 @@ uint32_t helper_bcdsetsgn(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
> >      return bcd_cmp_zero(r);
> >  }
> >  
> > +uint32_t helper_bcds(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> > +{
> > +    int cr;
> > +    int i = 0;
> > +    bool ox_flag = false;
> > +    int sgnb = bcd_get_sgn(b);
> > +    ppc_avr_t ret = *b;
> > +    ret.u64[LO_IDX] &= ~0xf;
> > +
> > +#if defined(HOST_WORDS_BIGENDIAN)
> > +    int upper = ARRAY_SIZE(a->s32) - 1;
> > +#else
> > +    int upper = 0;
> > +#endif
> > +
> > +    if (bcd_is_valid(b) == false) {
> > +        return CRF_SO;
> > +    }
> > +
> > +    if (a->s32[upper] > 0) {
> > +        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];

Determining the shift in terms of s32 elements seems dubious when the
arch defines it in terms of byte elements.

> > +        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
> > +    } else {
> > +        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
> > +        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
> > +    }
> > +    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
> > +
> > +    *r = ret;
> > +
> > +    cr = bcd_cmp_zero(r);
> > +    if (unlikely(ox_flag)) {
> > +        cr |= CRF_SO;
> > +    }
> > +
> > +    return cr;
> > +}
> > +
> >  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
> >  {
> >      int i;
> > diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> > index e8e527f..84ebb7e 100644
> > --- a/target-ppc/translate/vmx-impl.inc.c
> > +++ b/target-ppc/translate/vmx-impl.inc.c
> > @@ -1016,6 +1016,7 @@ GEN_BCD2(bcdcfsq)
> >  GEN_BCD2(bcdctsq)
> >  GEN_BCD2(bcdsetsgn)
> >  GEN_BCD(bcdcpsgn);
> > +GEN_BCD(bcds);
> >  
> >  static void gen_xpnd04_1(DisasContext *ctx)
> >  {
> > @@ -1090,6 +1091,8 @@ GEN_VXFORM_DUAL(vsubuhs, PPC_ALTIVEC, PPC_NONE, \
> >                  bcdsub, PPC_NONE, PPC2_ALTIVEC_207)
> >  GEN_VXFORM_DUAL(vaddshs, PPC_ALTIVEC, PPC_NONE, \
> >                  bcdcpsgn, PPC_NONE, PPC2_ISA300)
> > +GEN_VXFORM_DUAL(vsubudm, PPC2_ALTIVEC_207, PPC_NONE, \
> > +                bcds, PPC_NONE, PPC2_ISA300)
> >  
> >  static void gen_vsbox(DisasContext *ctx)
> >  {
> > diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> > index 57dce6e..7b4b009 100644
> > --- a/target-ppc/translate/vmx-ops.inc.c
> > +++ b/target-ppc/translate/vmx-ops.inc.c
> > @@ -62,7 +62,8 @@ GEN_VXFORM_207(vaddudm, 0, 3),
> >  GEN_VXFORM_DUAL(vsububm, bcdadd, 0, 16, PPC_ALTIVEC, PPC_NONE),
> >  GEN_VXFORM_DUAL(vsubuhm, bcdsub, 0, 17, PPC_ALTIVEC, PPC_NONE),
> >  GEN_VXFORM(vsubuwm, 0, 18),
> > -GEN_VXFORM_207(vsubudm, 0, 19),
> > +GEN_VXFORM_DUAL(vsubudm, bcds, 0, 19, PPC2_ALTIVEC_207, PPC2_ISA300),
> > +GEN_VXFORM_300(bcds, 0, 27),
> >  GEN_VXFORM(vmaxub, 1, 0),
> >  GEN_VXFORM(vmaxuh, 1, 1),
> >  GEN_VXFORM(vmaxuw, 1, 2),
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] target-ppc: Implement bcdsr. instruction
  2016-12-03  5:00 ` [Qemu-devel] [PATCH 5/7] target-ppc: Implement bcdsr. instruction Jose Ricardo Ziviani
@ 2016-12-05  3:19   ` David Gibson
  2016-12-05  9:45     ` [Qemu-devel] [Qemu-ppc] " joserz
  2016-12-05 18:52     ` joserz
  0 siblings, 2 replies; 19+ messages in thread
From: David Gibson @ 2016-12-05  3:19 UTC (permalink / raw)
  To: Jose Ricardo Ziviani; +Cc: qemu-ppc, qemu-devel, nikunj, bharata

[-- Attachment #1: Type: text/plain, Size: 4320 bytes --]

On Sat, Dec 03, 2016 at 03:00:04AM -0200, Jose Ricardo Ziviani wrote:
> bcdsr.: Decimal shift and round. This instruction works like bcds.
> however, when performing right shift, 1 will be added to the
> result if the last digit was >= 5.
> 
> Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
> ---
>  target-ppc/helper.h                 |  1 +
>  target-ppc/int_helper.c             | 45 +++++++++++++++++++++++++++++++++++++
>  target-ppc/translate/vmx-impl.inc.c |  1 +
>  target-ppc/translate/vmx-ops.inc.c  |  2 ++
>  4 files changed, 49 insertions(+)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 386ea67..d9528eb 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -394,6 +394,7 @@ DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
>  DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
>  DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
>  DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
> +DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
>  
>  DEF_HELPER_2(xsadddp, void, env, i32)
>  DEF_HELPER_2(xssubdp, void, env, i32)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 4b5eea1..c9fcb1a 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -3124,6 +3124,51 @@ uint32_t helper_bcdus(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
>      return cr;
>  }
>  
> +uint32_t helper_bcdsr(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> +{
> +    int cr;
> +    int i;
> +    int unused = 0;
> +    int invalid = 0;
> +    bool ox_flag = false;
> +    int sgnb = bcd_get_sgn(b);
> +    ppc_avr_t ret = *b;
> +    ret.u64[LO_IDX] &= ~0xf;
> +
> +#if defined(HOST_WORDS_BIGENDIAN)
> +    ppc_avr_t bcd_one = { .u64 = { 0, 0x10 } };
> +    int upper = ARRAY_SIZE(a->s32) - 1;

Same comment as previous patches about the shift argument.

> +#else
> +    ppc_avr_t bcd_one = { .u64 = { 0x10, 0 } };
> +    int upper = 0;
> +#endif
> +
> +    if (bcd_is_valid(b) == false) {
> +        return CRF_SO;
> +    }
> +
> +    if (a->s32[upper] > 0) {
> +        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];
> +        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
> +    } else {
> +        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
> +        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
> +
> +        if (bcd_get_digit(&ret, 0, &invalid) >= 5) {

So, the ISA actually says you increment only if the last digit is >
5.  That doesn't seem like correct rounding, so it might be an error
in the ISA document - best check this with the hardware people.

> +            bcd_add_mag(&ret, &ret, &bcd_one, &invalid, &unused);
> +        }
> +    }
> +    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
> +
> +    cr = bcd_cmp_zero(&ret);
> +    if (unlikely(ox_flag)) {
> +        cr |= CRF_SO;
> +    }
> +    *r = ret;
> +
> +    return cr;
> +}
> +
>  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
>  {
>      int i;
> diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> index fc54881..451abb5 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -1018,6 +1018,7 @@ GEN_BCD2(bcdsetsgn)
>  GEN_BCD(bcdcpsgn);
>  GEN_BCD(bcds);
>  GEN_BCD(bcdus);
> +GEN_BCD(bcdsr);
>  
>  static void gen_xpnd04_1(DisasContext *ctx)
>  {
> diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> index cdd3abe..fa9c996 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -132,6 +132,8 @@ GEN_HANDLER_E_2(vprtybd, 0x4, 0x1, 0x18, 9, 0, PPC_NONE, PPC2_ISA300),
>  GEN_HANDLER_E_2(vprtybq, 0x4, 0x1, 0x18, 10, 0, PPC_NONE, PPC2_ISA300),
>  
>  GEN_VXFORM_DUAL(vsubcuw, xpnd04_1, 0, 22, PPC_ALTIVEC, PPC_NONE),
> +GEN_VXFORM_300(bcdsr, 0, 23),
> +GEN_VXFORM_300(bcdsr, 0, 31),
>  GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM(vadduws, 0, 10),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests
  2016-12-05  1:56     ` David Gibson
@ 2016-12-05  9:35       ` joserz
  2016-12-05 22:59         ` David Gibson
  0 siblings, 1 reply; 19+ messages in thread
From: joserz @ 2016-12-05  9:35 UTC (permalink / raw)
  To: David Gibson; +Cc: Richard Henderson, qemu-ppc, qemu-devel, bharata

On Mon, Dec 05, 2016 at 12:56:39PM +1100, David Gibson wrote:
> On Sat, Dec 03, 2016 at 05:37:27PM -0800, Richard Henderson wrote:
> > On 12/02/2016 09:00 PM, Jose Ricardo Ziviani wrote:
> > > +++ b/include/qemu/host-utils.h
> > > @@ -29,6 +29,33 @@
> > >  #include "qemu/bswap.h"
> > > 
> > >  #ifdef CONFIG_INT128
> > > +static inline void urshift(uint64_t *plow, uint64_t *phigh, uint32_t shift)
> > > +{
> > > +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> > > +    val >>= (shift & 127);
> > > +    *phigh = val >> 64;
> > > +    *plow = val & 0xffffffffffffffff;
> > > +}
> > > +
> > > +static inline void ulshift(uint64_t *plow, uint64_t *phigh,
> > > +                           uint32_t shift, bool *overflow)
> > > +{
> > > +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> > > +
> > > +    if (shift == 0) {
> > > +        return;
> > > +    }
> > > +
> > > +    if (shift > 127 || (val >> (128 - (shift & 127))) != 0) {
> > > +        *overflow = true;
> > > +    }
> > > +
> > > +    val <<= (shift & 127);
> > > +
> > > +    *phigh = val >> 64;
> > > +    *plow = val & 0xffffffffffffffff;
> > > +}
> > > +
> > 
> > This belongs in qemu/int128.h, not here.  And certainly not predicated on
> > CONFIG_INT128.
> 
> Is there actually any advantage to the __uint128_t based versions over
> the 64-bit versions?

Nothing special here. It just looks more clear (to me) to shift all 128
bits at once than 2x64. But I agree we won't loose to have just one
function outside CONFIG_INT128.

So, I'll remove these two functions and keep only the other two using
uint64_t types.

Anyway I get a bit confused about int128.h and host-utils.h. I see
functions like divu128 and divs128 that could be in int128.h, since
there is no similar operation in int128.h. Is there any rule about it?

Thank you guys for reviewing it!

> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 5/7] target-ppc: Implement bcdsr. instruction
  2016-12-05  3:19   ` David Gibson
@ 2016-12-05  9:45     ` joserz
  2016-12-05 18:52     ` joserz
  1 sibling, 0 replies; 19+ messages in thread
From: joserz @ 2016-12-05  9:45 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, bharata

On Mon, Dec 05, 2016 at 02:19:26PM +1100, David Gibson wrote:
> On Sat, Dec 03, 2016 at 03:00:04AM -0200, Jose Ricardo Ziviani wrote:
> > bcdsr.: Decimal shift and round. This instruction works like bcds.
> > however, when performing right shift, 1 will be added to the
> > result if the last digit was >= 5.
> > 
> > Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
> > ---
> >  target-ppc/helper.h                 |  1 +
> >  target-ppc/int_helper.c             | 45 +++++++++++++++++++++++++++++++++++++
> >  target-ppc/translate/vmx-impl.inc.c |  1 +
> >  target-ppc/translate/vmx-ops.inc.c  |  2 ++
> >  4 files changed, 49 insertions(+)
> > 
> > diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> > index 386ea67..d9528eb 100644
> > --- a/target-ppc/helper.h
> > +++ b/target-ppc/helper.h
> > @@ -394,6 +394,7 @@ DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
> >  DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
> >  DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
> >  DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
> > +DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
> >  
> >  DEF_HELPER_2(xsadddp, void, env, i32)
> >  DEF_HELPER_2(xssubdp, void, env, i32)
> > diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> > index 4b5eea1..c9fcb1a 100644
> > --- a/target-ppc/int_helper.c
> > +++ b/target-ppc/int_helper.c
> > @@ -3124,6 +3124,51 @@ uint32_t helper_bcdus(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> >      return cr;
> >  }
> >  
> > +uint32_t helper_bcdsr(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> > +{
> > +    int cr;
> > +    int i;
> > +    int unused = 0;
> > +    int invalid = 0;
> > +    bool ox_flag = false;
> > +    int sgnb = bcd_get_sgn(b);
> > +    ppc_avr_t ret = *b;
> > +    ret.u64[LO_IDX] &= ~0xf;
> > +
> > +#if defined(HOST_WORDS_BIGENDIAN)
> > +    ppc_avr_t bcd_one = { .u64 = { 0, 0x10 } };
> > +    int upper = ARRAY_SIZE(a->s32) - 1;
> 
> Same comment as previous patches about the shift argument.

Thanks, I'll change it here/previous patches as well.

> 
> > +#else
> > +    ppc_avr_t bcd_one = { .u64 = { 0x10, 0 } };
> > +    int upper = 0;
> > +#endif
> > +
> > +    if (bcd_is_valid(b) == false) {
> > +        return CRF_SO;
> > +    }
> > +
> > +    if (a->s32[upper] > 0) {
> > +        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];
> > +        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
> > +    } else {
> > +        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
> > +        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
> > +
> > +        if (bcd_get_digit(&ret, 0, &invalid) >= 5) {
> 
> So, the ISA actually says you increment only if the last digit is >
> 5.  That doesn't seem like correct rounding, so it might be an error
> in the ISA document - best check this with the hardware people.

Good point, I'll look for someone to help here and get back to you as
soon as I have something.

Thanks David

> 
> > +            bcd_add_mag(&ret, &ret, &bcd_one, &invalid, &unused);
> > +        }
> > +    }
> > +    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
> > +
> > +    cr = bcd_cmp_zero(&ret);
> > +    if (unlikely(ox_flag)) {
> > +        cr |= CRF_SO;
> > +    }
> > +    *r = ret;
> > +
> > +    return cr;
> > +}
> > +
> >  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
> >  {
> >      int i;
> > diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> > index fc54881..451abb5 100644
> > --- a/target-ppc/translate/vmx-impl.inc.c
> > +++ b/target-ppc/translate/vmx-impl.inc.c
> > @@ -1018,6 +1018,7 @@ GEN_BCD2(bcdsetsgn)
> >  GEN_BCD(bcdcpsgn);
> >  GEN_BCD(bcds);
> >  GEN_BCD(bcdus);
> > +GEN_BCD(bcdsr);
> >  
> >  static void gen_xpnd04_1(DisasContext *ctx)
> >  {
> > diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> > index cdd3abe..fa9c996 100644
> > --- a/target-ppc/translate/vmx-ops.inc.c
> > +++ b/target-ppc/translate/vmx-ops.inc.c
> > @@ -132,6 +132,8 @@ GEN_HANDLER_E_2(vprtybd, 0x4, 0x1, 0x18, 9, 0, PPC_NONE, PPC2_ISA300),
> >  GEN_HANDLER_E_2(vprtybq, 0x4, 0x1, 0x18, 10, 0, PPC_NONE, PPC2_ISA300),
> >  
> >  GEN_VXFORM_DUAL(vsubcuw, xpnd04_1, 0, 22, PPC_ALTIVEC, PPC_NONE),
> > +GEN_VXFORM_300(bcdsr, 0, 23),
> > +GEN_VXFORM_300(bcdsr, 0, 31),
> >  GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE),
> >  GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),
> >  GEN_VXFORM(vadduws, 0, 10),
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 5/7] target-ppc: Implement bcdsr. instruction
  2016-12-05  3:19   ` David Gibson
  2016-12-05  9:45     ` [Qemu-devel] [Qemu-ppc] " joserz
@ 2016-12-05 18:52     ` joserz
  2016-12-05 23:01       ` David Gibson
  1 sibling, 1 reply; 19+ messages in thread
From: joserz @ 2016-12-05 18:52 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, bharata

On Mon, Dec 05, 2016 at 02:19:26PM +1100, David Gibson wrote:
> On Sat, Dec 03, 2016 at 03:00:04AM -0200, Jose Ricardo Ziviani wrote:
> > bcdsr.: Decimal shift and round. This instruction works like bcds.
> > however, when performing right shift, 1 will be added to the
> > result if the last digit was >= 5.
> > 
> > Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
> > ---
> >  target-ppc/helper.h                 |  1 +
> >  target-ppc/int_helper.c             | 45 +++++++++++++++++++++++++++++++++++++
> >  target-ppc/translate/vmx-impl.inc.c |  1 +
> >  target-ppc/translate/vmx-ops.inc.c  |  2 ++
> >  4 files changed, 49 insertions(+)
> > 
> > diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> > index 386ea67..d9528eb 100644
> > --- a/target-ppc/helper.h
> > +++ b/target-ppc/helper.h
> > @@ -394,6 +394,7 @@ DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
> >  DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
> >  DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
> >  DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
> > +DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
> >  
> >  DEF_HELPER_2(xsadddp, void, env, i32)
> >  DEF_HELPER_2(xssubdp, void, env, i32)
> > diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> > index 4b5eea1..c9fcb1a 100644
> > --- a/target-ppc/int_helper.c
> > +++ b/target-ppc/int_helper.c
> > @@ -3124,6 +3124,51 @@ uint32_t helper_bcdus(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> >      return cr;
> >  }
> >  
> > +uint32_t helper_bcdsr(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> > +{
> > +    int cr;
> > +    int i;
> > +    int unused = 0;
> > +    int invalid = 0;
> > +    bool ox_flag = false;
> > +    int sgnb = bcd_get_sgn(b);
> > +    ppc_avr_t ret = *b;
> > +    ret.u64[LO_IDX] &= ~0xf;
> > +
> > +#if defined(HOST_WORDS_BIGENDIAN)
> > +    ppc_avr_t bcd_one = { .u64 = { 0, 0x10 } };
> > +    int upper = ARRAY_SIZE(a->s32) - 1;
> 
> Same comment as previous patches about the shift argument.
> 
> > +#else
> > +    ppc_avr_t bcd_one = { .u64 = { 0x10, 0 } };
> > +    int upper = 0;
> > +#endif
> > +
> > +    if (bcd_is_valid(b) == false) {
> > +        return CRF_SO;
> > +    }
> > +
> > +    if (a->s32[upper] > 0) {
> > +        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];
> > +        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
> > +    } else {
> > +        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
> > +        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
> > +
> > +        if (bcd_get_digit(&ret, 0, &invalid) >= 5) {
> 
> So, the ISA actually says you increment only if the last digit is >
> 5.  That doesn't seem like correct rounding, so it might be an error
> in the ISA document - best check this with the hardware people.
> 

Just checked with hw team here and they will have this updated to
"greater than or equal to 5" in the next version (v3.0B).

I was told this operation is used more as arithmetic (divide by power of
10) than logical.

> > +            bcd_add_mag(&ret, &ret, &bcd_one, &invalid, &unused);
> > +        }
> > +    }
> > +    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
> > +
> > +    cr = bcd_cmp_zero(&ret);
> > +    if (unlikely(ox_flag)) {
> > +        cr |= CRF_SO;
> > +    }
> > +    *r = ret;
> > +
> > +    return cr;
> > +}
> > +
> >  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
> >  {
> >      int i;
> > diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> > index fc54881..451abb5 100644
> > --- a/target-ppc/translate/vmx-impl.inc.c
> > +++ b/target-ppc/translate/vmx-impl.inc.c
> > @@ -1018,6 +1018,7 @@ GEN_BCD2(bcdsetsgn)
> >  GEN_BCD(bcdcpsgn);
> >  GEN_BCD(bcds);
> >  GEN_BCD(bcdus);
> > +GEN_BCD(bcdsr);
> >  
> >  static void gen_xpnd04_1(DisasContext *ctx)
> >  {
> > diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> > index cdd3abe..fa9c996 100644
> > --- a/target-ppc/translate/vmx-ops.inc.c
> > +++ b/target-ppc/translate/vmx-ops.inc.c
> > @@ -132,6 +132,8 @@ GEN_HANDLER_E_2(vprtybd, 0x4, 0x1, 0x18, 9, 0, PPC_NONE, PPC2_ISA300),
> >  GEN_HANDLER_E_2(vprtybq, 0x4, 0x1, 0x18, 10, 0, PPC_NONE, PPC2_ISA300),
> >  
> >  GEN_VXFORM_DUAL(vsubcuw, xpnd04_1, 0, 22, PPC_ALTIVEC, PPC_NONE),
> > +GEN_VXFORM_300(bcdsr, 0, 23),
> > +GEN_VXFORM_300(bcdsr, 0, 31),
> >  GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE),
> >  GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),
> >  GEN_VXFORM(vadduws, 0, 10),
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests
  2016-12-05  9:35       ` [Qemu-devel] [Qemu-ppc] " joserz
@ 2016-12-05 22:59         ` David Gibson
  0 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2016-12-05 22:59 UTC (permalink / raw)
  To: joserz; +Cc: Richard Henderson, qemu-ppc, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 2470 bytes --]

On Mon, Dec 05, 2016 at 07:35:39AM -0200, joserz@linux.vnet.ibm.com wrote:
> On Mon, Dec 05, 2016 at 12:56:39PM +1100, David Gibson wrote:
> > On Sat, Dec 03, 2016 at 05:37:27PM -0800, Richard Henderson wrote:
> > > On 12/02/2016 09:00 PM, Jose Ricardo Ziviani wrote:
> > > > +++ b/include/qemu/host-utils.h
> > > > @@ -29,6 +29,33 @@
> > > >  #include "qemu/bswap.h"
> > > > 
> > > >  #ifdef CONFIG_INT128
> > > > +static inline void urshift(uint64_t *plow, uint64_t *phigh, uint32_t shift)
> > > > +{
> > > > +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> > > > +    val >>= (shift & 127);
> > > > +    *phigh = val >> 64;
> > > > +    *plow = val & 0xffffffffffffffff;
> > > > +}
> > > > +
> > > > +static inline void ulshift(uint64_t *plow, uint64_t *phigh,
> > > > +                           uint32_t shift, bool *overflow)
> > > > +{
> > > > +    __uint128_t val = ((__uint128_t)*phigh << 64) | *plow;
> > > > +
> > > > +    if (shift == 0) {
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    if (shift > 127 || (val >> (128 - (shift & 127))) != 0) {
> > > > +        *overflow = true;
> > > > +    }
> > > > +
> > > > +    val <<= (shift & 127);
> > > > +
> > > > +    *phigh = val >> 64;
> > > > +    *plow = val & 0xffffffffffffffff;
> > > > +}
> > > > +
> > > 
> > > This belongs in qemu/int128.h, not here.  And certainly not predicated on
> > > CONFIG_INT128.
> > 
> > Is there actually any advantage to the __uint128_t based versions over
> > the 64-bit versions?
> 
> Nothing special here. It just looks more clear (to me) to shift all 128
> bits at once than 2x64. But I agree we won't loose to have just one
> function outside CONFIG_INT128.

It is clearer, but having two different version makes things less
clear again.  We need to have the 64-bit only version for compilers
without int128_t support, so..

> So, I'll remove these two functions and keep only the other two using
> uint64_t types.
> 
> Anyway I get a bit confused about int128.h and host-utils.h. I see
> functions like divu128 and divs128 that could be in int128.h, since
> there is no similar operation in int128.h. Is there any rule about it?
> 
> Thank you guys for reviewing it!
> 
> > 
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 5/7] target-ppc: Implement bcdsr. instruction
  2016-12-05 18:52     ` joserz
@ 2016-12-05 23:01       ` David Gibson
  0 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2016-12-05 23:01 UTC (permalink / raw)
  To: joserz; +Cc: qemu-ppc, qemu-devel, bharata

[-- Attachment #1: Type: text/plain, Size: 5418 bytes --]

On Mon, Dec 05, 2016 at 04:52:57PM -0200, joserz@linux.vnet.ibm.com wrote:
> On Mon, Dec 05, 2016 at 02:19:26PM +1100, David Gibson wrote:
> > On Sat, Dec 03, 2016 at 03:00:04AM -0200, Jose Ricardo Ziviani wrote:
> > > bcdsr.: Decimal shift and round. This instruction works like bcds.
> > > however, when performing right shift, 1 will be added to the
> > > result if the last digit was >= 5.
> > > 
> > > Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
> > > ---
> > >  target-ppc/helper.h                 |  1 +
> > >  target-ppc/int_helper.c             | 45 +++++++++++++++++++++++++++++++++++++
> > >  target-ppc/translate/vmx-impl.inc.c |  1 +
> > >  target-ppc/translate/vmx-ops.inc.c  |  2 ++
> > >  4 files changed, 49 insertions(+)
> > > 
> > > diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> > > index 386ea67..d9528eb 100644
> > > --- a/target-ppc/helper.h
> > > +++ b/target-ppc/helper.h
> > > @@ -394,6 +394,7 @@ DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
> > >  DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
> > >  DEF_HELPER_4(bcds, i32, avr, avr, avr, i32)
> > >  DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32)
> > > +DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32)
> > >  
> > >  DEF_HELPER_2(xsadddp, void, env, i32)
> > >  DEF_HELPER_2(xssubdp, void, env, i32)
> > > diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> > > index 4b5eea1..c9fcb1a 100644
> > > --- a/target-ppc/int_helper.c
> > > +++ b/target-ppc/int_helper.c
> > > @@ -3124,6 +3124,51 @@ uint32_t helper_bcdus(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> > >      return cr;
> > >  }
> > >  
> > > +uint32_t helper_bcdsr(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t ps)
> > > +{
> > > +    int cr;
> > > +    int i;
> > > +    int unused = 0;
> > > +    int invalid = 0;
> > > +    bool ox_flag = false;
> > > +    int sgnb = bcd_get_sgn(b);
> > > +    ppc_avr_t ret = *b;
> > > +    ret.u64[LO_IDX] &= ~0xf;
> > > +
> > > +#if defined(HOST_WORDS_BIGENDIAN)
> > > +    ppc_avr_t bcd_one = { .u64 = { 0, 0x10 } };
> > > +    int upper = ARRAY_SIZE(a->s32) - 1;
> > 
> > Same comment as previous patches about the shift argument.
> > 
> > > +#else
> > > +    ppc_avr_t bcd_one = { .u64 = { 0x10, 0 } };
> > > +    int upper = 0;
> > > +#endif
> > > +
> > > +    if (bcd_is_valid(b) == false) {
> > > +        return CRF_SO;
> > > +    }
> > > +
> > > +    if (a->s32[upper] > 0) {
> > > +        i = (a->s32[upper] > 31) ? 31 : a->s32[upper];
> > > +        ulshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4, &ox_flag);
> > > +    } else {
> > > +        i = (a->s32[upper] < -31) ? 31 : -a->s32[upper];
> > > +        urshift(&ret.u64[LO_IDX], &ret.u64[HI_IDX], i * 4);
> > > +
> > > +        if (bcd_get_digit(&ret, 0, &invalid) >= 5) {
> > 
> > So, the ISA actually says you increment only if the last digit is >
> > 5.  That doesn't seem like correct rounding, so it might be an error
> > in the ISA document - best check this with the hardware people.
> > 
> 
> Just checked with hw team here and they will have this updated to
> "greater than or equal to 5" in the next version (v3.0B).
> 
> I was told this operation is used more as arithmetic (divide by power of
> 10) than logical.

Right, that makes sense, but we needed to check.  After all if it was
the actual hardware that got this wrong (not just the documentation),
then qemu would need to match that, regardless of how useless the
instruction would be as a result.

> 
> > > +            bcd_add_mag(&ret, &ret, &bcd_one, &invalid, &unused);
> > > +        }
> > > +    }
> > > +    bcd_put_digit(&ret, bcd_preferred_sgn(sgnb, ps), 0);
> > > +
> > > +    cr = bcd_cmp_zero(&ret);
> > > +    if (unlikely(ox_flag)) {
> > > +        cr |= CRF_SO;
> > > +    }
> > > +    *r = ret;
> > > +
> > > +    return cr;
> > > +}
> > > +
> > >  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
> > >  {
> > >      int i;
> > > diff --git a/target-ppc/translate/vmx-impl.inc.c b/target-ppc/translate/vmx-impl.inc.c
> > > index fc54881..451abb5 100644
> > > --- a/target-ppc/translate/vmx-impl.inc.c
> > > +++ b/target-ppc/translate/vmx-impl.inc.c
> > > @@ -1018,6 +1018,7 @@ GEN_BCD2(bcdsetsgn)
> > >  GEN_BCD(bcdcpsgn);
> > >  GEN_BCD(bcds);
> > >  GEN_BCD(bcdus);
> > > +GEN_BCD(bcdsr);
> > >  
> > >  static void gen_xpnd04_1(DisasContext *ctx)
> > >  {
> > > diff --git a/target-ppc/translate/vmx-ops.inc.c b/target-ppc/translate/vmx-ops.inc.c
> > > index cdd3abe..fa9c996 100644
> > > --- a/target-ppc/translate/vmx-ops.inc.c
> > > +++ b/target-ppc/translate/vmx-ops.inc.c
> > > @@ -132,6 +132,8 @@ GEN_HANDLER_E_2(vprtybd, 0x4, 0x1, 0x18, 9, 0, PPC_NONE, PPC2_ISA300),
> > >  GEN_HANDLER_E_2(vprtybq, 0x4, 0x1, 0x18, 10, 0, PPC_NONE, PPC2_ISA300),
> > >  
> > >  GEN_VXFORM_DUAL(vsubcuw, xpnd04_1, 0, 22, PPC_ALTIVEC, PPC_NONE),
> > > +GEN_VXFORM_300(bcdsr, 0, 23),
> > > +GEN_VXFORM_300(bcdsr, 0, 31),
> > >  GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, PPC_NONE),
> > >  GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),
> > >  GEN_VXFORM(vadduws, 0, 10),
> > 
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-12-06  0:05 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-03  4:59 [Qemu-devel] [PATCH 0/7] POWER9 TCG enablements - BCD functions - final part Jose Ricardo Ziviani
2016-12-03  5:00 ` [Qemu-devel] [PATCH 1/7] target-ppc: Implement bcd_is_valid function Jose Ricardo Ziviani
2016-12-03  5:00 ` [Qemu-devel] [PATCH 2/7] target-ppc: Implement unsigned quadword left/right shift and unit tests Jose Ricardo Ziviani
2016-12-04  1:37   ` Richard Henderson
2016-12-05  1:56     ` David Gibson
2016-12-05  9:35       ` [Qemu-devel] [Qemu-ppc] " joserz
2016-12-05 22:59         ` David Gibson
2016-12-03  5:00 ` [Qemu-devel] [PATCH 3/7] target-ppc: Implement bcds. instruction Jose Ricardo Ziviani
2016-12-05  3:12   ` David Gibson
2016-12-05  3:15     ` David Gibson
2016-12-03  5:00 ` [Qemu-devel] [PATCH 4/7] target-ppc: Implement bcdus. instruction Jose Ricardo Ziviani
2016-12-05  3:14   ` David Gibson
2016-12-03  5:00 ` [Qemu-devel] [PATCH 5/7] target-ppc: Implement bcdsr. instruction Jose Ricardo Ziviani
2016-12-05  3:19   ` David Gibson
2016-12-05  9:45     ` [Qemu-devel] [Qemu-ppc] " joserz
2016-12-05 18:52     ` joserz
2016-12-05 23:01       ` David Gibson
2016-12-03  5:00 ` [Qemu-devel] [PATCH 6/7] target-ppc: Implement bcdtrunc. instruction Jose Ricardo Ziviani
2016-12-03  5:00 ` [Qemu-devel] [PATCH 7/7] " Jose Ricardo Ziviani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).