[PATCH 0/8] SPARC VIS fixes

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/8] SPARC VIS fixes
@ 2023-09-25  5:03 Nick Bowler
  2023-09-25  5:03 ` [PATCH 1/8] target/sparc: Fix VIS fmul8x16 input register Nick Bowler
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

I noticed that the fmul8x16 instruction did not appear to be emulated
correctly[1].  It would seem that emulation was not using a single-
precision input register like the real hardware does, but rather a
double-precision register, causing it to operate on the wrong data.

Every other VIS instruction which contains one or more single-precision
inputs and a double-precision output has the exact same problem.

A few computational problems are found and fixed by this series too.

All patches can be applied independently, except patch 2 adds some
helpers which are subsequently needed by patches 3, 4 and 5.

Emulation results are tested by manually comparing the output of a small
Linux test program on an UltraSparc II against the output of running the
same binary under qemu-sparc32plus on a ppc64le host system.

[1] https://gitlab.com/qemu-project/qemu/-/issues/1901

Nick Bowler (8):
  target/sparc: Fix VIS fmul8x16 input register.
  target/sparc: Fix VIS fmul8x16au instruction.
  target/sparc: Fix VIS fmul8x16al instruction.
  target/sparc: Fix VIS fmuld8sux16 instruction.
  target/sparc: Fix VIS fmuld8ulx16 instruction.
  target/sparc: Fix VIS fpmerge input registers.
  target/sparc: Fix VIS fexpand input register.
  target/sparc: Fix VIS subtraction instructions.

 target/sparc/helper.h     |  14 ++---
 target/sparc/translate.c  |  42 +++++++++++---
 target/sparc/vis_helper.c | 119 +++++++++++++++++++-------------------
 3 files changed, 101 insertions(+), 74 deletions(-)

-- 
2.41.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/8] target/sparc: Fix VIS fmul8x16 input register.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:29   ` Richard Henderson
  2023-09-25  5:03 ` [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction Nick Bowler
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On a real UltraSparc II CPU, the fmul8x16 instruction reads its first
input from any of the single-precision floating point registers.

But the emulator is reading the input as if the first operand encodes
a double-precision register, which in most cases will not contain the
right data and therefore the output of the emulated instruction is
just garbage.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1901
---
 target/sparc/helper.h     | 2 +-
 target/sparc/translate.c  | 6 +++++-
 target/sparc/vis_helper.c | 9 +++++----
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index b8f1e78c75..ace731a22c 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -126,7 +126,7 @@ DEF_HELPER_FLAGS_2(fdtox, TCG_CALL_NO_RWG, s64, env, f64)
 DEF_HELPER_FLAGS_1(fqtox, TCG_CALL_NO_RWG, s64, env)
 
 DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
 DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 3bf0ab8135..bb65b8daf8 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4750,7 +4750,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x031: /* VIS I fmul8x16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_64 = gen_load_fpr_D(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D(dc, rd);
+                    gen_helper_fmul8x16(cpu_dst_64, cpu_src1_32, cpu_src2_64);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x033: /* VIS I fmul8x16au */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 3afdc6975c..d158b39b85 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -94,16 +94,17 @@ uint64_t helper_fpmerge(uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16(uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16(uint32_t src1, uint64_t src2)
 {
-    VIS64 s, d;
+    VIS32 s;
+    VIS64 d;
     uint32_t tmp;
 
-    s.ll = src1;
+    s.l = src1;
     d.ll = src2;
 
 #define PMUL(r)                                                 \
-    tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B64(r);       \
+    tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B32(r);       \
     if ((tmp & 0xff) > 0x7f) {                                  \
         tmp += 0x100;                                           \
     }                                                           \
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
  2023-09-25  5:03 ` [PATCH 1/8] target/sparc: Fix VIS fmul8x16 input register Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:32   ` Richard Henderson
  2023-09-25  5:03 ` [PATCH 3/8] target/sparc: Fix VIS fmul8x16al instruction Nick Bowler
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On a real UltraSparc II, the fmul8x16au instruction takes two single-
precision input operands and returns a double-precision result.  For
the second operand, bits 31:16 are used, and bits 15:0 are ignored.

However, the emulation is taking two double-precision input operands,
and furthermore it is using bits 15:0 of the second operand (ignoring
bits 31:16).  These are unlikely to contain the correct values.

Even still, the emulator overwrites the second input before all outputs
are calculated, so even if by chance the data loaded in happens to be
correct, the results are just garbage except in trivial cases.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
---
 target/sparc/helper.h     |  2 +-
 target/sparc/translate.c  | 19 ++++++++++++++++++-
 target/sparc/vis_helper.c | 14 +++++++++-----
 3 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index ace731a22c..76e06b8ea5 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -128,7 +128,7 @@ DEF_HELPER_FLAGS_1(fqtox, TCG_CALL_NO_RWG, s64, env)
 DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
 DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index bb65b8daf8..ca81b35a25 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -1786,6 +1786,23 @@ static void gen_fop_DFF(DisasContext *dc, int rd, int rs1, int rs2,
     gen_store_fpr_D(dc, rd, dst);
 }
 
+#ifdef TARGET_SPARC64
+static void gen_ne_fop_DFF(DisasContext *dc, int rd, int rs1, int rs2,
+                           void (*gen)(TCGv_i64, TCGv_i32, TCGv_i32))
+{
+    TCGv_i64 dst;
+    TCGv_i32 src1, src2;
+
+    src1 = gen_load_fpr_F(dc, rs1);
+    src2 = gen_load_fpr_F(dc, rs2);
+    dst = gen_dest_fpr_D(dc, rd);
+
+    gen(dst, src1, src2);
+
+    gen_store_fpr_D(dc, rd, dst);
+}
+#endif
+
 static void gen_fop_QDD(DisasContext *dc, int rd, int rs1, int rs2,
                         void (*gen)(TCGv_ptr, TCGv_i64, TCGv_i64))
 {
@@ -4758,7 +4775,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x033: /* VIS I fmul8x16au */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16au);
+                    gen_ne_fop_DFF(dc, rd, rs1, rs2, gen_helper_fmul8x16au);
                     break;
                 case 0x035: /* VIS I fmul8x16al */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index d158b39b85..2fc783a054 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -49,6 +49,7 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
 #define VIS_L64(n) l[1 - (n)]
 #define VIS_B32(n) b[3 - (n)]
 #define VIS_W32(n) w[1 - (n)]
+#define VIS_SW32(n) sw[1 - (n)]
 #else
 #define VIS_B64(n) b[n]
 #define VIS_W64(n) w[n]
@@ -56,6 +57,7 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
 #define VIS_L64(n) l[n]
 #define VIS_B32(n) b[n]
 #define VIS_W32(n) w[n]
+#define VIS_SW32(n) sw[n]
 #endif
 
 typedef union {
@@ -70,6 +72,7 @@ typedef union {
 typedef union {
     uint8_t b[4];
     uint16_t w[2];
+    int16_t sw[2];
     uint32_t l;
     float32 f;
 } VIS32;
@@ -143,16 +146,17 @@ uint64_t helper_fmul8x16al(uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16au(uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16au(uint32_t src1, uint32_t src2)
 {
-    VIS64 s, d;
+    VIS32 s1, s2;
+    VIS64 d;
     uint32_t tmp;
 
-    s.ll = src1;
-    d.ll = src2;
+    s1.l = src1;
+    s2.l = src2;
 
 #define PMUL(r)                                                 \
-    tmp = (int32_t)d.VIS_SW64(0) * (int32_t)s.VIS_B64(r);       \
+    tmp = (int32_t)s2.VIS_SW32(1) * (int32_t)s1.VIS_B64(r);     \
     if ((tmp & 0xff) > 0x7f) {                                  \
         tmp += 0x100;                                           \
     }                                                           \
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/8] target/sparc: Fix VIS fmul8x16al instruction.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
  2023-09-25  5:03 ` [PATCH 1/8] target/sparc: Fix VIS fmul8x16 input register Nick Bowler
  2023-09-25  5:03 ` [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:32   ` Richard Henderson
  2023-09-25  5:03 ` [PATCH 4/8] target/sparc: Fix VIS fmuld8sux16 instruction Nick Bowler
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On a real UltraSparc II, the fmul8x16al instruction takes two single-
precision input operands and returns a double-precision result.  For
the second operand, bits 15:0 are used, and bits 31:16 are ignored.

However, the emulation is taking two double-precision input operands,
and furthermore it is using bits 31:16 of the second operand (ignoring
bits 15:0).  These are unlikely to contain the correct values.

Even still, the emulator overwrites the second input before all outputs
are calculated, so even if by chance the data loaded in happens to be
correct, the results are just garbage except in trivial cases.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
---
 target/sparc/helper.h     |  2 +-
 target/sparc/translate.c  |  2 +-
 target/sparc/vis_helper.c | 11 ++++++-----
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 76e06b8ea5..25d6178ca5 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -127,7 +127,7 @@ DEF_HELPER_FLAGS_1(fqtox, TCG_CALL_NO_RWG, s64, env)
 
 DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
-DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index ca81b35a25..dddee9f974 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4779,7 +4779,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x035: /* VIS I fmul8x16al */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmul8x16al);
+                    gen_ne_fop_DFF(dc, rd, rs1, rs2, gen_helper_fmul8x16al);
                     break;
                 case 0x036: /* VIS I fmul8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 2fc783a054..386cfd0706 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -122,16 +122,17 @@ uint64_t helper_fmul8x16(uint32_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmul8x16al(uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16al(uint32_t src1, uint32_t src2)
 {
-    VIS64 s, d;
+    VIS32 s1, s2;
+    VIS64 d;
     uint32_t tmp;
 
-    s.ll = src1;
-    d.ll = src2;
+    s1.l = src1;
+    s2.l = src2;
 
 #define PMUL(r)                                                 \
-    tmp = (int32_t)d.VIS_SW64(1) * (int32_t)s.VIS_B64(r);       \
+    tmp = (int32_t)s2.VIS_SW32(0) * (int32_t)s1.VIS_B64(r);     \
     if ((tmp & 0xff) > 0x7f) {                                  \
         tmp += 0x100;                                           \
     }                                                           \
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/8] target/sparc: Fix VIS fmuld8sux16 instruction.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
                   ` (2 preceding siblings ...)
  2023-09-25  5:03 ` [PATCH 3/8] target/sparc: Fix VIS fmul8x16al instruction Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:33   ` Richard Henderson
  2023-09-25  5:03 ` [PATCH 5/8] target/sparc: Fix VIS fmuld8ulx16 instruction Nick Bowler
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On a real UltraSparc II, the fmuld8sux16 instruction takes two single-
precision input operands and returns a double-precision result.

However, the emulation is taking two double-precision input operands,
which are unlikely to contain the correct values.  Even if they did,
the emulator is rounding the output, which the real processor does
not do.  And the real processor shifts both outputs left by 8, which
the emulator does not do.

So the results are wrong except in trivial cases.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
---
 target/sparc/helper.h     |  2 +-
 target/sparc/translate.c  |  2 +-
 target/sparc/vis_helper.c | 19 ++++++++-----------
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 25d6178ca5..adc1ea6653 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -131,7 +131,7 @@ DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_3(pdist, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index dddee9f974..1017d3bca7 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4791,7 +4791,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x038: /* VIS I fmuld8sux16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8sux16);
+                    gen_ne_fop_DFF(dc, rd, rs1, rs2, gen_helper_fmuld8sux16);
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 386cfd0706..de5ddad39a 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -220,24 +220,21 @@ uint64_t helper_fmul8ulx16(uint64_t src1, uint64_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmuld8sux16(uint64_t src1, uint64_t src2)
+uint64_t helper_fmuld8sux16(uint32_t src1, uint32_t src2)
 {
-    VIS64 s, d;
+    VIS32 s1, s2;
+    VIS64 d;
     uint32_t tmp;
 
-    s.ll = src1;
-    d.ll = src2;
+    s1.l = src1;
+    s2.l = src2;
 
 #define PMUL(r)                                                         \
-    tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8);       \
-    if ((tmp & 0xff) > 0x7f) {                                          \
-        tmp += 0x100;                                                   \
-    }                                                                   \
-    d.VIS_L64(r) = tmp;
+    tmp = (int32_t)s2.VIS_SW32(r) * ((int32_t)s1.VIS_SW32(r) >> 8);     \
+    d.VIS_L64(r) = tmp << 8;
 
-    /* Reverse calculation order to handle overlap */
-    PMUL(1);
     PMUL(0);
+    PMUL(1);
 #undef PMUL
 
     return d.ll;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/8] target/sparc: Fix VIS fmuld8ulx16 instruction.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
                   ` (3 preceding siblings ...)
  2023-09-25  5:03 ` [PATCH 4/8] target/sparc: Fix VIS fmuld8sux16 instruction Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:34   ` Richard Henderson
  2023-09-25  5:03 ` [PATCH 6/8] target/sparc: Fix VIS fpmerge input registers Nick Bowler
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On a real UltraSparc II, the fmuld8ulx16 instruction takes two single-
precision input operands and returns a double-precision result.

However, the emulation is taking two double-precision input operands,
which are unlikely to contain the correct values, so the results are
garbage in most cases.  Even if the inputs happen to be correct, the
emulator is rounding the output, which the real processor does not do.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
---
 target/sparc/helper.h     |  2 +-
 target/sparc/translate.c  |  2 +-
 target/sparc/vis_helper.c | 17 +++++++----------
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index adc1ea6653..7a588f3068 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -132,7 +132,7 @@ DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i32, i32)
-DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_3(pdist, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_NO_RWG_SE, i32, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 1017d3bca7..cfccd95c3a 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4795,7 +4795,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x039: /* VIS I fmuld8ulx16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fmuld8ulx16);
+                    gen_ne_fop_DFF(dc, rd, rs1, rs2, gen_helper_fmuld8ulx16);
                     break;
                 case 0x03a: /* VIS I fpack32 */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index de5ddad39a..306383ba60 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -240,24 +240,21 @@ uint64_t helper_fmuld8sux16(uint32_t src1, uint32_t src2)
     return d.ll;
 }
 
-uint64_t helper_fmuld8ulx16(uint64_t src1, uint64_t src2)
+uint64_t helper_fmuld8ulx16(uint32_t src1, uint32_t src2)
 {
-    VIS64 s, d;
+    VIS32 s1, s2;
+    VIS64 d;
     uint32_t tmp;
 
-    s.ll = src1;
-    d.ll = src2;
+    s1.l = src1;
+    s2.l = src2;
 
 #define PMUL(r)                                                         \
-    tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2));        \
-    if ((tmp & 0xff) > 0x7f) {                                          \
-        tmp += 0x100;                                                   \
-    }                                                                   \
+    tmp = (int32_t)s2.VIS_SW32(r) * ((uint32_t)s1.VIS_B32(r * 2));      \
     d.VIS_L64(r) = tmp;
 
-    /* Reverse calculation order to handle overlap */
-    PMUL(1);
     PMUL(0);
+    PMUL(1);
 #undef PMUL
 
     return d.ll;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/8] target/sparc: Fix VIS fpmerge input registers.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
                   ` (4 preceding siblings ...)
  2023-09-25  5:03 ` [PATCH 5/8] target/sparc: Fix VIS fmuld8ulx16 instruction Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:35   ` Richard Henderson
  2023-09-25  5:03 ` [PATCH 7/8] target/sparc: Fix VIS fexpand input register Nick Bowler
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On a real UltraSparc II CPU, the fpmerge instruction reads two
single-precision input registers, but the emulator is reading
from double-precision input registers instead.

These are unlikely to contain the correct data so in most instances
the results of the emulation are just garbage in most instances.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
---
 target/sparc/helper.h     |  2 +-
 target/sparc/translate.c  |  6 +++++-
 target/sparc/vis_helper.c | 26 +++++++++++++-------------
 3 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 7a588f3068..b71688079f 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -125,7 +125,7 @@ DEF_HELPER_FLAGS_2(fstox, TCG_CALL_NO_RWG, s64, env, f32)
 DEF_HELPER_FLAGS_2(fdtox, TCG_CALL_NO_RWG, s64, env, f64)
 DEF_HELPER_FLAGS_1(fqtox, TCG_CALL_NO_RWG, s64, env)
 
-DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
 DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i32, i32)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index cfccd95c3a..241ac429ca 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4825,7 +4825,11 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x04b: /* VIS I fpmerge */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
+                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D(dc, rd);
+                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_32, cpu_src2_32);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x04c: /* VIS II bshuffle */
                     CHECK_FPU_FEATURE(dc, VIS2);
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 306383ba60..029aad3923 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -77,22 +77,22 @@ typedef union {
     float32 f;
 } VIS32;
 
-uint64_t helper_fpmerge(uint64_t src1, uint64_t src2)
+uint64_t helper_fpmerge(uint32_t src1, uint32_t src2)
 {
-    VIS64 s, d;
+    VIS32 s1, s2;
+    VIS64 d;
 
-    s.ll = src1;
-    d.ll = src2;
+    s1.l = src1;
+    s2.l = src2;
 
-    /* Reverse calculation order to handle overlap */
-    d.VIS_B64(7) = s.VIS_B64(3);
-    d.VIS_B64(6) = d.VIS_B64(3);
-    d.VIS_B64(5) = s.VIS_B64(2);
-    d.VIS_B64(4) = d.VIS_B64(2);
-    d.VIS_B64(3) = s.VIS_B64(1);
-    d.VIS_B64(2) = d.VIS_B64(1);
-    d.VIS_B64(1) = s.VIS_B64(0);
-    /* d.VIS_B64(0) = d.VIS_B64(0); */
+    d.VIS_B64(0) = s2.VIS_B32(0);
+    d.VIS_B64(1) = s1.VIS_B32(0);
+    d.VIS_B64(2) = s2.VIS_B32(1);
+    d.VIS_B64(3) = s1.VIS_B32(1);
+    d.VIS_B64(4) = s2.VIS_B32(2);
+    d.VIS_B64(5) = s1.VIS_B32(2);
+    d.VIS_B64(6) = s2.VIS_B32(3);
+    d.VIS_B64(7) = s1.VIS_B32(3);
 
     return d.ll;
 }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/8] target/sparc: Fix VIS fexpand input register.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
                   ` (5 preceding siblings ...)
  2023-09-25  5:03 ` [PATCH 6/8] target/sparc: Fix VIS fpmerge input registers Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:36   ` Richard Henderson
  2023-09-25  5:03 ` [PATCH 8/8] target/sparc: Fix VIS subtraction instructions Nick Bowler
  2023-09-28 20:05 ` [PATCH 0/8] SPARC VIS fixes Mark Cave-Ayland
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

This instruction is documented to get its input from the second
single-precision input operand; the first operand is ignored.
This is exactly what a real UltraSparc II does.  Meanwhile, the
the emulator uses only the irrelevant first operand, treating
it as a double-precision register, and ignores the second.

This will not normally contain the correct data so the emulated
instruction usually just produces garbage.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
---
 target/sparc/helper.h     | 2 +-
 target/sparc/translate.c  | 5 ++++-
 target/sparc/vis_helper.c | 5 ++---
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index b71688079f..81d44e7618 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -133,7 +133,7 @@ DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i32, i32)
 DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_NO_RWG_SE, i64, i32, i32)
-DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_1(fexpand, TCG_CALL_NO_RWG_SE, i64, i32)
 DEF_HELPER_FLAGS_3(pdist, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
 DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_NO_RWG_SE, i32, i64, i64)
 DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 241ac429ca..4e92c27768 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4837,7 +4837,10 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
                     break;
                 case 0x04d: /* VIS I fexpand */
                     CHECK_FPU_FEATURE(dc, VIS1);
-                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fexpand);
+                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
+                    cpu_dst_64 = gen_dest_fpr_D(dc, rd);
+                    gen_helper_fexpand(cpu_dst_64, cpu_src2_32);
+                    gen_store_fpr_D(dc, rd, cpu_dst_64);
                     break;
                 case 0x050: /* VIS I fpadd16 */
                     CHECK_FPU_FEATURE(dc, VIS1);
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 029aad3923..3903beaf5d 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -260,13 +260,12 @@ uint64_t helper_fmuld8ulx16(uint32_t src1, uint32_t src2)
     return d.ll;
 }
 
-uint64_t helper_fexpand(uint64_t src1, uint64_t src2)
+uint64_t helper_fexpand(uint32_t src2)
 {
     VIS32 s;
     VIS64 d;
 
-    s.l = (uint32_t)src1;
-    d.ll = src2;
+    s.l = src2;
     d.VIS_W64(0) = s.VIS_B32(0) << 4;
     d.VIS_W64(1) = s.VIS_B32(1) << 4;
     d.VIS_W64(2) = s.VIS_B32(2) << 4;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 8/8] target/sparc: Fix VIS subtraction instructions.
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
                   ` (6 preceding siblings ...)
  2023-09-25  5:03 ` [PATCH 7/8] target/sparc: Fix VIS fexpand input register Nick Bowler
@ 2023-09-25  5:03 ` Nick Bowler
  2023-09-28 21:40   ` Richard Henderson
  2023-09-28 20:05 ` [PATCH 0/8] SPARC VIS fixes Mark Cave-Ayland
  8 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-25  5:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

All of the VIS subtraction instructions are documented to subtract the
second input operand from the first.  This is also consistent with how
the instructions actually work on a real UltraSparc II.

But the emulator is implementing the subtraction in the wrong order,
subtracting the first input from the second, so the results are wrong
in all nontrivial cases.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
---
 target/sparc/vis_helper.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 3903beaf5d..fa97e09ffa 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -282,10 +282,10 @@ uint64_t helper_fexpand(uint32_t src2)
         s.ll = src1;                                    \
         d.ll = src2;                                    \
                                                         \
-        d.VIS_W64(0) = F(d.VIS_W64(0), s.VIS_W64(0));   \
-        d.VIS_W64(1) = F(d.VIS_W64(1), s.VIS_W64(1));   \
-        d.VIS_W64(2) = F(d.VIS_W64(2), s.VIS_W64(2));   \
-        d.VIS_W64(3) = F(d.VIS_W64(3), s.VIS_W64(3));   \
+        d.VIS_W64(0) = F(s.VIS_W64(0), d.VIS_W64(0));   \
+        d.VIS_W64(1) = F(s.VIS_W64(1), d.VIS_W64(1));   \
+        d.VIS_W64(2) = F(s.VIS_W64(2), d.VIS_W64(2));   \
+        d.VIS_W64(3) = F(s.VIS_W64(3), d.VIS_W64(3));   \
                                                         \
         return d.ll;                                    \
     }                                                   \
@@ -297,8 +297,8 @@ uint64_t helper_fexpand(uint32_t src2)
         s.l = src1;                                     \
         d.l = src2;                                     \
                                                         \
-        d.VIS_W32(0) = F(d.VIS_W32(0), s.VIS_W32(0));   \
-        d.VIS_W32(1) = F(d.VIS_W32(1), s.VIS_W32(1));   \
+        d.VIS_W32(0) = F(s.VIS_W32(0), d.VIS_W32(0));   \
+        d.VIS_W32(1) = F(s.VIS_W32(1), d.VIS_W32(1));   \
                                                         \
         return d.l;                                     \
     }                                                   \
@@ -310,8 +310,8 @@ uint64_t helper_fexpand(uint32_t src2)
         s.ll = src1;                                    \
         d.ll = src2;                                    \
                                                         \
-        d.VIS_L64(0) = F(d.VIS_L64(0), s.VIS_L64(0));   \
-        d.VIS_L64(1) = F(d.VIS_L64(1), s.VIS_L64(1));   \
+        d.VIS_L64(0) = F(s.VIS_L64(0), d.VIS_L64(0));   \
+        d.VIS_L64(1) = F(s.VIS_L64(1), d.VIS_L64(1));   \
                                                         \
         return d.ll;                                    \
     }                                                   \
@@ -323,7 +323,7 @@ uint64_t helper_fexpand(uint32_t src2)
         s.l = src1;                                     \
         d.l = src2;                                     \
                                                         \
-        d.l = F(d.l, s.l);                              \
+        d.l = F(s.l, d.l);                              \
                                                         \
         return d.l;                                     \
     }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/8] SPARC VIS fixes
  2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
                   ` (7 preceding siblings ...)
  2023-09-25  5:03 ` [PATCH 8/8] target/sparc: Fix VIS subtraction instructions Nick Bowler
@ 2023-09-28 20:05 ` Mark Cave-Ayland
  8 siblings, 0 replies; 22+ messages in thread
From: Mark Cave-Ayland @ 2023-09-28 20:05 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Artyom Tarasenko

On 25/09/2023 06:03, Nick Bowler wrote:

> I noticed that the fmul8x16 instruction did not appear to be emulated
> correctly[1].  It would seem that emulation was not using a single-
> precision input register like the real hardware does, but rather a
> double-precision register, causing it to operate on the wrong data.
> 
> Every other VIS instruction which contains one or more single-precision
> inputs and a double-precision output has the exact same problem.
> 
> A few computational problems are found and fixed by this series too.
> 
> All patches can be applied independently, except patch 2 adds some
> helpers which are subsequently needed by patches 3, 4 and 5.
> 
> Emulation results are tested by manually comparing the output of a small
> Linux test program on an UltraSparc II against the output of running the
> same binary under qemu-sparc32plus on a ppc64le host system.
> 
> [1] https://gitlab.com/qemu-project/qemu/-/issues/1901
> 
> Nick Bowler (8):
>    target/sparc: Fix VIS fmul8x16 input register.
>    target/sparc: Fix VIS fmul8x16au instruction.
>    target/sparc: Fix VIS fmul8x16al instruction.
>    target/sparc: Fix VIS fmuld8sux16 instruction.
>    target/sparc: Fix VIS fmuld8ulx16 instruction.
>    target/sparc: Fix VIS fpmerge input registers.
>    target/sparc: Fix VIS fexpand input register.
>    target/sparc: Fix VIS subtraction instructions.
> 
>   target/sparc/helper.h     |  14 ++---
>   target/sparc/translate.c  |  42 +++++++++++---
>   target/sparc/vis_helper.c | 119 +++++++++++++++++++-------------------
>   3 files changed, 101 insertions(+), 74 deletions(-)

Thanks for the patches, Nick. I've had a look at the series, and whilst I'm not 
overly familiar with the VIS instructions, your changes and detailed explanations 
look good against a cursory read of the SPARCv9 specification.

Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>

I'll wait a few days to see if either Artyom or Richard has any further comments, but 
if not then I'll queue them to my qemu-sparc branch.


ATB,

Mark.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/8] target/sparc: Fix VIS fmul8x16 input register.
  2023-09-25  5:03 ` [PATCH 1/8] target/sparc: Fix VIS fmul8x16 input register Nick Bowler
@ 2023-09-28 21:29   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:29 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
> On a real UltraSparc II CPU, the fmul8x16 instruction reads its first
> input from any of the single-precision floating point registers.
> 
> But the emulator is reading the input as if the first operand encodes
> a double-precision register, which in most cases will not contain the
> right data and therefore the output of the emulated instruction is
> just garbage.
> 
> Signed-off-by: Nick Bowler<nbowler@draconx.ca>
> Resolves:https://gitlab.com/qemu-project/qemu/-/issues/1901
> ---
>   target/sparc/helper.h     | 2 +-
>   target/sparc/translate.c  | 6 +++++-
>   target/sparc/vis_helper.c | 9 +++++----
>   3 files changed, 11 insertions(+), 6 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction.
  2023-09-25  5:03 ` [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction Nick Bowler
@ 2023-09-28 21:32   ` Richard Henderson
  2023-09-29  0:41     ` Nick Bowler
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:32 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
> On a real UltraSparc II, the fmul8x16au instruction takes two single-
> precision input operands and returns a double-precision result.  For
> the second operand, bits 31:16 are used, and bits 15:0 are ignored.
> 
> However, the emulation is taking two double-precision input operands,
> and furthermore it is using bits 15:0 of the second operand (ignoring
> bits 31:16).  These are unlikely to contain the correct values.
> 
> Even still, the emulator overwrites the second input before all outputs
> are calculated, so even if by chance the data loaded in happens to be
> correct, the results are just garbage except in trivial cases.
> 
> Signed-off-by: Nick Bowler <nbowler@draconx.ca>
> ---
>   target/sparc/helper.h     |  2 +-
>   target/sparc/translate.c  | 19 ++++++++++++++++++-
>   target/sparc/vis_helper.c | 14 +++++++++-----
>   3 files changed, 28 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

>   #define PMUL(r)                                                 \
> -    tmp = (int32_t)d.VIS_SW64(0) * (int32_t)s.VIS_B64(r);       \
> +    tmp = (int32_t)s2.VIS_SW32(1) * (int32_t)s1.VIS_B64(r);     \
>       if ((tmp & 0xff) > 0x7f) {                                  \
>           tmp += 0x100;                                           \
>       }                                                           \

Belated follow-up suggestion:

-   if ((tmp & 0xff) > 0x7f) {
-       tmp += 0x100;
-   }
+   tmp += 0x80;

7 occurrences throughout vis_helper.c.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/8] target/sparc: Fix VIS fmul8x16al instruction.
  2023-09-25  5:03 ` [PATCH 3/8] target/sparc: Fix VIS fmul8x16al instruction Nick Bowler
@ 2023-09-28 21:32   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:32 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
> On a real UltraSparc II, the fmul8x16al instruction takes two single-
> precision input operands and returns a double-precision result.  For
> the second operand, bits 15:0 are used, and bits 31:16 are ignored.
> 
> However, the emulation is taking two double-precision input operands,
> and furthermore it is using bits 31:16 of the second operand (ignoring
> bits 15:0).  These are unlikely to contain the correct values.
> 
> Even still, the emulator overwrites the second input before all outputs
> are calculated, so even if by chance the data loaded in happens to be
> correct, the results are just garbage except in trivial cases.
> 
> Signed-off-by: Nick Bowler<nbowler@draconx.ca>
> ---
>   target/sparc/helper.h     |  2 +-
>   target/sparc/translate.c  |  2 +-
>   target/sparc/vis_helper.c | 11 ++++++-----
>   3 files changed, 8 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/8] target/sparc: Fix VIS fmuld8sux16 instruction.
  2023-09-25  5:03 ` [PATCH 4/8] target/sparc: Fix VIS fmuld8sux16 instruction Nick Bowler
@ 2023-09-28 21:33   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:33 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
> On a real UltraSparc II, the fmuld8sux16 instruction takes two single-
> precision input operands and returns a double-precision result.
> 
> However, the emulation is taking two double-precision input operands,
> which are unlikely to contain the correct values.  Even if they did,
> the emulator is rounding the output, which the real processor does
> not do.  And the real processor shifts both outputs left by 8, which
> the emulator does not do.
> 
> So the results are wrong except in trivial cases.
> 
> Signed-off-by: Nick Bowler<nbowler@draconx.ca>
> ---
>   target/sparc/helper.h     |  2 +-
>   target/sparc/translate.c  |  2 +-
>   target/sparc/vis_helper.c | 19 ++++++++-----------
>   3 files changed, 10 insertions(+), 13 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/8] target/sparc: Fix VIS fmuld8ulx16 instruction.
  2023-09-25  5:03 ` [PATCH 5/8] target/sparc: Fix VIS fmuld8ulx16 instruction Nick Bowler
@ 2023-09-28 21:34   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:34 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
> On a real UltraSparc II, the fmuld8ulx16 instruction takes two single-
> precision input operands and returns a double-precision result.
> 
> However, the emulation is taking two double-precision input operands,
> which are unlikely to contain the correct values, so the results are
> garbage in most cases.  Even if the inputs happen to be correct, the
> emulator is rounding the output, which the real processor does not do.
> 
> Signed-off-by: Nick Bowler<nbowler@draconx.ca>
> ---
>   target/sparc/helper.h     |  2 +-
>   target/sparc/translate.c  |  2 +-
>   target/sparc/vis_helper.c | 17 +++++++----------
>   3 files changed, 9 insertions(+), 12 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 6/8] target/sparc: Fix VIS fpmerge input registers.
  2023-09-25  5:03 ` [PATCH 6/8] target/sparc: Fix VIS fpmerge input registers Nick Bowler
@ 2023-09-28 21:35   ` Richard Henderson
  2023-09-29  0:32     ` Nick Bowler
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:35 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
>                   case 0x04b: /* VIS I fpmerge */
>                       CHECK_FPU_FEATURE(dc, VIS1);
> -                    gen_ne_fop_DDD(dc, rd, rs1, rs2, gen_helper_fpmerge);
> +                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
> +                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
> +                    cpu_dst_64 = gen_dest_fpr_D(dc, rd);
> +                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_32, cpu_src2_32);
> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>                       break;

Use gen_ne_fop_DFF.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 7/8] target/sparc: Fix VIS fexpand input register.
  2023-09-25  5:03 ` [PATCH 7/8] target/sparc: Fix VIS fexpand input register Nick Bowler
@ 2023-09-28 21:36   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:36 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
> This instruction is documented to get its input from the second
> single-precision input operand; the first operand is ignored.
> This is exactly what a real UltraSparc II does.  Meanwhile, the
> the emulator uses only the irrelevant first operand, treating
> it as a double-precision register, and ignores the second.
> 
> This will not normally contain the correct data so the emulated
> instruction usually just produces garbage.
> 
> Signed-off-by: Nick Bowler<nbowler@draconx.ca>
> ---
>   target/sparc/helper.h     | 2 +-
>   target/sparc/translate.c  | 5 ++++-
>   target/sparc/vis_helper.c | 5 ++---
>   3 files changed, 7 insertions(+), 5 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] target/sparc: Fix VIS subtraction instructions.
  2023-09-25  5:03 ` [PATCH 8/8] target/sparc: Fix VIS subtraction instructions Nick Bowler
@ 2023-09-28 21:40   ` Richard Henderson
  2023-09-29  0:31     ` Nick Bowler
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2023-09-28 21:40 UTC (permalink / raw)
  To: Nick Bowler, qemu-devel; +Cc: Mark Cave-Ayland, Artyom Tarasenko

On 9/24/23 01:03, Nick Bowler wrote:
> All of the VIS subtraction instructions are documented to subtract the
> second input operand from the first.  This is also consistent with how
> the instructions actually work on a real UltraSparc II.
> 
> But the emulator is implementing the subtraction in the wrong order,
> subtracting the first input from the second, so the results are wrong
> in all nontrivial cases.
> 
> Signed-off-by: Nick Bowler <nbowler@draconx.ca>
> ---
>   target/sparc/vis_helper.c | 18 +++++++++---------
>   1 file changed, 9 insertions(+), 9 deletions(-)

While this patch works, better to use

void tcg_gen_vec_add16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
void tcg_gen_vec_add16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
void tcg_gen_vec_add32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);

void tcg_gen_vec_sub16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
void tcg_gen_vec_sub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
void tcg_gen_vec_sub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);

from "tcg/tcg-op-gvec.h" and remove the sparc helpers.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] target/sparc: Fix VIS subtraction instructions.
  2023-09-28 21:40   ` Richard Henderson
@ 2023-09-29  0:31     ` Nick Bowler
  0 siblings, 0 replies; 22+ messages in thread
From: Nick Bowler @ 2023-09-29  0:31 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, Mark Cave-Ayland, Artyom Tarasenko

On 2023-09-28, Richard Henderson <richard.henderson@linaro.org> wrote:
> On 9/24/23 01:03, Nick Bowler wrote:
>> All of the VIS subtraction instructions are documented to subtract the
>> second input operand from the first.  This is also consistent with how
>> the instructions actually work on a real UltraSparc II.
>>
>> But the emulator is implementing the subtraction in the wrong order,
>> subtracting the first input from the second, so the results are wrong
>> in all nontrivial cases.
>>
>> Signed-off-by: Nick Bowler <nbowler@draconx.ca>
>> ---
>>   target/sparc/vis_helper.c | 18 +++++++++---------
>>   1 file changed, 9 insertions(+), 9 deletions(-)
>
> While this patch works, better to use
>
> void tcg_gen_vec_add16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
> void tcg_gen_vec_add16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
> void tcg_gen_vec_add32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
>
> void tcg_gen_vec_sub16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b);
> void tcg_gen_vec_sub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
> void tcg_gen_vec_sub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b);
>
> from "tcg/tcg-op-gvec.h" and remove the sparc helpers.

OK, I will try to respin this one using these.

Thanks,
  Nick


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 6/8] target/sparc: Fix VIS fpmerge input registers.
  2023-09-28 21:35   ` Richard Henderson
@ 2023-09-29  0:32     ` Nick Bowler
  0 siblings, 0 replies; 22+ messages in thread
From: Nick Bowler @ 2023-09-29  0:32 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, Mark Cave-Ayland, Artyom Tarasenko

On 2023-09-28, Richard Henderson <richard.henderson@linaro.org> wrote:
> On 9/24/23 01:03, Nick Bowler wrote:
>>                   case 0x04b: /* VIS I fpmerge */
>>                       CHECK_FPU_FEATURE(dc, VIS1);
>> -                    gen_ne_fop_DDD(dc, rd, rs1, rs2,
>> gen_helper_fpmerge);
>> +                    cpu_src1_32 = gen_load_fpr_F(dc, rs1);
>> +                    cpu_src2_32 = gen_load_fpr_F(dc, rs2);
>> +                    cpu_dst_64 = gen_dest_fpr_D(dc, rd);
>> +                    gen_helper_fpmerge(cpu_dst_64, cpu_src1_32,
>> cpu_src2_32);
>> +                    gen_store_fpr_D(dc, rd, cpu_dst_64);
>>                       break;
>
> Use gen_ne_fop_DFF.

Good catch, I clearly missed that this can use the new helper, I will
respin this one.

Thanks,
  Nick


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction.
  2023-09-28 21:32   ` Richard Henderson
@ 2023-09-29  0:41     ` Nick Bowler
  2023-09-29  1:04       ` Richard Henderson
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Bowler @ 2023-09-29  0:41 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, Mark Cave-Ayland, Artyom Tarasenko

On 2023-09-28, Richard Henderson <richard.henderson@linaro.org> wrote:
> Belated follow-up suggestion:
>
> -   if ((tmp & 0xff) > 0x7f) {
> -       tmp += 0x100;
> -   }
> +   tmp += 0x80;
>
> 7 occurrences throughout vis_helper.c.

I agree with making this particular change but I think since it doesn't
fix a bug, it should go in a separate patch.

So I will include a patch to do that in series v2 and keep this one
as-is with your Reviewed-by.

Thanks,
  Nick


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction.
  2023-09-29  0:41     ` Nick Bowler
@ 2023-09-29  1:04       ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2023-09-29  1:04 UTC (permalink / raw)
  To: Nick Bowler; +Cc: qemu-devel, Mark Cave-Ayland, Artyom Tarasenko

On 9/28/23 17:41, Nick Bowler wrote:
> On 2023-09-28, Richard Henderson <richard.henderson@linaro.org> wrote:
>> Belated follow-up suggestion:
>>
>> -   if ((tmp & 0xff) > 0x7f) {
>> -       tmp += 0x100;
>> -   }
>> +   tmp += 0x80;
>>
>> 7 occurrences throughout vis_helper.c.
> 
> I agree with making this particular change but I think since it doesn't
> fix a bug, it should go in a separate patch.

Of course.  That's what I meant by followup.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-09-29  1:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-25  5:03 [PATCH 0/8] SPARC VIS fixes Nick Bowler
2023-09-25  5:03 ` [PATCH 1/8] target/sparc: Fix VIS fmul8x16 input register Nick Bowler
2023-09-28 21:29   ` Richard Henderson
2023-09-25  5:03 ` [PATCH 2/8] target/sparc: Fix VIS fmul8x16au instruction Nick Bowler
2023-09-28 21:32   ` Richard Henderson
2023-09-29  0:41     ` Nick Bowler
2023-09-29  1:04       ` Richard Henderson
2023-09-25  5:03 ` [PATCH 3/8] target/sparc: Fix VIS fmul8x16al instruction Nick Bowler
2023-09-28 21:32   ` Richard Henderson
2023-09-25  5:03 ` [PATCH 4/8] target/sparc: Fix VIS fmuld8sux16 instruction Nick Bowler
2023-09-28 21:33   ` Richard Henderson
2023-09-25  5:03 ` [PATCH 5/8] target/sparc: Fix VIS fmuld8ulx16 instruction Nick Bowler
2023-09-28 21:34   ` Richard Henderson
2023-09-25  5:03 ` [PATCH 6/8] target/sparc: Fix VIS fpmerge input registers Nick Bowler
2023-09-28 21:35   ` Richard Henderson
2023-09-29  0:32     ` Nick Bowler
2023-09-25  5:03 ` [PATCH 7/8] target/sparc: Fix VIS fexpand input register Nick Bowler
2023-09-28 21:36   ` Richard Henderson
2023-09-25  5:03 ` [PATCH 8/8] target/sparc: Fix VIS subtraction instructions Nick Bowler
2023-09-28 21:40   ` Richard Henderson
2023-09-29  0:31     ` Nick Bowler
2023-09-28 20:05 ` [PATCH 0/8] SPARC VIS fixes Mark Cave-Ayland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).