* [PATCH 00/41] target/sparc: Implement VIS4
@ 2024-03-02 5:15 Richard Henderson
2024-03-02 5:15 ` [PATCH 01/41] linux-user/sparc: Add more hwcap bits for sparc64 Richard Henderson
` (42 more replies)
0 siblings, 43 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
I whipped this up over the Christmas break, but I'm just now
getting around to posting. I have not attempted to model the
newer cpus that have these features, but it is possible to
enable the features manually via -cpu properties.
Possibly the first 6 or 7 patches should be taken sooner than
later because they fix bugs in existing VIS[12] code.
I remove cpu_fpr[], so that we can use gvec on the same memory.
r~
Richard Henderson (41):
linux-user/sparc: Add more hwcap bits for sparc64
target/sparc: Fix FEXPAND
target/sparc: Fix FMUL8x16
target/sparc: Fix FMUL8x16A{U,L}
target/sparc: Fix FMULD8*X16
target/sparc: Fix FPMERGE
target/sparc: Split out do_ms16b
target/sparc: Perform DFPREG/QFPREG in decodetree
target/sparc: Remove gen_dest_fpr_D
target/sparc: Remove cpu_fpr[]
target/sparc: Use gvec for VIS1 parallel add/sub
target/sparc: Implement FMAf extension
target/sparc: Add feature bits for VIS 3
target/sparc: Implement ADDXC, ADDXCcc
target/sparc: Implement CMASK instructions
target/sparc: Implement FCHKSM16
target/sparc: Implement FHADD, FHSUB, FNHADD, FNADD
target/sparc: Implement FNMUL
target/sparc: Implement FLCMP
target/sparc: Implement FMEAN16
target/sparc: Implement FPADD64 FPSUB64
target/sparc: Implement FPADDS, FPSUBS
target/sparc: Implement FPCMPEQ8, FPCMPNE8, FPCMPULE8, FPCMPUGT8
target/sparc: Implement FSLL, FSRL, FSRA, FSLAS
target/sparc: Implement LDXEFSR
target/sparc: Implement LZCNT
target/sparc: Implement MOVsTOw, MOVdTOx, MOVwTOs, MOVxTOd
target/sparc: Implement PDISTN
target/sparc: Implement UMULXHI
target/sparc: Implement XMULX
target/sparc: Enable VIS3 feature bit
target/sparc: Implement IMA extension
target/sparc: Add feature bit for VIS4
target/sparc: Implement FALIGNDATAi
target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS
target/sparc: Implement VIS4 comparisons
target/sparc: Implement FPMIN, FPMAX
target/sparc: Implement SUBXC, SUBXCcc
target/sparc: Implement MWAIT
target/sparc: Implement monitor asis
target/sparc: Enable VIS4 feature bit
target/sparc/asi.h | 4 +
target/sparc/helper.h | 36 +-
linux-user/elfload.c | 51 +-
target/sparc/cpu.c | 12 +
target/sparc/fop_helper.c | 104 ++++
target/sparc/ldst_helper.c | 4 +
target/sparc/translate.c | 960 +++++++++++++++++++++++++++++----
target/sparc/vis_helper.c | 526 +++++++++++-------
target/sparc/cpu-feature.h.inc | 4 +
target/sparc/insns.decode | 338 +++++++++---
10 files changed, 1626 insertions(+), 413 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH 01/41] linux-user/sparc: Add more hwcap bits for sparc64
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 02/41] target/sparc: Fix FEXPAND Richard Henderson
` (41 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Supply HWCAP_SPARC_V8PLUS, HWCAP_SPARC_MUL32, HWCAP_SPARC_DIV32,
HWCAP_SPARC_POPC, HWCAP_SPARC_FSMULD, HWCAP_SPARC_VIS, HWCAP_SPARC_VIS2.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
linux-user/elfload.c | 48 +++++++++++++++++++++++++++++++-------------
1 file changed, 34 insertions(+), 14 deletions(-)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index b8eef893d0..6041270f1c 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -963,24 +963,44 @@ const char *elf_hwcap2_str(uint32_t bit)
#endif /* TARGET_ARM */
#ifdef TARGET_SPARC
-#ifdef TARGET_SPARC64
-#define ELF_HWCAP (HWCAP_SPARC_FLUSH | HWCAP_SPARC_STBAR | HWCAP_SPARC_SWAP \
- | HWCAP_SPARC_MULDIV | HWCAP_SPARC_V9)
-#ifndef TARGET_ABI32
-#define elf_check_arch(x) ( (x) == EM_SPARCV9 || (x) == EM_SPARC32PLUS )
+#ifndef TARGET_SPARC64
+# define ELF_CLASS ELFCLASS32
+# define ELF_ARCH EM_SPARC
+#elif defined(TARGET_ABI32)
+# define ELF_CLASS ELFCLASS32
+# define elf_check_arch(x) ((x) == EM_SPARC32PLUS || (x) == EM_SPARC)
#else
-#define elf_check_arch(x) ( (x) == EM_SPARC32PLUS || (x) == EM_SPARC )
+# define ELF_CLASS ELFCLASS64
+# define ELF_ARCH EM_SPARCV9
#endif
-#define ELF_CLASS ELFCLASS64
-#define ELF_ARCH EM_SPARCV9
-#else
-#define ELF_HWCAP (HWCAP_SPARC_FLUSH | HWCAP_SPARC_STBAR | HWCAP_SPARC_SWAP \
- | HWCAP_SPARC_MULDIV)
-#define ELF_CLASS ELFCLASS32
-#define ELF_ARCH EM_SPARC
-#endif /* TARGET_SPARC64 */
+#include "elf.h"
+
+#define ELF_HWCAP get_elf_hwcap()
+
+static uint32_t get_elf_hwcap(void)
+{
+ /* There are not many sparc32 hwcap bits -- we have all of them. */
+ uint32_t r = HWCAP_SPARC_FLUSH | HWCAP_SPARC_STBAR |
+ HWCAP_SPARC_SWAP | HWCAP_SPARC_MULDIV;
+
+#ifdef TARGET_SPARC64
+ CPUSPARCState *env = cpu_env(thread_cpu);
+ uint32_t features = env->def.features;
+
+ r |= HWCAP_SPARC_V9 | HWCAP_SPARC_V8PLUS;
+ /* 32x32 multiply and divide are efficient. */
+ r |= HWCAP_SPARC_MUL32 | HWCAP_SPARC_DIV32;
+ /* We don't have an internal feature bit for this. */
+ r |= HWCAP_SPARC_POPC;
+ r |= features & CPU_FEATURE_FSMULD ? HWCAP_SPARC_FSMULD : 0;
+ r |= features & CPU_FEATURE_VIS1 ? HWCAP_SPARC_VIS : 0;
+ r |= features & CPU_FEATURE_VIS2 ? HWCAP_SPARC_VIS2 : 0;
+#endif
+
+ return r;
+}
static inline void init_thread(struct target_pt_regs *regs,
struct image_info *infop)
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 02/41] target/sparc: Fix FEXPAND
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
2024-03-02 5:15 ` [PATCH 01/41] linux-user/sparc: Add more hwcap bits for sparc64 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 03/41] target/sparc: Fix FMUL8x16 Richard Henderson
` (40 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
This is a 2-operand instruction, not 3-operand.
Worse, we took the source from the wrong operand.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 +-
target/sparc/translate.c | 20 +++++++++++++++++++-
target/sparc/vis_helper.c | 6 +++---
target/sparc/insns.decode | 2 +-
4 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index e55fad5b8c..ef21ef49ef 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -99,7 +99,7 @@ DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fexpand, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_1(fexpand, TCG_CALL_NO_RWG_SE, i64, i32)
DEF_HELPER_FLAGS_3(pdist, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_NO_RWG_SE, i32, i64, i64)
DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 692ce0b010..5016664869 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4314,6 +4314,25 @@ TRANS(FSQRTd, ALL, do_env_dd, a, gen_helper_fsqrtd)
TRANS(FxTOd, 64, do_env_dd, a, gen_helper_fxtod)
TRANS(FdTOx, 64, do_env_dd, a, gen_helper_fdtox)
+static bool do_df(DisasContext *dc, arg_r_r *a,
+ void (*func)(TCGv_i64, TCGv_i32))
+{
+ TCGv_i64 dst;
+ TCGv_i32 src;
+
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+
+ dst = tcg_temp_new_i64();
+ src = gen_load_fpr_F(dc, a->rs);
+ func(dst, src);
+ gen_store_fpr_D(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
+TRANS(FEXPAND, VIS1, do_df, a, gen_helper_fexpand)
+
static bool do_env_df(DisasContext *dc, arg_r_r *a,
void (*func)(TCGv_i64, TCGv_env, TCGv_i32))
{
@@ -4545,7 +4564,6 @@ TRANS(FMUL8ULx16, VIS1, do_ddd, a, gen_helper_fmul8ulx16)
TRANS(FMULD8SUx16, VIS1, do_ddd, a, gen_helper_fmuld8sux16)
TRANS(FMULD8ULx16, VIS1, do_ddd, a, gen_helper_fmuld8ulx16)
TRANS(FPMERGE, VIS1, do_ddd, a, gen_helper_fpmerge)
-TRANS(FEXPAND, VIS1, do_ddd, a, gen_helper_fexpand)
TRANS(FPADD16, VIS1, do_ddd, a, tcg_gen_vec_add16_i64)
TRANS(FPADD32, VIS1, do_ddd, a, tcg_gen_vec_add32_i64)
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 7763b16c24..db2e6dd6c1 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -260,13 +260,13 @@ uint64_t helper_fmuld8ulx16(uint64_t src1, uint64_t src2)
return d.ll;
}
-uint64_t helper_fexpand(uint64_t src1, uint64_t src2)
+uint64_t helper_fexpand(uint32_t src2)
{
VIS32 s;
VIS64 d;
- s.l = (uint32_t)src1;
- d.ll = src2;
+ s.l = src2;
+ d.ll = 0;
d.VIS_W64(0) = s.VIS_B32(0) << 4;
d.VIS_W64(1) = s.VIS_B32(1) << 4;
d.VIS_W64(2) = s.VIS_B32(2) << 4;
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 2d26404cb2..e2d8a07dc4 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -352,7 +352,7 @@ FCMPEq 10 000 cc:2 110101 rs1:5 0 0101 0111 rs2:5
FALIGNDATAg 10 ..... 110110 ..... 0 0100 1000 ..... @r_r_r
FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @r_r_r
BSHUFFLE 10 ..... 110110 ..... 0 0100 1100 ..... @r_r_r
- FEXPAND 10 ..... 110110 ..... 0 0100 1101 ..... @r_r_r
+ FEXPAND 10 ..... 110110 00000 0 0100 1101 ..... @r_r2
FSRCd 10 ..... 110110 ..... 0 0111 0100 00000 @r_r1 # FSRC1d
FSRCs 10 ..... 110110 ..... 0 0111 0101 00000 @r_r1 # FSRC1s
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 03/41] target/sparc: Fix FMUL8x16
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
2024-03-02 5:15 ` [PATCH 01/41] linux-user/sparc: Add more hwcap bits for sparc64 Richard Henderson
2024-03-02 5:15 ` [PATCH 02/41] target/sparc: Fix FEXPAND Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 04/41] target/sparc: Fix FMUL8x16A{U,L} Richard Henderson
` (39 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
This instruction has f32 as source1, which alters the
decoding of the register number, which means we've been
passing the wrong data for odd register numbers.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 +-
target/sparc/translate.c | 21 ++++++++++++++++++++-
target/sparc/vis_helper.c | 9 +++++----
3 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index ef21ef49ef..adc1b87319 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -92,7 +92,7 @@ DEF_HELPER_FLAGS_2(fdtox, TCG_CALL_NO_WG, s64, env, f64)
DEF_HELPER_FLAGS_2(fqtox, TCG_CALL_NO_WG, s64, env, i128)
DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 5016664869..5144fe4ed9 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4539,6 +4539,26 @@ TRANS(FSUBs, ALL, do_env_fff, a, gen_helper_fsubs)
TRANS(FMULs, ALL, do_env_fff, a, gen_helper_fmuls)
TRANS(FDIVs, ALL, do_env_fff, a, gen_helper_fdivs)
+static bool do_dfd(DisasContext *dc, arg_r_r_r *a,
+ void (*func)(TCGv_i64, TCGv_i32, TCGv_i64))
+{
+ TCGv_i64 dst, src2;
+ TCGv_i32 src1;
+
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+
+ dst = gen_dest_fpr_D(dc, a->rd);
+ src1 = gen_load_fpr_F(dc, a->rs1);
+ src2 = gen_load_fpr_D(dc, a->rs2);
+ func(dst, src1, src2);
+ gen_store_fpr_D(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
+TRANS(FMUL8x16, VIS1, do_dfd, a, gen_helper_fmul8x16)
+
static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64))
{
@@ -4556,7 +4576,6 @@ static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
return advance_pc(dc);
}
-TRANS(FMUL8x16, VIS1, do_ddd, a, gen_helper_fmul8x16)
TRANS(FMUL8x16AU, VIS1, do_ddd, a, gen_helper_fmul8x16au)
TRANS(FMUL8x16AL, VIS1, do_ddd, a, gen_helper_fmul8x16al)
TRANS(FMUL8SUx16, VIS1, do_ddd, a, gen_helper_fmul8sux16)
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index db2e6dd6c1..7728ffe9c6 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -94,16 +94,17 @@ uint64_t helper_fpmerge(uint64_t src1, uint64_t src2)
return d.ll;
}
-uint64_t helper_fmul8x16(uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16(uint32_t src1, uint64_t src2)
{
- VIS64 s, d;
+ VIS64 d;
+ VIS32 s;
uint32_t tmp;
- s.ll = src1;
+ s.l = src1;
d.ll = src2;
#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B64(r); \
+ tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B32(r); \
if ((tmp & 0xff) > 0x7f) { \
tmp += 0x100; \
} \
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 04/41] target/sparc: Fix FMUL8x16A{U,L}
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (2 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 03/41] target/sparc: Fix FMUL8x16 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-04-30 8:07 ` Mark Cave-Ayland
2024-03-02 5:15 ` [PATCH 05/41] target/sparc: Fix FMULD8*X16 Richard Henderson
` (38 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
These instructions have f32 inputs, which changes the decode
of the register numbers. While we're fixing things, use a
common helper for both insns, extracting the 16-bit scalar
in tcg beforehand.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 3 +--
target/sparc/translate.c | 38 ++++++++++++++++++++++++++++++----
target/sparc/vis_helper.c | 43 +++++++++------------------------------
3 files changed, 45 insertions(+), 39 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index adc1b87319..9e0b8b463e 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -93,8 +93,7 @@ DEF_HELPER_FLAGS_2(fqtox, TCG_CALL_NO_WG, s64, env, i128)
DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
-DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmul8x16a, TCG_CALL_NO_RWG_SE, i64, i32, s32)
DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 5144fe4ed9..598cfcf0ac 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -45,6 +45,7 @@
# define gen_helper_clear_softint(E, S) qemu_build_not_reached()
# define gen_helper_done(E) qemu_build_not_reached()
# define gen_helper_flushw(E) qemu_build_not_reached()
+# define gen_helper_fmul8x16a(D, S1, S2) qemu_build_not_reached()
# define gen_helper_rdccr(D, E) qemu_build_not_reached()
# define gen_helper_rdcwp(D, E) qemu_build_not_reached()
# define gen_helper_restored(E) qemu_build_not_reached()
@@ -72,8 +73,6 @@
# define gen_helper_fexpand ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmul8sux16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmul8ulx16 ({ qemu_build_not_reached(); NULL; })
-# define gen_helper_fmul8x16al ({ qemu_build_not_reached(); NULL; })
-# define gen_helper_fmul8x16au ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmul8x16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmuld8sux16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmuld8ulx16 ({ qemu_build_not_reached(); NULL; })
@@ -719,6 +718,18 @@ static void gen_op_bshuffle(TCGv_i64 dst, TCGv_i64 src1, TCGv_i64 src2)
#endif
}
+static void gen_op_fmul8x16al(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
+{
+ tcg_gen_ext16s_i32(src2, src2);
+ gen_helper_fmul8x16a(dst, src1, src2);
+}
+
+static void gen_op_fmul8x16au(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
+{
+ tcg_gen_sari_i32(src2, src2, 16);
+ gen_helper_fmul8x16a(dst, src1, src2);
+}
+
static void finishing_insn(DisasContext *dc)
{
/*
@@ -4539,6 +4550,27 @@ TRANS(FSUBs, ALL, do_env_fff, a, gen_helper_fsubs)
TRANS(FMULs, ALL, do_env_fff, a, gen_helper_fmuls)
TRANS(FDIVs, ALL, do_env_fff, a, gen_helper_fdivs)
+static bool do_dff(DisasContext *dc, arg_r_r_r *a,
+ void (*func)(TCGv_i64, TCGv_i32, TCGv_i32))
+{
+ TCGv_i64 dst;
+ TCGv_i32 src1, src2;
+
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+
+ dst = gen_dest_fpr_D(dc, a->rd);
+ src1 = gen_load_fpr_F(dc, a->rs1);
+ src2 = gen_load_fpr_F(dc, a->rs2);
+ func(dst, src1, src2);
+ gen_store_fpr_D(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
+TRANS(FMUL8x16AU, VIS1, do_dff, a, gen_op_fmul8x16au)
+TRANS(FMUL8x16AL, VIS1, do_dff, a, gen_op_fmul8x16al)
+
static bool do_dfd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i32, TCGv_i64))
{
@@ -4576,8 +4608,6 @@ static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
return advance_pc(dc);
}
-TRANS(FMUL8x16AU, VIS1, do_ddd, a, gen_helper_fmul8x16au)
-TRANS(FMUL8x16AL, VIS1, do_ddd, a, gen_helper_fmul8x16al)
TRANS(FMUL8SUx16, VIS1, do_ddd, a, gen_helper_fmul8sux16)
TRANS(FMUL8ULx16, VIS1, do_ddd, a, gen_helper_fmul8ulx16)
TRANS(FMULD8SUx16, VIS1, do_ddd, a, gen_helper_fmuld8sux16)
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 7728ffe9c6..5c7f5536bc 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -119,43 +119,20 @@ uint64_t helper_fmul8x16(uint32_t src1, uint64_t src2)
return d.ll;
}
-uint64_t helper_fmul8x16al(uint64_t src1, uint64_t src2)
+uint64_t helper_fmul8x16a(uint32_t src1, int32_t src2)
{
- VIS64 s, d;
+ VIS32 s;
+ VIS64 d;
uint32_t tmp;
- s.ll = src1;
- d.ll = src2;
+ s.l = src1;
+ d.ll = 0;
-#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(1) * (int32_t)s.VIS_B64(r); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
- d.VIS_W64(r) = tmp >> 8;
-
- PMUL(0);
- PMUL(1);
- PMUL(2);
- PMUL(3);
-#undef PMUL
-
- return d.ll;
-}
-
-uint64_t helper_fmul8x16au(uint64_t src1, uint64_t src2)
-{
- VIS64 s, d;
- uint32_t tmp;
-
- s.ll = src1;
- d.ll = src2;
-
-#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(0) * (int32_t)s.VIS_B64(r); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
+#define PMUL(r) \
+ tmp = src2 * (int32_t)s.VIS_B64(r); \
+ if ((tmp & 0xff) > 0x7f) { \
+ tmp += 0x100; \
+ } \
d.VIS_W64(r) = tmp >> 8;
PMUL(0);
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 05/41] target/sparc: Fix FMULD8*X16
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (3 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 04/41] target/sparc: Fix FMUL8x16A{U,L} Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 06/41] target/sparc: Fix FPMERGE Richard Henderson
` (37 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Not only do these instructions have f32 inputs, they also do not
perform rounding. Since these are relatively simple, implement
them properly inline.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 --
target/sparc/translate.c | 48 +++++++++++++++++++++++++++++++++++----
target/sparc/vis_helper.c | 46 -------------------------------------
3 files changed, 44 insertions(+), 52 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 9e0b8b463e..39ea8f9baf 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -96,8 +96,6 @@ DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
DEF_HELPER_FLAGS_2(fmul8x16a, TCG_CALL_NO_RWG_SE, i64, i32, s32)
DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fmuld8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_1(fexpand, TCG_CALL_NO_RWG_SE, i64, i32)
DEF_HELPER_FLAGS_3(pdist, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_NO_RWG_SE, i32, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 598cfcf0ac..edb97bc64e 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -74,8 +74,6 @@
# define gen_helper_fmul8sux16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmul8ulx16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmul8x16 ({ qemu_build_not_reached(); NULL; })
-# define gen_helper_fmuld8sux16 ({ qemu_build_not_reached(); NULL; })
-# define gen_helper_fmuld8ulx16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fpmerge ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fqtox ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fstox ({ qemu_build_not_reached(); NULL; })
@@ -730,6 +728,48 @@ static void gen_op_fmul8x16au(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
gen_helper_fmul8x16a(dst, src1, src2);
}
+static void gen_op_fmuld8ulx16(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
+{
+ TCGv_i32 t0 = tcg_temp_new_i32();
+ TCGv_i32 t1 = tcg_temp_new_i32();
+ TCGv_i32 t2 = tcg_temp_new_i32();
+
+ tcg_gen_ext8u_i32(t0, src1);
+ tcg_gen_ext16s_i32(t1, src2);
+ tcg_gen_mul_i32(t0, t0, t1);
+
+ tcg_gen_extract_i32(t1, src1, 16, 8);
+ tcg_gen_sextract_i32(t2, src2, 16, 16);
+ tcg_gen_mul_i32(t1, t1, t2);
+
+ tcg_gen_concat_i32_i64(dst, t0, t1);
+}
+
+static void gen_op_fmuld8sux16(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
+{
+ TCGv_i32 t0 = tcg_temp_new_i32();
+ TCGv_i32 t1 = tcg_temp_new_i32();
+ TCGv_i32 t2 = tcg_temp_new_i32();
+
+ /*
+ * The insn description talks about extracting the upper 8 bits
+ * of the signed 16-bit input rs1, performing the multiply, then
+ * shifting left by 8 bits. Instead, zap the lower 8 bits of
+ * the rs1 input, which avoids the need for two shifts.
+ */
+ tcg_gen_ext16s_i32(t0, src1);
+ tcg_gen_andi_i32(t0, t0, ~0xff);
+ tcg_gen_ext16s_i32(t1, src2);
+ tcg_gen_mul_i32(t0, t0, t1);
+
+ tcg_gen_sextract_i32(t1, src1, 16, 16);
+ tcg_gen_andi_i32(t1, t1, ~0xff);
+ tcg_gen_sextract_i32(t2, src2, 16, 16);
+ tcg_gen_mul_i32(t1, t1, t2);
+
+ tcg_gen_concat_i32_i64(dst, t0, t1);
+}
+
static void finishing_insn(DisasContext *dc)
{
/*
@@ -4570,6 +4610,8 @@ static bool do_dff(DisasContext *dc, arg_r_r_r *a,
TRANS(FMUL8x16AU, VIS1, do_dff, a, gen_op_fmul8x16au)
TRANS(FMUL8x16AL, VIS1, do_dff, a, gen_op_fmul8x16al)
+TRANS(FMULD8SUx16, VIS1, do_dff, a, gen_op_fmuld8sux16)
+TRANS(FMULD8ULx16, VIS1, do_dff, a, gen_op_fmuld8ulx16)
static bool do_dfd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i32, TCGv_i64))
@@ -4610,8 +4652,6 @@ static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
TRANS(FMUL8SUx16, VIS1, do_ddd, a, gen_helper_fmul8sux16)
TRANS(FMUL8ULx16, VIS1, do_ddd, a, gen_helper_fmul8ulx16)
-TRANS(FMULD8SUx16, VIS1, do_ddd, a, gen_helper_fmuld8sux16)
-TRANS(FMULD8ULx16, VIS1, do_ddd, a, gen_helper_fmuld8ulx16)
TRANS(FPMERGE, VIS1, do_ddd, a, gen_helper_fpmerge)
TRANS(FPADD16, VIS1, do_ddd, a, tcg_gen_vec_add16_i64)
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 5c7f5536bc..eb1c4e47e9 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -192,52 +192,6 @@ uint64_t helper_fmul8ulx16(uint64_t src1, uint64_t src2)
return d.ll;
}
-uint64_t helper_fmuld8sux16(uint64_t src1, uint64_t src2)
-{
- VIS64 s, d;
- uint32_t tmp;
-
- s.ll = src1;
- d.ll = src2;
-
-#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
- d.VIS_L64(r) = tmp;
-
- /* Reverse calculation order to handle overlap */
- PMUL(1);
- PMUL(0);
-#undef PMUL
-
- return d.ll;
-}
-
-uint64_t helper_fmuld8ulx16(uint64_t src1, uint64_t src2)
-{
- VIS64 s, d;
- uint32_t tmp;
-
- s.ll = src1;
- d.ll = src2;
-
-#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2)); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
- d.VIS_L64(r) = tmp;
-
- /* Reverse calculation order to handle overlap */
- PMUL(1);
- PMUL(0);
-#undef PMUL
-
- return d.ll;
-}
-
uint64_t helper_fexpand(uint32_t src2)
{
VIS32 s;
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 06/41] target/sparc: Fix FPMERGE
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (4 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 05/41] target/sparc: Fix FMULD8*X16 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 07/41] target/sparc: Split out do_ms16b Richard Henderson
` (36 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
This instruction has f32 inputs, which changes the decode
of the register numbers.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 +-
target/sparc/translate.c | 2 +-
target/sparc/vis_helper.c | 27 ++++++++++++++-------------
3 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 39ea8f9baf..f0576fb748 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -91,7 +91,7 @@ DEF_HELPER_FLAGS_2(fstox, TCG_CALL_NO_WG, s64, env, f32)
DEF_HELPER_FLAGS_2(fdtox, TCG_CALL_NO_WG, s64, env, f64)
DEF_HELPER_FLAGS_2(fqtox, TCG_CALL_NO_WG, s64, env, i128)
-DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i32, i32)
DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
DEF_HELPER_FLAGS_2(fmul8x16a, TCG_CALL_NO_RWG_SE, i64, i32, s32)
DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index edb97bc64e..6a6c259b06 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4612,6 +4612,7 @@ TRANS(FMUL8x16AU, VIS1, do_dff, a, gen_op_fmul8x16au)
TRANS(FMUL8x16AL, VIS1, do_dff, a, gen_op_fmul8x16al)
TRANS(FMULD8SUx16, VIS1, do_dff, a, gen_op_fmuld8sux16)
TRANS(FMULD8ULx16, VIS1, do_dff, a, gen_op_fmuld8ulx16)
+TRANS(FPMERGE, VIS1, do_dff, a, gen_helper_fpmerge)
static bool do_dfd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i32, TCGv_i64))
@@ -4652,7 +4653,6 @@ static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
TRANS(FMUL8SUx16, VIS1, do_ddd, a, gen_helper_fmul8sux16)
TRANS(FMUL8ULx16, VIS1, do_ddd, a, gen_helper_fmul8ulx16)
-TRANS(FPMERGE, VIS1, do_ddd, a, gen_helper_fpmerge)
TRANS(FPADD16, VIS1, do_ddd, a, tcg_gen_vec_add16_i64)
TRANS(FPADD32, VIS1, do_ddd, a, tcg_gen_vec_add32_i64)
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index eb1c4e47e9..89eea05ddb 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -74,22 +74,23 @@ typedef union {
float32 f;
} VIS32;
-uint64_t helper_fpmerge(uint64_t src1, uint64_t src2)
+uint64_t helper_fpmerge(uint32_t src1, uint32_t src2)
{
- VIS64 s, d;
+ VIS32 s1, s2;
+ VIS64 d;
- s.ll = src1;
- d.ll = src2;
+ s1.l = src1;
+ s2.l = src2;
+ d.ll = 0;
- /* Reverse calculation order to handle overlap */
- d.VIS_B64(7) = s.VIS_B64(3);
- d.VIS_B64(6) = d.VIS_B64(3);
- d.VIS_B64(5) = s.VIS_B64(2);
- d.VIS_B64(4) = d.VIS_B64(2);
- d.VIS_B64(3) = s.VIS_B64(1);
- d.VIS_B64(2) = d.VIS_B64(1);
- d.VIS_B64(1) = s.VIS_B64(0);
- /* d.VIS_B64(0) = d.VIS_B64(0); */
+ d.VIS_B64(7) = s1.VIS_B32(3);
+ d.VIS_B64(6) = s2.VIS_B32(3);
+ d.VIS_B64(5) = s1.VIS_B32(2);
+ d.VIS_B64(4) = s2.VIS_B32(2);
+ d.VIS_B64(3) = s1.VIS_B32(1);
+ d.VIS_B64(2) = s2.VIS_B32(1);
+ d.VIS_B64(1) = s1.VIS_B32(0);
+ d.VIS_B64(0) = s2.VIS_B32(0);
return d.ll;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 07/41] target/sparc: Split out do_ms16b
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (5 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 06/41] target/sparc: Fix FPMERGE Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 08/41] target/sparc: Perform DFPREG/QFPREG in decodetree Richard Henderson
` (35 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
The unit operation for fmul8x16 and friends is described in the
manual as "MS16b". Split that out for clarity. Improve rounding
with an unconditional addition of 0.5 as a fixed-point integer.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/vis_helper.c | 76 +++++++++++++--------------------------
1 file changed, 24 insertions(+), 52 deletions(-)
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 89eea05ddb..e15c6bb34e 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -44,6 +44,7 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
#if HOST_BIG_ENDIAN
#define VIS_B64(n) b[7 - (n)]
+#define VIS_SB64(n) sb[7 - (n)]
#define VIS_W64(n) w[3 - (n)]
#define VIS_SW64(n) sw[3 - (n)]
#define VIS_L64(n) l[1 - (n)]
@@ -51,6 +52,7 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
#define VIS_W32(n) w[1 - (n)]
#else
#define VIS_B64(n) b[n]
+#define VIS_SB64(n) sb[n]
#define VIS_W64(n) w[n]
#define VIS_SW64(n) sw[n]
#define VIS_L64(n) l[n]
@@ -60,6 +62,7 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
typedef union {
uint8_t b[8];
+ int8_t sb[8];
uint16_t w[4];
int16_t sw[4];
uint32_t l[2];
@@ -95,27 +98,23 @@ uint64_t helper_fpmerge(uint32_t src1, uint32_t src2)
return d.ll;
}
+static inline int do_ms16b(int x, int y)
+{
+ return ((x * y) + 0x80) >> 8;
+}
+
uint64_t helper_fmul8x16(uint32_t src1, uint64_t src2)
{
VIS64 d;
VIS32 s;
- uint32_t tmp;
s.l = src1;
d.ll = src2;
-#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(r) * (int32_t)s.VIS_B32(r); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
- d.VIS_W64(r) = tmp >> 8;
-
- PMUL(0);
- PMUL(1);
- PMUL(2);
- PMUL(3);
-#undef PMUL
+ d.VIS_W64(0) = do_ms16b(s.VIS_B32(0), d.VIS_SW64(0));
+ d.VIS_W64(1) = do_ms16b(s.VIS_B32(1), d.VIS_SW64(1));
+ d.VIS_W64(2) = do_ms16b(s.VIS_B32(2), d.VIS_SW64(2));
+ d.VIS_W64(3) = do_ms16b(s.VIS_B32(3), d.VIS_SW64(3));
return d.ll;
}
@@ -124,23 +123,14 @@ uint64_t helper_fmul8x16a(uint32_t src1, int32_t src2)
{
VIS32 s;
VIS64 d;
- uint32_t tmp;
s.l = src1;
d.ll = 0;
-#define PMUL(r) \
- tmp = src2 * (int32_t)s.VIS_B64(r); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
- d.VIS_W64(r) = tmp >> 8;
-
- PMUL(0);
- PMUL(1);
- PMUL(2);
- PMUL(3);
-#undef PMUL
+ d.VIS_W64(0) = do_ms16b(s.VIS_B32(0), src2);
+ d.VIS_W64(1) = do_ms16b(s.VIS_B32(1), src2);
+ d.VIS_W64(2) = do_ms16b(s.VIS_B32(2), src2);
+ d.VIS_W64(3) = do_ms16b(s.VIS_B32(3), src2);
return d.ll;
}
@@ -148,23 +138,14 @@ uint64_t helper_fmul8x16a(uint32_t src1, int32_t src2)
uint64_t helper_fmul8sux16(uint64_t src1, uint64_t src2)
{
VIS64 s, d;
- uint32_t tmp;
s.ll = src1;
d.ll = src2;
-#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(r) * ((int32_t)s.VIS_SW64(r) >> 8); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
- d.VIS_W64(r) = tmp >> 8;
-
- PMUL(0);
- PMUL(1);
- PMUL(2);
- PMUL(3);
-#undef PMUL
+ d.VIS_W64(0) = do_ms16b(s.VIS_SB64(1), d.VIS_SW64(0));
+ d.VIS_W64(1) = do_ms16b(s.VIS_SB64(3), d.VIS_SW64(1));
+ d.VIS_W64(2) = do_ms16b(s.VIS_SB64(5), d.VIS_SW64(2));
+ d.VIS_W64(3) = do_ms16b(s.VIS_SB64(7), d.VIS_SW64(3));
return d.ll;
}
@@ -172,23 +153,14 @@ uint64_t helper_fmul8sux16(uint64_t src1, uint64_t src2)
uint64_t helper_fmul8ulx16(uint64_t src1, uint64_t src2)
{
VIS64 s, d;
- uint32_t tmp;
s.ll = src1;
d.ll = src2;
-#define PMUL(r) \
- tmp = (int32_t)d.VIS_SW64(r) * ((uint32_t)s.VIS_B64(r * 2)); \
- if ((tmp & 0xff) > 0x7f) { \
- tmp += 0x100; \
- } \
- d.VIS_W64(r) = tmp >> 8;
-
- PMUL(0);
- PMUL(1);
- PMUL(2);
- PMUL(3);
-#undef PMUL
+ d.VIS_W64(0) = do_ms16b(s.VIS_B64(0), d.VIS_SW64(0));
+ d.VIS_W64(1) = do_ms16b(s.VIS_B64(2), d.VIS_SW64(1));
+ d.VIS_W64(2) = do_ms16b(s.VIS_B64(4), d.VIS_SW64(2));
+ d.VIS_W64(3) = do_ms16b(s.VIS_B64(6), d.VIS_SW64(3));
return d.ll;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 08/41] target/sparc: Perform DFPREG/QFPREG in decodetree
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (6 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 07/41] target/sparc: Split out do_ms16b Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 15:18 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 09/41] target/sparc: Remove gen_dest_fpr_D Richard Henderson
` (34 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Form the proper register decoding from the start.
Because we're removing the translation from the inner-most
gen_load_fpr_* and gen_store_fpr_* routines, this must be
done for all insns at once.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 18 ++--
target/sparc/insns.decode | 220 +++++++++++++++++++++++---------------
2 files changed, 138 insertions(+), 100 deletions(-)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 6a6c259b06..97a5c636d2 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -241,34 +241,30 @@ static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
{
- src = DFPREG(src);
return cpu_fpr[src / 2];
}
static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
{
- dst = DFPREG(dst);
tcg_gen_mov_i64(cpu_fpr[dst / 2], v);
gen_update_fprs_dirty(dc, dst);
}
static TCGv_i64 gen_dest_fpr_D(DisasContext *dc, unsigned int dst)
{
- return cpu_fpr[DFPREG(dst) / 2];
+ return cpu_fpr[dst / 2];
}
static TCGv_i128 gen_load_fpr_Q(DisasContext *dc, unsigned int src)
{
TCGv_i128 ret = tcg_temp_new_i128();
- src = QFPREG(src);
tcg_gen_concat_i64_i128(ret, cpu_fpr[src / 2 + 1], cpu_fpr[src / 2]);
return ret;
}
static void gen_store_fpr_Q(DisasContext *dc, unsigned int dst, TCGv_i128 v)
{
- dst = DFPREG(dst);
tcg_gen_extr_i128_i64(cpu_fpr[dst / 2 + 1], cpu_fpr[dst / 2], v);
gen_update_fprs_dirty(dc, dst);
}
@@ -2002,16 +1998,14 @@ static void gen_fmovd(DisasContext *dc, DisasCompare *cmp, int rd, int rs)
static void gen_fmovq(DisasContext *dc, DisasCompare *cmp, int rd, int rs)
{
#ifdef TARGET_SPARC64
- int qd = QFPREG(rd);
- int qs = QFPREG(rs);
TCGv c2 = tcg_constant_tl(cmp->c2);
- tcg_gen_movcond_i64(cmp->cond, cpu_fpr[qd / 2], cmp->c1, c2,
- cpu_fpr[qs / 2], cpu_fpr[qd / 2]);
- tcg_gen_movcond_i64(cmp->cond, cpu_fpr[qd / 2 + 1], cmp->c1, c2,
- cpu_fpr[qs / 2 + 1], cpu_fpr[qd / 2 + 1]);
+ tcg_gen_movcond_i64(cmp->cond, cpu_fpr[rd / 2], cmp->c1, c2,
+ cpu_fpr[rs / 2], cpu_fpr[rd / 2]);
+ tcg_gen_movcond_i64(cmp->cond, cpu_fpr[rd / 2 + 1], cmp->c1, c2,
+ cpu_fpr[rs / 2 + 1], cpu_fpr[rd / 2 + 1]);
- gen_update_fprs_dirty(dc, qd);
+ gen_update_fprs_dirty(dc, rd);
#else
qemu_build_not_reached();
#endif
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index e2d8a07dc4..2c23868fc3 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -26,6 +26,14 @@ CALL 01 i:s30
## Major Opcode 10 -- integer, floating-point, vis, and system insns.
##
+%dfp_rd 25:5 !function=extract_dfpreg
+%dfp_rs1 14:5 !function=extract_dfpreg
+%dfp_rs2 0:5 !function=extract_dfpreg
+
+%qfp_rd 25:5 !function=extract_qfpreg
+%qfp_rs1 14:5 !function=extract_qfpreg
+%qfp_rs2 0:5 !function=extract_qfpreg
+
&r_r_ri rd rs1 rs2_or_imm imm:bool
@n_r_ri .. ..... ...... rs1:5 imm:1 rs2_or_imm:s13 &r_r_ri rd=0
@r_r_ri .. rd:5 ...... rs1:5 imm:1 rs2_or_imm:s13 &r_r_ri
@@ -37,11 +45,40 @@ CALL 01 i:s30
&r_r_r rd rs1 rs2
@r_r_r .. rd:5 ...... rs1:5 . ........ rs2:5 &r_r_r
+@d_r_r .. ..... ...... rs1:5 . ........ rs2:5 \
+ &r_r_r rd=%dfp_rd
+@r_d_d .. rd:5 ...... ..... . ........ ..... \
+ &r_r_r rs1=%dfp_rs1 rs2=%dfp_rs2
+@d_r_d .. ..... ...... rs1:5 . ........ ..... \
+ &r_r_r rd=%dfp_rd rs2=%dfp_rs2
+@d_d_d .. ..... ...... ..... . ........ ..... \
+ &r_r_r rd=%dfp_rd rs1=%dfp_rs1 rs2=%dfp_rs2
+@q_q_q .. ..... ...... ..... . ........ ..... \
+ &r_r_r rd=%qfp_rd rs1=%qfp_rs1 rs2=%qfp_rs2
+@q_d_d .. ..... ...... ..... . ........ ..... \
+ &r_r_r rd=%qfp_rd rs1=%dfp_rs1 rs2=%dfp_rs2
+
@r_r_r_swap .. rd:5 ...... rs2:5 . ........ rs1:5 &r_r_r
+@d_d_d_swap .. ..... ...... ..... . ........ ..... \
+ &r_r_r rd=%dfp_rd rs1=%dfp_rs2 rs2=%dfp_rs1
&r_r rd rs
@r_r1 .. rd:5 ...... rs:5 . ........ ..... &r_r
@r_r2 .. rd:5 ...... ..... . ........ rs:5 &r_r
+@r_d2 .. rd:5 ...... ..... . ........ ..... &r_r rs=%dfp_rs2
+@r_q2 .. rd:5 ...... ..... . ........ ..... &r_r rs=%qfp_rs2
+@d_r2 .. ..... ...... ..... . ........ rs:5 &r_r rd=%dfp_rd
+@q_r2 .. ..... ...... ..... . ........ rs:5 &r_r rd=%qfp_rd
+@d_d1 .. ..... ...... ..... . ........ ..... \
+ &r_r rd=%dfp_rd rs=%dfp_rs1
+@d_d2 .. ..... ...... ..... . ........ ..... \
+ &r_r rd=%dfp_rd rs=%dfp_rs2
+@d_q2 .. ..... ...... ..... . ........ ..... \
+ &r_r rd=%dfp_rd rs=%qfp_rs2
+@q_q2 .. ..... ...... ..... . ........ ..... \
+ &r_r rd=%qfp_rd rs=%qfp_rs2
+@q_d2 .. ..... ...... ..... . ........ ..... \
+ &r_r rd=%qfp_rd rs=%dfp_rs2
{
[
@@ -241,68 +278,78 @@ DONE 10 00000 111110 00000 0 0000000000000
RETRY 10 00001 111110 00000 0 0000000000000
FMOVs 10 ..... 110100 00000 0 0000 0001 ..... @r_r2
-FMOVd 10 ..... 110100 00000 0 0000 0010 ..... @r_r2
-FMOVq 10 ..... 110100 00000 0 0000 0011 ..... @r_r2
+FMOVd 10 ..... 110100 00000 0 0000 0010 ..... @d_d2
+FMOVq 10 ..... 110100 00000 0 0000 0011 ..... @q_q2
FNEGs 10 ..... 110100 00000 0 0000 0101 ..... @r_r2
-FNEGd 10 ..... 110100 00000 0 0000 0110 ..... @r_r2
-FNEGq 10 ..... 110100 00000 0 0000 0111 ..... @r_r2
+FNEGd 10 ..... 110100 00000 0 0000 0110 ..... @d_d2
+FNEGq 10 ..... 110100 00000 0 0000 0111 ..... @q_q2
FABSs 10 ..... 110100 00000 0 0000 1001 ..... @r_r2
-FABSd 10 ..... 110100 00000 0 0000 1010 ..... @r_r2
-FABSq 10 ..... 110100 00000 0 0000 1011 ..... @r_r2
+FABSd 10 ..... 110100 00000 0 0000 1010 ..... @d_d2
+FABSq 10 ..... 110100 00000 0 0000 1011 ..... @q_q2
FSQRTs 10 ..... 110100 00000 0 0010 1001 ..... @r_r2
-FSQRTd 10 ..... 110100 00000 0 0010 1010 ..... @r_r2
-FSQRTq 10 ..... 110100 00000 0 0010 1011 ..... @r_r2
+FSQRTd 10 ..... 110100 00000 0 0010 1010 ..... @d_d2
+FSQRTq 10 ..... 110100 00000 0 0010 1011 ..... @q_q2
FADDs 10 ..... 110100 ..... 0 0100 0001 ..... @r_r_r
-FADDd 10 ..... 110100 ..... 0 0100 0010 ..... @r_r_r
-FADDq 10 ..... 110100 ..... 0 0100 0011 ..... @r_r_r
+FADDd 10 ..... 110100 ..... 0 0100 0010 ..... @d_d_d
+FADDq 10 ..... 110100 ..... 0 0100 0011 ..... @q_q_q
FSUBs 10 ..... 110100 ..... 0 0100 0101 ..... @r_r_r
-FSUBd 10 ..... 110100 ..... 0 0100 0110 ..... @r_r_r
-FSUBq 10 ..... 110100 ..... 0 0100 0111 ..... @r_r_r
+FSUBd 10 ..... 110100 ..... 0 0100 0110 ..... @d_d_d
+FSUBq 10 ..... 110100 ..... 0 0100 0111 ..... @q_q_q
FMULs 10 ..... 110100 ..... 0 0100 1001 ..... @r_r_r
-FMULd 10 ..... 110100 ..... 0 0100 1010 ..... @r_r_r
-FMULq 10 ..... 110100 ..... 0 0100 1011 ..... @r_r_r
+FMULd 10 ..... 110100 ..... 0 0100 1010 ..... @d_d_d
+FMULq 10 ..... 110100 ..... 0 0100 1011 ..... @q_q_q
FDIVs 10 ..... 110100 ..... 0 0100 1101 ..... @r_r_r
-FDIVd 10 ..... 110100 ..... 0 0100 1110 ..... @r_r_r
-FDIVq 10 ..... 110100 ..... 0 0100 1111 ..... @r_r_r
-FsMULd 10 ..... 110100 ..... 0 0110 1001 ..... @r_r_r
-FdMULq 10 ..... 110100 ..... 0 0110 1110 ..... @r_r_r
+FDIVd 10 ..... 110100 ..... 0 0100 1110 ..... @d_d_d
+FDIVq 10 ..... 110100 ..... 0 0100 1111 ..... @q_q_q
+FsMULd 10 ..... 110100 ..... 0 0110 1001 ..... @d_r_r
+FdMULq 10 ..... 110100 ..... 0 0110 1110 ..... @q_d_d
FsTOx 10 ..... 110100 00000 0 1000 0001 ..... @r_r2
-FdTOx 10 ..... 110100 00000 0 1000 0010 ..... @r_r2
-FqTOx 10 ..... 110100 00000 0 1000 0011 ..... @r_r2
+FdTOx 10 ..... 110100 00000 0 1000 0010 ..... @r_d2
+FqTOx 10 ..... 110100 00000 0 1000 0011 ..... @r_q2
FxTOs 10 ..... 110100 00000 0 1000 0100 ..... @r_r2
-FxTOd 10 ..... 110100 00000 0 1000 1000 ..... @r_r2
-FxTOq 10 ..... 110100 00000 0 1000 1100 ..... @r_r2
+FxTOd 10 ..... 110100 00000 0 1000 1000 ..... @d_r2
+FxTOq 10 ..... 110100 00000 0 1000 1100 ..... @q_r2
FiTOs 10 ..... 110100 00000 0 1100 0100 ..... @r_r2
-FdTOs 10 ..... 110100 00000 0 1100 0110 ..... @r_r2
-FqTOs 10 ..... 110100 00000 0 1100 0111 ..... @r_r2
-FiTOd 10 ..... 110100 00000 0 1100 1000 ..... @r_r2
-FsTOd 10 ..... 110100 00000 0 1100 1001 ..... @r_r2
-FqTOd 10 ..... 110100 00000 0 1100 1011 ..... @r_r2
-FiTOq 10 ..... 110100 00000 0 1100 1100 ..... @r_r2
-FsTOq 10 ..... 110100 00000 0 1100 1101 ..... @r_r2
-FdTOq 10 ..... 110100 00000 0 1100 1110 ..... @r_r2
+FdTOs 10 ..... 110100 00000 0 1100 0110 ..... @r_d2
+FqTOs 10 ..... 110100 00000 0 1100 0111 ..... @r_q2
+FiTOd 10 ..... 110100 00000 0 1100 1000 ..... @d_r2
+FsTOd 10 ..... 110100 00000 0 1100 1001 ..... @d_r2
+FqTOd 10 ..... 110100 00000 0 1100 1011 ..... @d_q2
+FiTOq 10 ..... 110100 00000 0 1100 1100 ..... @q_r2
+FsTOq 10 ..... 110100 00000 0 1100 1101 ..... @q_r2
+FdTOq 10 ..... 110100 00000 0 1100 1110 ..... @q_d2
FsTOi 10 ..... 110100 00000 0 1101 0001 ..... @r_r2
-FdTOi 10 ..... 110100 00000 0 1101 0010 ..... @r_r2
-FqTOi 10 ..... 110100 00000 0 1101 0011 ..... @r_r2
+FdTOi 10 ..... 110100 00000 0 1101 0010 ..... @r_d2
+FqTOi 10 ..... 110100 00000 0 1101 0011 ..... @r_q2
FMOVscc 10 rd:5 110101 0 cond:4 1 cc:1 0 000001 rs2:5
-FMOVdcc 10 rd:5 110101 0 cond:4 1 cc:1 0 000010 rs2:5
-FMOVqcc 10 rd:5 110101 0 cond:4 1 cc:1 0 000011 rs2:5
+FMOVdcc 10 ..... 110101 0 cond:4 1 cc:1 0 000010 ..... \
+ rd=%dfp_rd rs2=%dfp_rs2
+FMOVqcc 10 ..... 110101 0 cond:4 1 cc:1 0 000011 ..... \
+ rd=%qfp_rd rs2=%qfp_rs2
FMOVsfcc 10 rd:5 110101 0 cond:4 0 cc:2 000001 rs2:5
-FMOVdfcc 10 rd:5 110101 0 cond:4 0 cc:2 000010 rs2:5
-FMOVqfcc 10 rd:5 110101 0 cond:4 0 cc:2 000011 rs2:5
+FMOVdfcc 10 ..... 110101 0 cond:4 0 cc:2 000010 ..... \
+ rd=%dfp_rd rs2=%dfp_rs2
+FMOVqfcc 10 ..... 110101 0 cond:4 0 cc:2 000011 ..... \
+ rd=%qfp_rd rs2=%qfp_rs2
FMOVRs 10 rd:5 110101 rs1:5 0 cond:3 00101 rs2:5
-FMOVRd 10 rd:5 110101 rs1:5 0 cond:3 00110 rs2:5
-FMOVRq 10 rd:5 110101 rs1:5 0 cond:3 00111 rs2:5
+FMOVRd 10 ..... 110101 rs1:5 0 cond:3 00110 ..... \
+ rd=%dfp_rd rs2=%dfp_rs2
+FMOVRq 10 ..... 110101 rs1:5 0 cond:3 00111 ..... \
+ rd=%qfp_rd rs2=%qfp_rs2
FCMPs 10 000 cc:2 110101 rs1:5 0 0101 0001 rs2:5
-FCMPd 10 000 cc:2 110101 rs1:5 0 0101 0010 rs2:5
-FCMPq 10 000 cc:2 110101 rs1:5 0 0101 0011 rs2:5
+FCMPd 10 000 cc:2 110101 ..... 0 0101 0010 ..... \
+ rs1=%dfp_rs1 rs2=%dfp_rs2
+FCMPq 10 000 cc:2 110101 ..... 0 0101 0011 ..... \
+ rs1=%qfp_rs1 rs2=%qfp_rs2
FCMPEs 10 000 cc:2 110101 rs1:5 0 0101 0101 rs2:5
-FCMPEd 10 000 cc:2 110101 rs1:5 0 0101 0110 rs2:5
-FCMPEq 10 000 cc:2 110101 rs1:5 0 0101 0111 rs2:5
+FCMPEd 10 000 cc:2 110101 ..... 0 0101 0110 ..... \
+ rs1=%dfp_rs1 rs2=%dfp_rs2
+FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
+ rs1=%qfp_rs1 rs2=%qfp_rs2
{
[
@@ -328,74 +375,74 @@ FCMPEq 10 000 cc:2 110101 rs1:5 0 0101 0111 rs2:5
BMASK 10 ..... 110110 ..... 0 0001 1001 ..... @r_r_r
- FPCMPLE16 10 ..... 110110 ..... 0 0010 0000 ..... @r_r_r
- FPCMPNE16 10 ..... 110110 ..... 0 0010 0010 ..... @r_r_r
- FPCMPGT16 10 ..... 110110 ..... 0 0010 1000 ..... @r_r_r
- FPCMPEQ16 10 ..... 110110 ..... 0 0010 1010 ..... @r_r_r
- FPCMPLE32 10 ..... 110110 ..... 0 0010 0100 ..... @r_r_r
- FPCMPNE32 10 ..... 110110 ..... 0 0010 0110 ..... @r_r_r
- FPCMPGT32 10 ..... 110110 ..... 0 0010 1100 ..... @r_r_r
- FPCMPEQ32 10 ..... 110110 ..... 0 0010 1110 ..... @r_r_r
+ FPCMPLE16 10 ..... 110110 ..... 0 0010 0000 ..... @r_d_d
+ FPCMPNE16 10 ..... 110110 ..... 0 0010 0010 ..... @r_d_d
+ FPCMPGT16 10 ..... 110110 ..... 0 0010 1000 ..... @r_d_d
+ FPCMPEQ16 10 ..... 110110 ..... 0 0010 1010 ..... @r_d_d
+ FPCMPLE32 10 ..... 110110 ..... 0 0010 0100 ..... @r_d_d
+ FPCMPNE32 10 ..... 110110 ..... 0 0010 0110 ..... @r_d_d
+ FPCMPGT32 10 ..... 110110 ..... 0 0010 1100 ..... @r_d_d
+ FPCMPEQ32 10 ..... 110110 ..... 0 0010 1110 ..... @r_d_d
- FMUL8x16 10 ..... 110110 ..... 0 0011 0001 ..... @r_r_r
- FMUL8x16AU 10 ..... 110110 ..... 0 0011 0011 ..... @r_r_r
- FMUL8x16AL 10 ..... 110110 ..... 0 0011 0101 ..... @r_r_r
- FMUL8SUx16 10 ..... 110110 ..... 0 0011 0110 ..... @r_r_r
- FMUL8ULx16 10 ..... 110110 ..... 0 0011 0111 ..... @r_r_r
- FMULD8SUx16 10 ..... 110110 ..... 0 0011 1000 ..... @r_r_r
- FMULD8ULx16 10 ..... 110110 ..... 0 0011 1001 ..... @r_r_r
- FPACK32 10 ..... 110110 ..... 0 0011 1010 ..... @r_r_r
- FPACK16 10 ..... 110110 00000 0 0011 1011 ..... @r_r2
- FPACKFIX 10 ..... 110110 00000 0 0011 1101 ..... @r_r2
- PDIST 10 ..... 110110 ..... 0 0011 1110 ..... @r_r_r
+ FMUL8x16 10 ..... 110110 ..... 0 0011 0001 ..... @d_r_d
+ FMUL8x16AU 10 ..... 110110 ..... 0 0011 0011 ..... @d_r_r
+ FMUL8x16AL 10 ..... 110110 ..... 0 0011 0101 ..... @d_r_r
+ FMUL8SUx16 10 ..... 110110 ..... 0 0011 0110 ..... @d_d_d
+ FMUL8ULx16 10 ..... 110110 ..... 0 0011 0111 ..... @d_d_d
+ FMULD8SUx16 10 ..... 110110 ..... 0 0011 1000 ..... @d_r_r
+ FMULD8ULx16 10 ..... 110110 ..... 0 0011 1001 ..... @d_r_r
+ FPACK32 10 ..... 110110 ..... 0 0011 1010 ..... @d_d_d
+ FPACK16 10 ..... 110110 00000 0 0011 1011 ..... @d_d2
+ FPACKFIX 10 ..... 110110 00000 0 0011 1101 ..... @d_d2
+ PDIST 10 ..... 110110 ..... 0 0011 1110 ..... @d_d_d
- FALIGNDATAg 10 ..... 110110 ..... 0 0100 1000 ..... @r_r_r
- FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @r_r_r
- BSHUFFLE 10 ..... 110110 ..... 0 0100 1100 ..... @r_r_r
- FEXPAND 10 ..... 110110 00000 0 0100 1101 ..... @r_r2
+ FALIGNDATAg 10 ..... 110110 ..... 0 0100 1000 ..... @d_d_d
+ FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @d_r_r
+ BSHUFFLE 10 ..... 110110 ..... 0 0100 1100 ..... @d_d_d
+ FEXPAND 10 ..... 110110 00000 0 0100 1101 ..... @r_d2
- FSRCd 10 ..... 110110 ..... 0 0111 0100 00000 @r_r1 # FSRC1d
+ FSRCd 10 ..... 110110 ..... 0 0111 0100 00000 @d_d1 # FSRC1d
FSRCs 10 ..... 110110 ..... 0 0111 0101 00000 @r_r1 # FSRC1s
- FSRCd 10 ..... 110110 00000 0 0111 1000 ..... @r_r2 # FSRC2d
+ FSRCd 10 ..... 110110 00000 0 0111 1000 ..... @d_d2 # FSRC2d
FSRCs 10 ..... 110110 00000 0 0111 1001 ..... @r_r2 # FSRC2s
- FNOTd 10 ..... 110110 ..... 0 0110 1010 00000 @r_r1 # FNOT1d
+ FNOTd 10 ..... 110110 ..... 0 0110 1010 00000 @d_d1 # FNOT1d
FNOTs 10 ..... 110110 ..... 0 0110 1011 00000 @r_r1 # FNOT1s
- FNOTd 10 ..... 110110 00000 0 0110 0110 ..... @r_r2 # FNOT2d
+ FNOTd 10 ..... 110110 00000 0 0110 0110 ..... @d_d2 # FNOT2d
FNOTs 10 ..... 110110 00000 0 0110 0111 ..... @r_r2 # FNOT2s
- FPADD16 10 ..... 110110 ..... 0 0101 0000 ..... @r_r_r
+ FPADD16 10 ..... 110110 ..... 0 0101 0000 ..... @d_d_d
FPADD16s 10 ..... 110110 ..... 0 0101 0001 ..... @r_r_r
- FPADD32 10 ..... 110110 ..... 0 0101 0010 ..... @r_r_r
+ FPADD32 10 ..... 110110 ..... 0 0101 0010 ..... @d_d_d
FPADD32s 10 ..... 110110 ..... 0 0101 0011 ..... @r_r_r
- FPSUB16 10 ..... 110110 ..... 0 0101 0100 ..... @r_r_r
+ FPSUB16 10 ..... 110110 ..... 0 0101 0100 ..... @d_d_d
FPSUB16s 10 ..... 110110 ..... 0 0101 0101 ..... @r_r_r
- FPSUB32 10 ..... 110110 ..... 0 0101 0110 ..... @r_r_r
+ FPSUB32 10 ..... 110110 ..... 0 0101 0110 ..... @d_d_d
FPSUB32s 10 ..... 110110 ..... 0 0101 0111 ..... @r_r_r
- FNORd 10 ..... 110110 ..... 0 0110 0010 ..... @r_r_r
+ FNORd 10 ..... 110110 ..... 0 0110 0010 ..... @d_d_d
FNORs 10 ..... 110110 ..... 0 0110 0011 ..... @r_r_r
- FANDNOTd 10 ..... 110110 ..... 0 0110 0100 ..... @r_r_r # FANDNOT2d
+ FANDNOTd 10 ..... 110110 ..... 0 0110 0100 ..... @d_d_d # FANDNOT2d
FANDNOTs 10 ..... 110110 ..... 0 0110 0101 ..... @r_r_r # FANDNOT2s
- FANDNOTd 10 ..... 110110 ..... 0 0110 1000 ..... @r_r_r_swap # ... 1d
+ FANDNOTd 10 ..... 110110 ..... 0 0110 1000 ..... @d_d_d_swap # ... 1d
FANDNOTs 10 ..... 110110 ..... 0 0110 1001 ..... @r_r_r_swap # ... 1s
- FXORd 10 ..... 110110 ..... 0 0110 1100 ..... @r_r_r
+ FXORd 10 ..... 110110 ..... 0 0110 1100 ..... @d_d_d
FXORs 10 ..... 110110 ..... 0 0110 1101 ..... @r_r_r
- FNANDd 10 ..... 110110 ..... 0 0110 1110 ..... @r_r_r
+ FNANDd 10 ..... 110110 ..... 0 0110 1110 ..... @d_d_d
FNANDs 10 ..... 110110 ..... 0 0110 1111 ..... @r_r_r
- FANDd 10 ..... 110110 ..... 0 0111 0000 ..... @r_r_r
+ FANDd 10 ..... 110110 ..... 0 0111 0000 ..... @d_d_d
FANDs 10 ..... 110110 ..... 0 0111 0001 ..... @r_r_r
- FXNORd 10 ..... 110110 ..... 0 0111 0010 ..... @r_r_r
+ FXNORd 10 ..... 110110 ..... 0 0111 0010 ..... @d_d_d
FXNORs 10 ..... 110110 ..... 0 0111 0011 ..... @r_r_r
- FORNOTd 10 ..... 110110 ..... 0 0111 0110 ..... @r_r_r # FORNOT2d
+ FORNOTd 10 ..... 110110 ..... 0 0111 0110 ..... @d_d_d # FORNOT2d
FORNOTs 10 ..... 110110 ..... 0 0111 0111 ..... @r_r_r # FORNOT2s
- FORNOTd 10 ..... 110110 ..... 0 0111 1010 ..... @r_r_r_swap # ... 1d
+ FORNOTd 10 ..... 110110 ..... 0 0111 1010 ..... @d_d_d_swap # ... 1d
FORNOTs 10 ..... 110110 ..... 0 0111 1011 ..... @r_r_r_swap # ... 1s
- FORd 10 ..... 110110 ..... 0 0111 1100 ..... @r_r_r
+ FORd 10 ..... 110110 ..... 0 0111 1100 ..... @d_d_d
FORs 10 ..... 110110 ..... 0 0111 1101 ..... @r_r_r
- FZEROd 10 rd:5 110110 00000 0 0110 0000 00000
+ FZEROd 10 ..... 110110 00000 0 0110 0000 00000 rd=%dfp_rd
FZEROs 10 rd:5 110110 00000 0 0110 0001 00000
- FONEd 10 rd:5 110110 00000 0 0111 1110 00000
+ FONEd 10 ..... 110110 00000 0 0111 1110 00000 rd=%dfp_rd
FONEs 10 rd:5 110110 00000 0 0111 1111 00000
]
NCP 10 ----- 110110 ----- --------- ----- # v8 CPop1
@@ -407,9 +454,6 @@ NCP 10 ----- 110111 ----- --------- ----- # v8 CPop2
## Major Opcode 11 -- load and store instructions
##
-%dfp_rd 25:5 !function=extract_dfpreg
-%qfp_rd 25:5 !function=extract_qfpreg
-
&r_r_ri_asi rd rs1 rs2_or_imm asi imm:bool
@r_r_ri_na .. rd:5 ...... rs1:5 imm:1 rs2_or_imm:s13 &r_r_ri_asi asi=-1
@d_r_ri_na .. ..... ...... rs1:5 imm:1 rs2_or_imm:s13 \
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 09/41] target/sparc: Remove gen_dest_fpr_D
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (7 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 08/41] target/sparc: Perform DFPREG/QFPREG in decodetree Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 15:18 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 10/41] target/sparc: Remove cpu_fpr[] Richard Henderson
` (33 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Replace with tcg_temp_new_i64.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 97a5c636d2..ddceb25b08 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -250,11 +250,6 @@ static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
gen_update_fprs_dirty(dc, dst);
}
-static TCGv_i64 gen_dest_fpr_D(DisasContext *dc, unsigned int dst)
-{
- return cpu_fpr[dst / 2];
-}
-
static TCGv_i128 gen_load_fpr_Q(DisasContext *dc, unsigned int src)
{
TCGv_i128 ret = tcg_temp_new_i128();
@@ -1985,7 +1980,7 @@ static void gen_fmovs(DisasContext *dc, DisasCompare *cmp, int rd, int rs)
static void gen_fmovd(DisasContext *dc, DisasCompare *cmp, int rd, int rs)
{
#ifdef TARGET_SPARC64
- TCGv_i64 dst = gen_dest_fpr_D(dc, rd);
+ TCGv_i64 dst = tcg_temp_new_i64();
tcg_gen_movcond_i64(cmp->cond, dst, cmp->c1, tcg_constant_tl(cmp->c2),
gen_load_fpr_D(dc, rs),
gen_load_fpr_D(dc, rd));
@@ -4326,7 +4321,7 @@ static bool do_dd(DisasContext *dc, arg_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src = gen_load_fpr_D(dc, a->rs);
func(dst, src);
gen_store_fpr_D(dc, a->rd, dst);
@@ -4348,7 +4343,7 @@ static bool do_env_dd(DisasContext *dc, arg_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src = gen_load_fpr_D(dc, a->rs);
func(dst, tcg_env, src);
gen_store_fpr_D(dc, a->rd, dst);
@@ -4388,7 +4383,7 @@ static bool do_env_df(DisasContext *dc, arg_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src = gen_load_fpr_F(dc, a->rs);
func(dst, tcg_env, src);
gen_store_fpr_D(dc, a->rd, dst);
@@ -4479,7 +4474,7 @@ static bool do_env_dq(DisasContext *dc, arg_r_r *a,
}
src = gen_load_fpr_Q(dc, a->rs);
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
func(dst, tcg_env, src);
gen_store_fpr_D(dc, a->rd, dst);
return advance_pc(dc);
@@ -4594,7 +4589,7 @@ static bool do_dff(DisasContext *dc, arg_r_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src1 = gen_load_fpr_F(dc, a->rs1);
src2 = gen_load_fpr_F(dc, a->rs2);
func(dst, src1, src2);
@@ -4618,7 +4613,7 @@ static bool do_dfd(DisasContext *dc, arg_r_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src1 = gen_load_fpr_F(dc, a->rs1);
src2 = gen_load_fpr_D(dc, a->rs2);
func(dst, src1, src2);
@@ -4637,7 +4632,7 @@ static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src1 = gen_load_fpr_D(dc, a->rs1);
src2 = gen_load_fpr_D(dc, a->rs2);
func(dst, src1, src2);
@@ -4702,7 +4697,7 @@ static bool do_env_ddd(DisasContext *dc, arg_r_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src1 = gen_load_fpr_D(dc, a->rs1);
src2 = gen_load_fpr_D(dc, a->rs2);
func(dst, tcg_env, src1, src2);
@@ -4727,7 +4722,7 @@ static bool trans_FsMULd(DisasContext *dc, arg_r_r_r *a)
return raise_unimpfpop(dc);
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src1 = gen_load_fpr_F(dc, a->rs1);
src2 = gen_load_fpr_F(dc, a->rs2);
gen_helper_fsmuld(dst, tcg_env, src1, src2);
@@ -4744,7 +4739,7 @@ static bool do_dddd(DisasContext *dc, arg_r_r_r *a,
return true;
}
- dst = gen_dest_fpr_D(dc, a->rd);
+ dst = tcg_temp_new_i64();
src0 = gen_load_fpr_D(dc, a->rd);
src1 = gen_load_fpr_D(dc, a->rs1);
src2 = gen_load_fpr_D(dc, a->rs2);
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 10/41] target/sparc: Remove cpu_fpr[]
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (8 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 09/41] target/sparc: Remove gen_dest_fpr_D Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 11/41] target/sparc: Use gvec for VIS1 parallel add/sub Richard Henderson
` (32 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Use explicit loads and stores to env instead.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 158 +++++++++++++++++++++------------------
1 file changed, 84 insertions(+), 74 deletions(-)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index ddceb25b08..981d9d9101 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -124,8 +124,7 @@ static TCGv cpu_gsr;
#define cpu_xcc_C ({ qemu_build_not_reached(); NULL; })
#endif
-/* Floating point registers */
-static TCGv_i64 cpu_fpr[TARGET_DPREGS];
+/* Floating point comparison registers */
static TCGv_i32 cpu_fcc[TARGET_FCCREGS];
#define env_field_offsetof(X) offsetof(CPUSPARCState, X)
@@ -218,50 +217,72 @@ static void gen_update_fprs_dirty(DisasContext *dc, int rd)
}
/* floating point registers moves */
+
+static int gen_offset_fpr_F(unsigned int reg)
+{
+ int ret;
+
+ tcg_debug_assert(reg < 32);
+ ret= offsetof(CPUSPARCState, fpr[reg / 2]);
+ if (reg & 1) {
+ ret += offsetof(CPU_DoubleU, l.lower);
+ } else {
+ ret += offsetof(CPU_DoubleU, l.upper);
+ }
+ return ret;
+}
+
static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
{
TCGv_i32 ret = tcg_temp_new_i32();
- if (src & 1) {
- tcg_gen_extrl_i64_i32(ret, cpu_fpr[src / 2]);
- } else {
- tcg_gen_extrh_i64_i32(ret, cpu_fpr[src / 2]);
- }
+ tcg_gen_ld_i32(ret, tcg_env, gen_offset_fpr_F(src));
return ret;
}
static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
{
- TCGv_i64 t = tcg_temp_new_i64();
-
- tcg_gen_extu_i32_i64(t, v);
- tcg_gen_deposit_i64(cpu_fpr[dst / 2], cpu_fpr[dst / 2], t,
- (dst & 1 ? 0 : 32), 32);
+ tcg_gen_st_i32(v, tcg_env, gen_offset_fpr_F(dst));
gen_update_fprs_dirty(dc, dst);
}
+static int gen_offset_fpr_D(unsigned int reg)
+{
+ tcg_debug_assert(reg < 64);
+ tcg_debug_assert(reg % 2 == 0);
+ return offsetof(CPUSPARCState, fpr[reg / 2]);
+}
+
static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
{
- return cpu_fpr[src / 2];
+ TCGv_i64 ret = tcg_temp_new_i64();
+ tcg_gen_ld_i64(ret, tcg_env, gen_offset_fpr_D(src));
+ return ret;
}
static void gen_store_fpr_D(DisasContext *dc, unsigned int dst, TCGv_i64 v)
{
- tcg_gen_mov_i64(cpu_fpr[dst / 2], v);
+ tcg_gen_st_i64(v, tcg_env, gen_offset_fpr_D(dst));
gen_update_fprs_dirty(dc, dst);
}
static TCGv_i128 gen_load_fpr_Q(DisasContext *dc, unsigned int src)
{
TCGv_i128 ret = tcg_temp_new_i128();
+ TCGv_i64 h = gen_load_fpr_D(dc, src);
+ TCGv_i64 l = gen_load_fpr_D(dc, src + 2);
- tcg_gen_concat_i64_i128(ret, cpu_fpr[src / 2 + 1], cpu_fpr[src / 2]);
+ tcg_gen_concat_i64_i128(ret, l, h);
return ret;
}
static void gen_store_fpr_Q(DisasContext *dc, unsigned int dst, TCGv_i128 v)
{
- tcg_gen_extr_i128_i64(cpu_fpr[dst / 2 + 1], cpu_fpr[dst / 2], v);
- gen_update_fprs_dirty(dc, dst);
+ TCGv_i64 h = tcg_temp_new_i64();
+ TCGv_i64 l = tcg_temp_new_i64();
+
+ tcg_gen_extr_i128_i64(l, h, v);
+ gen_store_fpr_D(dc, dst, h);
+ gen_store_fpr_D(dc, dst + 2, l);
}
/* moves */
@@ -1595,7 +1616,7 @@ static void gen_ldf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
MemOp memop = da->memop;
MemOp size = memop & MO_SIZE;
TCGv_i32 d32;
- TCGv_i64 d64;
+ TCGv_i64 d64, l64;
TCGv addr_tmp;
/* TODO: Use 128-bit load/store below. */
@@ -1617,16 +1638,20 @@ static void gen_ldf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
break;
case MO_64:
- tcg_gen_qemu_ld_i64(cpu_fpr[rd / 2], addr, da->mem_idx, memop);
+ d64 = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(d64, addr, da->mem_idx, memop);
+ gen_store_fpr_D(dc, rd, d64);
break;
case MO_128:
d64 = tcg_temp_new_i64();
+ l64 = tcg_temp_new_i64();
tcg_gen_qemu_ld_i64(d64, addr, da->mem_idx, memop);
addr_tmp = tcg_temp_new();
tcg_gen_addi_tl(addr_tmp, addr, 8);
- tcg_gen_qemu_ld_i64(cpu_fpr[rd / 2 + 1], addr_tmp, da->mem_idx, memop);
- tcg_gen_mov_i64(cpu_fpr[rd / 2], d64);
+ tcg_gen_qemu_ld_i64(l64, addr_tmp, da->mem_idx, memop);
+ gen_store_fpr_D(dc, rd, d64);
+ gen_store_fpr_D(dc, rd + 2, l64);
break;
default:
g_assert_not_reached();
@@ -1638,9 +1663,11 @@ static void gen_ldf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
if (orig_size == MO_64 && (rd & 7) == 0) {
/* The first operation checks required alignment. */
addr_tmp = tcg_temp_new();
+ d64 = tcg_temp_new_i64();
for (int i = 0; ; ++i) {
- tcg_gen_qemu_ld_i64(cpu_fpr[rd / 2 + i], addr, da->mem_idx,
+ tcg_gen_qemu_ld_i64(d64, addr, da->mem_idx,
memop | (i == 0 ? MO_ALIGN_64 : 0));
+ gen_store_fpr_D(dc, rd + 2 * i, d64);
if (i == 7) {
break;
}
@@ -1655,8 +1682,9 @@ static void gen_ldf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
case GET_ASI_SHORT:
/* Valid for lddfa only. */
if (orig_size == MO_64) {
- tcg_gen_qemu_ld_i64(cpu_fpr[rd / 2], addr, da->mem_idx,
- memop | MO_ALIGN);
+ d64 = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(d64, addr, da->mem_idx, memop | MO_ALIGN);
+ gen_store_fpr_D(dc, rd, d64);
} else {
gen_exception(dc, TT_ILL_INSN);
}
@@ -1681,17 +1709,19 @@ static void gen_ldf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
gen_store_fpr_F(dc, rd, d32);
break;
case MO_64:
- gen_helper_ld_asi(cpu_fpr[rd / 2], tcg_env, addr,
- r_asi, r_mop);
+ d64 = tcg_temp_new_i64();
+ gen_helper_ld_asi(d64, tcg_env, addr, r_asi, r_mop);
+ gen_store_fpr_D(dc, rd, d64);
break;
case MO_128:
d64 = tcg_temp_new_i64();
+ l64 = tcg_temp_new_i64();
gen_helper_ld_asi(d64, tcg_env, addr, r_asi, r_mop);
addr_tmp = tcg_temp_new();
tcg_gen_addi_tl(addr_tmp, addr, 8);
- gen_helper_ld_asi(cpu_fpr[rd / 2 + 1], tcg_env, addr_tmp,
- r_asi, r_mop);
- tcg_gen_mov_i64(cpu_fpr[rd / 2], d64);
+ gen_helper_ld_asi(l64, tcg_env, addr_tmp, r_asi, r_mop);
+ gen_store_fpr_D(dc, rd, d64);
+ gen_store_fpr_D(dc, rd + 2, l64);
break;
default:
g_assert_not_reached();
@@ -1707,6 +1737,7 @@ static void gen_stf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
MemOp memop = da->memop;
MemOp size = memop & MO_SIZE;
TCGv_i32 d32;
+ TCGv_i64 d64;
TCGv addr_tmp;
/* TODO: Use 128-bit load/store below. */
@@ -1726,8 +1757,8 @@ static void gen_stf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
tcg_gen_qemu_st_i32(d32, addr, da->mem_idx, memop | MO_ALIGN);
break;
case MO_64:
- tcg_gen_qemu_st_i64(cpu_fpr[rd / 2], addr, da->mem_idx,
- memop | MO_ALIGN_4);
+ d64 = gen_load_fpr_D(dc, rd);
+ tcg_gen_qemu_st_i64(d64, addr, da->mem_idx, memop | MO_ALIGN_4);
break;
case MO_128:
/* Only 4-byte alignment required. However, it is legal for the
@@ -1735,11 +1766,12 @@ static void gen_stf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
required to fix it up. Requiring 16-byte alignment here avoids
having to probe the second page before performing the first
write. */
- tcg_gen_qemu_st_i64(cpu_fpr[rd / 2], addr, da->mem_idx,
- memop | MO_ALIGN_16);
+ d64 = gen_load_fpr_D(dc, rd);
+ tcg_gen_qemu_st_i64(d64, addr, da->mem_idx, memop | MO_ALIGN_16);
addr_tmp = tcg_temp_new();
tcg_gen_addi_tl(addr_tmp, addr, 8);
- tcg_gen_qemu_st_i64(cpu_fpr[rd / 2 + 1], addr_tmp, da->mem_idx, memop);
+ d64 = gen_load_fpr_D(dc, rd + 2);
+ tcg_gen_qemu_st_i64(d64, addr_tmp, da->mem_idx, memop);
break;
default:
g_assert_not_reached();
@@ -1752,7 +1784,8 @@ static void gen_stf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
/* The first operation checks required alignment. */
addr_tmp = tcg_temp_new();
for (int i = 0; ; ++i) {
- tcg_gen_qemu_st_i64(cpu_fpr[rd / 2 + i], addr, da->mem_idx,
+ d64 = gen_load_fpr_D(dc, rd + 2 * i);
+ tcg_gen_qemu_st_i64(d64, addr, da->mem_idx,
memop | (i == 0 ? MO_ALIGN_64 : 0));
if (i == 7) {
break;
@@ -1768,8 +1801,8 @@ static void gen_stf_asi(DisasContext *dc, DisasASI *da, MemOp orig_size,
case GET_ASI_SHORT:
/* Valid for stdfa only. */
if (orig_size == MO_64) {
- tcg_gen_qemu_st_i64(cpu_fpr[rd / 2], addr, da->mem_idx,
- memop | MO_ALIGN);
+ d64 = gen_load_fpr_D(dc, rd);
+ tcg_gen_qemu_st_i64(d64, addr, da->mem_idx, memop | MO_ALIGN);
} else {
gen_exception(dc, TT_ILL_INSN);
}
@@ -1994,13 +2027,17 @@ static void gen_fmovq(DisasContext *dc, DisasCompare *cmp, int rd, int rs)
{
#ifdef TARGET_SPARC64
TCGv c2 = tcg_constant_tl(cmp->c2);
+ TCGv_i64 h = tcg_temp_new_i64();
+ TCGv_i64 l = tcg_temp_new_i64();
- tcg_gen_movcond_i64(cmp->cond, cpu_fpr[rd / 2], cmp->c1, c2,
- cpu_fpr[rs / 2], cpu_fpr[rd / 2]);
- tcg_gen_movcond_i64(cmp->cond, cpu_fpr[rd / 2 + 1], cmp->c1, c2,
- cpu_fpr[rs / 2 + 1], cpu_fpr[rd / 2 + 1]);
-
- gen_update_fprs_dirty(dc, rd);
+ tcg_gen_movcond_i64(cmp->cond, h, cmp->c1, c2,
+ gen_load_fpr_D(dc, rs),
+ gen_load_fpr_D(dc, rd));
+ tcg_gen_movcond_i64(cmp->cond, l, cmp->c1, c2,
+ gen_load_fpr_D(dc, rs + 2),
+ gen_load_fpr_D(dc, rd + 2));
+ gen_store_fpr_D(dc, rd, h);
+ gen_store_fpr_D(dc, rd + 2, l);
#else
qemu_build_not_reached();
#endif
@@ -4192,39 +4229,24 @@ static bool do_stfsr(DisasContext *dc, arg_r_r_ri *a, MemOp mop)
TRANS(STFSR, ALL, do_stfsr, a, MO_TEUL)
TRANS(STXFSR, 64, do_stfsr, a, MO_TEUQ)
-static bool do_fc(DisasContext *dc, int rd, bool c)
+static bool do_fc(DisasContext *dc, int rd, int32_t c)
{
- uint64_t mask;
-
if (gen_trap_ifnofpu(dc)) {
return true;
}
-
- if (rd & 1) {
- mask = MAKE_64BIT_MASK(0, 32);
- } else {
- mask = MAKE_64BIT_MASK(32, 32);
- }
- if (c) {
- tcg_gen_ori_i64(cpu_fpr[rd / 2], cpu_fpr[rd / 2], mask);
- } else {
- tcg_gen_andi_i64(cpu_fpr[rd / 2], cpu_fpr[rd / 2], ~mask);
- }
- gen_update_fprs_dirty(dc, rd);
+ gen_store_fpr_F(dc, rd, tcg_constant_i32(c));
return advance_pc(dc);
}
TRANS(FZEROs, VIS1, do_fc, a->rd, 0)
-TRANS(FONEs, VIS1, do_fc, a->rd, 1)
+TRANS(FONEs, VIS1, do_fc, a->rd, -1)
static bool do_dc(DisasContext *dc, int rd, int64_t c)
{
if (gen_trap_ifnofpu(dc)) {
return true;
}
-
- tcg_gen_movi_i64(cpu_fpr[rd / 2], c);
- gen_update_fprs_dirty(dc, rd);
+ gen_store_fpr_D(dc, rd, tcg_constant_i64(c));
return advance_pc(dc);
}
@@ -5128,12 +5150,6 @@ void sparc_tcg_init(void)
"l0", "l1", "l2", "l3", "l4", "l5", "l6", "l7",
"i0", "i1", "i2", "i3", "i4", "i5", "i6", "i7",
};
- static const char fregnames[32][4] = {
- "f0", "f2", "f4", "f6", "f8", "f10", "f12", "f14",
- "f16", "f18", "f20", "f22", "f24", "f26", "f28", "f30",
- "f32", "f34", "f36", "f38", "f40", "f42", "f44", "f46",
- "f48", "f50", "f52", "f54", "f56", "f58", "f60", "f62",
- };
static const struct { TCGv_i32 *ptr; int off; const char *name; } r32[] = {
#ifdef TARGET_SPARC64
@@ -5190,12 +5206,6 @@ void sparc_tcg_init(void)
(i - 8) * sizeof(target_ulong),
gregnames[i]);
}
-
- for (i = 0; i < TARGET_DPREGS; i++) {
- cpu_fpr[i] = tcg_global_mem_new_i64(tcg_env,
- offsetof(CPUSPARCState, fpr[i]),
- fregnames[i]);
- }
}
void sparc_restore_state_to_opc(CPUState *cs,
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 11/41] target/sparc: Use gvec for VIS1 parallel add/sub
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (9 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 10/41] target/sparc: Remove cpu_fpr[] Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 15:21 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 12/41] target/sparc: Implement FMAf extension Richard Henderson
` (31 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 981d9d9101..ee3da73551 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4645,6 +4645,20 @@ static bool do_dfd(DisasContext *dc, arg_r_r_r *a,
TRANS(FMUL8x16, VIS1, do_dfd, a, gen_helper_fmul8x16)
+static bool do_gvec_ddd(DisasContext *dc, arg_r_r_r *a, MemOp vece,
+ void (*func)(unsigned, uint32_t, uint32_t,
+ uint32_t, uint32_t, uint32_t))
+{
+ func(vece, gen_offset_fpr_D(a->rd), gen_offset_fpr_D(a->rs1),
+ gen_offset_fpr_D(a->rs2), 8, 8);
+ return advance_pc(dc);
+}
+
+TRANS(FPADD16, VIS1, do_gvec_ddd, a, MO_16, tcg_gen_gvec_add)
+TRANS(FPADD32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_add)
+TRANS(FPSUB16, VIS1, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sub)
+TRANS(FPSUB32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sub)
+
static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64))
{
@@ -4665,10 +4679,6 @@ static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
TRANS(FMUL8SUx16, VIS1, do_ddd, a, gen_helper_fmul8sux16)
TRANS(FMUL8ULx16, VIS1, do_ddd, a, gen_helper_fmul8ulx16)
-TRANS(FPADD16, VIS1, do_ddd, a, tcg_gen_vec_add16_i64)
-TRANS(FPADD32, VIS1, do_ddd, a, tcg_gen_vec_add32_i64)
-TRANS(FPSUB16, VIS1, do_ddd, a, tcg_gen_vec_sub16_i64)
-TRANS(FPSUB32, VIS1, do_ddd, a, tcg_gen_vec_sub32_i64)
TRANS(FNORd, VIS1, do_ddd, a, tcg_gen_nor_i64)
TRANS(FANDNOTd, VIS1, do_ddd, a, tcg_gen_andc_i64)
TRANS(FXORd, VIS1, do_ddd, a, tcg_gen_xor_i64)
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 12/41] target/sparc: Implement FMAf extension
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (10 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 11/41] target/sparc: Use gvec for VIS1 parallel add/sub Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 13/41] target/sparc: Add feature bits for VIS 3 Richard Henderson
` (30 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Rearrange PDIST so that do_dddd is general purpose and may
be re-used for FMADDd etc.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 +
linux-user/elfload.c | 1 +
target/sparc/cpu.c | 3 ++
target/sparc/fop_helper.c | 16 +++++++
target/sparc/translate.c | 84 ++++++++++++++++++++++++++++++++--
target/sparc/cpu-feature.h.inc | 1 +
target/sparc/insns.decode | 23 +++++++++-
7 files changed, 124 insertions(+), 6 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index f0576fb748..63ae398841 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -53,6 +53,7 @@ DEF_HELPER_FLAGS_3(faddd, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_3(fsubd, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_3(fmuld, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_3(fdivd, TCG_CALL_NO_WG, f64, env, f64, f64)
+DEF_HELPER_FLAGS_5(fmaddd, TCG_CALL_NO_WG, f64, env, f64, f64, f64, i32)
DEF_HELPER_FLAGS_3(faddq, TCG_CALL_NO_WG, i128, env, i128, i128)
DEF_HELPER_FLAGS_3(fsubq, TCG_CALL_NO_WG, i128, env, i128, i128)
@@ -63,6 +64,7 @@ DEF_HELPER_FLAGS_3(fadds, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_3(fsubs, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_3(fmuls, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_3(fdivs, TCG_CALL_NO_WG, f32, env, f32, f32)
+DEF_HELPER_FLAGS_5(fmadds, TCG_CALL_NO_WG, f32, env, f32, f32, f32, i32)
DEF_HELPER_FLAGS_3(fsmuld, TCG_CALL_NO_WG, f64, env, f32, f32)
DEF_HELPER_FLAGS_3(fdmulq, TCG_CALL_NO_WG, i128, env, f64, f64)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 6041270f1c..5ebf2bf789 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -997,6 +997,7 @@ static uint32_t get_elf_hwcap(void)
r |= features & CPU_FEATURE_FSMULD ? HWCAP_SPARC_FSMULD : 0;
r |= features & CPU_FEATURE_VIS1 ? HWCAP_SPARC_VIS : 0;
r |= features & CPU_FEATURE_VIS2 ? HWCAP_SPARC_VIS2 : 0;
+ r |= features & CPU_FEATURE_FMAF ? HWCAP_SPARC_FMAF : 0;
#endif
return r;
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 313ebc4c11..491e627899 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -551,6 +551,7 @@ static const char * const feature_name[] = {
[CPU_FEATURE_BIT_HYPV] = "hypv",
[CPU_FEATURE_BIT_VIS1] = "vis1",
[CPU_FEATURE_BIT_VIS2] = "vis2",
+ [CPU_FEATURE_BIT_FMAF] = "fmaf",
#else
[CPU_FEATURE_BIT_MUL] = "mul",
[CPU_FEATURE_BIT_DIV] = "div",
@@ -873,6 +874,8 @@ static Property sparc_cpu_properties[] = {
CPU_FEATURE_BIT_VIS1, false),
DEFINE_PROP_BIT("vis2", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_VIS2, false),
+ DEFINE_PROP_BIT("fmaf", SPARCCPU, env.def.features,
+ CPU_FEATURE_BIT_FMAF, false),
#else
DEFINE_PROP_BIT("mul", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_MUL, false),
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
index 1205a599ef..1de44d79c1 100644
--- a/target/sparc/fop_helper.c
+++ b/target/sparc/fop_helper.c
@@ -343,6 +343,22 @@ Int128 helper_fsqrtq(CPUSPARCState *env, Int128 src)
return f128_ret(ret);
}
+float32 helper_fmadds(CPUSPARCState *env, float32 s1,
+ float32 s2, float32 s3, uint32_t op)
+{
+ float32 ret = float32_muladd(s1, s2, s3, op, &env->fp_status);
+ check_ieee_exceptions(env, GETPC());
+ return ret;
+}
+
+float64 helper_fmaddd(CPUSPARCState *env, float64 s1,
+ float64 s2, float64 s3, uint32_t op)
+{
+ float64 ret = float64_muladd(s1, s2, s3, op, &env->fp_status);
+ check_ieee_exceptions(env, GETPC());
+ return ret;
+}
+
static uint32_t finish_fcmp(CPUSPARCState *env, FloatRelation r, uintptr_t ra)
{
check_ieee_exceptions(env, ra);
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index ee3da73551..1178fca9e3 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -29,6 +29,7 @@
#include "exec/helper-gen.h"
#include "exec/translator.h"
#include "exec/log.h"
+#include "fpu/softfloat.h"
#include "asi.h"
#define HELPER_H "helper.h"
@@ -1151,6 +1152,52 @@ static void gen_op_fabsq(TCGv_i128 dst, TCGv_i128 src)
tcg_gen_concat_i64_i128(dst, l, h);
}
+static void gen_op_fmadds(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2, TCGv_i32 s3)
+{
+ gen_helper_fmadds(d, tcg_env, s1, s2, s3, tcg_constant_i32(0));
+}
+
+static void gen_op_fmaddd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2, TCGv_i64 s3)
+{
+ gen_helper_fmaddd(d, tcg_env, s1, s2, s3, tcg_constant_i32(0));
+}
+
+static void gen_op_fmsubs(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2, TCGv_i32 s3)
+{
+ int op = float_muladd_negate_c;
+ gen_helper_fmadds(d, tcg_env, s1, s2, s3, tcg_constant_i32(op));
+}
+
+static void gen_op_fmsubd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2, TCGv_i64 s3)
+{
+ int op = float_muladd_negate_c;
+ gen_helper_fmaddd(d, tcg_env, s1, s2, s3, tcg_constant_i32(op));
+}
+
+static void gen_op_fnmsubs(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2, TCGv_i32 s3)
+{
+ int op = float_muladd_negate_c | float_muladd_negate_result;
+ gen_helper_fmadds(d, tcg_env, s1, s2, s3, tcg_constant_i32(op));
+}
+
+static void gen_op_fnmsubd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2, TCGv_i64 s3)
+{
+ int op = float_muladd_negate_c | float_muladd_negate_result;
+ gen_helper_fmaddd(d, tcg_env, s1, s2, s3, tcg_constant_i32(op));
+}
+
+static void gen_op_fnmadds(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2, TCGv_i32 s3)
+{
+ int op = float_muladd_negate_result;
+ gen_helper_fmadds(d, tcg_env, s1, s2, s3, tcg_constant_i32(op));
+}
+
+static void gen_op_fnmaddd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2, TCGv_i64 s3)
+{
+ int op = float_muladd_negate_result;
+ gen_helper_fmaddd(d, tcg_env, s1, s2, s3, tcg_constant_i32(op));
+}
+
static void gen_op_fpexception_im(DisasContext *dc, int ftt)
{
/*
@@ -2093,6 +2140,7 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_MUL(C) true
# define avail_POWERDOWN(C) false
# define avail_64(C) true
+# define avail_FMAF(C) ((C)->def->features & CPU_FEATURE_FMAF)
# define avail_GL(C) ((C)->def->features & CPU_FEATURE_GL)
# define avail_HYPV(C) ((C)->def->features & CPU_FEATURE_HYPV)
# define avail_VIS1(C) ((C)->def->features & CPU_FEATURE_VIS1)
@@ -2105,6 +2153,7 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_MUL(C) ((C)->def->features & CPU_FEATURE_MUL)
# define avail_POWERDOWN(C) ((C)->def->features & CPU_FEATURE_POWERDOWN)
# define avail_64(C) false
+# define avail_FMAF(C) false
# define avail_GL(C) false
# define avail_HYPV(C) false
# define avail_VIS1(C) false
@@ -4762,25 +4811,52 @@ static bool trans_FsMULd(DisasContext *dc, arg_r_r_r *a)
return advance_pc(dc);
}
-static bool do_dddd(DisasContext *dc, arg_r_r_r *a,
+static bool do_ffff(DisasContext *dc, arg_r_r_r_r *a,
+ void (*func)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32))
+{
+ TCGv_i32 dst, src1, src2, src3;
+
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+
+ src1 = gen_load_fpr_F(dc, a->rs1);
+ src2 = gen_load_fpr_F(dc, a->rs2);
+ src3 = gen_load_fpr_F(dc, a->rs3);
+ dst = tcg_temp_new_i32();
+ func(dst, src1, src2, src3);
+ gen_store_fpr_F(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
+TRANS(FMADDs, FMAF, do_ffff, a, gen_op_fmadds)
+TRANS(FMSUBs, FMAF, do_ffff, a, gen_op_fmsubs)
+TRANS(FNMSUBs, FMAF, do_ffff, a, gen_op_fnmsubs)
+TRANS(FNMADDs, FMAF, do_ffff, a, gen_op_fnmadds)
+
+static bool do_dddd(DisasContext *dc, arg_r_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64))
{
- TCGv_i64 dst, src0, src1, src2;
+ TCGv_i64 dst, src1, src2, src3;
if (gen_trap_ifnofpu(dc)) {
return true;
}
dst = tcg_temp_new_i64();
- src0 = gen_load_fpr_D(dc, a->rd);
src1 = gen_load_fpr_D(dc, a->rs1);
src2 = gen_load_fpr_D(dc, a->rs2);
- func(dst, src0, src1, src2);
+ src3 = gen_load_fpr_D(dc, a->rs3);
+ func(dst, src1, src2, src3);
gen_store_fpr_D(dc, a->rd, dst);
return advance_pc(dc);
}
TRANS(PDIST, VIS1, do_dddd, a, gen_helper_pdist)
+TRANS(FMADDd, FMAF, do_dddd, a, gen_op_fmaddd)
+TRANS(FMSUBd, FMAF, do_dddd, a, gen_op_fmsubd)
+TRANS(FNMSUBd, FMAF, do_dddd, a, gen_op_fnmsubd)
+TRANS(FNMADDd, FMAF, do_dddd, a, gen_op_fnmaddd)
static bool do_env_qqq(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i128, TCGv_env, TCGv_i128, TCGv_i128))
diff --git a/target/sparc/cpu-feature.h.inc b/target/sparc/cpu-feature.h.inc
index d800f18c4e..a30b9255b2 100644
--- a/target/sparc/cpu-feature.h.inc
+++ b/target/sparc/cpu-feature.h.inc
@@ -12,3 +12,4 @@ FEATURE(ASR17)
FEATURE(CACHE_CTRL)
FEATURE(POWERDOWN)
FEATURE(CASA)
+FEATURE(FMAF)
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 2c23868fc3..6d5fa26e90 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -29,6 +29,7 @@ CALL 01 i:s30
%dfp_rd 25:5 !function=extract_dfpreg
%dfp_rs1 14:5 !function=extract_dfpreg
%dfp_rs2 0:5 !function=extract_dfpreg
+%dfp_rs3 9:5 !function=extract_dfpreg
%qfp_rd 25:5 !function=extract_qfpreg
%qfp_rs1 14:5 !function=extract_qfpreg
@@ -80,6 +81,11 @@ CALL 01 i:s30
@q_d2 .. ..... ...... ..... . ........ ..... \
&r_r rd=%qfp_rd rs=%dfp_rs2
+&r_r_r_r rd rs1 rs2 rs3
+@r_r_r_r .. rd:5 ...... rs1:5 rs3:5 .... rs2:5 &r_r_r_r
+@d_d_d_d .. ..... ...... ..... ..... .... ..... \
+ &r_r_r_r rd=%dfp_rd rs1=%dfp_rs1 rs2=%dfp_rs2 rs3=%dfp_rs3
+
{
[
STBAR 10 00000 101000 01111 0 0000000000000
@@ -394,7 +400,8 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPACK32 10 ..... 110110 ..... 0 0011 1010 ..... @d_d_d
FPACK16 10 ..... 110110 00000 0 0011 1011 ..... @d_d2
FPACKFIX 10 ..... 110110 00000 0 0011 1101 ..... @d_d2
- PDIST 10 ..... 110110 ..... 0 0011 1110 ..... @d_d_d
+ PDIST 10 ..... 110110 ..... 0 0011 1110 ..... \
+ &r_r_r_r rd=%dfp_rd rs1=%dfp_rd rs2=%dfp_rs1 rs3=%dfp_rs2
FALIGNDATAg 10 ..... 110110 ..... 0 0100 1000 ..... @d_d_d
FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @d_r_r
@@ -448,7 +455,19 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
NCP 10 ----- 110110 ----- --------- ----- # v8 CPop1
}
-NCP 10 ----- 110111 ----- --------- ----- # v8 CPop2
+{
+ [
+ FMADDs 10 ..... 110111 ..... ..... 0001 ..... @r_r_r_r
+ FMADDd 10 ..... 110111 ..... ..... 0010 ..... @d_d_d_d
+ FMSUBs 10 ..... 110111 ..... ..... 0101 ..... @r_r_r_r
+ FMSUBd 10 ..... 110111 ..... ..... 0110 ..... @d_d_d_d
+ FNMSUBs 10 ..... 110111 ..... ..... 1001 ..... @r_r_r_r
+ FNMSUBd 10 ..... 110111 ..... ..... 1010 ..... @d_d_d_d
+ FNMADDs 10 ..... 110111 ..... ..... 1101 ..... @r_r_r_r
+ FNMADDd 10 ..... 110111 ..... ..... 1110 ..... @d_d_d_d
+ ]
+ NCP 10 ----- 110111 ----- --------- ----- # v8 CPop2
+}
##
## Major Opcode 11 -- load and store instructions
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 13/41] target/sparc: Add feature bits for VIS 3
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (11 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 12/41] target/sparc: Implement FMAf extension Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 17:05 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 14/41] target/sparc: Implement ADDXC, ADDXCcc Richard Henderson
` (29 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
The manual separates VIS 3 and VIS 3B, even though they are both
present in all extant cpus. For clarity, let the translator
match the manual but otherwise leave them on the same feature bit.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 4 ++++
target/sparc/cpu-feature.h.inc | 1 +
2 files changed, 5 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 1178fca9e3..0ebb9c3aa9 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2145,6 +2145,8 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_HYPV(C) ((C)->def->features & CPU_FEATURE_HYPV)
# define avail_VIS1(C) ((C)->def->features & CPU_FEATURE_VIS1)
# define avail_VIS2(C) ((C)->def->features & CPU_FEATURE_VIS2)
+# define avail_VIS3(C) ((C)->def->features & CPU_FEATURE_VIS3)
+# define avail_VIS3B(C) avail_VIS3(C)
#else
# define avail_32(C) true
# define avail_ASR17(C) ((C)->def->features & CPU_FEATURE_ASR17)
@@ -2158,6 +2160,8 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_HYPV(C) false
# define avail_VIS1(C) false
# define avail_VIS2(C) false
+# define avail_VIS3(C) false
+# define avail_VIS3B(C) false
#endif
/* Default case for non jump instructions. */
diff --git a/target/sparc/cpu-feature.h.inc b/target/sparc/cpu-feature.h.inc
index a30b9255b2..3913fb4a54 100644
--- a/target/sparc/cpu-feature.h.inc
+++ b/target/sparc/cpu-feature.h.inc
@@ -13,3 +13,4 @@ FEATURE(CACHE_CTRL)
FEATURE(POWERDOWN)
FEATURE(CASA)
FEATURE(FMAF)
+FEATURE(VIS3)
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 14/41] target/sparc: Implement ADDXC, ADDXCcc
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (12 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 13/41] target/sparc: Add feature bits for VIS 3 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 16:16 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 15/41] target/sparc: Implement CMASK instructions Richard Henderson
` (28 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 14 ++++++++++++++
target/sparc/insns.decode | 3 +++
2 files changed, 17 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 0ebb9c3aa9..0b6d92d0a8 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -442,6 +442,17 @@ static void gen_op_addccc(TCGv dst, TCGv src1, TCGv src2)
gen_op_addcc_int(dst, src1, src2, gen_carry32());
}
+static void gen_op_addxc(TCGv dst, TCGv src1, TCGv src2)
+{
+ tcg_gen_add_tl(dst, src1, src2);
+ tcg_gen_add_tl(dst, dst, cpu_cc_C);
+}
+
+static void gen_op_addxccc(TCGv dst, TCGv src1, TCGv src2)
+{
+ gen_op_addcc_int(dst, src1, src2, cpu_cc_C);
+}
+
static void gen_op_subcc_int(TCGv dst, TCGv src1, TCGv src2, TCGv cin)
{
TCGv z = tcg_constant_tl(0);
@@ -3673,6 +3684,9 @@ TRANS(ARRAY8, VIS1, do_rrr, a, gen_helper_array8)
TRANS(ARRAY16, VIS1, do_rrr, a, gen_op_array16)
TRANS(ARRAY32, VIS1, do_rrr, a, gen_op_array32)
+TRANS(ADDXC, VIS3, do_rrr, a, gen_op_addxc)
+TRANS(ADDXCcc, VIS3, do_rrr, a, gen_op_addxccc)
+
static void gen_op_alignaddr(TCGv dst, TCGv s1, TCGv s2)
{
#ifdef TARGET_SPARC64
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 6d5fa26e90..07796b8fe2 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -376,6 +376,9 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
ARRAY16 10 ..... 110110 ..... 0 0001 0010 ..... @r_r_r
ARRAY32 10 ..... 110110 ..... 0 0001 0100 ..... @r_r_r
+ ADDXC 10 ..... 110110 ..... 0 0001 0001 ..... @r_r_r
+ ADDXCcc 10 ..... 110110 ..... 0 0001 0011 ..... @r_r_r
+
ALIGNADDR 10 ..... 110110 ..... 0 0001 1000 ..... @r_r_r
ALIGNADDRL 10 ..... 110110 ..... 0 0001 1010 ..... @r_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 15/41] target/sparc: Implement CMASK instructions
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (13 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 14/41] target/sparc: Implement ADDXC, ADDXCcc Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 16/41] target/sparc: Implement FCHKSM16 Richard Henderson
` (27 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 3 +++
target/sparc/translate.c | 13 +++++++++++++
target/sparc/vis_helper.c | 38 ++++++++++++++++++++++++++++++++++++++
target/sparc/insns.decode | 4 ++++
4 files changed, 58 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 63ae398841..9cd9a81f03 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -104,6 +104,9 @@ DEF_HELPER_FLAGS_2(fpack16, TCG_CALL_NO_RWG_SE, i32, i64, i64)
DEF_HELPER_FLAGS_3(fpack32, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
DEF_HELPER_FLAGS_2(fpackfix, TCG_CALL_NO_RWG_SE, i32, i64, i64)
DEF_HELPER_FLAGS_3(bshuffle, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_2(cmask8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(cmask16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(cmask32, TCG_CALL_NO_RWG_SE, i64, i64, i64)
#define VIS_CMPHELPER(name) \
DEF_HELPER_FLAGS_2(f##name##16, TCG_CALL_NO_RWG_SE, \
i64, i64, i64) \
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 0b6d92d0a8..fd85fd3e97 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -62,6 +62,9 @@
# define gen_helper_write_softint(E, S) qemu_build_not_reached()
# define gen_helper_wrpil(E, S) qemu_build_not_reached()
# define gen_helper_wrpstate(E, S) qemu_build_not_reached()
+# define gen_helper_cmask8 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_cmask16 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_cmask32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpeq16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpeq32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpgt16 ({ qemu_build_not_reached(); NULL; })
@@ -3729,6 +3732,16 @@ static void gen_op_bmask(TCGv dst, TCGv s1, TCGv s2)
TRANS(BMASK, VIS2, do_rrr, a, gen_op_bmask)
+static bool do_cmask(DisasContext *dc, int rs2, void (*func)(TCGv, TCGv, TCGv))
+{
+ func(cpu_gsr, cpu_gsr, gen_load_gpr(dc, rs2));
+ return true;
+}
+
+TRANS(CMASK8, VIS3, do_cmask, a->rs2, gen_helper_cmask8)
+TRANS(CMASK16, VIS3, do_cmask, a->rs2, gen_helper_cmask16)
+TRANS(CMASK32, VIS3, do_cmask, a->rs2, gen_helper_cmask32)
+
static bool do_shift_r(DisasContext *dc, arg_shiftr *a, bool l, bool u)
{
TCGv dst, src1, src2;
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index e15c6bb34e..0278caa25d 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -334,3 +334,41 @@ uint64_t helper_bshuffle(uint64_t gsr, uint64_t src1, uint64_t src2)
return r.ll;
}
+
+uint64_t helper_cmask8(uint64_t gsr, uint64_t src)
+{
+ uint32_t mask = 0;
+
+ mask |= (src & 0x01 ? 0x00000007 : 0x0000000f);
+ mask |= (src & 0x02 ? 0x00000060 : 0x000000e0);
+ mask |= (src & 0x04 ? 0x00000500 : 0x00000d00);
+ mask |= (src & 0x08 ? 0x00004000 : 0x0000c000);
+ mask |= (src & 0x10 ? 0x00030000 : 0x000b0000);
+ mask |= (src & 0x20 ? 0x00200000 : 0x00a00000);
+ mask |= (src & 0x40 ? 0x01000000 : 0x09000000);
+ mask |= (src & 0x80 ? 0x00000000 : 0x80000000);
+
+ return deposit64(gsr, 32, 32, mask);
+}
+
+uint64_t helper_cmask16(uint64_t gsr, uint64_t src)
+{
+ uint32_t mask = 0;
+
+ mask |= (src & 0x1 ? 0x00000067 : 0x000000ef);
+ mask |= (src & 0x2 ? 0x00004500 : 0x0000cd00);
+ mask |= (src & 0x4 ? 0x00230000 : 0x00ab0000);
+ mask |= (src & 0x8 ? 0x01000000 : 0x89000000);
+
+ return deposit64(gsr, 32, 32, mask);
+}
+
+uint64_t helper_cmask32(uint64_t gsr, uint64_t src)
+{
+ uint32_t mask = 0;
+
+ mask |= (src & 0x1 ? 0x00004567 : 0x0000cdef);
+ mask |= (src & 0x2 ? 0x01230000 : 0x89ab0000);
+
+ return deposit64(gsr, 32, 32, mask);
+}
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 07796b8fe2..8f298ca675 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -384,6 +384,10 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
BMASK 10 ..... 110110 ..... 0 0001 1001 ..... @r_r_r
+ CMASK8 10 00000 110110 00000 0 0001 1011 rs2:5
+ CMASK16 10 00000 110110 00000 0 0001 1101 rs2:5
+ CMASK32 10 00000 110110 00000 0 0001 1111 rs2:5
+
FPCMPLE16 10 ..... 110110 ..... 0 0010 0000 ..... @r_d_d
FPCMPNE16 10 ..... 110110 ..... 0 0010 0010 ..... @r_d_d
FPCMPGT16 10 ..... 110110 ..... 0 0010 1000 ..... @r_d_d
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 16/41] target/sparc: Implement FCHKSM16
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (14 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 15/41] target/sparc: Implement CMASK instructions Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 17/41] target/sparc: Implement FHADD, FHSUB, FNHADD, FNADD Richard Henderson
` (26 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 1 +
target/sparc/translate.c | 32 ++++++++++++++++++++++++++++++++
target/sparc/vis_helper.c | 23 +++++++++++++++++++++++
target/sparc/insns.decode | 1 +
4 files changed, 57 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 9cd9a81f03..37b22afd7f 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -107,6 +107,7 @@ DEF_HELPER_FLAGS_3(bshuffle, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)
DEF_HELPER_FLAGS_2(cmask8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(cmask16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(cmask32, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fchksm16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
#define VIS_CMPHELPER(name) \
DEF_HELPER_FLAGS_2(f##name##16, TCG_CALL_NO_RWG_SE, \
i64, i64, i64) \
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index fd85fd3e97..d6adbf9236 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -797,6 +797,37 @@ static void gen_op_fmuld8sux16(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
tcg_gen_concat_i32_i64(dst, t0, t1);
}
+#ifdef TARGET_SPARC64
+static void gen_vec_fchksm16(unsigned vece, TCGv_vec dst,
+ TCGv_vec src1, TCGv_vec src2)
+{
+ TCGv_vec a = tcg_temp_new_vec_matching(dst);
+ TCGv_vec c = tcg_temp_new_vec_matching(dst);
+
+ tcg_gen_add_vec(vece, a, src1, src2);
+ tcg_gen_cmp_vec(TCG_COND_LTU, vece, c, a, src1);
+ /* Vector cmp produces -1 for true, so subtract to add carry. */
+ tcg_gen_sub_vec(vece, dst, a, c);
+}
+
+static void gen_op_fchksm16(unsigned vece, uint32_t dofs, uint32_t aofs,
+ uint32_t bofs, uint32_t oprsz, uint32_t maxsz)
+{
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_cmp_vec, INDEX_op_add_vec, INDEX_op_sub_vec,
+ };
+ static const GVecGen3 op = {
+ .fni8 = gen_helper_fchksm16,
+ .fniv = gen_vec_fchksm16,
+ .opt_opc = vecop_list,
+ .vece = MO_16,
+ };
+ tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &op);
+}
+#else
+#define gen_op_fchksm16 ({ qemu_build_not_reached(); NULL; })
+#endif
+
static void finishing_insn(DisasContext *dc)
{
/*
@@ -4738,6 +4769,7 @@ TRANS(FPADD16, VIS1, do_gvec_ddd, a, MO_16, tcg_gen_gvec_add)
TRANS(FPADD32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_add)
TRANS(FPSUB16, VIS1, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sub)
TRANS(FPSUB32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sub)
+TRANS(FCHKSM16, VIS3, do_gvec_ddd, a, MO_16, gen_op_fchksm16)
static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64))
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 0278caa25d..c627bb1a1f 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -372,3 +372,26 @@ uint64_t helper_cmask32(uint64_t gsr, uint64_t src)
return deposit64(gsr, 32, 32, mask);
}
+
+static inline uint16_t do_fchksm16(uint16_t src1, uint16_t src2)
+{
+ uint16_t a = src1 + src2;
+ uint16_t c = a < src1;
+ return a + c;
+}
+
+uint64_t helper_fchksm16(uint64_t src1, uint64_t src2)
+{
+ VIS64 r, s1, s2;
+
+ s1.ll = src1;
+ s2.ll = src2;
+ r.ll = 0;
+
+ r.VIS_W64(0) = do_fchksm16(s1.VIS_W64(0), s2.VIS_W64(0));
+ r.VIS_W64(1) = do_fchksm16(s1.VIS_W64(1), s2.VIS_W64(1));
+ r.VIS_W64(2) = do_fchksm16(s1.VIS_W64(2), s2.VIS_W64(2));
+ r.VIS_W64(3) = do_fchksm16(s1.VIS_W64(3), s2.VIS_W64(3));
+
+ return r.ll;
+}
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 8f298ca675..120713a28f 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -410,6 +410,7 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
PDIST 10 ..... 110110 ..... 0 0011 1110 ..... \
&r_r_r_r rd=%dfp_rd rs1=%dfp_rd rs2=%dfp_rs1 rs3=%dfp_rs2
+ FCHKSM16 10 ..... 110110 ..... 0 0100 0100 ..... @d_d_d
FALIGNDATAg 10 ..... 110110 ..... 0 0100 1000 ..... @d_d_d
FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @d_r_r
BSHUFFLE 10 ..... 110110 ..... 0 0100 1100 ..... @d_d_d
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 17/41] target/sparc: Implement FHADD, FHSUB, FNHADD, FNADD
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (15 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 16/41] target/sparc: Implement FCHKSM16 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 18/41] target/sparc: Implement FNMUL Richard Henderson
` (25 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 70 +++++++++++++++++++++++++++++++++++++++
target/sparc/insns.decode | 8 +++++
2 files changed, 78 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index d6adbf9236..877847b884 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -1243,6 +1243,66 @@ static void gen_op_fnmaddd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2, TCGv_i64 s3)
gen_helper_fmaddd(d, tcg_env, s1, s2, s3, tcg_constant_i32(op));
}
+/* Use muladd to compute (1 * src1) + src2 / 2 with one rounding. */
+static void gen_op_fhadds(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2)
+{
+ TCGv_i32 one = tcg_constant_i32(float32_one);
+ int op = float_muladd_halve_result;
+ gen_helper_fmadds(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
+static void gen_op_fhaddd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2)
+{
+ TCGv_i64 one = tcg_constant_i64(float64_one);
+ int op = float_muladd_halve_result;
+ gen_helper_fmaddd(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
+/* Use muladd to compute (1 * src1) - src2 / 2 with one rounding. */
+static void gen_op_fhsubs(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2)
+{
+ TCGv_i32 one = tcg_constant_i32(float32_one);
+ int op = float_muladd_negate_c | float_muladd_halve_result;
+ gen_helper_fmadds(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
+static void gen_op_fhsubd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2)
+{
+ TCGv_i64 one = tcg_constant_i64(float64_one);
+ int op = float_muladd_negate_c | float_muladd_halve_result;
+ gen_helper_fmaddd(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
+/* Use muladd to compute -((1 * src1) + src2 / 2) with one rounding. */
+static void gen_op_fnhadds(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2)
+{
+ TCGv_i32 one = tcg_constant_i32(float32_one);
+ int op = float_muladd_negate_result | float_muladd_halve_result;
+ gen_helper_fmadds(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
+static void gen_op_fnhaddd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2)
+{
+ TCGv_i64 one = tcg_constant_i64(float64_one);
+ int op = float_muladd_negate_result | float_muladd_halve_result;
+ gen_helper_fmaddd(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
+/* Use muladd to compute -((1 * src1) + src2). */
+static void gen_op_fnadds(TCGv_i32 d, TCGv_i32 s1, TCGv_i32 s2)
+{
+ TCGv_i32 one = tcg_constant_i32(float32_one);
+ int op = float_muladd_negate_result;
+ gen_helper_fmadds(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
+static void gen_op_fnaddd(TCGv_i64 d, TCGv_i64 s1, TCGv_i64 s2)
+{
+ TCGv_i64 one = tcg_constant_i64(float64_one);
+ int op = float_muladd_negate_result;
+ gen_helper_fmaddd(d, tcg_env, one, s1, s2, tcg_constant_i32(op));
+}
+
static void gen_op_fpexception_im(DisasContext *dc, int ftt)
{
/*
@@ -4691,6 +4751,11 @@ TRANS(FXNORs, VIS1, do_fff, a, tcg_gen_eqv_i32)
TRANS(FORNOTs, VIS1, do_fff, a, tcg_gen_orc_i32)
TRANS(FORs, VIS1, do_fff, a, tcg_gen_or_i32)
+TRANS(FHADDs, VIS3, do_fff, a, gen_op_fhadds)
+TRANS(FHSUBs, VIS3, do_fff, a, gen_op_fhsubs)
+TRANS(FNHADDs, VIS3, do_fff, a, gen_op_fnhadds)
+TRANS(FNADDs, VIS3, do_fff, a, gen_op_fnadds)
+
static bool do_env_fff(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i32, TCGv_env, TCGv_i32, TCGv_i32))
{
@@ -4804,6 +4869,11 @@ TRANS(FPACK32, VIS1, do_ddd, a, gen_op_fpack32)
TRANS(FALIGNDATAg, VIS1, do_ddd, a, gen_op_faligndata)
TRANS(BSHUFFLE, VIS2, do_ddd, a, gen_op_bshuffle)
+TRANS(FHADDd, VIS3, do_ddd, a, gen_op_fhaddd)
+TRANS(FHSUBd, VIS3, do_ddd, a, gen_op_fhsubd)
+TRANS(FNHADDd, VIS3, do_ddd, a, gen_op_fnhaddd)
+TRANS(FNADDd, VIS3, do_ddd, a, gen_op_fnaddd)
+
static bool do_rdd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv, TCGv_i64, TCGv_i64))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 120713a28f..dc524f5b8f 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -307,8 +307,16 @@ FMULq 10 ..... 110100 ..... 0 0100 1011 ..... @q_q_q
FDIVs 10 ..... 110100 ..... 0 0100 1101 ..... @r_r_r
FDIVd 10 ..... 110100 ..... 0 0100 1110 ..... @d_d_d
FDIVq 10 ..... 110100 ..... 0 0100 1111 ..... @q_q_q
+FNADDs 10 ..... 110100 ..... 0 0101 0001 ..... @r_r_r
+FNADDd 10 ..... 110100 ..... 0 0101 0010 ..... @d_d_d
+FHADDs 10 ..... 110100 ..... 0 0110 0001 ..... @r_r_r
+FHADDd 10 ..... 110100 ..... 0 0110 0010 ..... @d_d_d
+FHSUBs 10 ..... 110100 ..... 0 0110 0101 ..... @r_r_r
+FHSUBd 10 ..... 110100 ..... 0 0110 0110 ..... @d_d_d
FsMULd 10 ..... 110100 ..... 0 0110 1001 ..... @d_r_r
FdMULq 10 ..... 110100 ..... 0 0110 1110 ..... @q_d_d
+FNHADDs 10 ..... 110100 ..... 0 0111 0001 ..... @r_r_r
+FNHADDd 10 ..... 110100 ..... 0 0111 0010 ..... @d_d_d
FsTOx 10 ..... 110100 00000 0 1000 0001 ..... @r_r2
FdTOx 10 ..... 110100 00000 0 1000 0010 ..... @r_d2
FqTOx 10 ..... 110100 00000 0 1000 0011 ..... @r_q2
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 18/41] target/sparc: Implement FNMUL
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (16 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 17/41] target/sparc: Implement FHADD, FHSUB, FNHADD, FNADD Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 19/41] target/sparc: Implement FLCMP Richard Henderson
` (24 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Unlike FNADD, we cannot (ab)use muladd for this operation because
-0.0 * +0.0 == -0.0
-0.0 + +0.0 == +0.0
the addition step will lose the -0.0 product result before negation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 3 +++
target/sparc/fop_helper.c | 36 ++++++++++++++++++++++++++++++++++++
target/sparc/translate.c | 21 +++++++++++++++++++++
target/sparc/insns.decode | 3 +++
4 files changed, 63 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 37b22afd7f..926b579e97 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -54,6 +54,7 @@ DEF_HELPER_FLAGS_3(fsubd, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_3(fmuld, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_3(fdivd, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_5(fmaddd, TCG_CALL_NO_WG, f64, env, f64, f64, f64, i32)
+DEF_HELPER_FLAGS_3(fnmuld, TCG_CALL_NO_WG, f64, env, f64, f64)
DEF_HELPER_FLAGS_3(faddq, TCG_CALL_NO_WG, i128, env, i128, i128)
DEF_HELPER_FLAGS_3(fsubq, TCG_CALL_NO_WG, i128, env, i128, i128)
@@ -65,8 +66,10 @@ DEF_HELPER_FLAGS_3(fsubs, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_3(fmuls, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_3(fdivs, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_5(fmadds, TCG_CALL_NO_WG, f32, env, f32, f32, f32, i32)
+DEF_HELPER_FLAGS_3(fnmuls, TCG_CALL_NO_WG, f32, env, f32, f32)
DEF_HELPER_FLAGS_3(fsmuld, TCG_CALL_NO_WG, f64, env, f32, f32)
+DEF_HELPER_FLAGS_3(fnsmuld, TCG_CALL_NO_WG, f64, env, f32, f32)
DEF_HELPER_FLAGS_3(fdmulq, TCG_CALL_NO_WG, i128, env, f64, f64)
DEF_HELPER_FLAGS_2(fitod, TCG_CALL_NO_WG, f64, env, s32)
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
index 1de44d79c1..ea9d4ec235 100644
--- a/target/sparc/fop_helper.c
+++ b/target/sparc/fop_helper.c
@@ -359,6 +359,42 @@ float64 helper_fmaddd(CPUSPARCState *env, float64 s1,
return ret;
}
+float32 helper_fnmuls(CPUSPARCState *env, float32 src1, float32 src2)
+{
+ float32 ret = float32_mul(src1, src2, &env->fp_status);
+
+ /* NaN inputs or result do not get a sign change. */
+ if (!(get_float_exception_flags(&env->fp_status) & float_flag_invalid)) {
+ ret = float32_chs(ret);
+ }
+ check_ieee_exceptions(env, GETPC());
+ return ret;
+}
+
+float64 helper_fnmuld(CPUSPARCState *env, float64 src1, float64 src2)
+{
+ float64 ret = float64_mul(src1, src2, &env->fp_status);
+
+ if (!(get_float_exception_flags(&env->fp_status) & float_flag_invalid)) {
+ ret = float64_chs(ret);
+ }
+ check_ieee_exceptions(env, GETPC());
+ return ret;
+}
+
+float64 helper_fnsmuld(CPUSPARCState *env, float32 src1, float32 src2)
+{
+ float64 ret = float64_mul(float32_to_float64(src1, &env->fp_status),
+ float32_to_float64(src2, &env->fp_status),
+ &env->fp_status);
+
+ if (!(get_float_exception_flags(&env->fp_status) & float_flag_invalid)) {
+ ret = float64_chs(ret);
+ }
+ check_ieee_exceptions(env, GETPC());
+ return ret;
+}
+
static uint32_t finish_fcmp(CPUSPARCState *env, FloatRelation r, uintptr_t ra)
{
check_ieee_exceptions(env, ra);
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 877847b884..b3714ada6a 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4776,6 +4776,7 @@ TRANS(FADDs, ALL, do_env_fff, a, gen_helper_fadds)
TRANS(FSUBs, ALL, do_env_fff, a, gen_helper_fsubs)
TRANS(FMULs, ALL, do_env_fff, a, gen_helper_fmuls)
TRANS(FDIVs, ALL, do_env_fff, a, gen_helper_fdivs)
+TRANS(FNMULs, VIS3, do_env_fff, a, gen_helper_fnmuls)
static bool do_dff(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i32, TCGv_i32))
@@ -4923,6 +4924,7 @@ TRANS(FADDd, ALL, do_env_ddd, a, gen_helper_faddd)
TRANS(FSUBd, ALL, do_env_ddd, a, gen_helper_fsubd)
TRANS(FMULd, ALL, do_env_ddd, a, gen_helper_fmuld)
TRANS(FDIVd, ALL, do_env_ddd, a, gen_helper_fdivd)
+TRANS(FNMULd, VIS3, do_env_ddd, a, gen_helper_fnmuld)
static bool trans_FsMULd(DisasContext *dc, arg_r_r_r *a)
{
@@ -4944,6 +4946,25 @@ static bool trans_FsMULd(DisasContext *dc, arg_r_r_r *a)
return advance_pc(dc);
}
+static bool trans_FNsMULd(DisasContext *dc, arg_r_r_r *a)
+{
+ TCGv_i64 dst;
+ TCGv_i32 src1, src2;
+
+ if (!avail_VIS3(dc)) {
+ return false;
+ }
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+ dst = tcg_temp_new_i64();
+ src1 = gen_load_fpr_F(dc, a->rs1);
+ src2 = gen_load_fpr_F(dc, a->rs2);
+ gen_helper_fnsmuld(dst, tcg_env, src1, src2);
+ gen_store_fpr_D(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
static bool do_ffff(DisasContext *dc, arg_r_r_r_r *a,
void (*func)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index dc524f5b8f..8c0df3004d 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -309,6 +309,8 @@ FDIVd 10 ..... 110100 ..... 0 0100 1110 ..... @d_d_d
FDIVq 10 ..... 110100 ..... 0 0100 1111 ..... @q_q_q
FNADDs 10 ..... 110100 ..... 0 0101 0001 ..... @r_r_r
FNADDd 10 ..... 110100 ..... 0 0101 0010 ..... @d_d_d
+FNMULs 10 ..... 110100 ..... 0 0101 1001 ..... @r_r_r
+FNMULd 10 ..... 110100 ..... 0 0101 1010 ..... @d_d_d
FHADDs 10 ..... 110100 ..... 0 0110 0001 ..... @r_r_r
FHADDd 10 ..... 110100 ..... 0 0110 0010 ..... @d_d_d
FHSUBs 10 ..... 110100 ..... 0 0110 0101 ..... @r_r_r
@@ -317,6 +319,7 @@ FsMULd 10 ..... 110100 ..... 0 0110 1001 ..... @d_r_r
FdMULq 10 ..... 110100 ..... 0 0110 1110 ..... @q_d_d
FNHADDs 10 ..... 110100 ..... 0 0111 0001 ..... @r_r_r
FNHADDd 10 ..... 110100 ..... 0 0111 0010 ..... @d_d_d
+FNsMULd 10 ..... 110100 ..... 0 0111 1001 ..... @d_r_r
FsTOx 10 ..... 110100 00000 0 1000 0001 ..... @r_r2
FdTOx 10 ..... 110100 00000 0 1000 0010 ..... @r_d2
FqTOx 10 ..... 110100 00000 0 1000 0011 ..... @r_q2
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 19/41] target/sparc: Implement FLCMP
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (17 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 18/41] target/sparc: Implement FNMUL Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 20/41] target/sparc: Implement FMEAN16 Richard Henderson
` (23 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 ++
target/sparc/fop_helper.c | 46 +++++++++++++++++++++++++++++++++++++++
target/sparc/translate.c | 34 +++++++++++++++++++++++++++++
target/sparc/insns.decode | 4 ++++
4 files changed, 86 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 926b579e97..97b3c24fb3 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -47,6 +47,8 @@ DEF_HELPER_FLAGS_3(fcmpd, TCG_CALL_NO_WG, i32, env, f64, f64)
DEF_HELPER_FLAGS_3(fcmped, TCG_CALL_NO_WG, i32, env, f64, f64)
DEF_HELPER_FLAGS_3(fcmpq, TCG_CALL_NO_WG, i32, env, i128, i128)
DEF_HELPER_FLAGS_3(fcmpeq, TCG_CALL_NO_WG, i32, env, i128, i128)
+DEF_HELPER_FLAGS_2(flcmps, TCG_CALL_NO_RWG_SE, i32, f32, f32)
+DEF_HELPER_FLAGS_2(flcmpd, TCG_CALL_NO_RWG_SE, i32, f64, f64)
DEF_HELPER_2(raise_exception, noreturn, env, int)
DEF_HELPER_FLAGS_3(faddd, TCG_CALL_NO_WG, f64, env, f64, f64)
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
index ea9d4ec235..8c07442ad6 100644
--- a/target/sparc/fop_helper.c
+++ b/target/sparc/fop_helper.c
@@ -458,6 +458,52 @@ uint32_t helper_fcmpeq(CPUSPARCState *env, Int128 src1, Int128 src2)
return finish_fcmp(env, r, GETPC());
}
+uint32_t helper_flcmps(float32 src1, float32 src2)
+{
+ /*
+ * FLCMP never raises an exception nor modifies any FSR fields.
+ * Perform the comparison with a dummy fp environment.
+ */
+ float_status discard = { };
+ FloatRelation r = float32_compare_quiet(src1, src2, &discard);
+
+ switch (r) {
+ case float_relation_equal:
+ if (src2 == float32_zero && src1 != float32_zero) {
+ return 1; /* -0.0 < +0.0 */
+ }
+ return 0;
+ case float_relation_less:
+ return 1;
+ case float_relation_greater:
+ return 0;
+ case float_relation_unordered:
+ return float32_is_any_nan(src2) ? 3 : 2;
+ }
+ g_assert_not_reached();
+}
+
+uint32_t helper_flcmpd(float64 src1, float64 src2)
+{
+ float_status discard = { };
+ FloatRelation r = float64_compare_quiet(src1, src2, &discard);
+
+ switch (r) {
+ case float_relation_equal:
+ if (src2 == float64_zero && src1 != float64_zero) {
+ return 1; /* -0.0 < +0.0 */
+ }
+ return 0;
+ case float_relation_less:
+ return 1;
+ case float_relation_greater:
+ return 0;
+ case float_relation_unordered:
+ return float64_is_any_nan(src2) ? 3 : 2;
+ }
+ g_assert_not_reached();
+}
+
target_ulong cpu_get_fsr(CPUSPARCState *env)
{
target_ulong fsr = env->fsr | env->fsr_cexc_ftt;
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index b3714ada6a..6dba0fcca6 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5199,6 +5199,40 @@ static bool do_fcmpq(DisasContext *dc, arg_FCMPq *a, bool e)
TRANS(FCMPq, ALL, do_fcmpq, a, false)
TRANS(FCMPEq, ALL, do_fcmpq, a, true)
+static bool trans_FLCMPs(DisasContext *dc, arg_FLCMPs *a)
+{
+ TCGv_i32 src1, src2;
+
+ if (!avail_VIS3(dc)) {
+ return false;
+ }
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+
+ src1 = gen_load_fpr_F(dc, a->rs1);
+ src2 = gen_load_fpr_F(dc, a->rs2);
+ gen_helper_flcmps(cpu_fcc[a->cc], src1, src2);
+ return advance_pc(dc);
+}
+
+static bool trans_FLCMPd(DisasContext *dc, arg_FLCMPd *a)
+{
+ TCGv_i64 src1, src2;
+
+ if (!avail_VIS3(dc)) {
+ return false;
+ }
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+
+ src1 = gen_load_fpr_D(dc, a->rs1);
+ src2 = gen_load_fpr_D(dc, a->rs2);
+ gen_helper_flcmpd(cpu_fcc[a->cc], src1, src2);
+ return advance_pc(dc);
+}
+
static void sparc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 8c0df3004d..51a7fb62fb 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -470,6 +470,10 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FZEROs 10 rd:5 110110 00000 0 0110 0001 00000
FONEd 10 ..... 110110 00000 0 0111 1110 00000 rd=%dfp_rd
FONEs 10 rd:5 110110 00000 0 0111 1111 00000
+
+ FLCMPs 10 000 cc:2 110110 rs1:5 1 0101 0001 rs2:5
+ FLCMPd 10 000 cc:2 110110 ..... 1 0101 0010 ..... \
+ rs1=%dfp_rs1 rs2=%dfp_rs2
]
NCP 10 ----- 110110 ----- --------- ----- # v8 CPop1
}
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 20/41] target/sparc: Implement FMEAN16
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (18 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 19/41] target/sparc: Implement FLCMP Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 21/41] target/sparc: Implement FPADD64 FPSUB64 Richard Henderson
` (22 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 1 +
target/sparc/translate.c | 30 ++++++++++++++++++++++++++++++
target/sparc/vis_helper.c | 21 +++++++++++++++++++++
target/sparc/insns.decode | 1 +
4 files changed, 53 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 97b3c24fb3..8a5191414e 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -113,6 +113,7 @@ DEF_HELPER_FLAGS_2(cmask8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(cmask16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(cmask32, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fchksm16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fmean16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
#define VIS_CMPHELPER(name) \
DEF_HELPER_FLAGS_2(f##name##16, TCG_CALL_NO_RWG_SE, \
i64, i64, i64) \
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 6dba0fcca6..4876d46ebb 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -824,8 +824,37 @@ static void gen_op_fchksm16(unsigned vece, uint32_t dofs, uint32_t aofs,
};
tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &op);
}
+
+static void gen_vec_fmean16(unsigned vece, TCGv_vec dst,
+ TCGv_vec src1, TCGv_vec src2)
+{
+ TCGv_vec t = tcg_temp_new_vec_matching(dst);
+
+ tcg_gen_or_vec(vece, t, src1, src2);
+ tcg_gen_and_vec(vece, t, t, tcg_constant_vec_matching(dst, vece, 1));
+ tcg_gen_sari_vec(vece, src1, src1, 1);
+ tcg_gen_sari_vec(vece, src2, src2, 1);
+ tcg_gen_add_vec(vece, dst, src1, src2);
+ tcg_gen_add_vec(vece, dst, dst, t);
+}
+
+static void gen_op_fmean16(unsigned vece, uint32_t dofs, uint32_t aofs,
+ uint32_t bofs, uint32_t oprsz, uint32_t maxsz)
+{
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_add_vec, INDEX_op_sari_vec,
+ };
+ static const GVecGen3 op = {
+ .fni8 = gen_helper_fmean16,
+ .fniv = gen_vec_fmean16,
+ .opt_opc = vecop_list,
+ .vece = MO_16,
+ };
+ tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &op);
+}
#else
#define gen_op_fchksm16 ({ qemu_build_not_reached(); NULL; })
+#define gen_op_fmean16 ({ qemu_build_not_reached(); NULL; })
#endif
static void finishing_insn(DisasContext *dc)
@@ -4836,6 +4865,7 @@ TRANS(FPADD32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_add)
TRANS(FPSUB16, VIS1, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sub)
TRANS(FPSUB32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sub)
TRANS(FCHKSM16, VIS3, do_gvec_ddd, a, MO_16, gen_op_fchksm16)
+TRANS(FMEAN16, VIS3, do_gvec_ddd, a, MO_16, gen_op_fmean16)
static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64))
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index c627bb1a1f..93a6239f41 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -395,3 +395,24 @@ uint64_t helper_fchksm16(uint64_t src1, uint64_t src2)
return r.ll;
}
+
+static inline int16_t do_fmean16(int16_t src1, int16_t src2)
+{
+ return (src1 + src2 + 1) / 2;
+}
+
+uint64_t helper_fmean16(uint64_t src1, uint64_t src2)
+{
+ VIS64 r, s1, s2;
+
+ s1.ll = src1;
+ s2.ll = src2;
+ r.ll = 0;
+
+ r.VIS_SW64(0) = do_fmean16(s1.VIS_SW64(0), s2.VIS_SW64(0));
+ r.VIS_SW64(1) = do_fmean16(s1.VIS_SW64(1), s2.VIS_SW64(1));
+ r.VIS_SW64(2) = do_fmean16(s1.VIS_SW64(2), s2.VIS_SW64(2));
+ r.VIS_SW64(3) = do_fmean16(s1.VIS_SW64(3), s2.VIS_SW64(3));
+
+ return r.ll;
+}
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 51a7fb62fb..bc5640aa5f 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -421,6 +421,7 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
PDIST 10 ..... 110110 ..... 0 0011 1110 ..... \
&r_r_r_r rd=%dfp_rd rs1=%dfp_rd rs2=%dfp_rs1 rs3=%dfp_rs2
+ FMEAN16 10 ..... 110110 ..... 0 0100 0000 ..... @d_d_d
FCHKSM16 10 ..... 110110 ..... 0 0100 0100 ..... @d_d_d
FALIGNDATAg 10 ..... 110110 ..... 0 0100 1000 ..... @d_d_d
FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @d_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 21/41] target/sparc: Implement FPADD64 FPSUB64
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (19 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 20/41] target/sparc: Implement FMEAN16 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 16:19 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 22/41] target/sparc: Implement FPADDS, FPSUBS Richard Henderson
` (21 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 3 +++
target/sparc/insns.decode | 2 ++
2 files changed, 5 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 4876d46ebb..9af30d8fa7 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4905,6 +4905,9 @@ TRANS(FHSUBd, VIS3, do_ddd, a, gen_op_fhsubd)
TRANS(FNHADDd, VIS3, do_ddd, a, gen_op_fnhaddd)
TRANS(FNADDd, VIS3, do_ddd, a, gen_op_fnaddd)
+TRANS(FPADD64, VIS3B, do_ddd, a, tcg_gen_add_i64)
+TRANS(FPSUB64, VIS3B, do_ddd, a, tcg_gen_sub_i64)
+
static bool do_rdd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv, TCGv_i64, TCGv_i64))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index bc5640aa5f..c9dab4236d 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -441,10 +441,12 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPADD16s 10 ..... 110110 ..... 0 0101 0001 ..... @r_r_r
FPADD32 10 ..... 110110 ..... 0 0101 0010 ..... @d_d_d
FPADD32s 10 ..... 110110 ..... 0 0101 0011 ..... @r_r_r
+ FPADD64 10 ..... 110110 ..... 0 0100 0010 ..... @d_d_d
FPSUB16 10 ..... 110110 ..... 0 0101 0100 ..... @d_d_d
FPSUB16s 10 ..... 110110 ..... 0 0101 0101 ..... @r_r_r
FPSUB32 10 ..... 110110 ..... 0 0101 0110 ..... @d_d_d
FPSUB32s 10 ..... 110110 ..... 0 0101 0111 ..... @r_r_r
+ FPSUB64 10 ..... 110110 ..... 0 0100 0110 ..... @d_d_d
FNORd 10 ..... 110110 ..... 0 0110 0010 ..... @d_d_d
FNORs 10 ..... 110110 ..... 0 0110 0011 ..... @r_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 22/41] target/sparc: Implement FPADDS, FPSUBS
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (20 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 21/41] target/sparc: Implement FPADD64 FPSUB64 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 23/41] target/sparc: Implement FPCMPEQ8, FPCMPNE8, FPCMPULE8, FPCMPUGT8 Richard Henderson
` (20 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 82 +++++++++++++++++++++++++++++++++++++++
target/sparc/insns.decode | 9 +++++
2 files changed, 91 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 9af30d8fa7..0dc02a3d6e 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -707,6 +707,78 @@ static void gen_op_fpack32(TCGv_i64 dst, TCGv_i64 src1, TCGv_i64 src2)
#endif
}
+static void gen_op_fpadds16s(TCGv_i32 d, TCGv_i32 src1, TCGv_i32 src2)
+{
+ TCGv_i32 t[2];
+
+ for (int i = 0; i < 2; i++) {
+ TCGv_i32 u = tcg_temp_new_i32();
+ TCGv_i32 v = tcg_temp_new_i32();
+
+ tcg_gen_sextract_i32(u, src1, i * 16, 16);
+ tcg_gen_sextract_i32(v, src2, i * 16, 16);
+ tcg_gen_add_i32(u, u, v);
+ tcg_gen_smax_i32(u, u, tcg_constant_i32(INT16_MIN));
+ tcg_gen_smin_i32(u, u, tcg_constant_i32(INT16_MAX));
+ t[i] = u;
+ }
+ tcg_gen_deposit_i32(d, t[0], t[1], 16, 16);
+}
+
+static void gen_op_fpsubs16s(TCGv_i32 d, TCGv_i32 src1, TCGv_i32 src2)
+{
+ TCGv_i32 t[2];
+
+ for (int i = 0; i < 2; i++) {
+ TCGv_i32 u = tcg_temp_new_i32();
+ TCGv_i32 v = tcg_temp_new_i32();
+
+ tcg_gen_sextract_i32(u, src1, i * 16, 16);
+ tcg_gen_sextract_i32(v, src2, i * 16, 16);
+ tcg_gen_sub_i32(u, u, v);
+ tcg_gen_smax_i32(u, u, tcg_constant_i32(INT16_MIN));
+ tcg_gen_smin_i32(u, u, tcg_constant_i32(INT16_MAX));
+ t[i] = u;
+ }
+ tcg_gen_deposit_i32(d, t[0], t[1], 16, 16);
+}
+
+static void gen_op_fpadds32s(TCGv_i32 d, TCGv_i32 src1, TCGv_i32 src2)
+{
+ TCGv_i32 r = tcg_temp_new_i32();
+ TCGv_i32 t = tcg_temp_new_i32();
+ TCGv_i32 v = tcg_temp_new_i32();
+ TCGv_i32 z = tcg_constant_i32(0);
+
+ tcg_gen_add_i32(r, src1, src2);
+ tcg_gen_xor_i32(t, src1, src2);
+ tcg_gen_xor_i32(v, r, src2);
+ tcg_gen_andc_i32(v, v, t);
+
+ tcg_gen_setcond_i32(TCG_COND_GE, t, r, z);
+ tcg_gen_addi_i32(t, t, INT32_MAX);
+
+ tcg_gen_movcond_i32(TCG_COND_LT, d, v, z, t, r);
+}
+
+static void gen_op_fpsubs32s(TCGv_i32 d, TCGv_i32 src1, TCGv_i32 src2)
+{
+ TCGv_i32 r = tcg_temp_new_i32();
+ TCGv_i32 t = tcg_temp_new_i32();
+ TCGv_i32 v = tcg_temp_new_i32();
+ TCGv_i32 z = tcg_constant_i32(0);
+
+ tcg_gen_sub_i32(r, src1, src2);
+ tcg_gen_xor_i32(t, src1, src2);
+ tcg_gen_xor_i32(v, r, src2);
+ tcg_gen_and_i32(v, v, t);
+
+ tcg_gen_setcond_i32(TCG_COND_GE, t, r, z);
+ tcg_gen_addi_i32(t, t, INT32_MAX);
+
+ tcg_gen_movcond_i32(TCG_COND_LT, d, v, z, t, r);
+}
+
static void gen_op_faligndata(TCGv_i64 dst, TCGv_i64 s1, TCGv_i64 s2)
{
#ifdef TARGET_SPARC64
@@ -4785,6 +4857,11 @@ TRANS(FHSUBs, VIS3, do_fff, a, gen_op_fhsubs)
TRANS(FNHADDs, VIS3, do_fff, a, gen_op_fnhadds)
TRANS(FNADDs, VIS3, do_fff, a, gen_op_fnadds)
+TRANS(FPADDS16s, VIS3, do_fff, a, gen_op_fpadds16s)
+TRANS(FPSUBS16s, VIS3, do_fff, a, gen_op_fpsubs16s)
+TRANS(FPADDS32s, VIS3, do_fff, a, gen_op_fpadds32s)
+TRANS(FPSUBS32s, VIS3, do_fff, a, gen_op_fpsubs32s)
+
static bool do_env_fff(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i32, TCGv_env, TCGv_i32, TCGv_i32))
{
@@ -4867,6 +4944,11 @@ TRANS(FPSUB32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sub)
TRANS(FCHKSM16, VIS3, do_gvec_ddd, a, MO_16, gen_op_fchksm16)
TRANS(FMEAN16, VIS3, do_gvec_ddd, a, MO_16, gen_op_fmean16)
+TRANS(FPADDS16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_ssadd)
+TRANS(FPADDS32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_ssadd)
+TRANS(FPSUBS16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sssub)
+TRANS(FPSUBS32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sssub)
+
static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index c9dab4236d..602b4cc648 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -448,6 +448,15 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPSUB32s 10 ..... 110110 ..... 0 0101 0111 ..... @r_r_r
FPSUB64 10 ..... 110110 ..... 0 0100 0110 ..... @d_d_d
+ FPADDS16 10 ..... 110110 ..... 0 0101 1000 ..... @d_d_d
+ FPADDS16s 10 ..... 110110 ..... 0 0101 1001 ..... @r_r_r
+ FPADDS32 10 ..... 110110 ..... 0 0101 1010 ..... @d_d_d
+ FPADDS32s 10 ..... 110110 ..... 0 0101 1011 ..... @r_r_r
+ FPSUBS16 10 ..... 110110 ..... 0 0101 1100 ..... @d_d_d
+ FPSUBS16s 10 ..... 110110 ..... 0 0101 1101 ..... @r_r_r
+ FPSUBS32 10 ..... 110110 ..... 0 0101 1110 ..... @d_d_d
+ FPSUBS32s 10 ..... 110110 ..... 0 0101 1111 ..... @r_r_r
+
FNORd 10 ..... 110110 ..... 0 0110 0010 ..... @d_d_d
FNORs 10 ..... 110110 ..... 0 0110 0011 ..... @r_r_r
FANDNOTd 10 ..... 110110 ..... 0 0110 0100 ..... @d_d_d # FANDNOT2d
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 23/41] target/sparc: Implement FPCMPEQ8, FPCMPNE8, FPCMPULE8, FPCMPUGT8
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (21 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 22/41] target/sparc: Implement FPADDS, FPSUBS Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 24/41] target/sparc: Implement FSLL, FSRL, FSRA, FSLAS Richard Henderson
` (19 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 4 ++++
target/sparc/translate.c | 9 +++++++++
target/sparc/vis_helper.c | 40 +++++++++++++++++++++++++++++++++++++++
target/sparc/insns.decode | 5 +++++
4 files changed, 58 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 8a5191414e..fb52f31666 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -123,6 +123,10 @@ VIS_CMPHELPER(cmpgt)
VIS_CMPHELPER(cmpeq)
VIS_CMPHELPER(cmple)
VIS_CMPHELPER(cmpne)
+DEF_HELPER_FLAGS_2(fcmpeq8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fcmpne8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fcmpule8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fcmpugt8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
#endif
#undef VIS_HELPER
#undef VIS_CMPHELPER
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 0dc02a3d6e..bc8c314d4c 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -65,14 +65,18 @@
# define gen_helper_cmask8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_cmask16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_cmask32 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpeq8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpeq16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpeq32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpgt16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpgt32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmple16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmple32 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpne8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpne16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpne32 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpule8 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpugt8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fdtox ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fexpand ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmul8sux16 ({ qemu_build_not_reached(); NULL; })
@@ -5018,6 +5022,11 @@ TRANS(FPCMPNE32, VIS1, do_rdd, a, gen_helper_fcmpne32)
TRANS(FPCMPGT32, VIS1, do_rdd, a, gen_helper_fcmpgt32)
TRANS(FPCMPEQ32, VIS1, do_rdd, a, gen_helper_fcmpeq32)
+TRANS(FPCMPEQ8, VIS3B, do_rdd, a, gen_helper_fcmpeq8)
+TRANS(FPCMPNE8, VIS3B, do_rdd, a, gen_helper_fcmpne8)
+TRANS(FPCMPULE8, VIS3B, do_rdd, a, gen_helper_fcmpule8)
+TRANS(FPCMPUGT8, VIS3B, do_rdd, a, gen_helper_fcmpugt8)
+
static bool do_env_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_env, TCGv_i64, TCGv_i64))
{
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 93a6239f41..2d290a440e 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -221,6 +221,46 @@ VIS_CMPHELPER(helper_fcmpeq, FCMPEQ)
VIS_CMPHELPER(helper_fcmple, FCMPLE)
VIS_CMPHELPER(helper_fcmpne, FCMPNE)
+uint64_t helper_fcmpeq8(uint64_t src1, uint64_t src2)
+{
+ uint64_t a = src1 ^ src2;
+ uint64_t m = 0x7f7f7f7f7f7f7f7fULL;
+ uint64_t c = ~(((a & m) + m) | a | m);
+
+ /* a.......b.......c.......d.......e.......f.......g.......h....... */
+ c |= c << 7;
+ /* ab......bc......cd......de......ef......fg......gh......h....... */
+ c |= c << 14;
+ /* abcd....bcde....cdef....defg....efgh....fgh.....gh......h....... */
+ c |= c << 28;
+ /* abcdefghbcdefgh.cdefgh..defgh...efgh....fgh.....gh......h....... */
+ return c >> 56;
+}
+
+uint64_t helper_fcmpne8(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmpeq8(src1, src2) ^ 0xff;
+}
+
+uint64_t helper_fcmpule8(uint64_t src1, uint64_t src2)
+{
+ VIS64 s1, s2;
+ uint64_t r = 0;
+
+ s1.ll = src1;
+ s2.ll = src2;
+
+ for (int i = 0; i < 8; ++i) {
+ r |= (s1.VIS_B64(i) <= s2.VIS_B64(i)) << i;
+ }
+ return r;
+}
+
+uint64_t helper_fcmpugt8(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmpule8(src1, src2) ^ 0xff;
+}
+
uint64_t helper_pdist(uint64_t sum, uint64_t src1, uint64_t src2)
{
int i;
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 602b4cc648..c94007bf95 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -408,6 +408,11 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPCMPGT32 10 ..... 110110 ..... 0 0010 1100 ..... @r_d_d
FPCMPEQ32 10 ..... 110110 ..... 0 0010 1110 ..... @r_d_d
+ FPCMPULE8 10 ..... 110110 ..... 1 0010 0000 ..... @r_d_d
+ FPCMPUGT8 10 ..... 110110 ..... 1 0010 1000 ..... @r_d_d
+ FPCMPEQ8 10 ..... 110110 ..... 1 0010 0010 ..... @r_d_d
+ FPCMPNE8 10 ..... 110110 ..... 1 0010 1010 ..... @r_d_d
+
FMUL8x16 10 ..... 110110 ..... 0 0011 0001 ..... @d_r_d
FMUL8x16AU 10 ..... 110110 ..... 0 0011 0011 ..... @d_r_r
FMUL8x16AL 10 ..... 110110 ..... 0 0011 0101 ..... @d_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 24/41] target/sparc: Implement FSLL, FSRL, FSRA, FSLAS
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (22 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 23/41] target/sparc: Implement FPCMPEQ8, FPCMPNE8, FPCMPULE8, FPCMPUGT8 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 25/41] target/sparc: Implement LDXEFSR Richard Henderson
` (18 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 ++
target/sparc/translate.c | 11 +++++++++++
target/sparc/vis_helper.c | 36 ++++++++++++++++++++++++++++++++++++
target/sparc/insns.decode | 9 +++++++++
4 files changed, 58 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index fb52f31666..331acbe8d0 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -114,6 +114,8 @@ DEF_HELPER_FLAGS_2(cmask16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(cmask32, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fchksm16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmean16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fslas16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(fslas32, TCG_CALL_NO_RWG_SE, i64, i64, i64)
#define VIS_CMPHELPER(name) \
DEF_HELPER_FLAGS_2(f##name##16, TCG_CALL_NO_RWG_SE, \
i64, i64, i64) \
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index bc8c314d4c..cab177190a 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -84,6 +84,8 @@
# define gen_helper_fmul8x16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fpmerge ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fqtox ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fslas16 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fslas32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fstox ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fxtod ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fxtoq ({ qemu_build_not_reached(); NULL; })
@@ -4953,6 +4955,13 @@ TRANS(FPADDS32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_ssadd)
TRANS(FPSUBS16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sssub)
TRANS(FPSUBS32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sssub)
+TRANS(FSLL16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_shlv)
+TRANS(FSLL32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_shlv)
+TRANS(FSRL16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_shrv)
+TRANS(FSRL32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_shrv)
+TRANS(FSRA16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sarv)
+TRANS(FSRA32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sarv)
+
static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64))
{
@@ -4993,6 +5002,8 @@ TRANS(FNADDd, VIS3, do_ddd, a, gen_op_fnaddd)
TRANS(FPADD64, VIS3B, do_ddd, a, tcg_gen_add_i64)
TRANS(FPSUB64, VIS3B, do_ddd, a, tcg_gen_sub_i64)
+TRANS(FSLAS16, VIS3, do_ddd, a, gen_helper_fslas16)
+TRANS(FSLAS32, VIS3, do_ddd, a, gen_helper_fslas32)
static bool do_rdd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv, TCGv_i64, TCGv_i64))
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 2d290a440e..8675ac64b3 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -456,3 +456,39 @@ uint64_t helper_fmean16(uint64_t src1, uint64_t src2)
return r.ll;
}
+
+uint64_t helper_fslas16(uint64_t src1, uint64_t src2)
+{
+ VIS64 r, s1, s2;
+
+ s1.ll = src1;
+ s2.ll = src2;
+ r.ll = 0;
+
+ for (int i = 0; i < 4; ++i) {
+ int t = s1.VIS_SW64(i) << (s2.VIS_W64(i) % 16);
+ t = MIN(t, INT16_MAX);
+ t = MAX(t, INT16_MIN);
+ r.VIS_SW64(i) = t;
+ }
+
+ return r.ll;
+}
+
+uint64_t helper_fslas32(uint64_t src1, uint64_t src2)
+{
+ VIS64 r, s1, s2;
+
+ s1.ll = src1;
+ s2.ll = src2;
+ r.ll = 0;
+
+ for (int i = 0; i < 2; ++i) {
+ int64_t t = (int64_t)(int32_t)s1.VIS_L64(i) << (s2.VIS_L64(i) % 32);
+ t = MIN(t, INT32_MAX);
+ t = MAX(t, INT32_MIN);
+ r.VIS_L64(i) = t;
+ }
+
+ return r.ll;
+}
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index c94007bf95..67591b7df9 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -408,6 +408,15 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPCMPGT32 10 ..... 110110 ..... 0 0010 1100 ..... @r_d_d
FPCMPEQ32 10 ..... 110110 ..... 0 0010 1110 ..... @r_d_d
+ FSLL16 10 ..... 110110 ..... 0 0010 0001 ..... @d_d_d
+ FSRL16 10 ..... 110110 ..... 0 0010 0011 ..... @d_d_d
+ FSLAS16 10 ..... 110110 ..... 0 0010 1001 ..... @d_d_d
+ FSRA16 10 ..... 110110 ..... 0 0010 1011 ..... @d_d_d
+ FSLL32 10 ..... 110110 ..... 0 0010 0101 ..... @d_d_d
+ FSRL32 10 ..... 110110 ..... 0 0010 0111 ..... @d_d_d
+ FSLAS32 10 ..... 110110 ..... 0 0010 1101 ..... @d_d_d
+ FSRA32 10 ..... 110110 ..... 0 0010 1111 ..... @d_d_d
+
FPCMPULE8 10 ..... 110110 ..... 1 0010 0000 ..... @r_d_d
FPCMPUGT8 10 ..... 110110 ..... 1 0010 1000 ..... @r_d_d
FPCMPEQ8 10 ..... 110110 ..... 1 0010 0010 ..... @r_d_d
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 25/41] target/sparc: Implement LDXEFSR
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (23 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 24/41] target/sparc: Implement FSLL, FSRL, FSRA, FSLAS Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 26/41] target/sparc: Implement LZCNT Richard Henderson
` (17 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 1 +
target/sparc/fop_helper.c | 6 ++++++
target/sparc/translate.c | 11 +++++++++--
target/sparc/insns.decode | 1 +
4 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 331acbe8d0..56daf2ad01 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -37,6 +37,7 @@ DEF_HELPER_FLAGS_4(ld_asi, TCG_CALL_NO_WG, i64, env, tl, int, i32)
DEF_HELPER_FLAGS_5(st_asi, TCG_CALL_NO_WG, void, env, tl, i64, int, i32)
#endif
DEF_HELPER_FLAGS_1(get_fsr, TCG_CALL_NO_WG_SE, tl, env)
+DEF_HELPER_FLAGS_2(set_fsr_nofcc, TCG_CALL_NO_RWG, void, env, i32)
DEF_HELPER_FLAGS_2(set_fsr_nofcc_noftt, TCG_CALL_NO_RWG, void, env, i32)
DEF_HELPER_FLAGS_2(fsqrts, TCG_CALL_NO_WG, f32, env, f32)
DEF_HELPER_FLAGS_2(fsqrtd, TCG_CALL_NO_WG, f64, env, f64)
diff --git a/target/sparc/fop_helper.c b/target/sparc/fop_helper.c
index 8c07442ad6..a483d69ab7 100644
--- a/target/sparc/fop_helper.c
+++ b/target/sparc/fop_helper.c
@@ -570,3 +570,9 @@ void helper_set_fsr_nofcc_noftt(CPUSPARCState *env, uint32_t fsr)
env->fsr_cexc_ftt |= fsr & FSR_CEXC_MASK;
set_fsr_nonsplit(env, fsr);
}
+
+void helper_set_fsr_nofcc(CPUSPARCState *env, uint32_t fsr)
+{
+ env->fsr_cexc_ftt = fsr & (FSR_CEXC_MASK | FSR_FTT_MASK);
+ set_fsr_nonsplit(env, fsr);
+}
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index cab177190a..c26fd04598 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -4454,7 +4454,7 @@ static bool trans_LDFSR(DisasContext *dc, arg_r_r_ri *a)
return advance_pc(dc);
}
-static bool trans_LDXFSR(DisasContext *dc, arg_r_r_ri *a)
+static bool do_ldxfsr(DisasContext *dc, arg_r_r_ri *a, bool entire)
{
#ifdef TARGET_SPARC64
TCGv addr = gen_ldst_addr(dc, a->rs1, a->imm, a->rs2_or_imm);
@@ -4479,13 +4479,20 @@ static bool trans_LDXFSR(DisasContext *dc, arg_r_r_ri *a)
tcg_gen_extract_i32(cpu_fcc[2], hi, FSR_FCC2_SHIFT - 32, 2);
tcg_gen_extract_i32(cpu_fcc[3], hi, FSR_FCC3_SHIFT - 32, 2);
- gen_helper_set_fsr_nofcc_noftt(tcg_env, lo);
+ if (entire) {
+ gen_helper_set_fsr_nofcc(tcg_env, lo);
+ } else {
+ gen_helper_set_fsr_nofcc_noftt(tcg_env, lo);
+ }
return advance_pc(dc);
#else
return false;
#endif
}
+TRANS(LDXFSR, 64, do_ldxfsr, a, false)
+TRANS(LDXEFSR, VIS3B, do_ldxfsr, a, true)
+
static bool do_stfsr(DisasContext *dc, arg_r_r_ri *a, MemOp mop)
{
TCGv addr = gen_ldst_addr(dc, a->rs1, a->imm, a->rs2_or_imm);
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 67591b7df9..353d26b9e6 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -589,6 +589,7 @@ STX 11 ..... 011110 ..... . ............. @r_r_i_asi # STXA
LDF 11 ..... 100000 ..... . ............. @r_r_ri_na
LDFSR 11 00000 100001 ..... . ............. @n_r_ri
LDXFSR 11 00001 100001 ..... . ............. @n_r_ri
+LDXEFSR 11 00011 100001 ..... . ............. @n_r_ri
LDQF 11 ..... 100010 ..... . ............. @q_r_ri_na
LDDF 11 ..... 100011 ..... . ............. @d_r_ri_na
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 26/41] target/sparc: Implement LZCNT
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (24 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 25/41] target/sparc: Implement LDXEFSR Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 17:22 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 27/41] target/sparc: Implement MOVsTOw, MOVdTOx, MOVwTOs, MOVxTOd Richard Henderson
` (16 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 18 ++++++++++++++++++
target/sparc/insns.decode | 1 +
2 files changed, 19 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index c26fd04598..761ae204b9 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -667,6 +667,11 @@ static void gen_op_popc(TCGv dst, TCGv src1, TCGv src2)
tcg_gen_ctpop_tl(dst, src2);
}
+static void gen_op_lzcnt(TCGv dst, TCGv src)
+{
+ tcg_gen_clzi_tl(dst, src, TARGET_LONG_BITS);
+}
+
#ifndef TARGET_SPARC64
static void gen_helper_array8(TCGv dst, TCGv src1, TCGv src2)
{
@@ -3869,6 +3874,19 @@ TRANS(EDGE16LN, VIS2, gen_edge, a, 16, 0, 1)
TRANS(EDGE32N, VIS2, gen_edge, a, 32, 0, 0)
TRANS(EDGE32LN, VIS2, gen_edge, a, 32, 0, 1)
+static bool do_rr(DisasContext *dc, arg_r_r *a,
+ void (*func)(TCGv, TCGv))
+{
+ TCGv dst = gen_dest_gpr(dc, a->rd);
+ TCGv src = gen_load_gpr(dc, a->rs);
+
+ func(dst, src);
+ gen_store_gpr(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
+TRANS(LZCNT, VIS3, do_rr, a, gen_op_lzcnt)
+
static bool do_rrr(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv, TCGv, TCGv))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 353d26b9e6..f7f532002a 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -389,6 +389,7 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
ADDXC 10 ..... 110110 ..... 0 0001 0001 ..... @r_r_r
ADDXCcc 10 ..... 110110 ..... 0 0001 0011 ..... @r_r_r
+ LZCNT 10 ..... 110110 00000 0 0001 0111 ..... @r_r2
ALIGNADDR 10 ..... 110110 ..... 0 0001 1000 ..... @r_r_r
ALIGNADDRL 10 ..... 110110 ..... 0 0001 1010 ..... @r_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 27/41] target/sparc: Implement MOVsTOw, MOVdTOx, MOVwTOs, MOVxTOd
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (25 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 26/41] target/sparc: Implement LZCNT Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 28/41] target/sparc: Implement PDISTN Richard Henderson
` (15 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 36 ++++++++++++++++++++++++++++++++++++
target/sparc/insns.decode | 6 ++++++
2 files changed, 42 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 761ae204b9..70d87a68cc 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5393,6 +5393,42 @@ static bool trans_FLCMPd(DisasContext *dc, arg_FLCMPd *a)
return advance_pc(dc);
}
+static bool do_movf2r(DisasContext *dc, arg_r_r *a,
+ int (*offset)(unsigned int),
+ void (*load)(TCGv, TCGv_ptr, tcg_target_long))
+{
+ TCGv dst;
+
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+ dst = gen_dest_gpr(dc, a->rd);
+ load(dst, tcg_env, offset(a->rs));
+ gen_store_gpr(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
+TRANS(MOVsTOsw, VIS3B, do_movf2r, a, gen_offset_fpr_F, tcg_gen_ld32s_tl)
+TRANS(MOVsTOuw, VIS3B, do_movf2r, a, gen_offset_fpr_F, tcg_gen_ld32u_tl)
+TRANS(MOVdTOx, VIS3B, do_movf2r, a, gen_offset_fpr_D, tcg_gen_ld_tl)
+
+static bool do_movr2f(DisasContext *dc, arg_r_r *a,
+ int (*offset)(unsigned int),
+ void (*store)(TCGv, TCGv_ptr, tcg_target_long))
+{
+ TCGv src;
+
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+ src = gen_load_gpr(dc, a->rs);
+ store(src, tcg_env, offset(a->rd));
+ return advance_pc(dc);
+}
+
+TRANS(MOVwTOs, VIS3B, do_movr2f, a, gen_offset_fpr_F, tcg_gen_st32_tl)
+TRANS(MOVxTOd, VIS3B, do_movr2f, a, gen_offset_fpr_D, tcg_gen_st_tl)
+
static void sparc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index f7f532002a..1189ad4c87 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -498,6 +498,12 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FONEd 10 ..... 110110 00000 0 0111 1110 00000 rd=%dfp_rd
FONEs 10 rd:5 110110 00000 0 0111 1111 00000
+ MOVsTOuw 10 ..... 110110 00000 1 0001 0001 ..... @r_r2
+ MOVsTOsw 10 ..... 110110 00000 1 0001 0011 ..... @r_r2
+ MOVwTOs 10 ..... 110110 00000 1 0001 1001 ..... @r_r2
+ MOVdTOx 10 ..... 110110 00000 1 0001 0000 ..... @r_d2
+ MOVxTOd 10 ..... 110110 00000 1 0001 1000 ..... @d_r2
+
FLCMPs 10 000 cc:2 110110 rs1:5 1 0101 0001 rs2:5
FLCMPd 10 000 cc:2 110110 ..... 1 0101 0010 ..... \
rs1=%dfp_rs1 rs2=%dfp_rs2
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 28/41] target/sparc: Implement PDISTN
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (26 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 27/41] target/sparc: Implement MOVsTOw, MOVdTOx, MOVwTOs, MOVxTOd Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 17:28 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 29/41] target/sparc: Implement UMULXHI Richard Henderson
` (14 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 11 +++++++++++
target/sparc/insns.decode | 1 +
2 files changed, 12 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 70d87a68cc..8241676174 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -826,6 +826,15 @@ static void gen_op_bshuffle(TCGv_i64 dst, TCGv_i64 src1, TCGv_i64 src2)
#endif
}
+static void gen_op_pdistn(TCGv dst, TCGv_i64 src1, TCGv_i64 src2)
+{
+#ifdef TARGET_SPARC64
+ gen_helper_pdist(dst, tcg_constant_i64(0), src1, src2);
+#else
+ g_assert_not_reached();
+#endif
+}
+
static void gen_op_fmul8x16al(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
{
tcg_gen_ext16s_i32(src2, src2);
@@ -5063,6 +5072,8 @@ TRANS(FPCMPNE8, VIS3B, do_rdd, a, gen_helper_fcmpne8)
TRANS(FPCMPULE8, VIS3B, do_rdd, a, gen_helper_fcmpule8)
TRANS(FPCMPUGT8, VIS3B, do_rdd, a, gen_helper_fcmpugt8)
+TRANS(PDISTN, VIS3, do_rdd, a, gen_op_pdistn)
+
static bool do_env_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_env, TCGv_i64, TCGv_i64))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 1189ad4c87..e46c5f7dc4 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -435,6 +435,7 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPACKFIX 10 ..... 110110 00000 0 0011 1101 ..... @d_d2
PDIST 10 ..... 110110 ..... 0 0011 1110 ..... \
&r_r_r_r rd=%dfp_rd rs1=%dfp_rd rs2=%dfp_rs1 rs3=%dfp_rs2
+ PDISTN 10 ..... 110110 ..... 0 0011 1111 ..... @r_d_d
FMEAN16 10 ..... 110110 ..... 0 0100 0000 ..... @d_d_d
FCHKSM16 10 ..... 110110 ..... 0 0100 0100 ..... @d_d_d
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 29/41] target/sparc: Implement UMULXHI
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (27 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 28/41] target/sparc: Implement PDISTN Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 30/41] target/sparc: Implement XMULX Richard Henderson
` (13 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 8 ++++++++
target/sparc/insns.decode | 1 +
2 files changed, 9 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 8241676174..2d697d2020 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -590,6 +590,12 @@ static void gen_op_smul(TCGv dst, TCGv src1, TCGv src2)
gen_op_multiply(dst, src1, src2, 1);
}
+static void gen_op_umulxhi(TCGv dst, TCGv src1, TCGv src2)
+{
+ TCGv discard = tcg_temp_new();
+ tcg_gen_mulu2_tl(discard, dst, src1, src2);
+}
+
static void gen_op_sdiv(TCGv dst, TCGv src1, TCGv src2)
{
#ifdef TARGET_SPARC64
@@ -3915,6 +3921,8 @@ TRANS(ARRAY32, VIS1, do_rrr, a, gen_op_array32)
TRANS(ADDXC, VIS3, do_rrr, a, gen_op_addxc)
TRANS(ADDXCcc, VIS3, do_rrr, a, gen_op_addxccc)
+TRANS(UMULXHI, VIS3, do_rrr, a, gen_op_umulxhi)
+
static void gen_op_alignaddr(TCGv dst, TCGv s1, TCGv s2)
{
#ifdef TARGET_SPARC64
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index e46c5f7dc4..0cd1cffe18 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -389,6 +389,7 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
ADDXC 10 ..... 110110 ..... 0 0001 0001 ..... @r_r_r
ADDXCcc 10 ..... 110110 ..... 0 0001 0011 ..... @r_r_r
+ UMULXHI 10 ..... 110110 ..... 0 0001 0110 ..... @r_r_r
LZCNT 10 ..... 110110 00000 0 0001 0111 ..... @r_r2
ALIGNADDR 10 ..... 110110 ..... 0 0001 1000 ..... @r_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 30/41] target/sparc: Implement XMULX
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (28 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 29/41] target/sparc: Implement UMULXHI Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 31/41] target/sparc: Enable VIS3 feature bit Richard Henderson
` (12 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 2 ++
target/sparc/translate.c | 4 ++++
target/sparc/vis_helper.c | 11 +++++++++++
target/sparc/insns.decode | 2 ++
4 files changed, 19 insertions(+)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 56daf2ad01..9b642fd74b 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -130,6 +130,8 @@ DEF_HELPER_FLAGS_2(fcmpeq8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fcmpne8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fcmpule8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fcmpugt8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(xmulx, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(xmulxhi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
#endif
#undef VIS_HELPER
#undef VIS_CMPHELPER
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 2d697d2020..f8db98c32f 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -91,6 +91,8 @@
# define gen_helper_fxtoq ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fxtos ({ qemu_build_not_reached(); NULL; })
# define gen_helper_pdist ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_xmulx ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_xmulxhi ({ qemu_build_not_reached(); NULL; })
# define MAXTL_MASK 0
#endif
@@ -5081,6 +5083,8 @@ TRANS(FPCMPULE8, VIS3B, do_rdd, a, gen_helper_fcmpule8)
TRANS(FPCMPUGT8, VIS3B, do_rdd, a, gen_helper_fcmpugt8)
TRANS(PDISTN, VIS3, do_rdd, a, gen_op_pdistn)
+TRANS(XMULX, VIS3, do_rdd, a, gen_helper_xmulx)
+TRANS(XMULXHI, VIS3, do_rdd, a, gen_helper_xmulxhi)
static bool do_env_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_env, TCGv_i64, TCGv_i64))
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 8675ac64b3..387acb3855 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -20,6 +20,7 @@
#include "qemu/osdep.h"
#include "cpu.h"
#include "exec/helper-proto.h"
+#include "crypto/clmul.h"
/* This function uses non-native bit order */
#define GET_FIELD(X, FROM, TO) \
@@ -492,3 +493,13 @@ uint64_t helper_fslas32(uint64_t src1, uint64_t src2)
return r.ll;
}
+
+uint64_t helper_xmulx(uint64_t src1, uint64_t src2)
+{
+ return int128_getlo(clmul_64(src1, src2));
+}
+
+uint64_t helper_xmulxhi(uint64_t src1, uint64_t src2)
+{
+ return int128_gethi(clmul_64(src1, src2));
+}
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 0cd1cffe18..54ba329440 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -391,6 +391,8 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
ADDXCcc 10 ..... 110110 ..... 0 0001 0011 ..... @r_r_r
UMULXHI 10 ..... 110110 ..... 0 0001 0110 ..... @r_r_r
LZCNT 10 ..... 110110 00000 0 0001 0111 ..... @r_r2
+ XMULX 10 ..... 110110 ..... 1 0001 0101 ..... @r_r_r
+ XMULXHI 10 ..... 110110 ..... 1 0001 0110 ..... @r_r_r
ALIGNADDR 10 ..... 110110 ..... 0 0001 1000 ..... @r_r_r
ALIGNADDRL 10 ..... 110110 ..... 0 0001 1010 ..... @r_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 31/41] target/sparc: Enable VIS3 feature bit
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (29 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 30/41] target/sparc: Implement XMULX Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 32/41] target/sparc: Implement IMA extension Richard Henderson
` (11 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
linux-user/elfload.c | 1 +
target/sparc/cpu.c | 3 +++
2 files changed, 4 insertions(+)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 5ebf2bf789..89ce0f3167 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -998,6 +998,7 @@ static uint32_t get_elf_hwcap(void)
r |= features & CPU_FEATURE_VIS1 ? HWCAP_SPARC_VIS : 0;
r |= features & CPU_FEATURE_VIS2 ? HWCAP_SPARC_VIS2 : 0;
r |= features & CPU_FEATURE_FMAF ? HWCAP_SPARC_FMAF : 0;
+ r |= features & CPU_FEATURE_VIS3 ? HWCAP_SPARC_VIS3 : 0;
#endif
return r;
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 491e627899..07d252a35b 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -552,6 +552,7 @@ static const char * const feature_name[] = {
[CPU_FEATURE_BIT_VIS1] = "vis1",
[CPU_FEATURE_BIT_VIS2] = "vis2",
[CPU_FEATURE_BIT_FMAF] = "fmaf",
+ [CPU_FEATURE_BIT_VIS3] = "vis3",
#else
[CPU_FEATURE_BIT_MUL] = "mul",
[CPU_FEATURE_BIT_DIV] = "div",
@@ -876,6 +877,8 @@ static Property sparc_cpu_properties[] = {
CPU_FEATURE_BIT_VIS2, false),
DEFINE_PROP_BIT("fmaf", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_FMAF, false),
+ DEFINE_PROP_BIT("vis3", SPARCCPU, env.def.features,
+ CPU_FEATURE_BIT_VIS3, false),
#else
DEFINE_PROP_BIT("mul", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_MUL, false),
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 32/41] target/sparc: Implement IMA extension
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (30 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 31/41] target/sparc: Enable VIS3 feature bit Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 17:09 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 33/41] target/sparc: Add feature bit for VIS4 Richard Henderson
` (10 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
linux-user/elfload.c | 1 +
target/sparc/cpu.c | 3 +++
target/sparc/translate.c | 24 ++++++++++++++++++++++++
target/sparc/cpu-feature.h.inc | 1 +
target/sparc/insns.decode | 3 +++
5 files changed, 32 insertions(+)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 89ce0f3167..e4ee4750a2 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -999,6 +999,7 @@ static uint32_t get_elf_hwcap(void)
r |= features & CPU_FEATURE_VIS2 ? HWCAP_SPARC_VIS2 : 0;
r |= features & CPU_FEATURE_FMAF ? HWCAP_SPARC_FMAF : 0;
r |= features & CPU_FEATURE_VIS3 ? HWCAP_SPARC_VIS3 : 0;
+ r |= features & CPU_FEATURE_IMA ? HWCAP_SPARC_IMA : 0;
#endif
return r;
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 07d252a35b..18dfd90845 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -553,6 +553,7 @@ static const char * const feature_name[] = {
[CPU_FEATURE_BIT_VIS2] = "vis2",
[CPU_FEATURE_BIT_FMAF] = "fmaf",
[CPU_FEATURE_BIT_VIS3] = "vis3",
+ [CPU_FEATURE_BIT_IMA] = "ima",
#else
[CPU_FEATURE_BIT_MUL] = "mul",
[CPU_FEATURE_BIT_DIV] = "div",
@@ -879,6 +880,8 @@ static Property sparc_cpu_properties[] = {
CPU_FEATURE_BIT_FMAF, false),
DEFINE_PROP_BIT("vis3", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_VIS3, false),
+ DEFINE_PROP_BIT("ima", SPARCCPU, env.def.features,
+ CPU_FEATURE_BIT_IMA, false),
#else
DEFINE_PROP_BIT("mul", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_MUL, false),
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index f8db98c32f..56ee3927af 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -598,6 +598,26 @@ static void gen_op_umulxhi(TCGv dst, TCGv src1, TCGv src2)
tcg_gen_mulu2_tl(discard, dst, src1, src2);
}
+static void gen_op_fpmaddx(TCGv_i64 dst, TCGv_i64 src1,
+ TCGv_i64 src2, TCGv_i64 src3)
+{
+ TCGv_i64 t = tcg_temp_new_i64();
+
+ tcg_gen_mul_i64(t, src1, src2);
+ tcg_gen_add_i64(dst, src3, t);
+}
+
+static void gen_op_fpmaddxhi(TCGv_i64 dst, TCGv_i64 src1,
+ TCGv_i64 src2, TCGv_i64 src3)
+{
+ TCGv_i64 l = tcg_temp_new_i64();
+ TCGv_i64 h = tcg_temp_new_i64();
+ TCGv_i64 z = tcg_constant_i64(0);
+
+ tcg_gen_mulu2_i64(l, h, src1, src2);
+ tcg_gen_add2_i64(l, dst, l, h, src3, z);
+}
+
static void gen_op_sdiv(TCGv dst, TCGv src1, TCGv src2)
{
#ifdef TARGET_SPARC64
@@ -2377,6 +2397,7 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_FMAF(C) ((C)->def->features & CPU_FEATURE_FMAF)
# define avail_GL(C) ((C)->def->features & CPU_FEATURE_GL)
# define avail_HYPV(C) ((C)->def->features & CPU_FEATURE_HYPV)
+# define avail_IMA(C) ((C)->def->features & CPU_FEATURE_IMA)
# define avail_VIS1(C) ((C)->def->features & CPU_FEATURE_VIS1)
# define avail_VIS2(C) ((C)->def->features & CPU_FEATURE_VIS2)
# define avail_VIS3(C) ((C)->def->features & CPU_FEATURE_VIS3)
@@ -2392,6 +2413,7 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_FMAF(C) false
# define avail_GL(C) false
# define avail_HYPV(C) false
+# define avail_IMA(C) false
# define avail_VIS1(C) false
# define avail_VIS2(C) false
# define avail_VIS3(C) false
@@ -5194,6 +5216,8 @@ TRANS(FMADDd, FMAF, do_dddd, a, gen_op_fmaddd)
TRANS(FMSUBd, FMAF, do_dddd, a, gen_op_fmsubd)
TRANS(FNMSUBd, FMAF, do_dddd, a, gen_op_fnmsubd)
TRANS(FNMADDd, FMAF, do_dddd, a, gen_op_fnmaddd)
+TRANS(FPMADDX, IMA, do_dddd, a, gen_op_fpmaddx)
+TRANS(FPMADDXHI, IMA, do_dddd, a, gen_op_fpmaddxhi)
static bool do_env_qqq(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i128, TCGv_env, TCGv_i128, TCGv_i128))
diff --git a/target/sparc/cpu-feature.h.inc b/target/sparc/cpu-feature.h.inc
index 3913fb4a54..e2e6de9144 100644
--- a/target/sparc/cpu-feature.h.inc
+++ b/target/sparc/cpu-feature.h.inc
@@ -14,3 +14,4 @@ FEATURE(POWERDOWN)
FEATURE(CASA)
FEATURE(FMAF)
FEATURE(VIS3)
+FEATURE(IMA)
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 54ba329440..56a82123a9 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -525,6 +525,9 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FNMSUBd 10 ..... 110111 ..... ..... 1010 ..... @d_d_d_d
FNMADDs 10 ..... 110111 ..... ..... 1101 ..... @r_r_r_r
FNMADDd 10 ..... 110111 ..... ..... 1110 ..... @d_d_d_d
+
+ FPMADDX 10 ..... 110111 ..... ..... 0000 ..... @d_d_d_d
+ FPMADDXHI 10 ..... 110111 ..... ..... 0100 ..... @d_d_d_d
]
NCP 10 ----- 110111 ----- --------- ----- # v8 CPop2
}
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 33/41] target/sparc: Add feature bit for VIS4
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (31 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 32/41] target/sparc: Implement IMA extension Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 17:05 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 34/41] target/sparc: Implement FALIGNDATAi Richard Henderson
` (9 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 2 ++
target/sparc/cpu-feature.h.inc | 1 +
2 files changed, 3 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 56ee3927af..77b53cbf3b 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2402,6 +2402,7 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_VIS2(C) ((C)->def->features & CPU_FEATURE_VIS2)
# define avail_VIS3(C) ((C)->def->features & CPU_FEATURE_VIS3)
# define avail_VIS3B(C) avail_VIS3(C)
+# define avail_VIS4(C) ((C)->def->features & CPU_FEATURE_VIS4)
#else
# define avail_32(C) true
# define avail_ASR17(C) ((C)->def->features & CPU_FEATURE_ASR17)
@@ -2418,6 +2419,7 @@ static int extract_qfpreg(DisasContext *dc, int x)
# define avail_VIS2(C) false
# define avail_VIS3(C) false
# define avail_VIS3B(C) false
+# define avail_VIS4(C) false
#endif
/* Default case for non jump instructions. */
diff --git a/target/sparc/cpu-feature.h.inc b/target/sparc/cpu-feature.h.inc
index e2e6de9144..be81005237 100644
--- a/target/sparc/cpu-feature.h.inc
+++ b/target/sparc/cpu-feature.h.inc
@@ -15,3 +15,4 @@ FEATURE(CASA)
FEATURE(FMAF)
FEATURE(VIS3)
FEATURE(IMA)
+FEATURE(VIS4)
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 34/41] target/sparc: Implement FALIGNDATAi
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (32 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 33/41] target/sparc: Add feature bit for VIS4 Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 35/41] target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS Richard Henderson
` (8 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 33 ++++++++++++++++++++++++++++++---
target/sparc/insns.decode | 1 +
2 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 77b53cbf3b..8e67d9023d 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -818,7 +818,8 @@ static void gen_op_fpsubs32s(TCGv_i32 d, TCGv_i32 src1, TCGv_i32 src2)
tcg_gen_movcond_i32(TCG_COND_LT, d, v, z, t, r);
}
-static void gen_op_faligndata(TCGv_i64 dst, TCGv_i64 s1, TCGv_i64 s2)
+static void gen_op_faligndata_i(TCGv_i64 dst, TCGv_i64 s1,
+ TCGv_i64 s2, TCGv gsr)
{
#ifdef TARGET_SPARC64
TCGv t1, t2, shift;
@@ -827,7 +828,7 @@ static void gen_op_faligndata(TCGv_i64 dst, TCGv_i64 s1, TCGv_i64 s2)
t2 = tcg_temp_new();
shift = tcg_temp_new();
- tcg_gen_andi_tl(shift, cpu_gsr, 7);
+ tcg_gen_andi_tl(shift, gsr, 7);
tcg_gen_shli_tl(shift, shift, 3);
tcg_gen_shl_tl(t1, s1, shift);
@@ -845,6 +846,11 @@ static void gen_op_faligndata(TCGv_i64 dst, TCGv_i64 s1, TCGv_i64 s2)
#endif
}
+static void gen_op_faligndata_g(TCGv_i64 dst, TCGv_i64 s1, TCGv_i64 s2)
+{
+ gen_op_faligndata_i(dst, s1, s2, cpu_gsr);
+}
+
static void gen_op_bshuffle(TCGv_i64 dst, TCGv_i64 src1, TCGv_i64 src2)
{
#ifdef TARGET_SPARC64
@@ -5060,7 +5066,7 @@ TRANS(FORNOTd, VIS1, do_ddd, a, tcg_gen_orc_i64)
TRANS(FORd, VIS1, do_ddd, a, tcg_gen_or_i64)
TRANS(FPACK32, VIS1, do_ddd, a, gen_op_fpack32)
-TRANS(FALIGNDATAg, VIS1, do_ddd, a, gen_op_faligndata)
+TRANS(FALIGNDATAg, VIS1, do_ddd, a, gen_op_faligndata_g)
TRANS(BSHUFFLE, VIS2, do_ddd, a, gen_op_bshuffle)
TRANS(FHADDd, VIS3, do_ddd, a, gen_op_fhaddd)
@@ -5221,6 +5227,27 @@ TRANS(FNMADDd, FMAF, do_dddd, a, gen_op_fnmaddd)
TRANS(FPMADDX, IMA, do_dddd, a, gen_op_fpmaddx)
TRANS(FPMADDXHI, IMA, do_dddd, a, gen_op_fpmaddxhi)
+static bool trans_FALIGNDATAi(DisasContext *dc, arg_r_r_r *a)
+{
+ TCGv_i64 dst, src1, src2;
+ TCGv src3;
+
+ if (!avail_VIS4(dc)) {
+ return false;
+ }
+ if (gen_trap_ifnofpu(dc)) {
+ return true;
+ }
+
+ dst = tcg_temp_new_i64();
+ src1 = gen_load_fpr_D(dc, a->rd);
+ src2 = gen_load_fpr_D(dc, a->rs2);
+ src3 = gen_load_gpr(dc, a->rs1);
+ gen_op_faligndata_i(dst, src1, src2, src3);
+ gen_store_fpr_D(dc, a->rd, dst);
+ return advance_pc(dc);
+}
+
static bool do_env_qqq(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i128, TCGv_env, TCGv_i128, TCGv_i128))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 56a82123a9..7833437f6c 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -446,6 +446,7 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @d_r_r
BSHUFFLE 10 ..... 110110 ..... 0 0100 1100 ..... @d_d_d
FEXPAND 10 ..... 110110 00000 0 0100 1101 ..... @r_d2
+ FALIGNDATAi 10 ..... 110110 ..... 0 0100 1001 ..... @d_r_d
FSRCd 10 ..... 110110 ..... 0 0111 0100 00000 @d_d1 # FSRC1d
FSRCs 10 ..... 110110 ..... 0 0111 0101 00000 @r_r1 # FSRC1s
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 35/41] target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (33 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 34/41] target/sparc: Implement FALIGNDATAi Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 16:20 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 36/41] target/sparc: Implement VIS4 comparisons Richard Henderson
` (7 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 11 +++++++++++
target/sparc/insns.decode | 9 +++++++++
2 files changed, 20 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 8e67d9023d..cb5d8c27ae 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5017,17 +5017,28 @@ static bool do_gvec_ddd(DisasContext *dc, arg_r_r_r *a, MemOp vece,
return advance_pc(dc);
}
+TRANS(FPADD8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_add)
TRANS(FPADD16, VIS1, do_gvec_ddd, a, MO_16, tcg_gen_gvec_add)
TRANS(FPADD32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_add)
+
+TRANS(FPSUB8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_sub)
TRANS(FPSUB16, VIS1, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sub)
TRANS(FPSUB32, VIS1, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sub)
+
TRANS(FCHKSM16, VIS3, do_gvec_ddd, a, MO_16, gen_op_fchksm16)
TRANS(FMEAN16, VIS3, do_gvec_ddd, a, MO_16, gen_op_fmean16)
+TRANS(FPADDS8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_ssadd)
TRANS(FPADDS16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_ssadd)
TRANS(FPADDS32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_ssadd)
+TRANS(FPADDUS8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_usadd)
+TRANS(FPADDUS16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_usadd)
+
+TRANS(FPSUBS8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_sssub)
TRANS(FPSUBS16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sssub)
TRANS(FPSUBS32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sssub)
+TRANS(FPSUBUS8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_ussub)
+TRANS(FPSUBUS16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_ussub)
TRANS(FSLL16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_shlv)
TRANS(FSLL32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_shlv)
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 7833437f6c..52bacff126 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -509,6 +509,15 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
MOVdTOx 10 ..... 110110 00000 1 0001 0000 ..... @r_d2
MOVxTOd 10 ..... 110110 00000 1 0001 1000 ..... @d_r2
+ FPADD8 10 ..... 110110 ..... 1 0010 0100 ..... @d_d_d
+ FPADDS8 10 ..... 110110 ..... 1 0010 0110 ..... @d_d_d
+ FPADDUS8 10 ..... 110110 ..... 1 0010 0111 ..... @d_d_d
+ FPADDUS16 10 ..... 110110 ..... 1 0010 0011 ..... @d_d_d
+ FPSUB8 10 ..... 110110 ..... 1 0101 0100 ..... @d_d_d
+ FPSUBS8 10 ..... 110110 ..... 1 0101 0110 ..... @d_d_d
+ FPSUBUS8 10 ..... 110110 ..... 1 0101 0111 ..... @d_d_d
+ FPSUBUS16 10 ..... 110110 ..... 1 0101 0011 ..... @d_d_d
+
FLCMPs 10 000 cc:2 110110 rs1:5 1 0101 0001 rs2:5
FLCMPd 10 000 cc:2 110110 ..... 1 0101 0010 ..... \
rs1=%dfp_rs1 rs2=%dfp_rs2
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 36/41] target/sparc: Implement VIS4 comparisons
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (34 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 35/41] target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:15 ` [PATCH 37/41] target/sparc: Implement FPMIN, FPMAX Richard Henderson
` (6 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
VIS4 completes the set, adding missing signed 8-bit ops
and missing unsigned 16 and 32-bit ops.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/helper.h | 12 +--
target/sparc/translate.c | 12 +++
target/sparc/vis_helper.c | 170 +++++++++++++++++++++++++++++---------
target/sparc/insns.decode | 6 ++
4 files changed, 153 insertions(+), 47 deletions(-)
diff --git a/target/sparc/helper.h b/target/sparc/helper.h
index 9b642fd74b..15ed0a6af3 100644
--- a/target/sparc/helper.h
+++ b/target/sparc/helper.h
@@ -117,19 +117,19 @@ DEF_HELPER_FLAGS_2(fchksm16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fmean16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fslas16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(fslas32, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-#define VIS_CMPHELPER(name) \
+#define VIS_CMPHELPER(name) \
+ DEF_HELPER_FLAGS_2(f##name##8, TCG_CALL_NO_RWG_SE, \
+ i64, i64, i64) \
DEF_HELPER_FLAGS_2(f##name##16, TCG_CALL_NO_RWG_SE, \
- i64, i64, i64) \
+ i64, i64, i64) \
DEF_HELPER_FLAGS_2(f##name##32, TCG_CALL_NO_RWG_SE, \
i64, i64, i64)
VIS_CMPHELPER(cmpgt)
VIS_CMPHELPER(cmpeq)
VIS_CMPHELPER(cmple)
VIS_CMPHELPER(cmpne)
-DEF_HELPER_FLAGS_2(fcmpeq8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fcmpne8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fcmpule8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(fcmpugt8, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+VIS_CMPHELPER(cmpugt)
+VIS_CMPHELPER(cmpule)
DEF_HELPER_FLAGS_2(xmulx, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(xmulxhi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
#endif
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index cb5d8c27ae..5f1982cecc 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -68,15 +68,21 @@
# define gen_helper_fcmpeq8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpeq16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpeq32 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpgt8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpgt16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpgt32 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmple8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmple16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmple32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpne8 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpne16 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpne32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpule8 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpule16 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpule32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fcmpugt8 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpugt16 ({ qemu_build_not_reached(); NULL; })
+# define gen_helper_fcmpugt32 ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fdtox ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fexpand ({ qemu_build_not_reached(); NULL; })
# define gen_helper_fmul8sux16 ({ qemu_build_not_reached(); NULL; })
@@ -5112,16 +5118,22 @@ TRANS(FPCMPLE16, VIS1, do_rdd, a, gen_helper_fcmple16)
TRANS(FPCMPNE16, VIS1, do_rdd, a, gen_helper_fcmpne16)
TRANS(FPCMPGT16, VIS1, do_rdd, a, gen_helper_fcmpgt16)
TRANS(FPCMPEQ16, VIS1, do_rdd, a, gen_helper_fcmpeq16)
+TRANS(FPCMPULE16, VIS4, do_rdd, a, gen_helper_fcmpule16)
+TRANS(FPCMPUGT16, VIS4, do_rdd, a, gen_helper_fcmpugt16)
TRANS(FPCMPLE32, VIS1, do_rdd, a, gen_helper_fcmple32)
TRANS(FPCMPNE32, VIS1, do_rdd, a, gen_helper_fcmpne32)
TRANS(FPCMPGT32, VIS1, do_rdd, a, gen_helper_fcmpgt32)
TRANS(FPCMPEQ32, VIS1, do_rdd, a, gen_helper_fcmpeq32)
+TRANS(FPCMPULE32, VIS4, do_rdd, a, gen_helper_fcmpule32)
+TRANS(FPCMPUGT32, VIS4, do_rdd, a, gen_helper_fcmpugt32)
TRANS(FPCMPEQ8, VIS3B, do_rdd, a, gen_helper_fcmpeq8)
TRANS(FPCMPNE8, VIS3B, do_rdd, a, gen_helper_fcmpne8)
TRANS(FPCMPULE8, VIS3B, do_rdd, a, gen_helper_fcmpule8)
TRANS(FPCMPUGT8, VIS3B, do_rdd, a, gen_helper_fcmpugt8)
+TRANS(FPCMPLE8, VIS4, do_rdd, a, gen_helper_fcmple8)
+TRANS(FPCMPGT8, VIS4, do_rdd, a, gen_helper_fcmpgt8)
TRANS(PDISTN, VIS3, do_rdd, a, gen_op_pdistn)
TRANS(XMULX, VIS3, do_rdd, a, gen_helper_xmulx)
diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
index 387acb3855..c05f3e7b30 100644
--- a/target/sparc/vis_helper.c
+++ b/target/sparc/vis_helper.c
@@ -49,6 +49,7 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
#define VIS_W64(n) w[3 - (n)]
#define VIS_SW64(n) sw[3 - (n)]
#define VIS_L64(n) l[1 - (n)]
+#define VIS_SL64(n) sl[1 - (n)]
#define VIS_B32(n) b[3 - (n)]
#define VIS_W32(n) w[1 - (n)]
#else
@@ -57,6 +58,7 @@ target_ulong helper_array8(target_ulong pixel_addr, target_ulong cubesize)
#define VIS_W64(n) w[n]
#define VIS_SW64(n) sw[n]
#define VIS_L64(n) l[n]
+#define VIS_SL64(n) sl[n]
#define VIS_B32(n) b[n]
#define VIS_W32(n) w[n]
#endif
@@ -67,6 +69,7 @@ typedef union {
uint16_t w[4];
int16_t sw[4];
uint32_t l[2];
+ int32_t sl[2];
uint64_t ll;
float64 d;
} VIS64;
@@ -181,47 +184,6 @@ uint64_t helper_fexpand(uint32_t src2)
return d.ll;
}
-#define VIS_CMPHELPER(name, F) \
- uint64_t name##16(uint64_t src1, uint64_t src2) \
- { \
- VIS64 s, d; \
- \
- s.ll = src1; \
- d.ll = src2; \
- \
- d.VIS_W64(0) = F(s.VIS_W64(0), d.VIS_W64(0)) ? 1 : 0; \
- d.VIS_W64(0) |= F(s.VIS_W64(1), d.VIS_W64(1)) ? 2 : 0; \
- d.VIS_W64(0) |= F(s.VIS_W64(2), d.VIS_W64(2)) ? 4 : 0; \
- d.VIS_W64(0) |= F(s.VIS_W64(3), d.VIS_W64(3)) ? 8 : 0; \
- d.VIS_W64(1) = d.VIS_W64(2) = d.VIS_W64(3) = 0; \
- \
- return d.ll; \
- } \
- \
- uint64_t name##32(uint64_t src1, uint64_t src2) \
- { \
- VIS64 s, d; \
- \
- s.ll = src1; \
- d.ll = src2; \
- \
- d.VIS_L64(0) = F(s.VIS_L64(0), d.VIS_L64(0)) ? 1 : 0; \
- d.VIS_L64(0) |= F(s.VIS_L64(1), d.VIS_L64(1)) ? 2 : 0; \
- d.VIS_L64(1) = 0; \
- \
- return d.ll; \
- }
-
-#define FCMPGT(a, b) ((a) > (b))
-#define FCMPEQ(a, b) ((a) == (b))
-#define FCMPLE(a, b) ((a) <= (b))
-#define FCMPNE(a, b) ((a) != (b))
-
-VIS_CMPHELPER(helper_fcmpgt, FCMPGT)
-VIS_CMPHELPER(helper_fcmpeq, FCMPEQ)
-VIS_CMPHELPER(helper_fcmple, FCMPLE)
-VIS_CMPHELPER(helper_fcmpne, FCMPNE)
-
uint64_t helper_fcmpeq8(uint64_t src1, uint64_t src2)
{
uint64_t a = src1 ^ src2;
@@ -243,6 +205,25 @@ uint64_t helper_fcmpne8(uint64_t src1, uint64_t src2)
return helper_fcmpeq8(src1, src2) ^ 0xff;
}
+uint64_t helper_fcmple8(uint64_t src1, uint64_t src2)
+{
+ VIS64 s1, s2;
+ uint64_t r = 0;
+
+ s1.ll = src1;
+ s2.ll = src2;
+
+ for (int i = 0; i < 8; ++i) {
+ r |= (s1.VIS_SB64(i) <= s2.VIS_SB64(i)) << i;
+ }
+ return r;
+}
+
+uint64_t helper_fcmpgt8(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmple8(src1, src2) ^ 0xff;
+}
+
uint64_t helper_fcmpule8(uint64_t src1, uint64_t src2)
{
VIS64 s1, s2;
@@ -262,6 +243,113 @@ uint64_t helper_fcmpugt8(uint64_t src1, uint64_t src2)
return helper_fcmpule8(src1, src2) ^ 0xff;
}
+uint64_t helper_fcmpeq16(uint64_t src1, uint64_t src2)
+{
+ uint64_t a = src1 ^ src2;
+ uint64_t m = 0x7fff7fff7fff7fffULL;
+ uint64_t c = ~(((a & m) + m) | a | m);
+
+ /* a...............b...............c...............d............... */
+ c |= c << 15;
+ /* ab..............bc..............cd..............d............... */
+ c |= c << 30;
+ /* abcd............bcd.............cd..............d............... */
+ return c >> 60;
+}
+
+uint64_t helper_fcmpne16(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmpeq16(src1, src2) ^ 0xf;
+}
+
+uint64_t helper_fcmple16(uint64_t src1, uint64_t src2)
+{
+ VIS64 s1, s2;
+ uint64_t r = 0;
+
+ s1.ll = src1;
+ s2.ll = src2;
+
+ for (int i = 0; i < 4; ++i) {
+ r |= (s1.VIS_SW64(i) <= s2.VIS_SW64(i)) << i;
+ }
+ return r;
+}
+
+uint64_t helper_fcmpgt16(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmple16(src1, src2) ^ 0xf;
+}
+
+uint64_t helper_fcmpule16(uint64_t src1, uint64_t src2)
+{
+ VIS64 s1, s2;
+ uint64_t r = 0;
+
+ s1.ll = src1;
+ s2.ll = src2;
+
+ for (int i = 0; i < 4; ++i) {
+ r |= (s1.VIS_W64(i) <= s2.VIS_W64(i)) << i;
+ }
+ return r;
+}
+
+uint64_t helper_fcmpugt16(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmpule16(src1, src2) ^ 0xf;
+}
+
+uint64_t helper_fcmpeq32(uint64_t src1, uint64_t src2)
+{
+ uint64_t a = src1 ^ src2;
+ return ((uint32_t)a == 0) | (a >> 32 ? 0 : 2);
+}
+
+uint64_t helper_fcmpne32(uint64_t src1, uint64_t src2)
+{
+ uint64_t a = src1 ^ src2;
+ return ((uint32_t)a != 0) | (a >> 32 ? 2 : 0);
+}
+
+uint64_t helper_fcmple32(uint64_t src1, uint64_t src2)
+{
+ VIS64 s1, s2;
+ uint64_t r = 0;
+
+ s1.ll = src1;
+ s2.ll = src2;
+
+ for (int i = 0; i < 2; ++i) {
+ r |= (s1.VIS_SL64(i) <= s2.VIS_SL64(i)) << i;
+ }
+ return r;
+}
+
+uint64_t helper_fcmpgt32(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmple32(src1, src2) ^ 3;
+}
+
+uint64_t helper_fcmpule32(uint64_t src1, uint64_t src2)
+{
+ VIS64 s1, s2;
+ uint64_t r = 0;
+
+ s1.ll = src1;
+ s2.ll = src2;
+
+ for (int i = 0; i < 2; ++i) {
+ r |= (s1.VIS_L64(i) <= s2.VIS_L64(i)) << i;
+ }
+ return r;
+}
+
+uint64_t helper_fcmpugt32(uint64_t src1, uint64_t src2)
+{
+ return helper_fcmpule32(src1, src2) ^ 3;
+}
+
uint64_t helper_pdist(uint64_t sum, uint64_t src1, uint64_t src2)
{
int i;
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 52bacff126..3ff8708304 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -425,6 +425,12 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPCMPUGT8 10 ..... 110110 ..... 1 0010 1000 ..... @r_d_d
FPCMPEQ8 10 ..... 110110 ..... 1 0010 0010 ..... @r_d_d
FPCMPNE8 10 ..... 110110 ..... 1 0010 1010 ..... @r_d_d
+ FPCMPLE8 10 ..... 110110 ..... 0 0011 0100 ..... @r_d_d
+ FPCMPGT8 10 ..... 110110 ..... 0 0011 1100 ..... @r_d_d
+ FPCMPULE16 10 ..... 110110 ..... 1 0010 1110 ..... @r_d_d
+ FPCMPUGT16 10 ..... 110110 ..... 1 0010 1011 ..... @r_d_d
+ FPCMPULE32 10 ..... 110110 ..... 1 0010 1111 ..... @r_d_d
+ FPCMPUGT32 10 ..... 110110 ..... 1 0010 1100 ..... @r_d_d
FMUL8x16 10 ..... 110110 ..... 0 0011 0001 ..... @d_r_d
FMUL8x16AU 10 ..... 110110 ..... 0 0011 0011 ..... @d_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 37/41] target/sparc: Implement FPMIN, FPMAX
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (35 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 36/41] target/sparc: Implement VIS4 comparisons Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 17:11 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 38/41] target/sparc: Implement SUBXC, SUBXCcc Richard Henderson
` (5 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 14 ++++++++++++++
target/sparc/insns.decode | 14 ++++++++++++++
2 files changed, 28 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 5f1982cecc..8eda190233 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5053,6 +5053,20 @@ TRANS(FSRL32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_shrv)
TRANS(FSRA16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sarv)
TRANS(FSRA32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sarv)
+TRANS(FPMIN8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_smin)
+TRANS(FPMIN16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_smin)
+TRANS(FPMIN32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_smin)
+TRANS(FPMINU8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_umin)
+TRANS(FPMINU16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_umin)
+TRANS(FPMINU32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_umin)
+
+TRANS(FPMAX8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_smax)
+TRANS(FPMAX16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_smax)
+TRANS(FPMAX32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_smax)
+TRANS(FPMAXU8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_umax)
+TRANS(FPMAXU16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_umax)
+TRANS(FPMAXU32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_umax)
+
static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
void (*func)(TCGv_i64, TCGv_i64, TCGv_i64))
{
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 3ff8708304..b7b4bfe92c 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -524,6 +524,20 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
FPSUBUS8 10 ..... 110110 ..... 1 0101 0111 ..... @d_d_d
FPSUBUS16 10 ..... 110110 ..... 1 0101 0011 ..... @d_d_d
+ FPMIN8 10 ..... 110110 ..... 1 0001 1010 ..... @d_d_d
+ FPMIN16 10 ..... 110110 ..... 1 0001 1011 ..... @d_d_d
+ FPMIN32 10 ..... 110110 ..... 1 0001 1100 ..... @d_d_d
+ FPMINU8 10 ..... 110110 ..... 1 0101 1010 ..... @d_d_d
+ FPMINU16 10 ..... 110110 ..... 1 0101 1011 ..... @d_d_d
+ FPMINU32 10 ..... 110110 ..... 1 0101 1100 ..... @d_d_d
+
+ FPMAX8 10 ..... 110110 ..... 1 0001 1101 ..... @d_d_d
+ FPMAX16 10 ..... 110110 ..... 1 0001 1110 ..... @d_d_d
+ FPMAX32 10 ..... 110110 ..... 1 0001 1111 ..... @d_d_d
+ FPMAXU8 10 ..... 110110 ..... 1 0101 1101 ..... @d_d_d
+ FPMAXU16 10 ..... 110110 ..... 1 0101 1110 ..... @d_d_d
+ FPMAXU32 10 ..... 110110 ..... 1 0101 1111 ..... @d_d_d
+
FLCMPs 10 000 cc:2 110110 rs1:5 1 0101 0001 rs2:5
FLCMPd 10 000 cc:2 110110 ..... 1 0101 0010 ..... \
rs1=%dfp_rs1 rs2=%dfp_rs2
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 38/41] target/sparc: Implement SUBXC, SUBXCcc
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (36 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 37/41] target/sparc: Implement FPMIN, FPMAX Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-05-10 16:21 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 39/41] target/sparc: Implement MWAIT Richard Henderson
` (4 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 14 ++++++++++++++
target/sparc/insns.decode | 2 ++
2 files changed, 16 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 8eda190233..4775e39240 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -524,6 +524,17 @@ static void gen_op_subccc(TCGv dst, TCGv src1, TCGv src2)
gen_op_subcc_int(dst, src1, src2, gen_carry32());
}
+static void gen_op_subxc(TCGv dst, TCGv src1, TCGv src2)
+{
+ tcg_gen_sub_tl(dst, src1, src2);
+ tcg_gen_sub_tl(dst, dst, cpu_cc_C);
+}
+
+static void gen_op_subxccc(TCGv dst, TCGv src1, TCGv src2)
+{
+ gen_op_subcc_int(dst, src1, src2, cpu_cc_C);
+}
+
static void gen_op_mulscc(TCGv dst, TCGv src1, TCGv src2)
{
TCGv zero = tcg_constant_tl(0);
@@ -3959,6 +3970,9 @@ TRANS(ARRAY32, VIS1, do_rrr, a, gen_op_array32)
TRANS(ADDXC, VIS3, do_rrr, a, gen_op_addxc)
TRANS(ADDXCcc, VIS3, do_rrr, a, gen_op_addxccc)
+TRANS(SUBXC, VIS4, do_rrr, a, gen_op_subxc)
+TRANS(SUBXCcc, VIS4, do_rrr, a, gen_op_subxccc)
+
TRANS(UMULXHI, VIS3, do_rrr, a, gen_op_umulxhi)
static void gen_op_alignaddr(TCGv dst, TCGv s1, TCGv s2)
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index b7b4bfe92c..1f9e07e526 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -447,6 +447,8 @@ FCMPEq 10 000 cc:2 110101 ..... 0 0101 0111 ..... \
PDISTN 10 ..... 110110 ..... 0 0011 1111 ..... @r_d_d
FMEAN16 10 ..... 110110 ..... 0 0100 0000 ..... @d_d_d
+ SUBXC 10 ..... 110110 ..... 0 0100 0001 ..... @r_r_r
+ SUBXCcc 10 ..... 110110 ..... 0 0100 0011 ..... @r_r_r
FCHKSM16 10 ..... 110110 ..... 0 0100 0100 ..... @d_d_d
FALIGNDATAg 10 ..... 110110 ..... 0 0100 1000 ..... @d_d_d
FPMERGE 10 ..... 110110 ..... 0 0100 1011 ..... @d_r_r
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 39/41] target/sparc: Implement MWAIT
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (37 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 38/41] target/sparc: Implement SUBXC, SUBXCcc Richard Henderson
@ 2024-03-02 5:15 ` Richard Henderson
2024-03-02 5:16 ` [PATCH 40/41] target/sparc: Implement monitor asis Richard Henderson
` (3 subsequent siblings)
42 siblings, 0 replies; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:15 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/translate.c | 11 +++++++++++
target/sparc/insns.decode | 1 +
2 files changed, 12 insertions(+)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 4775e39240..5694420a93 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -3316,6 +3316,17 @@ static void do_wrpowerdown(DisasContext *dc, TCGv src)
TRANS(WRPOWERDOWN, POWERDOWN, do_wr_special, a, supervisor(dc), do_wrpowerdown)
+static void do_wrmwait(DisasContext *dc, TCGv src)
+{
+ /*
+ * TODO: This is a stub version of mwait, which merely recognizes
+ * interrupts immediately and does not wait.
+ */
+ dc->base.is_jmp = DISAS_EXIT;
+}
+
+TRANS(WRMWAIT, VIS4, do_wr_special, a, true, do_wrmwait)
+
static void do_wrpsr(DisasContext *dc, TCGv src)
{
gen_helper_wrpsr(tcg_env, src);
diff --git a/target/sparc/insns.decode b/target/sparc/insns.decode
index 1f9e07e526..2927116031 100644
--- a/target/sparc/insns.decode
+++ b/target/sparc/insns.decode
@@ -124,6 +124,7 @@ CALL 01 i:s30
WRTICK_CMPR 10 10111 110000 ..... . ............. @n_r_ri
WRSTICK 10 11000 110000 ..... . ............. @n_r_ri
WRSTICK_CMPR 10 11001 110000 ..... . ............. @n_r_ri
+ WRMWAIT 10 11100 110000 ..... . ............. @n_r_ri
]
# Before v8, rs1==0 was WRY, and the rest executed as nop.
[
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 40/41] target/sparc: Implement monitor asis
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (38 preceding siblings ...)
2024-03-02 5:15 ` [PATCH 39/41] target/sparc: Implement MWAIT Richard Henderson
@ 2024-03-02 5:16 ` Richard Henderson
2024-05-10 17:04 ` Philippe Mathieu-Daudé
2024-03-02 5:16 ` [PATCH 41/41] target/sparc: Enable VIS4 feature bit Richard Henderson
` (2 subsequent siblings)
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:16 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Ignore the "monitor" portion and treat them the same
as their base asis.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/asi.h | 4 ++++
target/sparc/ldst_helper.c | 4 ++++
target/sparc/translate.c | 8 ++++++++
3 files changed, 16 insertions(+)
diff --git a/target/sparc/asi.h b/target/sparc/asi.h
index a66829674b..14ffaa3842 100644
--- a/target/sparc/asi.h
+++ b/target/sparc/asi.h
@@ -144,6 +144,8 @@
* ASIs, "(4V)" designates SUN4V specific ASIs. "(NG4)" designates SPARC-T4
* and later ASIs.
*/
+#define ASI_MON_AIUP 0x12 /* (VIS4) Primary, user, monitor */
+#define ASI_MON_AIUS 0x13 /* (VIS4) Secondary, user, monitor */
#define ASI_REAL 0x14 /* Real address, cacheable */
#define ASI_PHYS_USE_EC 0x14 /* PADDR, E-cacheable */
#define ASI_REAL_IO 0x15 /* Real address, non-cacheable */
@@ -257,6 +259,8 @@
#define ASI_UDBL_CONTROL_R 0x7f /* External UDB control regs rd low*/
#define ASI_INTR_R 0x7f /* IRQ vector dispatch read */
#define ASI_INTR_DATAN_R 0x7f /* (III) In irq vector data reg N */
+#define ASI_MON_P 0x84 /* (VIS4) Primary, monitor */
+#define ASI_MON_S 0x85 /* (VIS4) Secondary, monitor */
#define ASI_PIC 0xb0 /* (NG4) PIC registers */
#define ASI_PST8_P 0xc0 /* Primary, 8 8-bit, partial */
#define ASI_PST8_S 0xc1 /* Secondary, 8 8-bit, partial */
diff --git a/target/sparc/ldst_helper.c b/target/sparc/ldst_helper.c
index 1ecd58e8ff..82cf4ba074 100644
--- a/target/sparc/ldst_helper.c
+++ b/target/sparc/ldst_helper.c
@@ -1371,6 +1371,10 @@ uint64_t helper_ld_asi(CPUSPARCState *env, target_ulong addr,
case ASI_TWINX_PL: /* Primary, twinx, LE */
case ASI_TWINX_S: /* Secondary, twinx */
case ASI_TWINX_SL: /* Secondary, twinx, LE */
+ case ASI_MON_P:
+ case ASI_MON_S:
+ case ASI_MON_AIUP:
+ case ASI_MON_AIUS:
/* These are always handled inline. */
g_assert_not_reached();
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 5694420a93..15c9d5b59a 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -1622,6 +1622,7 @@ static DisasASI resolve_asi(DisasContext *dc, int asi, MemOp memop)
case ASI_BLK_AIUP_L_4V:
case ASI_BLK_AIUP:
case ASI_BLK_AIUPL:
+ case ASI_MON_AIUP:
mem_idx = MMU_USER_IDX;
break;
case ASI_AIUS: /* As if user secondary */
@@ -1632,6 +1633,7 @@ static DisasASI resolve_asi(DisasContext *dc, int asi, MemOp memop)
case ASI_BLK_AIUS_L_4V:
case ASI_BLK_AIUS:
case ASI_BLK_AIUSL:
+ case ASI_MON_AIUS:
mem_idx = MMU_USER_SECONDARY_IDX;
break;
case ASI_S: /* Secondary */
@@ -1645,6 +1647,7 @@ static DisasASI resolve_asi(DisasContext *dc, int asi, MemOp memop)
case ASI_FL8_SL:
case ASI_FL16_S:
case ASI_FL16_SL:
+ case ASI_MON_S:
if (mem_idx == MMU_USER_IDX) {
mem_idx = MMU_USER_SECONDARY_IDX;
} else if (mem_idx == MMU_KERNEL_IDX) {
@@ -1662,6 +1665,7 @@ static DisasASI resolve_asi(DisasContext *dc, int asi, MemOp memop)
case ASI_FL8_PL:
case ASI_FL16_P:
case ASI_FL16_PL:
+ case ASI_MON_P:
break;
}
switch (asi) {
@@ -1679,6 +1683,10 @@ static DisasASI resolve_asi(DisasContext *dc, int asi, MemOp memop)
case ASI_SL:
case ASI_P:
case ASI_PL:
+ case ASI_MON_P:
+ case ASI_MON_S:
+ case ASI_MON_AIUP:
+ case ASI_MON_AIUS:
type = GET_ASI_DIRECT;
break;
case ASI_TWINX_REAL:
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 41/41] target/sparc: Enable VIS4 feature bit
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (39 preceding siblings ...)
2024-03-02 5:16 ` [PATCH 40/41] target/sparc: Implement monitor asis Richard Henderson
@ 2024-03-02 5:16 ` Richard Henderson
2024-05-10 17:16 ` Philippe Mathieu-Daudé
2024-03-05 10:20 ` [PATCH 00/41] target/sparc: Implement VIS4 Mark Cave-Ayland
2024-04-29 20:52 ` Mark Cave-Ayland
42 siblings, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-03-02 5:16 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, atar4qemu
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/sparc/cpu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 18dfd90845..1ffac3dd8a 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -554,6 +554,7 @@ static const char * const feature_name[] = {
[CPU_FEATURE_BIT_FMAF] = "fmaf",
[CPU_FEATURE_BIT_VIS3] = "vis3",
[CPU_FEATURE_BIT_IMA] = "ima",
+ [CPU_FEATURE_BIT_VIS4] = "vis4",
#else
[CPU_FEATURE_BIT_MUL] = "mul",
[CPU_FEATURE_BIT_DIV] = "div",
@@ -882,6 +883,8 @@ static Property sparc_cpu_properties[] = {
CPU_FEATURE_BIT_VIS3, false),
DEFINE_PROP_BIT("ima", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_IMA, false),
+ DEFINE_PROP_BIT("vis4", SPARCCPU, env.def.features,
+ CPU_FEATURE_BIT_VIS4, false),
#else
DEFINE_PROP_BIT("mul", SPARCCPU, env.def.features,
CPU_FEATURE_BIT_MUL, false),
--
2.34.1
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH 00/41] target/sparc: Implement VIS4
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (40 preceding siblings ...)
2024-03-02 5:16 ` [PATCH 41/41] target/sparc: Enable VIS4 feature bit Richard Henderson
@ 2024-03-05 10:20 ` Mark Cave-Ayland
2024-04-29 20:52 ` Mark Cave-Ayland
42 siblings, 0 replies; 65+ messages in thread
From: Mark Cave-Ayland @ 2024-03-05 10:20 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: atar4qemu
On 02/03/2024 05:15, Richard Henderson wrote:
> I whipped this up over the Christmas break, but I'm just now
> getting around to posting. I have not attempted to model the
> newer cpus that have these features, but it is possible to
> enable the features manually via -cpu properties.
>
> Possibly the first 6 or 7 patches should be taken sooner than
> later because they fix bugs in existing VIS[12] code.
>
> I remove cpu_fpr[], so that we can use gvec on the same memory.
>
> r~
Nice! Since all of my SPARC bits and pieces are done with QEMU sun4m/sun4u, I don't
really have much in the way that can test the newer VIS instructions. Do you have any
particular suite that you've been using for testing, or future plans for particular CPUs?
Generally I'm okay for this to be merged if there is a way for the new VIS
instructions to be easily tested, given that they're not currently enabled by default.
> Richard Henderson (41):
> linux-user/sparc: Add more hwcap bits for sparc64
> target/sparc: Fix FEXPAND
> target/sparc: Fix FMUL8x16
> target/sparc: Fix FMUL8x16A{U,L}
> target/sparc: Fix FMULD8*X16
> target/sparc: Fix FPMERGE
> target/sparc: Split out do_ms16b
> target/sparc: Perform DFPREG/QFPREG in decodetree
> target/sparc: Remove gen_dest_fpr_D
> target/sparc: Remove cpu_fpr[]
> target/sparc: Use gvec for VIS1 parallel add/sub
> target/sparc: Implement FMAf extension
> target/sparc: Add feature bits for VIS 3
> target/sparc: Implement ADDXC, ADDXCcc
> target/sparc: Implement CMASK instructions
> target/sparc: Implement FCHKSM16
> target/sparc: Implement FHADD, FHSUB, FNHADD, FNADD
> target/sparc: Implement FNMUL
> target/sparc: Implement FLCMP
> target/sparc: Implement FMEAN16
> target/sparc: Implement FPADD64 FPSUB64
> target/sparc: Implement FPADDS, FPSUBS
> target/sparc: Implement FPCMPEQ8, FPCMPNE8, FPCMPULE8, FPCMPUGT8
> target/sparc: Implement FSLL, FSRL, FSRA, FSLAS
> target/sparc: Implement LDXEFSR
> target/sparc: Implement LZCNT
> target/sparc: Implement MOVsTOw, MOVdTOx, MOVwTOs, MOVxTOd
> target/sparc: Implement PDISTN
> target/sparc: Implement UMULXHI
> target/sparc: Implement XMULX
> target/sparc: Enable VIS3 feature bit
> target/sparc: Implement IMA extension
> target/sparc: Add feature bit for VIS4
> target/sparc: Implement FALIGNDATAi
> target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS
> target/sparc: Implement VIS4 comparisons
> target/sparc: Implement FPMIN, FPMAX
> target/sparc: Implement SUBXC, SUBXCcc
> target/sparc: Implement MWAIT
> target/sparc: Implement monitor asis
> target/sparc: Enable VIS4 feature bit
>
> target/sparc/asi.h | 4 +
> target/sparc/helper.h | 36 +-
> linux-user/elfload.c | 51 +-
> target/sparc/cpu.c | 12 +
> target/sparc/fop_helper.c | 104 ++++
> target/sparc/ldst_helper.c | 4 +
> target/sparc/translate.c | 960 +++++++++++++++++++++++++++++----
> target/sparc/vis_helper.c | 526 +++++++++++-------
> target/sparc/cpu-feature.h.inc | 4 +
> target/sparc/insns.decode | 338 +++++++++---
> 10 files changed, 1626 insertions(+), 413 deletions(-)
ATB,
Mark.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/41] target/sparc: Implement VIS4
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
` (41 preceding siblings ...)
2024-03-05 10:20 ` [PATCH 00/41] target/sparc: Implement VIS4 Mark Cave-Ayland
@ 2024-04-29 20:52 ` Mark Cave-Ayland
2024-04-29 21:02 ` Richard Henderson
42 siblings, 1 reply; 65+ messages in thread
From: Mark Cave-Ayland @ 2024-04-29 20:52 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: atar4qemu
On 02/03/2024 05:15, Richard Henderson wrote:
> I whipped this up over the Christmas break, but I'm just now
> getting around to posting. I have not attempted to model the
> newer cpus that have these features, but it is possible to
> enable the features manually via -cpu properties.
>
> Possibly the first 6 or 7 patches should be taken sooner than
> later because they fix bugs in existing VIS[12] code.
>
> I remove cpu_fpr[], so that we can use gvec on the same memory.
>
>
> r~
>
>
> Richard Henderson (41):
> linux-user/sparc: Add more hwcap bits for sparc64
> target/sparc: Fix FEXPAND
> target/sparc: Fix FMUL8x16
> target/sparc: Fix FMUL8x16A{U,L}
> target/sparc: Fix FMULD8*X16
> target/sparc: Fix FPMERGE
> target/sparc: Split out do_ms16b
> target/sparc: Perform DFPREG/QFPREG in decodetree
> target/sparc: Remove gen_dest_fpr_D
> target/sparc: Remove cpu_fpr[]
> target/sparc: Use gvec for VIS1 parallel add/sub
> target/sparc: Implement FMAf extension
> target/sparc: Add feature bits for VIS 3
> target/sparc: Implement ADDXC, ADDXCcc
> target/sparc: Implement CMASK instructions
> target/sparc: Implement FCHKSM16
> target/sparc: Implement FHADD, FHSUB, FNHADD, FNADD
> target/sparc: Implement FNMUL
> target/sparc: Implement FLCMP
> target/sparc: Implement FMEAN16
> target/sparc: Implement FPADD64 FPSUB64
> target/sparc: Implement FPADDS, FPSUBS
> target/sparc: Implement FPCMPEQ8, FPCMPNE8, FPCMPULE8, FPCMPUGT8
> target/sparc: Implement FSLL, FSRL, FSRA, FSLAS
> target/sparc: Implement LDXEFSR
> target/sparc: Implement LZCNT
> target/sparc: Implement MOVsTOw, MOVdTOx, MOVwTOs, MOVxTOd
> target/sparc: Implement PDISTN
> target/sparc: Implement UMULXHI
> target/sparc: Implement XMULX
> target/sparc: Enable VIS3 feature bit
> target/sparc: Implement IMA extension
> target/sparc: Add feature bit for VIS4
> target/sparc: Implement FALIGNDATAi
> target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS
> target/sparc: Implement VIS4 comparisons
> target/sparc: Implement FPMIN, FPMAX
> target/sparc: Implement SUBXC, SUBXCcc
> target/sparc: Implement MWAIT
> target/sparc: Implement monitor asis
> target/sparc: Enable VIS4 feature bit
>
> target/sparc/asi.h | 4 +
> target/sparc/helper.h | 36 +-
> linux-user/elfload.c | 51 +-
> target/sparc/cpu.c | 12 +
> target/sparc/fop_helper.c | 104 ++++
> target/sparc/ldst_helper.c | 4 +
> target/sparc/translate.c | 960 +++++++++++++++++++++++++++++----
> target/sparc/vis_helper.c | 526 +++++++++++-------
> target/sparc/cpu-feature.h.inc | 4 +
> target/sparc/insns.decode | 338 +++++++++---
> 10 files changed, 1626 insertions(+), 413 deletions(-)
I've applied the first 6 patches to my qemu-sparc branch, since I believe these are
the up-to-date version of the patches posted for
https://gitlab.com/qemu-project/qemu/-/issues/1901.
No objections here about the remainder of the series, other than that I don't have an
easy/obvious way to test the new instructions...
ATB,
Mark.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/41] target/sparc: Implement VIS4
2024-04-29 20:52 ` Mark Cave-Ayland
@ 2024-04-29 21:02 ` Richard Henderson
2024-04-29 21:10 ` Mark Cave-Ayland
2024-05-15 15:30 ` Richard Henderson
0 siblings, 2 replies; 65+ messages in thread
From: Richard Henderson @ 2024-04-29 21:02 UTC (permalink / raw)
To: Mark Cave-Ayland, qemu-devel; +Cc: atar4qemu
On 4/29/24 13:52, Mark Cave-Ayland wrote:
> No objections here about the remainder of the series, other than that I don't have an
> easy/obvious way to test the new instructions...
I was thinking about adding support to RISU, but the gcc compile farm sparc machines have
been down for ages, so no way to generate the reference traces.
r~
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/41] target/sparc: Implement VIS4
2024-04-29 21:02 ` Richard Henderson
@ 2024-04-29 21:10 ` Mark Cave-Ayland
2024-05-15 15:30 ` Richard Henderson
1 sibling, 0 replies; 65+ messages in thread
From: Mark Cave-Ayland @ 2024-04-29 21:10 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: atar4qemu
On 29/04/2024 22:02, Richard Henderson wrote:
> On 4/29/24 13:52, Mark Cave-Ayland wrote:
>> No objections here about the remainder of the series, other than that I don't have
>> an easy/obvious way to test the new instructions...
>
> I was thinking about adding support to RISU, but the gcc compile farm sparc machines
> have been down for ages, so no way to generate the reference traces.
Ah that's frustrating. I've just pinged the debian-sparc folk to see if anyone there
can offer any insight or an alternative.
ATB,
Mark.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 04/41] target/sparc: Fix FMUL8x16A{U,L}
2024-03-02 5:15 ` [PATCH 04/41] target/sparc: Fix FMUL8x16A{U,L} Richard Henderson
@ 2024-04-30 8:07 ` Mark Cave-Ayland
0 siblings, 0 replies; 65+ messages in thread
From: Mark Cave-Ayland @ 2024-04-30 8:07 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: atar4qemu
On 02/03/2024 05:15, Richard Henderson wrote:
> These instructions have f32 inputs, which changes the decode
> of the register numbers. While we're fixing things, use a
> common helper for both insns, extracting the 16-bit scalar
> in tcg beforehand.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/helper.h | 3 +--
> target/sparc/translate.c | 38 ++++++++++++++++++++++++++++++----
> target/sparc/vis_helper.c | 43 +++++++++------------------------------
> 3 files changed, 45 insertions(+), 39 deletions(-)
>
> diff --git a/target/sparc/helper.h b/target/sparc/helper.h
> index adc1b87319..9e0b8b463e 100644
> --- a/target/sparc/helper.h
> +++ b/target/sparc/helper.h
> @@ -93,8 +93,7 @@ DEF_HELPER_FLAGS_2(fqtox, TCG_CALL_NO_WG, s64, env, i128)
>
> DEF_HELPER_FLAGS_2(fpmerge, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> DEF_HELPER_FLAGS_2(fmul8x16, TCG_CALL_NO_RWG_SE, i64, i32, i64)
> -DEF_HELPER_FLAGS_2(fmul8x16al, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> -DEF_HELPER_FLAGS_2(fmul8x16au, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> +DEF_HELPER_FLAGS_2(fmul8x16a, TCG_CALL_NO_RWG_SE, i64, i32, s32)
> DEF_HELPER_FLAGS_2(fmul8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> DEF_HELPER_FLAGS_2(fmul8ulx16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> DEF_HELPER_FLAGS_2(fmuld8sux16, TCG_CALL_NO_RWG_SE, i64, i64, i64)
> diff --git a/target/sparc/translate.c b/target/sparc/translate.c
> index 5144fe4ed9..598cfcf0ac 100644
> --- a/target/sparc/translate.c
> +++ b/target/sparc/translate.c
> @@ -45,6 +45,7 @@
> # define gen_helper_clear_softint(E, S) qemu_build_not_reached()
> # define gen_helper_done(E) qemu_build_not_reached()
> # define gen_helper_flushw(E) qemu_build_not_reached()
> +# define gen_helper_fmul8x16a(D, S1, S2) qemu_build_not_reached()
> # define gen_helper_rdccr(D, E) qemu_build_not_reached()
> # define gen_helper_rdcwp(D, E) qemu_build_not_reached()
> # define gen_helper_restored(E) qemu_build_not_reached()
> @@ -72,8 +73,6 @@
> # define gen_helper_fexpand ({ qemu_build_not_reached(); NULL; })
> # define gen_helper_fmul8sux16 ({ qemu_build_not_reached(); NULL; })
> # define gen_helper_fmul8ulx16 ({ qemu_build_not_reached(); NULL; })
> -# define gen_helper_fmul8x16al ({ qemu_build_not_reached(); NULL; })
> -# define gen_helper_fmul8x16au ({ qemu_build_not_reached(); NULL; })
> # define gen_helper_fmul8x16 ({ qemu_build_not_reached(); NULL; })
> # define gen_helper_fmuld8sux16 ({ qemu_build_not_reached(); NULL; })
> # define gen_helper_fmuld8ulx16 ({ qemu_build_not_reached(); NULL; })
> @@ -719,6 +718,18 @@ static void gen_op_bshuffle(TCGv_i64 dst, TCGv_i64 src1, TCGv_i64 src2)
> #endif
> }
>
> +static void gen_op_fmul8x16al(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
> +{
> + tcg_gen_ext16s_i32(src2, src2);
> + gen_helper_fmul8x16a(dst, src1, src2);
> +}
> +
> +static void gen_op_fmul8x16au(TCGv_i64 dst, TCGv_i32 src1, TCGv_i32 src2)
> +{
> + tcg_gen_sari_i32(src2, src2, 16);
> + gen_helper_fmul8x16a(dst, src1, src2);
> +}
> +
> static void finishing_insn(DisasContext *dc)
> {
> /*
> @@ -4539,6 +4550,27 @@ TRANS(FSUBs, ALL, do_env_fff, a, gen_helper_fsubs)
> TRANS(FMULs, ALL, do_env_fff, a, gen_helper_fmuls)
> TRANS(FDIVs, ALL, do_env_fff, a, gen_helper_fdivs)
>
> +static bool do_dff(DisasContext *dc, arg_r_r_r *a,
> + void (*func)(TCGv_i64, TCGv_i32, TCGv_i32))
> +{
> + TCGv_i64 dst;
> + TCGv_i32 src1, src2;
> +
> + if (gen_trap_ifnofpu(dc)) {
> + return true;
> + }
> +
> + dst = gen_dest_fpr_D(dc, a->rd);
> + src1 = gen_load_fpr_F(dc, a->rs1);
> + src2 = gen_load_fpr_F(dc, a->rs2);
> + func(dst, src1, src2);
> + gen_store_fpr_D(dc, a->rd, dst);
> + return advance_pc(dc);
> +}
> +
> +TRANS(FMUL8x16AU, VIS1, do_dff, a, gen_op_fmul8x16au)
> +TRANS(FMUL8x16AL, VIS1, do_dff, a, gen_op_fmul8x16al)
> +
> static bool do_dfd(DisasContext *dc, arg_r_r_r *a,
> void (*func)(TCGv_i64, TCGv_i32, TCGv_i64))
> {
> @@ -4576,8 +4608,6 @@ static bool do_ddd(DisasContext *dc, arg_r_r_r *a,
> return advance_pc(dc);
> }
>
> -TRANS(FMUL8x16AU, VIS1, do_ddd, a, gen_helper_fmul8x16au)
> -TRANS(FMUL8x16AL, VIS1, do_ddd, a, gen_helper_fmul8x16al)
> TRANS(FMUL8SUx16, VIS1, do_ddd, a, gen_helper_fmul8sux16)
> TRANS(FMUL8ULx16, VIS1, do_ddd, a, gen_helper_fmul8ulx16)
> TRANS(FMULD8SUx16, VIS1, do_ddd, a, gen_helper_fmuld8sux16)
> diff --git a/target/sparc/vis_helper.c b/target/sparc/vis_helper.c
> index 7728ffe9c6..5c7f5536bc 100644
> --- a/target/sparc/vis_helper.c
> +++ b/target/sparc/vis_helper.c
> @@ -119,43 +119,20 @@ uint64_t helper_fmul8x16(uint32_t src1, uint64_t src2)
> return d.ll;
> }
>
> -uint64_t helper_fmul8x16al(uint64_t src1, uint64_t src2)
> +uint64_t helper_fmul8x16a(uint32_t src1, int32_t src2)
> {
> - VIS64 s, d;
> + VIS32 s;
> + VIS64 d;
> uint32_t tmp;
>
> - s.ll = src1;
> - d.ll = src2;
> + s.l = src1;
> + d.ll = 0;
>
> -#define PMUL(r) \
> - tmp = (int32_t)d.VIS_SW64(1) * (int32_t)s.VIS_B64(r); \
> - if ((tmp & 0xff) > 0x7f) { \
> - tmp += 0x100; \
> - } \
> - d.VIS_W64(r) = tmp >> 8;
> -
> - PMUL(0);
> - PMUL(1);
> - PMUL(2);
> - PMUL(3);
> -#undef PMUL
> -
> - return d.ll;
> -}
> -
> -uint64_t helper_fmul8x16au(uint64_t src1, uint64_t src2)
> -{
> - VIS64 s, d;
> - uint32_t tmp;
> -
> - s.ll = src1;
> - d.ll = src2;
> -
> -#define PMUL(r) \
> - tmp = (int32_t)d.VIS_SW64(0) * (int32_t)s.VIS_B64(r); \
> - if ((tmp & 0xff) > 0x7f) { \
> - tmp += 0x100; \
> - } \
> +#define PMUL(r) \
> + tmp = src2 * (int32_t)s.VIS_B64(r); \
> + if ((tmp & 0xff) > 0x7f) { \
> + tmp += 0x100; \
> + } \
> d.VIS_W64(r) = tmp >> 8;
>
> PMUL(0);
Hi Richard,
This patch is showing a couple of issues after a run through the GitLab pipeline:
1) From checkpatch (https://gitlab.com/mcayland/qemu/-/jobs/6743594359#L44):
ERROR: Macros with multiple statements should be enclosed in a do - while loop
total: 2 errors, 0 warnings, 130 lines checked
2) From the s390x runners (https://gitlab.com/mcayland/qemu/-/jobs/6743594301#L4792):
../target/sparc/vis_helper.c: In function ‘helper_fmul8x16a’:
../target/sparc/vis_helper.c:46:21: error: array subscript 7 is above array bounds of
‘uint8_t[4]’ {aka ‘unsigned char[4]’} [-Werror=array-bounds]
46 | #define VIS_B64(n) b[7 - (n)]
| ^
../target/sparc/vis_helper.c:133:29: note: in expansion of macro ‘VIS_B64’
133 | tmp = src2 * (int32_t)s.VIS_B64(r); \
| ^~~~~~~
../target/sparc/vis_helper.c:139:5: note: in expansion of macro ‘PMUL’
139 | PMUL(0);
| ^~~~
../target/sparc/vis_helper.c:71:13: note: while referencing ‘b’
71 | uint8_t b[4];
| ^
../target/sparc/vis_helper.c:46:21: error: array subscript 6 is above array bounds of
‘uint8_t[4]’ {aka ‘unsigned char[4]’} [-Werror=array-bounds]
46 | #define VIS_B64(n) b[7 - (n)]
| ^
../target/sparc/vis_helper.c:133:29: note: in expansion of macro ‘VIS_B64’
133 | tmp = src2 * (int32_t)s.VIS_B64(r); \
| ^~~~~~~
../target/sparc/vis_helper.c:140:5: note: in expansion of macro ‘PMUL’
140 | PMUL(1);
| ^~~~
../target/sparc/vis_helper.c:71:13: note: while referencing ‘b’
71 | uint8_t b[4];
| ^
../target/sparc/vis_helper.c:46:21: error: array subscript 5 is above array bounds of
‘uint8_t[4]’ {aka ‘unsigned char[4]’} [-Werror=array-bounds]
46 | #define VIS_B64(n) b[7 - (n)]
| ^
../target/sparc/vis_helper.c:133:29: note: in expansion of macro ‘VIS_B64’
133 | tmp = src2 * (int32_t)s.VIS_B64(r); \
| ^~~~~~~
../target/sparc/vis_helper.c:141:5: note: in expansion of macro ‘PMUL’
141 | PMUL(2);
| ^~~~
../target/sparc/vis_helper.c:71:13: note: while referencing ‘b’
71 | uint8_t b[4];
| ^
../target/sparc/vis_helper.c:46:21: error: array subscript 4 is above array bounds of
‘uint8_t[4]’ {aka ‘unsigned char[4]’} [-Werror=array-bounds]
46 | #define VIS_B64(n) b[7 - (n)]
| ^
../target/sparc/vis_helper.c:133:29: note: in expansion of macro ‘VIS_B64’
133 | tmp = src2 * (int32_t)s.VIS_B64(r); \
| ^~~~~~~
../target/sparc/vis_helper.c:142:5: note: in expansion of macro ‘PMUL’
142 | PMUL(3);
| ^~~~
../target/sparc/vis_helper.c:71:13: note: while referencing ‘b’
71 | uint8_t b[4];
| ^
cc1: all warnings being treated as errors
[4028/5573] Compiling C object libqemu-sparc64-softmmu.fa.p/trace_control-target.c.o
[4029/5573] Compiling C object libqemu-sparc64-softmmu.fa.p/target_sparc_translate.c.o
ninja: build stopped: subcommand failed.
make: *** [Makefile:167: run-ninja] Error 1
ATB,
Mark.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 08/41] target/sparc: Perform DFPREG/QFPREG in decodetree
2024-03-02 5:15 ` [PATCH 08/41] target/sparc: Perform DFPREG/QFPREG in decodetree Richard Henderson
@ 2024-05-10 15:18 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 15:18 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Form the proper register decoding from the start.
>
> Because we're removing the translation from the inner-most
> gen_load_fpr_* and gen_store_fpr_* routines, this must be
> done for all insns at once.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 18 ++--
> target/sparc/insns.decode | 220 +++++++++++++++++++++++---------------
> 2 files changed, 138 insertions(+), 100 deletions(-)
>
> diff --git a/target/sparc/translate.c b/target/sparc/translate.c
> index 6a6c259b06..97a5c636d2 100644
> --- a/target/sparc/translate.c
> +++ b/target/sparc/translate.c
> @@ -241,34 +241,30 @@ static void gen_store_fpr_F(DisasContext *dc, unsigned int dst, TCGv_i32 v)
>
> static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
> {
> - src = DFPREG(src);
> return cpu_fpr[src / 2];
> }
Optionally squash removal of the macros:
-- >8 --
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 0efc561d4c..f59d08e9e4 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -193,10 +193,2 @@ typedef struct DisasContext {
-#ifdef TARGET_SPARC64
-#define DFPREG(r) (((r & 1) << 5) | (r & 0x1e))
-#define QFPREG(r) (((r & 1) << 5) | (r & 0x1c))
-#else
-#define DFPREG(r) (r & 0x1e)
-#define QFPREG(r) (r & 0x1c)
-#endif
-
#define UA2005_HTRAP_MASK 0xff
@@ -2083,3 +2075,7 @@ static int extract_dfpreg(DisasContext *dc, int x)
{
- return DFPREG(x);
+ int r = x & 0x1c;
+#ifdef TARGET_SPARC64
+ r |= (x & 1) << 5;
+#endif
+ return r;
}
@@ -2088,3 +2084,7 @@ static int extract_qfpreg(DisasContext *dc, int x)
{
- return QFPREG(x);
+ int r = x & 0x1e;
+#ifdef TARGET_SPARC64
+ r |= (x & 1) << 5;
+#endif
+ return r;
}
---
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH 09/41] target/sparc: Remove gen_dest_fpr_D
2024-03-02 5:15 ` [PATCH 09/41] target/sparc: Remove gen_dest_fpr_D Richard Henderson
@ 2024-05-10 15:18 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 15:18 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Replace with tcg_temp_new_i64.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 27 +++++++++++----------------
> 1 file changed, 11 insertions(+), 16 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 11/41] target/sparc: Use gvec for VIS1 parallel add/sub
2024-03-02 5:15 ` [PATCH 11/41] target/sparc: Use gvec for VIS1 parallel add/sub Richard Henderson
@ 2024-05-10 15:21 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 15:21 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 18 ++++++++++++++----
> 1 file changed, 14 insertions(+), 4 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 14/41] target/sparc: Implement ADDXC, ADDXCcc
2024-03-02 5:15 ` [PATCH 14/41] target/sparc: Implement ADDXC, ADDXCcc Richard Henderson
@ 2024-05-10 16:16 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 16:16 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 14 ++++++++++++++
> target/sparc/insns.decode | 3 +++
> 2 files changed, 17 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 21/41] target/sparc: Implement FPADD64 FPSUB64
2024-03-02 5:15 ` [PATCH 21/41] target/sparc: Implement FPADD64 FPSUB64 Richard Henderson
@ 2024-05-10 16:19 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 16:19 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 3 +++
> target/sparc/insns.decode | 2 ++
> 2 files changed, 5 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 35/41] target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS
2024-03-02 5:15 ` [PATCH 35/41] target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS Richard Henderson
@ 2024-05-10 16:20 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 16:20 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 11 +++++++++++
> target/sparc/insns.decode | 9 +++++++++
> 2 files changed, 20 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 38/41] target/sparc: Implement SUBXC, SUBXCcc
2024-03-02 5:15 ` [PATCH 38/41] target/sparc: Implement SUBXC, SUBXCcc Richard Henderson
@ 2024-05-10 16:21 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 16:21 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 14 ++++++++++++++
> target/sparc/insns.decode | 2 ++
> 2 files changed, 16 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 40/41] target/sparc: Implement monitor asis
2024-03-02 5:16 ` [PATCH 40/41] target/sparc: Implement monitor asis Richard Henderson
@ 2024-05-10 17:04 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:04 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:16, Richard Henderson wrote:
> Ignore the "monitor" portion and treat them the same
> as their base asis.
s/asis/ASIs/
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/asi.h | 4 ++++
> target/sparc/ldst_helper.c | 4 ++++
> target/sparc/translate.c | 8 ++++++++
> 3 files changed, 16 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 33/41] target/sparc: Add feature bit for VIS4
2024-03-02 5:15 ` [PATCH 33/41] target/sparc: Add feature bit for VIS4 Richard Henderson
@ 2024-05-10 17:05 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:05 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 2 ++
> target/sparc/cpu-feature.h.inc | 1 +
> 2 files changed, 3 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 13/41] target/sparc: Add feature bits for VIS 3
2024-03-02 5:15 ` [PATCH 13/41] target/sparc: Add feature bits for VIS 3 Richard Henderson
@ 2024-05-10 17:05 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:05 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> The manual separates VIS 3 and VIS 3B, even though they are both
> present in all extant cpus. For clarity, let the translator
> match the manual but otherwise leave them on the same feature bit.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 4 ++++
> target/sparc/cpu-feature.h.inc | 1 +
> 2 files changed, 5 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 32/41] target/sparc: Implement IMA extension
2024-03-02 5:15 ` [PATCH 32/41] target/sparc: Implement IMA extension Richard Henderson
@ 2024-05-10 17:09 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:09 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> linux-user/elfload.c | 1 +
> target/sparc/cpu.c | 3 +++
> target/sparc/translate.c | 24 ++++++++++++++++++++++++
> target/sparc/cpu-feature.h.inc | 1 +
> target/sparc/insns.decode | 3 +++
> 5 files changed, 32 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 37/41] target/sparc: Implement FPMIN, FPMAX
2024-03-02 5:15 ` [PATCH 37/41] target/sparc: Implement FPMIN, FPMAX Richard Henderson
@ 2024-05-10 17:11 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:11 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 14 ++++++++++++++
> target/sparc/insns.decode | 14 ++++++++++++++
> 2 files changed, 28 insertions(+)
>
> diff --git a/target/sparc/translate.c b/target/sparc/translate.c
> index 5f1982cecc..8eda190233 100644
> --- a/target/sparc/translate.c
> +++ b/target/sparc/translate.c
> @@ -5053,6 +5053,20 @@ TRANS(FSRL32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_shrv)
> TRANS(FSRA16, VIS3, do_gvec_ddd, a, MO_16, tcg_gen_gvec_sarv)
> TRANS(FSRA32, VIS3, do_gvec_ddd, a, MO_32, tcg_gen_gvec_sarv)
>
> +TRANS(FPMIN8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_smin)
> +TRANS(FPMIN16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_smin)
> +TRANS(FPMIN32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_smin)
> +TRANS(FPMINU8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_umin)
> +TRANS(FPMINU16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_umin)
> +TRANS(FPMINU32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_umin)
> +
> +TRANS(FPMAX8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_smax)
> +TRANS(FPMAX16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_smax)
> +TRANS(FPMAX32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_smax)
> +TRANS(FPMAXU8, VIS4, do_gvec_ddd, a, MO_8, tcg_gen_gvec_umax)
> +TRANS(FPMAXU16, VIS4, do_gvec_ddd, a, MO_16, tcg_gen_gvec_umax)
> +TRANS(FPMAXU32, VIS4, do_gvec_ddd, a, MO_32, tcg_gen_gvec_umax)
Easy peasy :P
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 41/41] target/sparc: Enable VIS4 feature bit
2024-03-02 5:16 ` [PATCH 41/41] target/sparc: Enable VIS4 feature bit Richard Henderson
@ 2024-05-10 17:16 ` Philippe Mathieu-Daudé
2024-05-10 17:31 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:16 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:16, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/cpu.c | 3 +++
> 1 file changed, 3 insertions(+)
> @@ -882,6 +883,8 @@ static Property sparc_cpu_properties[] = {
> CPU_FEATURE_BIT_VIS3, false),
> DEFINE_PROP_BIT("ima", SPARCCPU, env.def.features,
> CPU_FEATURE_BIT_IMA, false),
> + DEFINE_PROP_BIT("vis4", SPARCCPU, env.def.features,
> + CPU_FEATURE_BIT_VIS4, false),
I don't see any current CPU with this bit enabled. Nitpicking,
maybe use "Allow enabling VIS4 feature" as subject? (I suppose
you tried using -cpu foo,vis4=on).
Could we add the M7 to sparc_defs[]?
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 26/41] target/sparc: Implement LZCNT
2024-03-02 5:15 ` [PATCH 26/41] target/sparc: Implement LZCNT Richard Henderson
@ 2024-05-10 17:22 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:22 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 18 ++++++++++++++++++
> target/sparc/insns.decode | 1 +
> 2 files changed, 19 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 28/41] target/sparc: Implement PDISTN
2024-03-02 5:15 ` [PATCH 28/41] target/sparc: Implement PDISTN Richard Henderson
@ 2024-05-10 17:28 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:28 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 2/3/24 06:15, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/sparc/translate.c | 11 +++++++++++
> target/sparc/insns.decode | 1 +
> 2 files changed, 12 insertions(+)
> +static void gen_op_pdistn(TCGv dst, TCGv_i64 src1, TCGv_i64 src2)
> +{
> +#ifdef TARGET_SPARC64
> + gen_helper_pdist(dst, tcg_constant_i64(0), src1, src2);
I node pdist[n] could benefit from gvec.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 41/41] target/sparc: Enable VIS4 feature bit
2024-05-10 17:16 ` Philippe Mathieu-Daudé
@ 2024-05-10 17:31 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 65+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-10 17:31 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: mark.cave-ayland, atar4qemu
On 10/5/24 19:16, Philippe Mathieu-Daudé wrote:
> On 2/3/24 06:16, Richard Henderson wrote:
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> target/sparc/cpu.c | 3 +++
>> 1 file changed, 3 insertions(+)
>
>
>> @@ -882,6 +883,8 @@ static Property sparc_cpu_properties[] = {
>> CPU_FEATURE_BIT_VIS3, false),
>> DEFINE_PROP_BIT("ima", SPARCCPU, env.def.features,
>> CPU_FEATURE_BIT_IMA, false),
>> + DEFINE_PROP_BIT("vis4", SPARCCPU, env.def.features,
>> + CPU_FEATURE_BIT_VIS4, false),
>
> I don't see any current CPU with this bit enabled. Nitpicking,
> maybe use "Allow enabling VIS4 feature" as subject? (I suppose
> you tried using -cpu foo,vis4=on).
Doh this is what you mentioned in the cover letter...
> Could we add the M7 to sparc_defs[]?
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/41] target/sparc: Implement VIS4
2024-04-29 21:02 ` Richard Henderson
2024-04-29 21:10 ` Mark Cave-Ayland
@ 2024-05-15 15:30 ` Richard Henderson
2024-05-16 20:36 ` Mark Cave-Ayland
1 sibling, 1 reply; 65+ messages in thread
From: Richard Henderson @ 2024-05-15 15:30 UTC (permalink / raw)
To: Mark Cave-Ayland, qemu-devel; +Cc: atar4qemu
On 4/29/24 23:02, Richard Henderson wrote:
> On 4/29/24 13:52, Mark Cave-Ayland wrote:
>> No objections here about the remainder of the series, other than that I don't have an
>> easy/obvious way to test the new instructions...
>
> I was thinking about adding support to RISU, but the gcc compile farm sparc machines have
> been down for ages, so no way to generate the reference traces.
Update: I have successfully ported RISU to Sparc64, Solaris and Linux. There is a
limitation in that I cannot find how to extract %gsr from the signal frame, which is
unfortunate, but I can work around that for now.
I have added descriptions of VIS1 instructions to RISU, and it turns out we have failures
relative to a Sparc M8. I have not yet analyzed these failures, but it proves the effort
was not wasted. :-)
I'll clean up these patches and post them here when I next get some downtime.
r~
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/41] target/sparc: Implement VIS4
2024-05-15 15:30 ` Richard Henderson
@ 2024-05-16 20:36 ` Mark Cave-Ayland
0 siblings, 0 replies; 65+ messages in thread
From: Mark Cave-Ayland @ 2024-05-16 20:36 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: atar4qemu
On 15/05/2024 16:30, Richard Henderson wrote:
> On 4/29/24 23:02, Richard Henderson wrote:
>> On 4/29/24 13:52, Mark Cave-Ayland wrote:
>>> No objections here about the remainder of the series, other than that I don't have
>>> an easy/obvious way to test the new instructions...
>>
>> I was thinking about adding support to RISU, but the gcc compile farm sparc
>> machines have been down for ages, so no way to generate the reference traces.
>
> Update: I have successfully ported RISU to Sparc64, Solaris and Linux. There is a
> limitation in that I cannot find how to extract %gsr from the signal frame, which is
> unfortunate, but I can work around that for now.
>
> I have added descriptions of VIS1 instructions to RISU, and it turns out we have
> failures relative to a Sparc M8. I have not yet analyzed these failures, but it
> proves the effort was not wasted. :-)
>
> I'll clean up these patches and post them here when I next get some downtime.
>
> r~
That's great news, thanks for the update. I've had confirmation that there is work
underway to repair the SPARC hardware hosting Linux for the gcc buildfarm, so
hopefully it will be back in service soon.
ATB,
Mark.
^ permalink raw reply [flat|nested] 65+ messages in thread
end of thread, other threads:[~2024-05-16 20:37 UTC | newest]
Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-02 5:15 [PATCH 00/41] target/sparc: Implement VIS4 Richard Henderson
2024-03-02 5:15 ` [PATCH 01/41] linux-user/sparc: Add more hwcap bits for sparc64 Richard Henderson
2024-03-02 5:15 ` [PATCH 02/41] target/sparc: Fix FEXPAND Richard Henderson
2024-03-02 5:15 ` [PATCH 03/41] target/sparc: Fix FMUL8x16 Richard Henderson
2024-03-02 5:15 ` [PATCH 04/41] target/sparc: Fix FMUL8x16A{U,L} Richard Henderson
2024-04-30 8:07 ` Mark Cave-Ayland
2024-03-02 5:15 ` [PATCH 05/41] target/sparc: Fix FMULD8*X16 Richard Henderson
2024-03-02 5:15 ` [PATCH 06/41] target/sparc: Fix FPMERGE Richard Henderson
2024-03-02 5:15 ` [PATCH 07/41] target/sparc: Split out do_ms16b Richard Henderson
2024-03-02 5:15 ` [PATCH 08/41] target/sparc: Perform DFPREG/QFPREG in decodetree Richard Henderson
2024-05-10 15:18 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 09/41] target/sparc: Remove gen_dest_fpr_D Richard Henderson
2024-05-10 15:18 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 10/41] target/sparc: Remove cpu_fpr[] Richard Henderson
2024-03-02 5:15 ` [PATCH 11/41] target/sparc: Use gvec for VIS1 parallel add/sub Richard Henderson
2024-05-10 15:21 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 12/41] target/sparc: Implement FMAf extension Richard Henderson
2024-03-02 5:15 ` [PATCH 13/41] target/sparc: Add feature bits for VIS 3 Richard Henderson
2024-05-10 17:05 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 14/41] target/sparc: Implement ADDXC, ADDXCcc Richard Henderson
2024-05-10 16:16 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 15/41] target/sparc: Implement CMASK instructions Richard Henderson
2024-03-02 5:15 ` [PATCH 16/41] target/sparc: Implement FCHKSM16 Richard Henderson
2024-03-02 5:15 ` [PATCH 17/41] target/sparc: Implement FHADD, FHSUB, FNHADD, FNADD Richard Henderson
2024-03-02 5:15 ` [PATCH 18/41] target/sparc: Implement FNMUL Richard Henderson
2024-03-02 5:15 ` [PATCH 19/41] target/sparc: Implement FLCMP Richard Henderson
2024-03-02 5:15 ` [PATCH 20/41] target/sparc: Implement FMEAN16 Richard Henderson
2024-03-02 5:15 ` [PATCH 21/41] target/sparc: Implement FPADD64 FPSUB64 Richard Henderson
2024-05-10 16:19 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 22/41] target/sparc: Implement FPADDS, FPSUBS Richard Henderson
2024-03-02 5:15 ` [PATCH 23/41] target/sparc: Implement FPCMPEQ8, FPCMPNE8, FPCMPULE8, FPCMPUGT8 Richard Henderson
2024-03-02 5:15 ` [PATCH 24/41] target/sparc: Implement FSLL, FSRL, FSRA, FSLAS Richard Henderson
2024-03-02 5:15 ` [PATCH 25/41] target/sparc: Implement LDXEFSR Richard Henderson
2024-03-02 5:15 ` [PATCH 26/41] target/sparc: Implement LZCNT Richard Henderson
2024-05-10 17:22 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 27/41] target/sparc: Implement MOVsTOw, MOVdTOx, MOVwTOs, MOVxTOd Richard Henderson
2024-03-02 5:15 ` [PATCH 28/41] target/sparc: Implement PDISTN Richard Henderson
2024-05-10 17:28 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 29/41] target/sparc: Implement UMULXHI Richard Henderson
2024-03-02 5:15 ` [PATCH 30/41] target/sparc: Implement XMULX Richard Henderson
2024-03-02 5:15 ` [PATCH 31/41] target/sparc: Enable VIS3 feature bit Richard Henderson
2024-03-02 5:15 ` [PATCH 32/41] target/sparc: Implement IMA extension Richard Henderson
2024-05-10 17:09 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 33/41] target/sparc: Add feature bit for VIS4 Richard Henderson
2024-05-10 17:05 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 34/41] target/sparc: Implement FALIGNDATAi Richard Henderson
2024-03-02 5:15 ` [PATCH 35/41] target/sparc: Implement 8-bit FPADD, FPADDS, and FPADDUS Richard Henderson
2024-05-10 16:20 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 36/41] target/sparc: Implement VIS4 comparisons Richard Henderson
2024-03-02 5:15 ` [PATCH 37/41] target/sparc: Implement FPMIN, FPMAX Richard Henderson
2024-05-10 17:11 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 38/41] target/sparc: Implement SUBXC, SUBXCcc Richard Henderson
2024-05-10 16:21 ` Philippe Mathieu-Daudé
2024-03-02 5:15 ` [PATCH 39/41] target/sparc: Implement MWAIT Richard Henderson
2024-03-02 5:16 ` [PATCH 40/41] target/sparc: Implement monitor asis Richard Henderson
2024-05-10 17:04 ` Philippe Mathieu-Daudé
2024-03-02 5:16 ` [PATCH 41/41] target/sparc: Enable VIS4 feature bit Richard Henderson
2024-05-10 17:16 ` Philippe Mathieu-Daudé
2024-05-10 17:31 ` Philippe Mathieu-Daudé
2024-03-05 10:20 ` [PATCH 00/41] target/sparc: Implement VIS4 Mark Cave-Ayland
2024-04-29 20:52 ` Mark Cave-Ayland
2024-04-29 21:02 ` Richard Henderson
2024-04-29 21:10 ` Mark Cave-Ayland
2024-05-15 15:30 ` Richard Henderson
2024-05-16 20:36 ` Mark Cave-Ayland
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).