qemu-riscv.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension
@ 2025-09-01 13:38 Max Chou
  2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei, Max Chou

This patch series introduces support for Zvqdotq extension.

The Zvqdotq extension's ISA specification is not yet ratified, so this
patch series is based on the latest draft (v0.0.2) and treats the
Zvqdotq extension as an experimental extension.

The draft of the Zvqdotq ISA specification:
https://github.com/riscv/riscv-dot-product

Max Chou (3):
  target/riscv: Add Zvqdotq cfg property
  target/riscv: rvv: Add Zvqdotq support
  target/riscv: Expose Zvqdotq extension as a cpu property

 target/riscv/cpu.c                            |  2 +
 target/riscv/cpu_cfg_fields.h.inc             |  1 +
 target/riscv/helper.h                         | 10 +++
 target/riscv/insn32.decode                    |  9 +++
 target/riscv/insn_trans/trans_rvzvqdotq.c.inc | 61 ++++++++++++++++++
 target/riscv/tcg/tcg-cpu.c                    |  5 ++
 target/riscv/translate.c                      |  1 +
 target/riscv/vector_helper.c                  | 63 +++++++++++++++++++
 8 files changed, 152 insertions(+)
 create mode 100644 target/riscv/insn_trans/trans_rvzvqdotq.c.inc

-- 
2.43.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property
  2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
@ 2025-09-01 13:38 ` Max Chou
  2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
  2025-09-01 13:38 ` [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property Max Chou
  2 siblings, 0 replies; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei, Max Chou

The Zvqdotq extension is the vector dot-product extension of RISC-V.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c                | 1 +
 target/riscv/cpu_cfg_fields.h.inc | 1 +
 target/riscv/tcg/tcg-cpu.c        | 5 +++++
 3 files changed, 7 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index d055ddf4623..95edd02e683 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -187,6 +187,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
     ISA_EXT_DATA_ENTRY(zvksg, PRIV_VERSION_1_12_0, ext_zvksg),
     ISA_EXT_DATA_ENTRY(zvksh, PRIV_VERSION_1_12_0, ext_zvksh),
     ISA_EXT_DATA_ENTRY(zvkt, PRIV_VERSION_1_12_0, ext_zvkt),
+    ISA_EXT_DATA_ENTRY(zvqdotq, PRIV_VERSION_1_12_0, ext_zvqdotq),
     ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
     ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
     ISA_EXT_DATA_ENTRY(sdtrig, PRIV_VERSION_1_12_0, debug),
diff --git a/target/riscv/cpu_cfg_fields.h.inc b/target/riscv/cpu_cfg_fields.h.inc
index e2d116f0dfb..5da59c22d68 100644
--- a/target/riscv/cpu_cfg_fields.h.inc
+++ b/target/riscv/cpu_cfg_fields.h.inc
@@ -100,6 +100,7 @@ BOOL_FIELD(ext_zvfbfmin)
 BOOL_FIELD(ext_zvfbfwma)
 BOOL_FIELD(ext_zvfh)
 BOOL_FIELD(ext_zvfhmin)
+BOOL_FIELD(ext_zvqdotq)
 BOOL_FIELD(ext_smaia)
 BOOL_FIELD(ext_ssaia)
 BOOL_FIELD(ext_smctr)
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 78fb2791847..7015370ab00 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -767,6 +767,11 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
         return;
     }
 
+    if (cpu->cfg.ext_zvqdotq && !cpu->cfg.ext_zve32x) {
+        error_setg(errp, "Zvqdotq extension requires V or Zve* extensions");
+        return;
+    }
+
     if ((cpu->cfg.ext_zvbc || cpu->cfg.ext_zvknhb) && !cpu->cfg.ext_zve64x) {
         error_setg(
             errp,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
  2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
  2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
@ 2025-09-01 13:38 ` Max Chou
  2025-09-02 13:38   ` Richard Henderson
  2025-09-01 13:38 ` [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property Max Chou
  2 siblings, 1 reply; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei, Max Chou

Support instructions for vector dot-product extension (Zvqdotq)
- vqdot.[vv,vx]
- vqdotu.[vv,vx]
- vqdotsu.[vv,vx]
- vqdotus.vx

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/helper.h                         | 10 +++
 target/riscv/insn32.decode                    |  9 +++
 target/riscv/insn_trans/trans_rvzvqdotq.c.inc | 61 ++++++++++++++++++
 target/riscv/translate.c                      |  1 +
 target/riscv/vector_helper.c                  | 63 +++++++++++++++++++
 5 files changed, 144 insertions(+)
 create mode 100644 target/riscv/insn_trans/trans_rvzvqdotq.c.inc

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index f712b1c368e..80274f1dad6 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1284,3 +1284,13 @@ DEF_HELPER_4(vgmul_vv, void, ptr, ptr, env, i32)
 DEF_HELPER_5(vsm4k_vi, void, ptr, ptr, i32, env, i32)
 DEF_HELPER_4(vsm4r_vv, void, ptr, ptr, env, i32)
 DEF_HELPER_4(vsm4r_vs, void, ptr, ptr, env, i32)
+
+/* Vector dot-product functions */
+DEF_HELPER_6(vqdot_vv, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqdotu_vv, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqdotsu_vv, void, ptr, ptr, ptr, ptr, env, i32)
+
+DEF_HELPER_6(vqdot_vx, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqdotu_vx, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqdotsu_vx, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqdotus_vx, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index cd23b1f3a9b..50a61566670 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -1066,3 +1066,12 @@ amominu_h  11000 . . ..... ..... 001 ..... 0101111 @atom_st
 amomaxu_h  11100 . . ..... ..... 001 ..... 0101111 @atom_st
 amocas_b    00101 . . ..... ..... 000 ..... 0101111 @atom_st
 amocas_h    00101 . . ..... ..... 001 ..... 0101111 @atom_st
+
+# *** Zvqdotq Vector Dot-Product Extension ***
+vqdot_vv    101100 . ..... ..... 010 ..... 1010111 @r_vm
+vqdot_vx    101100 . ..... ..... 110 ..... 1010111 @r_vm
+vqdotu_vv   101000 . ..... ..... 010 ..... 1010111 @r_vm
+vqdotu_vx   101000 . ..... ..... 110 ..... 1010111 @r_vm
+vqdotsu_vv  101010 . ..... ..... 010 ..... 1010111 @r_vm
+vqdotsu_vx  101010 . ..... ..... 110 ..... 1010111 @r_vm
+vqdotus_vx  101110 . ..... ..... 110 ..... 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvzvqdotq.c.inc b/target/riscv/insn_trans/trans_rvzvqdotq.c.inc
new file mode 100644
index 00000000000..b203c826a2e
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvzvqdotq.c.inc
@@ -0,0 +1,61 @@
+/*
+ * RISC-V translation routines for the Zvqdotq vector dot-product extension
+ *
+ * Copyright (c) 2025 Max Chou, max.chou@sifive.com
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+static bool vext_zvqdotq_base_check(DisasContext *s)
+{
+    return s->cfg_ptr->ext_zvqdotq && s->sew == MO_32;
+}
+
+static bool vext_vqdotq_opivv_check(DisasContext *s, arg_rmrr *a)
+{
+    return vext_zvqdotq_base_check(s) && opivv_check(s, a);
+}
+
+#define GEN_VQDOTQ_OPIVV_TRANS(NAME, CHECK)              \
+static bool trans_##NAME(DisasContext *s, arg_rmrr *a)   \
+{                                                        \
+    if (CHECK(s, a)) {                                   \
+        return opivv_trans(a->rd, a->rs1, a->rs2, a->vm, \
+                           gen_helper_##NAME, s);        \
+    }                                                    \
+    return false;                                        \
+}
+
+GEN_VQDOTQ_OPIVV_TRANS(vqdot_vv, vext_vqdotq_opivv_check)
+GEN_VQDOTQ_OPIVV_TRANS(vqdotu_vv, vext_vqdotq_opivv_check)
+GEN_VQDOTQ_OPIVV_TRANS(vqdotsu_vv, vext_vqdotq_opivv_check)
+
+static bool vext_vqdotq_opivx_check(DisasContext *s, arg_rmrr *a)
+{
+    return vext_zvqdotq_base_check(s) && opivx_check(s, a);
+}
+
+#define GEN_VQDOTQ_OPIVX_TRANS(NAME, CHECK)              \
+static bool trans_##NAME(DisasContext *s, arg_rmrr *a)   \
+{                                                        \
+    if (CHECK(s, a)) {                                   \
+        return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, \
+                           gen_helper_##NAME, s);        \
+    }                                                    \
+    return false;                                        \
+}
+
+GEN_VQDOTQ_OPIVX_TRANS(vqdot_vx, vext_vqdotq_opivx_check)
+GEN_VQDOTQ_OPIVX_TRANS(vqdotu_vx, vext_vqdotq_opivx_check)
+GEN_VQDOTQ_OPIVX_TRANS(vqdotsu_vx, vext_vqdotq_opivx_check)
+GEN_VQDOTQ_OPIVX_TRANS(vqdotus_vx, vext_vqdotq_opivx_check)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 9ddef2d6e2a..6f43ed1ffdb 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -1190,6 +1190,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc)
 #include "insn_trans/trans_rvzfh.c.inc"
 #include "insn_trans/trans_rvk.c.inc"
 #include "insn_trans/trans_rvvk.c.inc"
+#include "insn_trans/trans_rvzvqdotq.c.inc"
 #include "insn_trans/trans_privileged.c.inc"
 #include "insn_trans/trans_svinval.c.inc"
 #include "insn_trans/trans_rvbf16.c.inc"
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 7c67d67a13f..961b62add3f 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -921,6 +921,10 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b_tlb, ste_b_host)
 #define WOP_SSU_B int16_t, int8_t, uint8_t, int16_t, uint16_t
 #define WOP_SSU_H int32_t, int16_t, uint16_t, int32_t, uint32_t
 #define WOP_SSU_W int64_t, int32_t, uint32_t, int64_t, uint64_t
+#define QOP_SSS_B int32_t, int8_t, int8_t, int32_t, int32_t
+#define QOP_SUS_B int32_t, uint8_t, int8_t, uint32_t, int32_t
+#define QOP_SSU_B int32_t, int8_t, uint8_t, int32_t, uint32_t
+#define QOP_UUU_B uint32_t, uint8_t, uint8_t, uint32_t, uint32_t
 #define NOP_SSS_B int8_t, int8_t, int16_t, int8_t, int16_t
 #define NOP_SSS_H int16_t, int16_t, int32_t, int16_t, int32_t
 #define NOP_SSS_W int32_t, int32_t, int64_t, int32_t, int64_t
@@ -5473,3 +5477,62 @@ GEN_VEXT_INT_EXT(vsext_vf2_d, int64_t, int32_t, H8, H4)
 GEN_VEXT_INT_EXT(vsext_vf4_w, int32_t, int8_t,  H4, H1)
 GEN_VEXT_INT_EXT(vsext_vf4_d, int64_t, int16_t, H8, H2)
 GEN_VEXT_INT_EXT(vsext_vf8_d, int64_t, int8_t,  H8, H1)
+
+
+/* Vector dot-product instructions. */
+
+#define OPMVV_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2)          \
+static void do_##NAME(void *vd, void *vs1, void *vs2, int i)            \
+{                                                                       \
+    int idx;                                                            \
+    T1 r1;                                                              \
+    T2 r2;                                                              \
+    TX1 *r1_buf = (TX1 *)vs1 + HD(i);                                   \
+    TX2 *r2_buf = (TX2 *)vs2 + HD(i);                                   \
+    TD acc = *((TD *)vd + HD(i));                                       \
+    int64_t partial_sum = 0;                                            \
+                                                                        \
+    for (idx = 0; idx < 4; ++idx) {                                     \
+        r1 = *((T1 *)r1_buf + HS1(idx));                                \
+        r2 = *((T2 *)r2_buf + HS2(idx));                                \
+        partial_sum += (r1 * r2);                                       \
+    }                                                                   \
+    *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32); \
+}
+
+RVVCALL(OPMVV_VQDOTQ, vqdot_vv, QOP_SSS_B, H4, H1, H1)
+RVVCALL(OPMVV_VQDOTQ, vqdotu_vv, QOP_UUU_B, H4, H1, H1)
+RVVCALL(OPMVV_VQDOTQ, vqdotsu_vv, QOP_SUS_B, H4, H1, H1)
+
+GEN_VEXT_VV(vqdot_vv, 4)
+GEN_VEXT_VV(vqdotu_vv, 4)
+GEN_VEXT_VV(vqdotsu_vv, 4)
+
+#define OPMVX_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2)       \
+static void do_##NAME(void *vd, target_long s1, void *vs2, int i)       \
+{                                                                       \
+    int idx;                                                            \
+    T1 r1;                                                              \
+    T2 r2;                                                              \
+    TX1 *r1_buf = (TX1 *)&s1;                                           \
+    TX2 *r2_buf = (TX2 *)vs2 + HD(i);                                   \
+    TD acc = *((TD *)vd + HD(i));                                       \
+    int64_t partial_sum = 0;                                            \
+                                                                        \
+    for (idx = 0; idx < 4; ++idx) {                                     \
+        r1 = *((T1 *)r1_buf + HS1(idx));                                \
+        r2 = *((T2 *)r2_buf + HS2(idx));                                \
+        partial_sum += (r1 * r2);                                       \
+    }                                                                   \
+    *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32); \
+}
+
+RVVCALL(OPMVX_VQDOTQ, vqdot_vx, QOP_SSS_B, H4, H1, H1)
+RVVCALL(OPMVX_VQDOTQ, vqdotu_vx, QOP_UUU_B, H4, H1, H1)
+RVVCALL(OPMVX_VQDOTQ, vqdotsu_vx, QOP_SUS_B, H4, H1, H1)
+RVVCALL(OPMVX_VQDOTQ, vqdotus_vx, QOP_SSU_B, H4, H1, H1)
+
+GEN_VEXT_VX(vqdot_vx, 4)
+GEN_VEXT_VX(vqdotu_vx, 4)
+GEN_VEXT_VX(vqdotsu_vx, 4)
+GEN_VEXT_VX(vqdotus_vx, 4)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property
  2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
  2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
  2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
@ 2025-09-01 13:38 ` Max Chou
  2 siblings, 0 replies; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei, Max Chou

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 95edd02e683..ed486113ba1 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1373,6 +1373,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = {
 /* These are experimental so mark with 'x-' */
 const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
     MULTI_EXT_CFG_BOOL("x-svukte", ext_svukte, false),
+    MULTI_EXT_CFG_BOOL("x-zvqdotq", ext_zvqdotq, false),
 
     { },
 };
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
  2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
@ 2025-09-02 13:38   ` Richard Henderson
  2025-09-03  0:26     ` Max Chou
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2025-09-02 13:38 UTC (permalink / raw)
  To: Max Chou, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On 9/1/25 23:38, Max Chou wrote:
> +#define OPMVV_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2)          \
> +static void do_##NAME(void *vd, void *vs1, void *vs2, int i)            \
> +{                                                                       \
> +    int idx;                                                            \
> +    T1 r1;                                                              \
> +    T2 r2;                                                              \
> +    TX1 *r1_buf = (TX1 *)vs1 + HD(i);                                   \
> +    TX2 *r2_buf = (TX2 *)vs2 + HD(i);                                   \
> +    TD acc = *((TD *)vd + HD(i));                                       \
> +    int64_t partial_sum = 0;                                            \

I think it's clear partial_sum should be the 32-bit type TD.
Indeed, I'm not sure why you don't just have

	TD acc = ((TD *)vd)[HD(i)];

> +                                                                        \
> +    for (idx = 0; idx < 4; ++idx) {                                     \
> +        r1 = *((T1 *)r1_buf + HS1(idx));                                \
> +        r2 = *((T2 *)r2_buf + HS2(idx));                                \
> +        partial_sum += (r1 * r2);                                       \

	acc += r1 * r2;

> +    }                                                                   \
> +    *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32); \

	((TD *)vd)[HD(i)] = acc;

because that final mask is bogus.


r~

> +}
> +
> +RVVCALL(OPMVV_VQDOTQ, vqdot_vv, QOP_SSS_B, H4, H1, H1)
> +RVVCALL(OPMVV_VQDOTQ, vqdotu_vv, QOP_UUU_B, H4, H1, H1)
> +RVVCALL(OPMVV_VQDOTQ, vqdotsu_vv, QOP_SUS_B, H4, H1, H1)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
  2025-09-02 13:38   ` Richard Henderson
@ 2025-09-03  0:26     ` Max Chou
  2025-09-03  3:43       ` Richard Henderson
  0 siblings, 1 reply; 8+ messages in thread
From: Max Chou @ 2025-09-03  0:26 UTC (permalink / raw)
  To: Richard Henderson
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei

[-- Attachment #1: Type: text/plain, Size: 1469 bytes --]

On Tue, Sep 2, 2025 at 22:38 Richard Henderson <richard.henderson@linaro.org>
wrote:

> On 9/1/25 23:38, Max Chou wrote:
> > +#define OPMVV_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2)
> \
> > +static void do_##NAME(void *vd, void *vs1, void *vs2, int i)
> \
> > +{
>  \
> > +    int idx;
> \
> > +    T1 r1;
> \
> > +    T2 r2;
> \
> > +    TX1 *r1_buf = (TX1 *)vs1 + HD(i);
>  \
> > +    TX2 *r2_buf = (TX2 *)vs2 + HD(i);
>  \
> > +    TD acc = *((TD *)vd + HD(i));
>  \
> > +    int64_t partial_sum = 0;
> \
>
> I think it's clear partial_sum should be the 32-bit type TD.
> Indeed, I'm not sure why you don't just have
>
>         TD acc = ((TD *)vd)[HD(i)];

Thanks for the suggestion. I’ll update version 2 for this part.


>
> > +
> \
> > +    for (idx = 0; idx < 4; ++idx) {
>  \
> > +        r1 = *((T1 *)r1_buf + HS1(idx));
> \
> > +        r2 = *((T2 *)r2_buf + HS2(idx));
> \
> > +        partial_sum += (r1 * r2);
>  \
>
>         acc += r1 * r2;
>
> > +    }
>  \
> > +    *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32);
> \
>
>         ((TD *)vd)[HD(i)] = acc;
>
> because that final mask is bogus.
>

The partial_sum and the final mask are created to ensure the behavior
described in the Zvqdotq isa spec section 3 as follows:

“Finally, the four products are accumulated into the corresponding element
of vd, wrapping around signed overflow.”

Thanks,
Max.

[-- Attachment #2: Type: text/html, Size: 3380 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
  2025-09-03  0:26     ` Max Chou
@ 2025-09-03  3:43       ` Richard Henderson
  2025-09-03 12:43         ` Max Chou
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2025-09-03  3:43 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei

On 9/3/25 02:26, Max Chou wrote:
> The partial_sum and the final mask are created to ensure the behavior described in the 
> Zvqdotq isa spec section 3 as follows:
> 
> “Finally, the four products are accumulated into the corresponding element of vd, wrapping 
> around signed overflow.”

This is accomplished by the -fwrapv argument to the compiler, with which all of qemu is 
compiled.


r~


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
  2025-09-03  3:43       ` Richard Henderson
@ 2025-09-03 12:43         ` Max Chou
  0 siblings, 0 replies; 8+ messages in thread
From: Max Chou @ 2025-09-03 12:43 UTC (permalink / raw)
  To: Richard Henderson
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei

[-- Attachment #1: Type: text/plain, Size: 630 bytes --]

On Wed, Sep 3, 2025 at 12:43 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 9/3/25 02:26, Max Chou wrote:
> > The partial_sum and the final mask are created to ensure the behavior
> described in the
> > Zvqdotq isa spec section 3 as follows:
> >
> > “Finally, the four products are accumulated into the corresponding
> element of vd, wrapping
> > around signed overflow.”
>
> This is accomplished by the -fwrapv argument to the compiler, with which
> all of qemu is
> compiled.
>
Thanks for this information. It’s helpful.
I’ll update version 2 to include this part.

Thanks,
Max

[-- Attachment #2: Type: text/html, Size: 1041 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-09-03 12:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
2025-09-02 13:38   ` Richard Henderson
2025-09-03  0:26     ` Max Chou
2025-09-03  3:43       ` Richard Henderson
2025-09-03 12:43         ` Max Chou
2025-09-01 13:38 ` [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property Max Chou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).