* [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension
@ 2025-09-01 13:38 Max Chou
2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
To: qemu-devel, qemu-riscv
Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, Max Chou
This patch series introduces support for Zvqdotq extension.
The Zvqdotq extension's ISA specification is not yet ratified, so this
patch series is based on the latest draft (v0.0.2) and treats the
Zvqdotq extension as an experimental extension.
The draft of the Zvqdotq ISA specification:
https://github.com/riscv/riscv-dot-product
Max Chou (3):
target/riscv: Add Zvqdotq cfg property
target/riscv: rvv: Add Zvqdotq support
target/riscv: Expose Zvqdotq extension as a cpu property
target/riscv/cpu.c | 2 +
target/riscv/cpu_cfg_fields.h.inc | 1 +
target/riscv/helper.h | 10 +++
target/riscv/insn32.decode | 9 +++
target/riscv/insn_trans/trans_rvzvqdotq.c.inc | 61 ++++++++++++++++++
target/riscv/tcg/tcg-cpu.c | 5 ++
target/riscv/translate.c | 1 +
target/riscv/vector_helper.c | 63 +++++++++++++++++++
8 files changed, 152 insertions(+)
create mode 100644 target/riscv/insn_trans/trans_rvzvqdotq.c.inc
--
2.43.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property
2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
@ 2025-09-01 13:38 ` Max Chou
2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
2025-09-01 13:38 ` [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property Max Chou
2 siblings, 0 replies; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
To: qemu-devel, qemu-riscv
Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, Max Chou
The Zvqdotq extension is the vector dot-product extension of RISC-V.
Signed-off-by: Max Chou <max.chou@sifive.com>
---
target/riscv/cpu.c | 1 +
target/riscv/cpu_cfg_fields.h.inc | 1 +
target/riscv/tcg/tcg-cpu.c | 5 +++++
3 files changed, 7 insertions(+)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index d055ddf4623..95edd02e683 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -187,6 +187,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
ISA_EXT_DATA_ENTRY(zvksg, PRIV_VERSION_1_12_0, ext_zvksg),
ISA_EXT_DATA_ENTRY(zvksh, PRIV_VERSION_1_12_0, ext_zvksh),
ISA_EXT_DATA_ENTRY(zvkt, PRIV_VERSION_1_12_0, ext_zvkt),
+ ISA_EXT_DATA_ENTRY(zvqdotq, PRIV_VERSION_1_12_0, ext_zvqdotq),
ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
ISA_EXT_DATA_ENTRY(sdtrig, PRIV_VERSION_1_12_0, debug),
diff --git a/target/riscv/cpu_cfg_fields.h.inc b/target/riscv/cpu_cfg_fields.h.inc
index e2d116f0dfb..5da59c22d68 100644
--- a/target/riscv/cpu_cfg_fields.h.inc
+++ b/target/riscv/cpu_cfg_fields.h.inc
@@ -100,6 +100,7 @@ BOOL_FIELD(ext_zvfbfmin)
BOOL_FIELD(ext_zvfbfwma)
BOOL_FIELD(ext_zvfh)
BOOL_FIELD(ext_zvfhmin)
+BOOL_FIELD(ext_zvqdotq)
BOOL_FIELD(ext_smaia)
BOOL_FIELD(ext_ssaia)
BOOL_FIELD(ext_smctr)
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 78fb2791847..7015370ab00 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -767,6 +767,11 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
return;
}
+ if (cpu->cfg.ext_zvqdotq && !cpu->cfg.ext_zve32x) {
+ error_setg(errp, "Zvqdotq extension requires V or Zve* extensions");
+ return;
+ }
+
if ((cpu->cfg.ext_zvbc || cpu->cfg.ext_zvknhb) && !cpu->cfg.ext_zve64x) {
error_setg(
errp,
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
@ 2025-09-01 13:38 ` Max Chou
2025-09-02 13:38 ` Richard Henderson
2025-09-01 13:38 ` [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property Max Chou
2 siblings, 1 reply; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
To: qemu-devel, qemu-riscv
Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, Max Chou
Support instructions for vector dot-product extension (Zvqdotq)
- vqdot.[vv,vx]
- vqdotu.[vv,vx]
- vqdotsu.[vv,vx]
- vqdotus.vx
Signed-off-by: Max Chou <max.chou@sifive.com>
---
target/riscv/helper.h | 10 +++
target/riscv/insn32.decode | 9 +++
target/riscv/insn_trans/trans_rvzvqdotq.c.inc | 61 ++++++++++++++++++
target/riscv/translate.c | 1 +
target/riscv/vector_helper.c | 63 +++++++++++++++++++
5 files changed, 144 insertions(+)
create mode 100644 target/riscv/insn_trans/trans_rvzvqdotq.c.inc
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index f712b1c368e..80274f1dad6 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1284,3 +1284,13 @@ DEF_HELPER_4(vgmul_vv, void, ptr, ptr, env, i32)
DEF_HELPER_5(vsm4k_vi, void, ptr, ptr, i32, env, i32)
DEF_HELPER_4(vsm4r_vv, void, ptr, ptr, env, i32)
DEF_HELPER_4(vsm4r_vs, void, ptr, ptr, env, i32)
+
+/* Vector dot-product functions */
+DEF_HELPER_6(vqdot_vv, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqdotu_vv, void, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_6(vqdotsu_vv, void, ptr, ptr, ptr, ptr, env, i32)
+
+DEF_HELPER_6(vqdot_vx, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqdotu_vx, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqdotsu_vx, void, ptr, ptr, tl, ptr, env, i32)
+DEF_HELPER_6(vqdotus_vx, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index cd23b1f3a9b..50a61566670 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -1066,3 +1066,12 @@ amominu_h 11000 . . ..... ..... 001 ..... 0101111 @atom_st
amomaxu_h 11100 . . ..... ..... 001 ..... 0101111 @atom_st
amocas_b 00101 . . ..... ..... 000 ..... 0101111 @atom_st
amocas_h 00101 . . ..... ..... 001 ..... 0101111 @atom_st
+
+# *** Zvqdotq Vector Dot-Product Extension ***
+vqdot_vv 101100 . ..... ..... 010 ..... 1010111 @r_vm
+vqdot_vx 101100 . ..... ..... 110 ..... 1010111 @r_vm
+vqdotu_vv 101000 . ..... ..... 010 ..... 1010111 @r_vm
+vqdotu_vx 101000 . ..... ..... 110 ..... 1010111 @r_vm
+vqdotsu_vv 101010 . ..... ..... 010 ..... 1010111 @r_vm
+vqdotsu_vx 101010 . ..... ..... 110 ..... 1010111 @r_vm
+vqdotus_vx 101110 . ..... ..... 110 ..... 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvzvqdotq.c.inc b/target/riscv/insn_trans/trans_rvzvqdotq.c.inc
new file mode 100644
index 00000000000..b203c826a2e
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvzvqdotq.c.inc
@@ -0,0 +1,61 @@
+/*
+ * RISC-V translation routines for the Zvqdotq vector dot-product extension
+ *
+ * Copyright (c) 2025 Max Chou, max.chou@sifive.com
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+static bool vext_zvqdotq_base_check(DisasContext *s)
+{
+ return s->cfg_ptr->ext_zvqdotq && s->sew == MO_32;
+}
+
+static bool vext_vqdotq_opivv_check(DisasContext *s, arg_rmrr *a)
+{
+ return vext_zvqdotq_base_check(s) && opivv_check(s, a);
+}
+
+#define GEN_VQDOTQ_OPIVV_TRANS(NAME, CHECK) \
+static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \
+{ \
+ if (CHECK(s, a)) { \
+ return opivv_trans(a->rd, a->rs1, a->rs2, a->vm, \
+ gen_helper_##NAME, s); \
+ } \
+ return false; \
+}
+
+GEN_VQDOTQ_OPIVV_TRANS(vqdot_vv, vext_vqdotq_opivv_check)
+GEN_VQDOTQ_OPIVV_TRANS(vqdotu_vv, vext_vqdotq_opivv_check)
+GEN_VQDOTQ_OPIVV_TRANS(vqdotsu_vv, vext_vqdotq_opivv_check)
+
+static bool vext_vqdotq_opivx_check(DisasContext *s, arg_rmrr *a)
+{
+ return vext_zvqdotq_base_check(s) && opivx_check(s, a);
+}
+
+#define GEN_VQDOTQ_OPIVX_TRANS(NAME, CHECK) \
+static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \
+{ \
+ if (CHECK(s, a)) { \
+ return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, \
+ gen_helper_##NAME, s); \
+ } \
+ return false; \
+}
+
+GEN_VQDOTQ_OPIVX_TRANS(vqdot_vx, vext_vqdotq_opivx_check)
+GEN_VQDOTQ_OPIVX_TRANS(vqdotu_vx, vext_vqdotq_opivx_check)
+GEN_VQDOTQ_OPIVX_TRANS(vqdotsu_vx, vext_vqdotq_opivx_check)
+GEN_VQDOTQ_OPIVX_TRANS(vqdotus_vx, vext_vqdotq_opivx_check)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 9ddef2d6e2a..6f43ed1ffdb 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -1190,6 +1190,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc)
#include "insn_trans/trans_rvzfh.c.inc"
#include "insn_trans/trans_rvk.c.inc"
#include "insn_trans/trans_rvvk.c.inc"
+#include "insn_trans/trans_rvzvqdotq.c.inc"
#include "insn_trans/trans_privileged.c.inc"
#include "insn_trans/trans_svinval.c.inc"
#include "insn_trans/trans_rvbf16.c.inc"
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 7c67d67a13f..961b62add3f 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -921,6 +921,10 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b_tlb, ste_b_host)
#define WOP_SSU_B int16_t, int8_t, uint8_t, int16_t, uint16_t
#define WOP_SSU_H int32_t, int16_t, uint16_t, int32_t, uint32_t
#define WOP_SSU_W int64_t, int32_t, uint32_t, int64_t, uint64_t
+#define QOP_SSS_B int32_t, int8_t, int8_t, int32_t, int32_t
+#define QOP_SUS_B int32_t, uint8_t, int8_t, uint32_t, int32_t
+#define QOP_SSU_B int32_t, int8_t, uint8_t, int32_t, uint32_t
+#define QOP_UUU_B uint32_t, uint8_t, uint8_t, uint32_t, uint32_t
#define NOP_SSS_B int8_t, int8_t, int16_t, int8_t, int16_t
#define NOP_SSS_H int16_t, int16_t, int32_t, int16_t, int32_t
#define NOP_SSS_W int32_t, int32_t, int64_t, int32_t, int64_t
@@ -5473,3 +5477,62 @@ GEN_VEXT_INT_EXT(vsext_vf2_d, int64_t, int32_t, H8, H4)
GEN_VEXT_INT_EXT(vsext_vf4_w, int32_t, int8_t, H4, H1)
GEN_VEXT_INT_EXT(vsext_vf4_d, int64_t, int16_t, H8, H2)
GEN_VEXT_INT_EXT(vsext_vf8_d, int64_t, int8_t, H8, H1)
+
+
+/* Vector dot-product instructions. */
+
+#define OPMVV_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2) \
+static void do_##NAME(void *vd, void *vs1, void *vs2, int i) \
+{ \
+ int idx; \
+ T1 r1; \
+ T2 r2; \
+ TX1 *r1_buf = (TX1 *)vs1 + HD(i); \
+ TX2 *r2_buf = (TX2 *)vs2 + HD(i); \
+ TD acc = *((TD *)vd + HD(i)); \
+ int64_t partial_sum = 0; \
+ \
+ for (idx = 0; idx < 4; ++idx) { \
+ r1 = *((T1 *)r1_buf + HS1(idx)); \
+ r2 = *((T2 *)r2_buf + HS2(idx)); \
+ partial_sum += (r1 * r2); \
+ } \
+ *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32); \
+}
+
+RVVCALL(OPMVV_VQDOTQ, vqdot_vv, QOP_SSS_B, H4, H1, H1)
+RVVCALL(OPMVV_VQDOTQ, vqdotu_vv, QOP_UUU_B, H4, H1, H1)
+RVVCALL(OPMVV_VQDOTQ, vqdotsu_vv, QOP_SUS_B, H4, H1, H1)
+
+GEN_VEXT_VV(vqdot_vv, 4)
+GEN_VEXT_VV(vqdotu_vv, 4)
+GEN_VEXT_VV(vqdotsu_vv, 4)
+
+#define OPMVX_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2) \
+static void do_##NAME(void *vd, target_long s1, void *vs2, int i) \
+{ \
+ int idx; \
+ T1 r1; \
+ T2 r2; \
+ TX1 *r1_buf = (TX1 *)&s1; \
+ TX2 *r2_buf = (TX2 *)vs2 + HD(i); \
+ TD acc = *((TD *)vd + HD(i)); \
+ int64_t partial_sum = 0; \
+ \
+ for (idx = 0; idx < 4; ++idx) { \
+ r1 = *((T1 *)r1_buf + HS1(idx)); \
+ r2 = *((T2 *)r2_buf + HS2(idx)); \
+ partial_sum += (r1 * r2); \
+ } \
+ *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32); \
+}
+
+RVVCALL(OPMVX_VQDOTQ, vqdot_vx, QOP_SSS_B, H4, H1, H1)
+RVVCALL(OPMVX_VQDOTQ, vqdotu_vx, QOP_UUU_B, H4, H1, H1)
+RVVCALL(OPMVX_VQDOTQ, vqdotsu_vx, QOP_SUS_B, H4, H1, H1)
+RVVCALL(OPMVX_VQDOTQ, vqdotus_vx, QOP_SSU_B, H4, H1, H1)
+
+GEN_VEXT_VX(vqdot_vx, 4)
+GEN_VEXT_VX(vqdotu_vx, 4)
+GEN_VEXT_VX(vqdotsu_vx, 4)
+GEN_VEXT_VX(vqdotus_vx, 4)
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property
2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
@ 2025-09-01 13:38 ` Max Chou
2 siblings, 0 replies; 8+ messages in thread
From: Max Chou @ 2025-09-01 13:38 UTC (permalink / raw)
To: qemu-devel, qemu-riscv
Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, Max Chou
Signed-off-by: Max Chou <max.chou@sifive.com>
---
target/riscv/cpu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 95edd02e683..ed486113ba1 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1373,6 +1373,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = {
/* These are experimental so mark with 'x-' */
const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
MULTI_EXT_CFG_BOOL("x-svukte", ext_svukte, false),
+ MULTI_EXT_CFG_BOOL("x-zvqdotq", ext_zvqdotq, false),
{ },
};
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
@ 2025-09-02 13:38 ` Richard Henderson
2025-09-03 0:26 ` Max Chou
0 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2025-09-02 13:38 UTC (permalink / raw)
To: Max Chou, qemu-devel, qemu-riscv
Cc: Palmer Dabbelt, Alistair Francis, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei
On 9/1/25 23:38, Max Chou wrote:
> +#define OPMVV_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2) \
> +static void do_##NAME(void *vd, void *vs1, void *vs2, int i) \
> +{ \
> + int idx; \
> + T1 r1; \
> + T2 r2; \
> + TX1 *r1_buf = (TX1 *)vs1 + HD(i); \
> + TX2 *r2_buf = (TX2 *)vs2 + HD(i); \
> + TD acc = *((TD *)vd + HD(i)); \
> + int64_t partial_sum = 0; \
I think it's clear partial_sum should be the 32-bit type TD.
Indeed, I'm not sure why you don't just have
TD acc = ((TD *)vd)[HD(i)];
> + \
> + for (idx = 0; idx < 4; ++idx) { \
> + r1 = *((T1 *)r1_buf + HS1(idx)); \
> + r2 = *((T2 *)r2_buf + HS2(idx)); \
> + partial_sum += (r1 * r2); \
acc += r1 * r2;
> + } \
> + *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32); \
((TD *)vd)[HD(i)] = acc;
because that final mask is bogus.
r~
> +}
> +
> +RVVCALL(OPMVV_VQDOTQ, vqdot_vv, QOP_SSS_B, H4, H1, H1)
> +RVVCALL(OPMVV_VQDOTQ, vqdotu_vv, QOP_UUU_B, H4, H1, H1)
> +RVVCALL(OPMVV_VQDOTQ, vqdotsu_vv, QOP_SUS_B, H4, H1, H1)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
2025-09-02 13:38 ` Richard Henderson
@ 2025-09-03 0:26 ` Max Chou
2025-09-03 3:43 ` Richard Henderson
0 siblings, 1 reply; 8+ messages in thread
From: Max Chou @ 2025-09-03 0:26 UTC (permalink / raw)
To: Richard Henderson
Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei
[-- Attachment #1: Type: text/plain, Size: 1469 bytes --]
On Tue, Sep 2, 2025 at 22:38 Richard Henderson <richard.henderson@linaro.org>
wrote:
> On 9/1/25 23:38, Max Chou wrote:
> > +#define OPMVV_VQDOTQ(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2)
> \
> > +static void do_##NAME(void *vd, void *vs1, void *vs2, int i)
> \
> > +{
> \
> > + int idx;
> \
> > + T1 r1;
> \
> > + T2 r2;
> \
> > + TX1 *r1_buf = (TX1 *)vs1 + HD(i);
> \
> > + TX2 *r2_buf = (TX2 *)vs2 + HD(i);
> \
> > + TD acc = *((TD *)vd + HD(i));
> \
> > + int64_t partial_sum = 0;
> \
>
> I think it's clear partial_sum should be the 32-bit type TD.
> Indeed, I'm not sure why you don't just have
>
> TD acc = ((TD *)vd)[HD(i)];
Thanks for the suggestion. I’ll update version 2 for this part.
>
> > +
> \
> > + for (idx = 0; idx < 4; ++idx) {
> \
> > + r1 = *((T1 *)r1_buf + HS1(idx));
> \
> > + r2 = *((T2 *)r2_buf + HS2(idx));
> \
> > + partial_sum += (r1 * r2);
> \
>
> acc += r1 * r2;
>
> > + }
> \
> > + *((TD *)vd + HD(i)) = (acc + partial_sum) & MAKE_64BIT_MASK(0, 32);
> \
>
> ((TD *)vd)[HD(i)] = acc;
>
> because that final mask is bogus.
>
The partial_sum and the final mask are created to ensure the behavior
described in the Zvqdotq isa spec section 3 as follows:
“Finally, the four products are accumulated into the corresponding element
of vd, wrapping around signed overflow.”
Thanks,
Max.
[-- Attachment #2: Type: text/html, Size: 3380 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
2025-09-03 0:26 ` Max Chou
@ 2025-09-03 3:43 ` Richard Henderson
2025-09-03 12:43 ` Max Chou
0 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2025-09-03 3:43 UTC (permalink / raw)
To: Max Chou
Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei
On 9/3/25 02:26, Max Chou wrote:
> The partial_sum and the final mask are created to ensure the behavior described in the
> Zvqdotq isa spec section 3 as follows:
>
> “Finally, the four products are accumulated into the corresponding element of vd, wrapping
> around signed overflow.”
This is accomplished by the -fwrapv argument to the compiler, with which all of qemu is
compiled.
r~
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support
2025-09-03 3:43 ` Richard Henderson
@ 2025-09-03 12:43 ` Max Chou
0 siblings, 0 replies; 8+ messages in thread
From: Max Chou @ 2025-09-03 12:43 UTC (permalink / raw)
To: Richard Henderson
Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei
[-- Attachment #1: Type: text/plain, Size: 630 bytes --]
On Wed, Sep 3, 2025 at 12:43 PM Richard Henderson <
richard.henderson@linaro.org> wrote:
> On 9/3/25 02:26, Max Chou wrote:
> > The partial_sum and the final mask are created to ensure the behavior
> described in the
> > Zvqdotq isa spec section 3 as follows:
> >
> > “Finally, the four products are accumulated into the corresponding
> element of vd, wrapping
> > around signed overflow.”
>
> This is accomplished by the -fwrapv argument to the compiler, with which
> all of qemu is
> compiled.
>
Thanks for this information. It’s helpful.
I’ll update version 2 to include this part.
Thanks,
Max
[-- Attachment #2: Type: text/html, Size: 1041 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-09-03 12:43 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-01 13:38 [RFC PATCH 0/3] Support RISC-V Zvqdotq vector dot-product extension Max Chou
2025-09-01 13:38 ` [RFC PATCH 1/3] target/riscv: Add Zvqdotq cfg property Max Chou
2025-09-01 13:38 ` [RFC PATCH 2/3] target/riscv: rvv: Add Zvqdotq support Max Chou
2025-09-02 13:38 ` Richard Henderson
2025-09-03 0:26 ` Max Chou
2025-09-03 3:43 ` Richard Henderson
2025-09-03 12:43 ` Max Chou
2025-09-01 13:38 ` [RFC PATCH 3/3] target/riscv: Expose Zvqdotq extension as a cpu property Max Chou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).