All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16
@ 2026-06-25  1:51 Richard Henderson
  2026-06-25  1:51 ` [PATCH 01/10] target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16 Richard Henderson
                   ` (11 more replies)
  0 siblings, 12 replies; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Another minor feature working toward SME2.2.

r~

Richard Henderson (10):
  target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16
  target/arm: Rename SME FMLAL/FMLSL patterns
  target/arm: Implement FMLAL (multiple, multiple and single, FP8 to
    FP16)
  target/arm: Implement FMLAL (multiple and indexed, FP8 to FP16)
  target/arm: Implement FDOT (multiple, multiple and single, FP8 to
    FP16)
  target/arm: Implement DOT (multiple and indexed, FP8 to FP16)
  target/arm: Implement FMOPA (widening, 2-way, FP8 to FP16)
  target/arm: Rename FVDOT pattern
  target/arm: Implement FVDOT (FP8 to FP16)
  target/arm: Enable FEAT_SME_F8F16 for -cpu max

 target/arm/cpu-features.h        | 11 +++++
 target/arm/tcg/helper-fp8-defs.h |  2 +
 linux-user/aarch64/elfload.c     |  1 +
 target/arm/tcg/cpu64.c           |  1 +
 target/arm/tcg/fp8_helper.c      | 58 +++++++++++++++++++++++++
 target/arm/tcg/translate-sme.c   | 72 +++++++++++++++++++++++---------
 docs/system/arm/emulation.rst    |  1 +
 target/arm/tcg/sme.decode        | 66 +++++++++++++++++++++--------
 8 files changed, 176 insertions(+), 36 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/10] target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-26  9:03   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 02/10] target/arm: Rename SME FMLAL/FMLSL patterns Richard Henderson
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

These two instructions can be enabled with either
FEAT_SME_F8F16 or FEAT_SME_F16F16.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu-features.h      | 11 +++++++++++
 target/arm/tcg/translate-sme.c |  4 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
index 45213f71ab..cd78353609 100644
--- a/target/arm/cpu-features.h
+++ b/target/arm/cpu-features.h
@@ -1605,6 +1605,11 @@ static inline bool isar_feature_aa64_sme_f8f32(const ARMISARegisters *id)
     return FIELD_EX64_IDREG(id, ID_AA64SMFR0, F8F32);
 }
 
+static inline bool isar_feature_aa64_sme_f8f16(const ARMISARegisters *id)
+{
+    return FIELD_EX64_IDREG(id, ID_AA64SMFR0, F8F16);
+}
+
 static inline bool isar_feature_aa64_sme_f16f16(const ARMISARegisters *id)
 {
     return FIELD_EX64_IDREG(id, ID_AA64SMFR0, F16F16);
@@ -1776,6 +1781,12 @@ isar_feature_aa64_sme2_or_sve2_lut(const ARMISARegisters *id)
     return isar_feature_aa64_sme2_or_sve2(id) && isar_feature_aa64_lut(id);
 }
 
+static inline bool
+isar_feature_aa64_sme_f16f16_or_f8f16(const ARMISARegisters *id)
+{
+    return isar_feature_aa64_sme_f16f16(id) && isar_feature_aa64_sme_f8f16(id);
+}
+
 /*
  * Feature tests for "does this exist in either 32-bit or 64-bit?"
  */
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index a79b0a9b80..b8dde80c20 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1335,9 +1335,9 @@ static bool do_faddsub(DisasContext *s, arg_az_n *a, ARMFPStatusFlavour fpst,
     return true;
 }
 
-TRANS_FEAT(FADD_nn_h, aa64_sme_f16f16, do_faddsub, a,
+TRANS_FEAT(FADD_nn_h, aa64_sme_f16f16_or_f8f16, do_faddsub, a,
            FPST_ZA_F16, gen_helper_gvec_fadd_h)
-TRANS_FEAT(FSUB_nn_h, aa64_sme_f16f16, do_faddsub, a,
+TRANS_FEAT(FSUB_nn_h, aa64_sme_f16f16_or_f8f16, do_faddsub, a,
            FPST_ZA_F16, gen_helper_gvec_fsub_h)
 
 TRANS_FEAT(FADD_nn_s, aa64_sme2, do_faddsub, a,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/10] target/arm: Rename SME FMLAL/FMLSL patterns
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
  2026-06-25  1:51 ` [PATCH 01/10] target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16 Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-25 10:17   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 03/10] target/arm: Implement FMLAL (multiple, multiple and single, FP8 to FP16) Richard Henderson
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Rename patterns to include _sh suffix, so that we can
distinguish insns of the same name from FEAT_SME_F8F16.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sme.c | 12 ++++++------
 target/arm/tcg/sme.decode      | 32 ++++++++++++++++----------------
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index b8dde80c20..7aa270d3ec 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1107,10 +1107,10 @@ static bool do_fmlal(DisasContext *s, arg_azz_n *a, bool sub, bool multi)
                          multi, FPST_ENV, gen_helper_sve2_fmlal_zzzw_s);
 }
 
-TRANS_FEAT(FMLAL_n1, aa64_sme2, do_fmlal, a, false, false)
-TRANS_FEAT(FMLSL_n1, aa64_sme2, do_fmlal, a, true, false)
-TRANS_FEAT(FMLAL_nn, aa64_sme2, do_fmlal, a, false, true)
-TRANS_FEAT(FMLSL_nn, aa64_sme2, do_fmlal, a, true, true)
+TRANS_FEAT(FMLAL_n1_sh, aa64_sme2, do_fmlal, a, false, false)
+TRANS_FEAT(FMLSL_n1_sh, aa64_sme2, do_fmlal, a, true, false)
+TRANS_FEAT(FMLAL_nn_sh, aa64_sme2, do_fmlal, a, false, true)
+TRANS_FEAT(FMLSL_nn_sh, aa64_sme2, do_fmlal, a, true, true)
 
 static bool do_fmlall_fp8(DisasContext *s, arg_azz_n *a, bool multi)
 {
@@ -1128,8 +1128,8 @@ static bool do_fmlal_nx(DisasContext *s, arg_azx_n *a, bool sub)
                          false, FPST_ENV, gen_helper_sve2_fmlal_zzxw_s);
 }
 
-TRANS_FEAT(FMLAL_nx, aa64_sme2, do_fmlal_nx, a, false)
-TRANS_FEAT(FMLSL_nx, aa64_sme2, do_fmlal_nx, a, true)
+TRANS_FEAT(FMLAL_nx_sh, aa64_sme2, do_fmlal_nx, a, false)
+TRANS_FEAT(FMLSL_nx_sh, aa64_sme2, do_fmlal_nx, a, true)
 
 static bool do_bfmlal(DisasContext *s, arg_azz_n *a, bool sub, bool multi)
 {
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index 1de5f341ef..90ee161461 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -324,13 +324,13 @@ SUB_azz_n1_d    11000001 0111 .... 0 .. 110 ..... 11 ...    @azz_nx1_o3 n=4
 @azz_nx1_o2x2   ........ ... . zm:4 . .. ... zn:5 ... ..    \
                 &azz_n off=%off2_x2 rv=%mova_rv
 
-FMLAL_n1        11000001 001 0 .... 0 .. 011 ..... 00 ...   @azz_nx1_o3x2 n=1
-FMLAL_n1        11000001 001 0 .... 0 .. 010 ..... 000 ..   @azz_nx1_o2x2 n=2
-FMLAL_n1        11000001 001 1 .... 0 .. 010 ..... 000 ..   @azz_nx1_o2x2 n=4
+FMLAL_n1_sh     11000001 001 0 .... 0 .. 011 ..... 00 ...   @azz_nx1_o3x2 n=1
+FMLAL_n1_sh     11000001 001 0 .... 0 .. 010 ..... 000 ..   @azz_nx1_o2x2 n=2
+FMLAL_n1_sh     11000001 001 1 .... 0 .. 010 ..... 000 ..   @azz_nx1_o2x2 n=4
 
-FMLSL_n1        11000001 001 0 .... 0 .. 011 ..... 01 ...   @azz_nx1_o3x2 n=1
-FMLSL_n1        11000001 001 0 .... 0 .. 010 ..... 010 ..   @azz_nx1_o2x2 n=2
-FMLSL_n1        11000001 001 1 .... 0 .. 010 ..... 010 ..   @azz_nx1_o2x2 n=4
+FMLSL_n1_sh     11000001 001 0 .... 0 .. 011 ..... 01 ...   @azz_nx1_o3x2 n=1
+FMLSL_n1_sh     11000001 001 0 .... 0 .. 010 ..... 010 ..   @azz_nx1_o2x2 n=2
+FMLSL_n1_sh     11000001 001 1 .... 0 .. 010 ..... 010 ..   @azz_nx1_o2x2 n=4
 
 BFMLAL_n1       11000001 001 0 .... 0 .. 011 ..... 10 ...   @azz_nx1_o3x2 n=1
 BFMLAL_n1       11000001 001 0 .... 0 .. 010 ..... 100 ..   @azz_nx1_o2x2 n=2
@@ -477,11 +477,11 @@ SUB_azz_nn_d    11000001 111 ...01 0 .. 110 ...00 11 ...    @azz_4x4_o3
 @azz_4x4_o2x2   ........ ... ..... . .. ... ..... ... ..    \
                 &azz_n n=4 rv=%mova_rv zn=%zn_ax4 zm=%zm_ax4 off=%off2_x2
 
-FMLAL_nn        11000001 101 ....0 0 .. 010 ....0 000 ..    @azz_2x2_o2x2
-FMLAL_nn        11000001 101 ...01 0 .. 010 ...00 000 ..    @azz_4x4_o2x2
+FMLAL_nn_sh     11000001 101 ....0 0 .. 010 ....0 000 ..    @azz_2x2_o2x2
+FMLAL_nn_sh     11000001 101 ...01 0 .. 010 ...00 000 ..    @azz_4x4_o2x2
 
-FMLSL_nn        11000001 101 ....0 0 .. 010 ....0 010 ..    @azz_2x2_o2x2
-FMLSL_nn        11000001 101 ...01 0 .. 010 ...00 010 ..    @azz_4x4_o2x2
+FMLSL_nn_sh     11000001 101 ....0 0 .. 010 ....0 010 ..    @azz_2x2_o2x2
+FMLSL_nn_sh     11000001 101 ...01 0 .. 010 ...00 010 ..    @azz_4x4_o2x2
 
 BFMLAL_nn       11000001 101 ....0 0 .. 010 ....0 100 ..    @azz_2x2_o2x2
 BFMLAL_nn       11000001 101 ...01 0 .. 010 ...00 100 ..    @azz_4x4_o2x2
@@ -617,13 +617,13 @@ BFSUB_nn        11000001 111 00101 0 .. 111 ...00 01 ...    @az_4x4_o3
 @azx_4x1_o2x2   ........ .... zm:4 . .. . .. ..... .. ...   \
                 &azx_n n=4 rv=%mova_rv off=%off2_x2 zn=%zn_ax4 idx=%idx2_10_2
 
-FMLAL_nx        11000001 1000 .... . .. 1 .. ..... 00 ...   @azx_1x1_o3x2
-FMLAL_nx        11000001 1001 .... 0 .. 1 .. ....0 00 ...   @azx_2x1_o2x2
-FMLAL_nx        11000001 1001 .... 1 .. 1 .. ...00 00 ...   @azx_4x1_o2x2
+FMLAL_nx_sh     11000001 1000 .... . .. 1 .. ..... 00 ...   @azx_1x1_o3x2
+FMLAL_nx_sh     11000001 1001 .... 0 .. 1 .. ....0 00 ...   @azx_2x1_o2x2
+FMLAL_nx_sh     11000001 1001 .... 1 .. 1 .. ...00 00 ...   @azx_4x1_o2x2
 
-FMLSL_nx        11000001 1000 .... . .. 1 .. ..... 01 ...   @azx_1x1_o3x2
-FMLSL_nx        11000001 1001 .... 0 .. 1 .. ....0 01 ...   @azx_2x1_o2x2
-FMLSL_nx        11000001 1001 .... 1 .. 1 .. ...00 01 ...   @azx_4x1_o2x2
+FMLSL_nx_sh     11000001 1000 .... . .. 1 .. ..... 01 ...   @azx_1x1_o3x2
+FMLSL_nx_sh     11000001 1001 .... 0 .. 1 .. ....0 01 ...   @azx_2x1_o2x2
+FMLSL_nx_sh     11000001 1001 .... 1 .. 1 .. ...00 01 ...   @azx_4x1_o2x2
 
 BFMLAL_nx       11000001 1000 .... . .. 1 .. ..... 10 ...   @azx_1x1_o3x2
 BFMLAL_nx       11000001 1001 .... 0 .. 1 .. ....0 10 ...   @azx_2x1_o2x2
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/10] target/arm: Implement FMLAL (multiple, multiple and single, FP8 to FP16)
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
  2026-06-25  1:51 ` [PATCH 01/10] target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16 Richard Henderson
  2026-06-25  1:51 ` [PATCH 02/10] target/arm: Rename SME FMLAL/FMLSL patterns Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-26  9:11   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 04/10] target/arm: Implement FMLAL (multiple and indexed, " Richard Henderson
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sme.c | 9 +++++++++
 target/arm/tcg/sme.decode      | 7 +++++++
 2 files changed, 16 insertions(+)

diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 7aa270d3ec..7cb7b71e74 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1121,6 +1121,15 @@ static bool do_fmlall_fp8(DisasContext *s, arg_azz_n *a, bool multi)
 TRANS_FEAT(FMLALL_n1_b, aa64_sme_f8f32, do_fmlall_fp8, a, false)
 TRANS_FEAT(FMLALL_nn_b, aa64_sme_f8f32, do_fmlall_fp8, a, true)
 
+static bool do_fmlal_fp8(DisasContext *s, arg_azz_n *a, bool multi)
+{
+    return do_azz_acc_fp8(s, a->n, 2, a->rv, a->off, a->zn, a->zm,
+                          0, 0, multi, gen_helper_gvec_fmla_hb);
+}
+
+TRANS_FEAT(FMLAL_n1_hb, aa64_sme_f8f16, do_fmlal_fp8, a, false)
+TRANS_FEAT(FMLAL_nn_hb, aa64_sme_f8f16, do_fmlal_fp8, a, true)
+
 static bool do_fmlal_nx(DisasContext *s, arg_azx_n *a, bool sub)
 {
     return do_azz_acc_fp(s, a->n, 2, a->rv, a->off, a->zn, a->zm,
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index 90ee161461..b735f3de82 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -452,6 +452,10 @@ FMLALL_n1_b     11000001 001 1 .... 0 .. 000 ..... 0001 .   @azz_nx1_o1x4 n=4
 FDOT_n1_sb      11000001 001 0 .... 0 .. 100 ..... 11 ...   @azz_nx1_o3 n=2
 FDOT_n1_sb      11000001 001 1 .... 0 .. 100 ..... 11 ...   @azz_nx1_o3 n=4
 
+FMLAL_n1_hb     11000001 001 1 .... 0 .. 011 ..... 00 ...   @azz_nx1_o3x2 n=1
+FMLAL_n1_hb     11000001 001 0 .... 0 .. 010 ..... 001 ..   @azz_nx1_o2x2 n=2
+FMLAL_n1_hb     11000001 001 1 .... 0 .. 010 ..... 001 ..   @azz_nx1_o2x2 n=4
+
 ### SME2 Multi-vector Multiple Array Vectors
 
 %zn_ax2         6:4 !function=times_2
@@ -578,6 +582,9 @@ FMLALL_nn_b     11000001 101 ...01 0 .. 000 ...01 0000 .    @azz_4x4_o1x4
 FDOT_nn_sb      11000001 101 ....0 0 .. 100 ....1 10 ...    @azz_2x2_o3
 FDOT_nn_sb      11000001 101 ...01 0 .. 100 ...01 10 ...    @azz_4x4_o3
 
+FMLAL_nn_hb     11000001 101 ....0 0 .. 010 ....1 000 ..    @azz_2x2_o2x2
+FMLAL_nn_hb     11000001 101 ...01 0 .. 010 ...01 000 ..    @azz_4x4_o2x2
+
 &az_n           n off rv zm
 @az_2x2_o3      ........ ... ..... . .. ... ..... .. off:3  \
                 &az_n n=2 rv=%mova_rv zm=%zn_ax2
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/10] target/arm: Implement FMLAL (multiple and indexed, FP8 to FP16)
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (2 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 03/10] target/arm: Implement FMLAL (multiple, multiple and single, FP8 to FP16) Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-26  9:12   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 05/10] target/arm: Implement FDOT (multiple, multiple and single, " Richard Henderson
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sme.c |  4 ++++
 target/arm/tcg/sme.decode      | 13 +++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 7cb7b71e74..98f6eb0b70 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1140,6 +1140,10 @@ static bool do_fmlal_nx(DisasContext *s, arg_azx_n *a, bool sub)
 TRANS_FEAT(FMLAL_nx_sh, aa64_sme2, do_fmlal_nx, a, false)
 TRANS_FEAT(FMLSL_nx_sh, aa64_sme2, do_fmlal_nx, a, true)
 
+TRANS_FEAT(FMLAL_nx_hb, aa64_sme_f8f16, do_azz_acc_fp8,
+           a->n, 2, a->rv, a->off, a->zn, a->zm,
+           a->idx << 2, 0, false, gen_helper_gvec_fmla_idx_hb)
+
 static bool do_bfmlal(DisasContext *s, arg_azz_n *a, bool sub, bool multi)
 {
     return do_azz_acc_fp(s, a->n, 2, a->rv, a->off, a->zn, a->zm,
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index b735f3de82..c6e22c4999 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -791,6 +791,19 @@ FMLALL_nx_b     11000001 0001 .... 1 .. 0.. ...10 00.. .    @azx_4x1_i4_o1
 FDOT_nx_b       11000001 0101 .... 0 .. 0.. ....1 11 ...    @azx_2x1_i2_o3
 FDOT_nx_b       11000001 0101 .... 1 .. 0.. ...00 01 ...    @azx_4x1_i2_o3
 
+%idx4_15_10_3   15:1 10:2 3:1
+%idx4_10_2      10:2 2:2
+@azx_1x1_i4_o3x2 ........ .... zm:4 . .. . .. zn:5 .. ...    \
+                 &azx_n n=1 rv=%mova_rv off=%off3_x2 idx=%idx4_15_10_3
+@azx_2x2_i4_o3x2 ........ .... zm:4 . .. . .. .... .. .. ..  \
+                 &azx_n n=2 rv=%mova_rv zn=%zn_ax2 off=%off2_x2 idx=%idx4_10_2
+@azx_4x4_i4_o3x2 ........ .... zm:4 . .. . .. ... ... .. ..  \
+                 &azx_n n=4 rv=%mova_rv zn=%zn_ax4 off=%off2_x2 idx=%idx4_10_2
+
+FMLAL_nx_hb     11000001 1100 .... . .. 0.. ..... 0. ...    @azx_1x1_i4_o3x2
+FMLAL_nx_hb     11000001 1001 .... 0 .. 1.. ....1 1.. ..    @azx_2x2_i4_o3x2
+FMLAL_nx_hb     11000001 1001 .... 1 .. 1.. ...01 0.. ..    @azx_4x4_i4_o3x2
+
 %idx2_10_3      10:1 3:1
 @azx_4x2_i2_o3  ........ .... zm:4 . .. ... .... ... off:3 \
                 &azx_n n=4 rv=%mova_rv zn=%zn_ax2 idx=%idx2_10_3
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/10] target/arm: Implement FDOT (multiple, multiple and single, FP8 to FP16)
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (3 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 04/10] target/arm: Implement FMLAL (multiple and indexed, " Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-25  1:51 ` [PATCH 06/10] target/arm: Implement DOT (multiple and indexed, " Richard Henderson
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sme.c | 9 +++++++++
 target/arm/tcg/sme.decode      | 6 ++++++
 2 files changed, 15 insertions(+)

diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 98f6eb0b70..3174b10c30 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1196,6 +1196,15 @@ static bool do_fdot_fp8(DisasContext *s, arg_azz_n *a, bool multi)
 TRANS_FEAT(FDOT_n1_sb, aa64_sme_f8f32, do_fdot_fp8, a, false)
 TRANS_FEAT(FDOT_nn_sb, aa64_sme_f8f32, do_fdot_fp8, a, true)
 
+static bool do_fdot_hb(DisasContext *s, arg_azz_n *a, bool multi)
+{
+    return do_azz_acc_fp8(s, a->n, 1, a->rv, a->off, a->zn, a->zm,
+                          0, 0, multi, gen_helper_gvec_fdot_hb);
+}
+
+TRANS_FEAT(FDOT_n1_hb, aa64_sme_f8f16, do_fdot_hb, a, false)
+TRANS_FEAT(FDOT_nn_hb, aa64_sme_f8f16, do_fdot_hb, a, true)
+
 static bool do_fdot_nx(DisasContext *s, arg_azx_n *a)
 {
     return do_azz_acc_fp(s, a->n, 1, a->rv, a->off, a->zn, a->zm,
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index c6e22c4999..fbf5f3720d 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -456,6 +456,9 @@ FMLAL_n1_hb     11000001 001 1 .... 0 .. 011 ..... 00 ...   @azz_nx1_o3x2 n=1
 FMLAL_n1_hb     11000001 001 0 .... 0 .. 010 ..... 001 ..   @azz_nx1_o2x2 n=2
 FMLAL_n1_hb     11000001 001 1 .... 0 .. 010 ..... 001 ..   @azz_nx1_o2x2 n=4
 
+FDOT_n1_hb      11000001 001 0 .... 0 .. 100 ..... 01 ...   @azz_nx1_o3 n=2
+FDOT_n1_hb      11000001 001 1 .... 0 .. 100 ..... 01 ...   @azz_nx1_o3 n=4
+
 ### SME2 Multi-vector Multiple Array Vectors
 
 %zn_ax2         6:4 !function=times_2
@@ -585,6 +588,9 @@ FDOT_nn_sb      11000001 101 ...01 0 .. 100 ...01 10 ...    @azz_4x4_o3
 FMLAL_nn_hb     11000001 101 ....0 0 .. 010 ....1 000 ..    @azz_2x2_o2x2
 FMLAL_nn_hb     11000001 101 ...01 0 .. 010 ...01 000 ..    @azz_4x4_o2x2
 
+FDOT_nn_hb      11000001 101 ....0 0 .. 100 ....1 00 ...    @azz_2x2_o3
+FDOT_nn_hb      11000001 101 ...01 0 .. 100 ...01 00 ...    @azz_4x4_o3
+
 &az_n           n off rv zm
 @az_2x2_o3      ........ ... ..... . .. ... ..... .. off:3  \
                 &az_n n=2 rv=%mova_rv zm=%zn_ax2
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/10] target/arm: Implement DOT (multiple and indexed, FP8 to FP16)
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (4 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 05/10] target/arm: Implement FDOT (multiple, multiple and single, " Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-26  9:16   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 07/10] target/arm: Implement FMOPA (widening, 2-way, " Richard Henderson
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sme.c | 4 ++++
 target/arm/tcg/sme.decode      | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 3174b10c30..197274d00e 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1178,6 +1178,10 @@ TRANS_FEAT(FDOT_nx_b, aa64_sme_f8f32, do_azz_acc_fp8,
            a->n, 1, a->rv, a->off, a->zn, a->zm,
            a->idx, 0, false, gen_helper_gvec_fdot_idx_sb)
 
+TRANS_FEAT(FDOT_nx_hb, aa64_sme_f8f16, do_azz_acc_fp8,
+           a->n, 1, a->rv, a->off, a->zn, a->zm,
+           a->idx, 0, false, gen_helper_gvec_fdot_idx_hb)
+
 static bool do_fdot(DisasContext *s, arg_azz_n *a, bool multi)
 {
     return do_azz_acc_fp(s, a->n, 1, a->rv, a->off, a->zn, a->zm, 1, 0,
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index fbf5f3720d..1dd3b7c8b2 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -810,6 +810,9 @@ FMLAL_nx_hb     11000001 1100 .... . .. 0.. ..... 0. ...    @azx_1x1_i4_o3x2
 FMLAL_nx_hb     11000001 1001 .... 0 .. 1.. ....1 1.. ..    @azx_2x2_i4_o3x2
 FMLAL_nx_hb     11000001 1001 .... 1 .. 1.. ...01 0.. ..    @azx_4x4_i4_o3x2
 
+FDOT_nx_hb      11000001 1101 .... 0 .. 0.. ....1 0. ...    @azx_2x1_i3_o3
+FDOT_nx_hb      11000001 0001 .... 1 .. 1.. ...10 0. ...    @azx_4x1_i3_o3
+
 %idx2_10_3      10:1 3:1
 @azx_4x2_i2_o3  ........ .... zm:4 . .. ... .... ... off:3 \
                 &azx_n n=4 rv=%mova_rv zn=%zn_ax2 idx=%idx2_10_3
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/10] target/arm: Implement FMOPA (widening, 2-way, FP8 to FP16)
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (5 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 06/10] target/arm: Implement DOT (multiple and indexed, " Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-26  9:22   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 08/10] target/arm: Rename FVDOT pattern Richard Henderson
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/helper-fp8-defs.h |  1 +
 target/arm/tcg/fp8_helper.c      | 35 ++++++++++++++++++++++++++++++++
 target/arm/tcg/translate-sme.c   | 24 +++++++++++++---------
 target/arm/tcg/sme.decode        |  1 +
 4 files changed, 51 insertions(+), 10 deletions(-)

diff --git a/target/arm/tcg/helper-fp8-defs.h b/target/arm/tcg/helper-fp8-defs.h
index ef1375fea7..05bf8dbdc2 100644
--- a/target/arm/tcg/helper-fp8-defs.h
+++ b/target/arm/tcg/helper-fp8-defs.h
@@ -40,5 +40,6 @@ DEF_HELPER_FLAGS_5(gvec_fmmla_sb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32
 DEF_HELPER_FLAGS_5(gvec_fmmla_hb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
 
 DEF_HELPER_FLAGS_7(sme_fmopa_sb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, env, i32)
+DEF_HELPER_FLAGS_7(sme_fmopa_hb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, env, i32)
 
 DEF_HELPER_FLAGS_5(sme_fvdot_idx_sb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
diff --git a/target/arm/tcg/fp8_helper.c b/target/arm/tcg/fp8_helper.c
index 3c2d959099..5606f0fd8e 100644
--- a/target/arm/tcg/fp8_helper.c
+++ b/target/arm/tcg/fp8_helper.c
@@ -893,6 +893,41 @@ void HELPER(sme_fmopa_sb)(void *vza, void *vzn, void *vzm, void *vpn,
     }
 }
 
+void HELPER(sme_fmopa_hb)(void *vza, void *vzn, void *vzm, void *vpn,
+                          void *vpm, CPUARMState *env, uint32_t desc)
+{
+    FP8MulContext ctx = fp8_mul_start(env, 0xf);
+    intptr_t oprsz = simd_maxsz(desc);
+    uint16_t *pn = vpn, *pm = vpm;
+
+    for (intptr_t row = 0; row < oprsz; ) {
+        uint16_t prow = pn[H2(row >> 4)];
+        do {
+            void *vza_row = vza + tile_vslice_offset(row);
+            uint16_t n = *(uint16_t *)(vzn + H1_2(row));
+
+            n &= expand_pred_b(prow & 3);
+
+            for (intptr_t col = 0; col < oprsz; ) {
+                uint16_t pcol = pm[H2(col >> 4)];
+                do {
+                    if (prow & pcol & 0x3) {
+                        uint16_t *a = vza_row + H1_2(col);
+                        uint16_t m = *(uint16_t *)(vzm + H1_2(col));
+
+                        m &= expand_pred_b(pcol & 3);
+                        *a = f8dotadd_h(n, m, 2, *a, &ctx);
+                    }
+                    col += 2;
+                    pcol >>= 2;
+                } while (col & 15);
+            }
+            row += 2;
+            prow >>= 2;
+        } while (row & 15);
+    }
+}
+
 void HELPER(sme_fvdot_idx_sb)(void *vd, void *vn, void *vm,
                               CPUARMState *env, uint32_t desc)
 {
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 197274d00e..7eeac28480 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -616,25 +616,29 @@ TRANS_FEAT(BFMOPA_w, aa64_sme, do_outprod_env, a, MO_32,
            : !s->fpcr_ah ? gen_helper_sme_bfmops_w
            : gen_helper_sme_ah_bfmops_w)
 
-static bool trans_FMOPA_sb(DisasContext *s, arg_op *a)
+static bool do_outprod_fp8(DisasContext *s, arg_op *a, MemOp esz,
+                           gen_helper_gvec_5_ptr *fn)
 {
-    if (!dc_isar_feature(aa64_sme_f8f32, s)) {
-        return false;
-    }
     if (fpmr_access_check(s) && sme_smza_enabled_check(s)) {
         int svl = streaming_vec_reg_size(s);
         uint32_t desc = simd_desc(svl, svl, 0);
 
-        gen_helper_sme_fmopa_sb(get_tile(s, MO_32, a->zad),
-                                vec_full_reg_ptr(s, a->zn),
-                                vec_full_reg_ptr(s, a->zm),
-                                pred_full_reg_ptr(s, a->pn),
-                                pred_full_reg_ptr(s, a->pm),
-                                tcg_env, tcg_constant_i32(desc));
+        TCGv_ptr za = get_tile(s, esz, a->zad);
+        TCGv_ptr zn = vec_full_reg_ptr(s, a->zn);
+        TCGv_ptr zm = vec_full_reg_ptr(s, a->zm);
+        TCGv_ptr pn = pred_full_reg_ptr(s, a->pn);
+        TCGv_ptr pm = pred_full_reg_ptr(s, a->pm);
+
+        fn(za, zn, zm, pn, pm, tcg_env, tcg_constant_i32(desc));
     }
     return true;
 }
 
+TRANS_FEAT(FMOPA_sb, aa64_sme_f8f32, do_outprod_fp8,
+           a, MO_32, gen_helper_sme_fmopa_sb)
+TRANS_FEAT(FMOPA_hb, aa64_sme_f8f16, do_outprod_fp8,
+           a, MO_16, gen_helper_sme_fmopa_hb)
+
 TRANS_FEAT(SMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_smopa_s)
 TRANS_FEAT(UMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_umopa_s)
 TRANS_FEAT(SUMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_sumopa_s)
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index 1dd3b7c8b2..755d5f00d0 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -199,6 +199,7 @@ BFMOPA_w        10000001 100 ..... ... ... ..... . 00 ..        @op_32
 FMOPA_w_h       10000001 101 ..... ... ... ..... . 00 ..        @op_32
 
 FMOPA_sb        10000000 101 zm:5 pm:3 pn:3 zn:5 0 00 zad:2     &op sub=0
+FMOPA_hb        10000000 101 zm:5 pm:3 pn:3 zn:5 0100 zad:1     &op sub=0
 
 SMOPA_s         1010000 0 10 0 ..... ... ... ..... . 00 ..      @op_32
 SUMOPA_s        1010000 0 10 1 ..... ... ... ..... . 00 ..      @op_32
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/10] target/arm: Rename FVDOT pattern
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (6 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 07/10] target/arm: Implement FMOPA (widening, 2-way, " Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-25 10:19   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 09/10] target/arm: Implement FVDOT (FP8 to FP16) Richard Henderson
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Rename to FVDOT_sh so that we can introduce an insn
of the same name from FEAT_SME_F8F16.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-sme.c | 2 +-
 target/arm/tcg/sme.decode      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 7eeac28480..267a6b0d9b 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1260,7 +1260,7 @@ static bool do_vdot(DisasContext *s, arg_azx_n *a, gen_helper_gvec_4_ptr *fn)
     return true;
 }
 
-TRANS_FEAT(FVDOT, aa64_sme, do_vdot, a, gen_helper_sme2_fvdot_idx_h)
+TRANS_FEAT(FVDOT_sh, aa64_sme, do_vdot, a, gen_helper_sme2_fvdot_idx_h)
 TRANS_FEAT(BFVDOT, aa64_sme, do_vdot, a, gen_helper_sme2_bfvdot_idx)
 
 static bool do_fvdot_sb(DisasContext *s, arg_azx_n *a, bool top)
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index 755d5f00d0..160cf130d4 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -662,7 +662,7 @@ FDOT_nx         11000001 0101 .... 1 .. 1 .. ...00 01 ...   @azx_4x1_i2_o3
 BFDOT_nx        11000001 0101 .... 0 .. 1 .. ....0 11 ...   @azx_2x1_i2_o3
 BFDOT_nx        11000001 0101 .... 1 .. 1 .. ...00 11 ...   @azx_4x1_i2_o3
 
-FVDOT           11000001 0101 .... 0 .. 0 .. ....0 01 ...   @azx_2x1_i2_o3
+FVDOT_sh        11000001 0101 .... 0 .. 0 .. ....0 01 ...   @azx_2x1_i2_o3
 BFVDOT          11000001 0101 .... 0 .. 0 .. ....0 11 ...   @azx_2x1_i2_o3
 
 SDOT_nx_2h      11000001 0101 .... 0 .. 1 .. ....0 00 ...   @azx_2x1_i2_o3
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/10] target/arm: Implement FVDOT (FP8 to FP16)
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (7 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 08/10] target/arm: Rename FVDOT pattern Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-26  9:27   ` Peter Maydell
  2026-06-25  1:51 ` [PATCH 10/10] target/arm: Enable FEAT_SME_F8F16 for -cpu max Richard Henderson
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/helper-fp8-defs.h |  1 +
 target/arm/tcg/fp8_helper.c      | 23 +++++++++++++++++++++++
 target/arm/tcg/translate-sme.c   |  4 ++++
 target/arm/tcg/sme.decode        |  2 ++
 4 files changed, 30 insertions(+)

diff --git a/target/arm/tcg/helper-fp8-defs.h b/target/arm/tcg/helper-fp8-defs.h
index 05bf8dbdc2..126dcadf77 100644
--- a/target/arm/tcg/helper-fp8-defs.h
+++ b/target/arm/tcg/helper-fp8-defs.h
@@ -43,3 +43,4 @@ DEF_HELPER_FLAGS_7(sme_fmopa_sb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr,
 DEF_HELPER_FLAGS_7(sme_fmopa_hb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, env, i32)
 
 DEF_HELPER_FLAGS_5(sme_fvdot_idx_sb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_FLAGS_5(sme_fvdot_idx_hb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32)
diff --git a/target/arm/tcg/fp8_helper.c b/target/arm/tcg/fp8_helper.c
index 5606f0fd8e..e1dcc2b70f 100644
--- a/target/arm/tcg/fp8_helper.c
+++ b/target/arm/tcg/fp8_helper.c
@@ -950,3 +950,26 @@ void HELPER(sme_fvdot_idx_sb)(void *vd, void *vn, void *vm,
         } while (++i & 3);
     } while (i < elements);
 }
+
+void HELPER(sme_fvdot_idx_hb)(void *vd, void *vn, void *vm,
+                              CPUARMState *env, uint32_t desc)
+{
+    FP8MulContext ctx = fp8_mul_start(env, 0xf);
+    intptr_t oprsz = simd_maxsz(desc);
+    intptr_t elements = oprsz / sizeof(float32);
+    int idx_n = extract32(desc, SIMD_DATA_SHIFT, 1);
+    int idx_m = extract32(desc, SIMD_DATA_SHIFT + 1, 3);
+    float16 *d = vd;
+    uint8_t *n0 = vn;
+    uint8_t *n1 = vn + sizeof(ARMVectorReg);
+    uint16_t *m = vm;
+    intptr_t i = 0;
+
+    do {
+        uint16_t mm = m[H2(2 * i + idx_m)];
+        do {
+            uint16_t nn = n0[H1(4 * i + idx_n)] | (n1[H1(4 * i + idx_n)] << 8);
+            d[H2(i)] = f8dotadd_h(nn, mm, 2, d[H2(i)], &ctx);
+        } while (++i & 7);
+    } while (i < elements);
+}
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 267a6b0d9b..ff5554eefb 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -1273,6 +1273,10 @@ static bool do_fvdot_sb(DisasContext *s, arg_azx_n *a, bool top)
 TRANS_FEAT(FVDOTB_sb, aa64_sme_f8f32, do_fvdot_sb, a, false)
 TRANS_FEAT(FVDOTT_sb, aa64_sme_f8f32, do_fvdot_sb, a, true)
 
+TRANS_FEAT(FVDOT_hb, aa64_sme_f8f16, do_azz_acc_fp8,
+           a->n, 2, a->rv, a->off, a->zn, a->zm,
+           (a->idx << 1), 0, false, gen_helper_sme_fvdot_idx_hb)
+
 static bool do_fmla(DisasContext *s, arg_azz_n *a, bool multi,
                     ARMFPStatusFlavour fpst, gen_helper_gvec_3_ptr *fn)
 {
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index 160cf130d4..3a65e1ad4b 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -821,6 +821,8 @@ FDOT_nx_hb      11000001 0001 .... 1 .. 1.. ...10 0. ...    @azx_4x1_i3_o3
 FVDOTB_sb       11000001 1101 .... 0 .. 01. ....0 0. ...    @azx_4x2_i2_o3
 FVDOTT_sb       11000001 1101 .... 0 .. 01. ....0 1. ...    @azx_4x2_i2_o3
 
+FVDOT_hb        11000001 1101 .... 0 .. 1.. ....1 0. ...    @azx_2x1_i3_o3
+
 ### SME2 Add / Sub array accumulators
 
 ADD_aaz_s       11000001 101 000000 .. 111 ....0 10 ...     @az_2x2_o3
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/10] target/arm: Enable FEAT_SME_F8F16 for -cpu max
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (8 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 09/10] target/arm: Implement FVDOT (FP8 to FP16) Richard Henderson
@ 2026-06-25  1:51 ` Richard Henderson
  2026-06-25 10:18   ` Peter Maydell
  2026-06-26  9:33 ` [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Peter Maydell
  2026-06-26 10:07 ` Alex Bennée
  11 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2026-06-25  1:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/aarch64/elfload.c  | 1 +
 target/arm/tcg/cpu64.c        | 1 +
 docs/system/arm/emulation.rst | 1 +
 3 files changed, 3 insertions(+)

diff --git a/linux-user/aarch64/elfload.c b/linux-user/aarch64/elfload.c
index 20b9838520..42aeb29306 100644
--- a/linux-user/aarch64/elfload.c
+++ b/linux-user/aarch64/elfload.c
@@ -236,6 +236,7 @@ abi_ulong get_elf_hwcap2(CPUState *cs)
     GET_FEATURE_ID(aa64_ssve_f8dp4, ARM_HWCAP2_A64_SME_SF8DP4);
     GET_FEATURE_ID(aa64_ssve_f8dp2, ARM_HWCAP2_A64_SME_SF8DP2);
     GET_FEATURE_ID(aa64_sme_f8f32, ARM_HWCAP2_A64_SME_F8F32);
+    GET_FEATURE_ID(aa64_sme_f8f16, ARM_HWCAP2_A64_SME_F8F16);
 
     return hwcaps;
 }
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index e9eda8fbf1..42bff58e76 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1393,6 +1393,7 @@ void aarch64_max_tcg_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64SMFR0, F16F32, 1);   /* FEAT_SME */
     t = FIELD_DP64(t, ID_AA64SMFR0, I8I32, 0xf);  /* FEAT_SME */
     t = FIELD_DP64(t, ID_AA64SMFR0, F8F32, 1);    /* FEAT_SME_F8F32 */
+    t = FIELD_DP64(t, ID_AA64SMFR0, F8F16, 1);    /* FEAT_SME_F8F16 */
     t = FIELD_DP64(t, ID_AA64SMFR0, F16F16, 1);   /* FEAT_SME_F16F16 */
     t = FIELD_DP64(t, ID_AA64SMFR0, B16B16, 1);   /* FEAT_SME_B16B16 */
     t = FIELD_DP64(t, ID_AA64SMFR0, I16I32, 5);   /* FEAT_SME2 */
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index a3a1607ff9..7b85ec6146 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -169,6 +169,7 @@ the following architecture extensions:
 - FEAT_SME_FA64 (Full A64 instruction set in Streaming SVE mode)
 - FEAT_SME_F16F16 (Non-widening half-precision FP16 arithmetic for SME2)
 - FEAT_SME_F64F64 (Double-precision floating-point outer product instructions)
+- FEAT_SME_F8F16 (SME2 ZA-targeting FP8 multiply-accumulate, dot product, and outer product to half-precision instructions)
 - FEAT_SME_F8F32 (SME2 ZA-targeting FP8 multiply-accumulate, dot product, and outer product to single-precision instructions)
 - FEAT_SME_I16I64 (16-bit to 64-bit integer widening outer product instructions)
 - FEAT_SME_LUTv2 (Lookup table instructions with 4-bit indices and 8-bit elements)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/10] target/arm: Rename SME FMLAL/FMLSL patterns
  2026-06-25  1:51 ` [PATCH 02/10] target/arm: Rename SME FMLAL/FMLSL patterns Richard Henderson
@ 2026-06-25 10:17   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-25 10:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Rename patterns to include _sh suffix, so that we can
> distinguish insns of the same name from FEAT_SME_F8F16.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] target/arm: Enable FEAT_SME_F8F16 for -cpu max
  2026-06-25  1:51 ` [PATCH 10/10] target/arm: Enable FEAT_SME_F8F16 for -cpu max Richard Henderson
@ 2026-06-25 10:18   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-25 10:18 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  linux-user/aarch64/elfload.c  | 1 +
>  target/arm/tcg/cpu64.c        | 1 +
>  docs/system/arm/emulation.rst | 1 +
>  3 files changed, 3 insertions(+)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 08/10] target/arm: Rename FVDOT pattern
  2026-06-25  1:51 ` [PATCH 08/10] target/arm: Rename FVDOT pattern Richard Henderson
@ 2026-06-25 10:19   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-25 10:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Rename to FVDOT_sh so that we can introduce an insn
> of the same name from FEAT_SME_F8F16.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16
  2026-06-25  1:51 ` [PATCH 01/10] target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16 Richard Henderson
@ 2026-06-26  9:03   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-26  9:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> These two instructions can be enabled with either
> FEAT_SME_F8F16 or FEAT_SME_F16F16.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/cpu-features.h      | 11 +++++++++++
>  target/arm/tcg/translate-sme.c |  4 ++--
>  2 files changed, 13 insertions(+), 2 deletions(-)

> +static inline bool
> +isar_feature_aa64_sme_f16f16_or_f8f16(const ARMISARegisters *id)
> +{
> +    return isar_feature_aa64_sme_f16f16(id) && isar_feature_aa64_sme_f8f16(id);

Should be ||, not &&...

Otherwise

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 03/10] target/arm: Implement FMLAL (multiple, multiple and single, FP8 to FP16)
  2026-06-25  1:51 ` [PATCH 03/10] target/arm: Implement FMLAL (multiple, multiple and single, FP8 to FP16) Richard Henderson
@ 2026-06-26  9:11   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-26  9:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/tcg/translate-sme.c | 9 +++++++++
>  target/arm/tcg/sme.decode      | 7 +++++++
>  2 files changed, 16 insertions(+)
>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 04/10] target/arm: Implement FMLAL (multiple and indexed,  FP8 to FP16)
  2026-06-25  1:51 ` [PATCH 04/10] target/arm: Implement FMLAL (multiple and indexed, " Richard Henderson
@ 2026-06-26  9:12   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-26  9:12 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/tcg/translate-sme.c |  4 ++++
>  target/arm/tcg/sme.decode      | 13 +++++++++++++
>  2 files changed, 17 insertions(+)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 06/10] target/arm: Implement DOT (multiple and indexed, FP8 to FP16)
  2026-06-25  1:51 ` [PATCH 06/10] target/arm: Implement DOT (multiple and indexed, " Richard Henderson
@ 2026-06-26  9:16   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-26  9:16 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Patch subject should be FDOT, not DOT. Otherwise

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 07/10] target/arm: Implement FMOPA (widening, 2-way, FP8 to FP16)
  2026-06-25  1:51 ` [PATCH 07/10] target/arm: Implement FMOPA (widening, 2-way, " Richard Henderson
@ 2026-06-26  9:22   ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-26  9:22 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:54, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 09/10] target/arm: Implement FVDOT (FP8 to FP16)
  2026-06-25  1:51 ` [PATCH 09/10] target/arm: Implement FVDOT (FP8 to FP16) Richard Henderson
@ 2026-06-26  9:27   ` Peter Maydell
  2026-06-26 15:33     ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2026-06-26  9:27 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:52, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---


> +void HELPER(sme_fvdot_idx_hb)(void *vd, void *vn, void *vm,
> +                              CPUARMState *env, uint32_t desc)
> +{
> +    FP8MulContext ctx = fp8_mul_start(env, 0xf);
> +    intptr_t oprsz = simd_maxsz(desc);
> +    intptr_t elements = oprsz / sizeof(float32);

Shouldn't this be sizeof(float16) since the output elements
are halfprec ?

> +    int idx_n = extract32(desc, SIMD_DATA_SHIFT, 1);
> +    int idx_m = extract32(desc, SIMD_DATA_SHIFT + 1, 3);
> +    float16 *d = vd;
> +    uint8_t *n0 = vn;
> +    uint8_t *n1 = vn + sizeof(ARMVectorReg);
> +    uint16_t *m = vm;
> +    intptr_t i = 0;
> +
> +    do {
> +        uint16_t mm = m[H2(2 * i + idx_m)];
> +        do {
> +            uint16_t nn = n0[H1(4 * i + idx_n)] | (n1[H1(4 * i + idx_n)] << 8);
> +            d[H2(i)] = f8dotadd_h(nn, mm, 2, d[H2(i)], &ctx);
> +        } while (++i & 7);
> +    } while (i < elements);
> +}

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (9 preceding siblings ...)
  2026-06-25  1:51 ` [PATCH 10/10] target/arm: Enable FEAT_SME_F8F16 for -cpu max Richard Henderson
@ 2026-06-26  9:33 ` Peter Maydell
  2026-06-26 10:07 ` Alex Bennée
  11 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-26  9:33 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

On Thu, 25 Jun 2026 at 02:52, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Another minor feature working toward SME2.2.
>
> r~
>
> Richard Henderson (10):
>   target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16
>   target/arm: Rename SME FMLAL/FMLSL patterns
>   target/arm: Implement FMLAL (multiple, multiple and single, FP8 to
>     FP16)
>   target/arm: Implement FMLAL (multiple and indexed, FP8 to FP16)
>   target/arm: Implement FDOT (multiple, multiple and single, FP8 to
>     FP16)
>   target/arm: Implement DOT (multiple and indexed, FP8 to FP16)
>   target/arm: Implement FMOPA (widening, 2-way, FP8 to FP16)
>   target/arm: Rename FVDOT pattern
>   target/arm: Implement FVDOT (FP8 to FP16)
>   target/arm: Enable FEAT_SME_F8F16 for -cpu max

If you agree with my suggested tweaks for patches 1, 6, 9,
I can take this into target-arm.next and adjust it there.
The only one that isn't totally obvious is the patch 9 one.

-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16
  2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
                   ` (10 preceding siblings ...)
  2026-06-26  9:33 ` [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Peter Maydell
@ 2026-06-26 10:07 ` Alex Bennée
  2026-06-26 10:16   ` Peter Maydell
  11 siblings, 1 reply; 25+ messages in thread
From: Alex Bennée @ 2026-06-26 10:07 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm

Richard Henderson <richard.henderson@linaro.org> writes:

> Another minor feature working toward SME2.2.

This conflicts heavily with master - did a bunch of stuff get merged
that broke it?

>
> r~
>
> Richard Henderson (10):
>   target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16
>   target/arm: Rename SME FMLAL/FMLSL patterns
>   target/arm: Implement FMLAL (multiple, multiple and single, FP8 to
>     FP16)
>   target/arm: Implement FMLAL (multiple and indexed, FP8 to FP16)
>   target/arm: Implement FDOT (multiple, multiple and single, FP8 to
>     FP16)
>   target/arm: Implement DOT (multiple and indexed, FP8 to FP16)
>   target/arm: Implement FMOPA (widening, 2-way, FP8 to FP16)
>   target/arm: Rename FVDOT pattern
>   target/arm: Implement FVDOT (FP8 to FP16)
>   target/arm: Enable FEAT_SME_F8F16 for -cpu max
>
>  target/arm/cpu-features.h        | 11 +++++
>  target/arm/tcg/helper-fp8-defs.h |  2 +
>  linux-user/aarch64/elfload.c     |  1 +
>  target/arm/tcg/cpu64.c           |  1 +
>  target/arm/tcg/fp8_helper.c      | 58 +++++++++++++++++++++++++
>  target/arm/tcg/translate-sme.c   | 72 +++++++++++++++++++++++---------
>  docs/system/arm/emulation.rst    |  1 +
>  target/arm/tcg/sme.decode        | 66 +++++++++++++++++++++--------
>  8 files changed, 176 insertions(+), 36 deletions(-)

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16
  2026-06-26 10:07 ` Alex Bennée
@ 2026-06-26 10:16   ` Peter Maydell
  2026-06-26 11:54     ` Peter Maydell
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2026-06-26 10:16 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Richard Henderson, qemu-devel, qemu-arm

On Fri, 26 Jun 2026 at 11:07, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Richard Henderson <richard.henderson@linaro.org> writes:
>
> > Another minor feature working toward SME2.2.
>
> This conflicts heavily with master - did a bunch of stuff get merged
> that broke it?

It'll be based on the F8F32 series that's not yet upstream
(it's in my target-arm.next queue).

-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16
  2026-06-26 10:16   ` Peter Maydell
@ 2026-06-26 11:54     ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2026-06-26 11:54 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Richard Henderson, qemu-devel, qemu-arm

On Fri, 26 Jun 2026 at 11:16, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Fri, 26 Jun 2026 at 11:07, Alex Bennée <alex.bennee@linaro.org> wrote:
> >
> > Richard Henderson <richard.henderson@linaro.org> writes:
> >
> > > Another minor feature working toward SME2.2.
> >
> > This conflicts heavily with master - did a bunch of stuff get merged
> > that broke it?
>
> It'll be based on the F8F32 series that's not yet upstream
> (it's in my target-arm.next queue).


https://gitlab.com/pm215/qemu/-/commits/target-arm.next

including this series.

-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 09/10] target/arm: Implement FVDOT (FP8 to FP16)
  2026-06-26  9:27   ` Peter Maydell
@ 2026-06-26 15:33     ` Richard Henderson
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2026-06-26 15:33 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm

On 6/26/26 02:27, Peter Maydell wrote:
> On Thu, 25 Jun 2026 at 02:52, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
> 
> 
>> +void HELPER(sme_fvdot_idx_hb)(void *vd, void *vn, void *vm,
>> +                              CPUARMState *env, uint32_t desc)
>> +{
>> +    FP8MulContext ctx = fp8_mul_start(env, 0xf);
>> +    intptr_t oprsz = simd_maxsz(desc);
>> +    intptr_t elements = oprsz / sizeof(float32);
> 
> Shouldn't this be sizeof(float16) since the output elements
> are halfprec ?

Oops, yes.  Thanks.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2026-06-26 15:33 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-25  1:51 [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Richard Henderson
2026-06-25  1:51 ` [PATCH 01/10] target/arm: Enable FADD/FSUB (half-precision) with FEAT_SME_F8F16 Richard Henderson
2026-06-26  9:03   ` Peter Maydell
2026-06-25  1:51 ` [PATCH 02/10] target/arm: Rename SME FMLAL/FMLSL patterns Richard Henderson
2026-06-25 10:17   ` Peter Maydell
2026-06-25  1:51 ` [PATCH 03/10] target/arm: Implement FMLAL (multiple, multiple and single, FP8 to FP16) Richard Henderson
2026-06-26  9:11   ` Peter Maydell
2026-06-25  1:51 ` [PATCH 04/10] target/arm: Implement FMLAL (multiple and indexed, " Richard Henderson
2026-06-26  9:12   ` Peter Maydell
2026-06-25  1:51 ` [PATCH 05/10] target/arm: Implement FDOT (multiple, multiple and single, " Richard Henderson
2026-06-25  1:51 ` [PATCH 06/10] target/arm: Implement DOT (multiple and indexed, " Richard Henderson
2026-06-26  9:16   ` Peter Maydell
2026-06-25  1:51 ` [PATCH 07/10] target/arm: Implement FMOPA (widening, 2-way, " Richard Henderson
2026-06-26  9:22   ` Peter Maydell
2026-06-25  1:51 ` [PATCH 08/10] target/arm: Rename FVDOT pattern Richard Henderson
2026-06-25 10:19   ` Peter Maydell
2026-06-25  1:51 ` [PATCH 09/10] target/arm: Implement FVDOT (FP8 to FP16) Richard Henderson
2026-06-26  9:27   ` Peter Maydell
2026-06-26 15:33     ` Richard Henderson
2026-06-25  1:51 ` [PATCH 10/10] target/arm: Enable FEAT_SME_F8F16 for -cpu max Richard Henderson
2026-06-25 10:18   ` Peter Maydell
2026-06-26  9:33 ` [PATCH 00/10] target/arm: Implement FEAT_SME_F8F16 Peter Maydell
2026-06-26 10:07 ` Alex Bennée
2026-06-26 10:16   ` Peter Maydell
2026-06-26 11:54     ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.