[PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support
@ 2026-02-04  5:17 Max Chou
  2026-02-04  5:17 ` [PATCH v3 01/19] target/riscv: rvv: Fix NOP_UU_B vs2 width Max Chou
                   ` (19 more replies)
  0 siblings, 20 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

This patchset adds support for the OCP (Open Compute Project) 8-bit and
4-bit floating-point formats, along with the RISC-V Zvfofp8min and
Zvfofp4min vector extensions that provide conversion operations for
these formats.

OCP Floating-Point Formats
* The OCP FP8 specification defines two 8-bit floating-point formats:
  - E4M3: 4-bit exponent, 3-bit mantissa
    * No infinity representation; only 0x7f and 0xff are NaN
  - E5M2: 5-bit exponent, 2-bit mantissa
    * IEEE-like format with infinity representation
    * Multiple NaN encodings supported
* The OCP FP4 specification defines the E2M1 format:
  - E2M1: 2-bit exponent, 1-bit mantissa
    * No NaN representation

RISC-V ISA Extensions
* Zvfofp8min (Version 0.2.1):
  The Zvfofp8min extension provides minimal vector conversion support
  for OFP8 formats. It requires the Zve32f extension and leverages the
  altfmt field in the VTYPE CSR (introduced by Zvfbfa) to select between
  E4M3 (altfmt=0) and E5M2 (altfmt=1) formats.
  - Canonical NaN for both E4M3 and E5M2 is 0x7f
  - All NaNs are treated as quiet NaNs
  Instructions added/extended:
  - vfwcvtbf16.f.f.v: OFP8 to BF16 widening conversion
  - vfncvtbf16.f.f.w: BF16 to OFP8 narrowing conversion
  - vfncvtbf16.sat.f.f.w: BF16 to OFP8 with saturation (new)
  - vfncvt.f.f.q: FP32 to OFP8 quad-narrowing conversion (new)
  - vfncvt.sat.f.f.q: FP32 to OFP8 with saturation (new)

* Zvfofp4min (Version 0.1):
  The Zvfofp4min extension provides minimal vector conversion support
  for the OFP4 E2M1 format. It requires the Zve32f extension.
  Instructions added:
  - vfext.vf2: OFP4 E2M1 to OFP8 E4M3 widening conversion

Modifications
* Softfloat library:
  - Refactored IEEE format NaN classification to share code (new in v2)
  - New float8_e4m3 and float8_e5m2 types with NaN checking functions
  - New float4_e2m1 type for OFP4 support
  - Conversion functions: bfloat16/float32 <-> float8_e4m3/float8_e5m2
  - Conversion function: float4_e2m1 -> float8_e4m3
  - Implementation uses capability-based FloatFmt flags for format behavior
* RISC-V target:
  - CPU configuration properties for Zvfofp8min and Zvfofp4min
  - Extension implied rules (Zvfofp8min requires Zve32f and Zvfbfa)
  - Vector helper functions for OFP8/OFP4 conversion instructions
  - Disassembler support for new instructions

Changes in v3:
- Add floatN_nan_is_snan to simply the quiet/signaling NaN checking flow
  in patch 2 & 3
- Add patch 4 to fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers

Changes in v2:
- Merged v1 patch 2 & 3 to v2 patch 3, v1 patch 4 & 5 to v2 patch 4
- Added new v2 patch 2 to refactor the IEEE format NaN classification
  functions (float16, bfloat16, float32, float64) to use internal helper
  functions, reducing code duplication and improving maintainability.
  The OCP FP8 NaN classification functions follow the same pattern.
- Refactored softfloat implementation to use capability-based FloatFmt
  flags (no_infinity, limited_nan, overflow_raises_invalid, normal_frac_max)
  instead of monolithic flags
- Removed ocp_fp8e5m2_no_signal_nan and ocp_fp8_same_canonical_nan flags
  from float_status; now using local float_status with no_signaling_nans
  and default_nan_pattern for RISC-V Zvfofp8min instructions
- Rebased on latest riscv-to-apply.next with zvfbfa v3 patchset

References
* OCP FP8 specification:
  https://www.opencompute.org/documents/ocp-8-bit-floating-point-specification-ofp8-revision-1-0-2023-12-01-pdf-1
* Zvfofp8min specification (v0.2.1 commit e1e20a7):
  https://github.com/aswaterman/riscv-misc/blob/main/isa/zvfofp8min.adoc
* Zvfofp4min specification (v0.1 commit e1e20a7):
  https://github.com/aswaterman/riscv-misc/blob/main/isa/zvfofp4min.adoc

PS: This series depends on the Zvfbfa extension patchset which introduces:
  - The altfmt field in VTYPE CSR
  - BF16 vector operations infrastructure
  - vfwcvtbf16.f.f.v and vfncvtbf16.f.f.w base instructions

v2: <20260127063723.442734-1-max.chou@sifive.com>
v1: <20260108151650.16329-1-max.chou@sifive.com>

Based-on: <20260127014227.406653-1-max.chou@sifive.com>

Max Chou (19):
  target/riscv: rvv: Fix NOP_UU_B vs2 width
  fpu/softfloat: Refactor IEEE format NaN classification to share code
  fpu/softfloat: Refactor floatx80 format NaN classification to share
    code
  target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers
  fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type
  fpu/softfloat: Support OCP(Open Compute Project) OFP4 data type
  target/riscv: Add cfg properity for Zvfofp8min extension
  target/riscv: Add implied rules for Zvfofp8min extension
  target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16
    conversion for Zvfofp8min extension
  target/riscv: rvv: Make vfncvtbf16.f.f.w support BF16 to OFP8
    conversion for Zvfofp8min extension
  target/riscv: rvv: Add vfncvtbf16.sat.f.f.w instruction for Zvfofp8min
    extension
  target/riscv: rvv: Add vfncvt.f.f.q and vfncvt.sat.f.f.q instructions
    for Zvfofp8min extension
  target/riscv: Expose Zvfofp8min properity
  disas/riscv: Add support of Zvfofp8min extension
  target/riscv: Add cfg properity for Zvfofp4min extension
  target/riscv: Add implied rules for Zvfofp4min extension
  target/riscv: rvv: Add vfext.vf2 instruction for Zvfofp4min extension
  target/riscv: Expose Zvfofp4min properity
  disas/riscv: Add support of Zvfofp4min extension

 disas/riscv.c                              |  12 +
 fpu/softfloat-parts.c.inc                  | 159 +++++++++---
 fpu/softfloat-specialize.c.inc             | 287 +++++++++++----------
 fpu/softfloat.c                            | 220 +++++++++++++++-
 include/fpu/softfloat-types.h              |  17 ++
 include/fpu/softfloat.h                    | 128 ++++++++-
 target/i386/tcg/fpu_helper.c               |  30 +--
 target/riscv/cpu.c                         |  32 ++-
 target/riscv/cpu_cfg_fields.h.inc          |   2 +
 target/riscv/helper.h                      |  15 ++
 target/riscv/insn32.decode                 |   8 +
 target/riscv/insn_trans/trans_rvbf16.c.inc |  32 ++-
 target/riscv/insn_trans/trans_rvofp4.c.inc |  43 +++
 target/riscv/insn_trans/trans_rvofp8.c.inc | 105 ++++++++
 target/riscv/insn_trans/trans_rvv.c.inc    |  39 +++
 target/riscv/tcg/tcg-cpu.c                 |  10 +
 target/riscv/translate.c                   |   2 +
 target/riscv/vector_helper.c               | 135 +++++++++-
 18 files changed, 1072 insertions(+), 204 deletions(-)
 create mode 100644 target/riscv/insn_trans/trans_rvofp4.c.inc
 create mode 100644 target/riscv/insn_trans/trans_rvofp8.c.inc

-- 
2.52.0



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 01/19] target/riscv: rvv: Fix NOP_UU_B vs2 width
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 02/19] fpu/softfloat: Refactor IEEE format NaN classification to share code Max Chou
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/vector_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 5aea553814..ec0ea4c143 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -4972,7 +4972,7 @@ GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v, 4)
 
 /* Narrowing Floating-Point/Integer Type-Convert Instructions */
 /* (TD, T2, TX2) */
-#define NOP_UU_B uint8_t,  uint16_t, uint32_t
+#define NOP_UU_B uint8_t,  uint16_t, uint16_t
 #define NOP_UU_H uint16_t, uint32_t, uint32_t
 #define NOP_UU_W uint32_t, uint64_t, uint64_t
 /* vfncvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 02/19] fpu/softfloat: Refactor IEEE format NaN classification to share code
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
  2026-02-04  5:17 ` [PATCH v3 01/19] target/riscv: rvv: Fix NOP_UU_B vs2 width Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-05  3:29   ` Richard Henderson
  2026-02-04  5:17 ` [PATCH v3 03/19] fpu/softfloat: Refactor floatx80 " Max Chou
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

The floatN_is_[quiet|signaling]_nan functions for following formats
(float16, bfloat16, float32, float64, float128) contain duplicated
logic that should be shared.
This commit introduces
[float16|bfloat16|float32|float64|float128]_nan_is_snan that determine
if a NaN is signaling.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Chou <max.chou@sifive.com>
---
 fpu/softfloat-specialize.c.inc | 176 ++++++++++++++-------------------
 1 file changed, 72 insertions(+), 104 deletions(-)

diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
index ba4fa08b7b..7d2515c1fa 100644
--- a/fpu/softfloat-specialize.c.inc
+++ b/fpu/softfloat-specialize.c.inc
@@ -227,42 +227,26 @@ floatx80 floatx80_default_inf(bool zSign, float_status *status)
 }
 
 /*----------------------------------------------------------------------------
-| Returns 1 if the half-precision floating-point value `a' is a quiet
-| NaN; otherwise returns 0.
+| Determine if a float16 NaN is signaling NaN.
 *----------------------------------------------------------------------------*/
 
-bool float16_is_quiet_nan(float16 a_, float_status *status)
+static bool float16_nan_is_snan(float16 a, float_status *status)
 {
     if (no_signaling_nans(status)) {
-        return float16_is_any_nan(a_);
-    } else {
-        uint16_t a = float16_val(a_);
-        if (snan_bit_is_one(status)) {
-            return (((a >> 9) & 0x3F) == 0x3E) && (a & 0x1FF);
-        } else {
-
-            return ((a >> 9) & 0x3F) == 0x3F;
-        }
+        return false;
     }
+    bool frac_msb_is_one = (a >> 9) & 1;
+    return frac_msb_is_one == snan_bit_is_one(status);
 }
 
 /*----------------------------------------------------------------------------
-| Returns 1 if the bfloat16 value `a' is a quiet
+| Returns 1 if the half-precision floating-point value `a' is a quiet
 | NaN; otherwise returns 0.
 *----------------------------------------------------------------------------*/
 
-bool bfloat16_is_quiet_nan(bfloat16 a_, float_status *status)
+bool float16_is_quiet_nan(float16 a_, float_status *status)
 {
-    if (no_signaling_nans(status)) {
-        return bfloat16_is_any_nan(a_);
-    } else {
-        uint16_t a = a_;
-        if (snan_bit_is_one(status)) {
-            return (((a >> 6) & 0x1FF) == 0x1FE) && (a & 0x3F);
-        } else {
-            return ((a >> 6) & 0x1FF) == 0x1FF;
-        }
-    }
+    return float16_is_any_nan(a_) && !float16_nan_is_snan(a_, status);
 }
 
 /*----------------------------------------------------------------------------
@@ -271,36 +255,52 @@ bool bfloat16_is_quiet_nan(bfloat16 a_, float_status *status)
 *----------------------------------------------------------------------------*/
 
 bool float16_is_signaling_nan(float16 a_, float_status *status)
+{
+    return float16_is_any_nan(a_) && float16_nan_is_snan(a_, status);
+}
+
+/*----------------------------------------------------------------------------
+| Determine if a bfloat16 NaN is signaling NaN.
+*----------------------------------------------------------------------------*/
+
+static bool bfloat16_nan_is_snan(bfloat16 a, float_status *status)
 {
     if (no_signaling_nans(status)) {
-        return 0;
-    } else {
-        uint16_t a = float16_val(a_);
-        if (snan_bit_is_one(status)) {
-            return ((a >> 9) & 0x3F) == 0x3F;
-        } else {
-            return (((a >> 9) & 0x3F) == 0x3E) && (a & 0x1FF);
-        }
+        return false;
     }
+    bool frac_msb_is_one = (a >> 6) & 1;
+    return frac_msb_is_one == snan_bit_is_one(status);
 }
 
 /*----------------------------------------------------------------------------
-| Returns 1 if the bfloat16 value `a' is a signaling
-| NaN; otherwise returns 0.
+| Returns 1 if the bfloat16 value `a' is a quiet NaN; otherwise returns 0.
+*----------------------------------------------------------------------------*/
+
+bool bfloat16_is_quiet_nan(bfloat16 a_, float_status *status)
+{
+    return bfloat16_is_any_nan(a_) && !bfloat16_nan_is_snan(a_, status);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the bfloat16 value `a' is a signaling NaN; otherwise returns 0.
 *----------------------------------------------------------------------------*/
 
 bool bfloat16_is_signaling_nan(bfloat16 a_, float_status *status)
+{
+    return bfloat16_is_any_nan(a_) && bfloat16_nan_is_snan(a_, status);
+}
+
+/*----------------------------------------------------------------------------
+| Determine if a float32 NaN is signaling NaN.
+*----------------------------------------------------------------------------*/
+
+static bool float32_nan_is_snan(float32 a, float_status *status)
 {
     if (no_signaling_nans(status)) {
-        return 0;
-    } else {
-        uint16_t a = a_;
-        if (snan_bit_is_one(status)) {
-            return ((a >> 6) & 0x1FF) == 0x1FF;
-        } else {
-            return (((a >> 6) & 0x1FF) == 0x1FE) && (a & 0x3F);
-        }
+        return false;
     }
+    bool frac_msb_is_one = (a >> 22) & 1;
+    return frac_msb_is_one == snan_bit_is_one(status);
 }
 
 /*----------------------------------------------------------------------------
@@ -310,16 +310,7 @@ bool bfloat16_is_signaling_nan(bfloat16 a_, float_status *status)
 
 bool float32_is_quiet_nan(float32 a_, float_status *status)
 {
-    if (no_signaling_nans(status)) {
-        return float32_is_any_nan(a_);
-    } else {
-        uint32_t a = float32_val(a_);
-        if (snan_bit_is_one(status)) {
-            return (((a >> 22) & 0x1FF) == 0x1FE) && (a & 0x003FFFFF);
-        } else {
-            return ((uint32_t)(a << 1) >= 0xFF800000);
-        }
-    }
+    return float32_is_any_nan(a_) && !float32_nan_is_snan(a_, status);
 }
 
 /*----------------------------------------------------------------------------
@@ -328,17 +319,21 @@ bool float32_is_quiet_nan(float32 a_, float_status *status)
 *----------------------------------------------------------------------------*/
 
 bool float32_is_signaling_nan(float32 a_, float_status *status)
+{
+    return float32_is_any_nan(a_) && float32_nan_is_snan(a_, status);
+}
+
+/*----------------------------------------------------------------------------
+| Determine if a float64 NaN is signaling NaN.
+*----------------------------------------------------------------------------*/
+
+static bool float64_nan_is_snan(float64 a, float_status *status)
 {
     if (no_signaling_nans(status)) {
-        return 0;
-    } else {
-        uint32_t a = float32_val(a_);
-        if (snan_bit_is_one(status)) {
-            return ((uint32_t)(a << 1) >= 0xFF800000);
-        } else {
-            return (((a >> 22) & 0x1FF) == 0x1FE) && (a & 0x003FFFFF);
-        }
+        return false;
     }
+    bool frac_msb_is_one = (a >> 51) & 1;
+    return frac_msb_is_one == snan_bit_is_one(status);
 }
 
 /*----------------------------------------------------------------------------
@@ -348,17 +343,7 @@ bool float32_is_signaling_nan(float32 a_, float_status *status)
 
 bool float64_is_quiet_nan(float64 a_, float_status *status)
 {
-    if (no_signaling_nans(status)) {
-        return float64_is_any_nan(a_);
-    } else {
-        uint64_t a = float64_val(a_);
-        if (snan_bit_is_one(status)) {
-            return (((a >> 51) & 0xFFF) == 0xFFE)
-                && (a & 0x0007FFFFFFFFFFFFULL);
-        } else {
-            return ((a << 1) >= 0xFFF0000000000000ULL);
-        }
-    }
+    return float64_is_any_nan(a_) && !float64_nan_is_snan(a_, status);
 }
 
 /*----------------------------------------------------------------------------
@@ -368,17 +353,7 @@ bool float64_is_quiet_nan(float64 a_, float_status *status)
 
 bool float64_is_signaling_nan(float64 a_, float_status *status)
 {
-    if (no_signaling_nans(status)) {
-        return 0;
-    } else {
-        uint64_t a = float64_val(a_);
-        if (snan_bit_is_one(status)) {
-            return ((a << 1) >= 0xFFF0000000000000ULL);
-        } else {
-            return (((a >> 51) & 0xFFF) == 0xFFE)
-                && (a & UINT64_C(0x0007FFFFFFFFFFFF));
-        }
-    }
+    return float64_is_any_nan(a_) && float64_nan_is_snan(a_, status);
 }
 
 /*----------------------------------------------------------------------------
@@ -444,6 +419,19 @@ floatx80 floatx80_silence_nan(floatx80 a, float_status *status)
     return a;
 }
 
+/*----------------------------------------------------------------------------
+| Determine if a float128 NaN is signaling NaN.
+*----------------------------------------------------------------------------*/
+
+static bool float128_nan_is_snan(float128 a, float_status *status)
+{
+    if (no_signaling_nans(status)) {
+        return false;
+    }
+    bool frac_msb_is_one = (a.high >> 47) & 1;
+    return frac_msb_is_one == snan_bit_is_one(status);
+}
+
 /*----------------------------------------------------------------------------
 | Returns 1 if the quadruple-precision floating-point value `a' is a quiet
 | NaN; otherwise returns 0.
@@ -451,17 +439,7 @@ floatx80 floatx80_silence_nan(floatx80 a, float_status *status)
 
 bool float128_is_quiet_nan(float128 a, float_status *status)
 {
-    if (no_signaling_nans(status)) {
-        return float128_is_any_nan(a);
-    } else {
-        if (snan_bit_is_one(status)) {
-            return (((a.high >> 47) & 0xFFFF) == 0xFFFE)
-                && (a.low || (a.high & 0x00007FFFFFFFFFFFULL));
-        } else {
-            return ((a.high << 1) >= 0xFFFF000000000000ULL)
-                && (a.low || (a.high & 0x0000FFFFFFFFFFFFULL));
-        }
-    }
+    return float128_is_any_nan(a) && !float128_nan_is_snan(a, status);
 }
 
 /*----------------------------------------------------------------------------
@@ -471,15 +449,5 @@ bool float128_is_quiet_nan(float128 a, float_status *status)
 
 bool float128_is_signaling_nan(float128 a, float_status *status)
 {
-    if (no_signaling_nans(status)) {
-        return 0;
-    } else {
-        if (snan_bit_is_one(status)) {
-            return ((a.high << 1) >= 0xFFFF000000000000ULL)
-                && (a.low || (a.high & 0x0000FFFFFFFFFFFFULL));
-        } else {
-            return (((a.high >> 47) & 0xFFFF) == 0xFFFE)
-                && (a.low || (a.high & UINT64_C(0x00007FFFFFFFFFFF)));
-        }
-    }
+    return float128_is_any_nan(a) && float128_nan_is_snan(a, status);
 }
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 02/19] fpu/softfloat: Refactor IEEE format NaN classification to share code
  2026-02-04  5:17 ` [PATCH v3 02/19] fpu/softfloat: Refactor IEEE format NaN classification to share code Max Chou
@ 2026-02-05  3:29   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2026-02-05  3:29 UTC (permalink / raw)
  To: Max Chou, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On 2/4/26 15:17, Max Chou wrote:
> The floatN_is_[quiet|signaling]_nan functions for following formats
> (float16, bfloat16, float32, float64, float128) contain duplicated
> logic that should be shared.
> This commit introduces
> [float16|bfloat16|float32|float64|float128]_nan_is_snan that determine
> if a NaN is signaling.
> 
> Suggested-by: Richard Henderson<richard.henderson@linaro.org>
> Signed-off-by: Max Chou<max.chou@sifive.com>
> ---
>   fpu/softfloat-specialize.c.inc | 176 ++++++++++++++-------------------
>   1 file changed, 72 insertions(+), 104 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 03/19] fpu/softfloat: Refactor floatx80 format NaN classification to share code
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
  2026-02-04  5:17 ` [PATCH v3 01/19] target/riscv: rvv: Fix NOP_UU_B vs2 width Max Chou
  2026-02-04  5:17 ` [PATCH v3 02/19] fpu/softfloat: Refactor IEEE format NaN classification to share code Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-05  3:31   ` Richard Henderson
  2026-02-04  5:17 ` [PATCH v3 04/19] target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers Max Chou
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

The floatx80_is_[quiet|signaling]_nan functions contain duplicated
logic that should be shared.
This commit introduces floatx80_nan_is_snan helper function that
determine if a NaN is signaling and change the return type of
floatx80_is_[signaling|quiet]_nan to bool.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Max Chou <max.chou@sifive.com>
---
 fpu/softfloat-specialize.c.inc | 55 +++++++++++++---------------------
 include/fpu/softfloat.h        |  4 +--
 2 files changed, 22 insertions(+), 37 deletions(-)

diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
index 7d2515c1fa..9ed968c79b 100644
--- a/fpu/softfloat-specialize.c.inc
+++ b/fpu/softfloat-specialize.c.inc
@@ -357,53 +357,38 @@ bool float64_is_signaling_nan(float64 a_, float_status *status)
 }
 
 /*----------------------------------------------------------------------------
-| Returns 1 if the extended double-precision floating-point value `a' is a
-| quiet NaN; otherwise returns 0. This slightly differs from the same
-| function for other types as floatx80 has an explicit bit.
+| Determine if a floatx80 NaN is signaling NaN.
+| The MSB of frac differs from the same function for other types as floatx80
+| has an explicit bit.
 *----------------------------------------------------------------------------*/
 
-int floatx80_is_quiet_nan(floatx80 a, float_status *status)
+static bool floatx80_nan_is_snan(floatx80 a, float_status *status)
 {
     if (no_signaling_nans(status)) {
-        return floatx80_is_any_nan(a);
-    } else {
-        if (snan_bit_is_one(status)) {
-            uint64_t aLow;
-
-            aLow = a.low & ~0x4000000000000000ULL;
-            return ((a.high & 0x7FFF) == 0x7FFF)
-                && (aLow << 1)
-                && (a.low == aLow);
-        } else {
-            return ((a.high & 0x7FFF) == 0x7FFF)
-                && (UINT64_C(0x8000000000000000) <= ((uint64_t)(a.low << 1)));
-        }
+        return false;
     }
+    bool frac_msb_is_one = (a.low >> 62) & 1;
+    return frac_msb_is_one == snan_bit_is_one(status);
 }
 
 /*----------------------------------------------------------------------------
 | Returns 1 if the extended double-precision floating-point value `a' is a
-| signaling NaN; otherwise returns 0. This slightly differs from the same
-| function for other types as floatx80 has an explicit bit.
+| quiet NaN; otherwise returns 0.
 *----------------------------------------------------------------------------*/
 
-int floatx80_is_signaling_nan(floatx80 a, float_status *status)
+bool floatx80_is_quiet_nan(floatx80 a, float_status *status)
 {
-    if (no_signaling_nans(status)) {
-        return 0;
-    } else {
-        if (snan_bit_is_one(status)) {
-            return ((a.high & 0x7FFF) == 0x7FFF)
-                && ((a.low << 1) >= 0x8000000000000000ULL);
-        } else {
-            uint64_t aLow;
-
-            aLow = a.low & ~UINT64_C(0x4000000000000000);
-            return ((a.high & 0x7FFF) == 0x7FFF)
-                && (uint64_t)(aLow << 1)
-                && (a.low == aLow);
-        }
-    }
+    return floatx80_is_any_nan(a) && !floatx80_nan_is_snan(a, status);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the extended double-precision floating-point value `a' is a
+| signaling NaN; otherwise returns 0.
+*----------------------------------------------------------------------------*/
+
+bool floatx80_is_signaling_nan(floatx80 a, float_status *status)
+{
+    return floatx80_is_any_nan(a) && floatx80_nan_is_snan(a, status);
 }
 
 /*----------------------------------------------------------------------------
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index c18ab2cb60..ac6a392375 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -978,8 +978,8 @@ floatx80 floatx80_rem(floatx80, floatx80, float_status *status);
 floatx80 floatx80_sqrt(floatx80, float_status *status);
 FloatRelation floatx80_compare(floatx80, floatx80, float_status *status);
 FloatRelation floatx80_compare_quiet(floatx80, floatx80, float_status *status);
-int floatx80_is_quiet_nan(floatx80, float_status *status);
-int floatx80_is_signaling_nan(floatx80, float_status *status);
+bool floatx80_is_quiet_nan(floatx80, float_status *status);
+bool floatx80_is_signaling_nan(floatx80, float_status *status);
 floatx80 floatx80_silence_nan(floatx80, float_status *status);
 floatx80 floatx80_scalbn(floatx80, int, float_status *status);
 
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 03/19] fpu/softfloat: Refactor floatx80 format NaN classification to share code
  2026-02-04  5:17 ` [PATCH v3 03/19] fpu/softfloat: Refactor floatx80 " Max Chou
@ 2026-02-05  3:31   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2026-02-05  3:31 UTC (permalink / raw)
  To: Max Chou, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On 2/4/26 15:17, Max Chou wrote:
> The floatx80_is_[quiet|signaling]_nan functions contain duplicated
> logic that should be shared.
> This commit introduces floatx80_nan_is_snan helper function that
> determine if a NaN is signaling and change the return type of
> floatx80_is_[signaling|quiet]_nan to bool.
> 
> Suggested-by: Richard Henderson<richard.henderson@linaro.org>
> Signed-off-by: Max Chou<max.chou@sifive.com>
> ---

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 04/19] target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (2 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 03/19] fpu/softfloat: Refactor floatx80 " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-05  3:34   ` Richard Henderson
  2026-02-04  5:17 ` [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type Max Chou
                   ` (15 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

According to Intel's x87 FPU specification (Table 8-10, Vol. 1), arithmetic
operations on operands in unsupported formats (including pseudo-NaNs) must
return the QNaN floating-point indefinite value.

The helper functions for FPATAN, FYL2XP1, and FYL2X incorrectly check for
signaling NaN before checking for invalid encodings. This causes pseudo-NaNs
to be treated as valid signaling NaNs and silenced, rather than being
rejected as unsupported formats.

Reorder the checks to test floatx80_invalid_encoding before
floatx80_is_signaling_nan, matching the correct behavior already
implemented in helper_fscale.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/i386/tcg/fpu_helper.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index b3b23823fd..37c83ded38 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -1377,16 +1377,16 @@ void helper_fpatan(CPUX86State *env)
     int32_t arg1_exp = extractFloatx80Exp(ST1);
     bool arg1_sign = extractFloatx80Sign(ST1);
 
-    if (floatx80_is_signaling_nan(ST0, &env->fp_status)) {
+    if (floatx80_invalid_encoding(ST0, &env->fp_status) ||
+        floatx80_invalid_encoding(ST1, &env->fp_status)) {
+        float_raise(float_flag_invalid, &env->fp_status);
+        ST1 = floatx80_default_nan(&env->fp_status);
+    } else if (floatx80_is_signaling_nan(ST0, &env->fp_status)) {
         float_raise(float_flag_invalid, &env->fp_status);
         ST1 = floatx80_silence_nan(ST0, &env->fp_status);
     } else if (floatx80_is_signaling_nan(ST1, &env->fp_status)) {
         float_raise(float_flag_invalid, &env->fp_status);
         ST1 = floatx80_silence_nan(ST1, &env->fp_status);
-    } else if (floatx80_invalid_encoding(ST0, &env->fp_status) ||
-               floatx80_invalid_encoding(ST1, &env->fp_status)) {
-        float_raise(float_flag_invalid, &env->fp_status);
-        ST1 = floatx80_default_nan(&env->fp_status);
     } else if (floatx80_is_any_nan(ST0)) {
         ST1 = ST0;
     } else if (floatx80_is_any_nan(ST1)) {
@@ -2061,16 +2061,16 @@ void helper_fyl2xp1(CPUX86State *env)
     int32_t arg1_exp = extractFloatx80Exp(ST1);
     bool arg1_sign = extractFloatx80Sign(ST1);
 
-    if (floatx80_is_signaling_nan(ST0, &env->fp_status)) {
+    if (floatx80_invalid_encoding(ST0, &env->fp_status) ||
+        floatx80_invalid_encoding(ST1, &env->fp_status)) {
+        float_raise(float_flag_invalid, &env->fp_status);
+        ST1 = floatx80_default_nan(&env->fp_status);
+    } else if (floatx80_is_signaling_nan(ST0, &env->fp_status)) {
         float_raise(float_flag_invalid, &env->fp_status);
         ST1 = floatx80_silence_nan(ST0, &env->fp_status);
     } else if (floatx80_is_signaling_nan(ST1, &env->fp_status)) {
         float_raise(float_flag_invalid, &env->fp_status);
         ST1 = floatx80_silence_nan(ST1, &env->fp_status);
-    } else if (floatx80_invalid_encoding(ST0, &env->fp_status) ||
-               floatx80_invalid_encoding(ST1, &env->fp_status)) {
-        float_raise(float_flag_invalid, &env->fp_status);
-        ST1 = floatx80_default_nan(&env->fp_status);
     } else if (floatx80_is_any_nan(ST0)) {
         ST1 = ST0;
     } else if (floatx80_is_any_nan(ST1)) {
@@ -2159,16 +2159,16 @@ void helper_fyl2x(CPUX86State *env)
     int32_t arg1_exp = extractFloatx80Exp(ST1);
     bool arg1_sign = extractFloatx80Sign(ST1);
 
-    if (floatx80_is_signaling_nan(ST0, &env->fp_status)) {
+    if (floatx80_invalid_encoding(ST0, &env->fp_status) ||
+        floatx80_invalid_encoding(ST1, &env->fp_status)) {
+        float_raise(float_flag_invalid, &env->fp_status);
+        ST1 = floatx80_default_nan(&env->fp_status);
+    } else if (floatx80_is_signaling_nan(ST0, &env->fp_status)) {
         float_raise(float_flag_invalid, &env->fp_status);
         ST1 = floatx80_silence_nan(ST0, &env->fp_status);
     } else if (floatx80_is_signaling_nan(ST1, &env->fp_status)) {
         float_raise(float_flag_invalid, &env->fp_status);
         ST1 = floatx80_silence_nan(ST1, &env->fp_status);
-    } else if (floatx80_invalid_encoding(ST0, &env->fp_status) ||
-               floatx80_invalid_encoding(ST1, &env->fp_status)) {
-        float_raise(float_flag_invalid, &env->fp_status);
-        ST1 = floatx80_default_nan(&env->fp_status);
     } else if (floatx80_is_any_nan(ST0)) {
         ST1 = ST0;
     } else if (floatx80_is_any_nan(ST1)) {
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 04/19] target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers
  2026-02-04  5:17 ` [PATCH v3 04/19] target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers Max Chou
@ 2026-02-05  3:34   ` Richard Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Henderson @ 2026-02-05  3:34 UTC (permalink / raw)
  To: Max Chou, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On 2/4/26 15:17, Max Chou wrote:
> According to Intel's x87 FPU specification (Table 8-10, Vol. 1), arithmetic
> operations on operands in unsupported formats (including pseudo-NaNs) must
> return the QNaN floating-point indefinite value.
> 
> The helper functions for FPATAN, FYL2XP1, and FYL2X incorrectly check for
> signaling NaN before checking for invalid encodings. This causes pseudo-NaNs
> to be treated as valid signaling NaNs and silenced, rather than being
> rejected as unsupported formats.
> 
> Reorder the checks to test floatx80_invalid_encoding before
> floatx80_is_signaling_nan, matching the correct behavior already
> implemented in helper_fscale.
> 
> Signed-off-by: Max Chou<max.chou@sifive.com>
> ---
>   target/i386/tcg/fpu_helper.c | 30 +++++++++++++++---------------
>   1 file changed, 15 insertions(+), 15 deletions(-)

Good catch.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (3 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 04/19] target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-05  4:36   ` Richard Henderson
  2026-02-05 13:21   ` Chao Liu
  2026-02-04  5:17 ` [PATCH v3 06/19] fpu/softfloat: Support OCP(Open Compute Project) OFP4 " Max Chou
                   ` (14 subsequent siblings)
  19 siblings, 2 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

This commit provides the implementation defined behavior flags and the basic
operation support for the OCP float8 data types(E4M3 & E5M2).

According to the definition in OFP8 spec, the conversion from a wider
format infinity depends on the saturation mode defined in the spec.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 fpu/softfloat-parts.c.inc      | 159 +++++++++++++++++++++------
 fpu/softfloat-specialize.c.inc |  62 +++++++++++
 fpu/softfloat.c                | 191 +++++++++++++++++++++++++++++++--
 include/fpu/softfloat-types.h  |  12 +++
 include/fpu/softfloat.h        |  81 ++++++++++++++
 5 files changed, 467 insertions(+), 38 deletions(-)

diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index 5e0438fc0b..eee7daae4d 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -227,11 +227,28 @@ static void partsN(canonicalize)(FloatPartsN *p, float_status *status,
             p->exp = fmt->frac_shift - fmt->exp_bias
                    - shift + !has_pseudo_denormals;
         }
-    } else if (likely(p->exp < fmt->exp_max) || fmt->arm_althp) {
+    } else if (likely(p->exp < fmt->exp_max)) {
         p->cls = float_class_normal;
         p->exp -= fmt->exp_bias;
         frac_shl(p, fmt->frac_shift);
         p->frac_hi |= DECOMPOSED_IMPLICIT_BIT;
+    } else if (fmt->limited_nan) {
+        /*
+         * Formats with limited NaN encodings (E4M3, E2M1, ARM Alt HP).
+         */
+        frac_shl(p, fmt->frac_shift);
+        p->frac_hi |= DECOMPOSED_IMPLICIT_BIT;
+        if (fmt->normal_frac_max == NORMAL_FRAC_MAX_ALL ||
+            p->frac_hi <= fmt->normal_frac_max) {
+            p->cls = float_class_normal;
+            p->exp -= fmt->exp_bias;
+        } else {
+            if (parts_is_snan_frac(p->frac_hi, status)) {
+                p->cls = float_class_snan;
+            } else {
+                p->cls = float_class_qnan;
+            }
+        }
     } else if (likely(frac_eqz(p))) {
         p->cls = float_class_inf;
     } else {
@@ -241,14 +258,39 @@ static void partsN(canonicalize)(FloatPartsN *p, float_status *status,
     }
 }
 
+/*
+ * Set FloatPartsN to the maximum normal value for the given format.
+ * - IEEE formats (!no_infinity): exp = exp_max - 1, frac = all ones
+ * - Limited NaN formats (E4M3): exp = exp_max, frac = normal_frac_max
+ * - No NaN/InF formats (E2M1, ARM AHP): exp = exp_max, frac = all ones
+ */
+static void partsN(set_max_normal)(FloatPartsN *p, const FloatFmt *fmt)
+{
+    if (!fmt->no_infinity) {
+        p->exp = fmt->exp_max - 1;
+        frac_allones(p);
+    } else if (fmt->normal_frac_max != NORMAL_FRAC_MAX_ALL) {
+        p->exp = fmt->exp_max;
+        frac_clear(p);
+        p->frac_hi = fmt->normal_frac_max;
+    } else {
+        p->exp = fmt->exp_max;
+        frac_allones(p);
+    }
+}
+
 /*
  * Round and uncanonicalize a floating-point number by parts. There
  * are FRAC_SHIFT bits that may require rounding at the bottom of the
  * fraction; these bits will be removed. The exponent will be biased
  * by EXP_BIAS and must be bounded by [EXP_MAX-1, 0].
+ *
+ * The saturate parameter controls saturation behavior for formats that
+ * support it (OCP FP8 E4M3/E5M2). When true, overflow produces max normal
+ * instead of infinity (E5M2) or NaN (E4M3).
  */
 static void partsN(uncanon_normal)(FloatPartsN *p, float_status *s,
-                                   const FloatFmt *fmt)
+                                   const FloatFmt *fmt, bool saturate)
 {
     const int exp_max = fmt->exp_max;
     const int frac_shift = fmt->frac_shift;
@@ -256,8 +298,8 @@ static void partsN(uncanon_normal)(FloatPartsN *p, float_status *s,
     const uint64_t frac_lsb = round_mask + 1;
     const uint64_t frac_lsbm1 = round_mask ^ (round_mask >> 1);
     const uint64_t roundeven_mask = round_mask | frac_lsb;
+    bool overflow_norm = saturate;
     uint64_t inc;
-    bool overflow_norm = false;
     int exp, flags = 0;
 
     switch (s->float_rounding_mode) {
@@ -313,30 +355,64 @@ static void partsN(uncanon_normal)(FloatPartsN *p, float_status *s,
             }
             p->frac_lo &= ~round_mask;
         }
+        p->exp = exp;
 
-        if (fmt->arm_althp) {
-            /* ARM Alt HP eschews Inf and NaN for a wider exponent.  */
-            if (unlikely(exp > exp_max)) {
-                /* Overflow.  Return the maximum normal.  */
-                flags = float_flag_invalid;
-                exp = exp_max;
-                frac_allones(p);
-                p->frac_lo &= ~round_mask;
+        /*
+         * Unified overflow handling based on format capabilities.
+         * 1. Format has infinity -> overflow to infinity (or saturate)
+         * 2. Format has NaN but no infinity -> overflow to NaN (or saturate)
+         * 3. Format has neither -> always saturate
+         */
+        if (!fmt->no_infinity) {
+            if (unlikely(exp >= exp_max)) {
+                flags |= float_flag_overflow;
+                if (s->rebias_overflow) {
+                    exp -= fmt->exp_re_bias;
+                } else if (overflow_norm) {
+                    flags |= float_flag_inexact;
+                    parts_set_max_normal(p, fmt);
+                    exp = p->exp;
+                    p->frac_lo &= ~round_mask;
+                } else {
+                    flags |= float_flag_inexact;
+                    p->cls = float_class_inf;
+                    exp = exp_max;
+                    frac_clear(p);
+                }
             }
-        } else if (unlikely(exp >= exp_max)) {
-            flags |= float_flag_overflow;
-            if (s->rebias_overflow) {
-                exp -= fmt->exp_re_bias;
-            } else if (overflow_norm) {
+        } else if (fmt_has_nan_encoding(fmt)) {
+            bool is_overflow = (exp > exp_max) ||
+                               (exp == exp_max &&
+                                p->frac_hi > fmt->normal_frac_max);
+
+            if (unlikely(is_overflow)) {
+                flags |= float_flag_overflow;
                 flags |= float_flag_inexact;
-                exp = exp_max - 1;
-                frac_allones(p);
+
+                if (overflow_norm) {
+                    parts_set_max_normal(p, fmt);
+                    exp = p->exp;
+                } else {
+                    uint8_t dnan = s->default_nan_pattern;
+                    p->cls = float_class_qnan;
+                    p->sign = dnan >> 7;
+                    exp = exp_max;
+                    frac_allones(p);
+                }
+            }
+        } else {
+            if (unlikely(exp > exp_max)) {
+                if (fmt->overflow_raises_invalid) {
+                    /* ARM Alt HP: raise Invalid, not Overflow */
+                    flags = float_flag_invalid;
+                } else {
+                    flags |= float_flag_overflow;
+                    flags |= float_flag_inexact;
+                }
+
+                parts_set_max_normal(p, fmt);
+                exp = p->exp;
                 p->frac_lo &= ~round_mask;
-            } else {
-                flags |= float_flag_inexact;
-                p->cls = float_class_inf;
-                exp = exp_max;
-                frac_clear(p);
             }
         }
         frac_shr(p, frac_shift);
@@ -422,11 +498,11 @@ static void partsN(uncanon_normal)(FloatPartsN *p, float_status *s,
     float_raise(flags, s);
 }
 
-static void partsN(uncanon)(FloatPartsN *p, float_status *s,
-                            const FloatFmt *fmt)
+static void partsN(uncanon_sat)(FloatPartsN *p, float_status *s,
+                                const FloatFmt *fmt, bool saturate)
 {
     if (likely(is_anynorm(p->cls))) {
-        parts_uncanon_normal(p, s, fmt);
+        parts_uncanon_normal(p, s, fmt, saturate);
     } else {
         switch (p->cls) {
         case float_class_zero:
@@ -434,13 +510,30 @@ static void partsN(uncanon)(FloatPartsN *p, float_status *s,
             frac_clear(p);
             return;
         case float_class_inf:
-            g_assert(!fmt->arm_althp);
-            p->exp = fmt->exp_max;
-            frac_clear(p);
+            /*
+             * Unified infinity handling using format capabilities.
+             * Formats with no_infinity must convert infinity to something else
+             */
+            if (!fmt->no_infinity) {
+                p->exp = fmt->exp_max;
+                frac_clear(p);
+            } else if (fmt_has_nan_encoding(fmt)) {
+                if (saturate) {
+                    parts_set_max_normal(p, fmt);
+                } else {
+                    uint8_t dnan = s->default_nan_pattern;
+                    p->cls = float_class_qnan;
+                    p->sign = dnan >> 7;
+                    p->exp = fmt->exp_max;
+                    frac_allones(p);
+                }
+            } else {
+                parts_set_max_normal(p, fmt);
+            }
             return;
         case float_class_qnan:
         case float_class_snan:
-            g_assert(!fmt->arm_althp);
+            g_assert(fmt_has_nan_encoding(fmt));
             p->exp = fmt->exp_max;
             frac_shr(p, fmt->frac_shift);
             return;
@@ -451,6 +544,12 @@ static void partsN(uncanon)(FloatPartsN *p, float_status *s,
     }
 }
 
+static void partsN(uncanon)(FloatPartsN *p, float_status *s,
+                            const FloatFmt *fmt)
+{
+    partsN(uncanon_sat)(p, s, fmt, false);
+}
+
 /*
  * Returns the result of adding or subtracting the values of the
  * floating-point values `a' and `b'. The operation is performed
diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
index 9ed968c79b..40c574283f 100644
--- a/fpu/softfloat-specialize.c.inc
+++ b/fpu/softfloat-specialize.c.inc
@@ -226,6 +226,68 @@ floatx80 floatx80_default_inf(bool zSign, float_status *status)
     return packFloatx80(zSign, 0x7fff, z ? 0 : (1ULL << 63));
 }
 
+/*----------------------------------------------------------------------------
+| Determine if a OCP FP8 E4M3 NaN is signaling NaN.
+| E4M3 has only one NaN encoding, so classification is policy-based.
+*----------------------------------------------------------------------------*/
+
+static bool float8_e4m3_nan_is_snan(float8_e4m3 a, float_status *status)
+{
+    if (no_signaling_nans(status)) {
+        return false;
+    }
+    return snan_bit_is_one(status);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the OCP FP8 E4M3 value `a' is a quiet NaN; otherwise returns 0.
+*----------------------------------------------------------------------------*/
+
+bool float8_e4m3_is_quiet_nan(float8_e4m3 a_, float_status *status)
+{
+    return float8_e4m3_is_any_nan(a_) && !float8_e4m3_nan_is_snan(a_, status);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the OCP FP8 E4M3 value `a' is a signaling NaN; otherwise 0.
+*----------------------------------------------------------------------------*/
+
+bool float8_e4m3_is_signaling_nan(float8_e4m3 a_, float_status *status)
+{
+    return float8_e4m3_is_any_nan(a_) && float8_e4m3_nan_is_snan(a_, status);
+}
+
+/*----------------------------------------------------------------------------
+| Determine if a OCP FP8 E5M2 NaN is signaling NaN.
+*----------------------------------------------------------------------------*/
+
+static bool float8_e5m2_nan_is_snan(float8_e5m2 a, float_status *status)
+{
+    if (no_signaling_nans(status)) {
+        return false;
+    }
+    bool frac_msb_is_one = (a >> 1) & 1;
+    return frac_msb_is_one == snan_bit_is_one(status);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the OCP FP8 E5M2 value `a' is a quiet NaN; otherwise returns 0.
+*----------------------------------------------------------------------------*/
+
+bool float8_e5m2_is_quiet_nan(float8_e5m2 a_, float_status *status)
+{
+    return float8_e5m2_is_any_nan(a_) && !float8_e5m2_nan_is_snan(a_, status);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the OCP FP8 E5M2 value `a' is a signaling NaN; otherwise 0.
+*----------------------------------------------------------------------------*/
+
+bool float8_e5m2_is_signaling_nan(float8_e5m2 a_, float_status *status)
+{
+    return float8_e5m2_is_any_nan(a_) && float8_e5m2_nan_is_snan(a_, status);
+}
+
 /*----------------------------------------------------------------------------
 | Determine if a float16 NaN is signaling NaN.
 *----------------------------------------------------------------------------*/
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 8094358c2e..533f96dcda 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -522,6 +522,13 @@ typedef struct {
 #define DECOMPOSED_BINARY_POINT    63
 #define DECOMPOSED_IMPLICIT_BIT    (1ull << DECOMPOSED_BINARY_POINT)
 
+/*
+ * Sentinel value for normal_frac_max indicating "all fraction values at
+ * exp_max are normal" (i.e., the format has no NaN encoding at exp_max).
+ * Used by E2M1 and ARM Alternative Half Precision formats.
+ */
+#define NORMAL_FRAC_MAX_ALL        0
+
 /* Structure holding all of the relevant parameters for a format.
  *   exp_size: the size of the exponent field
  *   exp_bias: the offset applied to the exponent field
@@ -542,11 +549,39 @@ typedef struct {
     int exp_max;
     int frac_size;
     int frac_shift;
-    bool arm_althp;
     bool has_explicit_bit;
     uint64_t round_mask;
+    /*
+     * Format capability flags:
+     * no_infinity: Format has no infinity encoding. When true, exp=exp_max
+     *   with frac=0 is NOT infinity - it's either NaN or max normal.
+     *
+     * limited_nan: Format has limited or no NaN patterns. When combined
+     *   with normal_frac_max, determines NaN encoding capability:
+     *   - limited_nan=false: Standard IEEE NaN (exp=exp_max, frac!=0)
+     *   - limited_nan=true && normal_frac_max!=0: Limited NaN (E4M3)
+     *   - limited_nan=true && normal_frac_max==0: No NaN encoding (AHP, E2M1)
+     *
+     * overflow_raises_invalid: Raise Invalid (not Overflow) exception.
+     *   ARM Alt HP uses this to signal overflow as an invalid operation.
+     *
+     * normal_frac_max: For formats with limited_nan, the maximum fraction
+     *   value (after normalization shift, including implicit bit) that is
+     *   still considered normal at exp=exp_max.
+     *   Use NORMAL_FRAC_MAX_ALL (0) to indicate all frac values at exp_max
+     *   are normal (E2M1, ARM Alt HP), which also implies no NaN encoding.
+     */
+    bool no_infinity;
+    bool limited_nan;
+    bool overflow_raises_invalid;
+    uint64_t normal_frac_max;
 } FloatFmt;
 
+static inline bool fmt_has_nan_encoding(const FloatFmt *fmt)
+{
+    return !fmt->limited_nan || fmt->normal_frac_max != NORMAL_FRAC_MAX_ALL;
+}
+
 /* Expand fields based on the size of exponent and fraction */
 #define FLOAT_PARAMS_(E)                                \
     .exp_size       = E,                                \
@@ -560,13 +595,27 @@ typedef struct {
     .frac_shift     = (-F - 1) & 63,                    \
     .round_mask     = (1ull << ((-F - 1) & 63)) - 1
 
+static const FloatFmt float8_e4m3_params = {
+    FLOAT_PARAMS(4, 3),
+    .no_infinity = true,
+    .limited_nan = true,
+    .normal_frac_max = 0xE000000000000000ULL,
+};
+
+static const FloatFmt float8_e5m2_params = {
+    FLOAT_PARAMS(5, 2),
+};
+
 static const FloatFmt float16_params = {
     FLOAT_PARAMS(5, 10)
 };
 
 static const FloatFmt float16_params_ahp = {
     FLOAT_PARAMS(5, 10),
-    .arm_althp = true
+    .no_infinity = true,
+    .limited_nan = true,
+    .overflow_raises_invalid = true,
+    .normal_frac_max = NORMAL_FRAC_MAX_ALL,
 };
 
 static const FloatFmt bfloat16_params = {
@@ -614,6 +663,16 @@ static void unpack_raw64(FloatParts64 *r, const FloatFmt *fmt, uint64_t raw)
     };
 }
 
+static void QEMU_FLATTEN float8_e4m3_unpack_raw(FloatParts64 *p, float8_e4m3 f)
+{
+    unpack_raw64(p, &float8_e4m3_params, f);
+}
+
+static void QEMU_FLATTEN float8_e5m2_unpack_raw(FloatParts64 *p, float8_e5m2 f)
+{
+    unpack_raw64(p, &float8_e5m2_params, f);
+}
+
 static void QEMU_FLATTEN float16_unpack_raw(FloatParts64 *p, float16 f)
 {
     unpack_raw64(p, &float16_params, f);
@@ -671,6 +730,16 @@ static uint64_t pack_raw64(const FloatParts64 *p, const FloatFmt *fmt)
     return ret;
 }
 
+static float8_e4m3 QEMU_FLATTEN float8_e4m3_pack_raw(const FloatParts64 *p)
+{
+    return make_float8_e4m3(pack_raw64(p, &float8_e4m3_params));
+}
+
+static float8_e5m2 QEMU_FLATTEN float8_e5m2_pack_raw(const FloatParts64 *p)
+{
+    return make_float8_e5m2(pack_raw64(p, &float8_e5m2_params));
+}
+
 static float16 QEMU_FLATTEN float16_pack_raw(const FloatParts64 *p)
 {
     return make_float16(pack_raw64(p, &float16_params));
@@ -758,12 +827,26 @@ static void parts128_canonicalize(FloatParts128 *p, float_status *status,
     PARTS_GENERIC_64_128(canonicalize, A)(A, S, F)
 
 static void parts64_uncanon_normal(FloatParts64 *p, float_status *status,
-                                   const FloatFmt *fmt);
+                                   const FloatFmt *fmt, bool saturate);
 static void parts128_uncanon_normal(FloatParts128 *p, float_status *status,
-                                    const FloatFmt *fmt);
+                                    const FloatFmt *fmt, bool saturate);
+
+#define parts_uncanon_normal(A, S, F, SAT) \
+    PARTS_GENERIC_64_128(uncanon_normal, A)(A, S, F, SAT)
 
-#define parts_uncanon_normal(A, S, F) \
-    PARTS_GENERIC_64_128(uncanon_normal, A)(A, S, F)
+static void parts64_uncanon_sat(FloatParts64 *p, float_status *status,
+                                const FloatFmt *fmt, bool saturate);
+static void parts128_uncanon_sat(FloatParts128 *p, float_status *status,
+                                 const FloatFmt *fmt, bool saturate);
+
+#define parts_uncanon_sat(A, S, F, SAT) \
+    PARTS_GENERIC_64_128(uncanon_sat, A)(A, S, F, SAT)
+
+static void parts64_set_max_normal(FloatParts64 *p, const FloatFmt *fmt);
+static void parts128_set_max_normal(FloatParts128 *p, const FloatFmt *fmt);
+
+#define parts_set_max_normal(P, F) \
+    PARTS_GENERIC_64_128(set_max_normal, P)(P, F)
 
 static void parts64_uncanon(FloatParts64 *p, float_status *status,
                             const FloatFmt *fmt);
@@ -1662,6 +1745,20 @@ static const uint16_t rsqrt_tab[128] = {
  * Pack/unpack routines with a specific FloatFmt.
  */
 
+static void float8_e4m3_unpack_canonical(FloatParts64 *p, float8_e4m3 f,
+                                         float_status *s)
+{
+    float8_e4m3_unpack_raw(p, f);
+    parts_canonicalize(p, s, &float8_e4m3_params);
+}
+
+static void float8_e5m2_unpack_canonical(FloatParts64 *p, float8_e5m2 f,
+                                         float_status *s)
+{
+    float8_e5m2_unpack_raw(p, f);
+    parts_canonicalize(p, s, &float8_e5m2_params);
+}
+
 static void float16a_unpack_canonical(FloatParts64 *p, float16 f,
                                       float_status *s, const FloatFmt *params)
 {
@@ -1682,6 +1779,24 @@ static void bfloat16_unpack_canonical(FloatParts64 *p, bfloat16 f,
     parts_canonicalize(p, s, &bfloat16_params);
 }
 
+static float8_e4m3 float8_e4m3_round_pack_canonical(FloatParts64 *p,
+                                                    float_status *status,
+                                                    const FloatFmt *params,
+                                                    const bool saturate)
+{
+    parts_uncanon_sat(p, status, params, saturate);
+    return float8_e4m3_pack_raw(p);
+}
+
+static float8_e5m2 float8_e5m2_round_pack_canonical(FloatParts64 *p,
+                                                    float_status *status,
+                                                    const FloatFmt *params,
+                                                    const bool saturate)
+{
+    parts_uncanon_sat(p, status, params, saturate);
+    return float8_e5m2_pack_raw(p);
+}
+
 static float16 float16a_round_pack_canonical(FloatParts64 *p,
                                              float_status *s,
                                              const FloatFmt *params)
@@ -1838,7 +1953,7 @@ static floatx80 floatx80_round_pack_canonical(FloatParts128 *p,
     case float_class_normal:
     case float_class_denormal:
         if (s->floatx80_rounding_precision == floatx80_precision_x) {
-            parts_uncanon_normal(p, s, fmt);
+            parts_uncanon_normal(p, s, fmt, false);
             frac = p->frac_hi;
             exp = p->exp;
         } else {
@@ -1847,7 +1962,7 @@ static floatx80 floatx80_round_pack_canonical(FloatParts128 *p,
             p64.sign = p->sign;
             p64.exp = p->exp;
             frac_truncjam(&p64, p);
-            parts_uncanon_normal(&p64, s, fmt);
+            parts_uncanon_normal(&p64, s, fmt, false);
             frac = p64.frac;
             exp = p64.exp;
         }
@@ -2823,6 +2938,66 @@ static void parts_float_to_float_widen(FloatParts128 *a, FloatParts64 *b,
     }
 }
 
+bfloat16 float8_e4m3_to_bfloat16(float8_e4m3 a, float_status *s)
+{
+    FloatParts64 p;
+
+    float8_e4m3_unpack_canonical(&p, a, s);
+    parts_float_to_float(&p, s);
+
+    return bfloat16_round_pack_canonical(&p, s);
+}
+
+bfloat16 float8_e5m2_to_bfloat16(float8_e5m2 a, float_status *s)
+{
+    FloatParts64 p;
+
+    float8_e5m2_unpack_canonical(&p, a, s);
+    parts_float_to_float(&p, s);
+
+    return bfloat16_round_pack_canonical(&p, s);
+}
+
+float8_e4m3 bfloat16_to_float8_e4m3(bfloat16 a, bool saturate, float_status *s)
+{
+    FloatParts64 p;
+
+    bfloat16_unpack_canonical(&p, a, s);
+    parts_float_to_float(&p, s);
+    return float8_e4m3_round_pack_canonical(&p, s, &float8_e4m3_params,
+                                            saturate);
+}
+
+float8_e5m2 bfloat16_to_float8_e5m2(bfloat16 a, bool saturate, float_status *s)
+{
+    FloatParts64 p;
+
+    bfloat16_unpack_canonical(&p, a, s);
+    parts_float_to_float(&p, s);
+    return float8_e5m2_round_pack_canonical(&p, s, &float8_e5m2_params,
+                                            saturate);
+}
+
+float8_e4m3 float32_to_float8_e4m3(float32 a, bool saturate, float_status *s)
+{
+    FloatParts64 p;
+
+    float32_unpack_canonical(&p, a, s);
+    parts_float_to_float(&p, s);
+    return float8_e4m3_round_pack_canonical(&p, s, &float8_e4m3_params,
+                                            saturate);
+}
+
+float8_e5m2 float32_to_float8_e5m2(float32 a, bool saturate, float_status *s)
+{
+    FloatParts64 p;
+
+    float32_unpack_canonical(&p, a, s);
+    parts_float_to_float(&p, s);
+    return float8_e5m2_round_pack_canonical(&p, s, &float8_e5m2_params,
+                                            saturate);
+}
+
 float32 float16_to_float32(float16 a, bool ieee, float_status *s)
 {
     const FloatFmt *fmt16 = ieee ? &float16_params : &float16_params_ahp;
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
index 8f82fdfc97..b781bf10b7 100644
--- a/include/fpu/softfloat-types.h
+++ b/include/fpu/softfloat-types.h
@@ -119,6 +119,18 @@ typedef struct {
  */
 typedef uint16_t bfloat16;
 
+/*
+ * Software OCP(Open Compute Project) floating point types
+ */
+typedef uint8_t float8_e4m3;
+typedef uint8_t float8_e5m2;
+#define float8_e4m3_val(x) (x)
+#define float8_e5m2_val(x) (x)
+#define make_float8_e4m3(x) (x)
+#define make_float8_e5m2(x) (x)
+#define const_float8_e4m3(x) (x)
+#define const_float8_e5m2(x) (x)
+
 /*
  * Software IEC/IEEE floating-point underflow tininess-detection mode.
  */
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index ac6a392375..7abbf92b7e 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -189,6 +189,87 @@ float128 int128_to_float128(Int128, float_status *status);
 float128 uint64_to_float128(uint64_t, float_status *status);
 float128 uint128_to_float128(Int128, float_status *status);
 
+/*----------------------------------------------------------------------------
+| Software OCP conversion routines.
+*----------------------------------------------------------------------------*/
+
+bfloat16 float8_e4m3_to_bfloat16(float8_e4m3, float_status *status);
+bfloat16 float8_e5m2_to_bfloat16(float8_e5m2, float_status *status);
+float8_e4m3 bfloat16_to_float8_e4m3(bfloat16, bool saturate, float_status *status);
+float8_e5m2 bfloat16_to_float8_e5m2(bfloat16, bool saturate, float_status *status);
+float8_e4m3 float32_to_float8_e4m3(float32, bool saturate, float_status *status);
+float8_e5m2 float32_to_float8_e5m2(float32, bool saturate, float_status *status);
+
+/*----------------------------------------------------------------------------
+| Software OCP operations.
+*----------------------------------------------------------------------------*/
+
+bool float8_e4m3_is_quiet_nan(float8_e4m3, float_status *status);
+bool float8_e4m3_is_signaling_nan(float8_e4m3, float_status *status);
+bool float8_e5m2_is_quiet_nan(float8_e5m2, float_status *status);
+bool float8_e5m2_is_signaling_nan(float8_e5m2, float_status *status);
+
+static inline bool float8_e4m3_is_any_nan(float8_e4m3 a)
+{
+    return ((float8_e4m3_val(a) & ~0x80) == 0x7f);
+}
+
+static inline bool float8_e5m2_is_any_nan(float8_e5m2 a)
+{
+    return ((float8_e5m2_val(a) & ~0x80) > 0x7c);
+}
+
+static inline bool float8_e4m3_is_neg(float8_e4m3 a)
+{
+    return float8_e4m3_val(a) >> 7;
+}
+
+static inline bool float8_e5m2_is_neg(float8_e5m2 a)
+{
+    return float8_e5m2_val(a) >> 7;
+}
+
+static inline bool float8_e4m3_is_infinity(float8_e4m3 a)
+{
+    return false;
+}
+
+static inline bool float8_e5m2_is_infinity(float8_e5m2 a)
+{
+    return (float8_e5m2_val(a) & 0x7f) == 0x7c;
+}
+
+static inline bool float8_e4m3_is_zero(float8_e4m3 a)
+{
+    return (float8_e4m3_val(a) & 0x7f) == 0;
+}
+
+static inline bool float8_e5m2_is_zero(float8_e5m2 a)
+{
+    return (float8_e5m2_val(a) & 0x7f) == 0;
+}
+
+static inline bool float8_e4m3_is_zero_or_denormal(float8_e4m3 a)
+{
+    return (float8_e4m3_val(a) & 0x78) == 0;
+}
+
+static inline bool float8_e5m2_is_zero_or_denormal(float8_e5m2 a)
+{
+    return (float8_e5m2_val(a) & 0x7c) == 0;
+}
+
+static inline bool float8_e4m3_is_normal(float8_e4m3 a)
+{
+    uint8_t em = float8_e4m3_val(a) & 0x7f;
+    return em >= 0x8 && em <= 0x7e;
+}
+
+static inline bool float8_e5m2_is_normal(float8_e5m2 a)
+{
+    return (((float8_e5m2_val(a) >> 2) + 1) & 0x1f) >= 2;
+}
+
 /*----------------------------------------------------------------------------
 | Software half-precision conversion routines.
 *----------------------------------------------------------------------------*/
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type
  2026-02-04  5:17 ` [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type Max Chou
@ 2026-02-05  4:36   ` Richard Henderson
  2026-02-05 16:37     ` Max Chou
  2026-02-05 13:21   ` Chao Liu
  1 sibling, 1 reply; 36+ messages in thread
From: Richard Henderson @ 2026-02-05  4:36 UTC (permalink / raw)
  To: Max Chou, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On 2/4/26 15:17, Max Chou wrote:
> This commit provides the implementation defined behavior flags and the basic
> operation support for the OCP float8 data types(E4M3 & E5M2).

I'd really like to see this split into parts.  Beginning with

> @@ -542,11 +549,39 @@ typedef struct {
>       int exp_max;
>       int frac_size;
>       int frac_shift;
> -    bool arm_althp;
>       bool has_explicit_bit;
>       uint64_t round_mask;
> +    /*
> +     * Format capability flags:
> +     * no_infinity: Format has no infinity encoding. When true, exp=exp_max
> +     *   with frac=0 is NOT infinity - it's either NaN or max normal.
> +     *
> +     * limited_nan: Format has limited or no NaN patterns. When combined
> +     *   with normal_frac_max, determines NaN encoding capability:
> +     *   - limited_nan=false: Standard IEEE NaN (exp=exp_max, frac!=0)
> +     *   - limited_nan=true && normal_frac_max!=0: Limited NaN (E4M3)
> +     *   - limited_nan=true && normal_frac_max==0: No NaN encoding (AHP, E2M1)
> +     *
> +     * overflow_raises_invalid: Raise Invalid (not Overflow) exception.
> +     *   ARM Alt HP uses this to signal overflow as an invalid operation.
> +     *
> +     * normal_frac_max: For formats with limited_nan, the maximum fraction
> +     *   value (after normalization shift, including implicit bit) that is
> +     *   still considered normal at exp=exp_max.
> +     *   Use NORMAL_FRAC_MAX_ALL (0) to indicate all frac values at exp_max
> +     *   are normal (E2M1, ARM Alt HP), which also implies no NaN encoding.
> +     */
> +    bool no_infinity;
> +    bool limited_nan;
> +    bool overflow_raises_invalid;
> +    uint64_t normal_frac_max;
>   } FloatFmt;

... this.  I wanted to say something about this vs previous revisions, but I hadn't had 
anything coherent to say besides "meh".

In particular, I think separating "no_infinity" and "limited_nan" leads to confusing 
checks, such as the one in parts_canonicalize where you test "limited_nan" in a context 
that is really testing for overflow to infinity.

Further, normal_frac_max is defined oddly, such that you have to test it twice, once vs 
frac_hi and once vs NORMAL_FRAC_MAX_ALL.  Since this is used for exactly one format, this 
is perhaps trying to be overly general.

I think better might be:

     typedef enum {
         /* exp==max, frac==0 ? infinity : nan; this is ieee standard. */
         float_maxexp_ieee,
         /* exp==max is a normal number; no infinity or nan representation. */
         float_maxexp_normal,
         /* exp==max, frac==max ? nan : normal; no infinity. */
         float_maxexp_e4m3,
     } FloatFmtMaxExp;

We can stage in this behaviour without also including either FP8 format.
Just changing Arm althp in a separate patch is large enough.


r~


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type
  2026-02-05  4:36   ` Richard Henderson
@ 2026-02-05 16:37     ` Max Chou
  0 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-05 16:37 UTC (permalink / raw)
  To: Richard Henderson
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei

On 2026-02-05 14:36, Richard Henderson wrote:
> In particular, I think separating "no_infinity" and "limited_nan" leads to
> confusing checks, such as the one in parts_canonicalize where you test
> "limited_nan" in a context that is really testing for overflow to infinity.
> 
> Further, normal_frac_max is defined oddly, such that you have to test it
> twice, once vs frac_hi and once vs NORMAL_FRAC_MAX_ALL.  Since this is used
> for exactly one format, this is perhaps trying to be overly general.
> 
> I think better might be:
> 
>     typedef enum {
>         /* exp==max, frac==0 ? infinity : nan; this is ieee standard. */
>         float_maxexp_ieee,
>         /* exp==max is a normal number; no infinity or nan representation. */
>         float_maxexp_normal,
>         /* exp==max, frac==max ? nan : normal; no infinity. */
>         float_maxexp_e4m3,
>     } FloatFmtMaxExp;
> 
> We can stage in this behaviour without also including either FP8 format.
> Just changing Arm althp in a separate patch is large enough.
> 
> 
> r~

Hi Richard,

Thank you for the suggestions and v4 for softfloat part.
I agree that the original patch should be separated and the solution you
suggested is better.
I'll seperate the riscv isa part to another v4 patch based on the
softfloat v4.
And will also testing the softfloat v4 you provided and fix some
saturate issues on that.

Thanks a lot,
rnax


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type
  2026-02-04  5:17 ` [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type Max Chou
  2026-02-05  4:36   ` Richard Henderson
@ 2026-02-05 13:21   ` Chao Liu
  2026-02-05 16:48     ` Max Chou
  1 sibling, 1 reply; 36+ messages in thread
From: Chao Liu @ 2026-02-05 13:21 UTC (permalink / raw)
  To: Max Chou, Richard Henderson
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

Hi Max,

I've been testing the OCP FP8 implementation by writing
a simple test suite in tests/fp/ that covers various boundary cases for E4M3,
E5M2, E2M1, and BFloat16 formats. During testing, I found some issues in the
float_class_inf handling in partsN(uncanon_sat).

On Wed, Feb 04, 2026 at 01:17:41PM +0800, Max Chou wrote:
> This commit provides the implementation defined behavior flags and the basic
> operation support for the OCP float8 data types(E4M3 & E5M2).
> 
> According to the definition in OFP8 spec, the conversion from a wider
> format infinity depends on the saturation mode defined in the spec.
> 
> Signed-off-by: Max Chou <max.chou@sifive.com>
> ---
>  fpu/softfloat-parts.c.inc      | 159 +++++++++++++++++++++------
>  fpu/softfloat-specialize.c.inc |  62 +++++++++++
>  fpu/softfloat.c                | 191 +++++++++++++++++++++++++++++++--
>  include/fpu/softfloat-types.h  |  12 +++
>  include/fpu/softfloat.h        |  81 ++++++++++++++
>  5 files changed, 467 insertions(+), 38 deletions(-)
> 
> diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
> index 5e0438fc0b..eee7daae4d 100644
> --- a/fpu/softfloat-parts.c.inc
> +++ b/fpu/softfloat-parts.c.inc

[...]

> -static void partsN(uncanon)(FloatPartsN *p, float_status *s,
> -                            const FloatFmt *fmt)
> +static void partsN(uncanon_sat)(FloatPartsN *p, float_status *s,
> +                                const FloatFmt *fmt, bool saturate)
>  {
>      if (likely(is_anynorm(p->cls))) {
> -        parts_uncanon_normal(p, s, fmt);
> +        parts_uncanon_normal(p, s, fmt, saturate);
>      } else {
>          switch (p->cls) {
>          case float_class_zero:
> @@ -434,13 +510,30 @@ static void partsN(uncanon)(FloatPartsN *p, float_status *s,
>              frac_clear(p);
>              return;
>          case float_class_inf:
> -            g_assert(!fmt->arm_althp);
> -            p->exp = fmt->exp_max;
> -            frac_clear(p);
> +            /*
> +             * Unified infinity handling using format capabilities.
> +             * Formats with no_infinity must convert infinity to something else
> +             */
> +            if (!fmt->no_infinity) {
> +                p->exp = fmt->exp_max;
> +                frac_clear(p);
The saturate flag is not checked here. For IEEE-like formats such as
E5M2 that have infinity encoding, when saturate=true, the result should be
the maximum normal value, not infinity.

Per OCP FP8 specification Section 4.2 "Saturation", when saturation mode is
enabled, infinity should be converted to the maximum finite value even for
formats that support infinity representation.

My case:
  bfloat16_to_float8_e5m2(BF16_INF_POS, true, &status)
  Expected: 0x7b (max normal)
  Actual:   0x7c (infinity)

Suggested fix:
            if (!fmt->no_infinity && !saturate) {
                p->exp = fmt->exp_max;
                frac_clea
                r(p);
            } else if (!fmt->no_infinity && saturate) {
                /* Saturate infinity to max normal for IEEE-like formats */
                p->exp = fmt->exp_max - 1;
                frac_allones(p);
                frac_shr(p, fmt->frac_shift);
            } else if ...

> +            } else if (fmt_has_nan_encoding(fmt)) {
> +                if (saturate) {
> +                    parts_set_max_normal(p, fmt);
Missing frac_shr() call after parts_set_max_normal().

The parts_set_max_normal() function sets frac_hi to the normalized
fraction value (with implicit bit at position 63). Before packing into
the final format, the fraction must be shifted right by frac_shift to
position it correctly.

Compare with the float_class_qnan/snan case below which correctly calls
frac_shr(p, fmt->frac_shift) before returning.

My case:
  bfloat16_to_float8_e4m3(BF16_INF_POS, true, &status)
  Expected: 0x7e (max normal, exp=15, frac=6)
  Actual:   0x78 (exp=15, frac=0 - incorrect due to missing shift)

Suggested fix:
                if (saturate) {
                    parts_set_max_normal(p, fmt);
                    frac_shr(p, fmt->frac_shift);

> +                } else {
> +                    uint8_t dnan = s->default_nan_pattern;
> +                    p->cls = float_class_qnan;
> +                    p->sign = dnan >> 7;
> +                    p->exp = fmt->exp_max;
> +                    frac_allones(p);
Same issue - missing frac_shr() call after frac_allones().

> +                }
> +            } else {
> +                parts_set_max_normal(p, fmt);
Same issue - missing frac_shr() call after parts_set_max_normal().

PS: This path is taken for formats without NaN encoding (like E2M1).

> +            }
>              return;
>          case float_class_qnan:
>          case float_class_snan:
> -            g_assert(!fmt->arm_althp);
> +            g_assert(fmt_has_nan_encoding(fmt));
>              p->exp = fmt->exp_max;
>              frac_shr(p, fmt->frac_shift); /* <-- This is correct */
>              return;
The qnan/snan case correctly calls frac_shr(), which is good, but the inf case above does not.

---

I've prepared a fix patch and a test suite (tests/fp/fp-test-ocp.c) with 97
test cases covering:

- Classification functions for E4M3, E5M2, E2M1, BFloat16
- Format conversions with and without saturation
- Rounding mode handling
- Canonical NaN generation per Zvfofp8min specification


git repo:
https://github.com/zevorn/qemu/tree/riscv-zvfofp8min-zvfofp4min-v3

command:
cd $QEMU_SRC_PATH/build && ninja tests/fp/fp-test-ocp
./pyvenv/bin/meson test --suite softfloat-ocp -v


With the fix applied, all saturation tests pass:
  PASS: BF16 +inf -> E4M3 max normal (with saturation), got 0x7e
  PASS: BF16 +inf -> E5M2 max normal (with saturation), got 0x7b
  PASS: F32 +inf -> E4M3 max normal (with saturation), got 0x7e
  PASS: F32 +inf -> E5M2 max normal (with saturation), got 0x7b

---

@Richard: I noticed that the current tests/fp/ directory doesn't have test
coverage for BFloat16, OCP FP8 (E4M3/E5M2), or FP4 (E2M1) formats.

The existing fp-test relies on Berkeley TestFloat which doesn't support
these newer formats. Would it be useful if I clean up and submit my test
suite (fp-test-ocp.c) as a separate patch to provide basic test coverage
for these OCP floating-point formats? It could help catch similar issues
in future softfloat changes.

Thanks,
Chao


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type
  2026-02-05 13:21   ` Chao Liu
@ 2026-02-05 16:48     ` Max Chou
  0 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-05 16:48 UTC (permalink / raw)
  To: Chao Liu
  Cc: Richard Henderson, qemu-devel, qemu-riscv, Palmer Dabbelt,
	Alistair Francis, Aurelien Jarno, Peter Maydell, Alex Bennée,
	Paolo Bonzini, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On 2026-02-05 21:21, Chao Liu wrote:
> Hi Max,
> 
> I've been testing the OCP FP8 implementation by writing
> a simple test suite in tests/fp/ that covers various boundary cases for E4M3,
> E5M2, E2M1, and BFloat16 formats. During testing, I found some issues in the
> float_class_inf handling in partsN(uncanon_sat).
>

Hi Liu,

Ooops looks like my random test cases miss the cases that input is InF
from v2.
Thanks for pointing out this issue.
Richard has provided a v4 for the softfloat part, which is better and
clearer than v3. I intend to address this issue based on that one.
And I'll seperate the riscv isa part to another v4 based on the
softfloat v4.

Thanks,
rnax


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 06/19] fpu/softfloat: Support OCP(Open Compute Project) OFP4 data type
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (4 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 07/19] target/riscv: Add cfg properity for Zvfofp8min extension Max Chou
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

This commit provides the basic operation support for the OCP float4 data
type(e2m1).

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 fpu/softfloat.c               | 29 +++++++++++++++++++++++
 include/fpu/softfloat-types.h |  5 ++++
 include/fpu/softfloat.h       | 43 +++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 533f96dcda..96845e86df 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -595,6 +595,13 @@ static inline bool fmt_has_nan_encoding(const FloatFmt *fmt)
     .frac_shift     = (-F - 1) & 63,                    \
     .round_mask     = (1ull << ((-F - 1) & 63)) - 1
 
+static const FloatFmt float4_e2m1_params = {
+    FLOAT_PARAMS(2, 1),
+    .no_infinity = true,
+    .limited_nan = true,
+    .normal_frac_max = NORMAL_FRAC_MAX_ALL,
+};
+
 static const FloatFmt float8_e4m3_params = {
     FLOAT_PARAMS(4, 3),
     .no_infinity = true,
@@ -663,6 +670,11 @@ static void unpack_raw64(FloatParts64 *r, const FloatFmt *fmt, uint64_t raw)
     };
 }
 
+static void QEMU_FLATTEN float4_e2m1_unpack_raw(FloatParts64 *p, float4_e2m1 f)
+{
+    unpack_raw64(p, &float4_e2m1_params, f);
+}
+
 static void QEMU_FLATTEN float8_e4m3_unpack_raw(FloatParts64 *p, float8_e4m3 f)
 {
     unpack_raw64(p, &float8_e4m3_params, f);
@@ -1745,6 +1757,13 @@ static const uint16_t rsqrt_tab[128] = {
  * Pack/unpack routines with a specific FloatFmt.
  */
 
+static void float4_e2m1_unpack_canonical(FloatParts64 *p, float4_e2m1 f,
+                                       float_status *s)
+{
+    float4_e2m1_unpack_raw(p, f);
+    parts_canonicalize(p, s, &float4_e2m1_params);
+}
+
 static void float8_e4m3_unpack_canonical(FloatParts64 *p, float8_e4m3 f,
                                          float_status *s)
 {
@@ -2938,6 +2957,16 @@ static void parts_float_to_float_widen(FloatParts128 *a, FloatParts64 *b,
     }
 }
 
+float8_e4m3 float4_e2m1_to_float8_e4m3(float4_e2m1 a, float_status *s)
+{
+    FloatParts64 p;
+
+    float4_e2m1_unpack_canonical(&p, a, s);
+    parts_float_to_float(&p, s);
+
+    return float8_e4m3_round_pack_canonical(&p, s, &float8_e4m3_params, false);
+}
+
 bfloat16 float8_e4m3_to_bfloat16(float8_e4m3 a, float_status *s)
 {
     FloatParts64 p;
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
index b781bf10b7..f8cadffff4 100644
--- a/include/fpu/softfloat-types.h
+++ b/include/fpu/softfloat-types.h
@@ -131,6 +131,11 @@ typedef uint8_t float8_e5m2;
 #define const_float8_e4m3(x) (x)
 #define const_float8_e5m2(x) (x)
 
+typedef uint8_t float4_e2m1;
+#define float4_e2m1_val(x) ((x) & 0xf)
+#define make_float4_e2m1(x) ((x) & 0xf)
+#define const_float4_e2m1(x) ((x) & 0xf)
+
 /*
  * Software IEC/IEEE floating-point underflow tininess-detection mode.
  */
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 7abbf92b7e..888efed288 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -200,6 +200,8 @@ float8_e5m2 bfloat16_to_float8_e5m2(bfloat16, bool saturate, float_status *statu
 float8_e4m3 float32_to_float8_e4m3(float32, bool saturate, float_status *status);
 float8_e5m2 float32_to_float8_e5m2(float32, bool saturate, float_status *status);
 
+float8_e4m3 float4_e2m1_to_float8_e4m3(float4_e2m1, float_status *status);
+
 /*----------------------------------------------------------------------------
 | Software OCP operations.
 *----------------------------------------------------------------------------*/
@@ -270,6 +272,47 @@ static inline bool float8_e5m2_is_normal(float8_e5m2 a)
     return (((float8_e5m2_val(a) >> 2) + 1) & 0x1f) >= 2;
 }
 
+static inline bool float4_e2m1_is_quiet_nan(float4_e2m1 a, float_status *status)
+{
+    return false;
+}
+
+static inline bool float4_e2m1_is_signaling_nan(float4_e2m1 a, float_status *status)
+{
+    return false;
+}
+
+static inline bool float4_e2m1_is_any_nan(float4_e2m1 a)
+{
+    return false;
+}
+
+static inline bool float4_e2m1_is_neg(float4_e2m1 a)
+{
+    return float4_e2m1_val(a) >> 3;
+}
+
+static inline bool float4_e2m1_is_infinity(float4_e2m1 a)
+{
+    return false;
+}
+
+static inline bool float4_e2m1_is_zero(float4_e2m1 a)
+{
+    return (float4_e2m1_val(a) & 0x7) == 0;
+}
+
+static inline bool float4_e2m1_is_zero_or_denormal(float4_e2m1 a)
+{
+    return (float4_e2m1_val(a) & 0x6) == 0;
+}
+
+static inline bool float4_e2m1_is_normal(float4_e2m1 a)
+{
+    uint8_t em = float4_e2m1_val(a) & 0x7;
+    return em >= 0x2 && em <= 0x7;
+}
+
 /*----------------------------------------------------------------------------
 | Software half-precision conversion routines.
 *----------------------------------------------------------------------------*/
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 07/19] target/riscv: Add cfg properity for Zvfofp8min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (5 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 06/19] fpu/softfloat: Support OCP(Open Compute Project) OFP4 " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04 16:29   ` Chao Liu
  2026-02-04  5:17 ` [PATCH v3 08/19] target/riscv: Add implied rules " Max Chou
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

According to the ISA spec of Zvfofp8min extension,

"The Zvfofp8min extension requires on the Zve32f extension."

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c                | 1 +
 target/riscv/cpu_cfg_fields.h.inc | 1 +
 target/riscv/tcg/tcg-cpu.c        | 5 +++++
 target/riscv/vector_helper.c      | 3 ++-
 4 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index bd17f61d7b..d6ce51ef5e 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -194,6 +194,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
     ISA_EXT_DATA_ENTRY(zvfbfwma, PRIV_VERSION_1_12_0, ext_zvfbfwma),
     ISA_EXT_DATA_ENTRY(zvfh, PRIV_VERSION_1_12_0, ext_zvfh),
     ISA_EXT_DATA_ENTRY(zvfhmin, PRIV_VERSION_1_12_0, ext_zvfhmin),
+    ISA_EXT_DATA_ENTRY(zvfofp8min, PRIV_VERSION_1_12_0, ext_zvfofp8min),
     ISA_EXT_DATA_ENTRY(zvkb, PRIV_VERSION_1_12_0, ext_zvkb),
     ISA_EXT_DATA_ENTRY(zvkg, PRIV_VERSION_1_12_0, ext_zvkg),
     ISA_EXT_DATA_ENTRY(zvkn, PRIV_VERSION_1_12_0, ext_zvkn),
diff --git a/target/riscv/cpu_cfg_fields.h.inc b/target/riscv/cpu_cfg_fields.h.inc
index 3696f02ee0..59302894af 100644
--- a/target/riscv/cpu_cfg_fields.h.inc
+++ b/target/riscv/cpu_cfg_fields.h.inc
@@ -104,6 +104,7 @@ BOOL_FIELD(ext_zvfbfmin)
 BOOL_FIELD(ext_zvfbfwma)
 BOOL_FIELD(ext_zvfh)
 BOOL_FIELD(ext_zvfhmin)
+BOOL_FIELD(ext_zvfofp8min)
 BOOL_FIELD(ext_smaia)
 BOOL_FIELD(ext_ssaia)
 BOOL_FIELD(ext_smctr)
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 378b298886..ba89436f13 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -710,6 +710,11 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
         return;
     }
 
+    if (cpu->cfg.ext_zvfofp8min && !cpu->cfg.ext_zve32f) {
+        error_setg(errp, "Zvfofp8min extension depends on Zve32f extension");
+        return;
+    }
+
     if (cpu->cfg.ext_zvfh && !cpu->cfg.ext_zfhmin) {
         error_setg(errp, "Zvfh extensions requires Zfhmin extension");
         return;
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index ec0ea4c143..ee5a1e595b 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -38,7 +38,8 @@ static target_ulong vtype_reserved(CPURISCVState *env, target_ulong vtype)
     int xlen = riscv_cpu_xlen(env);
     target_ulong reserved = 0;
 
-    if (riscv_cpu_cfg(env)->ext_zvfbfa) {
+    if (riscv_cpu_cfg(env)->ext_zvfbfa ||
+        riscv_cpu_cfg(env)->ext_zvfofp8min) {
         reserved = vtype & MAKE_64BIT_MASK(R_VTYPE_RESERVED_SHIFT,
                                            xlen - 1 - R_VTYPE_RESERVED_SHIFT);
     } else {
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/19] target/riscv: Add cfg properity for Zvfofp8min extension
  2026-02-04  5:17 ` [PATCH v3 07/19] target/riscv: Add cfg properity for Zvfofp8min extension Max Chou
@ 2026-02-04 16:29   ` Chao Liu
  2026-02-05  7:33     ` Max Chou
  0 siblings, 1 reply; 36+ messages in thread
From: Chao Liu @ 2026-02-04 16:29 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On Wed, Feb 04, 2026 at 01:17:43PM +0800, Max Chou wrote:
> According to the ISA spec of Zvfofp8min extension,
> 
> "The Zvfofp8min extension requires on the Zve32f extension."
> 
Typo in subject: "properity" should be "property".

The same typo appears in patches 13, 15, and 18.

Otherwise, the patch looks good.

Reviewed-by: Chao Liu <chao.liu.zevorn@gmail.com>

Thanks,
Chao
> Signed-off-by: Max Chou <max.chou@sifive.com>
> ---
>  target/riscv/cpu.c                | 1 +
>  target/riscv/cpu_cfg_fields.h.inc | 1 +
>  target/riscv/tcg/tcg-cpu.c        | 5 +++++
>  target/riscv/vector_helper.c      | 3 ++-
>  4 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index bd17f61d7b..d6ce51ef5e 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -194,6 +194,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
>      ISA_EXT_DATA_ENTRY(zvfbfwma, PRIV_VERSION_1_12_0, ext_zvfbfwma),
>      ISA_EXT_DATA_ENTRY(zvfh, PRIV_VERSION_1_12_0, ext_zvfh),
>      ISA_EXT_DATA_ENTRY(zvfhmin, PRIV_VERSION_1_12_0, ext_zvfhmin),
> +    ISA_EXT_DATA_ENTRY(zvfofp8min, PRIV_VERSION_1_12_0, ext_zvfofp8min),
>      ISA_EXT_DATA_ENTRY(zvkb, PRIV_VERSION_1_12_0, ext_zvkb),
>      ISA_EXT_DATA_ENTRY(zvkg, PRIV_VERSION_1_12_0, ext_zvkg),
>      ISA_EXT_DATA_ENTRY(zvkn, PRIV_VERSION_1_12_0, ext_zvkn),
> diff --git a/target/riscv/cpu_cfg_fields.h.inc b/target/riscv/cpu_cfg_fields.h.inc
> index 3696f02ee0..59302894af 100644
> --- a/target/riscv/cpu_cfg_fields.h.inc
> +++ b/target/riscv/cpu_cfg_fields.h.inc
> @@ -104,6 +104,7 @@ BOOL_FIELD(ext_zvfbfmin)
>  BOOL_FIELD(ext_zvfbfwma)
>  BOOL_FIELD(ext_zvfh)
>  BOOL_FIELD(ext_zvfhmin)
> +BOOL_FIELD(ext_zvfofp8min)
>  BOOL_FIELD(ext_smaia)
>  BOOL_FIELD(ext_ssaia)
>  BOOL_FIELD(ext_smctr)
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index 378b298886..ba89436f13 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -710,6 +710,11 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
>          return;
>      }
>  
> +    if (cpu->cfg.ext_zvfofp8min && !cpu->cfg.ext_zve32f) {
> +        error_setg(errp, "Zvfofp8min extension depends on Zve32f extension");
> +        return;
> +    }
> +
>      if (cpu->cfg.ext_zvfh && !cpu->cfg.ext_zfhmin) {
>          error_setg(errp, "Zvfh extensions requires Zfhmin extension");
>          return;
> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
> index ec0ea4c143..ee5a1e595b 100644
> --- a/target/riscv/vector_helper.c
> +++ b/target/riscv/vector_helper.c
> @@ -38,7 +38,8 @@ static target_ulong vtype_reserved(CPURISCVState *env, target_ulong vtype)
>      int xlen = riscv_cpu_xlen(env);
>      target_ulong reserved = 0;
>  
> -    if (riscv_cpu_cfg(env)->ext_zvfbfa) {
> +    if (riscv_cpu_cfg(env)->ext_zvfbfa ||
> +        riscv_cpu_cfg(env)->ext_zvfofp8min) {
>          reserved = vtype & MAKE_64BIT_MASK(R_VTYPE_RESERVED_SHIFT,
>                                             xlen - 1 - R_VTYPE_RESERVED_SHIFT);
>      } else {
> -- 
> 2.52.0
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/19] target/riscv: Add cfg properity for Zvfofp8min extension
  2026-02-04 16:29   ` Chao Liu
@ 2026-02-05  7:33     ` Max Chou
  0 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-05  7:33 UTC (permalink / raw)
  To: Chao Liu
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On 2026-02-05 00:29, Chao Liu wrote:
> On Wed, Feb 04, 2026 at 01:17:43PM +0800, Max Chou wrote:
> > According to the ISA spec of Zvfofp8min extension,
> > 
> > "The Zvfofp8min extension requires on the Zve32f extension."
> > 
> Typo in subject: "properity" should be "property".
> 
> The same typo appears in patches 13, 15, and 18.
> 
> Otherwise, the patch looks good.
> 

Thanks for pointing out the typo.
Will fix this issue at next version.

rnax

> Reviewed-by: Chao Liu <chao.liu.zevorn@gmail.com>
> 
> Thanks,
> Chao


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 08/19] target/riscv: Add implied rules for Zvfofp8min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (6 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 07/19] target/riscv: Add cfg properity for Zvfofp8min extension Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion " Max Chou
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

Add implied rules to enable the implied extensions of Zvfofp8min
extension recursively.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index d6ce51ef5e..36fddce5bf 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -2507,6 +2507,15 @@ static RISCVCPUImpliedExtsRule ZVFHMIN_IMPLIED = {
     },
 };
 
+static RISCVCPUImpliedExtsRule ZVFOFP8MIN_IMPLIED = {
+    .ext = CPU_CFG_OFFSET(ext_zvfofp8min),
+    .implied_multi_exts = {
+        CPU_CFG_OFFSET(ext_zve32f),
+
+        RISCV_IMPLIED_EXTS_RULE_END
+    },
+};
+
 static RISCVCPUImpliedExtsRule ZVKN_IMPLIED = {
     .ext = CPU_CFG_OFFSET(ext_zvkn),
     .implied_multi_exts = {
@@ -2635,8 +2644,8 @@ RISCVCPUImpliedExtsRule *riscv_multi_ext_implied_rules[] = {
     &ZKS_IMPLIED, &ZVBB_IMPLIED, &ZVE32F_IMPLIED,
     &ZVE32X_IMPLIED, &ZVE64D_IMPLIED, &ZVE64F_IMPLIED,
     &ZVE64X_IMPLIED, &ZVFBFMIN_IMPLIED, &ZVFBFWMA_IMPLIED,
-    &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVKN_IMPLIED,
-    &ZVKNC_IMPLIED, &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED,
+    &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVFOFP8MIN_IMPLIED,
+    &ZVKN_IMPLIED, &ZVKNC_IMPLIED, &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED,
     &ZVKS_IMPLIED,  &ZVKSC_IMPLIED, &ZVKSG_IMPLIED, &SSCFG_IMPLIED,
     &SUPM_IMPLIED, &SSPM_IMPLIED, &SMCTR_IMPLIED, &SSCTR_IMPLIED,
     NULL
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion for Zvfofp8min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (7 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 08/19] target/riscv: Add implied rules " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04 16:34   ` Chao Liu
  2026-02-04 16:54   ` Chao Liu
  2026-02-04  5:17 ` [PATCH v3 10/19] target/riscv: rvv: Make vfncvtbf16.f.f.w support BF16 to OFP8 " Max Chou
                   ` (10 subsequent siblings)
  19 siblings, 2 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

According to the Zvfofp8min extension, the vfwcvtbf16.f.f.v instruction
supports OFP8 to BF16 conversion when SEW is 8.
And the VTYPE.altfmt field is used to select the OFP8 format.
* altfmt = 0: OFP8.e4m3 to BF16
* altfmt = 1: OFP8.e5m2 to BF16

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/helper.h                      | 12 +++
 target/riscv/insn_trans/trans_rvbf16.c.inc | 16 +++-
 target/riscv/vector_helper.c               | 97 ++++++++++++++++++++++
 3 files changed, 121 insertions(+), 4 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index eb0a488ba8..356c24d9fb 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1247,6 +1247,18 @@ DEF_HELPER_5(vfwcvtbf16_f_f_v, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vfwmaccbf16_vv, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vfwmaccbf16_vf, void, ptr, ptr, i64, ptr, env, i32)
 
+/* OFP8 functions */
+DEF_HELPER_5(vfwcvtbf16_f_f_v_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfwcvtbf16_f_f_v_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvtbf16_f_f_w_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvtbf16_f_f_w_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvtbf16_sat_f_f_w_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvtbf16_sat_f_f_w_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
+DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
+
 /* Vector crypto functions */
 DEF_HELPER_6(vclmul_vv, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vclmul_vx, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc
index 6cfda03d2e..9aafd4d2ef 100644
--- a/target/riscv/insn_trans/trans_rvbf16.c.inc
+++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
@@ -92,11 +92,20 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
 static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZVFBFMIN(ctx);
 
-    if (opfv_widen_check(ctx, a) && (ctx->sew == MO_16)) {
+    if (opfv_widen_check(ctx, a) &&
+        ((ctx->sew == MO_16 && ctx->cfg_ptr->ext_zvfbfmin) ||
+         (ctx->sew == MO_8 && ctx->cfg_ptr->ext_zvfofp8min))) {
+        gen_helper_gvec_3_ptr *fn;
         uint32_t data = 0;
 
+        if (ctx->sew == MO_16) {
+            fn = gen_helper_vfwcvtbf16_f_f_v;
+        } else {
+            fn = ctx->altfmt ? gen_helper_vfwcvtbf16_f_f_v_ofp8e5m2 :
+                               gen_helper_vfwcvtbf16_f_f_v_ofp8e4m3;
+        }
+
         gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
@@ -106,8 +115,7 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
         tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
                            vreg_ofs(ctx, a->rs2), tcg_env,
                            ctx->cfg_ptr->vlenb,
-                           ctx->cfg_ptr->vlenb, data,
-                           gen_helper_vfwcvtbf16_f_f_v);
+                           ctx->cfg_ptr->vlenb, data, fn);
         finalize_rvv_inst(ctx);
         return true;
     }
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index ee5a1e595b..418212973d 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -5024,6 +5024,103 @@ GEN_VEXT_V_ENV(vfncvt_f_f_w_w, 4)
 RVVCALL(OPFVV1, vfncvtbf16_f_f_w, NOP_UU_H, H2, H4, float32_to_bfloat16)
 GEN_VEXT_V_ENV(vfncvtbf16_f_f_w, 2)
 
+/*
+ * OCP FP8 Narrowing Conversions (BF16/F32 -> FP8)
+ * 1. Initialize a local float_status with RISC-V specific NaN handling
+ * 2. Call the softfloat conversion function with saturation parameter
+ * 3. Merge exception flags back to the original status
+ */
+#define GEN_OCP_FP8_NARROW(NAME, CONVERT_FN, SATURATE, IN_TYPE)  \
+static uint8_t NAME(IN_TYPE a, float_status *s)                  \
+{                                                                \
+    float_status local = *s;                                     \
+    local.default_nan_pattern = 0x70;                            \
+    local.default_nan_mode = true;                               \
+    uint8_t result = CONVERT_FN(a, SATURATE, &local);            \
+    s->float_exception_flags |= local.float_exception_flags;     \
+    return result;                                               \
+}
+
+/* BF16 -> E4M3/E5M2 conversions */
+GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e4m3, bfloat16_to_float8_e4m3, false,
+                   uint16_t)
+GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e5m2, bfloat16_to_float8_e5m2, false,
+                   uint16_t)
+GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e4m3_sat, bfloat16_to_float8_e4m3, true,
+                   uint16_t)
+GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e5m2_sat, bfloat16_to_float8_e5m2, true,
+                   uint16_t)
+
+/* F32 -> E4M3/E5M2 conversions */
+GEN_OCP_FP8_NARROW(vfncvt_f32_to_e4m3, float32_to_float8_e4m3, false, uint32_t)
+GEN_OCP_FP8_NARROW(vfncvt_f32_to_e5m2, float32_to_float8_e5m2, false, uint32_t)
+GEN_OCP_FP8_NARROW(vfncvt_f32_to_e4m3_sat, float32_to_float8_e4m3, true,
+                   uint32_t)
+GEN_OCP_FP8_NARROW(vfncvt_f32_to_e5m2_sat, float32_to_float8_e5m2, true,
+                   uint32_t)
+
+/*
+ * OCP FP8 Widening Conversions (FP8 -> BF16)
+ * According to Zvfofp8min isa specification: "No rounding occurs, and no
+ * floating-point exception flags are set."
+ * 1. Initialize a local float_status with no_signaling_nans=true
+ * 2. Call the softfloat conversion function
+ * 3. Intentionally DISCARD exception flags (not merged back)
+ */
+#define GEN_OCP_FP8_WIDEN(NAME, CONVERT_FN)      \
+static uint16_t NAME(uint8_t a, float_status *s) \
+{                                                \
+    float_status local = *s;                     \
+    local.no_signaling_nans = true;              \
+    return CONVERT_FN(a, &local);                \
+}
+
+GEN_OCP_FP8_WIDEN(vfwcvt_e4m3_to_bf16, float8_e4m3_to_bfloat16)
+GEN_OCP_FP8_WIDEN(vfwcvt_e5m2_to_bf16, float8_e5m2_to_bfloat16)
+
+/* vfwcvtbf16.f.f.w vd, vs2, vm # Convert OFP8 to BF16. */
+RVVCALL(OPFVV1, vfwcvtbf16_f_f_v_ofp8e4m3, WOP_UU_B, H2, H1,
+        vfwcvt_e4m3_to_bf16)
+RVVCALL(OPFVV1, vfwcvtbf16_f_f_v_ofp8e5m2, WOP_UU_B, H2, H1,
+        vfwcvt_e5m2_to_bf16)
+GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v_ofp8e4m3, 2)
+GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v_ofp8e5m2, 2)
+
+/* vfncvtbf16.f.f.w vd, vs2, vm # Convert BF16 to OFP8 without saturation. */
+RVVCALL(OPFVV1, vfncvtbf16_f_f_w_ofp8e4m3, NOP_UU_B, H1, H2,
+        vfncvt_bf16_to_e4m3)
+RVVCALL(OPFVV1, vfncvtbf16_f_f_w_ofp8e5m2, NOP_UU_B, H1, H2,
+        vfncvt_bf16_to_e5m2)
+GEN_VEXT_V_ENV(vfncvtbf16_f_f_w_ofp8e4m3, 1)
+GEN_VEXT_V_ENV(vfncvtbf16_f_f_w_ofp8e5m2, 1)
+
+/* vfncvtbf16.sat.f.f.w vd, vs2, vm # Convert BF16 to OFP8 with saturation. */
+RVVCALL(OPFVV1, vfncvtbf16_sat_f_f_w_ofp8e4m3, NOP_UU_B, H1, H2,
+        vfncvt_bf16_to_e4m3_sat)
+RVVCALL(OPFVV1, vfncvtbf16_sat_f_f_w_ofp8e5m2, NOP_UU_B, H1, H2,
+        vfncvt_bf16_to_e5m2_sat)
+GEN_VEXT_V_ENV(vfncvtbf16_sat_f_f_w_ofp8e4m3, 1)
+GEN_VEXT_V_ENV(vfncvtbf16_sat_f_f_w_ofp8e5m2, 1)
+
+/* Quad-width narrowing type for FP32 to OFP8 */
+#define QOP_UU_B uint8_t, uint32_t, uint32_t
+
+/* vfncvt.f.f.q vd, vs2, vm # Convert FP32 to OFP8. */
+RVVCALL(OPFVV1, vfncvt_f_f_q_ofp8e4m3, QOP_UU_B, H1, H4,
+        vfncvt_f32_to_e4m3)
+RVVCALL(OPFVV1, vfncvt_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
+        vfncvt_f32_to_e5m2)
+GEN_VEXT_V_ENV(vfncvt_f_f_q_ofp8e4m3, 1)
+GEN_VEXT_V_ENV(vfncvt_f_f_q_ofp8e5m2, 1)
+
+/* vfncvt.sat.f.f.q vd, vs2, vm # Convert FP32 to OFP8 with saturation. */
+RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e4m3, QOP_UU_B, H1, H4,
+        vfncvt_f32_to_e4m3_sat)
+RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
+        vfncvt_f32_to_e5m2_sat)
+GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e4m3, 1)
+GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e5m2, 1)
+
 /*
  * Vector Reduction Operations
  */
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion for Zvfofp8min extension
  2026-02-04  5:17 ` [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion " Max Chou
@ 2026-02-04 16:34   ` Chao Liu
  2026-02-04 16:54   ` Chao Liu
  1 sibling, 0 replies; 36+ messages in thread
From: Chao Liu @ 2026-02-04 16:34 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On Wed, Feb 04, 2026 at 01:17:45PM +0800, Max Chou wrote:
> According to the Zvfofp8min extension, the vfwcvtbf16.f.f.v instruction
> supports OFP8 to BF16 conversion when SEW is 8.
> And the VTYPE.altfmt field is used to select the OFP8 format.
> * altfmt = 0: OFP8.e4m3 to BF16
> * altfmt = 1: OFP8.e5m2 to BF16
> 
> Signed-off-by: Max Chou <max.chou@sifive.com>
> ---
>  target/riscv/helper.h                      | 12 +++
>  target/riscv/insn_trans/trans_rvbf16.c.inc | 16 +++-
>  target/riscv/vector_helper.c               | 97 ++++++++++++++++++++++
>  3 files changed, 121 insertions(+), 4 deletions(-)
> 
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index eb0a488ba8..356c24d9fb 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -1247,6 +1247,18 @@ DEF_HELPER_5(vfwcvtbf16_f_f_v, void, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_6(vfwmaccbf16_vv, void, ptr, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_6(vfwmaccbf16_vf, void, ptr, ptr, i64, ptr, env, i32)
>  
> +/* OFP8 functions */
> +DEF_HELPER_5(vfwcvtbf16_f_f_v_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfwcvtbf16_f_f_v_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_f_f_w_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_f_f_w_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_sat_f_f_w_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_sat_f_f_w_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +
>  /* Vector crypto functions */
>  DEF_HELPER_6(vclmul_vv, void, ptr, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_6(vclmul_vx, void, ptr, ptr, tl, ptr, env, i32)
> diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc
> index 6cfda03d2e..9aafd4d2ef 100644
> --- a/target/riscv/insn_trans/trans_rvbf16.c.inc
> +++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
> @@ -92,11 +92,20 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
>  static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZVFBFMIN(ctx);
>  
> -    if (opfv_widen_check(ctx, a) && (ctx->sew == MO_16)) {
> +    if (opfv_widen_check(ctx, a) &&
> +        ((ctx->sew == MO_16 && ctx->cfg_ptr->ext_zvfbfmin) ||
> +         (ctx->sew == MO_8 && ctx->cfg_ptr->ext_zvfofp8min))) {
> +        gen_helper_gvec_3_ptr *fn;
>          uint32_t data = 0;
>  
> +        if (ctx->sew == MO_16) {
> +            fn = gen_helper_vfwcvtbf16_f_f_v;
> +        } else {
> +            fn = ctx->altfmt ? gen_helper_vfwcvtbf16_f_f_v_ofp8e5m2 :
> +                               gen_helper_vfwcvtbf16_f_f_v_ofp8e4m3;
> +        }
> +
>          gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
>  
>          data = FIELD_DP32(data, VDATA, VM, a->vm);
> @@ -106,8 +115,7 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
>          tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
>                             vreg_ofs(ctx, a->rs2), tcg_env,
>                             ctx->cfg_ptr->vlenb,
> -                           ctx->cfg_ptr->vlenb, data,
> -                           gen_helper_vfwcvtbf16_f_f_v);
> +                           ctx->cfg_ptr->vlenb, data, fn);
>          finalize_rvv_inst(ctx);
>          return true;
>      }
> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
> index ee5a1e595b..418212973d 100644
> --- a/target/riscv/vector_helper.c
> +++ b/target/riscv/vector_helper.c
> @@ -5024,6 +5024,103 @@ GEN_VEXT_V_ENV(vfncvt_f_f_w_w, 4)
>  RVVCALL(OPFVV1, vfncvtbf16_f_f_w, NOP_UU_H, H2, H4, float32_to_bfloat16)
>  GEN_VEXT_V_ENV(vfncvtbf16_f_f_w, 2)
>  
> +/*
> + * OCP FP8 Narrowing Conversions (BF16/F32 -> FP8)
> + * 1. Initialize a local float_status with RISC-V specific NaN handling
> + * 2. Call the softfloat conversion function with saturation parameter
> + * 3. Merge exception flags back to the original status
> + */
> +#define GEN_OCP_FP8_NARROW(NAME, CONVERT_FN, SATURATE, IN_TYPE)  \
> +static uint8_t NAME(IN_TYPE a, float_status *s)                  \
checkpatch reports:
ERROR: spaces required around that '*' (ctx:WxV)

Please fix the spacing around the '*' in the macro definition.

Thanks,
Chao
> +{                                                                \
> +    float_status local = *s;                                     \
> +    local.default_nan_pattern = 0x70;                            \
> +    local.default_nan_mode = true;                               \
> +    uint8_t result = CONVERT_FN(a, SATURATE, &local);            \
> +    s->float_exception_flags |= local.float_exception_flags;     \
> +    return result;                                               \
> +}
> +
> +/* BF16 -> E4M3/E5M2 conversions */
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e4m3, bfloat16_to_float8_e4m3, false,
> +                   uint16_t)
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e5m2, bfloat16_to_float8_e5m2, false,
> +                   uint16_t)
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e4m3_sat, bfloat16_to_float8_e4m3, true,
> +                   uint16_t)
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e5m2_sat, bfloat16_to_float8_e5m2, true,
> +                   uint16_t)
> +
> +/* F32 -> E4M3/E5M2 conversions */
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e4m3, float32_to_float8_e4m3, false, uint32_t)
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e5m2, float32_to_float8_e5m2, false, uint32_t)
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e4m3_sat, float32_to_float8_e4m3, true,
> +                   uint32_t)
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e5m2_sat, float32_to_float8_e5m2, true,
> +                   uint32_t)
> +
> +/*
> + * OCP FP8 Widening Conversions (FP8 -> BF16)
> + * According to Zvfofp8min isa specification: "No rounding occurs, and no
> + * floating-point exception flags are set."
> + * 1. Initialize a local float_status with no_signaling_nans=true
> + * 2. Call the softfloat conversion function
> + * 3. Intentionally DISCARD exception flags (not merged back)
> + */
> +#define GEN_OCP_FP8_WIDEN(NAME, CONVERT_FN)      \
> +static uint16_t NAME(uint8_t a, float_status *s) \
> +{                                                \
> +    float_status local = *s;                     \
> +    local.no_signaling_nans = true;              \
> +    return CONVERT_FN(a, &local);                \
> +}
> +
> +GEN_OCP_FP8_WIDEN(vfwcvt_e4m3_to_bf16, float8_e4m3_to_bfloat16)
> +GEN_OCP_FP8_WIDEN(vfwcvt_e5m2_to_bf16, float8_e5m2_to_bfloat16)
> +
> +/* vfwcvtbf16.f.f.w vd, vs2, vm # Convert OFP8 to BF16. */
> +RVVCALL(OPFVV1, vfwcvtbf16_f_f_v_ofp8e4m3, WOP_UU_B, H2, H1,
> +        vfwcvt_e4m3_to_bf16)
> +RVVCALL(OPFVV1, vfwcvtbf16_f_f_v_ofp8e5m2, WOP_UU_B, H2, H1,
> +        vfwcvt_e5m2_to_bf16)
> +GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v_ofp8e4m3, 2)
> +GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v_ofp8e5m2, 2)
> +
> +/* vfncvtbf16.f.f.w vd, vs2, vm # Convert BF16 to OFP8 without saturation. */
> +RVVCALL(OPFVV1, vfncvtbf16_f_f_w_ofp8e4m3, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e4m3)
> +RVVCALL(OPFVV1, vfncvtbf16_f_f_w_ofp8e5m2, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e5m2)
> +GEN_VEXT_V_ENV(vfncvtbf16_f_f_w_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvtbf16_f_f_w_ofp8e5m2, 1)
> +
> +/* vfncvtbf16.sat.f.f.w vd, vs2, vm # Convert BF16 to OFP8 with saturation. */
> +RVVCALL(OPFVV1, vfncvtbf16_sat_f_f_w_ofp8e4m3, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e4m3_sat)
> +RVVCALL(OPFVV1, vfncvtbf16_sat_f_f_w_ofp8e5m2, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e5m2_sat)
> +GEN_VEXT_V_ENV(vfncvtbf16_sat_f_f_w_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvtbf16_sat_f_f_w_ofp8e5m2, 1)
> +
> +/* Quad-width narrowing type for FP32 to OFP8 */
> +#define QOP_UU_B uint8_t, uint32_t, uint32_t
> +
> +/* vfncvt.f.f.q vd, vs2, vm # Convert FP32 to OFP8. */
> +RVVCALL(OPFVV1, vfncvt_f_f_q_ofp8e4m3, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e4m3)
> +RVVCALL(OPFVV1, vfncvt_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e5m2)
> +GEN_VEXT_V_ENV(vfncvt_f_f_q_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvt_f_f_q_ofp8e5m2, 1)
> +
> +/* vfncvt.sat.f.f.q vd, vs2, vm # Convert FP32 to OFP8 with saturation. */
> +RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e4m3, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e4m3_sat)
> +RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e5m2_sat)
> +GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e5m2, 1)
> +
>  /*
>   * Vector Reduction Operations
>   */
> -- 
> 2.52.0
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion for Zvfofp8min extension
  2026-02-04  5:17 ` [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion " Max Chou
  2026-02-04 16:34   ` Chao Liu
@ 2026-02-04 16:54   ` Chao Liu
  1 sibling, 0 replies; 36+ messages in thread
From: Chao Liu @ 2026-02-04 16:54 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On Wed, Feb 04, 2026 at 01:17:45PM +0800, Max Chou wrote:
> According to the Zvfofp8min extension, the vfwcvtbf16.f.f.v instruction
> supports OFP8 to BF16 conversion when SEW is 8.
> And the VTYPE.altfmt field is used to select the OFP8 format.
> * altfmt = 0: OFP8.e4m3 to BF16
> * altfmt = 1: OFP8.e5m2 to BF16
> 
> Signed-off-by: Max Chou <max.chou@sifive.com>
> ---
>  target/riscv/helper.h                      | 12 +++
>  target/riscv/insn_trans/trans_rvbf16.c.inc | 16 +++-
>  target/riscv/vector_helper.c               | 97 ++++++++++++++++++++++
>  3 files changed, 121 insertions(+), 4 deletions(-)
> 
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index eb0a488ba8..356c24d9fb 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -1247,6 +1247,18 @@ DEF_HELPER_5(vfwcvtbf16_f_f_v, void, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_6(vfwmaccbf16_vv, void, ptr, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_6(vfwmaccbf16_vf, void, ptr, ptr, i64, ptr, env, i32)
>  
> +/* OFP8 functions */
> +DEF_HELPER_5(vfwcvtbf16_f_f_v_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfwcvtbf16_f_f_v_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_f_f_w_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_f_f_w_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_sat_f_f_w_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvtbf16_sat_f_f_w_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
> +DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
> +
>  /* Vector crypto functions */
>  DEF_HELPER_6(vclmul_vv, void, ptr, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_6(vclmul_vx, void, ptr, ptr, tl, ptr, env, i32)
> diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc
> index 6cfda03d2e..9aafd4d2ef 100644
> --- a/target/riscv/insn_trans/trans_rvbf16.c.inc
> +++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
> @@ -92,11 +92,20 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
>  static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZVFBFMIN(ctx);
>  
> -    if (opfv_widen_check(ctx, a) && (ctx->sew == MO_16)) {
> +    if (opfv_widen_check(ctx, a) &&
> +        ((ctx->sew == MO_16 && ctx->cfg_ptr->ext_zvfbfmin) ||
> +         (ctx->sew == MO_8 && ctx->cfg_ptr->ext_zvfofp8min))) {
> +        gen_helper_gvec_3_ptr *fn;
>          uint32_t data = 0;
>  
> +        if (ctx->sew == MO_16) {
> +            fn = gen_helper_vfwcvtbf16_f_f_v;
> +        } else {
> +            fn = ctx->altfmt ? gen_helper_vfwcvtbf16_f_f_v_ofp8e5m2 :
> +                               gen_helper_vfwcvtbf16_f_f_v_ofp8e4m3;
> +        }
> +
>          gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
>  
>          data = FIELD_DP32(data, VDATA, VM, a->vm);
> @@ -106,8 +115,7 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a)
>          tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
>                             vreg_ofs(ctx, a->rs2), tcg_env,
>                             ctx->cfg_ptr->vlenb,
> -                           ctx->cfg_ptr->vlenb, data,
> -                           gen_helper_vfwcvtbf16_f_f_v);
> +                           ctx->cfg_ptr->vlenb, data, fn);
>          finalize_rvv_inst(ctx);
>          return true;
>      }
> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
> index ee5a1e595b..418212973d 100644
> --- a/target/riscv/vector_helper.c
> +++ b/target/riscv/vector_helper.c
> @@ -5024,6 +5024,103 @@ GEN_VEXT_V_ENV(vfncvt_f_f_w_w, 4)
>  RVVCALL(OPFVV1, vfncvtbf16_f_f_w, NOP_UU_H, H2, H4, float32_to_bfloat16)
>  GEN_VEXT_V_ENV(vfncvtbf16_f_f_w, 2)
>  
> +/*
> + * OCP FP8 Narrowing Conversions (BF16/F32 -> FP8)
> + * 1. Initialize a local float_status with RISC-V specific NaN handling
> + * 2. Call the softfloat conversion function with saturation parameter
> + * 3. Merge exception flags back to the original status
> + */
> +#define GEN_OCP_FP8_NARROW(NAME, CONVERT_FN, SATURATE, IN_TYPE)  \
> +static uint8_t NAME(IN_TYPE a, float_status *s)                  \
> +{                                                                \
> +    float_status local = *s;                                     \
> +    local.default_nan_pattern = 0x70;                            \
I suggest adding a comment in GEN_OCP_FP8_NARROW() to explain the
choice of default_nan_pattern:

    * 0x70 produces canonical NaN 0x7f for both E4M3 and E5M2 per
    * Zvfofp8min spec

Thanks,
Chao
> +    local.default_nan_mode = true;                               \
> +    uint8_t result = CONVERT_FN(a, SATURATE, &local);            \
> +    s->float_exception_flags |= local.float_exception_flags;     \
> +    return result;                                               \
> +}
> +
> +/* BF16 -> E4M3/E5M2 conversions */
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e4m3, bfloat16_to_float8_e4m3, false,
> +                   uint16_t)
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e5m2, bfloat16_to_float8_e5m2, false,
> +                   uint16_t)
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e4m3_sat, bfloat16_to_float8_e4m3, true,
> +                   uint16_t)
> +GEN_OCP_FP8_NARROW(vfncvt_bf16_to_e5m2_sat, bfloat16_to_float8_e5m2, true,
> +                   uint16_t)
> +
> +/* F32 -> E4M3/E5M2 conversions */
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e4m3, float32_to_float8_e4m3, false, uint32_t)
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e5m2, float32_to_float8_e5m2, false, uint32_t)
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e4m3_sat, float32_to_float8_e4m3, true,
> +                   uint32_t)
> +GEN_OCP_FP8_NARROW(vfncvt_f32_to_e5m2_sat, float32_to_float8_e5m2, true,
> +                   uint32_t)
> +
> +/*
> + * OCP FP8 Widening Conversions (FP8 -> BF16)
> + * According to Zvfofp8min isa specification: "No rounding occurs, and no
> + * floating-point exception flags are set."
> + * 1. Initialize a local float_status with no_signaling_nans=true
> + * 2. Call the softfloat conversion function
> + * 3. Intentionally DISCARD exception flags (not merged back)
> + */
> +#define GEN_OCP_FP8_WIDEN(NAME, CONVERT_FN)      \
> +static uint16_t NAME(uint8_t a, float_status *s) \
> +{                                                \
> +    float_status local = *s;                     \
> +    local.no_signaling_nans = true;              \
> +    return CONVERT_FN(a, &local);                \
> +}
> +
> +GEN_OCP_FP8_WIDEN(vfwcvt_e4m3_to_bf16, float8_e4m3_to_bfloat16)
> +GEN_OCP_FP8_WIDEN(vfwcvt_e5m2_to_bf16, float8_e5m2_to_bfloat16)
> +
> +/* vfwcvtbf16.f.f.w vd, vs2, vm # Convert OFP8 to BF16. */
> +RVVCALL(OPFVV1, vfwcvtbf16_f_f_v_ofp8e4m3, WOP_UU_B, H2, H1,
> +        vfwcvt_e4m3_to_bf16)
> +RVVCALL(OPFVV1, vfwcvtbf16_f_f_v_ofp8e5m2, WOP_UU_B, H2, H1,
> +        vfwcvt_e5m2_to_bf16)
> +GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v_ofp8e4m3, 2)
> +GEN_VEXT_V_ENV(vfwcvtbf16_f_f_v_ofp8e5m2, 2)
> +
> +/* vfncvtbf16.f.f.w vd, vs2, vm # Convert BF16 to OFP8 without saturation. */
> +RVVCALL(OPFVV1, vfncvtbf16_f_f_w_ofp8e4m3, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e4m3)
> +RVVCALL(OPFVV1, vfncvtbf16_f_f_w_ofp8e5m2, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e5m2)
> +GEN_VEXT_V_ENV(vfncvtbf16_f_f_w_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvtbf16_f_f_w_ofp8e5m2, 1)
> +
> +/* vfncvtbf16.sat.f.f.w vd, vs2, vm # Convert BF16 to OFP8 with saturation. */
> +RVVCALL(OPFVV1, vfncvtbf16_sat_f_f_w_ofp8e4m3, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e4m3_sat)
> +RVVCALL(OPFVV1, vfncvtbf16_sat_f_f_w_ofp8e5m2, NOP_UU_B, H1, H2,
> +        vfncvt_bf16_to_e5m2_sat)
> +GEN_VEXT_V_ENV(vfncvtbf16_sat_f_f_w_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvtbf16_sat_f_f_w_ofp8e5m2, 1)
> +
> +/* Quad-width narrowing type for FP32 to OFP8 */
> +#define QOP_UU_B uint8_t, uint32_t, uint32_t
> +
> +/* vfncvt.f.f.q vd, vs2, vm # Convert FP32 to OFP8. */
> +RVVCALL(OPFVV1, vfncvt_f_f_q_ofp8e4m3, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e4m3)
> +RVVCALL(OPFVV1, vfncvt_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e5m2)
> +GEN_VEXT_V_ENV(vfncvt_f_f_q_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvt_f_f_q_ofp8e5m2, 1)
> +
> +/* vfncvt.sat.f.f.q vd, vs2, vm # Convert FP32 to OFP8 with saturation. */
> +RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e4m3, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e4m3_sat)
> +RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
> +        vfncvt_f32_to_e5m2_sat)
> +GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e4m3, 1)
> +GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e5m2, 1)
> +
>  /*
>   * Vector Reduction Operations
>   */
> -- 
> 2.52.0
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 10/19] target/riscv: rvv: Make vfncvtbf16.f.f.w support BF16 to OFP8 conversion for Zvfofp8min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (8 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 11/19] target/riscv: rvv: Add vfncvtbf16.sat.f.f.w instruction " Max Chou
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

According to the Zvfofp8min extension, the vfncvtbf16.f.f.w instruction
supports BF16 to OFP8 conversion without satuation when SEW is 8.
And the VTYPE.altfmt field is used to select the OFP8 format.
* altfmt = 0: BF16 to OFP8.e4m3
* altfmt = 1: BF16 to OFP8.e5m2

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/insn_trans/trans_rvbf16.c.inc | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc
index 9aafd4d2ef..16f4403909 100644
--- a/target/riscv/insn_trans/trans_rvbf16.c.inc
+++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
@@ -67,11 +67,20 @@ static bool trans_fcvt_s_bf16(DisasContext *ctx, arg_fcvt_s_bf16 *a)
 static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZVFBFMIN(ctx);
 
-    if (opfv_narrow_check(ctx, a) && (ctx->sew == MO_16)) {
+    if (opfv_narrow_check(ctx, a) &&
+        ((ctx->sew == MO_16 && ctx->cfg_ptr->ext_zvfbfmin) ||
+         (ctx->sew == MO_8 && ctx->cfg_ptr->ext_zvfofp8min))) {
+        gen_helper_gvec_3_ptr *fn;
         uint32_t data = 0;
 
+        if (ctx->sew == MO_16) {
+            fn = gen_helper_vfncvtbf16_f_f_w;
+        } else {
+            fn = ctx->altfmt ? gen_helper_vfncvtbf16_f_f_w_ofp8e5m2 :
+                               gen_helper_vfncvtbf16_f_f_w_ofp8e4m3;
+        }
+
         gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
 
         data = FIELD_DP32(data, VDATA, VM, a->vm);
@@ -81,8 +90,7 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a)
         tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
                            vreg_ofs(ctx, a->rs2), tcg_env,
                            ctx->cfg_ptr->vlenb,
-                           ctx->cfg_ptr->vlenb, data,
-                           gen_helper_vfncvtbf16_f_f_w);
+                           ctx->cfg_ptr->vlenb, data, fn);
         finalize_rvv_inst(ctx);
         return true;
     }
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 11/19] target/riscv: rvv: Add vfncvtbf16.sat.f.f.w instruction for Zvfofp8min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (9 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 10/19] target/riscv: rvv: Make vfncvtbf16.f.f.w support BF16 to OFP8 " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 12/19] target/riscv: rvv: Add vfncvt.f.f.q and vfncvt.sat.f.f.q instructions " Max Chou
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

The vfncvtbf16.sat.f.f.w instruction converts a vector of 16-bit
floating-point numbers to a vector of 8-bit floating-point numbers with
saturation.
The VTYPE.altfmt field is used to select the format of the 8-bit floating-point
numbers.
* altfmt = 0: BF16 to OFP8.e4m3
* altfmt = 1: BF16 to OFP8.e5m2

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/insn32.decode                 |  3 ++
 target/riscv/insn_trans/trans_rvofp8.c.inc | 42 ++++++++++++++++++++++
 target/riscv/translate.c                   |  1 +
 3 files changed, 46 insertions(+)
 create mode 100644 target/riscv/insn_trans/trans_rvofp8.c.inc

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 6e35c4b1e6..49201c0c20 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -973,6 +973,9 @@ vfwcvtbf16_f_f_v  010010 . ..... 01101 001 ..... 1010111 @r2_vm
 vfwmaccbf16_vv    111011 . ..... ..... 001 ..... 1010111 @r_vm
 vfwmaccbf16_vf    111011 . ..... ..... 101 ..... 1010111 @r_vm
 
+# *** Zvfofp8min Extension ***
+vfncvtbf16_sat_f_f_w  010010 . ..... 11111 001 ..... 1010111 @r2_vm
+
 # *** Zvbc vector crypto extension ***
 vclmul_vv   001100 . ..... ..... 010 ..... 1010111 @r_vm
 vclmul_vx   001100 . ..... ..... 110 ..... 1010111 @r_vm
diff --git a/target/riscv/insn_trans/trans_rvofp8.c.inc b/target/riscv/insn_trans/trans_rvofp8.c.inc
new file mode 100644
index 0000000000..d28f92e050
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvofp8.c.inc
@@ -0,0 +1,42 @@
+/*
+ * RISC-V translation routines for the OFP8 Standard Extensions.
+ *
+ * Copyright (C) 2025 SiFive, Inc.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#define REQUIRE_ZVFOFP8MIN(ctx) do {        \
+    if (!ctx->cfg_ptr->ext_zvfofp8min) {    \
+        return false;                       \
+    }                                       \
+} while (0)
+
+
+static bool trans_vfncvtbf16_sat_f_f_w(DisasContext *ctx, arg_rmr *a)
+{
+    REQUIRE_FPU;
+    REQUIRE_ZVFOFP8MIN(ctx);
+
+    if (opfv_narrow_check(ctx, a) && ctx->sew == MO_8) {
+        gen_helper_gvec_3_ptr *fn;
+        uint32_t data = 0;
+
+        fn = ctx->altfmt ? gen_helper_vfncvtbf16_sat_f_f_w_ofp8e5m2 :
+                           gen_helper_vfncvtbf16_sat_f_f_w_ofp8e4m3;
+
+        gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
+
+        data = FIELD_DP32(data, VDATA, VM, a->vm);
+        data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
+        data = FIELD_DP32(data, VDATA, VTA, ctx->vta);
+        data = FIELD_DP32(data, VDATA, VMA, ctx->vma);
+        tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
+                           vreg_ofs(ctx, a->rs2), tcg_env,
+                           ctx->cfg_ptr->vlenb,
+                           ctx->cfg_ptr->vlenb, data, fn);
+        finalize_rvv_inst(ctx);
+        return true;
+    }
+    return false;
+}
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index a1c4b325e5..137022d7fb 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -1219,6 +1219,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc)
 #include "insn_trans/trans_privileged.c.inc"
 #include "insn_trans/trans_svinval.c.inc"
 #include "insn_trans/trans_rvbf16.c.inc"
+#include "insn_trans/trans_rvofp8.c.inc"
 #include "decode-xthead.c.inc"
 #include "decode-xmips.c.inc"
 #include "insn_trans/trans_xthead.c.inc"
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 12/19] target/riscv: rvv: Add vfncvt.f.f.q and vfncvt.sat.f.f.q instructions for Zvfofp8min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (10 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 11/19] target/riscv: rvv: Add vfncvtbf16.sat.f.f.w instruction " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 13/19] target/riscv: Expose Zvfofp8min properity Max Chou
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

The vfncvt.f.f.q and vfncvt.sat.f.f.q instructions convert a vector of
FP32 elements to a vector of OFP8 elements. The vfncvt.sat.f.fq instruction
converts a vector of FP32 elements to a vector of OFP8 elements with saturation.
The VTYPE.altfmt field is used to select the OFP8 format.
* altfmt = 0: FP32 to OFP8.e4m3
* altfmt = 1: FP32 to OFP8.e5m2

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/insn32.decode                 |  2 +
 target/riscv/insn_trans/trans_rvofp8.c.inc | 63 ++++++++++++++++++++++
 target/riscv/insn_trans/trans_rvv.c.inc    | 39 ++++++++++++++
 3 files changed, 104 insertions(+)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 49201c0c20..f2b413c7d4 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -974,6 +974,8 @@ vfwmaccbf16_vv    111011 . ..... ..... 001 ..... 1010111 @r_vm
 vfwmaccbf16_vf    111011 . ..... ..... 101 ..... 1010111 @r_vm
 
 # *** Zvfofp8min Extension ***
+vfncvt_f_f_q          010010 . ..... 11001 001 ..... 1010111 @r2_vm
+vfncvt_sat_f_f_q      010010 . ..... 11011 001 ..... 1010111 @r2_vm
 vfncvtbf16_sat_f_f_w  010010 . ..... 11111 001 ..... 1010111 @r2_vm
 
 # *** Zvbc vector crypto extension ***
diff --git a/target/riscv/insn_trans/trans_rvofp8.c.inc b/target/riscv/insn_trans/trans_rvofp8.c.inc
index d28f92e050..619ee4d773 100644
--- a/target/riscv/insn_trans/trans_rvofp8.c.inc
+++ b/target/riscv/insn_trans/trans_rvofp8.c.inc
@@ -12,6 +12,13 @@
     }                                       \
 } while (0)
 
+static bool zvfofp8min_narrow_quad_check(DisasContext *s, arg_rmr *a)
+{
+    return require_rvv(s) &&
+           vext_check_isa_ill(s) &&
+           vext_check_sq(s, a->rd, a->rs2, a->vm) &&
+           (s->sew == MO_8);
+}
 
 static bool trans_vfncvtbf16_sat_f_f_w(DisasContext *ctx, arg_rmr *a)
 {
@@ -40,3 +47,59 @@ static bool trans_vfncvtbf16_sat_f_f_w(DisasContext *ctx, arg_rmr *a)
     }
     return false;
 }
+
+static bool trans_vfncvt_f_f_q(DisasContext *ctx, arg_rmr *a)
+{
+    REQUIRE_FPU;
+    REQUIRE_ZVFOFP8MIN(ctx);
+
+    if (zvfofp8min_narrow_quad_check(ctx, a)) {
+        gen_helper_gvec_3_ptr *fn;
+        uint32_t data = 0;
+
+        fn = ctx->altfmt ? gen_helper_vfncvt_f_f_q_ofp8e5m2 :
+                           gen_helper_vfncvt_f_f_q_ofp8e4m3;
+
+        gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
+
+        data = FIELD_DP32(data, VDATA, VM, a->vm);
+        data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
+        data = FIELD_DP32(data, VDATA, VTA, ctx->vta);
+        data = FIELD_DP32(data, VDATA, VMA, ctx->vma);
+        tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
+                           vreg_ofs(ctx, a->rs2), tcg_env,
+                           ctx->cfg_ptr->vlenb,
+                           ctx->cfg_ptr->vlenb, data, fn);
+        finalize_rvv_inst(ctx);
+        return true;
+    }
+    return false;
+}
+
+static bool trans_vfncvt_sat_f_f_q(DisasContext *ctx, arg_rmr *a)
+{
+    REQUIRE_FPU;
+    REQUIRE_ZVFOFP8MIN(ctx);
+
+    if (zvfofp8min_narrow_quad_check(ctx, a)) {
+        gen_helper_gvec_3_ptr *fn;
+        uint32_t data = 0;
+
+        fn = ctx->altfmt ? gen_helper_vfncvt_sat_f_f_q_ofp8e5m2 :
+                           gen_helper_vfncvt_sat_f_f_q_ofp8e4m3;
+
+        gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);
+
+        data = FIELD_DP32(data, VDATA, VM, a->vm);
+        data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
+        data = FIELD_DP32(data, VDATA, VTA, ctx->vta);
+        data = FIELD_DP32(data, VDATA, VMA, ctx->vma);
+        tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
+                           vreg_ofs(ctx, a->rs2), tcg_env,
+                           ctx->cfg_ptr->vlenb,
+                           ctx->cfg_ptr->vlenb, data, fn);
+        finalize_rvv_inst(ctx);
+        return true;
+    }
+    return false;
+}
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc
index bcd45b0aa3..9053b9fb57 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -621,6 +621,45 @@ static bool vext_check_sds(DisasContext *s, int vd, int vs1, int vs2, int vm)
            require_align(vs1, s->lmul);
 }
 
+/*
+ * Common check function for vector narrowing instructions
+ * of single-width result (SEW) and quad-width source (4*SEW).
+ *
+ * Rules to be checked here:
+ *   1. The largest vector register group used by an instruction
+ *      can not be greater than 8 vector registers
+ *      (Section 31.5.2)
+ *   2. Quad-width SEW cannot greater than ELEN.
+ *      (Section 31.2)
+ *   3. Source vector register number is multiples of 4 * LMUL.
+ *      (Section 31.3.4.2)
+ *   4. Destination vector register number is multiples of LMUL.
+ *      (Section 31.3.4.2)
+ *   5. Destination vector register group for a masked vector
+ *      instruction cannot overlap the source mask register (v0).
+ *      (Section 31.5.3)
+ * risc-v unprivileged spec
+ */
+static bool vext_quad_narrow_check_common(DisasContext *s, int vd, int vs2,
+                                          int vm)
+{
+    return (s->lmul <= 1) &&
+           (s->sew < MO_32) &&
+           ((s->sew + 2) <= (s->cfg_ptr->elen >> 4)) &&
+           require_align(vs2, s->lmul + 2) &&
+           require_align(vd, s->lmul) &&
+           require_vm(vm, vd);
+}
+
+static bool vext_check_sq(DisasContext *s, int vd, int vs, int vm)
+{
+    bool ret = vext_quad_narrow_check_common(s, vd, vs, vm);
+    if (vd != vs) {
+        ret &= require_noover(vd, s->lmul, vs, s->lmul + 2);
+    }
+    return ret;
+}
+
 /*
  * Check function for vector reduction instructions.
  *
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 13/19] target/riscv: Expose Zvfofp8min properity
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (11 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 12/19] target/riscv: rvv: Add vfncvt.f.f.q and vfncvt.sat.f.f.q instructions " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 14/19] disas/riscv: Add support of Zvfofp8min extension Max Chou
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 36fddce5bf..c8cb3d021d 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1381,6 +1381,9 @@ const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
     MULTI_EXT_CFG_BOOL("x-svukte", ext_svukte, false),
     MULTI_EXT_CFG_BOOL("x-zvfbfa", ext_zvfbfa, false),
 
+    /* Zvfofp8min extension for OFP8 conversion */
+    MULTI_EXT_CFG_BOOL("x-zvfofp8min", ext_zvfofp8min, false),
+
     { },
 };
 
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 14/19] disas/riscv: Add support of Zvfofp8min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (12 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 13/19] target/riscv: Expose Zvfofp8min properity Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04 16:35   ` Chao Liu
  2026-02-04  5:17 ` [PATCH v3 15/19] target/riscv: Add cfg properity for Zvfofp4min extension Max Chou
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

This patch adds support to disassemble Zvfofp8min instructions.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 disas/riscv.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/disas/riscv.c b/disas/riscv.c
index 85cd2a9c2a..daffe9917f 100644
--- a/disas/riscv.c
+++ b/disas/riscv.c
@@ -984,6 +984,9 @@ typedef enum {
     rv_op_ssamoswap_d = 953,
     rv_op_c_sspush = 954,
     rv_op_c_sspopchk = 955,
+    rv_op_vfncvtbf16_sat_f_f_w = 956,
+    rv_op_vfncvt_f_f_q = 957,
+    rv_op_vfncvt_sat_f_f_q = 958,
 } rv_op;
 
 /* register names */
@@ -2254,6 +2257,9 @@ const rv_opcode_data rvi_opcode_data[] = {
       rv_op_sspush, 0 },
     { "c.sspopchk", rv_codec_cmop_ss, rv_fmt_rs1, NULL, rv_op_sspopchk,
       rv_op_sspopchk, 0 },
+    { "vfncvtbf16.sat.f.f.w", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
+    { "vfncvt.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
+    { "vfncvt.sat.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
 };
 
 /* CSR names */
@@ -3630,7 +3636,10 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
                     case 21: op = rv_op_vfncvt_rod_f_f_w; break;
                     case 22: op = rv_op_vfncvt_rtz_xu_f_w; break;
                     case 23: op = rv_op_vfncvt_rtz_x_f_w; break;
+                    case 25: op = rv_op_vfncvt_f_f_q; break;
+                    case 27: op = rv_op_vfncvt_sat_f_f_q; break;
                     case 29: op = rv_op_vfncvtbf16_f_f_w; break;
+                    case 31: op = rv_op_vfncvtbf16_sat_f_f_w; break;
                     }
                     break;
                 case 19:
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 14/19] disas/riscv: Add support of Zvfofp8min extension
  2026-02-04  5:17 ` [PATCH v3 14/19] disas/riscv: Add support of Zvfofp8min extension Max Chou
@ 2026-02-04 16:35   ` Chao Liu
  0 siblings, 0 replies; 36+ messages in thread
From: Chao Liu @ 2026-02-04 16:35 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On Wed, Feb 04, 2026 at 01:17:50PM +0800, Max Chou wrote:
> This patch adds support to disassemble Zvfofp8min instructions.
> 
> Signed-off-by: Max Chou <max.chou@sifive.com>
> ---
>  disas/riscv.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/disas/riscv.c b/disas/riscv.c
> index 85cd2a9c2a..daffe9917f 100644
> --- a/disas/riscv.c
> +++ b/disas/riscv.c
> @@ -984,6 +984,9 @@ typedef enum {
>      rv_op_ssamoswap_d = 953,
>      rv_op_c_sspush = 954,
>      rv_op_c_sspopchk = 955,
> +    rv_op_vfncvtbf16_sat_f_f_w = 956,
> +    rv_op_vfncvt_f_f_q = 957,
> +    rv_op_vfncvt_sat_f_f_q = 958,
>  } rv_op;
>  
>  /* register names */
> @@ -2254,6 +2257,9 @@ const rv_opcode_data rvi_opcode_data[] = {
>        rv_op_sspush, 0 },
>      { "c.sspopchk", rv_codec_cmop_ss, rv_fmt_rs1, NULL, rv_op_sspopchk,
>        rv_op_sspopchk, 0 },
> +    { "vfncvtbf16.sat.f.f.w", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
> +    { "vfncvt.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
> +    { "vfncvt.sat.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
>  };
>  
>  /* CSR names */
> @@ -3630,7 +3636,10 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
>                      case 21: op = rv_op_vfncvt_rod_f_f_w; break;
>                      case 22: op = rv_op_vfncvt_rtz_xu_f_w; break;
>                      case 23: op = rv_op_vfncvt_rtz_x_f_w; break;
> +                    case 25: op = rv_op_vfncvt_f_f_q; break;
> +                    case 27: op = rv_op_vfncvt_sat_f_f_q; break;
>                      case 29: op = rv_op_vfncvtbf16_f_f_w; break;
> +                    case 31: op = rv_op_vfncvtbf16_sat_f_f_w; break;
checkpatch reports:
ERROR: trailing statements should be on next line

The QEMU coding style requires that the statement after 'case' should be
on a new line. Please reformat as:

    case 25:
        op = rv_op_vfncvt_f_f_q;
        break;
    case 27:
        op = rv_op_vfncvt_sat_f_f_q;
        break;
    case 31:
        op = rv_op_vfncvtbf16_sat_f_f_w;
        break;

Thanks,
Chao

>                      }
>                      break;
>                  case 19:
> -- 
> 2.52.0
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 15/19] target/riscv: Add cfg properity for Zvfofp4min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (13 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 14/19] disas/riscv: Add support of Zvfofp8min extension Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 16/19] target/riscv: Add implied rules " Max Chou
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

According to the ISA spec of Zvfofp4min extension,
"The Zvfofp4min extension requires on the Zve32f extension."

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c                | 1 +
 target/riscv/cpu_cfg_fields.h.inc | 1 +
 target/riscv/tcg/tcg-cpu.c        | 5 +++++
 3 files changed, 7 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index c8cb3d021d..7823508615 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -194,6 +194,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
     ISA_EXT_DATA_ENTRY(zvfbfwma, PRIV_VERSION_1_12_0, ext_zvfbfwma),
     ISA_EXT_DATA_ENTRY(zvfh, PRIV_VERSION_1_12_0, ext_zvfh),
     ISA_EXT_DATA_ENTRY(zvfhmin, PRIV_VERSION_1_12_0, ext_zvfhmin),
+    ISA_EXT_DATA_ENTRY(zvfofp4min, PRIV_VERSION_1_12_0, ext_zvfofp4min),
     ISA_EXT_DATA_ENTRY(zvfofp8min, PRIV_VERSION_1_12_0, ext_zvfofp8min),
     ISA_EXT_DATA_ENTRY(zvkb, PRIV_VERSION_1_12_0, ext_zvkb),
     ISA_EXT_DATA_ENTRY(zvkg, PRIV_VERSION_1_12_0, ext_zvkg),
diff --git a/target/riscv/cpu_cfg_fields.h.inc b/target/riscv/cpu_cfg_fields.h.inc
index 59302894af..353a932c36 100644
--- a/target/riscv/cpu_cfg_fields.h.inc
+++ b/target/riscv/cpu_cfg_fields.h.inc
@@ -104,6 +104,7 @@ BOOL_FIELD(ext_zvfbfmin)
 BOOL_FIELD(ext_zvfbfwma)
 BOOL_FIELD(ext_zvfh)
 BOOL_FIELD(ext_zvfhmin)
+BOOL_FIELD(ext_zvfofp4min)
 BOOL_FIELD(ext_zvfofp8min)
 BOOL_FIELD(ext_smaia)
 BOOL_FIELD(ext_ssaia)
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index ba89436f13..b1097e55a3 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -715,6 +715,11 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
         return;
     }
 
+    if (cpu->cfg.ext_zvfofp4min && !cpu->cfg.ext_zve32f) {
+        error_setg(errp, "Zvfofp4min extension depends on Zve32f extension");
+        return;
+    }
+
     if (cpu->cfg.ext_zvfh && !cpu->cfg.ext_zfhmin) {
         error_setg(errp, "Zvfh extensions requires Zfhmin extension");
         return;
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 16/19] target/riscv: Add implied rules for Zvfofp4min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (14 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 15/19] target/riscv: Add cfg properity for Zvfofp4min extension Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 17/19] target/riscv: rvv: Add vfext.vf2 instruction " Max Chou
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

Add implied rules to enable the implied extensions of Zvfofp4min
extension recursively.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 7823508615..87827441d3 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -2520,6 +2520,15 @@ static RISCVCPUImpliedExtsRule ZVFOFP8MIN_IMPLIED = {
     },
 };
 
+static RISCVCPUImpliedExtsRule ZVFOFP4MIN_IMPLIED = {
+    .ext = CPU_CFG_OFFSET(ext_zvfofp4min),
+    .implied_multi_exts = {
+        CPU_CFG_OFFSET(ext_zve32f),
+
+        RISCV_IMPLIED_EXTS_RULE_END
+    },
+};
+
 static RISCVCPUImpliedExtsRule ZVKN_IMPLIED = {
     .ext = CPU_CFG_OFFSET(ext_zvkn),
     .implied_multi_exts = {
@@ -2648,9 +2657,10 @@ RISCVCPUImpliedExtsRule *riscv_multi_ext_implied_rules[] = {
     &ZKS_IMPLIED, &ZVBB_IMPLIED, &ZVE32F_IMPLIED,
     &ZVE32X_IMPLIED, &ZVE64D_IMPLIED, &ZVE64F_IMPLIED,
     &ZVE64X_IMPLIED, &ZVFBFMIN_IMPLIED, &ZVFBFWMA_IMPLIED,
-    &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVFOFP8MIN_IMPLIED,
-    &ZVKN_IMPLIED, &ZVKNC_IMPLIED, &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED,
-    &ZVKS_IMPLIED,  &ZVKSC_IMPLIED, &ZVKSG_IMPLIED, &SSCFG_IMPLIED,
+    &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVFOFP4MIN_IMPLIED,
+    &ZVFOFP8MIN_IMPLIED, &ZVKN_IMPLIED, &ZVKNC_IMPLIED,
+    &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED, &ZVKS_IMPLIED,
+    &ZVKSC_IMPLIED, &ZVKSG_IMPLIED, &SSCFG_IMPLIED,
     &SUPM_IMPLIED, &SSPM_IMPLIED, &SMCTR_IMPLIED, &SSCTR_IMPLIED,
     NULL
 };
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 17/19] target/riscv: rvv: Add vfext.vf2 instruction for Zvfofp4min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (15 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 16/19] target/riscv: Add implied rules " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-05  4:06   ` Chao Liu
  2026-02-04  5:17 ` [PATCH v3 18/19] target/riscv: Expose Zvfofp4min properity Max Chou
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

The vfext.vf2 instruction converts a vector of OCP FP4 E2M1
floating-point numbers to a vector of OFP FP8 E4M3 floating-points
numbers.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/helper.h                      |  3 ++
 target/riscv/insn32.decode                 |  3 ++
 target/riscv/insn_trans/trans_rvofp4.c.inc | 43 ++++++++++++++++++++++
 target/riscv/translate.c                   |  1 +
 target/riscv/vector_helper.c               | 33 +++++++++++++++++
 5 files changed, 83 insertions(+)
 create mode 100644 target/riscv/insn_trans/trans_rvofp4.c.inc

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 356c24d9fb..162303fb6c 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1259,6 +1259,9 @@ DEF_HELPER_5(vfncvt_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
 DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
 
+/* OFP4 function */
+DEF_HELPER_5(vfext_vf2, void, ptr, ptr, ptr, env, i32)
+
 /* Vector crypto functions */
 DEF_HELPER_6(vclmul_vv, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vclmul_vx, void, ptr, ptr, tl, ptr, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f2b413c7d4..c58223ebd8 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -754,6 +754,9 @@ vsext_vf2       010010 . ..... 00111 010 ..... 1010111 @r2_vm
 vsext_vf4       010010 . ..... 00101 010 ..... 1010111 @r2_vm
 vsext_vf8       010010 . ..... 00011 010 ..... 1010111 @r2_vm
 
+# Zvfofp4min Extension
+vfext_vf2       010010 . ..... 10110 010 ..... 1010111 @r2_vm
+
 vsetvli         0 ........... ..... 111 ..... 1010111  @r2_zimm11
 vsetivli        11 .......... ..... 111 ..... 1010111  @r2_zimm10
 vsetvl          1000000 ..... ..... 111 ..... 1010111  @r
diff --git a/target/riscv/insn_trans/trans_rvofp4.c.inc b/target/riscv/insn_trans/trans_rvofp4.c.inc
new file mode 100644
index 0000000000..0fb5d7d534
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvofp4.c.inc
@@ -0,0 +1,43 @@
+/*
+ * RISC-V translation routines for the OFP4 Standard Extensions.
+ *
+ * Copyright (C) 2025 SiFive, Inc.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+static bool vext_zvfofp4min_check(DisasContext *s, arg_rmr *a)
+{
+    return s->cfg_ptr->ext_zvfofp4min &&
+           (s->sew == MO_8) &&
+           vext_check_altfmt(s, -1) &&
+           (s->lmul >= -2) &&
+           require_rvv(s) &&
+           vext_check_isa_ill(s) &&
+           (a->rd != a->rs2) &&
+           require_align(a->rd, s->lmul) &&
+           require_align(a->rs2, s->lmul - 1) &&
+           require_vm(a->vm, a->rd) &&
+           require_noover(a->rd, s->lmul, a->rs2, s->lmul - 1);
+}
+
+static bool trans_vfext_vf2(DisasContext *s, arg_rmr *a)
+{
+    if (vext_zvfofp4min_check(s, a)) {
+        uint32_t data = 0;
+
+        data = FIELD_DP32(data, VDATA, VM, a->vm);
+        data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
+        data = FIELD_DP32(data, VDATA, VTA, s->vta);
+        data = FIELD_DP32(data, VDATA, VMA, s->vma);
+        tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
+                           vreg_ofs(s, a->rs2), tcg_env,
+                           s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data,
+                           gen_helper_vfext_vf2);
+        tcg_gen_movi_tl(cpu_vstart, 0);
+        finalize_rvv_inst(s);
+
+        return true;
+    }
+    return false;
+}
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 137022d7fb..bf403785b5 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -1220,6 +1220,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc)
 #include "insn_trans/trans_svinval.c.inc"
 #include "insn_trans/trans_rvbf16.c.inc"
 #include "insn_trans/trans_rvofp8.c.inc"
+#include "insn_trans/trans_rvofp4.c.inc"
 #include "decode-xthead.c.inc"
 #include "decode-xmips.c.inc"
 #include "insn_trans/trans_xthead.c.inc"
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 418212973d..a87728f130 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -5121,6 +5121,7 @@ RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
 GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e4m3, 1)
 GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e5m2, 1)
 
+/* Zvfofp4min: vfext.vf2 - OFP4 E2M1 to OFP8 E4M3 conversion */
 /*
  * Vector Reduction Operations
  */
@@ -5920,3 +5921,35 @@ GEN_VEXT_INT_EXT(vsext_vf2_d, int64_t, int32_t, H8, H4)
 GEN_VEXT_INT_EXT(vsext_vf4_w, int32_t, int8_t,  H4, H1)
 GEN_VEXT_INT_EXT(vsext_vf4_d, int64_t, int16_t, H8, H2)
 GEN_VEXT_INT_EXT(vsext_vf8_d, int64_t, int8_t,  H8, H1)
+
+
+void HELPER(vfext_vf2)(void *vd, void *v0, void *vs2, CPURISCVState *env,
+                       uint32_t desc)
+{
+    float_status fp_status = env->fp_status;
+    uint32_t vl = env->vl;
+    uint32_t vm = vext_vm(desc);
+    uint32_t esz = sizeof(uint8_t);
+    uint32_t total_elems = vext_get_total_elems(env, desc, esz);
+    uint32_t vta = vext_vta(desc);
+    uint32_t vma = vext_vma(desc);
+    uint32_t i;
+
+    VSTART_CHECK_EARLY_EXIT(env, vl);
+
+    for (i = env->vstart; i < vl; ++i) {
+        if (!vm && !vext_elem_mask(v0, i)) {
+            /* set masked-off elements to 1s */
+            vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz);
+            continue;
+        }
+
+        uint8_t input = *((uint8_t *)vs2 + H1((i % 2 ? i - 1 : i) / 2));
+        input = (i % 2) ? ((input >> 4) & 0xf) : (input & 0xf);
+        *((uint8_t *)vd + H1(i)) = float4_e2m1_to_float8_e4m3(input,
+                                                              &fp_status);
+    }
+    env->vstart = 0;
+    /* set tail elements to 1s */
+    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 17/19] target/riscv: rvv: Add vfext.vf2 instruction for Zvfofp4min extension
  2026-02-04  5:17 ` [PATCH v3 17/19] target/riscv: rvv: Add vfext.vf2 instruction " Max Chou
@ 2026-02-05  4:06   ` Chao Liu
  0 siblings, 0 replies; 36+ messages in thread
From: Chao Liu @ 2026-02-05  4:06 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On Wed, Feb 04, 2026 at 01:17:53PM +0800, Max Chou wrote:
> The vfext.vf2 instruction converts a vector of OCP FP4 E2M1
> floating-point numbers to a vector of OFP FP8 E4M3 floating-points
> numbers.
> 
> Signed-off-by: Max Chou <max.chou@sifive.com>
> ---
>  target/riscv/helper.h                      |  3 ++
>  target/riscv/insn32.decode                 |  3 ++
>  target/riscv/insn_trans/trans_rvofp4.c.inc | 43 ++++++++++++++++++++++
>  target/riscv/translate.c                   |  1 +
>  target/riscv/vector_helper.c               | 33 +++++++++++++++++
>  5 files changed, 83 insertions(+)
>  create mode 100644 target/riscv/insn_trans/trans_rvofp4.c.inc
> 
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index 356c24d9fb..162303fb6c 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -1259,6 +1259,9 @@ DEF_HELPER_5(vfncvt_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e4m3, void, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_5(vfncvt_sat_f_f_q_ofp8e5m2, void, ptr, ptr, ptr, env, i32)
>  
> +/* OFP4 function */
> +DEF_HELPER_5(vfext_vf2, void, ptr, ptr, ptr, env, i32)
> +
>  /* Vector crypto functions */
>  DEF_HELPER_6(vclmul_vv, void, ptr, ptr, ptr, ptr, env, i32)
>  DEF_HELPER_6(vclmul_vx, void, ptr, ptr, tl, ptr, env, i32)
> diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> index f2b413c7d4..c58223ebd8 100644
> --- a/target/riscv/insn32.decode
> +++ b/target/riscv/insn32.decode
> @@ -754,6 +754,9 @@ vsext_vf2       010010 . ..... 00111 010 ..... 1010111 @r2_vm
>  vsext_vf4       010010 . ..... 00101 010 ..... 1010111 @r2_vm
>  vsext_vf8       010010 . ..... 00011 010 ..... 1010111 @r2_vm
>  
> +# Zvfofp4min Extension
> +vfext_vf2       010010 . ..... 10110 010 ..... 1010111 @r2_vm
> +
>  vsetvli         0 ........... ..... 111 ..... 1010111  @r2_zimm11
>  vsetivli        11 .......... ..... 111 ..... 1010111  @r2_zimm10
>  vsetvl          1000000 ..... ..... 111 ..... 1010111  @r
> diff --git a/target/riscv/insn_trans/trans_rvofp4.c.inc b/target/riscv/insn_trans/trans_rvofp4.c.inc
> new file mode 100644
> index 0000000000..0fb5d7d534
> --- /dev/null
> +++ b/target/riscv/insn_trans/trans_rvofp4.c.inc
> @@ -0,0 +1,43 @@
> +/*
> + * RISC-V translation routines for the OFP4 Standard Extensions.
> + *
> + * Copyright (C) 2025 SiFive, Inc.
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +static bool vext_zvfofp4min_check(DisasContext *s, arg_rmr *a)
> +{
> +    return s->cfg_ptr->ext_zvfofp4min &&
> +           (s->sew == MO_8) &&
> +           vext_check_altfmt(s, -1) &&
> +           (s->lmul >= -2) &&
> +           require_rvv(s) &&
> +           vext_check_isa_ill(s) &&
> +           (a->rd != a->rs2) &&
> +           require_align(a->rd, s->lmul) &&
> +           require_align(a->rs2, s->lmul - 1) &&
> +           require_vm(a->vm, a->rd) &&
> +           require_noover(a->rd, s->lmul, a->rs2, s->lmul - 1);
> +}
> +
> +static bool trans_vfext_vf2(DisasContext *s, arg_rmr *a)
> +{
> +    if (vext_zvfofp4min_check(s, a)) {
> +        uint32_t data = 0;
> +
> +        data = FIELD_DP32(data, VDATA, VM, a->vm);
> +        data = FIELD_DP32(data, VDATA, LMUL, s->lmul);
> +        data = FIELD_DP32(data, VDATA, VTA, s->vta);
> +        data = FIELD_DP32(data, VDATA, VMA, s->vma);
> +        tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
> +                           vreg_ofs(s, a->rs2), tcg_env,
> +                           s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data,
> +                           gen_helper_vfext_vf2);
> +        tcg_gen_movi_tl(cpu_vstart, 0);
> +        finalize_rvv_inst(s);
> +
> +        return true;
> +    }
> +    return false;
> +}
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 137022d7fb..bf403785b5 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -1220,6 +1220,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc)
>  #include "insn_trans/trans_svinval.c.inc"
>  #include "insn_trans/trans_rvbf16.c.inc"
>  #include "insn_trans/trans_rvofp8.c.inc"
> +#include "insn_trans/trans_rvofp4.c.inc"
>  #include "decode-xthead.c.inc"
>  #include "decode-xmips.c.inc"
>  #include "insn_trans/trans_xthead.c.inc"
> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
> index 418212973d..a87728f130 100644
> --- a/target/riscv/vector_helper.c
> +++ b/target/riscv/vector_helper.c
> @@ -5121,6 +5121,7 @@ RVVCALL(OPFVV1, vfncvt_sat_f_f_q_ofp8e5m2, QOP_UU_B, H1, H4,
>  GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e4m3, 1)
>  GEN_VEXT_V_ENV(vfncvt_sat_f_f_q_ofp8e5m2, 1)
>  
> +/* Zvfofp4min: vfext.vf2 - OFP4 E2M1 to OFP8 E4M3 conversion */
>  /*
>   * Vector Reduction Operations
>   */
> @@ -5920,3 +5921,35 @@ GEN_VEXT_INT_EXT(vsext_vf2_d, int64_t, int32_t, H8, H4)
>  GEN_VEXT_INT_EXT(vsext_vf4_w, int32_t, int8_t,  H4, H1)
>  GEN_VEXT_INT_EXT(vsext_vf4_d, int64_t, int16_t, H8, H2)
>  GEN_VEXT_INT_EXT(vsext_vf8_d, int64_t, int8_t,  H8, H1)
> +
> +
> +void HELPER(vfext_vf2)(void *vd, void *v0, void *vs2, CPURISCVState *env,
> +                       uint32_t desc)
> +{
> +    float_status fp_status = env->fp_status;
> +    uint32_t vl = env->vl;
> +    uint32_t vm = vext_vm(desc);
> +    uint32_t esz = sizeof(uint8_t);
> +    uint32_t total_elems = vext_get_total_elems(env, desc, esz);
> +    uint32_t vta = vext_vta(desc);
> +    uint32_t vma = vext_vma(desc);
> +    uint32_t i;
> +
> +    VSTART_CHECK_EARLY_EXIT(env, vl);
> +
> +    for (i = env->vstart; i < vl; ++i) {
> +        if (!vm && !vext_elem_mask(v0, i)) {
> +            /* set masked-off elements to 1s */
> +            vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz);
> +            continue;
> +        }
> +
> +        uint8_t input = *((uint8_t *)vs2 + H1((i % 2 ? i - 1 : i) / 2));
> +        input = (i % 2) ? ((input >> 4) & 0xf) : (input & 0xf);
> +        *((uint8_t *)vd + H1(i)) = float4_e2m1_to_float8_e4m3(input,
> +                                                              &fp_status);
> +    }
> +    env->vstart = 0;
> +    /* set tail elements to 1s */
> +    vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);
> +}
> -- 
> 2.52.0
> 
> 

Reviewed-by: Chao Liu <chao.liu.zevorn@gmail.com>

Thanks,
Chao


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 18/19] target/riscv: Expose Zvfofp4min properity
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (16 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 17/19] target/riscv: rvv: Add vfext.vf2 instruction " Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04  5:17 ` [PATCH v3 19/19] disas/riscv: Add support of Zvfofp4min extension Max Chou
  2026-02-04 16:59 ` [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Chao Liu
  19 siblings, 0 replies; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou, Alistair Francis

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 target/riscv/cpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 87827441d3..d2f25cd14b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1384,6 +1384,8 @@ const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
 
     /* Zvfofp8min extension for OFP8 conversion */
     MULTI_EXT_CFG_BOOL("x-zvfofp8min", ext_zvfofp8min, false),
+    /* Zvfofp4min extension for OFP4 conversion */
+    MULTI_EXT_CFG_BOOL("x-zvfofp4min", ext_zvfofp4min, false),
 
     { },
 };
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v3 19/19] disas/riscv: Add support of Zvfofp4min extension
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (17 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 18/19] target/riscv: Expose Zvfofp4min properity Max Chou
@ 2026-02-04  5:17 ` Max Chou
  2026-02-04 16:43   ` Chao Liu
  2026-02-04 16:59 ` [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Chao Liu
  19 siblings, 1 reply; 36+ messages in thread
From: Max Chou @ 2026-02-04  5:17 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Aurelien Jarno, Peter Maydell,
	Alex Bennée, Paolo Bonzini, Richard Henderson,
	Eduardo Habkost, Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
	Max Chou

This patch adds support to disassemble Zvfofp4min instructions.

Signed-off-by: Max Chou <max.chou@sifive.com>
---
 disas/riscv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/disas/riscv.c b/disas/riscv.c
index daffe9917f..9abf86f2d7 100644
--- a/disas/riscv.c
+++ b/disas/riscv.c
@@ -987,6 +987,7 @@ typedef enum {
     rv_op_vfncvtbf16_sat_f_f_w = 956,
     rv_op_vfncvt_f_f_q = 957,
     rv_op_vfncvt_sat_f_f_q = 958,
+    rv_op_vfext_vf2 = 959,
 } rv_op;
 
 /* register names */
@@ -2260,6 +2261,7 @@ const rv_opcode_data rvi_opcode_data[] = {
     { "vfncvtbf16.sat.f.f.w", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
     { "vfncvt.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
     { "vfncvt.sat.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
+    { "vfext.vf2", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
 };
 
 /* CSR names */
@@ -3715,6 +3717,7 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
                     case 12: op = rv_op_vclz_v; break;
                     case 13: op = rv_op_vctz_v; break;
                     case 14: op = rv_op_vcpop_v; break;
+                    case 22: op = rv_op_vfext_vf2; break;
                     }
                     break;
                 case 20:
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 19/19] disas/riscv: Add support of Zvfofp4min extension
  2026-02-04  5:17 ` [PATCH v3 19/19] disas/riscv: Add support of Zvfofp4min extension Max Chou
@ 2026-02-04 16:43   ` Chao Liu
  2026-02-04 16:46     ` Chao Liu
  0 siblings, 1 reply; 36+ messages in thread
From: Chao Liu @ 2026-02-04 16:43 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On Wed, Feb 04, 2026 at 01:17:55PM +0800, Max Chou wrote:
> This patch adds support to disassemble Zvfofp4min instructions.
> 
> Signed-off-by: Max Chou <max.chou@sifive.com>
> ---
>  disas/riscv.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/disas/riscv.c b/disas/riscv.c
> index daffe9917f..9abf86f2d7 100644
> --- a/disas/riscv.c
> +++ b/disas/riscv.c
> @@ -987,6 +987,7 @@ typedef enum {
>      rv_op_vfncvtbf16_sat_f_f_w = 956,
>      rv_op_vfncvt_f_f_q = 957,
>      rv_op_vfncvt_sat_f_f_q = 958,
> +    rv_op_vfext_vf2 = 959,
>  } rv_op;
>  
>  /* register names */
> @@ -2260,6 +2261,7 @@ const rv_opcode_data rvi_opcode_data[] = {
>      { "vfncvtbf16.sat.f.f.w", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
>      { "vfncvt.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
>      { "vfncvt.sat.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
> +    { "vfext.vf2", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
>  };
>  
>  /* CSR names */
> @@ -3715,6 +3717,7 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
>                      case 12: op = rv_op_vclz_v; break;
>                      case 13: op = rv_op_vctz_v; break;
>                      case 14: op = rv_op_vcpop_v; break;
> +                    case 22: op = rv_op_vfext_vf2; break;
checkpatch reports:
ERROR: trailing statements should be on next line

But this should be on next line" can be ignored (Patch 9 too). The existing disas/riscv.c file
consistently uses the single-line format:
       case X: op = rv_op_xxx; break;

Maintaining consistency with the existing file style takes precedence here.

Other, LGTM.

Reviewed-by: Chao Liu <chao.liu.zevorn@gmail.com>

Thanks,
Chao
>                      }
>                      break;
>                  case 20:
> -- 
> 2.52.0
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 19/19] disas/riscv: Add support of Zvfofp4min extension
  2026-02-04 16:43   ` Chao Liu
@ 2026-02-04 16:46     ` Chao Liu
  0 siblings, 0 replies; 36+ messages in thread
From: Chao Liu @ 2026-02-04 16:46 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

On Thu, Feb 05, 2026 at 12:44:02AM +0800, Chao Liu wrote:
> On Wed, Feb 04, 2026 at 01:17:55PM +0800, Max Chou wrote:
> > This patch adds support to disassemble Zvfofp4min instructions.
> > 
> > Signed-off-by: Max Chou <max.chou@sifive.com>
> > ---
> >  disas/riscv.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/disas/riscv.c b/disas/riscv.c
> > index daffe9917f..9abf86f2d7 100644
> > --- a/disas/riscv.c
> > +++ b/disas/riscv.c
> > @@ -987,6 +987,7 @@ typedef enum {
> >      rv_op_vfncvtbf16_sat_f_f_w = 956,
> >      rv_op_vfncvt_f_f_q = 957,
> >      rv_op_vfncvt_sat_f_f_q = 958,
> > +    rv_op_vfext_vf2 = 959,
> >  } rv_op;
> >  
> >  /* register names */
> > @@ -2260,6 +2261,7 @@ const rv_opcode_data rvi_opcode_data[] = {
> >      { "vfncvtbf16.sat.f.f.w", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
> >      { "vfncvt.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
> >      { "vfncvt.sat.f.f.q", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
> > +    { "vfext.vf2", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, 0, 0, 0 },
> >  };
> >  
> >  /* CSR names */
> > @@ -3715,6 +3717,7 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
> >                      case 12: op = rv_op_vclz_v; break;
> >                      case 13: op = rv_op_vctz_v; break;
> >                      case 14: op = rv_op_vcpop_v; break;
> > +                    case 22: op = rv_op_vfext_vf2; break;
> checkpatch reports:
> ERROR: trailing statements should be on next line
> 
> But this should be on next line" can be ignored (Patch 9 too). The existing disas/riscv.c file
Correction: This is similar to the case in Patch 14. We can keep the current code style.

Thanks,
Chao
> consistently uses the single-line format:
>        case X: op = rv_op_xxx; break;
> 
> Maintaining consistency with the existing file style takes precedence here.
> 
> Other, LGTM.
> 
> Reviewed-by: Chao Liu <chao.liu.zevorn@gmail.com>
> 
> Thanks,
> Chao
> >                      }
> >                      break;
> >                  case 20:
> > -- 
> > 2.52.0
> > 
> > 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support
  2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
                   ` (18 preceding siblings ...)
  2026-02-04  5:17 ` [PATCH v3 19/19] disas/riscv: Add support of Zvfofp4min extension Max Chou
@ 2026-02-04 16:59 ` Chao Liu
  19 siblings, 0 replies; 36+ messages in thread
From: Chao Liu @ 2026-02-04 16:59 UTC (permalink / raw)
  To: Max Chou
  Cc: qemu-devel, qemu-riscv, Palmer Dabbelt, Alistair Francis,
	Aurelien Jarno, Peter Maydell, Alex Bennée, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Weiwei Li,
	Daniel Henrique Barboza, Liu Zhiwei

Hi Max,

On Wed, Feb 04, 2026 at 01:17:36PM +0800, Max Chou wrote:
> This patchset adds support for the OCP (Open Compute Project) 8-bit and
> 4-bit floating-point formats, along with the RISC-V Zvfofp8min and
> Zvfofp4min vector extensions that provide conversion operations for
> these formats.
> 
> OCP Floating-Point Formats
> * The OCP FP8 specification defines two 8-bit floating-point formats:
>   - E4M3: 4-bit exponent, 3-bit mantissa
>     * No infinity representation; only 0x7f and 0xff are NaN
>   - E5M2: 5-bit exponent, 2-bit mantissa
>     * IEEE-like format with infinity representation
>     * Multiple NaN encodings supported
> * The OCP FP4 specification defines the E2M1 format:
>   - E2M1: 2-bit exponent, 1-bit mantissa
>     * No NaN representation
> 
> RISC-V ISA Extensions
> * Zvfofp8min (Version 0.2.1):
>   The Zvfofp8min extension provides minimal vector conversion support
>   for OFP8 formats. It requires the Zve32f extension and leverages the
>   altfmt field in the VTYPE CSR (introduced by Zvfbfa) to select between
>   E4M3 (altfmt=0) and E5M2 (altfmt=1) formats.
>   - Canonical NaN for both E4M3 and E5M2 is 0x7f
>   - All NaNs are treated as quiet NaNs
>   Instructions added/extended:
>   - vfwcvtbf16.f.f.v: OFP8 to BF16 widening conversion
>   - vfncvtbf16.f.f.w: BF16 to OFP8 narrowing conversion
>   - vfncvtbf16.sat.f.f.w: BF16 to OFP8 with saturation (new)
>   - vfncvt.f.f.q: FP32 to OFP8 quad-narrowing conversion (new)
>   - vfncvt.sat.f.f.q: FP32 to OFP8 with saturation (new)
> 
> * Zvfofp4min (Version 0.1):
>   The Zvfofp4min extension provides minimal vector conversion support
>   for the OFP4 E2M1 format. It requires the Zve32f extension.
>   Instructions added:
>   - vfext.vf2: OFP4 E2M1 to OFP8 E4M3 widening conversion
> 
> Modifications
> * Softfloat library:
>   - Refactored IEEE format NaN classification to share code (new in v2)
>   - New float8_e4m3 and float8_e5m2 types with NaN checking functions
>   - New float4_e2m1 type for OFP4 support
>   - Conversion functions: bfloat16/float32 <-> float8_e4m3/float8_e5m2
>   - Conversion function: float4_e2m1 -> float8_e4m3
>   - Implementation uses capability-based FloatFmt flags for format behavior
> * RISC-V target:
>   - CPU configuration properties for Zvfofp8min and Zvfofp4min
>   - Extension implied rules (Zvfofp8min requires Zve32f and Zvfbfa)
>   - Vector helper functions for OFP8/OFP4 conversion instructions
>   - Disassembler support for new instructions
> 
> Changes in v3:
> - Add floatN_nan_is_snan to simply the quiet/signaling NaN checking flow
>   in patch 2 & 3
> - Add patch 4 to fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers
> 
Thanks for the v3 series.

I have also noted a few minor issues and replied to the relevant patch
emails individually.

Everything else looks good to me.

Reviewed-by: Chao Liu <chao.liu.zevorn@gmail.com>

Thanks,
Chao

> Changes in v2:
> - Merged v1 patch 2 & 3 to v2 patch 3, v1 patch 4 & 5 to v2 patch 4
> - Added new v2 patch 2 to refactor the IEEE format NaN classification
>   functions (float16, bfloat16, float32, float64) to use internal helper
>   functions, reducing code duplication and improving maintainability.
>   The OCP FP8 NaN classification functions follow the same pattern.
> - Refactored softfloat implementation to use capability-based FloatFmt
>   flags (no_infinity, limited_nan, overflow_raises_invalid, normal_frac_max)
>   instead of monolithic flags
> - Removed ocp_fp8e5m2_no_signal_nan and ocp_fp8_same_canonical_nan flags
>   from float_status; now using local float_status with no_signaling_nans
>   and default_nan_pattern for RISC-V Zvfofp8min instructions
> - Rebased on latest riscv-to-apply.next with zvfbfa v3 patchset
> 
> References
> * OCP FP8 specification:
>   https://www.opencompute.org/documents/ocp-8-bit-floating-point-specification-ofp8-revision-1-0-2023-12-01-pdf-1
> * Zvfofp8min specification (v0.2.1 commit e1e20a7):
>   https://github.com/aswaterman/riscv-misc/blob/main/isa/zvfofp8min.adoc
> * Zvfofp4min specification (v0.1 commit e1e20a7):
>   https://github.com/aswaterman/riscv-misc/blob/main/isa/zvfofp4min.adoc
> 
> PS: This series depends on the Zvfbfa extension patchset which introduces:
>   - The altfmt field in VTYPE CSR
>   - BF16 vector operations infrastructure
>   - vfwcvtbf16.f.f.v and vfncvtbf16.f.f.w base instructions
> 
> v2: <20260127063723.442734-1-max.chou@sifive.com>
> v1: <20260108151650.16329-1-max.chou@sifive.com>
> 
> Based-on: <20260127014227.406653-1-max.chou@sifive.com>
> 
> Max Chou (19):
>   target/riscv: rvv: Fix NOP_UU_B vs2 width
>   fpu/softfloat: Refactor IEEE format NaN classification to share code
>   fpu/softfloat: Refactor floatx80 format NaN classification to share
>     code
>   target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers
>   fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type
>   fpu/softfloat: Support OCP(Open Compute Project) OFP4 data type
>   target/riscv: Add cfg properity for Zvfofp8min extension
>   target/riscv: Add implied rules for Zvfofp8min extension
>   target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16
>     conversion for Zvfofp8min extension
>   target/riscv: rvv: Make vfncvtbf16.f.f.w support BF16 to OFP8
>     conversion for Zvfofp8min extension
>   target/riscv: rvv: Add vfncvtbf16.sat.f.f.w instruction for Zvfofp8min
>     extension
>   target/riscv: rvv: Add vfncvt.f.f.q and vfncvt.sat.f.f.q instructions
>     for Zvfofp8min extension
>   target/riscv: Expose Zvfofp8min properity
>   disas/riscv: Add support of Zvfofp8min extension
>   target/riscv: Add cfg properity for Zvfofp4min extension
>   target/riscv: Add implied rules for Zvfofp4min extension
>   target/riscv: rvv: Add vfext.vf2 instruction for Zvfofp4min extension
>   target/riscv: Expose Zvfofp4min properity
>   disas/riscv: Add support of Zvfofp4min extension
> 
>  disas/riscv.c                              |  12 +
>  fpu/softfloat-parts.c.inc                  | 159 +++++++++---
>  fpu/softfloat-specialize.c.inc             | 287 +++++++++++----------
>  fpu/softfloat.c                            | 220 +++++++++++++++-
>  include/fpu/softfloat-types.h              |  17 ++
>  include/fpu/softfloat.h                    | 128 ++++++++-
>  target/i386/tcg/fpu_helper.c               |  30 +--
>  target/riscv/cpu.c                         |  32 ++-
>  target/riscv/cpu_cfg_fields.h.inc          |   2 +
>  target/riscv/helper.h                      |  15 ++
>  target/riscv/insn32.decode                 |   8 +
>  target/riscv/insn_trans/trans_rvbf16.c.inc |  32 ++-
>  target/riscv/insn_trans/trans_rvofp4.c.inc |  43 +++
>  target/riscv/insn_trans/trans_rvofp8.c.inc | 105 ++++++++
>  target/riscv/insn_trans/trans_rvv.c.inc    |  39 +++
>  target/riscv/tcg/tcg-cpu.c                 |  10 +
>  target/riscv/translate.c                   |   2 +
>  target/riscv/vector_helper.c               | 135 +++++++++-
>  18 files changed, 1072 insertions(+), 204 deletions(-)
>  create mode 100644 target/riscv/insn_trans/trans_rvofp4.c.inc
>  create mode 100644 target/riscv/insn_trans/trans_rvofp8.c.inc
> 
> -- 
> 2.52.0
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2026-02-05 16:49 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-04  5:17 [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Max Chou
2026-02-04  5:17 ` [PATCH v3 01/19] target/riscv: rvv: Fix NOP_UU_B vs2 width Max Chou
2026-02-04  5:17 ` [PATCH v3 02/19] fpu/softfloat: Refactor IEEE format NaN classification to share code Max Chou
2026-02-05  3:29   ` Richard Henderson
2026-02-04  5:17 ` [PATCH v3 03/19] fpu/softfloat: Refactor floatx80 " Max Chou
2026-02-05  3:31   ` Richard Henderson
2026-02-04  5:17 ` [PATCH v3 04/19] target/i386: Fix pseudo-NaN handling in FPATAN/FYL2XP1/FYL2X helpers Max Chou
2026-02-05  3:34   ` Richard Henderson
2026-02-04  5:17 ` [PATCH v3 05/19] fpu/softfloat: Support OCP(Open Compute Project) OFP8 data type Max Chou
2026-02-05  4:36   ` Richard Henderson
2026-02-05 16:37     ` Max Chou
2026-02-05 13:21   ` Chao Liu
2026-02-05 16:48     ` Max Chou
2026-02-04  5:17 ` [PATCH v3 06/19] fpu/softfloat: Support OCP(Open Compute Project) OFP4 " Max Chou
2026-02-04  5:17 ` [PATCH v3 07/19] target/riscv: Add cfg properity for Zvfofp8min extension Max Chou
2026-02-04 16:29   ` Chao Liu
2026-02-05  7:33     ` Max Chou
2026-02-04  5:17 ` [PATCH v3 08/19] target/riscv: Add implied rules " Max Chou
2026-02-04  5:17 ` [PATCH v3 09/19] target/riscv: rvv: Make vfwcvtbf16.f.f.v support OFP8 to BF16 conversion " Max Chou
2026-02-04 16:34   ` Chao Liu
2026-02-04 16:54   ` Chao Liu
2026-02-04  5:17 ` [PATCH v3 10/19] target/riscv: rvv: Make vfncvtbf16.f.f.w support BF16 to OFP8 " Max Chou
2026-02-04  5:17 ` [PATCH v3 11/19] target/riscv: rvv: Add vfncvtbf16.sat.f.f.w instruction " Max Chou
2026-02-04  5:17 ` [PATCH v3 12/19] target/riscv: rvv: Add vfncvt.f.f.q and vfncvt.sat.f.f.q instructions " Max Chou
2026-02-04  5:17 ` [PATCH v3 13/19] target/riscv: Expose Zvfofp8min properity Max Chou
2026-02-04  5:17 ` [PATCH v3 14/19] disas/riscv: Add support of Zvfofp8min extension Max Chou
2026-02-04 16:35   ` Chao Liu
2026-02-04  5:17 ` [PATCH v3 15/19] target/riscv: Add cfg properity for Zvfofp4min extension Max Chou
2026-02-04  5:17 ` [PATCH v3 16/19] target/riscv: Add implied rules " Max Chou
2026-02-04  5:17 ` [PATCH v3 17/19] target/riscv: rvv: Add vfext.vf2 instruction " Max Chou
2026-02-05  4:06   ` Chao Liu
2026-02-04  5:17 ` [PATCH v3 18/19] target/riscv: Expose Zvfofp4min properity Max Chou
2026-02-04  5:17 ` [PATCH v3 19/19] disas/riscv: Add support of Zvfofp4min extension Max Chou
2026-02-04 16:43   ` Chao Liu
2026-02-04 16:46     ` Chao Liu
2026-02-04 16:59 ` [PATCH v3 00/19] Add OCP FP8/FP4 and RISC-V Zvfofp8min/Zvfofp4min extension support Chao Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.