From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org, peter.maydell@linaro.org, qemu-stable@nongnu.org
Subject: [PATCH v3 08/10] target/arm: Fix f16_dotadd vs nan selection
Date: Wed, 2 Jul 2025 06:22:11 -0600 [thread overview]
Message-ID: <20250702122213.758588-9-richard.henderson@linaro.org> (raw)
In-Reply-To: <20250702122213.758588-1-richard.henderson@linaro.org>
Implement FPProcessNaNs4 within f16_dotadd, rather than
simply letting NaNs propagate through the function.
Cc: qemu-stable@nongnu.org
Fixes: 3916841ac75 ("target/arm: Implement FMOPA, FMOPS (widening)")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/sme_helper.c | 62 +++++++++++++++++++++++++++----------
1 file changed, 46 insertions(+), 16 deletions(-)
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index de0c6e54d4..8f33387e4b 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1005,25 +1005,55 @@ static float32 f16_dotadd(float32 sum, uint32_t e1, uint32_t e2,
* - we have pre-set-up copy of s_std which is set to round-to-odd,
* for the multiply (see below)
*/
- float64 e1r = float16_to_float64(e1 & 0xffff, true, s_f16);
- float64 e1c = float16_to_float64(e1 >> 16, true, s_f16);
- float64 e2r = float16_to_float64(e2 & 0xffff, true, s_f16);
- float64 e2c = float16_to_float64(e2 >> 16, true, s_f16);
- float64 t64;
+ float16 h1r = e1 & 0xffff;
+ float16 h1c = e1 >> 16;
+ float16 h2r = e2 & 0xffff;
+ float16 h2c = e2 >> 16;
float32 t32;
- /*
- * The ARM pseudocode function FPDot performs both multiplies
- * and the add with a single rounding operation. Emulate this
- * by performing the first multiply in round-to-odd, then doing
- * the second multiply as fused multiply-add, and rounding to
- * float32 all in one step.
- */
- t64 = float64_mul(e1r, e2r, s_odd);
- t64 = float64r32_muladd(e1c, e2c, t64, 0, s_std);
+ /* C.f. FPProcessNaNs4 */
+ if (float16_is_any_nan(h1r) || float16_is_any_nan(h1c) ||
+ float16_is_any_nan(h2r) || float16_is_any_nan(h2c)) {
+ float16 t16;
- /* This conversion is exact, because we've already rounded. */
- t32 = float64_to_float32(t64, s_std);
+ if (float16_is_signaling_nan(h1r, s_f16)) {
+ t16 = h1r;
+ } else if (float16_is_signaling_nan(h1c, s_f16)) {
+ t16 = h1c;
+ } else if (float16_is_signaling_nan(h2r, s_f16)) {
+ t16 = h2r;
+ } else if (float16_is_signaling_nan(h2c, s_f16)) {
+ t16 = h2c;
+ } else if (float16_is_any_nan(h1r)) {
+ t16 = h1r;
+ } else if (float16_is_any_nan(h1c)) {
+ t16 = h1c;
+ } else if (float16_is_any_nan(h2r)) {
+ t16 = h2r;
+ } else {
+ t16 = h2c;
+ }
+ t32 = float16_to_float32(t16, true, s_f16);
+ } else {
+ float64 e1r = float16_to_float64(h1r, true, s_f16);
+ float64 e1c = float16_to_float64(h1c, true, s_f16);
+ float64 e2r = float16_to_float64(h2r, true, s_f16);
+ float64 e2c = float16_to_float64(h2c, true, s_f16);
+ float64 t64;
+
+ /*
+ * The ARM pseudocode function FPDot performs both multiplies
+ * and the add with a single rounding operation. Emulate this
+ * by performing the first multiply in round-to-odd, then doing
+ * the second multiply as fused multiply-add, and rounding to
+ * float32 all in one step.
+ */
+ t64 = float64_mul(e1r, e2r, s_odd);
+ t64 = float64r32_muladd(e1c, e2c, t64, 0, s_std);
+
+ /* This conversion is exact, because we've already rounded. */
+ t32 = float64_to_float32(t64, s_std);
+ }
/* The final accumulation step is not fused. */
return float32_add(sum, t32, s_std);
--
2.43.0
next prev parent reply other threads:[~2025-07-02 12:23 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 12:22 [PATCH v3 00/10] target/arm: SME1/SVE2 fixes Richard Henderson
2025-07-02 12:22 ` [PATCH v3 01/10] target/arm: Fix SME vs AdvSIMD exception priority Richard Henderson
2025-07-02 12:22 ` [PATCH v3 02/10] target/arm: Fix sve_access_check for SME Richard Henderson
2025-07-02 12:22 ` [PATCH v3 03/10] target/arm: Fix 128-bit element ZIP, UZP, TRN Richard Henderson
2025-07-03 9:08 ` Peter Maydell
2025-07-02 12:22 ` [PATCH v3 04/10] target/arm: Replace @rda_rn_rm_e0 in sve.decode Richard Henderson
2025-07-02 14:11 ` Richard Henderson
2025-07-03 9:14 ` Peter Maydell
2025-07-02 12:22 ` [PATCH v3 05/10] target/arm: Fix FMMLA (64-bit element) for 128-bit VL Richard Henderson
2025-07-03 9:08 ` Peter Maydell
2025-07-02 12:22 ` [PATCH v3 06/10] target/arm: Disable FEAT_F64MM if maximum SVE vector size too small Richard Henderson
2025-07-02 18:15 ` Richard Henderson
2025-07-03 9:10 ` Peter Maydell
2025-07-02 12:22 ` [PATCH v3 07/10] target/arm: Fix PSEL size operands to tcg_gen_gvec_ands Richard Henderson
2025-07-03 9:09 ` Peter Maydell
2025-07-02 12:22 ` Richard Henderson [this message]
2025-07-03 9:12 ` [PATCH v3 08/10] target/arm: Fix f16_dotadd vs nan selection Peter Maydell
2025-07-02 12:22 ` [PATCH v3 09/10] target/arm: Fix bfdotadd_ebf " Richard Henderson
2025-07-03 9:13 ` Peter Maydell
2025-07-02 12:22 ` [PATCH v3 10/10] target/arm: Remove CPUARMState.vfp.scratch Richard Henderson
2025-07-02 13:47 ` Alex Bennée
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250702122213.758588-9-richard.henderson@linaro.org \
--to=richard.henderson@linaro.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).