From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
Richard Henderson <richard.henderson@linaro.org>
Subject: [PULL 6/7] tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases
Date: Fri, 7 Feb 2025 11:28:00 +0100 [thread overview]
Message-ID: <20250207102802.2445596-7-pbonzini@redhat.com> (raw)
In-Reply-To: <20250207102802.2445596-1-pbonzini@redhat.com>
From: Peter Maydell <peter.maydell@linaro.org>
Add a test case which tests some corner case behaviour of
fused-multiply-add on x86:
* 0 * Inf + SNaN should raise Invalid
* 0 * Inf + QNaN shouldh not raise Invalid
* tininess should be detected after rounding
There is also one currently-disabled test case:
* flush-to-zero should be done after rounding
This is disabled because QEMU's emulation currently does this
incorrectly (and so would fail the test). The test case is kept in
but disabled, as the justification for why the test running harness
has support for testing both with and without FTZ set.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20250116112536.4117889-3-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
tests/tcg/x86_64/fma.c | 109 +++++++++++++++++++++++++++++++
tests/tcg/x86_64/Makefile.target | 1 +
2 files changed, 110 insertions(+)
create mode 100644 tests/tcg/x86_64/fma.c
diff --git a/tests/tcg/x86_64/fma.c b/tests/tcg/x86_64/fma.c
new file mode 100644
index 00000000000..09c622ebc00
--- /dev/null
+++ b/tests/tcg/x86_64/fma.c
@@ -0,0 +1,109 @@
+/*
+ * Test some fused multiply add corner cases.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <inttypes.h>
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+/*
+ * Perform one "n * m + a" operation using the vfmadd insn and return
+ * the result; on return *mxcsr_p is set to the bottom 6 bits of MXCSR
+ * (the Flag bits). If ftz is true then we set MXCSR.FTZ while doing
+ * the operation.
+ * We print the operation and its results to stdout.
+ */
+static uint64_t do_fmadd(uint64_t n, uint64_t m, uint64_t a,
+ bool ftz, uint32_t *mxcsr_p)
+{
+ uint64_t r;
+ uint32_t mxcsr = 0;
+ uint32_t ftz_bit = ftz ? (1 << 15) : 0;
+ uint32_t saved_mxcsr = 0;
+
+ asm volatile("stmxcsr %[saved_mxcsr]\n"
+ "stmxcsr %[mxcsr]\n"
+ "andl $0xffff7fc0, %[mxcsr]\n"
+ "orl %[ftz_bit], %[mxcsr]\n"
+ "ldmxcsr %[mxcsr]\n"
+ "movq %[a], %%xmm0\n"
+ "movq %[m], %%xmm1\n"
+ "movq %[n], %%xmm2\n"
+ /* xmm0 = xmm0 + xmm2 * xmm1 */
+ "vfmadd231sd %%xmm1, %%xmm2, %%xmm0\n"
+ "movq %%xmm0, %[r]\n"
+ "stmxcsr %[mxcsr]\n"
+ "ldmxcsr %[saved_mxcsr]\n"
+ : [r] "=r" (r), [mxcsr] "=m" (mxcsr),
+ [saved_mxcsr] "=m" (saved_mxcsr)
+ : [n] "r" (n), [m] "r" (m), [a] "r" (a),
+ [ftz_bit] "r" (ftz_bit)
+ : "xmm0", "xmm1", "xmm2");
+ *mxcsr_p = mxcsr & 0x3f;
+ printf("vfmadd132sd 0x%" PRIx64 " 0x%" PRIx64 " 0x%" PRIx64
+ " = 0x%" PRIx64 " MXCSR flags 0x%" PRIx32 "\n",
+ n, m, a, r, *mxcsr_p);
+ return r;
+}
+
+typedef struct testdata {
+ /* Input n, m, a */
+ uint64_t n;
+ uint64_t m;
+ uint64_t a;
+ bool ftz;
+ /* Expected result */
+ uint64_t expected_r;
+ /* Expected low 6 bits of MXCSR (the Flag bits) */
+ uint32_t expected_mxcsr;
+} testdata;
+
+static testdata tests[] = {
+ { 0, 0x7ff0000000000000, 0x7ff000000000aaaa, false, /* 0 * Inf + SNaN */
+ 0x7ff800000000aaaa, 1 }, /* Should be QNaN and does raise Invalid */
+ { 0, 0x7ff0000000000000, 0x7ff800000000aaaa, false, /* 0 * Inf + QNaN */
+ 0x7ff800000000aaaa, 0 }, /* Should be QNaN and does *not* raise Invalid */
+ /*
+ * These inputs give a result which is tiny before rounding but which
+ * becomes non-tiny after rounding. x86 is a "detect tininess after
+ * rounding" architecture, so it should give a non-denormal result and
+ * not set the Underflow flag (only the Precision flag for an inexact
+ * result).
+ */
+ { 0x3fdfffffffffffff, 0x001fffffffffffff, 0x801fffffffffffff, false,
+ 0x8010000000000000, 0x20 },
+ /*
+ * Flushing of denormal outputs to zero should also happen after
+ * rounding, so setting FTZ should not affect the result or the flags.
+ * QEMU currently does not emulate this correctly because we do the
+ * flush-to-zero check before rounding, so we incorrectly produce a
+ * zero result and set Underflow as well as Precision.
+ */
+#ifdef ENABLE_FAILING_TESTS
+ { 0x3fdfffffffffffff, 0x001fffffffffffff, 0x801fffffffffffff, true,
+ 0x8010000000000000, 0x20 }, /* Enabling FTZ shouldn't change flags */
+#endif
+};
+
+int main(void)
+{
+ bool passed = true;
+ for (int i = 0; i < ARRAY_SIZE(tests); i++) {
+ uint32_t mxcsr;
+ uint64_t r = do_fmadd(tests[i].n, tests[i].m, tests[i].a,
+ tests[i].ftz, &mxcsr);
+ if (r != tests[i].expected_r) {
+ printf("expected result 0x%" PRIx64 "\n", tests[i].expected_r);
+ passed = false;
+ }
+ if (mxcsr != tests[i].expected_mxcsr) {
+ printf("expected MXCSR flags 0x%x\n", tests[i].expected_mxcsr);
+ passed = false;
+ }
+ }
+ return passed ? 0 : 1;
+}
diff --git a/tests/tcg/x86_64/Makefile.target b/tests/tcg/x86_64/Makefile.target
index d6dff559c7d..be20fc64e88 100644
--- a/tests/tcg/x86_64/Makefile.target
+++ b/tests/tcg/x86_64/Makefile.target
@@ -18,6 +18,7 @@ X86_64_TESTS += adox
X86_64_TESTS += test-1648
X86_64_TESTS += test-2175
X86_64_TESTS += cross-modifying-code
+X86_64_TESTS += fma
TESTS=$(MULTIARCH_TESTS) $(X86_64_TESTS) test-x86_64
else
TESTS=$(MULTIARCH_TESTS)
--
2.48.1
next prev parent reply other threads:[~2025-02-07 10:29 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
2025-02-07 10:27 ` [PULL 1/7] rust: remove unnecessary Cargo.toml metadata Paolo Bonzini
2025-02-07 10:27 ` [PULL 2/7] rust: include rust_version in Cargo.toml Paolo Bonzini
2025-02-07 10:27 ` [PULL 3/7] rust: add docs Paolo Bonzini
2025-02-07 10:27 ` [PULL 4/7] rust: add clippy configuration file Paolo Bonzini
2025-02-07 10:27 ` [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN Paolo Bonzini
2025-02-07 11:53 ` Michael Tokarev
2025-02-07 13:43 ` Peter Maydell
2025-02-07 10:28 ` Paolo Bonzini [this message]
2025-02-07 10:28 ` [PULL 7/7] tcg/optimize: optimize TSTNE using smask and zmask Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250207102802.2445596-7-pbonzini@redhat.com \
--to=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).