From: "Alex Bennée" <alex.bennee@linaro.org>
To: peter.maydell@linaro.org
Cc: qemu-devel@nongnu.org, "Emilio G. Cota" <cota@braap.org>,
"Alex Bennée" <alex.bennee@linaro.org>,
"Aurelien Jarno" <aurelien@aurel32.net>
Subject: [Qemu-devel] [PULL 15/15] hardfloat: implement float32/64 comparison
Date: Fri, 14 Dec 2018 13:54:52 +0000 [thread overview]
Message-ID: <20181214135452.25936-16-alex.bennee@linaro.org> (raw)
In-Reply-To: <20181214135452.25936-1-alex.bennee@linaro.org>
From: "Emilio G. Cota" <cota@braap.org>
Performance results for fp-bench:
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
cmp-single: 110.98 MFlops
cmp-double: 107.12 MFlops
- after:
cmp-single: 506.28 MFlops
cmp-double: 524.77 MFlops
Note that flattening both eq and eq_signaling versions
would give us extra performance (695v506, 615v524 Mflops
for single/double, respectively) but this would emit two
essentially identical functions for each eq/signaling pair,
which is a waste.
Aggregate performance improvement for the last few patches:
[ all charts in png: https://imgur.com/a/4yV8p ]
1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
qemu-aarch64 NBench score; higher is better
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
16 +-+-----------+-------------+----===-------+---===-------+-----------+-+
14 +-+..........................@@@&&.=.......@@@&&.=...................+-+
12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+
10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+
8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+
6 +-+............@@@&&=+***##.$%.@.&.=***##$$%+@.&.=..###$$%%@i& = +-+
4 +-+.......###$%%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=+**.#+$ +@m& = +-+
2 +-+.....***.#$.%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=.**.#+$+sqr& = +-+
0 +-+-----***##$%%@@&&=-***##$$%@@&&==***##$$%@@&&==-**##$$%+cmp==-----+-+
FOURIER NEURAL NELU DECOMPOSITION gmean
qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c1015905
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
error bars: 95% confidence interval
4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+
4 +-+..........................+@@+...........................................................................+-+
3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+
2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+
2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+
1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+
0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+
0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+
410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean
2. Host: ARM Aarch64 A57 @ 2.4GHz
qemu-aarch64 NBench score; higher is better
Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz
5 +-+-----------+-------------+-------------+-------------+-----------+-+
4.5 +-+........................................@@@&==...................+-+
3 4 +-+..........................@@@&==........@.@&.=.....+before +-+
3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+
2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+
2 +-+............@@@&==.***#.$.%.@&.=.***#$$%%.@&.=.***#$$%%d@& = +-+
1.5 +-+.....***#$$%%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$ +f@& = +-+
0.5 +-+.....*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$+sqr& = +-+
0 +-+-----***#$$%%@@&==-***#$$%%@@&==-***#$$%%@@&==-***#$$%+cmp==-----+-+
FOURIER NEURAL NLU DECOMPOSITION gmean
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index fbd66fd8dc..59eac97d10 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2903,28 +2903,109 @@ static int compare_floats(FloatParts a, FloatParts b, bool is_quiet,
}
}
-#define COMPARE(sz) \
-int float ## sz ## _compare(float ## sz a, float ## sz b, \
- float_status *s) \
+#define COMPARE(name, attr, sz) \
+static int attr \
+name(float ## sz a, float ## sz b, bool is_quiet, float_status *s) \
{ \
FloatParts pa = float ## sz ## _unpack_canonical(a, s); \
FloatParts pb = float ## sz ## _unpack_canonical(b, s); \
- return compare_floats(pa, pb, false, s); \
-} \
-int float ## sz ## _compare_quiet(float ## sz a, float ## sz b, \
- float_status *s) \
-{ \
- FloatParts pa = float ## sz ## _unpack_canonical(a, s); \
- FloatParts pb = float ## sz ## _unpack_canonical(b, s); \
- return compare_floats(pa, pb, true, s); \
+ return compare_floats(pa, pb, is_quiet, s); \
}
-COMPARE(16)
-COMPARE(32)
-COMPARE(64)
+COMPARE(soft_f16_compare, QEMU_FLATTEN, 16)
+COMPARE(soft_f32_compare, QEMU_SOFTFLOAT_ATTR, 32)
+COMPARE(soft_f64_compare, QEMU_SOFTFLOAT_ATTR, 64)
#undef COMPARE
+int float16_compare(float16 a, float16 b, float_status *s)
+{
+ return soft_f16_compare(a, b, false, s);
+}
+
+int float16_compare_quiet(float16 a, float16 b, float_status *s)
+{
+ return soft_f16_compare(a, b, true, s);
+}
+
+static int QEMU_FLATTEN
+f32_compare(float32 xa, float32 xb, bool is_quiet, float_status *s)
+{
+ union_float32 ua, ub;
+
+ ua.s = xa;
+ ub.s = xb;
+
+ if (QEMU_NO_HARDFLOAT) {
+ goto soft;
+ }
+
+ float32_input_flush2(&ua.s, &ub.s, s);
+ if (isgreaterequal(ua.h, ub.h)) {
+ if (isgreater(ua.h, ub.h)) {
+ return float_relation_greater;
+ }
+ return float_relation_equal;
+ }
+ if (likely(isless(ua.h, ub.h))) {
+ return float_relation_less;
+ }
+ /* The only condition remaining is unordered.
+ * Fall through to set flags.
+ */
+ soft:
+ return soft_f32_compare(ua.s, ub.s, is_quiet, s);
+}
+
+int float32_compare(float32 a, float32 b, float_status *s)
+{
+ return f32_compare(a, b, false, s);
+}
+
+int float32_compare_quiet(float32 a, float32 b, float_status *s)
+{
+ return f32_compare(a, b, true, s);
+}
+
+static int QEMU_FLATTEN
+f64_compare(float64 xa, float64 xb, bool is_quiet, float_status *s)
+{
+ union_float64 ua, ub;
+
+ ua.s = xa;
+ ub.s = xb;
+
+ if (QEMU_NO_HARDFLOAT) {
+ goto soft;
+ }
+
+ float64_input_flush2(&ua.s, &ub.s, s);
+ if (isgreaterequal(ua.h, ub.h)) {
+ if (isgreater(ua.h, ub.h)) {
+ return float_relation_greater;
+ }
+ return float_relation_equal;
+ }
+ if (likely(isless(ua.h, ub.h))) {
+ return float_relation_less;
+ }
+ /* The only condition remaining is unordered.
+ * Fall through to set flags.
+ */
+ soft:
+ return soft_f64_compare(ua.s, ub.s, is_quiet, s);
+}
+
+int float64_compare(float64 a, float64 b, float_status *s)
+{
+ return f64_compare(a, b, false, s);
+}
+
+int float64_compare_quiet(float64 a, float64 b, float_status *s)
+{
+ return f64_compare(a, b, true, s);
+}
+
/* Multiply A by 2 raised to the power N. */
static FloatParts scalbn_decomposed(FloatParts a, int n, float_status *s)
{
--
2.17.1
next prev parent reply other threads:[~2018-12-14 13:55 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-14 13:54 [Qemu-devel] [PULL 00/15] Hardfloat + softfloat maintainers update and gitdm Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 01/15] contrib: add a basic gitdm config Alex Bennée
2018-12-14 20:55 ` Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 02/15] MAINTAINERS: update status of FPU emulation Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 03/15] fp-test: pick TARGET_ARM to get its specialization Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 04/15] softfloat: add float{32, 64}_is_{de, }normal Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 05/15] target/tricore: use float32_is_denormal Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 06/15] softfloat: rename canonicalize to sf_canonicalize Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 07/15] softfloat: add float{32, 64}_is_zero_or_normal Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 08/15] tests/fp: add fp-bench Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 09/15] fpu: introduce hardfloat Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 10/15] hardfloat: implement float32/64 addition and subtraction Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 11/15] hardfloat: implement float32/64 multiplication Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 12/15] hardfloat: implement float32/64 division Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 13/15] hardfloat: implement float32/64 fused multiply-add Alex Bennée
2018-12-14 13:54 ` [Qemu-devel] [PULL 14/15] hardfloat: implement float32/64 square root Alex Bennée
2018-12-14 13:54 ` Alex Bennée [this message]
2018-12-16 21:51 ` [Qemu-devel] [PULL 00/15] Hardfloat + softfloat maintainers update and gitdm Peter Maydell
2018-12-17 9:29 ` Alex Bennée
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181214135452.25936-16-alex.bennee@linaro.org \
--to=alex.bennee@linaro.org \
--cc=aurelien@aurel32.net \
--cc=cota@braap.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).