qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 00/15] fp-test + hardfloat
@ 2018-04-04 23:11 Emilio G. Cota
  2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 01/15] tests: add fp-test, a floating point test suite Emilio G. Cota
                   ` (15 more replies)
  0 siblings, 16 replies; 25+ messages in thread
From: Emilio G. Cota @ 2018-04-04 23:11 UTC (permalink / raw)
  To: qemu-devel
  Cc: Aurelien Jarno, Peter Maydell, Alex Bennée, Laurent Vivier,
	Richard Henderson, Paolo Bonzini, Mark Cave-Ayland,
	Bastian Koppelmann

v2: https://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06805.html

Changes since v2:

- Add R-b tags

- Add a patch to rename our canonicalize to sf_canonicalize,
  to avoid clashing with glibc's.

- Add a patch to define float{32,64}_is_zero_or_normal

- Simplify the float{32,64}_input_flushX macros -- now the
  macros are more verbose but the full function names are greppable.

- Move tests/fp-test to tests/fp, since now both fp-bench and fp-test
  are under tests/fp.
  + Use tests/fp/fp-test.h for helpers common to both fp-bench and fp-test.

- Complete rewrite of fp-bench:
  + We can now directly call the softfloat functions, thereby
    making the benchmark more sensitive to changes to those functions.
  + We can still use the native ops with "-t host".
  + The rewrite also has less macro trickery; we rely instead on
    constant propagation by the compiler.
  + Alex: dropped your R-b since this changed a lot. I think you'll
    like this version better though!

- Define a generic function to generate the hardfloat implementation
  for ops with 2 inputs; add, sub, mul and div depend on it.
  Instead of using macros, rely on the constant propagation done
  by the compiler. [Alex: I dropped your R-b for the addsub
  patch because it changed a lot]
  + I kept macros for other ops, because I think the subsequent
    code duplication savings are worth the pain.

- Add #define's to select whether to use fpclassify etc. or
  float32_is_zero etc.
  + Benchmark perf differences on x86_64, aarch64 and IBM Power8 hosts.
  + For 32-bit we don't use fpclassify etc. for any architectures,
    so I was tempted to get rid of this option to save some code.
    It's possible however that on some hosts I have not tested this option
    might pay off, so I decided to keep it there.

- Add a #define to select whether to use isinf() or floatX_is_infinity().
  Turns out this makes a big difference for power64.

- Remove float32_to_float64 support in hardfloat, since nbench or
  SPEC actually showed a small yet measurable slowdown with it,
  despite fp-bench showing a significant speedup for this operation.

- Do not flatten soft-fp functions; these are now slow paths.
  This shrinks the size of the softfloat object below its original
  size (see last patch's log).

- Add a #define to disable hardfloat for some targets. I noticed that
  some targets (at least I noticed PPC, there might be others) do
  clear the FP flags before calling softfloat. This precludes hardfloat
  since it relies on inexact not being set. In the long run we should
  fix these targets though.

Note: fp-bench can run _very_ slowly (~0.5 IPC) for -o fma on some x86_64
hosts. I have not pinned down what's going on, but from the few hosts
I have access to, it seems that machines that have been patched for
Spectre/Meltdown are susceptible to this slowdown.
Fortunately though:
1) when fma is run in QEMU (and not under a microbenchmark such as
   fp-bench), fma performance is still very good (much better than with
   soft-fp).
2) Compiling with -march=native gets rid of the problem.
I've reproduced this with both gcc 5.4.0 and gcc 7.1.0. The *very* same
fp-bench binary that performs very well for FMA on two machines (one
AMD, one Intel, neither patched against Meltdown/Spectre) performs
below soft-fp on another three machines (all Intel, all patched).

Note: there are some checkpatch errors, but they are false positives.

Perf numbers for fp-bench are in each commit log; numbers for several
benchmarks are in the last patch's commit log.

You can fetch this series from:
  https://github.com/cota/qemu/tree/hardfloat-v3

Thanks,

		Emilio

---
 configure                   |    2 +
 fpu/softfloat.c             |  945 ++++++++++++++++++++++++++++++--
 include/fpu/softfloat.h     |   30 +
 target/tricore/fpu_helper.c |    9 +-
 tests/Makefile.include      |    3 +
 tests/fp/.gitignore         |    4 +
 tests/fp/Makefile           |   36 ++
 tests/fp/fp-bench.c         |  528 ++++++++++++++++++
 tests/fp/fp-test.c          | 1183 ++++++++++++++++++++++++++++++++++++++++
 tests/fp/muladd.fptest      |   51 ++
 10 files changed, 2737 insertions(+), 54 deletions(-)
 create mode 100644 tests/fp/.gitignore
 create mode 100644 tests/fp/Makefile
 create mode 100644 tests/fp/fp-bench.c
 create mode 100644 tests/fp/fp-test.c
 create mode 100644 tests/fp/muladd.fptest

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-04-11 21:36 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-04 23:11 [Qemu-devel] [PATCH v3 00/15] fp-test + hardfloat Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 01/15] tests: add fp-test, a floating point test suite Emilio G. Cota
2018-04-11  1:20   ` Alex Bennée
2018-04-11  1:39     ` Alex Bennée
2018-04-11 21:36     ` Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 02/15] softfloat: fix {min, max}nummag for same-abs-value inputs Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 03/15] fp-test: add muladd variants Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 04/15] softfloat: add float{32, 64}_is_{de, }normal Emilio G. Cota
2018-04-06 12:01   ` Bastian Koppelmann
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 05/15] target/tricore: use float32_is_denormal Emilio G. Cota
2018-04-06 12:01   ` Bastian Koppelmann
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 06/15] tests/fp: add fp-bench, a collection of simple floating point microbenchmarks Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 07/15] softfloat: rename canonicalize to sf_canonicalize Emilio G. Cota
2018-04-06 12:02   ` Bastian Koppelmann
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 08/15] softfloat: add float{32, 64}_is_zero_or_normal Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 09/15] fpu: introduce hardfloat Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 10/15] hardfloat: support float32/64 addition and subtraction Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 11/15] hardfloat: support float32/64 multiplication Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 12/15] hardfloat: support float32/64 division Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 13/15] hardfloat: support float32/64 fused multiply-add Emilio G. Cota
2018-04-04 23:16   ` Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 14/15] hardfloat: support float32/64 square root Emilio G. Cota
2018-04-04 23:17   ` Emilio G. Cota
2018-04-04 23:11 ` [Qemu-devel] [PATCH v3 15/15] hardfloat: support float32/64 comparison Emilio G. Cota
2018-04-04 23:31 ` [Qemu-devel] [PATCH v3 00/15] fp-test + hardfloat no-reply

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).