From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Emilio G. Cota" <cota@braap.org>
Cc: qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net>,
Peter Maydell <peter.maydell@linaro.org>,
Laurent Vivier <laurent@vivier.eu>,
Richard Henderson <richard.henderson@linaro.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Subject: Re: [Qemu-devel] [PATCH v1 07/14] fpu: introduce hostfloat
Date: Tue, 27 Mar 2018 12:49:48 +0100 [thread overview]
Message-ID: <87po3p91r7.fsf@linaro.org> (raw)
In-Reply-To: <1521663109-32262-8-git-send-email-cota@braap.org>
Emilio G. Cota <cota@braap.org> writes:
> The appended paves the way for leveraging the host FPU for a subset
> of guest FP operations. For most guest workloads (e.g. FP flags
> aren't ever cleared, inexact occurs often and rounding is set to the
> default [to nearest]) this will yield sizable performance speedups.
>
> The approach followed here avoids checking the FP exception flags register.
> See the comment at the top of hostfloat.c for details.
>
> This assumes that QEMU is running on an IEEE754-compliant FPU and
> that the rounding is set to the default (to nearest). The
> implementation-dependent specifics of the FPU should not matter; things
> like tininess detection and snan representation are still dealt with in
> soft-fp. However, this approach will break on most hosts if we compile
> QEMU with flags such as -ffast-math. We control the flags so this should
> be easy to enforce though.
The thing I would avoid is generating is any x87 instructions as we can
get weird effects if the compiler ever decides to stash a signalling NaN
in an x87 register.
Anyway perhaps -fno-fast-math should be explicit when building fpu/* code?
>
> The licensing in softfloat.h is complicated at best, so to keep things
> simple I'm adding this as a separate, GPL'ed file.
I don't think we need to worry about this. It's fine to add GPL only
stuff to softfloat.c and since the re-factoring (or before really) we
"own" this code and are unlikely to upstream anything.
My preference would be to include this all in softfloat.c unless there
is a very good reason not to.
>
> This patch just adds some boilerplate code; subsequent patches add
> operations, one per commit to ease bisection.
>
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
> Makefile.target | 2 +-
> include/fpu/hostfloat.h | 14 +++++++
> include/fpu/softfloat.h | 1 +
> fpu/hostfloat.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
> target/m68k/Makefile.objs | 2 +-
> tests/fp-test/Makefile | 2 +-
> 6 files changed, 114 insertions(+), 3 deletions(-)
> create mode 100644 include/fpu/hostfloat.h
> create mode 100644 fpu/hostfloat.c
>
> diff --git a/Makefile.target b/Makefile.target
> index 6549481..efcdfb9 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -97,7 +97,7 @@ obj-$(CONFIG_TCG) += tcg/tcg.o tcg/tcg-op.o tcg/tcg-op-vec.o tcg/tcg-op-gvec.o
> obj-$(CONFIG_TCG) += tcg/tcg-common.o tcg/optimize.o
> obj-$(CONFIG_TCG_INTERPRETER) += tcg/tci.o
> obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
> -obj-y += fpu/softfloat.o
> +obj-y += fpu/softfloat.o fpu/hostfloat.o
> obj-y += target/$(TARGET_BASE_ARCH)/
> obj-y += disas.o
> obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
> diff --git a/include/fpu/hostfloat.h b/include/fpu/hostfloat.h
> new file mode 100644
> index 0000000..b01291b
> --- /dev/null
> +++ b/include/fpu/hostfloat.h
> @@ -0,0 +1,14 @@
> +/*
> + * Copyright (C) 2018, Emilio G. Cota <cota@braap.org>
> + *
> + * License: GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#ifndef HOSTFLOAT_H
> +#define HOSTFLOAT_H
> +
> +#ifndef SOFTFLOAT_H
> +#error fpu/hostfloat.h must only be included from softfloat.h
> +#endif
> +
> +#endif /* HOSTFLOAT_H */
> diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
> index 8fb44a8..8963b68 100644
> --- a/include/fpu/softfloat.h
> +++ b/include/fpu/softfloat.h
> @@ -95,6 +95,7 @@ enum {
> };
>
> #include "fpu/softfloat-types.h"
> +#include "fpu/hostfloat.h"
>
> static inline void set_float_detect_tininess(int val, float_status *status)
> {
> diff --git a/fpu/hostfloat.c b/fpu/hostfloat.c
> new file mode 100644
> index 0000000..cab0341
> --- /dev/null
> +++ b/fpu/hostfloat.c
> @@ -0,0 +1,96 @@
> +/*
> + * hostfloat.c - FP primitives that use the host's FPU whenever possible.
> + *
> + * Copyright (C) 2018, Emilio G. Cota <cota@braap.org>
> + *
> + * License: GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + * Fast emulation of guest FP instructions is challenging for two reasons.
> + * First, FP instruction semantics are similar but not identical, particularly
> + * when handling NaNs. Second, emulating at reasonable speed the guest FP
> + * exception flags is not trivial: reading the host's flags register with a
> + * feclearexcept & fetestexcept pair is slow [slightly slower than soft-fp],
> + * and trapping on every FP exception is not fast nor pleasant to work with.
> + *
> + * This module leverages the host FPU for a subset of the operations. To
> + * do this it follows the main idea presented in this paper:
> + *
> + * Guo, Yu-Chuan, et al. "Translating the ARM Neon and VFP instructions in a
> + * binary translator." Software: Practice and Experience 46.12 (2016):1591-1615.
> + *
> + * The idea is thus to leverage the host FPU to (1) compute FP operations
> + * and (2) identify whether FP exceptions occurred while avoiding
> + * expensive exception flag register accesses.
> + *
> + * An important optimization shown in the paper is that given that exception
> + * flags are rarely cleared by the guest, we can avoid recomputing some flags.
> + * This is particularly useful for the inexact flag, which is very frequently
> + * raised in floating-point workloads.
> + *
> + * We optimize the code further by deferring to soft-fp whenever FP
> + * exception detection might get hairy. Fortunately this is not common.
> + */
> +#include <math.h>
> +
> +#include "qemu/osdep.h"
> +#include "fpu/softfloat.h"
> +
> +#define GEN_TYPE_CONV(name, to_t, from_t) \
> + static inline to_t name(from_t a) \
> + { \
> + to_t r = *(to_t *)&a; \
> + return r; \
> + }
> +
> +GEN_TYPE_CONV(float32_to_float, float, float32)
> +GEN_TYPE_CONV(float64_to_double, double, float64)
> +GEN_TYPE_CONV(float_to_float32, float32, float)
> +GEN_TYPE_CONV(double_to_float64, float64, double)
> +#undef GEN_TYPE_CONV
> +
> +#define GEN_INPUT_FLUSH(soft_t) \
> + static inline __attribute__((always_inline)) void \
> + soft_t ## _input_flush__nocheck(soft_t *a, float_status *s) \
> + { \
> + if (unlikely(soft_t ## _is_denormal(*a))) { \
> + *a = soft_t ## _set_sign(soft_t ## _zero, \
> + soft_t ## _is_neg(*a)); \
> + s->float_exception_flags |= float_flag_input_denormal; \
> + } \
> + } \
> + \
> + static inline __attribute__((always_inline)) void \
> + soft_t ## _input_flush1(soft_t *a, float_status *s) \
> + { \
> + if (likely(!s->flush_inputs_to_zero)) { \
> + return; \
> + } \
> + soft_t ## _input_flush__nocheck(a, s); \
> + } \
> + \
> + static inline __attribute__((always_inline)) void \
> + soft_t ## _input_flush2(soft_t *a, soft_t *b, float_status *s) \
> + { \
> + if (likely(!s->flush_inputs_to_zero)) { \
> + return; \
> + } \
> + soft_t ## _input_flush__nocheck(a, s); \
> + soft_t ## _input_flush__nocheck(b, s); \
> + } \
> + \
> + static inline __attribute__((always_inline)) void \
> + soft_t ## _input_flush3(soft_t *a, soft_t *b, soft_t *c, \
> + float_status *s) \
> + { \
> + if (likely(!s->flush_inputs_to_zero)) { \
> + return; \
> + } \
> + soft_t ## _input_flush__nocheck(a, s); \
> + soft_t ## _input_flush__nocheck(b, s); \
> + soft_t ## _input_flush__nocheck(c, s); \
> + }
> +
> +GEN_INPUT_FLUSH(float32)
> +GEN_INPUT_FLUSH(float64)
Having spent time getting rid of a bunch of macro expansions I'm wary of
adding more in. However for these I guess it's kind of marginal.
> +#undef GEN_INPUT_FLUSH
> diff --git a/target/m68k/Makefile.objs b/target/m68k/Makefile.objs
> index ac61948..2868b11 100644
> --- a/target/m68k/Makefile.objs
> +++ b/target/m68k/Makefile.objs
> @@ -1,5 +1,5 @@
> obj-y += m68k-semi.o
> obj-y += translate.o op_helper.o helper.o cpu.o
> -obj-y += fpu_helper.o softfloat.o
> +obj-y += fpu_helper.o softfloat.o hostfloat.o
> obj-y += gdbstub.o
> obj-$(CONFIG_SOFTMMU) += monitor.o
> diff --git a/tests/fp-test/Makefile b/tests/fp-test/Makefile
> index 703434f..187cfcc 100644
> --- a/tests/fp-test/Makefile
> +++ b/tests/fp-test/Makefile
> @@ -28,7 +28,7 @@ ibm:
> $(WHITELIST_FILES):
> wget -nv -O $@ http://www.cs.columbia.edu/~cota/qemu/fpbench-$@
>
> -fp-test$(EXESUF): fp-test.o softfloat.o
> +fp-test$(EXESUF): fp-test.o softfloat.o hostfloat.o
>
> clean:
> rm -f *.o *.d $(OBJS)
--
Alex Bennée
next prev parent reply other threads:[~2018-03-27 11:49 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-21 20:11 [Qemu-devel] [PATCH v1 00/14] fp-test + hostfloat Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks Emilio G. Cota
2018-03-27 8:45 ` Alex Bennée
2018-03-27 17:21 ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 02/14] tests: add fp-test, a floating point test suite Emilio G. Cota
2018-03-27 10:13 ` Alex Bennée
2018-03-27 18:00 ` Emilio G. Cota
2018-03-28 9:51 ` Alex Bennée
2018-03-28 15:36 ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 03/14] softfloat: fix {min, max}nummag for same-abs-value inputs Emilio G. Cota
2018-03-27 10:15 ` Alex Bennée
2018-03-27 10:15 ` Alex Bennée
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 04/14] fp-test: add muladd variants Emilio G. Cota
2018-03-27 11:33 ` Alex Bennée
2018-03-27 18:03 ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 05/14] softfloat: add float32_is_normal and float64_is_normal Emilio G. Cota
2018-03-27 11:34 ` Alex Bennée
2018-03-27 18:05 ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 06/14] softfloat: add float32_is_denormal and float64_is_denormal Emilio G. Cota
2018-03-27 11:35 ` Alex Bennée
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 07/14] fpu: introduce hostfloat Emilio G. Cota
2018-03-21 20:41 ` Laurent Vivier
2018-03-21 21:45 ` Emilio G. Cota
2018-03-27 11:49 ` Alex Bennée [this message]
2018-03-27 18:16 ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction Emilio G. Cota
2018-03-22 5:05 ` Richard Henderson
2018-03-22 5:57 ` Emilio G. Cota
2018-03-22 6:41 ` Richard Henderson
2018-03-22 15:08 ` Emilio G. Cota
2018-03-22 15:12 ` Laurent Vivier
2018-03-22 19:57 ` Emilio G. Cota
2018-03-27 11:41 ` Alex Bennée
2018-03-27 18:08 ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 09/14] hostfloat: support float32/64 multiplication Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 10/14] hostfloat: support float32/64 division Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 11/14] hostfloat: support float32/64 fused multiply-add Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 12/14] hostfloat: support float32/64 square root Emilio G. Cota
2018-03-22 1:29 ` Alex Bennée
2018-03-22 4:02 ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 13/14] hostfloat: support float32/64 comparison Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 14/14] hostfloat: support float32_to_float64 Emilio G. Cota
2018-03-21 20:36 ` [Qemu-devel] [PATCH v1 00/14] fp-test + hostfloat no-reply
2018-03-22 5:02 ` no-reply
2018-03-22 8:56 ` Alex Bennée
2018-03-22 15:28 ` Emilio G. Cota
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87po3p91r7.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=aurelien@aurel32.net \
--cc=cota@braap.org \
--cc=laurent@vivier.eu \
--cc=mark.cave-ayland@ilande.co.uk \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).