Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Emilio G. Cota" <cota@braap.org>
Cc: Richard Henderson <richard.henderson@linaro.org>,
	qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net>,
	Peter Maydell <peter.maydell@linaro.org>,
	Laurent Vivier <laurent@vivier.eu>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Subject: Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction
Date: Tue, 27 Mar 2018 12:41:18 +0100	[thread overview]
Message-ID: <87r2o5925d.fsf@linaro.org> (raw)
In-Reply-To: <20180322195721.GA22594@flamenco>


Emilio G. Cota <cota@braap.org> writes:

> On Thu, Mar 22, 2018 at 14:41:05 +0800, Richard Henderson wrote:
> (snip)
>> Another thought re all of the soft_is_normal || soft_is_zero checks that you're
>> performing.  I think it would be nice if we could work with
>> float*_unpack_canonical so that we don't have to duplicate work.  E.g.
>>
>> /* Return true for float_class_normal && float_class_zero.  */
>> static inline bool is_finite(FloatClass c) { return c <= float_class_zero; }
>>
>> float32 float32_add(float32 a, float32 b, float_status *s)
>> {
>>   FloatClass a_cls = float32_classify(a);
>>   FloatClass b_cls = float32_classify(b);
>
> Just looked at this. It can be done, although it comes at the
> price of some performance for fp-bench -o add:
> 180 Mflops vs. 196 Mflops, i.e. a 8% slowdown. That is with
> adequate inlining etc., otherwise perf is worse.
>
> I'm not convinced that we can gain much in simplicity to
> justify the perf impact. Yes, we'd simplify canonicalize(),
> but we'd probably need a float_class_denormal[*], which
> would complicate everything else.
>
> I think it makes sense to keep some inlines that work on
> the float32/64's directly.
>
>>   if (is_finite(a_cls) && is_finite(b_cls) && ...) {
>>       /* do hardfp thing */
>>   }
>
> [*] Taking 0, denormals and normals would be OK from correctness,
> but we really don't want to compute ops with denormal inputs on
> the host; it is very likely that the output will also be denormal,
> and we'll end up deferring to soft-fp anyway to avoid
> computing whether the underflow exception has occurred,
> which is expensive.
>
>>   pa = float32_unpack(a, ca, s);
>>   pb = float32_unpack(b, cb, s);
>>   pr = addsub_floats(pa, pb, s, false);
>>   return float32_round_pack(pr, s);
>> }
>
> It pays off to have two separate functions (add & sub) for the
> slow path. With soft_f32_add/sub factored out:
>
> $ taskset -c 0 x86_64-linux-user/qemu-x86_64 tests/fp-bench -o add
> 197.53 MFlops
>
> With the above four lines (pa...return) as an else branch:
> 169.16 MFlops
>
> BTW flattening makes things worse (150.63 MFlops).

That's disappointing. Did you look at the generated code? Because the
way we are abusing __flatten__ to effectively make a compile time
template you would hope it could pull the relevant classify bits to
before the hard float branch and do the rest later if needed.

Everything was inline or in softfloat.c for this test right?

>
> Note that fp-bench only tests normal numbers. But I think it's fair
> to assume that this is the path we want to speed up.
>
> Thanks,
>
> 		E.


--
Alex Bennée

next prev parent reply	other threads:[~2018-03-27 11:41 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21 20:11 [Qemu-devel] [PATCH v1 00/14] fp-test + hostfloat Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks Emilio G. Cota
2018-03-27  8:45   ` Alex Bennée
2018-03-27 17:21     ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 02/14] tests: add fp-test, a floating point test suite Emilio G. Cota
2018-03-27 10:13   ` Alex Bennée
2018-03-27 18:00     ` Emilio G. Cota
2018-03-28  9:51       ` Alex Bennée
2018-03-28 15:36         ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 03/14] softfloat: fix {min, max}nummag for same-abs-value inputs Emilio G. Cota
2018-03-27 10:15   ` Alex Bennée
2018-03-27 10:15   ` Alex Bennée
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 04/14] fp-test: add muladd variants Emilio G. Cota
2018-03-27 11:33   ` Alex Bennée
2018-03-27 18:03     ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 05/14] softfloat: add float32_is_normal and float64_is_normal Emilio G. Cota
2018-03-27 11:34   ` Alex Bennée
2018-03-27 18:05     ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 06/14] softfloat: add float32_is_denormal and float64_is_denormal Emilio G. Cota
2018-03-27 11:35   ` Alex Bennée
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 07/14] fpu: introduce hostfloat Emilio G. Cota
2018-03-21 20:41   ` Laurent Vivier
2018-03-21 21:45     ` Emilio G. Cota
2018-03-27 11:49   ` Alex Bennée
2018-03-27 18:16     ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction Emilio G. Cota
2018-03-22  5:05   ` Richard Henderson
2018-03-22  5:57     ` Emilio G. Cota
2018-03-22  6:41       ` Richard Henderson
2018-03-22 15:08         ` Emilio G. Cota
2018-03-22 15:12           ` Laurent Vivier
2018-03-22 19:57         ` Emilio G. Cota
2018-03-27 11:41           ` Alex Bennée [this message]
2018-03-27 18:08             ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 09/14] hostfloat: support float32/64 multiplication Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 10/14] hostfloat: support float32/64 division Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 11/14] hostfloat: support float32/64 fused multiply-add Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 12/14] hostfloat: support float32/64 square root Emilio G. Cota
2018-03-22  1:29   ` Alex Bennée
2018-03-22  4:02     ` Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 13/14] hostfloat: support float32/64 comparison Emilio G. Cota
2018-03-21 20:11 ` [Qemu-devel] [PATCH v1 14/14] hostfloat: support float32_to_float64 Emilio G. Cota
2018-03-21 20:36 ` [Qemu-devel] [PATCH v1 00/14] fp-test + hostfloat no-reply
2018-03-22  5:02 ` no-reply
2018-03-22  8:56 ` Alex Bennée
2018-03-22 15:28   ` Emilio G. Cota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r2o5925d.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=aurelien@aurel32.net \
    --cc=cota@braap.org \
    --cc=laurent@vivier.eu \
    --cc=mark.cave-ayland@ilande.co.uk \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).