Re: [Qemu-devel] [PATCH 12/14] VSX Stage 4: Add Scalar SP Fused Multiply-Adds

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Richard Henderson <rth@twiddle.net>
To: Tom Musta <tommusta@gmail.com>, qemu-devel@nongnu.org
Cc: qemu-ppc@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 12/14] VSX Stage 4: Add Scalar SP Fused Multiply-Adds
Date: Fri, 08 Nov 2013 10:13:06 +1000	[thread overview]
Message-ID: <527C2C92.7060006@twiddle.net> (raw)
In-Reply-To: <527C2297.7040407@twiddle.net>

On 11/08/2013 09:30 AM, Richard Henderson wrote:
> On 11/08/2013 09:28 AM, Richard Henderson wrote:
>> On 11/07/2013 06:31 AM, Tom Musta wrote:
>>>          }                                                                     \
>>> +                                                                              \
>>> +        if (r2sp) {                                                           \
>>> +            float32 tmp32 = float64_to_float32(xt_out.fld[i],                 \
>>> +                                               &env->fp_status);              \
>>> +            xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status);       \
>>> +        }                                                                     \
>>> +                                                                              \
>>
>> You can't get correct results for a single-precision fma from a
>> double-precision fma and merely rounding the results.
>>
>> See e.g. glibc's sysdeps/ieee754/dbl-64/s_fmaf.c.
> 
> Blah, nevermind.  That would be using separate add+mul in double-precision, not
> using a double-precision fma primitive.

Hmph.  I was right the first time.  See

> http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/

for example inputs that suffer from double-rounding.

What's needed in each of the examples are infinite precision values containing
55 bits.  This is easy to accomplish with fma.

Two 23-bit inputs can create a product with 46 significant bits.  One can
append 23 more significant bits by choosing an exponent for the addend that
does not overlap the product.  Thus one can create (almost) every intermediate
result with up to 69 consecutive bits (the exception being products without
factors that can be represented in 23-bits).

I'm too lazy to decompose the examples therein to actual single-precision
inputs, but I'm certain it can be done.

Thus you *do* need the round-to-zero plus inexact solution that glibc uses.
(Or to perform the fma in 128-bits and round once, but I think that would be
way more intrusive wrt the rest of the code, and more expensive than necessary.)

r~

next prev parent reply	other threads:[~2013-11-08  0:13 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-06 20:31 [Qemu-devel] [PATCH 00/14] VSX Stage 4 Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 01/14] VSX Stage 4: Add VSX 2.07 Flag Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 02/14] VSX Stage 4: Refactor lxsdx Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 03/14] VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 04/14] VSX Stage 4: Refactor stxsdx Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 05/14] VSX Stage 4: Add stxsiwx and stxsspx Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 06/14] VSX Stage 4: Add xsaddsp and xssubsp Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 07/14] VSX Stage 4: Add xsmulsp Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 08/14] VSX Stage 4: Add xsdivsp Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 09/14] VSX Stage 4: Add xsresp Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 10/14] VSX Stage 4: Add xssqrtsp Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 11/14] VSX Stage 4: add xsrsqrtesp Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 12/14] VSX Stage 4: Add Scalar SP Fused Multiply-Adds Tom Musta
2013-11-07 23:28   ` Richard Henderson
2013-11-07 23:30     ` Richard Henderson
2013-11-08  0:13       ` Richard Henderson [this message]
2013-11-13 20:49         ` Tom Musta
2013-11-13 23:14           ` Richard Henderson
2013-11-14 20:58             ` Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 13/14] VSX Stage 4: Add xscvsxdsp and xscvuxdsp Tom Musta
2013-11-06 20:31 ` [Qemu-devel] [PATCH 14/14] VSX Stage 4: Add xxleqv, xxlnand and xxlorc Tom Musta
2013-11-08  0:23 ` [Qemu-devel] [PATCH 00/14] VSX Stage 4 Richard Henderson
2013-11-08 14:53   ` Tom Musta
2013-11-13 14:35   ` Tom Musta
2013-11-08 15:55 ` Andreas Färber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=527C2C92.7060006@twiddle.net \
    --to=rth@twiddle.net \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=tommusta@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).