From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:33791)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <rth7680@gmail.com>) id 1Vivg1-00077y-Ge
	for qemu-devel@nongnu.org; Tue, 19 Nov 2013 19:29:58 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <rth7680@gmail.com>) id 1Vivfv-0004lm-9E
	for qemu-devel@nongnu.org; Tue, 19 Nov 2013 19:29:53 -0500
Sender: Richard Henderson <rth7680@gmail.com>
Message-ID: <528C0274.5020208@twiddle.net>
Date: Wed, 20 Nov 2013 10:29:40 +1000
From: Richard Henderson <rth@twiddle.net>
MIME-Version: 1.0
References: <1384868432-2427-1-git-send-email-tommusta@gmail.com>
	<1384868432-2427-13-git-send-email-tommusta@gmail.com>
In-Reply-To: <1384868432-2427-13-git-send-email-tommusta@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [V2 PATCH 12/14] target-ppc: VSX Stage 4: Add
 Scalar SP Fused Multiply-Adds
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Tom Musta <tommusta@gmail.com>, qemu-devel@nongnu.org
Cc: qemu-ppc@nongnu.org

On 11/19/2013 11:40 PM, Tom Musta wrote:
> +    /* NOTE: in order to get accurate results, we must first round back */    \
> +    /*       to single precision and use the fused multiply add routine */    \
> +    /*       for 32-bit floats.                                         */    \
> +    float_status tstat = env->fp_status;                                      \
> +    float32 a32 = float64_to_float32(xa.f64[0], &tstat);                      \
> +    float32 b32 = float64_to_float32(b->f64[0], &tstat);                      \
> +    float32 c32 = float64_to_float32(c->f64[0], &tstat);                      \
> +                                                                              \
> +    set_float_exception_flags(0, &tstat);                                     \
> +    float32 t32 = float32_muladd(a32, b32, c32, maddflgs, &tstat);            \

While this will produce correct results for the "normal" use case of correctly
rounded single-precision inputs, the spec says

# Except for xsresp or xsrsqrtesp, any double-precision value can
# be used in single-precision scalar arithmetic operations when
# OE=0 and UE=0.

Thus a more correct implementation would use the full double-precision inputs
while also correctly rounding.  I pointed you at the glibc implementation to
show how that can be done using round-to-zero plus examining the inexact bit.

    float_status tstat = env->fp_status;

    set_float_exception_flags(0, &tstat);
    if (tstat.float_rounding_mode == float_round_nearest_even) {
        /* Avoid double rounding errors by rounding the intermediate
           result to odd.  See
           http://hal.inria.fr/docs/00/08/04/27/PDF/odd-rounding.pdf */
        set_float_rounding_mode(float_round_to_zero, &tstat);
        res = float64_muladd(...);
        res |= (get_float_exception_flags(&tstat) & float_flag_inexact) != 0;
    } else {
        res = float64_muladd(...);
    }
    res = helper_frsp(env, res);

    apply tstat exceptions;


r~