qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <rth@twiddle.net>
To: Tom Musta <tommusta@gmail.com>, qemu-devel@nongnu.org
Cc: qemu-ppc@nongnu.org
Subject: Re: [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds
Date: Wed, 04 Dec 2013 13:23:22 +1300	[thread overview]
Message-ID: <529E75FA.1000501@twiddle.net> (raw)
In-Reply-To: <1386086305-8163-13-git-send-email-tommusta@gmail.com>

On 12/04/2013 04:58 AM, Tom Musta wrote:
> This patch adds the Single Precision VSX Scalar Fused Multiply-Add
> instructions: xsmaddasp, xsmaddmsp, xssubasp, xssubmsp, xsnmaddasp,
> xsnmaddmsp, xsnmsubasp, xsnmsubmsp.
> 
> The existing VSX_MADD() macro is modified to support rounding of the
> intermediate double precision result to single precision.
> 
> V2: Re-implemented per feedback from Richard Henderson.  In order to
> avoid double rounding and incorrect results, the operands must be
> converted to true single precision values and use the single precision
> fused multiply/add routine.
> 
> V3: Re-implemented per feedback from Richard Henderson.  The implementation
> now uses a round-to-odd algorithm to address subtle double rounding errors.
> 
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
>  target-ppc/fpu_helper.c |   84 ++++++++++++++++++++++++++++++----------------
>  target-ppc/helper.h     |    8 ++++
>  target-ppc/translate.c  |   16 +++++++++
>  3 files changed, 79 insertions(+), 29 deletions(-)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index 8825db2..077d057 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2192,7 +2192,7 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23)
>   *   afrm  - A form (1=A, 0=M)
>   *   sfprf - set FPRF
>   */
> -#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf)                    \
> +#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)              \
>  void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
>  {                                                                             \
>      ppc_vsr_t xt_in, xa, xb, xt_out;                                          \
> @@ -2218,8 +2218,18 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
>      for (i = 0; i < nels; i++) {                                              \
>          float_status tstat = env->fp_status;                                  \
>          set_float_exception_flags(0, &tstat);                                 \
> -        xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],          \
> -                                     maddflgs, &tstat);                       \
> +        if (r2sp && (tstat.float_rounding_mode == float_round_nearest_even)) {\
> +            /* Avoid double rounding errors by rounding the intermediate */   \
> +            /* result to odd.                                            */   \
> +            set_float_rounding_mode(float_round_to_zero, &tstat);             \
> +            xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],      \
> +                                       maddflgs, &tstat);                     \
> +            xt_out.fld[i] |= (get_float_exception_flags(&tstat) &             \
> +                              float_flag_inexact) != 0;                       \
> +        } else {                                                              \
> +            xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],      \
> +                                        maddflgs, &tstat);                    \
> +        }                                                                     \
>          env->fp_status.float_exception_flags |= tstat.float_exception_flags;  \
>                                                                                \
>          if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {     \
> @@ -2242,6 +2252,13 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)                           \
>                  fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, sfprf);     \
>              }                                                                 \
>          }                                                                     \
> +                                                                              \
> +        if (r2sp) {                                                           \
> +            float32 tmp32 = float64_to_float32(xt_out.fld[i],                 \
> +                                               &env->fp_status);              \
> +            xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status);       \
> +        }                                                                     \
> +                                                                              \

helper_frsp

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

  reply	other threads:[~2013-12-04  0:24 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-03 15:58 [Qemu-devel] [V3 PATCH 00/14] target-ppc: VSX Stage 4 Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 01/14] target-ppc: VSX Stage 4: Add VSX 2.07 Flag Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 02/14] target-ppc: VSX Stage 4: Refactor lxsdx Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 03/14] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 04/14] target-ppc: VSX Stage 4: Refactor stxsdx Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 05/14] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 06/14] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 07/14] target-ppc: VSX Stage 4: Add xsmulsp Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 08/14] target-ppc: VSX Stage 4: Add xsdivsp Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 09/14] target-ppc: VSX Stage 4: Add xsresp Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 10/14] target-ppc: VSX Stage 4: Add xssqrtsp Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 11/14] target-ppc: VSX Stage 4: add xsrsqrtesp Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds Tom Musta
2013-12-04  0:23   ` Richard Henderson [this message]
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 13/14] target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp Tom Musta
2013-12-03 15:58 ` [Qemu-devel] [V3 PATCH 14/14] target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc Tom Musta
2013-12-04  0:25 ` [Qemu-devel] [V3 PATCH 00/14] target-ppc: VSX Stage 4 Richard Henderson
2013-12-04 18:13   ` Tom Musta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=529E75FA.1000501@twiddle.net \
    --to=rth@twiddle.net \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=tommusta@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).