From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41576) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vo0GA-0000iI-Vn for qemu-devel@nongnu.org; Tue, 03 Dec 2013 19:24:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vo0G4-00040D-Lg for qemu-devel@nongnu.org; Tue, 03 Dec 2013 19:24:10 -0500 Sender: Richard Henderson Message-ID: <529E75FA.1000501@twiddle.net> Date: Wed, 04 Dec 2013 13:23:22 +1300 From: Richard Henderson MIME-Version: 1.0 References: <1386086305-8163-1-git-send-email-tommusta@gmail.com> <1386086305-8163-13-git-send-email-tommusta@gmail.com> In-Reply-To: <1386086305-8163-13-git-send-email-tommusta@gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Tom Musta , qemu-devel@nongnu.org Cc: qemu-ppc@nongnu.org On 12/04/2013 04:58 AM, Tom Musta wrote: > This patch adds the Single Precision VSX Scalar Fused Multiply-Add > instructions: xsmaddasp, xsmaddmsp, xssubasp, xssubmsp, xsnmaddasp, > xsnmaddmsp, xsnmsubasp, xsnmsubmsp. > > The existing VSX_MADD() macro is modified to support rounding of the > intermediate double precision result to single precision. > > V2: Re-implemented per feedback from Richard Henderson. In order to > avoid double rounding and incorrect results, the operands must be > converted to true single precision values and use the single precision > fused multiply/add routine. > > V3: Re-implemented per feedback from Richard Henderson. The implementation > now uses a round-to-odd algorithm to address subtle double rounding errors. > > Signed-off-by: Tom Musta > --- > target-ppc/fpu_helper.c | 84 ++++++++++++++++++++++++++++++---------------- > target-ppc/helper.h | 8 ++++ > target-ppc/translate.c | 16 +++++++++ > 3 files changed, 79 insertions(+), 29 deletions(-) > > diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c > index 8825db2..077d057 100644 > --- a/target-ppc/fpu_helper.c > +++ b/target-ppc/fpu_helper.c > @@ -2192,7 +2192,7 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23) > * afrm - A form (1=A, 0=M) > * sfprf - set FPRF > */ > -#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf) \ > +#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp) \ > void helper_##op(CPUPPCState *env, uint32_t opcode) \ > { \ > ppc_vsr_t xt_in, xa, xb, xt_out; \ > @@ -2218,8 +2218,18 @@ void helper_##op(CPUPPCState *env, uint32_t opcode) \ > for (i = 0; i < nels; i++) { \ > float_status tstat = env->fp_status; \ > set_float_exception_flags(0, &tstat); \ > - xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i], \ > - maddflgs, &tstat); \ > + if (r2sp && (tstat.float_rounding_mode == float_round_nearest_even)) {\ > + /* Avoid double rounding errors by rounding the intermediate */ \ > + /* result to odd. */ \ > + set_float_rounding_mode(float_round_to_zero, &tstat); \ > + xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i], \ > + maddflgs, &tstat); \ > + xt_out.fld[i] |= (get_float_exception_flags(&tstat) & \ > + float_flag_inexact) != 0; \ > + } else { \ > + xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i], \ > + maddflgs, &tstat); \ > + } \ > env->fp_status.float_exception_flags |= tstat.float_exception_flags; \ > \ > if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ > @@ -2242,6 +2252,13 @@ void helper_##op(CPUPPCState *env, uint32_t opcode) \ > fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, sfprf); \ > } \ > } \ > + \ > + if (r2sp) { \ > + float32 tmp32 = float64_to_float32(xt_out.fld[i], \ > + &env->fp_status); \ > + xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status); \ > + } \ > + \ helper_frsp Otherwise, Reviewed-by: Richard Henderson r~