From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37133) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VZkCY-0003HK-Fo for qemu-devel@nongnu.org; Fri, 25 Oct 2013 12:25:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VZkCQ-0000Tn-0z for qemu-devel@nongnu.org; Fri, 25 Oct 2013 12:25:30 -0400 Message-ID: <526A9B6A.7080001@gmail.com> Date: Fri, 25 Oct 2013 11:25:14 -0500 From: Tom Musta MIME-Version: 1.0 References: <526947CA.4020504@gmail.com> <526949E1.3010405@gmail.com> <5269852E.9000601@twiddle.net> In-Reply-To: <5269852E.9000601@twiddle.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 13/19] Add VSX ISA2.06 Multiply Add Instructions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson , QEMU Developers Cc: "qemu-ppc@nongnu.org" On 10/24/2013 3:38 PM, Richard Henderson wrote: > On 10/24/2013 09:25 AM, Tom Musta wrote: >> \ >> + ft0 = tp##_to_##btp(xa.fld[i], &env->fp_status); \ >> + ft1 = tp##_to_##btp(m->fld[i], &env->fp_status); \ >> + ft0 = btp##_mul(ft0, ft1, &env->fp_status); \ >> + if (unlikely(btp##_is_infinity(ft0) && \ >> + tp##_is_infinity(s->fld[i]) && \ >> + btp##_is_neg(ft0) cmp tp##_is_neg(s->fld[i]))) { \ >> + xt.fld[i] = float64_to_##tp( \ >> + fload_invalid_op_excp(env, \ >> + POWERPC_EXCP_FP_VXISI, \ >> + sfprf), \ >> + &env->fp_status); \ >> + } else { \ >> + ft1 = tp##_to_##btp(s->fld[i], &env->fp_status); \ >> + ft0 = btp##_##sum(ft0, ft1, &env->fp_status); \ >> + xt.fld[i] = btp##_to_##tp(ft0, &env->fp_status); \ >> + } \ >> + if (neg && likely(!tp##_is_any_nan(xt.fld[i]))) { \ >> + xt.fld[i] = tp##_chs(xt.fld[i]); \ >> + } > > You want to be using tp##muladd instead of widening to 128 bits. I tried recoding xsmaddadp using float64_muladd. The problem that I hit is the boundary case where the intermediate product and the summand are infinities of the opposite sign. This is the case handled by the first "if" in the code snippet above. PowerPC has a dedicated FPSCR bit for this type of condition (VXISI) as well as a general invalid operation bit (VX). As far as I can tell, the softfloat code only has the equivalent of the VX bit. Thus the implementation that I proposed is a more accurate representation of the Power ISA. The VSX code was modeled after the existing fmadd FPU instruction. I suspect the author of that code wrote it this way for similar reasons. I am inclined to keep my proposed implementation, which is consistent with the existing PowerPC code. Thoughts?