From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:48276) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1guIRc-0006Th-Hj for qemu-devel@nongnu.org; Thu, 14 Feb 2019 09:56:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1guIRL-0006TX-2K for qemu-devel@nongnu.org; Thu, 14 Feb 2019 09:56:46 -0500 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:40462) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1guIRC-0006Hm-Fj for qemu-devel@nongnu.org; Thu, 14 Feb 2019 09:56:32 -0500 Received: by mail-pf1-x444.google.com with SMTP id h1so3217106pfo.7 for ; Thu, 14 Feb 2019 06:56:26 -0800 (PST) References: <20190214034345.24722-1-richard.henderson@linaro.org> <20190214034345.24722-2-richard.henderson@linaro.org> From: Richard Henderson Message-ID: Date: Thu, 14 Feb 2019 06:56:22 -0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/4] target/arm: Add helpers for FMLAL and FMLSL List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laurent Desnogues Cc: "qemu-devel@nongnu.org" , Peter Maydell , qemu-arm On 2/14/19 1:16 AM, Laurent Desnogues wrote: > Hello, > > On Thu, Feb 14, 2019 at 5:00 AM Richard Henderson > wrote: >> >> Note that float16_to_float32 rightly squashes SNaN to QNaN. >> But of course pickNaNMulAdd, for ARM, selects SNaNs first. >> So we have to preserve SNaN long enough for the correct NaN >> to be selected. Thus float16_to_float32_by_bits. >> >> Signed-off-by: Richard Henderson >> --- >> target/arm/helper.h | 9 +++ >> target/arm/vec_helper.c | 154 ++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 163 insertions(+) >> >> diff --git a/target/arm/helper.h b/target/arm/helper.h >> index 53a38188c6..0302e13604 100644 >> --- a/target/arm/helper.h >> +++ b/target/arm/helper.h >> @@ -653,6 +653,15 @@ DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG, >> DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG, >> void, ptr, ptr, ptr, ptr, ptr, i32) >> >> +DEF_HELPER_FLAGS_5(gvec_fmlal_h, TCG_CALL_NO_RWG, >> + void, ptr, ptr, ptr, ptr, i32) >> +DEF_HELPER_FLAGS_5(gvec_fmlsl_h, TCG_CALL_NO_RWG, >> + void, ptr, ptr, ptr, ptr, i32) >> +DEF_HELPER_FLAGS_5(gvec_fmlal_idx_h, TCG_CALL_NO_RWG, >> + void, ptr, ptr, ptr, ptr, i32) >> +DEF_HELPER_FLAGS_5(gvec_fmlsl_idx_h, TCG_CALL_NO_RWG, >> + void, ptr, ptr, ptr, ptr, i32) >> + >> #ifdef TARGET_AARCH64 >> #include "helper-a64.h" >> #include "helper-sve.h" >> diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c >> index 37f338732e..0c3b3de961 100644 >> --- a/target/arm/vec_helper.c >> +++ b/target/arm/vec_helper.c >> @@ -766,3 +766,157 @@ DO_FMLA_IDX(gvec_fmla_idx_s, float32, H4) >> DO_FMLA_IDX(gvec_fmla_idx_d, float64, ) >> >> #undef DO_FMLA_IDX >> + >> +/* >> + * Convert float16 to float32, raising no exceptions and >> + * preserving exceptional values, including SNaN. >> + * This is effectively an unpack+repack operation. >> + */ >> +static float32 float16_to_float32_by_bits(uint32_t f16) >> +{ >> + const int f16_bias = 15; >> + const int f32_bias = 127; >> + uint32_t sign = extract32(f16, 15, 1); >> + uint32_t exp = extract32(f16, 10, 5); >> + uint32_t frac = extract32(f16, 0, 10); >> + >> + if (exp == 0x1f) { >> + /* Inf or NaN */ >> + exp = 0xff; >> + } else if (exp == 0) { >> + /* Zero or denormal. */ >> + if (frac != 0) { >> + /* >> + * Denormal; these are all normal float32. >> + * Shift the fraction so that the msb is at bit 11, >> + * then remove bit 11 as the implicit bit of the >> + * normalized float32. Note that we still go through >> + * the shift for normal numbers below, to put the >> + * float32 fraction at the right place. >> + */ >> + int shift = clz32(frac) - 21; >> + frac = (frac << shift) & 0x3ff; >> + exp = f32_bias - f16_bias - shift + 1; > > If FZ16 is set, this should flush to zero. Ho, hum, yes it should. > This means you will have to use both fp_status (for the muladd) and > fp_status_f16 (for this function) and so you should pass cpu_env to > the helpers rather than the fp_status. It's not quite as simple as that, because aa32 mode would pass standard_fp_status. I'll figure something out... r~