From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51171) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ee4pU-0003CO-Mv for qemu-devel@nongnu.org; Tue, 23 Jan 2018 15:06:02 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ee4pQ-0007W4-MJ for qemu-devel@nongnu.org; Tue, 23 Jan 2018 15:06:00 -0500 Received: from mail-wm0-x235.google.com ([2a00:1450:400c:c09::235]:33046) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ee4pQ-0007Ue-Az for qemu-devel@nongnu.org; Tue, 23 Jan 2018 15:05:56 -0500 Received: by mail-wm0-x235.google.com with SMTP id x4so23507694wmc.0 for ; Tue, 23 Jan 2018 12:05:56 -0800 (PST) References: <20180109122252.17670-1-alex.bennee@linaro.org> <20180109122252.17670-12-alex.bennee@linaro.org> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: Date: Tue, 23 Jan 2018 20:05:53 +0000 Message-ID: <87inbs7332.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 11/20] fpu/softfloat: re-factor add/sub List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: Richard Henderson , Laurent Vivier , bharata@linux.vnet.ibm.com, Andrew Dutcher , QEMU Developers , Aurelien Jarno Peter Maydell writes: > >> + float_status *status) >> +{ >> + if (part.exp =3D=3D parm->exp_max) { >> + if (part.frac =3D=3D 0) { >> + part.cls =3D float_class_inf; >> + } else { >> +#ifdef NO_SIGNALING_NANS > > The old code didn't seem to need to ifdef this; why's the new > code different? (at some point we'll want to make this a runtime > setting so we can support one binary handling CPUs with it both > set and unset, but that is a far future thing we can ignore for now) It does, but it's hidden behind propagateFloatXXNaN which in turn calls floatXX_is_quiet/signalling_nan which are altered by the #ifdefs. > >> + part.cls =3D float_class_qnan; >> +#else >> + int64_t msb =3D part.frac << (parm->frac_shift + 2); >> + if ((msb < 0) =3D=3D status->snan_bit_is_one) { >> + part.cls =3D float_class_snan; >> + } else { >> + part.cls =3D float_class_qnan; >> + } >> +#endif >> + } >> + } else if (part.exp =3D=3D 0) { >> + if (likely(part.frac =3D=3D 0)) { >> + part.cls =3D float_class_zero; >> + } else if (status->flush_inputs_to_zero) { >> + float_raise(float_flag_input_denormal, status); >> + part.cls =3D float_class_zero; >> + part.frac =3D 0; >> + } else { >> + int shift =3D clz64(part.frac) - 1; >> + part.cls =3D float_class_normal; > > This is really confusing. This is a *denormal*, but we're setting > the classification to "normal" ? (It's particularly confusing in > the code that uses the decomposed numbers, because it looks like > "if (a.cls =3D=3D float_class_normal...)" is handling the normal-number > case and denormals are going to be in a later if branch, but actually > it's dealing with both.) The code deals with canonicalized numbers - so unless we explicitly flush denormals to zero they can be treated like any other for the rest of the code. What would you prefer? A comment in FloatClass? >> + >> +static float16 float16_round_pack_canonical(decomposed_parts p, float_s= tatus *s) >> +{ >> + switch (p.cls) { >> + case float_class_dnan: >> + return float16_default_nan(s); >> + case float_class_msnan: >> + return float16_maybe_silence_nan(float16_pack_raw(p), s); > > I think you will find that doing the silencing of the NaNs like this > isn't quite the right approach. Specifically, for Arm targets we > currently have a bug in float-to-float conversion from a wider > format to a narrower one when the input is a signaling NaN that we > want to silence, and its non-zero mantissa bits are all at the > less-significant end of the mantissa such that they don't fit into > the narrower format. If you pack the float into a float16 first and > then call maybe_silence_nan() on it you've lost the info about those > low bits which the silence function needs to know to return the > right answer. What you want to do instead is pass the silence_nan > function the decomposed value. So this is an inherited bug from softfloat-specialize.h? I guess we need a common specific decomposed specialisation we can use for all the sizes. > > (The effect of this bug is that we return a default NaN, with the > sign bit clear, but the Arm FPConvertNaN pseudocode says that we > should effectively get the default NaN but with the same sign bit > as the input SNaN.) > > Given that this is a bug currently in the version we have, we don't > necessarily need to fix it now, but I thought I'd mention it since > the redesign has almost but not quite managed to deliver the right > information to the silencing code to allow us to fix it soon :-) So comment for now? Currently all the information for decomposed is kept internal to softfloat.c - I'm not sure we want to expose the internals to a wider audience? Especially as these inline helpers in specialize.h are also used by helpers. >> + >> + >> +/* >> + * Returns the result of adding the absolute values of the >> + * floating-point values `a' and `b'. If `subtract' is set, the sum is >> + * negated before being returned. `subtract' is ignored if the result >> + * is a NaN. The addition is performed according to the IEC/IEEE >> + * Standard for Binary Floating-Point Arithmetic. >> + */ > > This comment doesn't seem to match what the code is doing, > because it says it adds the absolute values of 'a' and 'b', > but the code looks at a_sign and b_sign to decide whether it's > doing an addition or subtraction rather than ignoring the signs > (as you would for absolute arithmetic). > > Put another way, this comment has been copied from the old addFloat64Sigs= () > and not updated to account for the way the new function includes handling > of subFloat64Sigs(). > >> + >> +static decomposed_parts add_decomposed(decomposed_parts a, decomposed_p= arts b, >> + bool subtract, float_status *s) >> +{ >> + bool a_sign =3D a.sign; >> + bool b_sign =3D b.sign ^ subtract; >> + >> + if (a_sign !=3D b_sign) { >> + /* Subtraction */ >> + >> + if (a.cls =3D=3D float_class_normal && b.cls =3D=3D float_class= _normal) { >> + int a_exp =3D a.exp; >> + int b_exp =3D b.exp; >> + uint64_t a_frac =3D a.frac; >> + uint64_t b_frac =3D b.frac; > > Do we really have to use locals here rather than just using a.frac, > b.frac etc in place ? If we trust the compiler enough to throw > structs in and out of functions and let everything inline, it > ought to be able to handle a uint64_t in a struct local variable. Fixed. >> + if (a.cls >=3D float_class_qnan >> + || >> + b.cls >=3D float_class_qnan) { > > We should helper functions for "is some kind of NaN" rather than > baking the assumption about order of the enum values directly > into every function. (Also "float_is_any_nan(a)" is easier to read.) if (is_nan(a.cls) || is_nan(b.cls)) >> +float64 float64_sub(float64 a, float64 b, float_status *status) >> +{ >> + decomposed_parts pa =3D float64_unpack_canonical(a, status); >> + decomposed_parts pb =3D float64_unpack_canonical(b, status); >> + decomposed_parts pr =3D add_decomposed(pa, pb, true, status); >> + >> + return float64_round_pack_canonical(pr, status); >> +} > > This part is a pretty good advert for the benefits of the refactoring... > > I'm not particularly worried about the performance of softfloat, > but out of curiosity have you benchmarked the old vs new? Not yet but I can run some with my vector kernel benchmark. > > thanks > -- PMM -- Alex Benn=C3=A9e