[Qemu-devel] [V4 PATCH 14/22] softfloat: Factor out RoundAndPackFloat16 and NormalizeFloat16Subnormal

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Tom Musta <tommusta@gmail.com>
To: qemu-devel@nongnu.org
Cc: Tom Musta <tommusta@gmail.com>,
	qemu-ppc@nongnu.org, Peter Maydell <peter.maydell@linaro.org>
Subject: [Qemu-devel] [V4 PATCH 14/22] softfloat: Factor out RoundAndPackFloat16 and NormalizeFloat16Subnormal
Date: Tue,  7 Jan 2014 10:06:02 -0600	[thread overview]
Message-ID: <1389110770-5199-15-git-send-email-tommusta@gmail.com> (raw)
In-Reply-To: <1389110770-5199-1-git-send-email-tommusta@gmail.com>

From: Peter Maydell <peter.maydell@linaro.org>

In preparation for adding conversions between float16 and float64,
factor out code currently done inline in the float16<=>float32
conversion functions into functions RoundAndPackFloat16 and
NormalizeFloat16Subnormal along the lines of the existing versions
for the other float types.

Note that we change the handling of zExp from the inline code
to match the API of the other RoundAndPackFloat functions; however
we leave the positioning of the binary point between bits 22 and 23
rather than shifting it up to the high end of the word.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1389013881-15726-14-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Tom Musta <tommusta@gmail.com
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 fpu/softfloat.c |  209 +++++++++++++++++++++++++++++++++----------------------
 1 files changed, 125 insertions(+), 84 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index f95c964..2cefd81 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3100,6 +3100,127 @@ static float16 packFloat16(flag zSign, int_fast16_t zExp, uint16_t zSig)
         (((uint32_t)zSign) << 15) + (((uint32_t)zExp) << 10) + zSig);
 }
 
+/*----------------------------------------------------------------------------
+| Takes an abstract floating-point value having sign `zSign', exponent `zExp',
+| and significand `zSig', and returns the proper half-precision floating-
+| point value corresponding to the abstract input.  Ordinarily, the abstract
+| value is simply rounded and packed into the half-precision format, with
+| the inexact exception raised if the abstract input cannot be represented
+| exactly.  However, if the abstract value is too large, the overflow and
+| inexact exceptions are raised and an infinity or maximal finite value is
+| returned.  If the abstract value is too small, the input value is rounded to
+| a subnormal number, and the underflow and inexact exceptions are raised if
+| the abstract input cannot be represented exactly as a subnormal half-
+| precision floating-point number.
+| The `ieee' flag indicates whether to use IEEE standard half precision, or
+| ARM-style "alternative representation", which omits the NaN and Inf
+| encodings in order to raise the maximum representable exponent by one.
+|     The input significand `zSig' has its binary point between bits 22
+| and 23, which is 13 bits to the left of the usual location.  This shifted
+| significand must be normalized or smaller.  If `zSig' is not normalized,
+| `zExp' must be 0; in that case, the result returned is a subnormal number,
+| and it must not require rounding.  In the usual case that `zSig' is
+| normalized, `zExp' must be 1 less than the ``true'' floating-point exponent.
+| Note the slightly odd position of the binary point in zSig compared with the
+| other roundAndPackFloat functions. This should probably be fixed if we
+| need to implement more float16 routines than just conversion.
+| The handling of underflow and overflow follows the IEC/IEEE Standard for
+| Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+static float32 roundAndPackFloat16(flag zSign, int_fast16_t zExp,
+                                   uint32_t zSig, flag ieee STATUS_PARAM)
+{
+    int maxexp = ieee ? 29 : 30;
+    uint32_t mask;
+    uint32_t increment;
+    int8 roundingMode;
+    bool rounding_bumps_exp;
+    bool is_tiny = false;
+
+    /* Calculate the mask of bits of the mantissa which are not
+     * representable in half-precision and will be lost.
+     */
+    if (zExp < 1) {
+        /* Will be denormal in halfprec */
+        mask = 0x00ffffff;
+        if (zExp >= -11) {
+            mask >>= 11 + zExp;
+        }
+    } else {
+        /* Normal number in halfprec */
+        mask = 0x00001fff;
+    }
+
+    roundingMode = STATUS(float_rounding_mode);
+    switch (roundingMode) {
+    case float_round_nearest_even:
+        increment = (mask + 1) >> 1;
+        if ((zSig & mask) == increment) {
+            increment = zSig & (increment << 1);
+        }
+        break;
+    case float_round_up:
+        increment = zSign ? 0 : mask;
+        break;
+    case float_round_down:
+        increment = zSign ? mask : 0;
+        break;
+    default: /* round_to_zero */
+        increment = 0;
+        break;
+    }
+
+    rounding_bumps_exp = (zSig + increment >= 0x01000000);
+
+    if (zExp > maxexp || (zExp == maxexp && rounding_bumps_exp)) {
+        if (ieee) {
+            float_raise(float_flag_overflow | float_flag_inexact STATUS_VAR);
+            return packFloat16(zSign, 0x1f, 0);
+        } else {
+            float_raise(float_flag_invalid STATUS_VAR);
+            return packFloat16(zSign, 0x1f, 0x3ff);
+        }
+    }
+
+    if (zExp < 0) {
+        /* Note that flush-to-zero does not affect half-precision results */
+        is_tiny =
+            (STATUS(float_detect_tininess) == float_tininess_before_rounding)
+            || (zExp < -1)
+            || (!rounding_bumps_exp);
+    }
+    if (zSig & mask) {
+        float_raise(float_flag_inexact STATUS_VAR);
+        if (is_tiny) {
+            float_raise(float_flag_underflow STATUS_VAR);
+        }
+    }
+
+    zSig += increment;
+    if (rounding_bumps_exp) {
+        zSig >>= 1;
+        zExp++;
+    }
+
+    if (zExp < -10) {
+        return packFloat16(zSign, 0, 0);
+    }
+    if (zExp < 0) {
+        zSig >>= -zExp;
+        zExp = 0;
+    }
+    return packFloat16(zSign, zExp, zSig >> 13);
+}
+
+static void normalizeFloat16Subnormal(uint32_t aSig, int_fast16_t *zExpPtr,
+                                      uint32_t *zSigPtr)
+{
+    int8_t shiftCount = countLeadingZeros32(aSig) - 21;
+    *zSigPtr = aSig << shiftCount;
+    *zExpPtr = 1 - shiftCount;
+}
+
 /* Half precision floats come in two formats: standard IEEE and "ARM" format.
    The latter gains extra exponent range by omitting the NaN/Inf encodings.  */
 
@@ -3120,15 +3241,12 @@ float32 float16_to_float32(float16 a, flag ieee STATUS_PARAM)
         return packFloat32(aSign, 0xff, 0);
     }
     if (aExp == 0) {
-        int8 shiftCount;
-
         if (aSig == 0) {
             return packFloat32(aSign, 0, 0);
         }
 
-        shiftCount = countLeadingZeros32( aSig ) - 21;
-        aSig = aSig << shiftCount;
-        aExp = -shiftCount;
+        normalizeFloat16Subnormal(aSig, &aExp, &aSig);
+        aExp--;
     }
     return packFloat32( aSign, aExp + 0x70, aSig << 13);
 }
@@ -3138,12 +3256,6 @@ float16 float32_to_float16(float32 a, flag ieee STATUS_PARAM)
     flag aSign;
     int_fast16_t aExp;
     uint32_t aSig;
-    uint32_t mask;
-    uint32_t increment;
-    int8 roundingMode;
-    int maxexp = ieee ? 15 : 16;
-    bool rounding_bumps_exp;
-    bool is_tiny = false;
 
     a = float32_squash_input_denormal(a STATUS_VAR);
 
@@ -3178,80 +3290,9 @@ float16 float32_to_float16(float32 a, flag ieee STATUS_PARAM)
      * codepath.
      */
     aSig |= 0x00800000;
-    aExp -= 0x7f;
-    /* Calculate the mask of bits of the mantissa which are not
-     * representable in half-precision and will be lost.
-     */
-    if (aExp < -14) {
-        /* Will be denormal in halfprec */
-        mask = 0x00ffffff;
-        if (aExp >= -24) {
-            mask >>= 25 + aExp;
-        }
-    } else {
-        /* Normal number in halfprec */
-        mask = 0x00001fff;
-    }
+    aExp -= 0x71;
 
-    roundingMode = STATUS(float_rounding_mode);
-    switch (roundingMode) {
-    case float_round_nearest_even:
-        increment = (mask + 1) >> 1;
-        if ((aSig & mask) == increment) {
-            increment = aSig & (increment << 1);
-        }
-        break;
-    case float_round_up:
-        increment = aSign ? 0 : mask;
-        break;
-    case float_round_down:
-        increment = aSign ? mask : 0;
-        break;
-    default: /* round_to_zero */
-        increment = 0;
-        break;
-    }
-
-    rounding_bumps_exp = (aSig + increment >= 0x01000000);
-
-    if (aExp > maxexp || (aExp == maxexp && rounding_bumps_exp)) {
-        if (ieee) {
-            float_raise(float_flag_overflow | float_flag_inexact STATUS_VAR);
-            return packFloat16(aSign, 0x1f, 0);
-        } else {
-            float_raise(float_flag_invalid STATUS_VAR);
-            return packFloat16(aSign, 0x1f, 0x3ff);
-        }
-    }
-
-    if (aExp < -14) {
-        /* Note that flush-to-zero does not affect half-precision results */
-        is_tiny =
-            (STATUS(float_detect_tininess) == float_tininess_before_rounding)
-            || (aExp < -15)
-            || (!rounding_bumps_exp);
-    }
-    if (aSig & mask) {
-        float_raise(float_flag_inexact STATUS_VAR);
-        if (is_tiny) {
-            float_raise(float_flag_underflow STATUS_VAR);
-        }
-    }
-
-    aSig += increment;
-    if (rounding_bumps_exp) {
-        aSig >>= 1;
-        aExp++;
-    }
-
-    if (aExp < -24) {
-        return packFloat16(aSign, 0, 0);
-    }
-    if (aExp < -14) {
-        aSig >>= -14 - aExp;
-        aExp = -14;
-    }
-    return packFloat16(aSign, aExp + 14, aSig >> 13);
+    return roundAndPackFloat16(aSign, aExp, aSig, ieee STATUS_VAR);
 }
 
 /*----------------------------------------------------------------------------
-- 
1.7.1

next prev parent reply	other threads:[~2014-01-07 16:07 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-07 16:05 [Qemu-devel] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 01/22] target-ppc: Add ISA2.06 bpermd Instruction Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 02/22] target-ppc: Add Flag for ISA2.06 Divide Extended Instructions Tom Musta
2014-01-08 18:27   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 03/22] target-ppc: Add ISA2.06 divdeu[o] Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 04/22] target-ppc: Add ISA2.06 divde[o] Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 05/22] target-ppc: Add ISA 2.06 divweu[o] Instructions Tom Musta
2014-01-08 18:26   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 06/22] target-ppc: Add ISA 2.06 divwe[o] Instructions Tom Musta
2014-01-08 18:28   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 07/22] target-ppc: Add Flag for ISA2.06 Atomic Instructions Tom Musta
2014-01-08 18:28   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 08/22] target-ppc: Add ISA2.06 lbarx, lharx Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 09/22] target-ppc: Add ISA 2.06 stbcx. and sthcx. Instructions Tom Musta
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 10/22] target-ppc: Add Flag for ISA V2.06 Floating Point Conversion Tom Musta
2014-01-08 18:29   ` Richard Henderson
2014-01-07 16:05 ` [Qemu-devel] [V4 PATCH 11/22] target-ppc: Add ISA2.06 Float to Integer Instructions Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 12/22] target-ppc: Add ISA 2.06 fcfid[u][s] Instructions Tom Musta
2014-01-08 18:31   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 13/22] softfloat: Fix exception flag handling for float32_to_float16() Tom Musta
2014-01-07 16:06 ` Tom Musta [this message]
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 15/22] softfloat: Refactor code handling various rounding modes Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 16/22] softfloat: Add support for ties-away rounding Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 17/22] target-ppc: Fix and enable fri[mnpz] Tom Musta
2014-01-08 18:32   ` Richard Henderson
2014-02-21 11:58     ` [Qemu-devel] [Qemu-ppc] " Alexander Graf
2014-02-21 12:48       ` Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 18/22] target-ppc: Add Flag for Power ISA V2.06 Floating Point Test Instructions Tom Musta
2014-01-08 18:33   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 19/22] target-ppc: Add ISA 2.06 ftdiv Instruction Tom Musta
2014-01-08 18:34   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 20/22] target-ppc: Add ISA 2.06 ftsqrt Tom Musta
2014-01-08 18:35   ` Richard Henderson
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 21/22] target-ppc: Enable frsqrtes on Power7 and Power8 Tom Musta
2014-01-07 16:06 ` [Qemu-devel] [V4 PATCH 22/22] target-ppc: Add ISA2.06 lfiwzx Instruction Tom Musta
2014-01-27 16:01 ` [Qemu-devel] [Qemu-ppc] [V4 PATCH 00/22] target-ppc: Base ISA V2.06 for Power7/Power8 Alexander Graf

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:f95c964 dfblob:2cefd81 )
 OR (
bs:"[Qemu-devel] [V4 PATCH 14/22] softfloat: Factor out RoundAndPackFloat16 and NormalizeFloat16Subnormal" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1389110770-5199-15-git-send-email-tommusta@gmail.com \
    --to=tommusta@gmail.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).