* [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
@ 2018-08-14 0:26 Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 1/4] softfloat: Add scaling int-to-float routines Richard Henderson
` (7 more replies)
0 siblings, 8 replies; 11+ messages in thread
From: Richard Henderson @ 2018-08-14 0:26 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, alex.bennee
In 88808a022c0, I tried to fix an overflow problem that affected float16
scaling by coverting first to float64 and then rounding after that.
However, Laurent reported that -0x3ff40000000001 converted to float16
resulted in 0xfbfe instead of the expected 0xfbff. This is caused by
the inexact conversion to float64.
Rather than build more logic into target/arm to compensate, just add
a function that takes a scaling parameter so that the whole thing is
done all at once with only one rounding.
I don't have a failing test case for the float-to-int paths, but it
seemed best to apply the same solution.
r~
Richard Henderson (4):
softfloat: Add scaling int-to-float routines
softfloat: Add scaling float-to-int routines
target/arm: Use the int-to-float-scale softfloat routines
target/arm: Use the float-to-int-scale softfloat routines
include/fpu/softfloat.h | 169 ++++++++----
fpu/softfloat.c | 579 +++++++++++++++++++++++++++++++---------
target/arm/helper.c | 130 ++++-----
3 files changed, 628 insertions(+), 250 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 1/4] softfloat: Add scaling int-to-float routines
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
@ 2018-08-14 0:26 ` Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 2/4] softfloat: Add scaling float-to-int routines Richard Henderson
` (6 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2018-08-14 0:26 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/fpu/softfloat.h | 56 ++++++++----
fpu/softfloat.c | 188 +++++++++++++++++++++++++++++-----------
2 files changed, 179 insertions(+), 65 deletions(-)
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 69f4dbc4db..038e375e71 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -190,22 +190,54 @@ enum {
/*----------------------------------------------------------------------------
| Software IEC/IEEE integer-to-floating-point conversion routines.
*----------------------------------------------------------------------------*/
+
+float16 int16_to_float16_scalbn(int16_t a, int, float_status *status);
+float16 int32_to_float16_scalbn(int32_t a, int, float_status *status);
+float16 int64_to_float16_scalbn(int64_t a, int, float_status *status);
+float16 uint16_to_float16_scalbn(uint16_t a, int, float_status *status);
+float16 uint32_to_float16_scalbn(uint32_t a, int, float_status *status);
+float16 uint64_to_float16_scalbn(uint64_t a, int, float_status *status);
+
+float16 int16_to_float16(int16_t a, float_status *status);
+float16 int32_to_float16(int32_t a, float_status *status);
+float16 int64_to_float16(int64_t a, float_status *status);
+float16 uint16_to_float16(uint16_t a, float_status *status);
+float16 uint32_to_float16(uint32_t a, float_status *status);
+float16 uint64_to_float16(uint64_t a, float_status *status);
+
+float32 int16_to_float32_scalbn(int16_t, int, float_status *status);
+float32 int32_to_float32_scalbn(int32_t, int, float_status *status);
+float32 int64_to_float32_scalbn(int64_t, int, float_status *status);
+float32 uint16_to_float32_scalbn(uint16_t, int, float_status *status);
+float32 uint32_to_float32_scalbn(uint32_t, int, float_status *status);
+float32 uint64_to_float32_scalbn(uint64_t, int, float_status *status);
+
float32 int16_to_float32(int16_t, float_status *status);
float32 int32_to_float32(int32_t, float_status *status);
-float64 int16_to_float64(int16_t, float_status *status);
-float64 int32_to_float64(int32_t, float_status *status);
+float32 int64_to_float32(int64_t, float_status *status);
float32 uint16_to_float32(uint16_t, float_status *status);
float32 uint32_to_float32(uint32_t, float_status *status);
+float32 uint64_to_float32(uint64_t, float_status *status);
+
+float64 int16_to_float64_scalbn(int16_t, int, float_status *status);
+float64 int32_to_float64_scalbn(int32_t, int, float_status *status);
+float64 int64_to_float64_scalbn(int64_t, int, float_status *status);
+float64 uint16_to_float64_scalbn(uint16_t, int, float_status *status);
+float64 uint32_to_float64_scalbn(uint32_t, int, float_status *status);
+float64 uint64_to_float64_scalbn(uint64_t, int, float_status *status);
+
+float64 int16_to_float64(int16_t, float_status *status);
+float64 int32_to_float64(int32_t, float_status *status);
+float64 int64_to_float64(int64_t, float_status *status);
float64 uint16_to_float64(uint16_t, float_status *status);
float64 uint32_to_float64(uint32_t, float_status *status);
-floatx80 int32_to_floatx80(int32_t, float_status *status);
-float128 int32_to_float128(int32_t, float_status *status);
-float32 int64_to_float32(int64_t, float_status *status);
-float64 int64_to_float64(int64_t, float_status *status);
-floatx80 int64_to_floatx80(int64_t, float_status *status);
-float128 int64_to_float128(int64_t, float_status *status);
-float32 uint64_to_float32(uint64_t, float_status *status);
float64 uint64_to_float64(uint64_t, float_status *status);
+
+floatx80 int32_to_floatx80(int32_t, float_status *status);
+floatx80 int64_to_floatx80(int64_t, float_status *status);
+
+float128 int32_to_float128(int32_t, float_status *status);
+float128 int64_to_float128(int64_t, float_status *status);
float128 uint64_to_float128(uint64_t, float_status *status);
/*----------------------------------------------------------------------------
@@ -227,12 +259,6 @@ int64_t float16_to_int64(float16, float_status *status);
uint64_t float16_to_uint64(float16 a, float_status *status);
int64_t float16_to_int64_round_to_zero(float16, float_status *status);
uint64_t float16_to_uint64_round_to_zero(float16 a, float_status *status);
-float16 int16_to_float16(int16_t a, float_status *status);
-float16 int32_to_float16(int32_t a, float_status *status);
-float16 int64_to_float16(int64_t a, float_status *status);
-float16 uint16_to_float16(uint16_t a, float_status *status);
-float16 uint32_to_float16(uint32_t a, float_status *status);
-float16 uint64_to_float16(uint64_t a, float_status *status);
/*----------------------------------------------------------------------------
| Software half-precision operations.
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 7d63cffdeb..12f373cbad 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1603,81 +1603,122 @@ FLOAT_TO_UINT(64, 64)
* to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
*/
-static FloatParts int_to_float(int64_t a, float_status *status)
+static FloatParts int_to_float(int64_t a, int scale, float_status *status)
{
- FloatParts r = {};
+ FloatParts r = { .sign = false };
+
if (a == 0) {
r.cls = float_class_zero;
- r.sign = false;
- } else if (a == (1ULL << 63)) {
- r.cls = float_class_normal;
- r.sign = true;
- r.frac = DECOMPOSED_IMPLICIT_BIT;
- r.exp = 63;
} else {
- uint64_t f;
- if (a < 0) {
- f = -a;
- r.sign = true;
- } else {
- f = a;
- r.sign = false;
- }
- int shift = clz64(f) - 1;
+ uint64_t f = a;
+ int shift;
+
r.cls = float_class_normal;
- r.exp = (DECOMPOSED_BINARY_POINT - shift);
- r.frac = f << shift;
+ if (a < 0) {
+ f = -f;
+ r.sign = true;
+ }
+ shift = clz64(f) - 1;
+ scale = MIN(MAX(scale, -0x10000), 0x10000);
+
+ r.exp = DECOMPOSED_BINARY_POINT - shift + scale;
+ r.frac = (shift < 0 ? DECOMPOSED_IMPLICIT_BIT : f << shift);
}
return r;
}
+float16 int64_to_float16_scalbn(int64_t a, int scale, float_status *status)
+{
+ FloatParts pa = int_to_float(a, scale, status);
+ return float16_round_pack_canonical(pa, status);
+}
+
+float16 int32_to_float16_scalbn(int32_t a, int scale, float_status *status)
+{
+ return int64_to_float16_scalbn(a, scale, status);
+}
+
+float16 int16_to_float16_scalbn(int16_t a, int scale, float_status *status)
+{
+ return int64_to_float16_scalbn(a, scale, status);
+}
+
float16 int64_to_float16(int64_t a, float_status *status)
{
- FloatParts pa = int_to_float(a, status);
- return float16_round_pack_canonical(pa, status);
+ return int64_to_float16_scalbn(a, 0, status);
}
float16 int32_to_float16(int32_t a, float_status *status)
{
- return int64_to_float16(a, status);
+ return int64_to_float16_scalbn(a, 0, status);
}
float16 int16_to_float16(int16_t a, float_status *status)
{
- return int64_to_float16(a, status);
+ return int64_to_float16_scalbn(a, 0, status);
+}
+
+float32 int64_to_float32_scalbn(int64_t a, int scale, float_status *status)
+{
+ FloatParts pa = int_to_float(a, scale, status);
+ return float32_round_pack_canonical(pa, status);
+}
+
+float32 int32_to_float32_scalbn(int32_t a, int scale, float_status *status)
+{
+ return int64_to_float32_scalbn(a, scale, status);
+}
+
+float32 int16_to_float32_scalbn(int16_t a, int scale, float_status *status)
+{
+ return int64_to_float32_scalbn(a, scale, status);
}
float32 int64_to_float32(int64_t a, float_status *status)
{
- FloatParts pa = int_to_float(a, status);
- return float32_round_pack_canonical(pa, status);
+ return int64_to_float32_scalbn(a, 0, status);
}
float32 int32_to_float32(int32_t a, float_status *status)
{
- return int64_to_float32(a, status);
+ return int64_to_float32_scalbn(a, 0, status);
}
float32 int16_to_float32(int16_t a, float_status *status)
{
- return int64_to_float32(a, status);
+ return int64_to_float32_scalbn(a, 0, status);
+}
+
+float64 int64_to_float64_scalbn(int64_t a, int scale, float_status *status)
+{
+ FloatParts pa = int_to_float(a, scale, status);
+ return float64_round_pack_canonical(pa, status);
+}
+
+float64 int32_to_float64_scalbn(int32_t a, int scale, float_status *status)
+{
+ return int64_to_float64_scalbn(a, scale, status);
+}
+
+float64 int16_to_float64_scalbn(int16_t a, int scale, float_status *status)
+{
+ return int64_to_float64_scalbn(a, scale, status);
}
float64 int64_to_float64(int64_t a, float_status *status)
{
- FloatParts pa = int_to_float(a, status);
- return float64_round_pack_canonical(pa, status);
+ return int64_to_float64_scalbn(a, 0, status);
}
float64 int32_to_float64(int32_t a, float_status *status)
{
- return int64_to_float64(a, status);
+ return int64_to_float64_scalbn(a, 0, status);
}
float64 int16_to_float64(int16_t a, float_status *status)
{
- return int64_to_float64(a, status);
+ return int64_to_float64_scalbn(a, 0, status);
}
@@ -1689,73 +1730,120 @@ float64 int16_to_float64(int16_t a, float_status *status)
* IEC/IEEE Standard for Binary Floating-Point Arithmetic.
*/
-static FloatParts uint_to_float(uint64_t a, float_status *status)
+static FloatParts uint_to_float(uint64_t a, int scale, float_status *status)
{
- FloatParts r = { .sign = false};
+ FloatParts r = { .sign = false };
if (a == 0) {
r.cls = float_class_zero;
} else {
- int spare_bits = clz64(a) - 1;
+ scale = MIN(MAX(scale, -0x10000), 0x10000);
r.cls = float_class_normal;
- r.exp = DECOMPOSED_BINARY_POINT - spare_bits;
- if (spare_bits < 0) {
- shift64RightJamming(a, -spare_bits, &a);
+ if ((int64_t)a < 0) {
+ r.exp = DECOMPOSED_BINARY_POINT + 1 + scale;
+ shift64RightJamming(a, 1, &a);
r.frac = a;
} else {
- r.frac = a << spare_bits;
+ int shift = clz64(a) - 1;
+ r.exp = DECOMPOSED_BINARY_POINT - shift + scale;
+ r.frac = a << shift;
}
}
return r;
}
+float16 uint64_to_float16_scalbn(uint64_t a, int scale, float_status *status)
+{
+ FloatParts pa = uint_to_float(a, scale, status);
+ return float16_round_pack_canonical(pa, status);
+}
+
+float16 uint32_to_float16_scalbn(uint32_t a, int scale, float_status *status)
+{
+ return uint64_to_float16_scalbn(a, scale, status);
+}
+
+float16 uint16_to_float16_scalbn(uint16_t a, int scale, float_status *status)
+{
+ return uint64_to_float16_scalbn(a, scale, status);
+}
+
float16 uint64_to_float16(uint64_t a, float_status *status)
{
- FloatParts pa = uint_to_float(a, status);
- return float16_round_pack_canonical(pa, status);
+ return uint64_to_float16_scalbn(a, 0, status);
}
float16 uint32_to_float16(uint32_t a, float_status *status)
{
- return uint64_to_float16(a, status);
+ return uint64_to_float16_scalbn(a, 0, status);
}
float16 uint16_to_float16(uint16_t a, float_status *status)
{
- return uint64_to_float16(a, status);
+ return uint64_to_float16_scalbn(a, 0, status);
+}
+
+float32 uint64_to_float32_scalbn(uint64_t a, int scale, float_status *status)
+{
+ FloatParts pa = uint_to_float(a, scale, status);
+ return float32_round_pack_canonical(pa, status);
+}
+
+float32 uint32_to_float32_scalbn(uint32_t a, int scale, float_status *status)
+{
+ return uint64_to_float32_scalbn(a, scale, status);
+}
+
+float32 uint16_to_float32_scalbn(uint16_t a, int scale, float_status *status)
+{
+ return uint64_to_float32_scalbn(a, scale, status);
}
float32 uint64_to_float32(uint64_t a, float_status *status)
{
- FloatParts pa = uint_to_float(a, status);
- return float32_round_pack_canonical(pa, status);
+ return uint64_to_float32_scalbn(a, 0, status);
}
float32 uint32_to_float32(uint32_t a, float_status *status)
{
- return uint64_to_float32(a, status);
+ return uint64_to_float32_scalbn(a, 0, status);
}
float32 uint16_to_float32(uint16_t a, float_status *status)
{
- return uint64_to_float32(a, status);
+ return uint64_to_float32_scalbn(a, 0, status);
+}
+
+float64 uint64_to_float64_scalbn(uint64_t a, int scale, float_status *status)
+{
+ FloatParts pa = uint_to_float(a, scale, status);
+ return float64_round_pack_canonical(pa, status);
+}
+
+float64 uint32_to_float64_scalbn(uint32_t a, int scale, float_status *status)
+{
+ return uint64_to_float64_scalbn(a, scale, status);
+}
+
+float64 uint16_to_float64_scalbn(uint16_t a, int scale, float_status *status)
+{
+ return uint64_to_float64_scalbn(a, scale, status);
}
float64 uint64_to_float64(uint64_t a, float_status *status)
{
- FloatParts pa = uint_to_float(a, status);
- return float64_round_pack_canonical(pa, status);
+ return uint64_to_float64_scalbn(a, 0, status);
}
float64 uint32_to_float64(uint32_t a, float_status *status)
{
- return uint64_to_float64(a, status);
+ return uint64_to_float64_scalbn(a, 0, status);
}
float64 uint16_to_float64(uint16_t a, float_status *status)
{
- return uint64_to_float64(a, status);
+ return uint64_to_float64_scalbn(a, 0, status);
}
/* Float Min/Max */
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 2/4] softfloat: Add scaling float-to-int routines
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 1/4] softfloat: Add scaling int-to-float routines Richard Henderson
@ 2018-08-14 0:26 ` Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 3/4] target/arm: Use the int-to-float-scale softfloat routines Richard Henderson
` (5 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2018-08-14 0:26 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/fpu/softfloat.h | 85 ++++++---
fpu/softfloat.c | 391 ++++++++++++++++++++++++++++++++--------
2 files changed, 379 insertions(+), 97 deletions(-)
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 038e375e71..cc1b58b029 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -243,21 +243,34 @@ float128 uint64_to_float128(uint64_t, float_status *status);
/*----------------------------------------------------------------------------
| Software half-precision conversion routines.
*----------------------------------------------------------------------------*/
+
float16 float32_to_float16(float32, bool ieee, float_status *status);
float32 float16_to_float32(float16, bool ieee, float_status *status);
float16 float64_to_float16(float64 a, bool ieee, float_status *status);
float64 float16_to_float64(float16 a, bool ieee, float_status *status);
+
+int16_t float16_to_int16_scalbn(float16, int, int, float_status *status);
+int32_t float16_to_int32_scalbn(float16, int, int, float_status *status);
+int64_t float16_to_int64_scalbn(float16, int, int, float_status *status);
+
int16_t float16_to_int16(float16, float_status *status);
-uint16_t float16_to_uint16(float16 a, float_status *status);
-int16_t float16_to_int16_round_to_zero(float16, float_status *status);
-uint16_t float16_to_uint16_round_to_zero(float16 a, float_status *status);
int32_t float16_to_int32(float16, float_status *status);
-uint32_t float16_to_uint32(float16 a, float_status *status);
-int32_t float16_to_int32_round_to_zero(float16, float_status *status);
-uint32_t float16_to_uint32_round_to_zero(float16 a, float_status *status);
int64_t float16_to_int64(float16, float_status *status);
-uint64_t float16_to_uint64(float16 a, float_status *status);
+
+int16_t float16_to_int16_round_to_zero(float16, float_status *status);
+int32_t float16_to_int32_round_to_zero(float16, float_status *status);
int64_t float16_to_int64_round_to_zero(float16, float_status *status);
+
+uint16_t float16_to_uint16_scalbn(float16 a, int, int, float_status *status);
+uint32_t float16_to_uint32_scalbn(float16 a, int, int, float_status *status);
+uint64_t float16_to_uint64_scalbn(float16 a, int, int, float_status *status);
+
+uint16_t float16_to_uint16(float16 a, float_status *status);
+uint32_t float16_to_uint32(float16 a, float_status *status);
+uint64_t float16_to_uint64(float16 a, float_status *status);
+
+uint16_t float16_to_uint16_round_to_zero(float16 a, float_status *status);
+uint32_t float16_to_uint32_round_to_zero(float16 a, float_status *status);
uint64_t float16_to_uint64_round_to_zero(float16 a, float_status *status);
/*----------------------------------------------------------------------------
@@ -347,18 +360,31 @@ float16 float16_default_nan(float_status *status);
/*----------------------------------------------------------------------------
| Software IEC/IEEE single-precision conversion routines.
*----------------------------------------------------------------------------*/
+
+int16_t float32_to_int16_scalbn(float32, int, int, float_status *status);
+int32_t float32_to_int32_scalbn(float32, int, int, float_status *status);
+int64_t float32_to_int64_scalbn(float32, int, int, float_status *status);
+
int16_t float32_to_int16(float32, float_status *status);
-uint16_t float32_to_uint16(float32, float_status *status);
-int16_t float32_to_int16_round_to_zero(float32, float_status *status);
-uint16_t float32_to_uint16_round_to_zero(float32, float_status *status);
int32_t float32_to_int32(float32, float_status *status);
-int32_t float32_to_int32_round_to_zero(float32, float_status *status);
-uint32_t float32_to_uint32(float32, float_status *status);
-uint32_t float32_to_uint32_round_to_zero(float32, float_status *status);
int64_t float32_to_int64(float32, float_status *status);
-uint64_t float32_to_uint64(float32, float_status *status);
-uint64_t float32_to_uint64_round_to_zero(float32, float_status *status);
+
+int16_t float32_to_int16_round_to_zero(float32, float_status *status);
+int32_t float32_to_int32_round_to_zero(float32, float_status *status);
int64_t float32_to_int64_round_to_zero(float32, float_status *status);
+
+uint16_t float32_to_uint16_scalbn(float32, int, int, float_status *status);
+uint32_t float32_to_uint32_scalbn(float32, int, int, float_status *status);
+uint64_t float32_to_uint64_scalbn(float32, int, int, float_status *status);
+
+uint16_t float32_to_uint16(float32, float_status *status);
+uint32_t float32_to_uint32(float32, float_status *status);
+uint64_t float32_to_uint64(float32, float_status *status);
+
+uint16_t float32_to_uint16_round_to_zero(float32, float_status *status);
+uint32_t float32_to_uint32_round_to_zero(float32, float_status *status);
+uint64_t float32_to_uint64_round_to_zero(float32, float_status *status);
+
float64 float32_to_float64(float32, float_status *status);
floatx80 float32_to_floatx80(float32, float_status *status);
float128 float32_to_float128(float32, float_status *status);
@@ -476,18 +502,31 @@ float32 float32_default_nan(float_status *status);
/*----------------------------------------------------------------------------
| Software IEC/IEEE double-precision conversion routines.
*----------------------------------------------------------------------------*/
+
+int16_t float64_to_int16_scalbn(float64, int, int, float_status *status);
+int32_t float64_to_int32_scalbn(float64, int, int, float_status *status);
+int64_t float64_to_int64_scalbn(float64, int, int, float_status *status);
+
int16_t float64_to_int16(float64, float_status *status);
-uint16_t float64_to_uint16(float64, float_status *status);
-int16_t float64_to_int16_round_to_zero(float64, float_status *status);
-uint16_t float64_to_uint16_round_to_zero(float64, float_status *status);
int32_t float64_to_int32(float64, float_status *status);
-int32_t float64_to_int32_round_to_zero(float64, float_status *status);
-uint32_t float64_to_uint32(float64, float_status *status);
-uint32_t float64_to_uint32_round_to_zero(float64, float_status *status);
int64_t float64_to_int64(float64, float_status *status);
+
+int16_t float64_to_int16_round_to_zero(float64, float_status *status);
+int32_t float64_to_int32_round_to_zero(float64, float_status *status);
int64_t float64_to_int64_round_to_zero(float64, float_status *status);
-uint64_t float64_to_uint64(float64 a, float_status *status);
-uint64_t float64_to_uint64_round_to_zero(float64 a, float_status *status);
+
+uint16_t float64_to_uint16_scalbn(float64, int, int, float_status *status);
+uint32_t float64_to_uint32_scalbn(float64, int, int, float_status *status);
+uint64_t float64_to_uint64_scalbn(float64, int, int, float_status *status);
+
+uint16_t float64_to_uint16(float64, float_status *status);
+uint32_t float64_to_uint32(float64, float_status *status);
+uint64_t float64_to_uint64(float64, float_status *status);
+
+uint16_t float64_to_uint16_round_to_zero(float64, float_status *status);
+uint32_t float64_to_uint32_round_to_zero(float64, float_status *status);
+uint64_t float64_to_uint64_round_to_zero(float64, float_status *status);
+
float32 float64_to_float32(float64, float_status *status);
floatx80 float64_to_floatx80(float64, float_status *status);
float128 float64_to_float128(float64, float_status *status);
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 12f373cbad..59ca356d0e 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1293,19 +1293,23 @@ float32 float64_to_float32(float64 a, float_status *s)
* Arithmetic.
*/
-static FloatParts round_to_int(FloatParts a, int rounding_mode, float_status *s)
+static FloatParts round_to_int(FloatParts a, int rmode,
+ int scale, float_status *s)
{
- if (is_nan(a.cls)) {
- return return_nan(a, s);
- }
-
switch (a.cls) {
+ case float_class_qnan:
+ case float_class_snan:
+ return return_nan(a, s);
+
case float_class_zero:
case float_class_inf:
- case float_class_qnan:
/* already "integral" */
break;
+
case float_class_normal:
+ scale = MIN(MAX(scale, -0x10000), 0x10000);
+ a.exp += scale;
+
if (a.exp >= DECOMPOSED_BINARY_POINT) {
/* already integral */
break;
@@ -1314,7 +1318,7 @@ static FloatParts round_to_int(FloatParts a, int rounding_mode, float_status *s)
bool one;
/* all fractional */
s->float_exception_flags |= float_flag_inexact;
- switch (rounding_mode) {
+ switch (rmode) {
case float_round_nearest_even:
one = a.exp == -1 && a.frac > DECOMPOSED_IMPLICIT_BIT;
break;
@@ -1347,7 +1351,7 @@ static FloatParts round_to_int(FloatParts a, int rounding_mode, float_status *s)
uint64_t rnd_mask = rnd_even_mask >> 1;
uint64_t inc;
- switch (rounding_mode) {
+ switch (rmode) {
case float_round_nearest_even:
inc = ((a.frac & rnd_even_mask) != frac_lsbm1 ? frac_lsbm1 : 0);
break;
@@ -1387,28 +1391,28 @@ static FloatParts round_to_int(FloatParts a, int rounding_mode, float_status *s)
float16 float16_round_to_int(float16 a, float_status *s)
{
FloatParts pa = float16_unpack_canonical(a, s);
- FloatParts pr = round_to_int(pa, s->float_rounding_mode, s);
+ FloatParts pr = round_to_int(pa, s->float_rounding_mode, 0, s);
return float16_round_pack_canonical(pr, s);
}
float32 float32_round_to_int(float32 a, float_status *s)
{
FloatParts pa = float32_unpack_canonical(a, s);
- FloatParts pr = round_to_int(pa, s->float_rounding_mode, s);
+ FloatParts pr = round_to_int(pa, s->float_rounding_mode, 0, s);
return float32_round_pack_canonical(pr, s);
}
float64 float64_round_to_int(float64 a, float_status *s)
{
FloatParts pa = float64_unpack_canonical(a, s);
- FloatParts pr = round_to_int(pa, s->float_rounding_mode, s);
+ FloatParts pr = round_to_int(pa, s->float_rounding_mode, 0, s);
return float64_round_pack_canonical(pr, s);
}
float64 float64_trunc_to_int(float64 a, float_status *s)
{
FloatParts pa = float64_unpack_canonical(a, s);
- FloatParts pr = round_to_int(pa, float_round_to_zero, s);
+ FloatParts pr = round_to_int(pa, float_round_to_zero, 0, s);
return float64_round_pack_canonical(pr, s);
}
@@ -1423,13 +1427,13 @@ float64 float64_trunc_to_int(float64 a, float_status *s)
* is returned.
*/
-static int64_t round_to_int_and_pack(FloatParts in, int rmode,
+static int64_t round_to_int_and_pack(FloatParts in, int rmode, int scale,
int64_t min, int64_t max,
float_status *s)
{
uint64_t r;
int orig_flags = get_float_exception_flags(s);
- FloatParts p = round_to_int(in, rmode, s);
+ FloatParts p = round_to_int(in, rmode, scale, s);
switch (p.cls) {
case float_class_snan:
@@ -1469,38 +1473,158 @@ static int64_t round_to_int_and_pack(FloatParts in, int rmode,
}
}
-#define FLOAT_TO_INT(fsz, isz) \
-int ## isz ## _t float ## fsz ## _to_int ## isz(float ## fsz a, \
- float_status *s) \
-{ \
- FloatParts p = float ## fsz ## _unpack_canonical(a, s); \
- return round_to_int_and_pack(p, s->float_rounding_mode, \
- INT ## isz ## _MIN, INT ## isz ## _MAX,\
- s); \
-} \
- \
-int ## isz ## _t float ## fsz ## _to_int ## isz ## _round_to_zero \
- (float ## fsz a, float_status *s) \
-{ \
- FloatParts p = float ## fsz ## _unpack_canonical(a, s); \
- return round_to_int_and_pack(p, float_round_to_zero, \
- INT ## isz ## _MIN, INT ## isz ## _MAX,\
- s); \
+int16_t float16_to_int16_scalbn(float16 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float16_unpack_canonical(a, s),
+ rmode, scale, INT16_MIN, INT16_MAX, s);
}
-FLOAT_TO_INT(16, 16)
-FLOAT_TO_INT(16, 32)
-FLOAT_TO_INT(16, 64)
+int32_t float16_to_int32_scalbn(float16 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float16_unpack_canonical(a, s),
+ rmode, scale, INT32_MIN, INT32_MAX, s);
+}
-FLOAT_TO_INT(32, 16)
-FLOAT_TO_INT(32, 32)
-FLOAT_TO_INT(32, 64)
+int64_t float16_to_int64_scalbn(float16 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float16_unpack_canonical(a, s),
+ rmode, scale, INT64_MIN, INT64_MAX, s);
+}
-FLOAT_TO_INT(64, 16)
-FLOAT_TO_INT(64, 32)
-FLOAT_TO_INT(64, 64)
+int16_t float32_to_int16_scalbn(float32 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float32_unpack_canonical(a, s),
+ rmode, scale, INT16_MIN, INT16_MAX, s);
+}
-#undef FLOAT_TO_INT
+int32_t float32_to_int32_scalbn(float32 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float32_unpack_canonical(a, s),
+ rmode, scale, INT32_MIN, INT32_MAX, s);
+}
+
+int64_t float32_to_int64_scalbn(float32 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float32_unpack_canonical(a, s),
+ rmode, scale, INT64_MIN, INT64_MAX, s);
+}
+
+int16_t float64_to_int16_scalbn(float64 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float64_unpack_canonical(a, s),
+ rmode, scale, INT16_MIN, INT16_MAX, s);
+}
+
+int32_t float64_to_int32_scalbn(float64 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float64_unpack_canonical(a, s),
+ rmode, scale, INT32_MIN, INT32_MAX, s);
+}
+
+int64_t float64_to_int64_scalbn(float64 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_int_and_pack(float64_unpack_canonical(a, s),
+ rmode, scale, INT64_MIN, INT64_MAX, s);
+}
+
+int16_t float16_to_int16(float16 a, float_status *s)
+{
+ return float16_to_int16_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int32_t float16_to_int32(float16 a, float_status *s)
+{
+ return float16_to_int32_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int64_t float16_to_int64(float16 a, float_status *s)
+{
+ return float16_to_int64_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int16_t float32_to_int16(float32 a, float_status *s)
+{
+ return float32_to_int16_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int32_t float32_to_int32(float32 a, float_status *s)
+{
+ return float32_to_int32_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int64_t float32_to_int64(float32 a, float_status *s)
+{
+ return float32_to_int64_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int16_t float64_to_int16(float64 a, float_status *s)
+{
+ return float64_to_int16_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int32_t float64_to_int32(float64 a, float_status *s)
+{
+ return float64_to_int32_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int64_t float64_to_int64(float64 a, float_status *s)
+{
+ return float64_to_int64_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int16_t float16_to_int16_round_to_zero(float16 a, float_status *s)
+{
+ return float16_to_int16_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int32_t float16_to_int32_round_to_zero(float16 a, float_status *s)
+{
+ return float16_to_int32_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int64_t float16_to_int64_round_to_zero(float16 a, float_status *s)
+{
+ return float16_to_int64_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int16_t float32_to_int16_round_to_zero(float32 a, float_status *s)
+{
+ return float32_to_int16_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int32_t float32_to_int32_round_to_zero(float32 a, float_status *s)
+{
+ return float32_to_int32_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int64_t float32_to_int64_round_to_zero(float32 a, float_status *s)
+{
+ return float32_to_int64_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int16_t float64_to_int16_round_to_zero(float64 a, float_status *s)
+{
+ return float64_to_int16_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int32_t float64_to_int32_round_to_zero(float64 a, float_status *s)
+{
+ return float64_to_int32_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int64_t float64_to_int64_round_to_zero(float64 a, float_status *s)
+{
+ return float64_to_int64_scalbn(a, float_round_to_zero, 0, s);
+}
/*
* Returns the result of converting the floating-point value `a' to
@@ -1515,11 +1639,12 @@ FLOAT_TO_INT(64, 64)
* flag.
*/
-static uint64_t round_to_uint_and_pack(FloatParts in, int rmode, uint64_t max,
- float_status *s)
+static uint64_t round_to_uint_and_pack(FloatParts in, int rmode, int scale,
+ uint64_t max, float_status *s)
{
int orig_flags = get_float_exception_flags(s);
- FloatParts p = round_to_int(in, rmode, s);
+ FloatParts p = round_to_int(in, rmode, scale, s);
+ uint64_t r;
switch (p.cls) {
case float_class_snan:
@@ -1532,8 +1657,6 @@ static uint64_t round_to_uint_and_pack(FloatParts in, int rmode, uint64_t max,
case float_class_zero:
return 0;
case float_class_normal:
- {
- uint64_t r;
if (p.sign) {
s->float_exception_flags = orig_flags | float_flag_invalid;
return 0;
@@ -1555,45 +1678,165 @@ static uint64_t round_to_uint_and_pack(FloatParts in, int rmode, uint64_t max,
if (r > max) {
s->float_exception_flags = orig_flags | float_flag_invalid;
return max;
- } else {
- return r;
}
- }
+ return r;
default:
g_assert_not_reached();
}
}
-#define FLOAT_TO_UINT(fsz, isz) \
-uint ## isz ## _t float ## fsz ## _to_uint ## isz(float ## fsz a, \
- float_status *s) \
-{ \
- FloatParts p = float ## fsz ## _unpack_canonical(a, s); \
- return round_to_uint_and_pack(p, s->float_rounding_mode, \
- UINT ## isz ## _MAX, s); \
-} \
- \
-uint ## isz ## _t float ## fsz ## _to_uint ## isz ## _round_to_zero \
- (float ## fsz a, float_status *s) \
-{ \
- FloatParts p = float ## fsz ## _unpack_canonical(a, s); \
- return round_to_uint_and_pack(p, float_round_to_zero, \
- UINT ## isz ## _MAX, s); \
+uint16_t float16_to_uint16_scalbn(float16 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float16_unpack_canonical(a, s),
+ rmode, scale, UINT16_MAX, s);
}
-FLOAT_TO_UINT(16, 16)
-FLOAT_TO_UINT(16, 32)
-FLOAT_TO_UINT(16, 64)
+uint32_t float16_to_uint32_scalbn(float16 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float16_unpack_canonical(a, s),
+ rmode, scale, UINT32_MAX, s);
+}
-FLOAT_TO_UINT(32, 16)
-FLOAT_TO_UINT(32, 32)
-FLOAT_TO_UINT(32, 64)
+uint64_t float16_to_uint64_scalbn(float16 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float16_unpack_canonical(a, s),
+ rmode, scale, UINT64_MAX, s);
+}
-FLOAT_TO_UINT(64, 16)
-FLOAT_TO_UINT(64, 32)
-FLOAT_TO_UINT(64, 64)
+uint16_t float32_to_uint16_scalbn(float32 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float32_unpack_canonical(a, s),
+ rmode, scale, UINT16_MAX, s);
+}
-#undef FLOAT_TO_UINT
+uint32_t float32_to_uint32_scalbn(float32 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float32_unpack_canonical(a, s),
+ rmode, scale, UINT32_MAX, s);
+}
+
+uint64_t float32_to_uint64_scalbn(float32 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float32_unpack_canonical(a, s),
+ rmode, scale, UINT64_MAX, s);
+}
+
+uint16_t float64_to_uint16_scalbn(float64 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float64_unpack_canonical(a, s),
+ rmode, scale, UINT16_MAX, s);
+}
+
+uint32_t float64_to_uint32_scalbn(float64 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float64_unpack_canonical(a, s),
+ rmode, scale, UINT32_MAX, s);
+}
+
+uint64_t float64_to_uint64_scalbn(float64 a, int rmode, int scale,
+ float_status *s)
+{
+ return round_to_uint_and_pack(float64_unpack_canonical(a, s),
+ rmode, scale, UINT64_MAX, s);
+}
+
+uint16_t float16_to_uint16(float16 a, float_status *s)
+{
+ return float16_to_uint16_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint32_t float16_to_uint32(float16 a, float_status *s)
+{
+ return float16_to_uint32_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint64_t float16_to_uint64(float16 a, float_status *s)
+{
+ return float16_to_uint64_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint16_t float32_to_uint16(float32 a, float_status *s)
+{
+ return float32_to_uint16_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint32_t float32_to_uint32(float32 a, float_status *s)
+{
+ return float32_to_uint32_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint64_t float32_to_uint64(float32 a, float_status *s)
+{
+ return float32_to_uint64_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint16_t float64_to_uint16(float64 a, float_status *s)
+{
+ return float64_to_uint16_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint32_t float64_to_uint32(float64 a, float_status *s)
+{
+ return float64_to_uint32_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint64_t float64_to_uint64(float64 a, float_status *s)
+{
+ return float64_to_uint64_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+uint16_t float16_to_uint16_round_to_zero(float16 a, float_status *s)
+{
+ return float16_to_uint16_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint32_t float16_to_uint32_round_to_zero(float16 a, float_status *s)
+{
+ return float16_to_uint32_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint64_t float16_to_uint64_round_to_zero(float16 a, float_status *s)
+{
+ return float16_to_uint64_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint16_t float32_to_uint16_round_to_zero(float32 a, float_status *s)
+{
+ return float32_to_uint16_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint32_t float32_to_uint32_round_to_zero(float32 a, float_status *s)
+{
+ return float32_to_uint32_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint64_t float32_to_uint64_round_to_zero(float32 a, float_status *s)
+{
+ return float32_to_uint64_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint16_t float64_to_uint16_round_to_zero(float64 a, float_status *s)
+{
+ return float64_to_uint16_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint32_t float64_to_uint32_round_to_zero(float64 a, float_status *s)
+{
+ return float64_to_uint32_scalbn(a, float_round_to_zero, 0, s);
+}
+
+uint64_t float64_to_uint64_round_to_zero(float64 a, float_status *s)
+{
+ return float64_to_uint64_scalbn(a, float_round_to_zero, 0, s);
+}
/*
* Integer to float conversions
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 3/4] target/arm: Use the int-to-float-scale softfloat routines
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 1/4] softfloat: Add scaling int-to-float routines Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 2/4] softfloat: Add scaling float-to-int routines Richard Henderson
@ 2018-08-14 0:26 ` Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 4/4] target/arm: Use the float-to-int-scale " Richard Henderson
` (4 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2018-08-14 0:26 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.c | 29 +++++------------------------
1 file changed, 5 insertions(+), 24 deletions(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 61454a77ec..38439a2ee8 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11550,12 +11550,7 @@ float32 VFP_HELPER(fcvts, d)(float64 x, CPUARMState *env)
#define VFP_CONV_FIX_FLOAT(name, p, fsz, isz, itype) \
float##fsz HELPER(vfp_##name##to##p)(uint##isz##_t x, uint32_t shift, \
void *fpstp) \
-{ \
- float_status *fpst = fpstp; \
- float##fsz tmp; \
- tmp = itype##_to_##float##fsz(x, fpst); \
- return float##fsz##_scalbn(tmp, -(int)shift, fpst); \
-}
+{ return itype##_to_##float##fsz##_scalbn(x, -shift, fpstp); }
/* Notice that we want only input-denormal exception flags from the
* scalbn operation: the other possible flags (overflow+inexact if
@@ -11608,38 +11603,24 @@ VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
#undef VFP_CONV_FLOAT_FIX_ROUND
#undef VFP_CONV_FIX_A64
-/* Conversion to/from f16 can overflow to infinity before/after scaling.
- * Therefore we convert to f64, scale, and then convert f64 to f16; or
- * vice versa for conversion to integer.
- *
- * For 16- and 32-bit integers, the conversion to f64 never rounds.
- * For 64-bit integers, any integer that would cause rounding will also
- * overflow to f16 infinity, so there is no double rounding problem.
- */
-
-static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
-{
- return float64_to_float16(float64_scalbn(f, -shift, fpst), true, fpst);
-}
-
uint32_t HELPER(vfp_sltoh)(uint32_t x, uint32_t shift, void *fpst)
{
- return do_postscale_fp16(int32_to_float64(x, fpst), shift, fpst);
+ return int32_to_float16_scalbn(x, -shift, fpst);
}
uint32_t HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
{
- return do_postscale_fp16(uint32_to_float64(x, fpst), shift, fpst);
+ return uint32_to_float16_scalbn(x, -shift, fpst);
}
uint32_t HELPER(vfp_sqtoh)(uint64_t x, uint32_t shift, void *fpst)
{
- return do_postscale_fp16(int64_to_float64(x, fpst), shift, fpst);
+ return int64_to_float16_scalbn(x, -shift, fpst);
}
uint32_t HELPER(vfp_uqtoh)(uint64_t x, uint32_t shift, void *fpst)
{
- return do_postscale_fp16(uint64_to_float64(x, fpst), shift, fpst);
+ return uint64_to_float16_scalbn(x, -shift, fpst);
}
static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [Qemu-devel] [PATCH 4/4] target/arm: Use the float-to-int-scale softfloat routines
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
` (2 preceding siblings ...)
2018-08-14 0:26 ` [Qemu-devel] [PATCH 3/4] target/arm: Use the int-to-float-scale softfloat routines Richard Henderson
@ 2018-08-14 0:26 ` Richard Henderson
2018-08-14 8:32 ` [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Alex Bennée
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2018-08-14 0:26 UTC (permalink / raw)
To: qemu-devel; +Cc: laurent.desnogues, peter.maydell, alex.bennee
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/helper.c | 101 ++++++++++++++++++++++----------------------
1 file changed, 51 insertions(+), 50 deletions(-)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 38439a2ee8..e4a7d97805 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11552,38 +11552,28 @@ float##fsz HELPER(vfp_##name##to##p)(uint##isz##_t x, uint32_t shift, \
void *fpstp) \
{ return itype##_to_##float##fsz##_scalbn(x, -shift, fpstp); }
-/* Notice that we want only input-denormal exception flags from the
- * scalbn operation: the other possible flags (overflow+inexact if
- * we overflow to infinity, output-denormal) aren't correct for the
- * complete scale-and-convert operation.
- */
-#define VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, round) \
-uint##isz##_t HELPER(vfp_to##name##p##round)(float##fsz x, \
- uint32_t shift, \
- void *fpstp) \
-{ \
- float_status *fpst = fpstp; \
- int old_exc_flags = get_float_exception_flags(fpst); \
- float##fsz tmp; \
- if (float##fsz##_is_any_nan(x)) { \
- float_raise(float_flag_invalid, fpst); \
- return 0; \
- } \
- tmp = float##fsz##_scalbn(x, shift, fpst); \
- old_exc_flags |= get_float_exception_flags(fpst) \
- & float_flag_input_denormal; \
- set_float_exception_flags(old_exc_flags, fpst); \
- return float##fsz##_to_##itype##round(tmp, fpst); \
+#define VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, ROUND, suff) \
+uint##isz##_t HELPER(vfp_to##name##p##suff)(float##fsz x, uint32_t shift, \
+ void *fpst) \
+{ \
+ if (unlikely(float##fsz##_is_any_nan(x))) { \
+ float_raise(float_flag_invalid, fpst); \
+ return 0; \
+ } \
+ return float##fsz##_to_##itype##_scalbn(x, ROUND, shift, fpst); \
}
#define VFP_CONV_FIX(name, p, fsz, isz, itype) \
VFP_CONV_FIX_FLOAT(name, p, fsz, isz, itype) \
-VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, _round_to_zero) \
-VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, )
+VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, \
+ float_round_to_zero, _round_to_zero) \
+VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, \
+ get_float_rounding_mode(fpst), )
#define VFP_CONV_FIX_A64(name, p, fsz, isz, itype) \
VFP_CONV_FIX_FLOAT(name, p, fsz, isz, itype) \
-VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, )
+VFP_CONV_FLOAT_FIX_ROUND(name, p, fsz, isz, itype, \
+ get_float_rounding_mode(fpst), )
VFP_CONV_FIX(sh, d, 64, 64, int16)
VFP_CONV_FIX(sl, d, 64, 64, int32)
@@ -11623,53 +11613,64 @@ uint32_t HELPER(vfp_uqtoh)(uint64_t x, uint32_t shift, void *fpst)
return uint64_to_float16_scalbn(x, -shift, fpst);
}
-static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
-{
- if (unlikely(float16_is_any_nan(f))) {
- float_raise(float_flag_invalid, fpst);
- return 0;
- } else {
- int old_exc_flags = get_float_exception_flags(fpst);
- float64 ret;
-
- ret = float16_to_float64(f, true, fpst);
- ret = float64_scalbn(ret, shift, fpst);
- old_exc_flags |= get_float_exception_flags(fpst)
- & float_flag_input_denormal;
- set_float_exception_flags(old_exc_flags, fpst);
-
- return ret;
- }
-}
-
uint32_t HELPER(vfp_toshh)(uint32_t x, uint32_t shift, void *fpst)
{
- return float64_to_int16(do_prescale_fp16(x, shift, fpst), fpst);
+ if (unlikely(float16_is_any_nan(x))) {
+ float_raise(float_flag_invalid, fpst);
+ return 0;
+ }
+ return float16_to_int16_scalbn(x, get_float_rounding_mode(fpst),
+ shift, fpst);
}
uint32_t HELPER(vfp_touhh)(uint32_t x, uint32_t shift, void *fpst)
{
- return float64_to_uint16(do_prescale_fp16(x, shift, fpst), fpst);
+ if (unlikely(float16_is_any_nan(x))) {
+ float_raise(float_flag_invalid, fpst);
+ return 0;
+ }
+ return float16_to_uint16_scalbn(x, get_float_rounding_mode(fpst),
+ shift, fpst);
}
uint32_t HELPER(vfp_toslh)(uint32_t x, uint32_t shift, void *fpst)
{
- return float64_to_int32(do_prescale_fp16(x, shift, fpst), fpst);
+ if (unlikely(float16_is_any_nan(x))) {
+ float_raise(float_flag_invalid, fpst);
+ return 0;
+ }
+ return float16_to_int32_scalbn(x, get_float_rounding_mode(fpst),
+ shift, fpst);
}
uint32_t HELPER(vfp_toulh)(uint32_t x, uint32_t shift, void *fpst)
{
- return float64_to_uint32(do_prescale_fp16(x, shift, fpst), fpst);
+ if (unlikely(float16_is_any_nan(x))) {
+ float_raise(float_flag_invalid, fpst);
+ return 0;
+ }
+ return float16_to_uint32_scalbn(x, get_float_rounding_mode(fpst),
+ shift, fpst);
}
uint64_t HELPER(vfp_tosqh)(uint32_t x, uint32_t shift, void *fpst)
{
- return float64_to_int64(do_prescale_fp16(x, shift, fpst), fpst);
+ if (unlikely(float16_is_any_nan(x))) {
+ float_raise(float_flag_invalid, fpst);
+ return 0;
+ }
+ return float16_to_int64_scalbn(x, get_float_rounding_mode(fpst),
+ shift, fpst);
}
uint64_t HELPER(vfp_touqh)(uint32_t x, uint32_t shift, void *fpst)
{
- return float64_to_uint64(do_prescale_fp16(x, shift, fpst), fpst);
+ if (unlikely(float16_is_any_nan(x))) {
+ float_raise(float_flag_invalid, fpst);
+ return 0;
+ }
+ return float16_to_uint64_scalbn(x, get_float_rounding_mode(fpst),
+ shift, fpst);
}
/* Set the current fp rounding mode and return the old one.
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
` (3 preceding siblings ...)
2018-08-14 0:26 ` [Qemu-devel] [PATCH 4/4] target/arm: Use the float-to-int-scale " Richard Henderson
@ 2018-08-14 8:32 ` Alex Bennée
2018-08-14 14:47 ` Richard Henderson
2018-08-16 1:00 ` no-reply
` (2 subsequent siblings)
7 siblings, 1 reply; 11+ messages in thread
From: Alex Bennée @ 2018-08-14 8:32 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent.desnogues, peter.maydell
Richard Henderson <richard.henderson@linaro.org> writes:
> In 88808a022c0, I tried to fix an overflow problem that affected float16
> scaling by coverting first to float64 and then rounding after that.
>
> However, Laurent reported that -0x3ff40000000001 converted to float16
> resulted in 0xfbfe instead of the expected 0xfbff. This is caused by
> the inexact conversion to float64.
>
> Rather than build more logic into target/arm to compensate, just add
> a function that takes a scaling parameter so that the whole thing is
> done all at once with only one rounding.
>
> I don't have a failing test case for the float-to-int paths, but it
> seemed best to apply the same solution.
Can't we add the constants to the fcvt test case?
>
>
> r~
>
>
> Richard Henderson (4):
> softfloat: Add scaling int-to-float routines
> softfloat: Add scaling float-to-int routines
> target/arm: Use the int-to-float-scale softfloat routines
> target/arm: Use the float-to-int-scale softfloat routines
>
> include/fpu/softfloat.h | 169 ++++++++----
> fpu/softfloat.c | 579 +++++++++++++++++++++++++++++++---------
> target/arm/helper.c | 130 ++++-----
> 3 files changed, 628 insertions(+), 250 deletions(-)
--
Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
2018-08-14 8:32 ` [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Alex Bennée
@ 2018-08-14 14:47 ` Richard Henderson
2018-08-14 15:38 ` Alex Bennée
0 siblings, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2018-08-14 14:47 UTC (permalink / raw)
To: Alex Bennée; +Cc: qemu-devel, laurent.desnogues, peter.maydell
On 08/14/2018 01:32 AM, Alex Bennée wrote:
> Can't we add the constants to the fcvt test case?
No, they're all half-to-integer. This is integer-to-half.
We could write another one, I suppose, but it's not just
an add-one-line kind of thing.
r~
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
2018-08-14 14:47 ` Richard Henderson
@ 2018-08-14 15:38 ` Alex Bennée
0 siblings, 0 replies; 11+ messages in thread
From: Alex Bennée @ 2018-08-14 15:38 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, laurent.desnogues, peter.maydell
Richard Henderson <richard.henderson@linaro.org> writes:
> On 08/14/2018 01:32 AM, Alex Bennée wrote:
>> Can't we add the constants to the fcvt test case?
>
> No, they're all half-to-integer. This is integer-to-half.
I'll add the int-to-float conversions, the whole thing could do with a
bit of a re-factor anyway.
>
> We could write another one, I suppose, but it's not just
> an add-one-line kind of thing.
>
>
> r~
--
Alex Bennée
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
` (4 preceding siblings ...)
2018-08-14 8:32 ` [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Alex Bennée
@ 2018-08-16 1:00 ` no-reply
2018-08-20 17:15 ` Peter Maydell
2018-08-20 19:35 ` no-reply
7 siblings, 0 replies; 11+ messages in thread
From: no-reply @ 2018-08-16 1:00 UTC (permalink / raw)
To: richard.henderson
Cc: famz, qemu-devel, laurent.desnogues, peter.maydell, alex.bennee
Hi,
This series seems to have some coding style problems. See output below for
more information:
Type: series
Message-id: 20180814002653.12828-1-richard.henderson@linaro.org
Subject: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
=== TEST SCRIPT BEGIN ===
#!/bin/bash
BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done
exit $failed
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
709fbe603d target/arm: Use the float-to-int-scale softfloat routines
b158c8d737 target/arm: Use the int-to-float-scale softfloat routines
5f86798067 softfloat: Add scaling float-to-int routines
8ec3fc49ea softfloat: Add scaling int-to-float routines
=== OUTPUT BEGIN ===
Checking PATCH 1/4: softfloat: Add scaling int-to-float routines...
Checking PATCH 2/4: softfloat: Add scaling float-to-int routines...
Checking PATCH 3/4: target/arm: Use the int-to-float-scale softfloat routines...
Checking PATCH 4/4: target/arm: Use the float-to-int-scale softfloat routines...
ERROR: space prohibited before that close parenthesis ')'
#57: FILE: target/arm/helper.c:11531:
+ get_float_rounding_mode(fpst), )
ERROR: space prohibited before that close parenthesis ')'
#63: FILE: target/arm/helper.c:11536:
+ get_float_rounding_mode(fpst), )
total: 2 errors, 0 warnings, 142 lines checked
Your patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===
Test command exited with code: 1
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
` (5 preceding siblings ...)
2018-08-16 1:00 ` no-reply
@ 2018-08-20 17:15 ` Peter Maydell
2018-08-20 19:35 ` no-reply
7 siblings, 0 replies; 11+ messages in thread
From: Peter Maydell @ 2018-08-20 17:15 UTC (permalink / raw)
To: Richard Henderson; +Cc: QEMU Developers, Laurent Desnogues, Alex Bennée
On 14 August 2018 at 01:26, Richard Henderson
<richard.henderson@linaro.org> wrote:
> In 88808a022c0, I tried to fix an overflow problem that affected float16
> scaling by coverting first to float64 and then rounding after that.
>
> However, Laurent reported that -0x3ff40000000001 converted to float16
> resulted in 0xfbfe instead of the expected 0xfbff. This is caused by
> the inexact conversion to float64.
>
> Rather than build more logic into target/arm to compensate, just add
> a function that takes a scaling parameter so that the whole thing is
> done all at once with only one rounding.
>
> I don't have a failing test case for the float-to-int paths, but it
> seemed best to apply the same solution.
>
>
> r~
>
>
> Richard Henderson (4):
> softfloat: Add scaling int-to-float routines
> softfloat: Add scaling float-to-int routines
> target/arm: Use the int-to-float-scale softfloat routines
> target/arm: Use the float-to-int-scale softfloat routines
series
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
and applied to target-arm.next.
thanks
-- PMM
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
` (6 preceding siblings ...)
2018-08-20 17:15 ` Peter Maydell
@ 2018-08-20 19:35 ` no-reply
7 siblings, 0 replies; 11+ messages in thread
From: no-reply @ 2018-08-20 19:35 UTC (permalink / raw)
To: richard.henderson
Cc: famz, qemu-devel, laurent.desnogues, peter.maydell, alex.bennee
Hi,
This series seems to have some coding style problems. See output below for
more information:
Type: series
Message-id: 20180814002653.12828-1-richard.henderson@linaro.org
Subject: [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding
=== TEST SCRIPT BEGIN ===
#!/bin/bash
BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done
exit $failed
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
776a29ed02 target/arm: Use the float-to-int-scale softfloat routines
c08c4abc59 target/arm: Use the int-to-float-scale softfloat routines
71c42653c5 softfloat: Add scaling float-to-int routines
040490a28a softfloat: Add scaling int-to-float routines
=== OUTPUT BEGIN ===
Checking PATCH 1/4: softfloat: Add scaling int-to-float routines...
Checking PATCH 2/4: softfloat: Add scaling float-to-int routines...
Checking PATCH 3/4: target/arm: Use the int-to-float-scale softfloat routines...
Checking PATCH 4/4: target/arm: Use the float-to-int-scale softfloat routines...
ERROR: space prohibited before that close parenthesis ')'
#58: FILE: target/arm/helper.c:11585:
+ get_float_rounding_mode(fpst), )
ERROR: space prohibited before that close parenthesis ')'
#64: FILE: target/arm/helper.c:11590:
+ get_float_rounding_mode(fpst), )
total: 2 errors, 0 warnings, 142 lines checked
Your patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===
Test command exited with code: 1
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-08-20 19:51 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-14 0:26 [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 1/4] softfloat: Add scaling int-to-float routines Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 2/4] softfloat: Add scaling float-to-int routines Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 3/4] target/arm: Use the int-to-float-scale softfloat routines Richard Henderson
2018-08-14 0:26 ` [Qemu-devel] [PATCH 4/4] target/arm: Use the float-to-int-scale " Richard Henderson
2018-08-14 8:32 ` [Qemu-devel] [PATCH 0/4] target/arm: Fix int64_to_float16 double-rounding Alex Bennée
2018-08-14 14:47 ` Richard Henderson
2018-08-14 15:38 ` Alex Bennée
2018-08-16 1:00 ` no-reply
2018-08-20 17:15 ` Peter Maydell
2018-08-20 19:35 ` no-reply
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).