[Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting
@ 2011-05-06 12:48 Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 1/7] target-arm: Don't set FP exceptions in recip, recip_sqrt estimate fns Peter Maydell
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

This patch series fixes a number of minor bugs in the ARM target where
we were not correctly setting the cumulative exception flags in the
FPSCR. It includes adding a new flag to softfloat indicating when a
denormal result has been flushed to zero (as discussed previously on
the list.)

Tested with the usual random instruction sequence testing (covering
all the neon and vfp data processing instructions which can set FPSCR
exception flags). These patches fix all the FPSCR flags bugs I found,
with the exception of those in the VCVT float-int and float32-float16
conversion routines, which are a bit trickier to fix because they are
bugs in softfloat rather than merely in the arm helper functions.

Peter Maydell (7):
  target-arm: Don't set FP exceptions in recip, recip_sqrt estimate fns
  target-arm: Signal InputDenormal for VRECPE, VRSQRTE, VRECPS, VRSQRTS
  target-arm: Signal InvalidOp for Neon GE and GT compares of QNaN
  target-arm: Refactor int-float conversions
  target-arm: Add separate Neon float-int conversion helpers
  softfloat: Add new flag for when denormal result is flushed to zero
  target-arm: Signal Underflow when denormal flushed to zero on output

 fpu/softfloat.c          |   41 ++++++++++--
 fpu/softfloat.h          |    3 +-
 target-arm/helper.c      |  151 +++++++---------------------------------------
 target-arm/helper.h      |   70 ++++++++++++---------
 target-arm/neon_helper.c |   40 ++++++-------
 target-arm/op_helper.c   |   74 ++++++++++++++++++++++
 target-arm/translate.c   |   92 ++++++++++++++++------------
 7 files changed, 243 insertions(+), 228 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/7] target-arm: Don't set FP exceptions in recip, recip_sqrt estimate fns
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
@ 2011-05-06 12:48 ` Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 2/7] target-arm: Signal InputDenormal for VRECPE, VRSQRTE, VRECPS, VRSQRTS Peter Maydell
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

The functions which do the core estimation algorithms for the VRSQRTE
and VRECPE instructions should not set floating point exception flags,
so use a local fp status for doing these calculations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 62ae72e..5ff6a9b 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2749,7 +2749,11 @@ float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUState *env)
  */
 static float64 recip_estimate(float64 a, CPUState *env)
 {
-    float_status *s = &env->vfp.standard_fp_status;
+    /* These calculations mustn't set any fp exception flags,
+     * so we use a local copy of the fp_status.
+     */
+    float_status dummy_status = env->vfp.standard_fp_status;
+    float_status *s = &dummy_status;
     /* q = (int)(a * 512.0) */
     float64 q = float64_mul(float64_512, a, s);
     int64_t q_int = float64_to_int64_round_to_zero(q, s);
@@ -2812,7 +2816,11 @@ float32 HELPER(recpe_f32)(float32 a, CPUState *env)
  */
 static float64 recip_sqrt_estimate(float64 a, CPUState *env)
 {
-    float_status *s = &env->vfp.standard_fp_status;
+    /* These calculations mustn't set any fp exception flags,
+     * so we use a local copy of the fp_status.
+     */
+    float_status dummy_status = env->vfp.standard_fp_status;
+    float_status *s = &dummy_status;
     float64 q;
     int64_t q_int;
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/7] target-arm: Signal InputDenormal for VRECPE, VRSQRTE, VRECPS, VRSQRTS
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 1/7] target-arm: Don't set FP exceptions in recip, recip_sqrt estimate fns Peter Maydell
@ 2011-05-06 12:48 ` Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 3/7] target-arm: Signal InvalidOp for Neon GE and GT compares of QNaN Peter Maydell
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

The helpers for VRECPE.F32, VSQRTE.F32, VRECPS and VRSQRTS handle denormals
as special cases, so we must set the InputDenormal exception flag ourselves.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 5ff6a9b..f072527 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2720,6 +2720,9 @@ float32 HELPER(recps_f32)(float32 a, float32 b, CPUState *env)
     float_status *s = &env->vfp.standard_fp_status;
     if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
         (float32_is_infinity(b) && float32_is_zero_or_denormal(a))) {
+        if (!(float32_is_zero(a) || float32_is_zero(b))) {
+            float_raise(float_flag_input_denormal, s);
+        }
         return float32_two;
     }
     return float32_sub(float32_two, float32_mul(a, b, s), s);
@@ -2731,6 +2734,9 @@ float32 HELPER(rsqrts_f32)(float32 a, float32 b, CPUState *env)
     float32 product;
     if ((float32_is_infinity(a) && float32_is_zero_or_denormal(b)) ||
         (float32_is_infinity(b) && float32_is_zero_or_denormal(a))) {
+        if (!(float32_is_zero(a) || float32_is_zero(b))) {
+            float_raise(float_flag_input_denormal, s);
+        }
         return float32_one_point_five;
     }
     product = float32_mul(a, b, s);
@@ -2791,6 +2797,9 @@ float32 HELPER(recpe_f32)(float32 a, CPUState *env)
     } else if (float32_is_infinity(a)) {
         return float32_set_sign(float32_zero, float32_is_neg(a));
     } else if (float32_is_zero_or_denormal(a)) {
+        if (!float32_is_zero(a)) {
+            float_raise(float_flag_input_denormal, s);
+        }
         float_raise(float_flag_divbyzero, s);
         return float32_set_sign(float32_infinity, float32_is_neg(a));
     } else if (a_exp >= 253) {
@@ -2882,6 +2891,9 @@ float32 HELPER(rsqrte_f32)(float32 a, CPUState *env)
         }
         return float32_default_nan;
     } else if (float32_is_zero_or_denormal(a)) {
+        if (!float32_is_zero(a)) {
+            float_raise(float_flag_input_denormal, s);
+        }
         float_raise(float_flag_divbyzero, s);
         return float32_set_sign(float32_infinity, float32_is_neg(a));
     } else if (float32_is_neg(a)) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/7] target-arm: Signal InvalidOp for Neon GE and GT compares of QNaN
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 1/7] target-arm: Don't set FP exceptions in recip, recip_sqrt estimate fns Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 2/7] target-arm: Signal InputDenormal for VRECPE, VRSQRTE, VRECPS, VRSQRTS Peter Maydell
@ 2011-05-06 12:48 ` Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions Peter Maydell
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

If the input to a Neon float comparison is a quiet NaN, the ARM ARM
specifies that we should raise InvalidOp if the comparison is GE or GT
but not for EQ. (Signaling NaNs raise InvalidOp regardless). This means
only EQ should use the _quiet version of the comparison function.

We implement this by cleaning up the comparison helpers to call the
appopriate versions of the softfloat simple comparison functions
(float32_le and friends) rather than the generic float32_compare functions.
This makes them simple enough that they are clearer opencoded rather
than macroised.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/neon_helper.c |   40 ++++++++++++++++++----------------------
 1 files changed, 18 insertions(+), 22 deletions(-)

diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c
index f5b173a..9165519 100644
--- a/target-arm/neon_helper.c
+++ b/target-arm/neon_helper.c
@@ -1802,41 +1802,37 @@ uint32_t HELPER(neon_mul_f32)(uint32_t a, uint32_t b)
     return float32_val(float32_mul(make_float32(a), make_float32(b), NFS));
 }
 
-/* Floating point comparisons produce an integer result.  */
-#define NEON_VOP_FCMP(name, ok) \
-uint32_t HELPER(neon_##name)(uint32_t a, uint32_t b) \
-{ \
-    switch (float32_compare_quiet(make_float32(a), make_float32(b), NFS)) { \
-    ok return ~0; \
-    default: return 0; \
-    } \
+/* Floating point comparisons produce an integer result.
+ * Note that EQ doesn't signal InvalidOp for QNaNs but GE and GT do.
+ * Softfloat routines return 0/1, which we convert to the 0/-1 Neon requires.
+ */
+uint32_t HELPER(neon_ceq_f32)(uint32_t a, uint32_t b)
+{
+    return -float32_eq_quiet(make_float32(a), make_float32(b), NFS);
+}
+
+uint32_t HELPER(neon_cge_f32)(uint32_t a, uint32_t b)
+{
+    return -float32_le(make_float32(b), make_float32(a), NFS);
 }
 
-NEON_VOP_FCMP(ceq_f32, case float_relation_equal:)
-NEON_VOP_FCMP(cge_f32, case float_relation_equal: case float_relation_greater:)
-NEON_VOP_FCMP(cgt_f32, case float_relation_greater:)
+uint32_t HELPER(neon_cgt_f32)(uint32_t a, uint32_t b)
+{
+    return -float32_lt(make_float32(b), make_float32(a), NFS);
+}
 
 uint32_t HELPER(neon_acge_f32)(uint32_t a, uint32_t b)
 {
     float32 f0 = float32_abs(make_float32(a));
     float32 f1 = float32_abs(make_float32(b));
-    switch (float32_compare_quiet(f0, f1, NFS)) {
-    case float_relation_equal:
-    case float_relation_greater:
-        return ~0;
-    default:
-        return 0;
-    }
+    return -float32_le(f1, f0, NFS);
 }
 
 uint32_t HELPER(neon_acgt_f32)(uint32_t a, uint32_t b)
 {
     float32 f0 = float32_abs(make_float32(a));
     float32 f1 = float32_abs(make_float32(b));
-    if (float32_compare_quiet(f0, f1, NFS) == float_relation_greater) {
-        return ~0;
-    }
-    return 0;
+    return -float32_lt(f1, f0, NFS);
 }
 
 #define ELEM(V, N, SIZE) (((V) >> ((N) * (SIZE))) & ((1ull << (SIZE)) - 1))
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
                   ` (2 preceding siblings ...)
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 3/7] target-arm: Signal InvalidOp for Neon GE and GT compares of QNaN Peter Maydell
@ 2011-05-06 12:48 ` Peter Maydell
  2011-05-06 14:09   ` Paul Brook
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 5/7] target-arm: Add separate Neon float-int conversion helpers Peter Maydell
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

The Neon versions of int-float conversions need their own helper routines
because they must use the "standard FPSCR" rather than the default one.
Refactor the helper functions to make it easy to add the neon versions.
While we're touching the code, move the helpers to op_helper.c so that
we can use the global env variable rather than passing it as a parameter.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.c    |  125 ------------------------------------------------
 target-arm/helper.h    |   60 +++++++++++-----------
 target-arm/op_helper.c |   62 ++++++++++++++++++++++++
 target-arm/translate.c |   63 +++++++++++++-----------
 4 files changed, 127 insertions(+), 183 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index f072527..de00468 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2526,100 +2526,6 @@ DO_VFP_cmp(s, float32)
 DO_VFP_cmp(d, float64)
 #undef DO_VFP_cmp
 
-/* Integer to float conversion.  */
-float32 VFP_HELPER(uito, s)(uint32_t x, CPUState *env)
-{
-    return uint32_to_float32(x, &env->vfp.fp_status);
-}
-
-float64 VFP_HELPER(uito, d)(uint32_t x, CPUState *env)
-{
-    return uint32_to_float64(x, &env->vfp.fp_status);
-}
-
-float32 VFP_HELPER(sito, s)(uint32_t x, CPUState *env)
-{
-    return int32_to_float32(x, &env->vfp.fp_status);
-}
-
-float64 VFP_HELPER(sito, d)(uint32_t x, CPUState *env)
-{
-    return int32_to_float64(x, &env->vfp.fp_status);
-}
-
-/* Float to integer conversion.  */
-uint32_t VFP_HELPER(toui, s)(float32 x, CPUState *env)
-{
-    if (float32_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float32_to_uint32(x, &env->vfp.fp_status);
-}
-
-uint32_t VFP_HELPER(toui, d)(float64 x, CPUState *env)
-{
-    if (float64_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float64_to_uint32(x, &env->vfp.fp_status);
-}
-
-uint32_t VFP_HELPER(tosi, s)(float32 x, CPUState *env)
-{
-    if (float32_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float32_to_int32(x, &env->vfp.fp_status);
-}
-
-uint32_t VFP_HELPER(tosi, d)(float64 x, CPUState *env)
-{
-    if (float64_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float64_to_int32(x, &env->vfp.fp_status);
-}
-
-uint32_t VFP_HELPER(touiz, s)(float32 x, CPUState *env)
-{
-    if (float32_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float32_to_uint32_round_to_zero(x, &env->vfp.fp_status);
-}
-
-uint32_t VFP_HELPER(touiz, d)(float64 x, CPUState *env)
-{
-    if (float64_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float64_to_uint32_round_to_zero(x, &env->vfp.fp_status);
-}
-
-uint32_t VFP_HELPER(tosiz, s)(float32 x, CPUState *env)
-{
-    if (float32_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float32_to_int32_round_to_zero(x, &env->vfp.fp_status);
-}
-
-uint32_t VFP_HELPER(tosiz, d)(float64 x, CPUState *env)
-{
-    if (float64_is_any_nan(x)) {
-        float_raise(float_flag_invalid, &env->vfp.fp_status);
-        return 0;
-    }
-    return float64_to_int32_round_to_zero(x, &env->vfp.fp_status);
-}
-
 /* floating point conversion */
 float64 VFP_HELPER(fcvtd, s)(float32 x, CPUState *env)
 {
@@ -2639,37 +2545,6 @@ float32 VFP_HELPER(fcvts, d)(float64 x, CPUState *env)
     return float32_maybe_silence_nan(r);
 }
 
-/* VFP3 fixed point conversion.  */
-#define VFP_CONV_FIX(name, p, fsz, itype, sign) \
-float##fsz VFP_HELPER(name##to, p)(uint##fsz##_t  x, uint32_t shift, \
-                                   CPUState *env) \
-{ \
-    float##fsz tmp; \
-    tmp = sign##int32_to_##float##fsz ((itype##_t)x, &env->vfp.fp_status); \
-    return float##fsz##_scalbn(tmp, -(int)shift, &env->vfp.fp_status); \
-} \
-uint##fsz##_t VFP_HELPER(to##name, p)(float##fsz x, uint32_t shift, \
-                                      CPUState *env) \
-{ \
-    float##fsz tmp; \
-    if (float##fsz##_is_any_nan(x)) { \
-        float_raise(float_flag_invalid, &env->vfp.fp_status); \
-        return 0; \
-    } \
-    tmp = float##fsz##_scalbn(x, shift, &env->vfp.fp_status); \
-    return float##fsz##_to_##itype##_round_to_zero(tmp, &env->vfp.fp_status); \
-}
-
-VFP_CONV_FIX(sh, d, 64, int16, )
-VFP_CONV_FIX(sl, d, 64, int32, )
-VFP_CONV_FIX(uh, d, 64, uint16, u)
-VFP_CONV_FIX(ul, d, 64, uint32, u)
-VFP_CONV_FIX(sh, s, 32, int16, )
-VFP_CONV_FIX(sl, s, 32, int32, )
-VFP_CONV_FIX(uh, s, 32, uint16, u)
-VFP_CONV_FIX(ul, s, 32, uint32, u)
-#undef VFP_CONV_FIX
-
 /* Half precision conversions.  */
 static float32 do_fcvt_f16_to_f32(uint32_t a, CPUState *env, float_status *s)
 {
diff --git a/target-arm/helper.h b/target-arm/helper.h
index ae701e8..2c54d5e 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -96,36 +96,36 @@ DEF_HELPER_3(vfp_cmped, void, f64, f64, env)
 DEF_HELPER_2(vfp_fcvtds, f64, f32, env)
 DEF_HELPER_2(vfp_fcvtsd, f32, f64, env)
 
-DEF_HELPER_2(vfp_uitos, f32, i32, env)
-DEF_HELPER_2(vfp_uitod, f64, i32, env)
-DEF_HELPER_2(vfp_sitos, f32, i32, env)
-DEF_HELPER_2(vfp_sitod, f64, i32, env)
-
-DEF_HELPER_2(vfp_touis, i32, f32, env)
-DEF_HELPER_2(vfp_touid, i32, f64, env)
-DEF_HELPER_2(vfp_touizs, i32, f32, env)
-DEF_HELPER_2(vfp_touizd, i32, f64, env)
-DEF_HELPER_2(vfp_tosis, i32, f32, env)
-DEF_HELPER_2(vfp_tosid, i32, f64, env)
-DEF_HELPER_2(vfp_tosizs, i32, f32, env)
-DEF_HELPER_2(vfp_tosizd, i32, f64, env)
-
-DEF_HELPER_3(vfp_toshs, i32, f32, i32, env)
-DEF_HELPER_3(vfp_tosls, i32, f32, i32, env)
-DEF_HELPER_3(vfp_touhs, i32, f32, i32, env)
-DEF_HELPER_3(vfp_touls, i32, f32, i32, env)
-DEF_HELPER_3(vfp_toshd, i64, f64, i32, env)
-DEF_HELPER_3(vfp_tosld, i64, f64, i32, env)
-DEF_HELPER_3(vfp_touhd, i64, f64, i32, env)
-DEF_HELPER_3(vfp_tould, i64, f64, i32, env)
-DEF_HELPER_3(vfp_shtos, f32, i32, i32, env)
-DEF_HELPER_3(vfp_sltos, f32, i32, i32, env)
-DEF_HELPER_3(vfp_uhtos, f32, i32, i32, env)
-DEF_HELPER_3(vfp_ultos, f32, i32, i32, env)
-DEF_HELPER_3(vfp_shtod, f64, i64, i32, env)
-DEF_HELPER_3(vfp_sltod, f64, i64, i32, env)
-DEF_HELPER_3(vfp_uhtod, f64, i64, i32, env)
-DEF_HELPER_3(vfp_ultod, f64, i64, i32, env)
+DEF_HELPER_1(vfp_uitos, f32, i32)
+DEF_HELPER_1(vfp_uitod, f64, i32)
+DEF_HELPER_1(vfp_sitos, f32, i32)
+DEF_HELPER_1(vfp_sitod, f64, i32)
+
+DEF_HELPER_1(vfp_touis, i32, f32)
+DEF_HELPER_1(vfp_touid, i32, f64)
+DEF_HELPER_1(vfp_touizs, i32, f32)
+DEF_HELPER_1(vfp_touizd, i32, f64)
+DEF_HELPER_1(vfp_tosis, i32, f32)
+DEF_HELPER_1(vfp_tosid, i32, f64)
+DEF_HELPER_1(vfp_tosizs, i32, f32)
+DEF_HELPER_1(vfp_tosizd, i32, f64)
+
+DEF_HELPER_2(vfp_toshs, i32, f32, i32)
+DEF_HELPER_2(vfp_tosls, i32, f32, i32)
+DEF_HELPER_2(vfp_touhs, i32, f32, i32)
+DEF_HELPER_2(vfp_touls, i32, f32, i32)
+DEF_HELPER_2(vfp_toshd, i64, f64, i32)
+DEF_HELPER_2(vfp_tosld, i64, f64, i32)
+DEF_HELPER_2(vfp_touhd, i64, f64, i32)
+DEF_HELPER_2(vfp_tould, i64, f64, i32)
+DEF_HELPER_2(vfp_shtos, f32, i32, i32)
+DEF_HELPER_2(vfp_sltos, f32, i32, i32)
+DEF_HELPER_2(vfp_uhtos, f32, i32, i32)
+DEF_HELPER_2(vfp_ultos, f32, i32, i32)
+DEF_HELPER_2(vfp_shtod, f64, i64, i32)
+DEF_HELPER_2(vfp_sltod, f64, i64, i32)
+DEF_HELPER_2(vfp_uhtod, f64, i64, i32)
+DEF_HELPER_2(vfp_ultod, f64, i64, i32)
 
 DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env)
 DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env)
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index 8334fbc..1afea43 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -424,3 +424,65 @@ uint32_t HELPER(ror_cc)(uint32_t x, uint32_t i)
         return ((uint32_t)x >> shift) | (x << (32 - shift));
     }
 }
+
+/* Integer to float and float to integer conversions */
+
+#define CONV_ITOF(name, fsz, sign, fpst) \
+float##fsz HELPER(name)(uint32_t x) \
+{ \
+    return sign##int32_to_##float##fsz(x, fpst); \
+}
+
+#define CONV_FTOI(name, fsz, sign, fpst, round) \
+uint32_t HELPER(name)(float##fsz x) \
+{ \
+    if (float##fsz##_is_any_nan(x)) { \
+        float_raise(float_flag_invalid, fpst); \
+        return 0; \
+    } \
+    return float##fsz##_to_##sign##int32##round(x, fpst); \
+}
+
+#define VFP_CONVS(name, p, fsz, sign) \
+CONV_ITOF(vfp_##name##to##p, fsz, sign, &env->vfp.fp_status) \
+CONV_FTOI(vfp_##to##name##p, fsz, sign, &env->vfp.fp_status, ) \
+CONV_FTOI(vfp_##to##name##z##p, fsz, sign, &env->vfp.fp_status, _round_to_zero)
+
+VFP_CONVS(si, s, 32, )
+VFP_CONVS(si, d, 64, )
+VFP_CONVS(ui, s, 32, u)
+VFP_CONVS(ui, d, 64, u)
+
+#undef CONV_ITOF
+#undef CONV_FTOI
+#undef VFP_CONVS
+
+/* VFP3 fixed point conversion.  */
+#define VFP_CONV_FIX(pfx, name, p, fsz, itype, sign, status) \
+float##fsz HELPER(pfx##name##to##p)(uint##fsz##_t  x, uint32_t shift) \
+{ \
+    float##fsz tmp; \
+    tmp = sign##int32_to_##float##fsz((itype##_t)x, status); \
+    return float##fsz##_scalbn(tmp, -(int)shift, status); \
+} \
+uint##fsz##_t HELPER(pfx##to##name##p)(float##fsz x, uint32_t shift) \
+{ \
+    float##fsz tmp; \
+    if (float##fsz##_is_any_nan(x)) { \
+        float_raise(float_flag_invalid, status); \
+        return 0; \
+    } \
+    tmp = float##fsz##_scalbn(x, shift, status); \
+    return float##fsz##_to_##itype##_round_to_zero(tmp, status); \
+}
+
+VFP_CONV_FIX(vfp_, sh, d, 64, int16, , &env->vfp.fp_status)
+VFP_CONV_FIX(vfp_, sl, d, 64, int32, , &env->vfp.fp_status)
+VFP_CONV_FIX(vfp_, uh, d, 64, uint16, u, &env->vfp.fp_status)
+VFP_CONV_FIX(vfp_, ul, d, 64, uint32, u, &env->vfp.fp_status)
+VFP_CONV_FIX(vfp_, sh, s, 32, int16, , &env->vfp.fp_status)
+VFP_CONV_FIX(vfp_, sl, s, 32, int32, , &env->vfp.fp_status)
+VFP_CONV_FIX(vfp_, uh, s, 32, uint16, u, &env->vfp.fp_status)
+VFP_CONV_FIX(vfp_, ul, s, 32, uint32, u, &env->vfp.fp_status)
+
+#undef VFP_CONV_FIX
diff --git a/target-arm/translate.c b/target-arm/translate.c
index a1af436..195cf30 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -959,60 +959,67 @@ static inline void gen_vfp_F1_ld0(int dp)
 
 static inline void gen_vfp_uito(int dp)
 {
-    if (dp)
-        gen_helper_vfp_uitod(cpu_F0d, cpu_F0s, cpu_env);
-    else
-        gen_helper_vfp_uitos(cpu_F0s, cpu_F0s, cpu_env);
+    if (dp) {
+        gen_helper_vfp_uitod(cpu_F0d, cpu_F0s);
+    } else {
+        gen_helper_vfp_uitos(cpu_F0s, cpu_F0s);
+    }
 }
 
 static inline void gen_vfp_sito(int dp)
 {
-    if (dp)
-        gen_helper_vfp_sitod(cpu_F0d, cpu_F0s, cpu_env);
-    else
-        gen_helper_vfp_sitos(cpu_F0s, cpu_F0s, cpu_env);
+    if (dp) {
+        gen_helper_vfp_sitod(cpu_F0d, cpu_F0s);
+    } else {
+        gen_helper_vfp_sitos(cpu_F0s, cpu_F0s);
+    }
 }
 
 static inline void gen_vfp_toui(int dp)
 {
-    if (dp)
-        gen_helper_vfp_touid(cpu_F0s, cpu_F0d, cpu_env);
-    else
-        gen_helper_vfp_touis(cpu_F0s, cpu_F0s, cpu_env);
+    if (dp) {
+        gen_helper_vfp_touid(cpu_F0s, cpu_F0d);
+    } else {
+        gen_helper_vfp_touis(cpu_F0s, cpu_F0s);
+    }
 }
 
 static inline void gen_vfp_touiz(int dp)
 {
-    if (dp)
-        gen_helper_vfp_touizd(cpu_F0s, cpu_F0d, cpu_env);
-    else
-        gen_helper_vfp_touizs(cpu_F0s, cpu_F0s, cpu_env);
+    if (dp) {
+        gen_helper_vfp_touizd(cpu_F0s, cpu_F0d);
+    } else {
+        gen_helper_vfp_touizs(cpu_F0s, cpu_F0s);
+    }
 }
 
 static inline void gen_vfp_tosi(int dp)
 {
-    if (dp)
-        gen_helper_vfp_tosid(cpu_F0s, cpu_F0d, cpu_env);
-    else
-        gen_helper_vfp_tosis(cpu_F0s, cpu_F0s, cpu_env);
+    if (dp) {
+        gen_helper_vfp_tosid(cpu_F0s, cpu_F0d);
+    } else {
+        gen_helper_vfp_tosis(cpu_F0s, cpu_F0s);
+    }
 }
 
 static inline void gen_vfp_tosiz(int dp)
 {
-    if (dp)
-        gen_helper_vfp_tosizd(cpu_F0s, cpu_F0d, cpu_env);
-    else
-        gen_helper_vfp_tosizs(cpu_F0s, cpu_F0s, cpu_env);
+    if (dp) {
+        gen_helper_vfp_tosizd(cpu_F0s, cpu_F0d);
+    } else {
+        gen_helper_vfp_tosizs(cpu_F0s, cpu_F0s);
+    }
 }
 
 #define VFP_GEN_FIX(name) \
 static inline void gen_vfp_##name(int dp, int shift) \
 { \
     TCGv tmp_shift = tcg_const_i32(shift); \
-    if (dp) \
-        gen_helper_vfp_##name##d(cpu_F0d, cpu_F0d, tmp_shift, cpu_env);\
-    else \
-        gen_helper_vfp_##name##s(cpu_F0s, cpu_F0s, tmp_shift, cpu_env);\
+    if (dp) { \
+        gen_helper_vfp_##name##d(cpu_F0d, cpu_F0d, tmp_shift); \
+    } else { \
+        gen_helper_vfp_##name##s(cpu_F0s, cpu_F0s, tmp_shift); \
+    } \
     tcg_temp_free_i32(tmp_shift); \
 }
 VFP_GEN_FIX(tosh)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions Peter Maydell
@ 2011-05-06 14:09   ` Paul Brook
  2011-05-06 14:42     ` Peter Maydell
  2011-05-06 15:30     ` Blue Swirl
  0 siblings, 2 replies; 15+ messages in thread
From: Paul Brook @ 2011-05-06 14:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, patches

> The Neon versions of int-float conversions need their own helper routines
> because they must use the "standard FPSCR" rather than the default one.
> Refactor the helper functions to make it easy to add the neon versions.
> While we're touching the code, move the helpers to op_helper.c so that
> we can use the global env variable rather than passing it as a parameter.

IMO this is going in the wrong direction.  We should in aiming for less 
implicit accesses to cpu state, not more.

Maybe better would be to explicitly pass a pointer the fp status. That way you 
don't even need separate VFP and NEON variants of these routines.

Paul

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions
  2011-05-06 14:09   ` Paul Brook
@ 2011-05-06 14:42     ` Peter Maydell
  2011-05-06 15:30     ` Blue Swirl
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 14:42 UTC (permalink / raw)
  To: Paul Brook; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno, patches

On 6 May 2011 15:09, Paul Brook <paul@codesourcery.com> wrote:
>> The Neon versions of int-float conversions need their own helper routines
>> because they must use the "standard FPSCR" rather than the default one.
>> Refactor the helper functions to make it easy to add the neon versions.
>> While we're touching the code, move the helpers to op_helper.c so that
>> we can use the global env variable rather than passing it as a parameter.
>
> IMO this is going in the wrong direction.  We should in aiming for less
> implicit accesses to cpu state, not more.

I don't have a very strong feeling about this personally, I've just been
going in the direction suggested by past discussions eg
http://lists.gnu.org/archive/html/qemu-devel/2011-04/msg00183.html

> Maybe better would be to explicitly pass a pointer the fp status. That way you
> don't even need separate VFP and NEON variants of these routines.

If you were otherwise going to pass in a CPUState pointer then just passing
the pointer to the fp_status is probably better, yes.

-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions
  2011-05-06 14:09   ` Paul Brook
  2011-05-06 14:42     ` Peter Maydell
@ 2011-05-06 15:30     ` Blue Swirl
  2011-05-06 16:38       ` Paul Brook
  1 sibling, 1 reply; 15+ messages in thread
From: Blue Swirl @ 2011-05-06 15:30 UTC (permalink / raw)
  To: Paul Brook; +Cc: Peter Maydell, qemu-devel, patches

On Fri, May 6, 2011 at 5:09 PM, Paul Brook <paul@codesourcery.com> wrote:
>> The Neon versions of int-float conversions need their own helper routines
>> because they must use the "standard FPSCR" rather than the default one.
>> Refactor the helper functions to make it easy to add the neon versions.
>> While we're touching the code, move the helpers to op_helper.c so that
>> we can use the global env variable rather than passing it as a parameter.
>
> IMO this is going in the wrong direction.  We should in aiming for less
> implicit accesses to cpu state, not more.

Performance wise global env variable is faster and the register is
always available. Do you mean that we should aim to get rid of special
status of global env, so that if no op uses it, it could be discarded
to free a register?

> Maybe better would be to explicitly pass a pointer the fp status. That way you
> don't even need separate VFP and NEON variants of these routines.

It would be nice to have generic float functions callable directly as
TCG helper.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions
  2011-05-06 15:30     ` Blue Swirl
@ 2011-05-06 16:38       ` Paul Brook
  2011-05-08 10:32         ` Blue Swirl
  0 siblings, 1 reply; 15+ messages in thread
From: Paul Brook @ 2011-05-06 16:38 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Peter Maydell, qemu-devel, patches

> On Fri, May 6, 2011 at 5:09 PM, Paul Brook <paul@codesourcery.com> wrote:
> >> The Neon versions of int-float conversions need their own helper
> >> routines because they must use the "standard FPSCR" rather than the
> >> default one. Refactor the helper functions to make it easy to add the
> >> neon versions. While we're touching the code, move the helpers to
> >> op_helper.c so that we can use the global env variable rather than
> >> passing it as a parameter.
> > 
> > IMO this is going in the wrong direction.  We should in aiming for less
> > implicit accesses to cpu state, not more.
> 
> Performance wise global env variable is faster and the register is
> always available. 

Not entirely true.  Reserving the global env variable has significant cost, 
especially on hosts with limited register sets (i.e. x86).  It's also a rather 
fragile hack.  There's a fairly long history of nasy hacks and things that 
just don't work in this context.  For example you can't reliably include 
stdio.h from these files, and they often break if you turn optimization off, 
which makes debugging much harder than it should be.

> Do you mean that we should aim to get rid of special
> status of global env, so that if no op uses it, it could be discarded
> to free a register?

That's already true for most of qemu.  IMO more interesting is being able to 
tell TCG that helpers don't mess with cpu state in opaque ways.  In theory 
it's already possible to do that manually. However that tends to be somewhat 
fragile, relying on careful maintenance and code code auditing, with mistakes 
triggering subtle hard-to-debug failures.  Much better to avoid the implicit 
global interface and make all helper arguments explicit.

> > Maybe better would be to explicitly pass a pointer the fp status. That
> > way you don't even need separate VFP and NEON variants of these
> > routines.
> 
> It would be nice to have generic float functions callable directly as
> TCG helper.

Possibly.  I'd have to look quite a bit closer to determine whether exposing 
the generic float functions is useful in practice, or if you still end up 
needing wrappers in most cases for most targets.  Adding "native" floating 
point support to the TCG interface is also a possibility.  In practice this 
might end up as wrappers around helper functions, but might provide a nicer 
programming interface.

Paul

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions
  2011-05-06 16:38       ` Paul Brook
@ 2011-05-08 10:32         ` Blue Swirl
  2011-05-14 22:38           ` Aurelien Jarno
  0 siblings, 1 reply; 15+ messages in thread
From: Blue Swirl @ 2011-05-08 10:32 UTC (permalink / raw)
  To: Paul Brook; +Cc: Peter Maydell, qemu-devel, patches

On Fri, May 6, 2011 at 7:38 PM, Paul Brook <paul@codesourcery.com> wrote:
>> On Fri, May 6, 2011 at 5:09 PM, Paul Brook <paul@codesourcery.com> wrote:
>> >> The Neon versions of int-float conversions need their own helper
>> >> routines because they must use the "standard FPSCR" rather than the
>> >> default one. Refactor the helper functions to make it easy to add the
>> >> neon versions. While we're touching the code, move the helpers to
>> >> op_helper.c so that we can use the global env variable rather than
>> >> passing it as a parameter.
>> >
>> > IMO this is going in the wrong direction.  We should in aiming for less
>> > implicit accesses to cpu state, not more.
>>
>> Performance wise global env variable is faster and the register is
>> always available.
>
> Not entirely true.  Reserving the global env variable has significant cost,
> especially on hosts with limited register sets (i.e. x86).  It's also a rather
> fragile hack.  There's a fairly long history of nasy hacks and things that
> just don't work in this context.  For example you can't reliably include
> stdio.h from these files, and they often break if you turn optimization off,
> which makes debugging much harder than it should be.

Even if we don't reserve the register, in many cases a corresponding
pointer to CPUState will be needed. But there will still be the
advantage that this temporary pointer can be discarded while the
globally reserved register is reserved forever.

>> Do you mean that we should aim to get rid of special
>> status of global env, so that if no op uses it, it could be discarded
>> to free a register?
>
> That's already true for most of qemu.  IMO more interesting is being able to
> tell TCG that helpers don't mess with cpu state in opaque ways.  In theory
> it's already possible to do that manually. However that tends to be somewhat
> fragile, relying on careful maintenance and code code auditing, with mistakes
> triggering subtle hard-to-debug failures.  Much better to avoid the implicit
> global interface and make all helper arguments explicit.

OK. This will be a major refactoring, but given the expected
performance increase, it should be done.

>> > Maybe better would be to explicitly pass a pointer the fp status. That
>> > way you don't even need separate VFP and NEON variants of these
>> > routines.
>>
>> It would be nice to have generic float functions callable directly as
>> TCG helper.
>
> Possibly.  I'd have to look quite a bit closer to determine whether exposing
> the generic float functions is useful in practice, or if you still end up
> needing wrappers in most cases for most targets.  Adding "native" floating
> point support to the TCG interface is also a possibility.  In practice this
> might end up as wrappers around helper functions, but might provide a nicer
> programming interface.
>
> Paul
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions
  2011-05-08 10:32         ` Blue Swirl
@ 2011-05-14 22:38           ` Aurelien Jarno
  0 siblings, 0 replies; 15+ messages in thread
From: Aurelien Jarno @ 2011-05-14 22:38 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Peter Maydell, patches, Paul Brook, qemu-devel

On Sun, May 08, 2011 at 01:32:34PM +0300, Blue Swirl wrote:
> On Fri, May 6, 2011 at 7:38 PM, Paul Brook <paul@codesourcery.com> wrote:
> >> On Fri, May 6, 2011 at 5:09 PM, Paul Brook <paul@codesourcery.com> wrote:
> >> >> The Neon versions of int-float conversions need their own helper
> >> >> routines because they must use the "standard FPSCR" rather than the
> >> >> default one. Refactor the helper functions to make it easy to add the
> >> >> neon versions. While we're touching the code, move the helpers to
> >> >> op_helper.c so that we can use the global env variable rather than
> >> >> passing it as a parameter.
> >> >
> >> > IMO this is going in the wrong direction.  We should in aiming for less
> >> > implicit accesses to cpu state, not more.
> >>
> >> Performance wise global env variable is faster and the register is
> >> always available.
> >
> > Not entirely true.  Reserving the global env variable has significant cost,
> > especially on hosts with limited register sets (i.e. x86).  It's also a rather
> > fragile hack.  There's a fairly long history of nasy hacks and things that
> > just don't work in this context.  For example you can't reliably include
> > stdio.h from these files, and they often break if you turn optimization off,
> > which makes debugging much harder than it should be.
> 
> Even if we don't reserve the register, in many cases a corresponding
> pointer to CPUState will be needed. But there will still be the
> advantage that this temporary pointer can be discarded while the
> globally reserved register is reserved forever.
> 
> >> Do you mean that we should aim to get rid of special
> >> status of global env, so that if no op uses it, it could be discarded
> >> to free a register?
> >
> > That's already true for most of qemu.  IMO more interesting is being able to
> > tell TCG that helpers don't mess with cpu state in opaque ways.  In theory
> > it's already possible to do that manually. However that tends to be somewhat
> > fragile, relying on careful maintenance and code code auditing, with mistakes
> > triggering subtle hard-to-debug failures.  Much better to avoid the implicit
> > global interface and make all helper arguments explicit.
> 
> OK. This will be a major refactoring, but given the expected
> performance increase, it should be done.
> 

We might want to do it from the cleanliness point of view, but i really
doubt we should expect performance increase from this (actually i think
it will be the contrary).

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 5/7] target-arm: Add separate Neon float-int conversion helpers
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
                   ` (3 preceding siblings ...)
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions Peter Maydell
@ 2011-05-06 12:48 ` Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 6/7] softfloat: Add new flag for when denormal result is flushed to zero Peter Maydell
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

Add the Neon-specific float-int conversion helper functions which
use the standard FPSCR value rather than the VFP FPSCR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.h    |   10 ++++++++++
 target-arm/op_helper.c |   12 ++++++++++++
 target-arm/translate.c |   29 +++++++++++++++++------------
 3 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/target-arm/helper.h b/target-arm/helper.h
index 2c54d5e..1b4005a 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -127,6 +127,16 @@ DEF_HELPER_2(vfp_sltod, f64, i64, i32)
 DEF_HELPER_2(vfp_uhtod, f64, i64, i32)
 DEF_HELPER_2(vfp_ultod, f64, i64, i32)
 
+DEF_HELPER_1(neon_sitos, f32, i32)
+DEF_HELPER_1(neon_uitos, f32, i32)
+DEF_HELPER_1(neon_tosizs, i32, f32)
+DEF_HELPER_1(neon_touizs, i32, f32)
+
+DEF_HELPER_2(neon_ultos, f32, i32, i32);
+DEF_HELPER_2(neon_sltos, f32, i32, i32);
+DEF_HELPER_2(neon_touls, i32, f32, i32);
+DEF_HELPER_2(neon_tosls, i32, f32, i32);
+
 DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env)
 DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env)
 DEF_HELPER_2(neon_fcvt_f16_to_f32, f32, i32, env)
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index 1afea43..3998d9c 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -448,14 +448,23 @@ CONV_ITOF(vfp_##name##to##p, fsz, sign, &env->vfp.fp_status) \
 CONV_FTOI(vfp_##to##name##p, fsz, sign, &env->vfp.fp_status, ) \
 CONV_FTOI(vfp_##to##name##z##p, fsz, sign, &env->vfp.fp_status, _round_to_zero)
 
+#define NEON_CONVS(name, p, fsz, sign) \
+CONV_ITOF(neon_##name##to##p, fsz, sign, &env->vfp.standard_fp_status) \
+CONV_FTOI(neon_##to##name##z##p, fsz, sign, &env->vfp.standard_fp_status, \
+          _round_to_zero)
+
 VFP_CONVS(si, s, 32, )
 VFP_CONVS(si, d, 64, )
 VFP_CONVS(ui, s, 32, u)
 VFP_CONVS(ui, d, 64, u)
 
+NEON_CONVS(si, s, 32, )
+NEON_CONVS(ui, s, 32, u)
+
 #undef CONV_ITOF
 #undef CONV_FTOI
 #undef VFP_CONVS
+#undef NEON_CONVS
 
 /* VFP3 fixed point conversion.  */
 #define VFP_CONV_FIX(pfx, name, p, fsz, itype, sign, status) \
@@ -485,4 +494,7 @@ VFP_CONV_FIX(vfp_, sl, s, 32, int32, , &env->vfp.fp_status)
 VFP_CONV_FIX(vfp_, uh, s, 32, uint16, u, &env->vfp.fp_status)
 VFP_CONV_FIX(vfp_, ul, s, 32, uint32, u, &env->vfp.fp_status)
 
+VFP_CONV_FIX(neon_, sl, s, 32, int32, , &env->vfp.standard_fp_status)
+VFP_CONV_FIX(neon_, ul, s, 32, uint32, u, &env->vfp.standard_fp_status)
+
 #undef VFP_CONV_FIX
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 195cf30..10592a5 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -5220,6 +5220,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                 }
             } else if (op >= 14) {
                 /* VCVT fixed-point.  */
+                TCGv tmp_shift;
                 if (!(insn & (1 << 21)) || (q && ((rd | rm) & 1))) {
                     return 1;
                 }
@@ -5227,21 +5228,25 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                  * hence this 32-shift where the ARM ARM has 64-imm6.
                  */
                 shift = 32 - shift;
+                tmp_shift = tcg_const_i32(shift);
                 for (pass = 0; pass < (q ? 4 : 2); pass++) {
                     tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, pass));
                     if (!(op & 1)) {
-                        if (u)
-                            gen_vfp_ulto(0, shift);
-                        else
-                            gen_vfp_slto(0, shift);
+                        if (u) {
+                            gen_helper_neon_ultos(cpu_F0s, cpu_F0s, tmp_shift);
+                        } else {
+                            gen_helper_neon_sltos(cpu_F0s, cpu_F0s, tmp_shift);
+                        }
                     } else {
-                        if (u)
-                            gen_vfp_toul(0, shift);
-                        else
-                            gen_vfp_tosl(0, shift);
+                        if (u) {
+                            gen_helper_neon_touls(cpu_F0s, cpu_F0s, tmp_shift);
+                        } else {
+                            gen_helper_neon_tosls(cpu_F0s, cpu_F0s, tmp_shift);
+                        }
                     }
                     tcg_gen_st_f32(cpu_F0s, cpu_env, neon_reg_offset(rd, pass));
                 }
+                tcg_temp_free_i32(tmp_shift);
             } else {
                 return 1;
             }
@@ -6051,16 +6056,16 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                             gen_helper_rsqrte_f32(cpu_F0s, cpu_F0s, cpu_env);
                             break;
                         case NEON_2RM_VCVT_FS: /* VCVT.F32.S32 */
-                            gen_vfp_sito(0);
+                            gen_helper_neon_sitos(cpu_F0s, cpu_F0s);
                             break;
                         case NEON_2RM_VCVT_FU: /* VCVT.F32.U32 */
-                            gen_vfp_uito(0);
+                            gen_helper_neon_uitos(cpu_F0s, cpu_F0s);
                             break;
                         case NEON_2RM_VCVT_SF: /* VCVT.S32.F32 */
-                            gen_vfp_tosiz(0);
+                            gen_helper_neon_tosizs(cpu_F0s, cpu_F0s);
                             break;
                         case NEON_2RM_VCVT_UF: /* VCVT.U32.F32 */
-                            gen_vfp_touiz(0);
+                            gen_helper_neon_touizs(cpu_F0s, cpu_F0s);
                             break;
                         default:
                             /* Reserved op values were caught by the
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 6/7] softfloat: Add new flag for when denormal result is flushed to zero
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
                   ` (4 preceding siblings ...)
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 5/7] target-arm: Add separate Neon float-int conversion helpers Peter Maydell
@ 2011-05-06 12:48 ` Peter Maydell
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 7/7] target-arm: Signal Underflow when denormal flushed to zero on output Peter Maydell
  2011-05-17 18:19 ` [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

Add a new float_flag_output_denormal which is set when the result
of a floating point operation would be denormal but is flushed to
zero because we are in flush_to_zero mode. This is necessary because
some architectures signal this condition as an underflow and others
signal it as an inexact result.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 fpu/softfloat.c |   41 ++++++++++++++++++++++++++++++++++-------
 fpu/softfloat.h |    3 ++-
 2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index baba1dc..e3cd8a7 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -341,7 +341,10 @@ static float32 roundAndPackFloat32( flag zSign, int16 zExp, uint32_t zSig STATUS
             return packFloat32( zSign, 0xFF, - ( roundIncrement == 0 ));
         }
         if ( zExp < 0 ) {
-            if ( STATUS(flush_to_zero) ) return packFloat32( zSign, 0, 0 );
+            if (STATUS(flush_to_zero)) {
+                float_raise(float_flag_output_denormal STATUS_VAR);
+                return packFloat32(zSign, 0, 0);
+            }
             isTiny =
                    ( STATUS(float_detect_tininess) == float_tininess_before_rounding )
                 || ( zExp < -1 )
@@ -520,7 +523,10 @@ static float64 roundAndPackFloat64( flag zSign, int16 zExp, uint64_t zSig STATUS
             return packFloat64( zSign, 0x7FF, - ( roundIncrement == 0 ));
         }
         if ( zExp < 0 ) {
-            if ( STATUS(flush_to_zero) ) return packFloat64( zSign, 0, 0 );
+            if (STATUS(flush_to_zero)) {
+                float_raise(float_flag_output_denormal STATUS_VAR);
+                return packFloat64(zSign, 0, 0);
+            }
             isTiny =
                    ( STATUS(float_detect_tininess) == float_tininess_before_rounding )
                 || ( zExp < -1 )
@@ -699,7 +705,10 @@ static floatx80
             goto overflow;
         }
         if ( zExp <= 0 ) {
-            if ( STATUS(flush_to_zero) ) return packFloatx80( zSign, 0, 0 );
+            if (STATUS(flush_to_zero)) {
+                float_raise(float_flag_output_denormal STATUS_VAR);
+                return packFloatx80(zSign, 0, 0);
+            }
             isTiny =
                    ( STATUS(float_detect_tininess) == float_tininess_before_rounding )
                 || ( zExp < 0 )
@@ -1030,7 +1039,10 @@ static float128
             return packFloat128( zSign, 0x7FFF, 0, 0 );
         }
         if ( zExp < 0 ) {
-            if ( STATUS(flush_to_zero) ) return packFloat128( zSign, 0, 0, 0 );
+            if (STATUS(flush_to_zero)) {
+                float_raise(float_flag_output_denormal STATUS_VAR);
+                return packFloat128(zSign, 0, 0, 0);
+            }
             isTiny =
                    ( STATUS(float_detect_tininess) == float_tininess_before_rounding )
                 || ( zExp < -1 )
@@ -1761,7 +1773,12 @@ static float32 addFloat32Sigs( float32 a, float32 b, flag zSign STATUS_PARAM)
             return a;
         }
         if ( aExp == 0 ) {
-            if ( STATUS(flush_to_zero) ) return packFloat32( zSign, 0, 0 );
+            if (STATUS(flush_to_zero)) {
+                if (aSig | bSig) {
+                    float_raise(float_flag_output_denormal STATUS_VAR);
+                }
+                return packFloat32(zSign, 0, 0);
+            }
             return packFloat32( zSign, 0, ( aSig + bSig )>>6 );
         }
         zSig = 0x40000000 + aSig + bSig;
@@ -3120,7 +3137,12 @@ static float64 addFloat64Sigs( float64 a, float64 b, flag zSign STATUS_PARAM )
             return a;
         }
         if ( aExp == 0 ) {
-            if ( STATUS(flush_to_zero) ) return packFloat64( zSign, 0, 0 );
+            if (STATUS(flush_to_zero)) {
+                if (aSig | bSig) {
+                    float_raise(float_flag_output_denormal STATUS_VAR);
+                }
+                return packFloat64(zSign, 0, 0);
+            }
             return packFloat64( zSign, 0, ( aSig + bSig )>>9 );
         }
         zSig = LIT64( 0x4000000000000000 ) + aSig + bSig;
@@ -5282,7 +5304,12 @@ static float128 addFloat128Sigs( float128 a, float128 b, flag zSign STATUS_PARAM
         }
         add128( aSig0, aSig1, bSig0, bSig1, &zSig0, &zSig1 );
         if ( aExp == 0 ) {
-            if ( STATUS(flush_to_zero) ) return packFloat128( zSign, 0, 0, 0 );
+            if (STATUS(flush_to_zero)) {
+                if (zSig0 | zSig1) {
+                    float_raise(float_flag_output_denormal STATUS_VAR);
+                }
+                return packFloat128(zSign, 0, 0, 0);
+            }
             return packFloat128( zSign, 0, zSig0, zSig1 );
         }
         zSig2 = 0;
diff --git a/fpu/softfloat.h b/fpu/softfloat.h
index 5eff085..58c9b7b 100644
--- a/fpu/softfloat.h
+++ b/fpu/softfloat.h
@@ -193,7 +193,8 @@ enum {
     float_flag_overflow  =  8,
     float_flag_underflow = 16,
     float_flag_inexact   = 32,
-    float_flag_input_denormal = 64
+    float_flag_input_denormal = 64,
+    float_flag_output_denormal = 128
 };
 
 typedef struct float_status {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 7/7] target-arm: Signal Underflow when denormal flushed to zero on output
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
                   ` (5 preceding siblings ...)
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 6/7] softfloat: Add new flag for when denormal result is flushed to zero Peter Maydell
@ 2011-05-06 12:48 ` Peter Maydell
  2011-05-17 18:19 ` [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-06 12:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

On ARM the architecture mandates that when an output denormal is flushed to
zero we must set the FPSCR UFC (underflow) bit, so map softfloat's
float_flag_output_denormal accordingly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index de00468..149fc82 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2355,7 +2355,7 @@ static inline int vfp_exceptbits_from_host(int host_bits)
         target_bits |= 2;
     if (host_bits & float_flag_overflow)
         target_bits |= 4;
-    if (host_bits & float_flag_underflow)
+    if (host_bits & (float_flag_underflow | float_flag_output_denormal))
         target_bits |= 8;
     if (host_bits & float_flag_inexact)
         target_bits |= 0x10;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting
  2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
                   ` (6 preceding siblings ...)
  2011-05-06 12:48 ` [Qemu-devel] [PATCH 7/7] target-arm: Signal Underflow when denormal flushed to zero on output Peter Maydell
@ 2011-05-17 18:19 ` Peter Maydell
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2011-05-17 18:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

On 6 May 2011 13:48, Peter Maydell <peter.maydell@linaro.org> wrote:
> This patch series fixes a number of minor bugs in the ARM target where
> we were not correctly setting the cumulative exception flags in the
> FPSCR. It includes adding a new flag to softfloat indicating when a
> denormal result has been flushed to zero (as discussed previously on
> the list.)

> Peter Maydell (7):
>  target-arm: Don't set FP exceptions in recip, recip_sqrt estimate fns
>  target-arm: Signal InputDenormal for VRECPE, VRSQRTE, VRECPS, VRSQRTS
>  target-arm: Signal InvalidOp for Neon GE and GT compares of QNaN
>  target-arm: Refactor int-float conversions
>  target-arm: Add separate Neon float-int conversion helpers
>  softfloat: Add new flag for when denormal result is flushed to zero
>  target-arm: Signal Underflow when denormal flushed to zero on output

I'm redoing patches 4 and 5 based on review comments; does anybody
have any comments on 1,2,3,6,7 ? (if not then they could all be
applied now, I guess -- 6 and 7 don't depend on 4 and 5. otherwise
I'll put them all into my resend as-is.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-05-17 18:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-06 12:48 [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell
2011-05-06 12:48 ` [Qemu-devel] [PATCH 1/7] target-arm: Don't set FP exceptions in recip, recip_sqrt estimate fns Peter Maydell
2011-05-06 12:48 ` [Qemu-devel] [PATCH 2/7] target-arm: Signal InputDenormal for VRECPE, VRSQRTE, VRECPS, VRSQRTS Peter Maydell
2011-05-06 12:48 ` [Qemu-devel] [PATCH 3/7] target-arm: Signal InvalidOp for Neon GE and GT compares of QNaN Peter Maydell
2011-05-06 12:48 ` [Qemu-devel] [PATCH 4/7] target-arm: Refactor int-float conversions Peter Maydell
2011-05-06 14:09   ` Paul Brook
2011-05-06 14:42     ` Peter Maydell
2011-05-06 15:30     ` Blue Swirl
2011-05-06 16:38       ` Paul Brook
2011-05-08 10:32         ` Blue Swirl
2011-05-14 22:38           ` Aurelien Jarno
2011-05-06 12:48 ` [Qemu-devel] [PATCH 5/7] target-arm: Add separate Neon float-int conversion helpers Peter Maydell
2011-05-06 12:48 ` [Qemu-devel] [PATCH 6/7] softfloat: Add new flag for when denormal result is flushed to zero Peter Maydell
2011-05-06 12:48 ` [Qemu-devel] [PATCH 7/7] target-arm: Signal Underflow when denormal flushed to zero on output Peter Maydell
2011-05-17 18:19 ` [Qemu-devel] [PATCH 0/7] target-arm: Fix bugs in fp exception flag setting Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).