* [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:03   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT " Richard Henderson
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
While we have some of the scalar paths for *CVF for fp16,
we failed to decode the fp16 version of these instructions.
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Use parens with (x << y) >> z.
---
 target/arm/translate-a64.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index bff4e13bf6..68ca445691 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -7165,13 +7165,26 @@ static void handle_simd_shift_intfp_conv(DisasContext *s, bool is_scalar,
                                          int immh, int immb, int opcode,
                                          int rn, int rd)
 {
-    bool is_double = extract32(immh, 3, 1);
-    int size = is_double ? MO_64 : MO_32;
-    int elements;
+    int size, elements, fracbits;
     int immhb = immh << 3 | immb;
-    int fracbits = (is_double ? 128 : 64) - immhb;
 
-    if (!extract32(immh, 2, 2)) {
+    if (immh & 8) {
+        size = MO_64;
+        if (!is_scalar && !is_q) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else if (immh & 4) {
+        size = MO_32;
+    } else if (immh & 2) {
+        size = MO_16;
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else {
+        /* immh == 0 would be a failure of the decode logic */
+        g_assert(immh == 1);
         unallocated_encoding(s);
         return;
     }
@@ -7179,20 +7192,14 @@ static void handle_simd_shift_intfp_conv(DisasContext *s, bool is_scalar,
     if (is_scalar) {
         elements = 1;
     } else {
-        elements = is_double ? 2 : is_q ? 4 : 2;
-        if (is_double && !is_q) {
-            unallocated_encoding(s);
-            return;
-        }
+        elements = (8 << is_q) >> size;
     }
+    fracbits = (16 << size) - immhb;
 
     if (!fp_access_check(s)) {
         return;
     }
 
-    /* immh == 0 would be a failure of the decode logic */
-    g_assert(immh);
-
     handle_simd_intfp_conv(s, rd, rn, elements, !is_u, fracbits, size);
 }
 
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:04   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 03/14] target/arm: Fix float16 to/from int16 Richard Henderson
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
While we have some of the scalar paths for FCVT for fp16,
we failed to decode the fp16 version of these instructions.
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Use parens with (x << y) >> z.
---
 target/arm/translate-a64.c | 65 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 46 insertions(+), 19 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 68ca445691..a64673575a 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -7208,19 +7208,28 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
                                          bool is_q, bool is_u,
                                          int immh, int immb, int rn, int rd)
 {
-    bool is_double = extract32(immh, 3, 1);
     int immhb = immh << 3 | immb;
-    int fracbits = (is_double ? 128 : 64) - immhb;
-    int pass;
+    int pass, size, fracbits;
     TCGv_ptr tcg_fpstatus;
     TCGv_i32 tcg_rmode, tcg_shift;
 
-    if (!extract32(immh, 2, 2)) {
-        unallocated_encoding(s);
-        return;
-    }
-
-    if (!is_scalar && !is_q && is_double) {
+    if (immh & 0x8) {
+        size = MO_64;
+        if (!is_scalar && !is_q) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else if (immh & 0x4) {
+        size = MO_32;
+    } else if (immh & 0x2) {
+        size = MO_16;
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else {
+        /* Should have split out AdvSIMD modified immediate earlier.  */
+        assert(immh == 1);
         unallocated_encoding(s);
         return;
     }
@@ -7232,11 +7241,12 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
     assert(!(is_scalar && is_q));
 
     tcg_rmode = tcg_const_i32(arm_rmode_to_sf(FPROUNDING_ZERO));
-    tcg_fpstatus = get_fpstatus_ptr(false);
+    tcg_fpstatus = get_fpstatus_ptr(size == MO_16);
     gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
+    fracbits = (16 << size) - immhb;
     tcg_shift = tcg_const_i32(fracbits);
 
-    if (is_double) {
+    if (size == MO_64) {
         int maxpass = is_scalar ? 1 : 2;
 
         for (pass = 0; pass < maxpass; pass++) {
@@ -7253,20 +7263,37 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
         }
         clear_vec_high(s, is_q, rd);
     } else {
-        int maxpass = is_scalar ? 1 : is_q ? 4 : 2;
+        void (*fn)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+        int maxpass = is_scalar ? 1 : ((8 << is_q) >> size);
+
+        switch (size) {
+        case MO_16:
+            if (is_u) {
+                fn = gen_helper_vfp_toulh;
+            } else {
+                fn = gen_helper_vfp_toslh;
+            }
+            break;
+        case MO_32:
+            if (is_u) {
+                fn = gen_helper_vfp_touls;
+            } else {
+                fn = gen_helper_vfp_tosls;
+            }
+            break;
+        default:
+            g_assert_not_reached();
+        }
+
         for (pass = 0; pass < maxpass; pass++) {
             TCGv_i32 tcg_op = tcg_temp_new_i32();
 
-            read_vec_element_i32(s, tcg_op, rn, pass, MO_32);
-            if (is_u) {
-                gen_helper_vfp_touls(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
-            } else {
-                gen_helper_vfp_tosls(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
-            }
+            read_vec_element_i32(s, tcg_op, rn, pass, size);
+            fn(tcg_op, tcg_op, tcg_shift, tcg_fpstatus);
             if (is_scalar) {
                 write_fp_sreg(s, rd, tcg_op);
             } else {
-                write_vec_element_i32(s, tcg_op, rd, pass, MO_32);
+                write_vec_element_i32(s, tcg_op, rd, pass, size);
             }
             tcg_temp_free_i32(tcg_op);
         }
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 03/14] target/arm: Fix float16 to/from int16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 01/14] target/arm: Implement vector shifted SCVF/UCVF for fp16 Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 02/14] target/arm: Implement vector shifted FCVT " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 04/14] target/arm: Clear SVE high bits for FMOV Richard Henderson
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
The instruction "ucvtf v0.4h, v04h, #2", with input 0x8000u,
overflows the intermediate float16 to infinity before we have a
chance to scale the output.  Use float64 as the intermediate type
so that no input argument (uint32_t in this case) can overflow
or round before scaling.  Given the declared argument, the signed
int32_t function has the same problem.
When converting from float16 to integer, using u/int32_t instead
of u/int16_t means that the bounding is incorrect.
Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.h        |  4 ++--
 target/arm/helper.c        | 53 ++++++++++++++++++++++++++++++++++++++++++++--
 target/arm/translate-a64.c |  4 ++--
 3 files changed, 55 insertions(+), 6 deletions(-)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index 34e8cc8904..1969b37f2d 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -149,8 +149,8 @@ DEF_HELPER_3(vfp_toshd_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_tosld_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, ptr)
-DEF_HELPER_3(vfp_toulh, i32, f16, i32, ptr)
-DEF_HELPER_3(vfp_toslh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_touhh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toshh, i32, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshs, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosls, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosqs, i64, f32, i32, ptr)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 52a88e0297..ba04cddda0 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11420,11 +11420,60 @@ VFP_CONV_FIX_A64(sq, s, 32, 64, int64)
 VFP_CONV_FIX(uh, s, 32, 32, uint16)
 VFP_CONV_FIX(ul, s, 32, 32, uint32)
 VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
-VFP_CONV_FIX_A64(sl, h, 16, 32, int32)
-VFP_CONV_FIX_A64(ul, h, 16, 32, uint32)
+
 #undef VFP_CONV_FIX
 #undef VFP_CONV_FIX_FLOAT
 #undef VFP_CONV_FLOAT_FIX_ROUND
+#undef VFP_CONV_FIX_A64
+
+/* Conversion to/from f16 can overflow to infinity before/after scaling.
+ * Therefore we convert to f64 (which does not round), scale,
+ * and then convert f64 to f16 (which may round).
+ */
+
+static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
+{
+    return float64_to_float16(float64_scalbn(f, -shift, fpst), true, fpst);
+}
+
+float16 HELPER(vfp_sltoh)(uint32_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(int32_to_float64(x, fpst), shift, fpst);
+}
+
+float16 HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(uint32_to_float64(x, fpst), shift, fpst);
+}
+
+static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
+{
+    if (unlikely(float16_is_any_nan(f))) {
+        float_raise(float_flag_invalid, fpst);
+        return 0;
+    } else {
+        int old_exc_flags = get_float_exception_flags(fpst);
+        float64 ret;
+
+        ret = float16_to_float64(f, true, fpst);
+        ret = float64_scalbn(ret, shift, fpst);
+        old_exc_flags |= get_float_exception_flags(fpst)
+            & float_flag_input_denormal;
+        set_float_exception_flags(old_exc_flags, fpst);
+
+        return ret;
+    }
+}
+
+uint32_t HELPER(vfp_toshh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int16(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint32_t HELPER(vfp_touhh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint16(do_prescale_fp16(x, shift, fpst), fpst);
+}
 
 /* Set the current fp rounding mode and return the old one.
  * The argument is a softfloat float_round_ value.
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a64673575a..7021a31b89 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -7269,9 +7269,9 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar,
         switch (size) {
         case MO_16:
             if (is_u) {
-                fn = gen_helper_vfp_toulh;
+                fn = gen_helper_vfp_touhh;
             } else {
-                fn = gen_helper_vfp_toslh;
+                fn = gen_helper_vfp_toshh;
             }
             break;
         case MO_32:
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 04/14] target/arm: Clear SVE high bits for FMOV
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (2 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 03/14] target/arm: Fix float16 to/from int16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16 Richard Henderson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
Use write_fp_dreg and clear_vec_high to zero the bits
that need zeroing for these cases.
Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 7021a31b89..c64c3ed99d 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5444,31 +5444,24 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
 
     if (itof) {
         TCGv_i64 tcg_rn = cpu_reg(s, rn);
+        TCGv_i64 tmp;
 
         switch (type) {
         case 0:
-        {
             /* 32 bit */
-            TCGv_i64 tmp = tcg_temp_new_i64();
+            tmp = tcg_temp_new_i64();
             tcg_gen_ext32u_i64(tmp, tcg_rn);
-            tcg_gen_st_i64(tmp, cpu_env, fp_reg_offset(s, rd, MO_64));
-            tcg_gen_movi_i64(tmp, 0);
-            tcg_gen_st_i64(tmp, cpu_env, fp_reg_hi_offset(s, rd));
+            write_fp_dreg(s, rd, tmp);
             tcg_temp_free_i64(tmp);
             break;
-        }
         case 1:
-        {
             /* 64 bit */
-            TCGv_i64 tmp = tcg_const_i64(0);
-            tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_offset(s, rd, MO_64));
-            tcg_gen_st_i64(tmp, cpu_env, fp_reg_hi_offset(s, rd));
-            tcg_temp_free_i64(tmp);
+            write_fp_dreg(s, rd, tcg_rn);
             break;
-        }
         case 2:
             /* 64 bit to top half. */
             tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
+            clear_vec_high(s, true, rd);
             break;
         }
     } else {
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (3 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 04/14] target/arm: Clear SVE high bits for FMOV Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:21   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 06/14] target/arm: Implement FCVT (scalar, integer) " Richard Henderson
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
Adding the fp16 moves to/from general registers.
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index c64c3ed99d..247a4f0cce 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5463,6 +5463,15 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
             tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
             clear_vec_high(s, true, rd);
             break;
+        case 3:
+            /* 16 bit */
+            tmp = tcg_temp_new_i64();
+            tcg_gen_ext16u_i64(tmp, tcg_rn);
+            write_fp_dreg(s, rd, tmp);
+            tcg_temp_free_i64(tmp);
+            break;
+        default:
+            g_assert_not_reached();
         }
     } else {
         TCGv_i64 tcg_rd = cpu_reg(s, rd);
@@ -5480,6 +5489,12 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
             /* 64 bits from top half */
             tcg_gen_ld_i64(tcg_rd, cpu_env, fp_reg_hi_offset(s, rn));
             break;
+        case 3:
+            /* 16 bit */
+            tcg_gen_ld16u_i64(tcg_rd, cpu_env, fp_reg_offset(s, rn, MO_16));
+            break;
+        default:
+            g_assert_not_reached();
         }
     }
 }
@@ -5519,10 +5534,15 @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
         case 0xa: /* 64 bit */
         case 0xd: /* 64 bit to top half of quad */
             break;
+        case 0x6: /* 16-bit */
+            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+                break;
+            }
+            /* fallthru */
         default:
             /* all other sf/type/rmode combinations are invalid */
             unallocated_encoding(s);
-            break;
+            return;
         }
 
         if (!fp_access_check(s)) {
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* Re: [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16 Richard Henderson
@ 2018-05-10 16:21   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 16:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, qemu-arm, qemu-stable
On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> Adding the fp16 moves to/from general registers.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/translate-a64.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index c64c3ed99d..247a4f0cce 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -5463,6 +5463,15 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
>              tcg_gen_st_i64(tcg_rn, cpu_env, fp_reg_hi_offset(s, rd));
>              clear_vec_high(s, true, rd);
>              break;
> +        case 3:
> +            /* 16 bit */
> +            tmp = tcg_temp_new_i64();
> +            tcg_gen_ext16u_i64(tmp, tcg_rn);
> +            write_fp_dreg(s, rd, tmp);
> +            tcg_temp_free_i64(tmp);
> +            break;
> +        default:
> +            g_assert_not_reached();
>          }
>      } else {
>          TCGv_i64 tcg_rd = cpu_reg(s, rd);
> @@ -5480,6 +5489,12 @@ static void handle_fmov(DisasContext *s, int rd, int rn, int type, bool itof)
>              /* 64 bits from top half */
>              tcg_gen_ld_i64(tcg_rd, cpu_env, fp_reg_hi_offset(s, rn));
>              break;
> +        case 3:
> +            /* 16 bit */
> +            tcg_gen_ld16u_i64(tcg_rd, cpu_env, fp_reg_offset(s, rn, MO_16));
> +            break;
> +        default:
> +            g_assert_not_reached();
>          }
>      }
>  }
> @@ -5519,10 +5534,15 @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
>          case 0xa: /* 64 bit */
>          case 0xd: /* 64 bit to top half of quad */
>              break;
> +        case 0x6: /* 16-bit */
> +            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
> +                break;
> +            }
> +            /* fallthru */
This is a switch on (sf << 3 | type << 1 | rmode), so this catches
the halfprec <-> 32 bit cases, but it misses the halfprec <-> 64 bit
ops, which have sf == 1. So I think you need a 'case 0xe' as well.
>          default:
>              /* all other sf/type/rmode combinations are invalid */
>              unallocated_encoding(s);
> -            break;
> +            return;
This is a distinct bugfix, I think (though the only effect would
be generation of unnecessary dead code after the undef).
>          }
>
>          if (!fp_access_check(s)) {
> --
> 2.14.3
thanks
-- PMM
^ permalink raw reply	[flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v2 06/14] target/arm: Implement FCVT (scalar, integer) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (4 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 05/14] target/arm: Implement FMOV (general) for fp16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 07/14] target/arm: Implement FCVT (scalar, fixed-point) " Richard Henderson
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.h        |  6 +++
 target/arm/helper.c        | 38 +++++++++++++++++-
 target/arm/translate-a64.c | 96 ++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 122 insertions(+), 18 deletions(-)
diff --git a/target/arm/helper.h b/target/arm/helper.h
index 1969b37f2d..ce89968b2d 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -151,6 +151,10 @@ DEF_HELPER_3(vfp_touhd_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_tould_round_to_zero, i64, f64, i32, ptr)
 DEF_HELPER_3(vfp_touhh, i32, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toulh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_toslh, i32, f16, i32, ptr)
+DEF_HELPER_3(vfp_touqh, i64, f16, i32, ptr)
+DEF_HELPER_3(vfp_tosqh, i64, f16, i32, ptr)
 DEF_HELPER_3(vfp_toshs, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosls, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosqs, i64, f32, i32, ptr)
@@ -177,6 +181,8 @@ DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr)
 DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr)
 DEF_HELPER_3(vfp_sltoh, f16, i32, i32, ptr)
 DEF_HELPER_3(vfp_ultoh, f16, i32, i32, ptr)
+DEF_HELPER_3(vfp_sqtoh, f16, i64, i32, ptr)
+DEF_HELPER_3(vfp_uqtoh, f16, i64, i32, ptr)
 
 DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, ptr)
 DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index ba04cddda0..1fbf5fdec9 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11427,8 +11427,12 @@ VFP_CONV_FIX_A64(uq, s, 32, 64, uint64)
 #undef VFP_CONV_FIX_A64
 
 /* Conversion to/from f16 can overflow to infinity before/after scaling.
- * Therefore we convert to f64 (which does not round), scale,
- * and then convert f64 to f16 (which may round).
+ * Therefore we convert to f64, scale, and then convert f64 to f16; or
+ * vice versa for conversion to integer.
+ *
+ * For 16- and 32-bit integers, the conversion to f64 never rounds.
+ * For 64-bit integers, any integer that would cause rounding will also
+ * overflow to f16 infinity, so there is no double rounding problem.
  */
 
 static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
@@ -11446,6 +11450,16 @@ float16 HELPER(vfp_ultoh)(uint32_t x, uint32_t shift, void *fpst)
     return do_postscale_fp16(uint32_to_float64(x, fpst), shift, fpst);
 }
 
+float16 HELPER(vfp_sqtoh)(uint64_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(int64_to_float64(x, fpst), shift, fpst);
+}
+
+float16 HELPER(vfp_uqtoh)(uint64_t x, uint32_t shift, void *fpst)
+{
+    return do_postscale_fp16(uint64_to_float64(x, fpst), shift, fpst);
+}
+
 static float64 do_prescale_fp16(float16 f, int shift, float_status *fpst)
 {
     if (unlikely(float16_is_any_nan(f))) {
@@ -11475,6 +11489,26 @@ uint32_t HELPER(vfp_touhh)(float16 x, uint32_t shift, void *fpst)
     return float64_to_uint16(do_prescale_fp16(x, shift, fpst), fpst);
 }
 
+uint32_t HELPER(vfp_toslh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int32(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint32_t HELPER(vfp_toulh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint32(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint64_t HELPER(vfp_tosqh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_int64(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
+uint64_t HELPER(vfp_touqh)(float16 x, uint32_t shift, void *fpst)
+{
+    return float64_to_uint64(do_prescale_fp16(x, shift, fpst), fpst);
+}
+
 /* Set the current fp rounding mode and return the old one.
  * The argument is a softfloat float_round_ value.
  */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 247a4f0cce..d794744aec 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5274,11 +5274,11 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                            bool itof, int rmode, int scale, int sf, int type)
 {
     bool is_signed = !(opcode & 1);
-    bool is_double = type;
     TCGv_ptr tcg_fpstatus;
-    TCGv_i32 tcg_shift;
+    TCGv_i32 tcg_shift, tcg_single;
+    TCGv_i64 tcg_double;
 
-    tcg_fpstatus = get_fpstatus_ptr(false);
+    tcg_fpstatus = get_fpstatus_ptr(type == 3);
 
     tcg_shift = tcg_const_i32(64 - scale);
 
@@ -5296,8 +5296,9 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             tcg_int = tcg_extend;
         }
 
-        if (is_double) {
-            TCGv_i64 tcg_double = tcg_temp_new_i64();
+        switch (type) {
+        case 1: /* float64 */
+            tcg_double = tcg_temp_new_i64();
             if (is_signed) {
                 gen_helper_vfp_sqtod(tcg_double, tcg_int,
                                      tcg_shift, tcg_fpstatus);
@@ -5307,8 +5308,10 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             }
             write_fp_dreg(s, rd, tcg_double);
             tcg_temp_free_i64(tcg_double);
-        } else {
-            TCGv_i32 tcg_single = tcg_temp_new_i32();
+            break;
+
+        case 0: /* float32 */
+            tcg_single = tcg_temp_new_i32();
             if (is_signed) {
                 gen_helper_vfp_sqtos(tcg_single, tcg_int,
                                      tcg_shift, tcg_fpstatus);
@@ -5318,6 +5321,23 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
             }
             write_fp_sreg(s, rd, tcg_single);
             tcg_temp_free_i32(tcg_single);
+            break;
+
+        case 3: /* float16 */
+            tcg_single = tcg_temp_new_i32();
+            if (is_signed) {
+                gen_helper_vfp_sqtoh(tcg_single, tcg_int,
+                                     tcg_shift, tcg_fpstatus);
+            } else {
+                gen_helper_vfp_uqtoh(tcg_single, tcg_int,
+                                     tcg_shift, tcg_fpstatus);
+            }
+            write_fp_sreg(s, rd, tcg_single);
+            tcg_temp_free_i32(tcg_single);
+            break;
+
+        default:
+            g_assert_not_reached();
         }
     } else {
         TCGv_i64 tcg_int = cpu_reg(s, rd);
@@ -5334,8 +5354,9 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
 
         gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
 
-        if (is_double) {
-            TCGv_i64 tcg_double = read_fp_dreg(s, rn);
+        switch (type) {
+        case 1: /* float64 */
+            tcg_double = read_fp_dreg(s, rn);
             if (is_signed) {
                 if (!sf) {
                     gen_helper_vfp_tosld(tcg_int, tcg_double,
@@ -5353,9 +5374,14 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                                          tcg_shift, tcg_fpstatus);
                 }
             }
+            if (!sf) {
+                tcg_gen_ext32u_i64(tcg_int, tcg_int);
+            }
             tcg_temp_free_i64(tcg_double);
-        } else {
-            TCGv_i32 tcg_single = read_fp_sreg(s, rn);
+            break;
+
+        case 0: /* float32 */
+            tcg_single = read_fp_sreg(s, rn);
             if (sf) {
                 if (is_signed) {
                     gen_helper_vfp_tosqs(tcg_int, tcg_single,
@@ -5377,14 +5403,39 @@ static void handle_fpfpcvt(DisasContext *s, int rd, int rn, int opcode,
                 tcg_temp_free_i32(tcg_dest);
             }
             tcg_temp_free_i32(tcg_single);
+            break;
+
+        case 3: /* float16 */
+            tcg_single = read_fp_sreg(s, rn);
+            if (sf) {
+                if (is_signed) {
+                    gen_helper_vfp_tosqh(tcg_int, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                } else {
+                    gen_helper_vfp_touqh(tcg_int, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                }
+            } else {
+                TCGv_i32 tcg_dest = tcg_temp_new_i32();
+                if (is_signed) {
+                    gen_helper_vfp_toslh(tcg_dest, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                } else {
+                    gen_helper_vfp_toulh(tcg_dest, tcg_single,
+                                         tcg_shift, tcg_fpstatus);
+                }
+                tcg_gen_extu_i32_i64(tcg_int, tcg_dest);
+                tcg_temp_free_i32(tcg_dest);
+            }
+            tcg_temp_free_i32(tcg_single);
+            break;
+
+        default:
+            g_assert_not_reached();
         }
 
         gen_helper_set_rmode(tcg_rmode, tcg_rmode, tcg_fpstatus);
         tcg_temp_free_i32(tcg_rmode);
-
-        if (!sf) {
-            tcg_gen_ext32u_i64(tcg_int, tcg_int);
-        }
     }
 
     tcg_temp_free_ptr(tcg_fpstatus);
@@ -5553,7 +5604,20 @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn)
         /* actual FP conversions */
         bool itof = extract32(opcode, 1, 1);
 
-        if (type > 1 || (rmode != 0 && opcode > 1)) {
+        if (rmode != 0 && opcode > 1) {
+            unallocated_encoding(s);
+            return;
+        }
+        switch (type) {
+        case 0: /* float32 */
+        case 1: /* float64 */
+            break;
+        case 3: /* float16 */
+            if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+                break;
+            }
+            /* fallthru */
+        default:
             unallocated_encoding(s);
             return;
         }
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 07/14] target/arm: Implement FCVT (scalar, fixed-point) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (5 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 06/14] target/arm: Implement FCVT (scalar, integer) " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg Richard Henderson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index d794744aec..e19d97e8f1 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5460,8 +5460,7 @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
     bool sf = extract32(insn, 31, 1);
     bool itof;
 
-    if (sbit || (type > 1)
-        || (!sf && scale < 32)) {
+    if (sbit || (!sf && scale < 32)) {
         unallocated_encoding(s);
         return;
     }
@@ -5480,6 +5479,20 @@ static void disas_fp_fixed_conv(DisasContext *s, uint32_t insn)
         return;
     }
 
+    switch (type) {
+    case 0: /* float32 */
+    case 1: /* float64 */
+        break;
+    case 3: /* float16 */
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
     if (!fp_access_check(s)) {
         return;
     }
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (6 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 07/14] target/arm: Implement FCVT (scalar, fixed-point) " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:30   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 09/14] target/arm: Implement FP data-processing (2 source) for fp16 Richard Henderson
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 30 ++++++++++++++----------------
 1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index e19d97e8f1..8c63d5e743 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -614,6 +614,14 @@ static TCGv_i32 read_fp_sreg(DisasContext *s, int reg)
     return v;
 }
 
+static TCGv_i32 read_fp_hreg(DisasContext *s, int reg)
+{
+    TCGv_i32 v = tcg_temp_new_i32();
+
+    tcg_gen_ld16u_i32(v, cpu_env, fp_reg_offset(s, reg, MO_16));
+    return v;
+}
+
 /* Clear the bits above an N-bit vector, for N = (is_q ? 128 : 64).
  * If SVE is not enabled, then there are only 128 bits in the vector.
  */
@@ -4644,11 +4652,9 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
 static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
 {
     TCGv_ptr fpst = NULL;
-    TCGv_i32 tcg_op = tcg_temp_new_i32();
+    TCGv_i32 tcg_op = read_fp_hreg(s, rn);
     TCGv_i32 tcg_res = tcg_temp_new_i32();
 
-    read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
-
     switch (opcode) {
     case 0x0: /* FMOV */
         tcg_gen_mov_i32(tcg_res, tcg_op);
@@ -7543,13 +7549,10 @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
         tcg_temp_free_i64(tcg_op2);
         tcg_temp_free_i64(tcg_res);
     } else {
-        TCGv_i32 tcg_op1 = tcg_temp_new_i32();
-        TCGv_i32 tcg_op2 = tcg_temp_new_i32();
+        TCGv_i32 tcg_op1 = read_fp_hreg(s, rn);
+        TCGv_i32 tcg_op2 = read_fp_hreg(s, rm);
         TCGv_i64 tcg_res = tcg_temp_new_i64();
 
-        read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
-        read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
-
         gen_helper_neon_mull_s16(tcg_res, tcg_op1, tcg_op2);
         gen_helper_neon_addl_saturate_s32(tcg_res, cpu_env, tcg_res, tcg_res);
 
@@ -8090,13 +8093,10 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
 
     fpst = get_fpstatus_ptr(true);
 
-    tcg_op1 = tcg_temp_new_i32();
-    tcg_op2 = tcg_temp_new_i32();
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
     tcg_res = tcg_temp_new_i32();
 
-    read_vec_element_i32(s, tcg_op1, rn, 0, MO_16);
-    read_vec_element_i32(s, tcg_op2, rm, 0, MO_16);
-
     switch (fpopcode) {
     case 0x03: /* FMULX */
         gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst);
@@ -12015,11 +12015,9 @@ static void disas_simd_two_reg_misc_fp16(DisasContext *s, uint32_t insn)
     }
 
     if (is_scalar) {
-        TCGv_i32 tcg_op = tcg_temp_new_i32();
+        TCGv_i32 tcg_op = read_fp_hreg(s, rn);
         TCGv_i32 tcg_res = tcg_temp_new_i32();
 
-        read_vec_element_i32(s, tcg_op, rn, 0, MO_16);
-
         switch (fpop) {
         case 0x1a: /* FCVTNS */
         case 0x1b: /* FCVTMS */
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 09/14] target/arm: Implement FP data-processing (2 source) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (7 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 08/14] target/arm: Introduce and use read_fp_hreg Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 10/14] target/arm: Implement FP data-processing (3 " Richard Henderson
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
We missed all of the scalar fp16 binary operations.
Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 8c63d5e743..d7a49ef2e4 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5062,6 +5062,61 @@ static void handle_fp_2src_double(DisasContext *s, int opcode,
     tcg_temp_free_i64(tcg_res);
 }
 
+/* Floating-point data-processing (2 source) - half precision */
+static void handle_fp_2src_half(DisasContext *s, int opcode,
+                                int rd, int rn, int rm)
+{
+    TCGv_i32 tcg_op1;
+    TCGv_i32 tcg_op2;
+    TCGv_i32 tcg_res;
+    TCGv_ptr fpst;
+
+    tcg_res = tcg_temp_new_i32();
+    fpst = get_fpstatus_ptr(true);
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
+
+    switch (opcode) {
+    case 0x0: /* FMUL */
+        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x1: /* FDIV */
+        gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x2: /* FADD */
+        gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x3: /* FSUB */
+        gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x4: /* FMAX */
+        gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x5: /* FMIN */
+        gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x6: /* FMAXNM */
+        gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x7: /* FMINNM */
+        gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst);
+        break;
+    case 0x8: /* FNMUL */
+        gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst);
+        tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    write_fp_sreg(s, rd, tcg_res);
+
+    tcg_temp_free_ptr(fpst);
+    tcg_temp_free_i32(tcg_op1);
+    tcg_temp_free_i32(tcg_op2);
+    tcg_temp_free_i32(tcg_res);
+}
+
 /* Floating point data-processing (2 source)
  *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
  * +---+---+---+-----------+------+---+------+--------+-----+------+------+
@@ -5094,6 +5149,16 @@ static void disas_fp_2src(DisasContext *s, uint32_t insn)
         }
         handle_fp_2src_double(s, opcode, rd, rn, rm);
         break;
+    case 3:
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!fp_access_check(s)) {
+            return;
+        }
+        handle_fp_2src_half(s, opcode, rd, rn, rm);
+        break;
     default:
         unallocated_encoding(s);
     }
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 10/14] target/arm: Implement FP data-processing (3 source) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (8 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 09/14] target/arm: Implement FP data-processing (2 source) for fp16 Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP " Richard Henderson
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, qemu-stable
We missed all of the scalar fp16 fma operations.
Cc: qemu-stable@nongnu.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index d7a49ef2e4..9322b0d896 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5240,6 +5240,44 @@ static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1,
     tcg_temp_free_i64(tcg_res);
 }
 
+/* Floating-point data-processing (3 source) - half precision */
+static void handle_fp_3src_half(DisasContext *s, bool o0, bool o1,
+                                int rd, int rn, int rm, int ra)
+{
+    TCGv_i32 tcg_op1, tcg_op2, tcg_op3;
+    TCGv_i32 tcg_res = tcg_temp_new_i32();
+    TCGv_ptr fpst = get_fpstatus_ptr(true);
+
+    tcg_op1 = read_fp_hreg(s, rn);
+    tcg_op2 = read_fp_hreg(s, rm);
+    tcg_op3 = read_fp_hreg(s, ra);
+
+    /* These are fused multiply-add, and must be done as one
+     * floating point operation with no rounding between the
+     * multiplication and addition steps.
+     * NB that doing the negations here as separate steps is
+     * correct : an input NaN should come out with its sign bit
+     * flipped if it is a negated-input.
+     */
+    if (o1 == true) {
+        tcg_gen_xori_i32(tcg_op3, tcg_op3, 0x8000);
+    }
+
+    if (o0 != o1) {
+        tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000);
+    }
+
+    gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst);
+
+    write_fp_sreg(s, rd, tcg_res);
+
+    tcg_temp_free_ptr(fpst);
+    tcg_temp_free_i32(tcg_op1);
+    tcg_temp_free_i32(tcg_op2);
+    tcg_temp_free_i32(tcg_op3);
+    tcg_temp_free_i32(tcg_res);
+}
+
 /* Floating point data-processing (3 source)
  *   31  30  29 28       24 23  22  21  20  16  15  14  10 9    5 4    0
  * +---+---+---+-----------+------+----+------+----+------+------+------+
@@ -5269,6 +5307,16 @@ static void disas_fp_3src(DisasContext *s, uint32_t insn)
         }
         handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra);
         break;
+    case 3:
+        if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!fp_access_check(s)) {
+            return;
+        }
+        handle_fp_3src_half(s, o0, o1, rd, rn, rm, ra);
+        break;
     default:
         unallocated_encoding(s);
     }
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (9 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 10/14] target/arm: Implement FP data-processing (3 " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 16:59   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL " Richard Henderson
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable
From: Alex Bennée <alex.bennee@linaro.org>
These where missed out from the rest of the half-precision work.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[rth: Diagnose lack of FP16 before fp_access_check]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper-a64.h    |  2 ++
 target/arm/helper-a64.c    | 10 ++++++
 target/arm/translate-a64.c | 88 +++++++++++++++++++++++++++++++++++++---------
 3 files changed, 83 insertions(+), 17 deletions(-)
diff --git a/target/arm/helper-a64.h b/target/arm/helper-a64.h
index ef4ddfe9d8..5c0b9bd799 100644
--- a/target/arm/helper-a64.h
+++ b/target/arm/helper-a64.h
@@ -19,6 +19,8 @@
 DEF_HELPER_FLAGS_2(udiv64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(sdiv64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
 DEF_HELPER_FLAGS_1(rbit64, TCG_CALL_NO_RWG_SE, i64, i64)
+DEF_HELPER_3(vfp_cmph_a64, i64, f16, f16, ptr)
+DEF_HELPER_3(vfp_cmpeh_a64, i64, f16, f16, ptr)
 DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index afb25ad20c..35df07adb9 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -85,6 +85,16 @@ static inline uint32_t float_rel_to_flags(int res)
     return flags;
 }
 
+uint64_t HELPER(vfp_cmph_a64)(float16 x, float16 y, void *fp_status)
+{
+    return float_rel_to_flags(float16_compare_quiet(x, y, fp_status));
+}
+
+uint64_t HELPER(vfp_cmpeh_a64)(float16 x, float16 y, void *fp_status)
+{
+    return float_rel_to_flags(float16_compare(x, y, fp_status));
+}
+
 uint64_t HELPER(vfp_cmps_a64)(float32 x, float32 y, void *fp_status)
 {
     return float_rel_to_flags(float32_compare_quiet(x, y, fp_status));
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 9322b0d896..f0aca20771 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4475,14 +4475,14 @@ static void disas_data_proc_reg(DisasContext *s, uint32_t insn)
     }
 }
 
-static void handle_fp_compare(DisasContext *s, bool is_double,
+static void handle_fp_compare(DisasContext *s, int size,
                               unsigned int rn, unsigned int rm,
                               bool cmp_with_zero, bool signal_all_nans)
 {
     TCGv_i64 tcg_flags = tcg_temp_new_i64();
-    TCGv_ptr fpst = get_fpstatus_ptr(false);
+    TCGv_ptr fpst = get_fpstatus_ptr(size == MO_16);
 
-    if (is_double) {
+    if (size == MO_64) {
         TCGv_i64 tcg_vn, tcg_vm;
 
         tcg_vn = read_fp_dreg(s, rn);
@@ -4499,19 +4499,35 @@ static void handle_fp_compare(DisasContext *s, bool is_double,
         tcg_temp_free_i64(tcg_vn);
         tcg_temp_free_i64(tcg_vm);
     } else {
-        TCGv_i32 tcg_vn, tcg_vm;
+        TCGv_i32 tcg_vn = tcg_temp_new_i32();
+        TCGv_i32 tcg_vm = tcg_temp_new_i32();
 
-        tcg_vn = read_fp_sreg(s, rn);
+        read_vec_element_i32(s, tcg_vn, rn, 0, size);
         if (cmp_with_zero) {
-            tcg_vm = tcg_const_i32(0);
+            tcg_gen_movi_i32(tcg_vm, 0);
         } else {
-            tcg_vm = read_fp_sreg(s, rm);
+            read_vec_element_i32(s, tcg_vm, rm, 0, size);
         }
-        if (signal_all_nans) {
-            gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
-        } else {
-            gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+
+        switch (size) {
+        case MO_32:
+            if (signal_all_nans) {
+                gen_helper_vfp_cmpes_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            } else {
+                gen_helper_vfp_cmps_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            }
+            break;
+        case MO_16:
+            if (signal_all_nans) {
+                gen_helper_vfp_cmpeh_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            } else {
+                gen_helper_vfp_cmph_a64(tcg_flags, tcg_vn, tcg_vm, fpst);
+            }
+            break;
+        default:
+            g_assert_not_reached();
         }
+
         tcg_temp_free_i32(tcg_vn);
         tcg_temp_free_i32(tcg_vm);
     }
@@ -4532,16 +4548,35 @@ static void handle_fp_compare(DisasContext *s, bool is_double,
 static void disas_fp_compare(DisasContext *s, uint32_t insn)
 {
     unsigned int mos, type, rm, op, rn, opc, op2r;
+    int size;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     op = extract32(insn, 14, 2);
     rn = extract32(insn, 5, 5);
     opc = extract32(insn, 3, 2);
     op2r = extract32(insn, 0, 3);
 
-    if (mos || op || op2r || type > 1) {
+    if (mos || op || op2r) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0:
+        size = MO_32;
+        break;
+    case 1:
+        size = MO_64;
+        break;
+    case 3:
+        size = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -4550,7 +4585,7 @@ static void disas_fp_compare(DisasContext *s, uint32_t insn)
         return;
     }
 
-    handle_fp_compare(s, type, rn, rm, opc & 1, opc & 2);
+    handle_fp_compare(s, size, rn, rm, opc & 1, opc & 2);
 }
 
 /* Floating point conditional compare
@@ -4564,16 +4599,35 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
     unsigned int mos, type, rm, cond, rn, op, nzcv;
     TCGv_i64 tcg_flags;
     TCGLabel *label_continue = NULL;
+    int size;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     cond = extract32(insn, 12, 4);
     rn = extract32(insn, 5, 5);
     op = extract32(insn, 4, 1);
     nzcv = extract32(insn, 0, 4);
 
-    if (mos || type > 1) {
+    if (mos) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (type) {
+    case 0:
+        size = MO_32;
+        break;
+    case 1:
+        size = MO_64;
+        break;
+    case 3:
+        size = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -4594,7 +4648,7 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn)
         gen_set_label(label_match);
     }
 
-    handle_fp_compare(s, type, rn, rm, false, op);
+    handle_fp_compare(s, size, rn, rm, false, op);
 
     if (cond < 0x0e) {
         gen_set_label(label_continue);
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* Re: [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP " Richard Henderson
@ 2018-05-10 16:59   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 16:59 UTC (permalink / raw)
  To: Richard Henderson
  Cc: QEMU Developers, qemu-arm, Alex Bennée, qemu-stable
On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> From: Alex Bennée <alex.bennee@linaro.org>
>
> These where missed out from the rest of the half-precision work.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> [rth: Diagnose lack of FP16 before fp_access_check]
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper-a64.h    |  2 ++
>  target/arm/helper-a64.c    | 10 ++++++
>  target/arm/translate-a64.c | 88 +++++++++++++++++++++++++++++++++++++---------
>  3 files changed, 83 insertions(+), 17 deletions(-)
>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply	[flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (10 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 11/14] target/arm: Implement FCMP " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 17:01   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) " Richard Henderson
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable
From: Alex Bennée <alex.bennee@linaro.org>
These were missed out from the rest of the half-precision work.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[rth: Fix erroneous check vs type]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index f0aca20771..1ea5185f14 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4666,15 +4666,34 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
     unsigned int mos, type, rm, cond, rn, rd;
     TCGv_i64 t_true, t_false, t_zero;
     DisasCompare64 c;
+    TCGMemOp sz;
 
     mos = extract32(insn, 29, 3);
-    type = extract32(insn, 22, 2); /* 0 = single, 1 = double */
+    type = extract32(insn, 22, 2);
     rm = extract32(insn, 16, 5);
     cond = extract32(insn, 12, 4);
     rn = extract32(insn, 5, 5);
     rd = extract32(insn, 0, 5);
 
-    if (mos || type > 1) {
+    if (mos) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch(type) {
+    case 0:
+        sz = MO_32;
+        break;
+    case 1:
+        sz = MO_64;
+        break;
+    case 3:
+        sz = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -4683,11 +4702,11 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
         return;
     }
 
-    /* Zero extend sreg inputs to 64 bits now.  */
+    /* Zero extend sreg & hreg inputs to 64 bits now.  */
     t_true = tcg_temp_new_i64();
     t_false = tcg_temp_new_i64();
-    read_vec_element(s, t_true, rn, 0, type ? MO_64 : MO_32);
-    read_vec_element(s, t_false, rm, 0, type ? MO_64 : MO_32);
+    read_vec_element(s, t_true, rn, 0, sz);
+    read_vec_element(s, t_false, rm, 0, sz);
 
     a64_test_cc(&c, cond);
     t_zero = tcg_const_i64(0);
@@ -4696,7 +4715,7 @@ static void disas_fp_csel(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(t_false);
     a64_free_cc(&c);
 
-    /* Note that sregs write back zeros to the high bits,
+    /* Note that sregs & hregs write back zeros to the high bits,
        and we've already done the zero-extension.  */
     write_fp_dreg(s, rd, t_true);
     tcg_temp_free_i64(t_true);
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) for fp16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (11 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 12/14] target/arm: Implement FCSEL " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-10 17:02   ` Peter Maydell
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 14/14] target/arm: Fix sqrt_f16 exception raising Richard Henderson
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable
From: Alex Bennée <alex.bennee@linaro.org>
All the hard work is already done by vfp_expand_imm, we just need to
make sure we pick up the correct size.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[rth: Merge unallocated_encoding check with TCGMemOp conversion.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 1ea5185f14..b73d6d96cd 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -5437,11 +5437,25 @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
 {
     int rd = extract32(insn, 0, 5);
     int imm8 = extract32(insn, 13, 8);
-    int is_double = extract32(insn, 22, 2);
+    int type = extract32(insn, 22, 2);
     uint64_t imm;
     TCGv_i64 tcg_res;
+    TCGMemOp sz;
 
-    if (is_double > 1) {
+    switch (type) {
+    case 0:
+        sz = MO_32;
+        break;
+    case 1:
+        sz = MO_64;
+        break;
+    case 3:
+        sz = MO_16;
+        if (arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+            break;
+        }
+        /* fallthru */
+    default:
         unallocated_encoding(s);
         return;
     }
@@ -5450,7 +5464,7 @@ static void disas_fp_imm(DisasContext *s, uint32_t insn)
         return;
     }
 
-    imm = vfp_expand_imm(MO_32 + is_double, imm8);
+    imm = vfp_expand_imm(sz, imm8);
 
     tcg_res = tcg_const_i64(imm);
     write_fp_dreg(s, rd, tcg_res);
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* Re: [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) for fp16
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) " Richard Henderson
@ 2018-05-10 17:02   ` Peter Maydell
  0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 17:02 UTC (permalink / raw)
  To: Richard Henderson
  Cc: QEMU Developers, qemu-arm, Alex Bennée, qemu-stable
On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> From: Alex Bennée <alex.bennee@linaro.org>
>
> All the hard work is already done by vfp_expand_imm, we just need to
> make sure we pick up the correct size.
>
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> [rth: Merge unallocated_encoding check with TCGMemOp conversion.]
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply	[flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH v2 14/14] target/arm: Fix sqrt_f16 exception raising
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (12 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 13/14] target/arm: Implement FMOV (immediate) " Richard Henderson
@ 2018-05-02 22:15 ` Richard Henderson
  2018-05-02 22:33 ` [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 no-reply
  2018-05-10 17:09 ` Peter Maydell
  15 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2018-05-02 22:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm, peter.maydell, Alex Bennée, qemu-stable
From: Alex Bennée <alex.bennee@linaro.org>
We are meant to explicitly pass fpst, not cpu_env.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-a64.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index b73d6d96cd..07e196c19c 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -4739,7 +4739,8 @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn)
         tcg_gen_xori_i32(tcg_res, tcg_op, 0x8000);
         break;
     case 0x3: /* FSQRT */
-        gen_helper_sqrt_f16(tcg_res, tcg_op, cpu_env);
+        fpst = get_fpstatus_ptr(true);
+        gen_helper_sqrt_f16(tcg_res, tcg_op, fpst);
         break;
     case 0x8: /* FRINTN */
     case 0x9: /* FRINTP */
-- 
2.14.3
^ permalink raw reply related	[flat|nested] 24+ messages in thread* Re: [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (13 preceding siblings ...)
  2018-05-02 22:15 ` [Qemu-devel] [PATCH v2 14/14] target/arm: Fix sqrt_f16 exception raising Richard Henderson
@ 2018-05-02 22:33 ` no-reply
  2018-05-10 17:09 ` Peter Maydell
  15 siblings, 0 replies; 24+ messages in thread
From: no-reply @ 2018-05-02 22:33 UTC (permalink / raw)
  To: richard.henderson; +Cc: famz, qemu-devel, peter.maydell, qemu-arm
Hi,
This series seems to have some coding style problems. See output below for
more information:
Type: series
Message-id: 20180502221552.3873-1-richard.henderson@linaro.org
Subject: [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16
=== TEST SCRIPT BEGIN ===
#!/bin/bash
BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done
exit $failed
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20180502221552.3873-1-richard.henderson@linaro.org -> patchew/20180502221552.3873-1-richard.henderson@linaro.org
Switched to a new branch 'test'
0af6837b48 target/arm: Fix sqrt_f16 exception raising
c740b594ef target/arm: Implement FMOV (immediate) for fp16
9869941c58 target/arm: Implement FCSEL for fp16
2f9af143e5 target/arm: Implement FCMP for fp16
d7fd874757 target/arm: Implement FP data-processing (3 source) for fp16
bd693b19f4 target/arm: Implement FP data-processing (2 source) for fp16
a7c1ff7f31 target/arm: Introduce and use read_fp_hreg
5c2dfce3b3 target/arm: Implement FCVT (scalar, fixed-point) for fp16
ece533faad target/arm: Implement FCVT (scalar, integer) for fp16
1d81531581 target/arm: Implement FMOV (general) for fp16
d3e47fd86b target/arm: Clear SVE high bits for FMOV
b8530d66b3 target/arm: Fix float16 to/from int16
96451f8fef target/arm: Implement vector shifted FCVT for fp16
f3d1b8db5d target/arm: Implement vector shifted SCVF/UCVF for fp16
=== OUTPUT BEGIN ===
Checking PATCH 1/14: target/arm: Implement vector shifted SCVF/UCVF for fp16...
Checking PATCH 2/14: target/arm: Implement vector shifted FCVT for fp16...
Checking PATCH 3/14: target/arm: Fix float16 to/from int16...
ERROR: spaces required around that '*' (ctx:WxV)
#45: FILE: target/arm/helper.c:11434:
+static float16 do_postscale_fp16(float64 f, int shift, float_status *fpst)
                                                                     ^
total: 1 errors, 0 warnings, 83 lines checked
Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 4/14: target/arm: Clear SVE high bits for FMOV...
Checking PATCH 5/14: target/arm: Implement FMOV (general) for fp16...
Checking PATCH 6/14: target/arm: Implement FCVT (scalar, integer) for fp16...
Checking PATCH 7/14: target/arm: Implement FCVT (scalar, fixed-point) for fp16...
Checking PATCH 8/14: target/arm: Introduce and use read_fp_hreg...
Checking PATCH 9/14: target/arm: Implement FP data-processing (2 source) for fp16...
Checking PATCH 10/14: target/arm: Implement FP data-processing (3 source) for fp16...
Checking PATCH 11/14: target/arm: Implement FCMP for fp16...
Checking PATCH 12/14: target/arm: Implement FCSEL for fp16...
ERROR: space required before the open parenthesis '('
#41: FILE: target/arm/translate-a64.c:4683:
+    switch(type) {
total: 1 errors, 0 warnings, 58 lines checked
Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 13/14: target/arm: Implement FMOV (immediate) for fp16...
Checking PATCH 14/14: target/arm: Fix sqrt_f16 exception raising...
=== OUTPUT END ===
Test command exited with code: 1
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply	[flat|nested] 24+ messages in thread* Re: [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16
  2018-05-02 22:15 [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 Richard Henderson
                   ` (14 preceding siblings ...)
  2018-05-02 22:33 ` [Qemu-devel] [PATCH v2 00/14] target/arm: Fixups for ARM_FEATURE_V8_FP16 no-reply
@ 2018-05-10 17:09 ` Peter Maydell
  15 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2018-05-10 17:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers, qemu-arm
On 2 May 2018 at 23:15, Richard Henderson <richard.henderson@linaro.org> wrote:
> When running the gcc testsuite with current aarch64-linux-user,
> the testsuite detects the presence of the fp16 extension and
> enables lots of extra tests for builtins.
>
> Quite a few of these new tests fail because we missed implementing
> some instructions.  We really should go back and verify that nothing
> else is missing from this (rather large) extension.
>
> In addition, it tests some edge conditions on data that show flaws
> in the way we were performing integer<->fp conversion; particularly
> with respect to scaled conversion.
>
> Changes since v1:
>   * Rebased vs master instead of tgt-arm-sve-9.
>   * Alex did some additional digging through the ARM xhtml
>     and came up with some additional missing instructions.
>   * Everything cc'd to qemu-stable.
I'm going to take patches 1..4 into target-arm.next, just to
reduce the size of this a bit. Patch 5 is the FMOV one which
is missing some decode. Otherwise I've reviewed all the ones
which still needed review. Don't forget to fix the patch 12
patchew warning:
ERROR: space required before the open parenthesis '('
#41: FILE: target/arm/translate-a64.c:4683:
+    switch(type) {
thanks
-- PMM
^ permalink raw reply	[flat|nested] 24+ messages in thread