From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Subject: [PULL 53/68] target/arm: Implement increased precision FRECPE
Date: Tue, 11 Feb 2025 16:25:39 +0000 [thread overview]
Message-ID: <20250211162554.4135349-54-peter.maydell@linaro.org> (raw)
In-Reply-To: <20250211162554.4135349-1-peter.maydell@linaro.org>
Implement the increased precision variation of FRECPE. In the
pseudocode this corresponds to the handling of the
"increasedprecision" boolean in the FPRecipEstimate() and
RecipEstimate() functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/vfp_helper.c | 54 +++++++++++++++++++++++++++++++++++------
1 file changed, 46 insertions(+), 8 deletions(-)
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index b97417e5a1a..2df97e128f2 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -733,6 +733,33 @@ static int recip_estimate(int input)
return r;
}
+/*
+ * Increased precision version:
+ * input is a 13 bit fixed point number
+ * input range 2048 .. 4095 for a number from 0.5 <= x < 1.0.
+ * result range 4096 .. 8191 for a number from 1.0 to 2.0
+ */
+static int recip_estimate_incprec(int input)
+{
+ int a, b, r;
+ assert(2048 <= input && input < 4096);
+ a = (input * 2) + 1;
+ /*
+ * The pseudocode expresses this as an operation on infinite
+ * precision reals where it calculates 2^25 / a and then looks
+ * at the error between that and the rounded-down-to-integer
+ * value to see if it should instead round up. We instead
+ * follow the same approach as the pseudocode for the 8-bit
+ * precision version, and calculate (2 * (2^25 / a)) as an
+ * integer so we can do the "add one and halve" to round it.
+ * So the 1 << 26 here is correct.
+ */
+ b = (1 << 26) / a;
+ r = (b + 1) >> 1;
+ assert(4096 <= r && r < 8192);
+ return r;
+}
+
/*
* Common wrapper to call recip_estimate
*
@@ -742,7 +769,8 @@ static int recip_estimate(int input)
* callee.
*/
-static uint64_t call_recip_estimate(int *exp, int exp_off, uint64_t frac)
+static uint64_t call_recip_estimate(int *exp, int exp_off, uint64_t frac,
+ bool increasedprecision)
{
uint32_t scaled, estimate;
uint64_t result_frac;
@@ -758,12 +786,22 @@ static uint64_t call_recip_estimate(int *exp, int exp_off, uint64_t frac)
}
}
- /* scaled = UInt('1':fraction<51:44>) */
- scaled = deposit32(1 << 8, 0, 8, extract64(frac, 44, 8));
- estimate = recip_estimate(scaled);
+ if (increasedprecision) {
+ /* scaled = UInt('1':fraction<51:41>) */
+ scaled = deposit32(1 << 11, 0, 11, extract64(frac, 41, 11));
+ estimate = recip_estimate_incprec(scaled);
+ } else {
+ /* scaled = UInt('1':fraction<51:44>) */
+ scaled = deposit32(1 << 8, 0, 8, extract64(frac, 44, 8));
+ estimate = recip_estimate(scaled);
+ }
result_exp = exp_off - *exp;
- result_frac = deposit64(0, 44, 8, estimate);
+ if (increasedprecision) {
+ result_frac = deposit64(0, 40, 12, estimate);
+ } else {
+ result_frac = deposit64(0, 44, 8, estimate);
+ }
if (result_exp == 0) {
result_frac = deposit64(result_frac >> 1, 51, 1, 1);
} else if (result_exp == -1) {
@@ -832,7 +870,7 @@ uint32_t HELPER(recpe_f16)(uint32_t input, float_status *fpst)
}
f64_frac = call_recip_estimate(&f16_exp, 29,
- ((uint64_t) f16_frac) << (52 - 10));
+ ((uint64_t) f16_frac) << (52 - 10), false);
/* result = sign : result_exp<4:0> : fraction<51:42> */
f16_val = deposit32(0, 15, 1, f16_sign);
@@ -885,7 +923,7 @@ static float32 do_recpe_f32(float32 input, float_status *fpst, bool rpres)
}
f64_frac = call_recip_estimate(&f32_exp, 253,
- ((uint64_t) f32_frac) << (52 - 23));
+ ((uint64_t) f32_frac) << (52 - 23), rpres);
/* result = sign : result_exp<7:0> : fraction<51:29> */
f32_val = deposit32(0, 31, 1, f32_sign);
@@ -943,7 +981,7 @@ float64 HELPER(recpe_f64)(float64 input, float_status *fpst)
return float64_set_sign(float64_zero, float64_is_neg(f64));
}
- f64_frac = call_recip_estimate(&f64_exp, 2045, f64_frac);
+ f64_frac = call_recip_estimate(&f64_exp, 2045, f64_frac, false);
/* result = sign : result_exp<10:0> : fraction<51:0>; */
f64_val = deposit64(0, 63, 1, f64_sign);
--
2.34.1
next prev parent reply other threads:[~2025-02-11 16:27 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 16:24 [PULL 00/68] target-arm queue Peter Maydell
2025-02-11 16:24 ` [PULL 01/68] target/alpha: Don't corrupt error_code with unknown softfloat flags Peter Maydell
2025-02-11 16:24 ` [PULL 02/68] fpu: Add float_class_denormal Peter Maydell
2025-02-11 16:24 ` [PULL 03/68] fpu: Implement float_flag_input_denormal_used Peter Maydell
2025-02-11 16:24 ` [PULL 04/68] fpu: allow flushing of output denormals to be after rounding Peter Maydell
2025-02-11 16:24 ` [PULL 05/68] target/arm: Define FPCR AH, FIZ, NEP bits Peter Maydell
2025-02-11 16:24 ` [PULL 06/68] target/arm: Implement FPCR.FIZ handling Peter Maydell
2025-02-11 16:24 ` [PULL 07/68] target/arm: Adjust FP behaviour for FPCR.AH = 1 Peter Maydell
2025-02-11 16:24 ` [PULL 08/68] target/arm: Adjust exception flag handling for AH " Peter Maydell
2025-02-11 16:24 ` [PULL 09/68] target/arm: Add FPCR.AH to tbflags Peter Maydell
2025-02-11 16:24 ` [PULL 10/68] target/arm: Set up float_status to use for FPCR.AH=1 behaviour Peter Maydell
2025-02-11 16:24 ` [PULL 11/68] target/arm: Use FPST_FPCR_AH for FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS Peter Maydell
2025-02-11 16:24 ` [PULL 12/68] target/arm: Use FPST_FPCR_AH for BFCVT* insns Peter Maydell
2025-02-11 16:24 ` [PULL 13/68] target/arm: Use FPST_FPCR_AH for BFMLAL*, BFMLSL* insns Peter Maydell
2025-02-11 16:25 ` [PULL 14/68] target/arm: Add FPCR.NEP to TBFLAGS Peter Maydell
2025-02-11 16:25 ` [PULL 15/68] target/arm: Define and use new write_fp_*reg_merging() functions Peter Maydell
2025-02-11 16:25 ` [PULL 16/68] target/arm: Handle FPCR.NEP for 3-input scalar operations Peter Maydell
2025-02-11 16:25 ` [PULL 17/68] target/arm: Handle FPCR.NEP for BFCVT scalar Peter Maydell
2025-02-11 16:25 ` [PULL 18/68] target/arm: Handle FPCR.NEP for 1-input scalar operations Peter Maydell
2025-02-11 16:25 ` [PULL 19/68] target/arm: Handle FPCR.NEP in do_cvtf_scalar() Peter Maydell
2025-02-11 16:25 ` [PULL 20/68] target/arm: Handle FPCR.NEP for scalar FABS and FNEG Peter Maydell
2025-02-11 16:25 ` [PULL 21/68] target/arm: Handle FPCR.NEP for FCVTXN (scalar) Peter Maydell
2025-02-11 16:25 ` [PULL 22/68] target/arm: Handle FPCR.NEP for NEP for FMUL, FMULX scalar by element Peter Maydell
2025-02-11 16:25 ` [PULL 23/68] target/arm: Implement FPCR.AH semantics for scalar FMIN/FMAX Peter Maydell
2025-02-11 16:25 ` [PULL 24/68] target/arm: Implement FPCR.AH semantics for vector FMIN/FMAX Peter Maydell
2025-02-11 16:25 ` [PULL 25/68] target/arm: Implement FPCR.AH semantics for FMAXV and FMINV Peter Maydell
2025-02-11 16:25 ` [PULL 26/68] target/arm: Implement FPCR.AH semantics for FMINP and FMAXP Peter Maydell
2025-02-11 16:25 ` [PULL 27/68] target/arm: Implement FPCR.AH semantics for SVE FMAXV and FMINV Peter Maydell
2025-02-11 16:25 ` [PULL 28/68] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediate Peter Maydell
2025-02-11 16:25 ` [PULL 29/68] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX vector Peter Maydell
2025-02-11 16:25 ` [PULL 30/68] target/arm: Implement FPCR.AH handling of negation of NaN Peter Maydell
2025-02-11 16:25 ` [PULL 31/68] target/arm: Implement FPCR.AH handling for scalar FABS and FABD Peter Maydell
2025-02-11 16:25 ` [PULL 32/68] target/arm: Handle FPCR.AH in vector FABD Peter Maydell
2025-02-11 16:25 ` [PULL 33/68] target/arm: Handle FPCR.AH in SVE FNEG Peter Maydell
2025-02-11 16:25 ` [PULL 34/68] target/arm: Handle FPCR.AH in SVE FABS Peter Maydell
2025-02-11 16:25 ` [PULL 35/68] target/arm: Handle FPCR.AH in SVE FABD Peter Maydell
2025-02-11 16:25 ` [PULL 36/68] target/arm: Handle FPCR.AH in negation steps in SVE FCADD Peter Maydell
2025-02-11 16:25 ` [PULL 37/68] target/arm: Handle FPCR.AH in negation steps in FCADD Peter Maydell
2025-02-11 16:25 ` [PULL 38/68] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS scalar insns Peter Maydell
2025-02-11 16:25 ` [PULL 39/68] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns Peter Maydell
2025-02-11 16:25 ` [PULL 40/68] target/arm: Handle FPCR.AH in negation step in FMLS (indexed) Peter Maydell
2025-02-11 16:25 ` [PULL 41/68] target/arm: Handle FPCR.AH in negation in FMLS (vector) Peter Maydell
2025-02-11 16:25 ` [PULL 42/68] target/arm: Handle FPCR.AH in negation step in SVE " Peter Maydell
2025-02-11 16:25 ` [PULL 43/68] target/arm: Handle FPCR.AH in SVE FTSSEL Peter Maydell
2025-02-11 16:25 ` [PULL 44/68] target/arm: Handle FPCR.AH in SVE FTMAD Peter Maydell
2025-02-11 16:25 ` [PULL 45/68] target/arm: Handle FPCR.AH in vector FCMLA Peter Maydell
2025-02-11 16:25 ` [PULL 46/68] target/arm: Handle FPCR.AH in FCMLA by index Peter Maydell
2025-02-11 16:25 ` [PULL 47/68] target/arm: Handle FPCR.AH in SVE FCMLA Peter Maydell
2025-02-11 16:25 ` [PULL 48/68] target/arm: Handle FPCR.AH in FMLSL (by element and vector) Peter Maydell
2025-02-11 16:25 ` [PULL 49/68] target/arm: Handle FPCR.AH in SVE FMLSL (indexed) Peter Maydell
2025-02-11 16:25 ` [PULL 50/68] target/arm: Handle FPCR.AH in SVE FMLSLB, FMLSLT (vectors) Peter Maydell
2025-02-11 16:25 ` [PULL 51/68] target/arm: Enable FEAT_AFP for '-cpu max' Peter Maydell
2025-02-11 16:25 ` [PULL 52/68] target/arm: Plumb FEAT_RPRES frecpe and frsqrte through to new helper Peter Maydell
2025-02-11 16:25 ` Peter Maydell [this message]
2025-02-11 16:25 ` [PULL 54/68] target/arm: Implement increased precision FRSQRTE Peter Maydell
2025-02-11 16:25 ` [PULL 55/68] target/arm: Enable FEAT_RPRES for -cpu max Peter Maydell
2025-02-11 16:25 ` [PULL 56/68] target/arm: Introduce CPUARMState.vfp.fp_status[] Peter Maydell
2025-02-11 16:25 ` [PULL 57/68] target/arm: Remove standard_fp_status_f16 Peter Maydell
2025-02-11 16:25 ` [PULL 58/68] target/arm: Remove standard_fp_status Peter Maydell
2025-02-11 16:25 ` [PULL 59/68] target/arm: Remove ah_fp_status_f16 Peter Maydell
2025-02-11 16:25 ` [PULL 60/68] target/arm: Remove ah_fp_status Peter Maydell
2025-02-11 16:25 ` [PULL 61/68] target/arm: Remove fp_status_f16_a64 Peter Maydell
2025-02-11 16:25 ` [PULL 62/68] target/arm: Remove fp_status_f16_a32 Peter Maydell
2025-02-11 16:25 ` [PULL 63/68] target/arm: Remove fp_status_a64 Peter Maydell
2025-02-11 16:25 ` [PULL 64/68] target/arm: Remove fp_status_a32 Peter Maydell
2025-02-11 16:25 ` [PULL 65/68] target/arm: Simplify fp_status indexing in mve_helper.c Peter Maydell
2025-02-11 16:25 ` [PULL 66/68] target/arm: Simplify DO_VFP_cmp in vfp_helper.c Peter Maydell
2025-02-11 16:25 ` [PULL 67/68] target/arm: Read fz16 from env->vfp.fpcr Peter Maydell
2025-02-11 16:25 ` [PULL 68/68] target/arm: Sink fp_status and fpcr access into do_fmlal* Peter Maydell
2025-02-12 17:38 ` [PULL 00/68] target-arm queue Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250211162554.4135349-54-peter.maydell@linaro.org \
--to=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).