[PATCH v2 00/69] target/arm: FEAT_AFP and FEAT_RPRES

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-arm@nongnu.org, qemu-devel@nongnu.org
Subject: [PATCH v2 00/69] target/arm: FEAT_AFP and FEAT_RPRES
Date: Sat,  1 Feb 2025 16:39:03 +0000	[thread overview]
Message-ID: <20250201164012.1660228-1-peter.maydell@linaro.org> (raw)

This patchset implements emulation of the Arm FEAT_AFP and FEAT_RPRES
extensions, which are floating-point related. (Summary of what
these are exactly is at the bottom of the cover letter.)

If you'd rather have these patches as a git branch:
 https://git.linaro.org/people/pmaydell/qemu-arm.git  feat-afp
with human readable web view at:
 https://git.linaro.org/people/peter.maydell/qemu-arm.git/log/?h=feat-afp
  
Changes between v1 and v2:
 * first part of the series has been upstreamed
 * I've left the first two x86 patches in here, just to avoid having
   to use a Based-on: tag. They've both been taken by Paolo already,
   they just haven't landed upstream yet.
 * the tail-end patches fixing x86 denormal support are not posted
   here (indeed I didn't mean to send them in v1!); I'll send those
   separately once the underlying softfloat patches are upstream
 * the renaming of the FPST_ constants (already upstream) is
   carried through into these patches
 * name changes in the "allow flushing of output denormals to be
   after rounding" patch:
   now set_float_ftz_detection(), get_float_ftz_detection(),
   float_ftz_after_rounding and float_ftz_before_rounding
 * moved select_fpst to translate-a64.h and renamed to select_ah_fpst
 * use vec_full_reg_offset() in the write_fp_*reg_merging fns
 * drop no-longer-nedeed float*_input_flush2() calls in the
   float*_hs_compare() fns in "implement float_flag_input_denormal_used"
 * adopted RTH's patchset, by a mix of merging in fixes to my
   patches and adding his (partly on the end, and partly sorted
   into the series at appropriate places). I updated commit messages
   in a few places (notably standardising them onto "Handle X for
   <some insn>" rather than "for <some QEMU function>")

Patches that still need review:
04 fpu: Implement float_flag_input_denormal_used
05 fpu: allow flushing of output denormals to be after rounding
06 target/arm: Define FPCR AH, FIZ, NEP bits

RTH: I kept your r-by tags on the patches where I squashed
in your fixes from your followup series (mostly this is the
changes to use the muladd flags). If you want to re-review
to check that I did the squashing right, those are patches:

37 target/arm: Handle FPCR.AH in negation steps in SVE FCADD
38 target/arm: Handle FPCR.AH in negation steps in FCADD
41 target/arm: Handle FPCR.AH in negation step in FMLS (indexed)
42 target/arm: Handle FPCR.AH in negation in FMLS (vector)
43 target/arm: Handle FPCR.AH in negation step in SVE FMLS (vector)
44 target/arm: Handle FPCR.AH in SVE FTSSEL
45 target/arm: Handle FPCR.AH in SVE FTMAD

Summary of what FEAT_AFP/FEAT_RPRES are, from v1 cover letter:

FEAT_AFP defines three new control bits in the FPCR, whose
operations are basically independent of each other:
 * FPCR.AH: "alternate floating point mode"; this changes floating
   point behaviour in a variety of ways, including:
   - the sign of a default NaN is 1, not 0
   - if FPCR.FZ is also 1, denormals detected after rounding
   with an unbounded exponent has been applied are flushed to zero
   - FPCR.FZ does not cause denormalized inputs to be flushed to zero
   - miscellaneous other corner-case behaviour changes
 * FPCR.FIZ: flush denormalized numbers to zero on input for
   most instructions
 * FPCR.NEP: makes scalar SIMD operations merge the result with
   higher vector elements in one of the source registers, instead
   of zeroing the higher elements of the destination

FEAT_RPRES makes single-precision FRECPE and FRSQRTE use a 12-bit
mantissa precision instead of 8-bit when FPCR.AH is set.

thanks
-- PMM

Peter Maydell (50):
  target/i386: Do not raise Invalid for 0 * Inf + QNaN
  tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases
  fpu: Add float_class_denormal
  fpu: Implement float_flag_input_denormal_used
  fpu: allow flushing of output denormals to be after rounding
  target/arm: Define FPCR AH, FIZ, NEP bits
  target/arm: Implement FPCR.FIZ handling
  target/arm: Adjust FP behaviour for FPCR.AH = 1
  target/arm: Adjust exception flag handling for AH = 1
  target/arm: Add FPCR.AH to tbflags
  target/arm: Set up float_status to use for FPCR.AH=1 behaviour
  target/arm: Use FPST_FPCR_AH for FRECPE, FRECPS, FRECPX, FRSQRTE,
    FRSQRTS
  target/arm: Use FPST_FPCR_AH for BFCVT* insns
  target/arm: Use FPST_FPCR_AH for BFMLAL*, BFMLSL* insns
  target/arm: Add FPCR.NEP to TBFLAGS
  target/arm: Define and use new write_fp_*reg_merging() functions
  target/arm: Handle FPCR.NEP for 3-input scalar operations
  target/arm: Handle FPCR.NEP for BFCVT scalar
  target/arm: Handle FPCR.NEP for 1-input scalar operations
  target/arm: Handle FPCR.NEP in do_cvtf_scalar()
  target/arm: Handle FPCR.NEP for scalar FABS and FNEG
  target/arm: Handle FPCR.NEP for FCVTXN (scalar)
  target/arm: Handle FPCR.NEP for NEP for FMUL, FMULX scalar by element
  target/arm: Implement FPCR.AH semantics for scalar FMIN/FMAX
  target/arm: Implement FPCR.AH semantics for vector FMIN/FMAX
  target/arm: Implement FPCR.AH semantics for FMAXV and FMINV
  target/arm: Implement FPCR.AH semantics for FMINP and FMAXP
  target/arm: Implement FPCR.AH semantics for SVE FMAXV and FMINV
  target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediate
  target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX vector
  target/arm: Implement FPCR.AH handling of negation of NaN
  target/arm: Implement FPCR.AH handling for scalar FABS and FABD
  target/arm: Handle FPCR.AH in vector FABD
  target/arm: Handle FPCR.AH in SVE FNEG
  target/arm: Handle FPCR.AH in SVE FABS
  target/arm: Handle FPCR.AH in SVE FABD
  target/arm: Handle FPCR.AH in negation steps in SVE FCADD
  target/arm: Handle FPCR.AH in negation steps in FCADD
  target/arm: Handle FPCR.AH in FRECPS and FRSQRTS scalar insns
  target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns
  target/arm: Handle FPCR.AH in negation step in FMLS (indexed)
  target/arm: Handle FPCR.AH in negation in FMLS (vector)
  target/arm: Handle FPCR.AH in negation step in SVE FMLS (vector)
  target/arm: Handle FPCR.AH in SVE FTSSEL
  target/arm: Handle FPCR.AH in SVE FTMAD
  target/arm: Enable FEAT_AFP for '-cpu max'
  target/arm: Plumb FEAT_RPRES frecpe and frsqrte through to new helper
  target/arm: Implement increased precision FRECPE
  target/arm: Implement increased precision FRSQRTE
  target/arm: Enable FEAT_RPRES for -cpu max

Richard Henderson (19):
  target/arm: Handle FPCR.AH in vector FCMLA
  target/arm: Handle FPCR.AH in FCMLA by index
  target/arm: Handle FPCR.AH in SVE FCMLA
  target/arm: Handle FPCR.AH in FMLSL (by element and vector)
  target/arm: Handle FPCR.AH in SVE FMLSL (indexed)
  target/arm: Handle FPCR.AH in SVE FMLSLB, FMLSLT (vectors)
  target/arm: Introduce CPUARMState.vfp.fp_status[]
  target/arm: Remove standard_fp_status_f16
  target/arm: Remove standard_fp_status
  target/arm: Remove ah_fp_status_f16
  target/arm: Remove ah_fp_status
  target/arm: Remove fp_status_f16_a64
  target/arm: Remove fp_status_f16_a32
  target/arm: Remove fp_status_a64
  target/arm: Remove fp_status_a32
  target/arm: Simplify fp_status indexing in mve_helper.c
  target/arm: Simplify DO_VFP_cmp in vfp_helper.c
  target/arm: Read fz16 from env->vfp.fpcr
  target/arm: Sink fp_status and fpcr access into do_fmlal*

 docs/system/arm/emulation.rst    |   2 +
 include/fpu/softfloat-helpers.h  |  11 +
 include/fpu/softfloat-types.h    |  41 +-
 target/arm/cpu-features.h        |  10 +
 target/arm/cpu.h                 |  97 ++--
 target/arm/helper.h              |  26 +
 target/arm/internals.h           |   6 +
 target/arm/tcg/helper-a64.h      |  13 +
 target/arm/tcg/helper-sve.h      | 120 +++++
 target/arm/tcg/translate-a64.h   |  13 +
 target/arm/tcg/translate.h       |  54 +--
 target/arm/tcg/vec_internal.h    |  35 ++
 target/mips/fpu_helper.h         |   6 +
 fpu/softfloat.c                  |  66 ++-
 target/alpha/cpu.c               |   7 +
 target/arm/cpu.c                 |  46 +-
 target/arm/helper.c              |   2 +-
 target/arm/tcg/cpu64.c           |   2 +
 target/arm/tcg/helper-a64.c      | 151 +++---
 target/arm/tcg/hflags.c          |  13 +
 target/arm/tcg/mve_helper.c      |  44 +-
 target/arm/tcg/sme_helper.c      |   4 +-
 target/arm/tcg/sve_helper.c      | 367 +++++++++++----
 target/arm/tcg/translate-a64.c   | 782 +++++++++++++++++++++++++------
 target/arm/tcg/translate-sve.c   | 193 ++++++--
 target/arm/tcg/vec_helper.c      | 387 ++++++++++-----
 target/arm/vfp_helper.c          | 372 ++++++++++++---
 target/hppa/fpu_helper.c         |  11 +
 target/i386/tcg/fpu_helper.c     |  13 +-
 target/mips/msa.c                |   9 +
 target/ppc/cpu_init.c            |   3 +
 target/rx/cpu.c                  |   8 +
 target/sh4/cpu.c                 |   8 +
 target/tricore/helper.c          |   1 +
 tests/fp/fp-bench.c              |   1 +
 tests/tcg/x86_64/fma.c           | 109 +++++
 fpu/softfloat-parts.c.inc        | 132 +++++-
 tests/tcg/x86_64/Makefile.target |   1 +
 38 files changed, 2452 insertions(+), 714 deletions(-)
 create mode 100644 tests/tcg/x86_64/fma.c

-- 
2.34.1

next             reply	other threads:[~2025-02-01 16:40 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-01 16:39 Peter Maydell [this message]
2025-02-01 16:39 ` [PATCH v2 01/69] target/i386: Do not raise Invalid for 0 * Inf + QNaN Peter Maydell
2025-02-01 16:39 ` [PATCH v2 02/69] tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases Peter Maydell
2025-02-01 16:39 ` [PATCH v2 03/69] fpu: Add float_class_denormal Peter Maydell
2025-02-01 16:39 ` [PATCH v2 04/69] fpu: Implement float_flag_input_denormal_used Peter Maydell
2025-02-02 16:45   ` Richard Henderson
2025-02-01 16:39 ` [PATCH v2 05/69] fpu: allow flushing of output denormals to be after rounding Peter Maydell
2025-02-02 16:50   ` Richard Henderson
2025-02-01 16:39 ` [PATCH v2 06/69] target/arm: Define FPCR AH, FIZ, NEP bits Peter Maydell
2025-02-02 16:51   ` Richard Henderson
2025-02-01 16:39 ` [PATCH v2 07/69] target/arm: Implement FPCR.FIZ handling Peter Maydell
2025-02-01 16:39 ` [PATCH v2 08/69] target/arm: Adjust FP behaviour for FPCR.AH = 1 Peter Maydell
2025-02-11 13:17   ` Peter Maydell
2025-02-01 16:39 ` [PATCH v2 09/69] target/arm: Adjust exception flag handling for AH " Peter Maydell
2025-02-01 16:39 ` [PATCH v2 10/69] target/arm: Add FPCR.AH to tbflags Peter Maydell
2025-02-01 16:39 ` [PATCH v2 11/69] target/arm: Set up float_status to use for FPCR.AH=1 behaviour Peter Maydell
2025-02-01 16:39 ` [PATCH v2 12/69] target/arm: Use FPST_FPCR_AH for FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS Peter Maydell
2025-02-01 16:39 ` [PATCH v2 13/69] target/arm: Use FPST_FPCR_AH for BFCVT* insns Peter Maydell
2025-02-01 16:39 ` [PATCH v2 14/69] target/arm: Use FPST_FPCR_AH for BFMLAL*, BFMLSL* insns Peter Maydell
2025-02-01 16:39 ` [PATCH v2 15/69] target/arm: Add FPCR.NEP to TBFLAGS Peter Maydell
2025-02-01 16:39 ` [PATCH v2 16/69] target/arm: Define and use new write_fp_*reg_merging() functions Peter Maydell
2025-02-01 16:39 ` [PATCH v2 17/69] target/arm: Handle FPCR.NEP for 3-input scalar operations Peter Maydell
2025-02-01 16:39 ` [PATCH v2 18/69] target/arm: Handle FPCR.NEP for BFCVT scalar Peter Maydell
2025-02-01 16:39 ` [PATCH v2 19/69] target/arm: Handle FPCR.NEP for 1-input scalar operations Peter Maydell
2025-02-01 16:39 ` [PATCH v2 20/69] target/arm: Handle FPCR.NEP in do_cvtf_scalar() Peter Maydell
2025-02-01 16:39 ` [PATCH v2 21/69] target/arm: Handle FPCR.NEP for scalar FABS and FNEG Peter Maydell
2025-02-01 16:39 ` [PATCH v2 22/69] target/arm: Handle FPCR.NEP for FCVTXN (scalar) Peter Maydell
2025-02-01 16:39 ` [PATCH v2 23/69] target/arm: Handle FPCR.NEP for NEP for FMUL, FMULX scalar by element Peter Maydell
2025-02-01 16:39 ` [PATCH v2 24/69] target/arm: Implement FPCR.AH semantics for scalar FMIN/FMAX Peter Maydell
2025-02-01 16:39 ` [PATCH v2 25/69] target/arm: Implement FPCR.AH semantics for vector FMIN/FMAX Peter Maydell
2025-02-01 16:39 ` [PATCH v2 26/69] target/arm: Implement FPCR.AH semantics for FMAXV and FMINV Peter Maydell
2025-02-01 16:39 ` [PATCH v2 27/69] target/arm: Implement FPCR.AH semantics for FMINP and FMAXP Peter Maydell
2025-02-01 16:39 ` [PATCH v2 28/69] target/arm: Implement FPCR.AH semantics for SVE FMAXV and FMINV Peter Maydell
2025-02-01 16:39 ` [PATCH v2 29/69] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediate Peter Maydell
2025-02-01 16:39 ` [PATCH v2 30/69] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX vector Peter Maydell
2025-02-01 16:39 ` [PATCH v2 31/69] target/arm: Implement FPCR.AH handling of negation of NaN Peter Maydell
2025-02-01 16:39 ` [PATCH v2 32/69] target/arm: Implement FPCR.AH handling for scalar FABS and FABD Peter Maydell
2025-02-01 16:39 ` [PATCH v2 33/69] target/arm: Handle FPCR.AH in vector FABD Peter Maydell
2025-02-01 16:39 ` [PATCH v2 34/69] target/arm: Handle FPCR.AH in SVE FNEG Peter Maydell
2025-02-01 16:39 ` [PATCH v2 35/69] target/arm: Handle FPCR.AH in SVE FABS Peter Maydell
2025-02-01 16:39 ` [PATCH v2 36/69] target/arm: Handle FPCR.AH in SVE FABD Peter Maydell
2025-02-01 16:39 ` [PATCH v2 37/69] target/arm: Handle FPCR.AH in negation steps in SVE FCADD Peter Maydell
2025-02-02 17:17   ` Richard Henderson
2025-02-01 16:39 ` [PATCH v2 38/69] target/arm: Handle FPCR.AH in negation steps in FCADD Peter Maydell
2025-02-01 16:39 ` [PATCH v2 39/69] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS scalar insns Peter Maydell
2025-02-01 16:39 ` [PATCH v2 40/69] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns Peter Maydell
2025-02-01 16:39 ` [PATCH v2 41/69] target/arm: Handle FPCR.AH in negation step in FMLS (indexed) Peter Maydell
2025-02-01 16:39 ` [PATCH v2 42/69] target/arm: Handle FPCR.AH in negation in FMLS (vector) Peter Maydell
2025-02-01 16:39 ` [PATCH v2 43/69] target/arm: Handle FPCR.AH in negation step in SVE " Peter Maydell
2025-02-01 16:39 ` [PATCH v2 44/69] target/arm: Handle FPCR.AH in SVE FTSSEL Peter Maydell
2025-02-01 16:39 ` [PATCH v2 45/69] target/arm: Handle FPCR.AH in SVE FTMAD Peter Maydell
2025-02-01 16:39 ` [PATCH v2 46/69] target/arm: Handle FPCR.AH in vector FCMLA Peter Maydell
2025-02-01 16:39 ` [PATCH v2 47/69] target/arm: Handle FPCR.AH in FCMLA by index Peter Maydell
2025-02-01 16:39 ` [PATCH v2 48/69] target/arm: Handle FPCR.AH in SVE FCMLA Peter Maydell
2025-02-01 16:39 ` [PATCH v2 49/69] target/arm: Handle FPCR.AH in FMLSL (by element and vector) Peter Maydell
2025-02-01 16:39 ` [PATCH v2 50/69] target/arm: Handle FPCR.AH in SVE FMLSL (indexed) Peter Maydell
2025-02-01 16:39 ` [PATCH v2 51/69] target/arm: Handle FPCR.AH in SVE FMLSLB, FMLSLT (vectors) Peter Maydell
2025-02-01 16:39 ` [PATCH v2 52/69] target/arm: Enable FEAT_AFP for '-cpu max' Peter Maydell
2025-02-01 16:39 ` [PATCH v2 53/69] target/arm: Plumb FEAT_RPRES frecpe and frsqrte through to new helper Peter Maydell
2025-02-01 16:39 ` [PATCH v2 54/69] target/arm: Implement increased precision FRECPE Peter Maydell
2025-02-01 16:39 ` [PATCH v2 55/69] target/arm: Implement increased precision FRSQRTE Peter Maydell
2025-02-01 16:39 ` [PATCH v2 56/69] target/arm: Enable FEAT_RPRES for -cpu max Peter Maydell
2025-02-01 16:40 ` [PATCH v2 57/69] target/arm: Introduce CPUARMState.vfp.fp_status[] Peter Maydell
2025-02-01 16:40 ` [PATCH v2 58/69] target/arm: Remove standard_fp_status_f16 Peter Maydell
2025-02-01 16:40 ` [PATCH v2 59/69] target/arm: Remove standard_fp_status Peter Maydell
2025-02-01 16:40 ` [PATCH v2 60/69] target/arm: Remove ah_fp_status_f16 Peter Maydell
2025-02-01 16:40 ` [PATCH v2 61/69] target/arm: Remove ah_fp_status Peter Maydell
2025-02-01 16:40 ` [PATCH v2 62/69] target/arm: Remove fp_status_f16_a64 Peter Maydell
2025-02-01 16:40 ` [PATCH v2 63/69] target/arm: Remove fp_status_f16_a32 Peter Maydell
2025-02-01 16:40 ` [PATCH v2 64/69] target/arm: Remove fp_status_a64 Peter Maydell
2025-02-01 16:40 ` [PATCH v2 65/69] target/arm: Remove fp_status_a32 Peter Maydell
2025-02-01 16:40 ` [PATCH v2 66/69] target/arm: Simplify fp_status indexing in mve_helper.c Peter Maydell
2025-02-01 16:40 ` [PATCH v2 67/69] target/arm: Simplify DO_VFP_cmp in vfp_helper.c Peter Maydell
2025-02-01 16:40 ` [PATCH v2 68/69] target/arm: Read fz16 from env->vfp.fpcr Peter Maydell
2025-02-01 16:40 ` [PATCH v2 69/69] target/arm: Sink fp_status and fpcr access into do_fmlal* Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250201164012.1660228-1-peter.maydell@linaro.org \
    --to=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).