qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 00/30] v8.2 half-precision support (work-in-progress)
@ 2017-10-13 16:24 Alex Bennée
  2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 01/30] linux-user/main: support dfilter Alex Bennée
                   ` (32 more replies)
  0 siblings, 33 replies; 59+ messages in thread
From: Alex Bennée @ 2017-10-13 16:24 UTC (permalink / raw)
  To: richard.henderson; +Cc: peter.maydell, qemu-devel, qemu-arm, Alex Bennée

Hi,

This is the current state of ARM v8.2 half precision operations. There
are two halves to this effort, expanding our copy of softfloat to
include the requisite operations and then plumbing in the appropriate
helpers and tcg generation code into the ARM front end.

I'm posting this today as I wanted to get feedback before too many
assumptions where baked into what is already a large patch series
which will likely be giant by the time it is finished.

SoftFloat
=========

Previously I had pondered if switching to the newer SoftFloat3 would
be a worthwhile. While the upstream project is certainly open to
accepting patches it would be a slow process given the changes we've
made over the years. As a result I've decided to stick with expanding
our current code.

Most of the helpers have been done fairly mechanically by copying the
float32 equivalent, filing off the 32's, replacing with 16's and
adjusting the constants appropriately (min/max exp, NaNs etc). I've
done this in conjunction with reading the SoftFloat3 code as a sanity
check although in places the design is a little different.

Some bits of the softfloat code were a bit magical to me so I've added
additional comments and re-written the flow to be a bit more obvious.
Currently there a whole bunch of checkpatch things to fix, now we
"own" this copy of softfloat I'm intending for the new code to follow
our own internal coding standards.

The tests/test-softfloat is slightly hacked up. I do want to build one
for each configured target as softfloat varies depending on the target
parameters but I couldn't quite get it to work with:

  @@ -156,10 +158,11 @@ check-unit-y += tests/ptimer-test$(EXESUF)
   gcov-files-ptimer-test-y = hw/core/ptimer.c
   check-unit-y += tests/test-qapi-util$(EXESUF)
   gcov-files-test-qapi-util-y = qapi/qapi-util.c
  -check-unit-y += tests/test-softfloat$(EXESUF)
  -gcov-files-test-softfloat-y = fpu/softfloat.c
  -check-unit-y += tests/test-softfloat-aarch64$(EXESUF)
  -gcov-files-test-softfloat-aarch64-y = fpu/softfloat.c
  +
  +# We built a softfloat test for each variant of softfloat we have
  +$(foreach TARGET,$(TARGETS),\
  +	$(eval check-unit-y += tests/test-softfloat-$(TARGET)$(EXESUF)) \
  +	$(eval gcov-files-test-softfloat-$(TARGET)-y = fpu/softfloat.c))

   check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh

  @@ -608,8 +611,9 @@ tests/test-qht-par$(EXESUF): tests/test-qht-par.o tests/qht-bench$(EXESUF) $(tes
   tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y)
   tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-obj-y)
   tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-y)
  -tests/test-softfloat$(EXESUF): tests/test-softfloat.o $(BUILD_DIR)/aarch64-softmmu/fpu/softfloat.o
  -tests/test-softfloat-aarch64$(EXESUF): tests/test-softfloat.o $(BUILD_DIR)/aarch64-softmmu/fpu/softfloat.o
  +# There is a softfloat test target for each system type/softfloat build
  +$(foreach TARGET,$(TARGETS),\
  +	$(eval tests/test-softfloat-$(TARGET)$(EXESUF): tests/test-softfloat.o $(BUILD_DIR)/$(TARGET)-softmmu/fpu/softfloat.o))

I think it would be nice to add some softfloat unit tests so any
pointers welcome.

ARM Vector Code
===============

The code follows the existing decompose into chunks and call helpers
methodology. I was originally going to base this directly on top of
Richard's TCGvec support but realised this would introduce a
dependency that might complicate further development. I have ended up
copy and pasting a bunch of the loop code, see lines like:

  for (pass = 0; pass < (is_q ? 8 : 4); pass++) {
  ...
  .. stuff per element ..
  ...
  }

I wonder if more of that could be factored away into a common iterator
which could be more easily converted when the vector code is ready.
For now following the existing conventions hopefully makes it easier
to review.


Alex Bennée (30):
  linux-user/main: support dfilter
  arm: introduce ARM_V8_FP16 feature bit
  include/exec/helper-head.h: support f16 in helper calls
  target/arm/cpu.h: update comment for half-precision values
  softfloat: implement propagateFloat16NaN
  fpu/softfloat: implement float16_squash_input_denormal
  fpu/softfloat: implement float16_abs helper
  softfloat: add half-precision expansions for MINMAX fns
  softfloat: propagate signalling NaNs in MINMAX
  softfloat: improve comments on ARM NaN propagation
  target/arm: implement half-precision F(MIN|MAX)(V|NMV)
  target/arm/translate-a64.c: handle_3same_64 comment fix
  target/arm/translate-a64.c: AdvSIMD scalar 3 Same FP16 initial decode
  softfloat: 16 bit helpers for shr, clz and rounding and packing
  softfloat: half-precision add/sub/mul/div support
  target/arm/translate-a64.c: add FP16 FADD/FMUL/FDIV to AdvSIMD 3 Same
    (!sub)
  target/arm/translate-a64.c: add FP16 FMULX
  target/arm/translate-a64.c: add AdvSIMD scalar two-reg misc skeleton
  Fix mask for AdvancedSIMD 2 reg misc
  softfloat: half-precision compare functions
  target/arm/translate-a64: add FP16 2-reg misc compare (zero)
  target/arm/translate-a64.c: add FP16 FAGCT to AdvSIMD 3 Same
  softfloat: add float16_rem and float16_muladd (!CHECK)
  disas_simd_indexed: support half-precision operations
  softfloat: float16_round_to_int
  tests/test-softfloat: add a simple test framework
  target/arm/translate-a64.c: add FP16 FRINTP to 2 reg misc
  softfloat: float16_to_int16 conversion
  tests/test-softfloat: add f16_to_int16 conversion test
  target/arm/translate-a64.c: add FP16 FCVTPS to 2 reg misc

 fpu/softfloat-macros.h     |   39 ++
 fpu/softfloat-specialize.h |  105 ++++-
 fpu/softfloat.c            | 1121 +++++++++++++++++++++++++++++++++++++++++++-
 include/exec/helper-head.h |    3 +
 include/fpu/softfloat.h    |   37 ++
 linux-user/main.c          |    7 +
 target/arm/cpu.h           |    2 +
 target/arm/cpu64.c         |    1 +
 target/arm/helper-a64.c    |  122 +++++
 target/arm/helper-a64.h    |   17 +
 target/arm/translate-a64.c |  420 ++++++++++++++---
 tests/Makefile.include     |    8 +-
 tests/test-softfloat.c     |   84 ++++
 13 files changed, 1883 insertions(+), 83 deletions(-)
 create mode 100644 tests/test-softfloat.c

-- 
2.14.1

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2017-10-17  2:34 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-13 16:24 [Qemu-devel] [RFC PATCH 00/30] v8.2 half-precision support (work-in-progress) Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 01/30] linux-user/main: support dfilter Alex Bennée
2017-10-13 20:36   ` Richard Henderson
2017-10-14  9:58   ` Laurent Vivier
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 02/30] arm: introduce ARM_V8_FP16 feature bit Alex Bennée
2017-10-13 20:44   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 03/30] include/exec/helper-head.h: support f16 in helper calls Alex Bennée
2017-10-13 20:44   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 04/30] target/arm/cpu.h: update comment for half-precision values Alex Bennée
2017-10-13 20:44   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 05/30] softfloat: implement propagateFloat16NaN Alex Bennée
2017-10-13 20:49   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 06/30] fpu/softfloat: implement float16_squash_input_denormal Alex Bennée
2017-10-13 20:51   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 07/30] fpu/softfloat: implement float16_abs helper Alex Bennée
2017-10-13 20:51   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 08/30] softfloat: add half-precision expansions for MINMAX fns Alex Bennée
2017-10-13 20:52   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 09/30] softfloat: propagate signalling NaNs in MINMAX Alex Bennée
2017-10-15 16:13   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 10/30] softfloat: improve comments on ARM NaN propagation Alex Bennée
2017-10-15 16:14   ` Richard Henderson
2017-10-15 16:54   ` Peter Maydell
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 11/30] target/arm: implement half-precision F(MIN|MAX)(V|NMV) Alex Bennée
2017-10-16 20:10   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 12/30] target/arm/translate-a64.c: handle_3same_64 comment fix Alex Bennée
2017-10-15 16:28   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 13/30] target/arm/translate-a64.c: AdvSIMD scalar 3 Same FP16 initial decode Alex Bennée
2017-10-16 20:16   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 14/30] softfloat: 16 bit helpers for shr, clz and rounding and packing Alex Bennée
2017-10-15 18:02   ` Richard Henderson
2017-10-16  8:20     ` Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 15/30] softfloat: half-precision add/sub/mul/div support Alex Bennée
2017-10-16 22:01   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 16/30] target/arm/translate-a64.c: add FP16 FADD/FMUL/FDIV to AdvSIMD 3 Same (!sub) Alex Bennée
2017-10-16 22:08   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 17/30] target/arm/translate-a64.c: add FP16 FMULX Alex Bennée
2017-10-16 22:24   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 18/30] target/arm/translate-a64.c: add AdvSIMD scalar two-reg misc skeleton Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 19/30] Fix mask for AdvancedSIMD 2 reg misc Alex Bennée
2017-10-16 23:47   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 20/30] softfloat: half-precision compare functions Alex Bennée
2017-10-17  0:06   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 21/30] target/arm/translate-a64: add FP16 2-reg misc compare (zero) Alex Bennée
2017-10-17  0:36   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 22/30] target/arm/translate-a64.c: add FP16 FAGCT to AdvSIMD 3 Same Alex Bennée
2017-10-17  0:39   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 23/30] softfloat: add float16_rem and float16_muladd (!CHECK) Alex Bennée
2017-10-17  2:17   ` Richard Henderson
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 24/30] disas_simd_indexed: support half-precision operations Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 25/30] softfloat: float16_round_to_int Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 26/30] tests/test-softfloat: add a simple test framework Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 27/30] target/arm/translate-a64.c: add FP16 FRINTP to 2 reg misc Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 28/30] softfloat: float16_to_int16 conversion Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 29/30] tests/test-softfloat: add f16_to_int16 conversion test Alex Bennée
2017-10-13 16:24 ` [Qemu-devel] [RFC PATCH 30/30] target/arm/translate-a64.c: add FP16 FCVTPS to 2 reg misc Alex Bennée
2017-10-13 16:58 ` [Qemu-devel] [RFC PATCH 00/30] v8.2 half-precision support (work-in-progress) no-reply
2017-10-13 16:59 ` no-reply
2017-10-17  2:34 ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).