From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Subject: [PATCH for-8.0 v3 00/45] tcg: Support for Int128 with helpers
Date: Fri, 11 Nov 2022 17:40:16 +1000 [thread overview]
Message-ID: <20221111074101.2069454-1-richard.henderson@linaro.org> (raw)
This is working toward improving atomicity within TCG, especially
with respect to Arm FEAT_LSE2, which guarantees that any operation
that does not cross a 16-byte boundary is treated atomically.
(Incidentally, I've also stumbled across language in the Intel SDM
that shows the feature is required there too, and the guarantee is
even bigger -- anything that does not cross a cache line boundary.
Given that we've only ever got 16-byte atomic operations on other
hosts, there's no chance of supporting full cache line generically.
But we could turn on the same "within 16" atomicity that will be
used for FEAT_LSE2.)
That goal is somewhat down the road. This patch set contains two
items: paired register allocation and TCGv_i128 usage with helpers.
The next step will be putting these two together to provide atomic
128-bit load/store operations within TCG. Via, e.g. AArch64 LDP,
Power7 LDQ, or S390x LPQ -- all of which require allocating a pair
of registers. (Intel will require that I go through AVX, which is
a bit of a complication, but I'll figure that out.) And then of
course separately via the helpers used by the slow path.
Patches for target/ to use and test this to follow.
Changes for v3:
* Testing showed that trying to make things "easier" for the
register allocator on 32-bit hosts by keeping TCGv_i128 as
a blob was a mistake. Now split to 4 parts, similar to
how we treat TCGv_i64 as 2 parts.
* Fallout from the above is that we now have to support more
than 14 call arguments, which meant expanding TCGOp.
Now allocated variable sized, using only the nubmer of
operands required. This could in fact result in less memory
usage on average, but haven't collected any numbers.
* Implement (non-atomic) load/store on TCGv_i128, which gives
some of the helpers required for...
* Implement tcg_gen_atomic_cmpxchg_i128, which will eliminate
the primary source of 128-bit hacks/ifdefs in target/.
Changes for v2:
* Fixes and r-b (philmd).
* Include i386 atomic16 patch, which avoids minor conflicts later.
* Split a few larger patches.
* Bug fixes for TCI.
r~
Richard Henderson (45):
meson: Move CONFIG_TCG_INTERPRETER to config_host
tcg: Tidy tcg_reg_alloc_op
tcg: Introduce paired register allocation
tcg/s390x: Use register pair allocation for div and mulu2
tcg/arm: Use register pair allocation for qemu_{ld,st}_i64
tcg: Remove TCG_TARGET_STACK_GROWSUP
accel/tcg: Set cflags_next_tb in cpu_common_initfn
target/sparc: Avoid TCGV_{LOW,HIGH}
tcg: Move TCG_{LOW,HIGH} to tcg-internal.h
tcg: Add temp_subindex to TCGTemp
tcg: Simplify calls to temp_sync vs mem_coherent
tcg: Allocate TCGTemp pairs in host memory order
tcg: Move TCG_TYPE_COUNT outside enum
tcg: Introduce tcg_type_size
tcg: Introduce TCGCallReturnKind and TCGCallArgumentKind
tcg: Replace TCG_TARGET_CALL_ALIGN_ARGS with TCG_TARGET_CALL_ARG_I64
tcg: Replace TCG_TARGET_EXTEND_ARGS with TCG_TARGET_CALL_ARG_I32
tcg: Use TCG_CALL_ARG_EVEN for TCI special case
accel/tcg/plugin: Don't search for the function pointer index
accel/tcg/plugin: Avoid duplicate copy in copy_call
accel/tcg/plugin: Use copy_op in append_{udata,mem}_cb
tci: MAX_OPC_PARAM_IARGS is no longer used
tcg: Vary the allocation size for TCGOp
tcg: Use output_pref wrapper function
tcg: Reorg function calls
tcg: Move ffi_cif pointer into TCGHelperInfo
tcg/aarch64: Merge tcg_out_callr into tcg_out_call
tcg: Add TCGHelperInfo argument to tcg_out_call
tcg: Define TCG_TYPE_I128 and related helper macros
tcg: Handle dh_typecode_i128 with TCG_CALL_{RET,ARG}_NORMAL
tcg: Allocate objects contiguously in temp_allocate_frame
tcg: Introduce tcg_out_addi_ptr
tcg: Add TCG_CALL_{RET,ARG}_BY_REF
tcg: Introduce tcg_target_call_oarg_reg
tcg: Add TCG_CALL_RET_BY_VEC
include/qemu/int128: Use Int128 structure for TCI
tcg/i386: Add TCG_TARGET_CALL_{RET,ARG}_I128
tcg/tci: Fix big-endian return register ordering
tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128
tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128
tcg: Add temp allocation for TCGv_i128
tcg: Add basic data movement for TCGv_i128
tcg: Add guest load/store primitives for TCGv_i128
tcg: Add tcg_gen_{non}atomic_cmpxchg_i128
tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32,64}
accel/tcg/tcg-runtime.h | 11 +
include/exec/cpu_ldst.h | 10 +
include/exec/helper-head.h | 9 +-
include/qemu/atomic128.h | 29 +-
include/qemu/int128.h | 25 +-
include/tcg/tcg-op.h | 50 +-
include/tcg/tcg.h | 145 ++-
tcg/aarch64/tcg-target.h | 6 +-
tcg/arm/tcg-target-con-set.h | 7 +-
tcg/arm/tcg-target-con-str.h | 2 +
tcg/arm/tcg-target.h | 6 +-
tcg/i386/tcg-target.h | 12 +
tcg/loongarch64/tcg-target.h | 5 +-
tcg/mips/tcg-target.h | 6 +-
tcg/riscv/tcg-target.h | 10 +-
tcg/s390x/tcg-target-con-set.h | 4 +-
tcg/s390x/tcg-target-con-str.h | 8 +-
tcg/s390x/tcg-target.h | 5 +-
tcg/sparc64/tcg-target.h | 5 +-
tcg/tcg-internal.h | 75 +-
tcg/tci/tcg-target.h | 10 +
accel/tcg/cputlb.c | 112 ++
accel/tcg/plugin-gen.c | 54 +-
accel/tcg/user-exec.c | 66 ++
hw/core/cpu-common.c | 1 +
target/sparc/translate.c | 21 +-
tcg/optimize.c | 10 +-
tcg/tcg-op-vec.c | 10 +-
tcg/tcg-op.c | 442 ++++++--
tcg/tcg.c | 1679 +++++++++++++++++++++---------
tcg/tci.c | 66 +-
util/int128.c | 42 +
accel/tcg/atomic_common.c.inc | 45 +
meson.build | 4 +-
tcg/aarch64/tcg-target.c.inc | 36 +-
tcg/arm/tcg-target.c.inc | 68 +-
tcg/i386/tcg-target.c.inc | 57 +-
tcg/loongarch64/tcg-target.c.inc | 24 +-
tcg/mips/tcg-target.c.inc | 20 +-
tcg/ppc/tcg-target.c.inc | 56 +-
tcg/riscv/tcg-target.c.inc | 24 +-
tcg/s390x/tcg-target.c.inc | 71 +-
tcg/sparc64/tcg-target.c.inc | 22 +-
tcg/tci/tcg-target.c.inc | 36 +-
44 files changed, 2506 insertions(+), 900 deletions(-)
--
2.34.1
next reply other threads:[~2022-11-11 7:41 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-11 7:40 Richard Henderson [this message]
2022-11-11 7:40 ` [PATCH for-8.0 v3 01/45] meson: Move CONFIG_TCG_INTERPRETER to config_host Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 02/45] tcg: Tidy tcg_reg_alloc_op Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 03/45] tcg: Introduce paired register allocation Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 04/45] tcg/s390x: Use register pair allocation for div and mulu2 Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 05/45] tcg/arm: Use register pair allocation for qemu_{ld, st}_i64 Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 06/45] tcg: Remove TCG_TARGET_STACK_GROWSUP Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 07/45] accel/tcg: Set cflags_next_tb in cpu_common_initfn Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 08/45] target/sparc: Avoid TCGV_{LOW,HIGH} Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 09/45] tcg: Move TCG_{LOW,HIGH} to tcg-internal.h Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 10/45] tcg: Add temp_subindex to TCGTemp Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 11/45] tcg: Simplify calls to temp_sync vs mem_coherent Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 12/45] tcg: Allocate TCGTemp pairs in host memory order Richard Henderson
2022-11-22 11:25 ` Philippe Mathieu-Daudé
2022-11-11 7:40 ` [PATCH for-8.0 v3 13/45] tcg: Move TCG_TYPE_COUNT outside enum Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 14/45] tcg: Introduce tcg_type_size Richard Henderson
2022-11-22 11:30 ` Philippe Mathieu-Daudé
2022-11-22 16:54 ` Richard Henderson
2022-11-22 18:14 ` Philippe Mathieu-Daudé
2022-11-22 18:15 ` Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 15/45] tcg: Introduce TCGCallReturnKind and TCGCallArgumentKind Richard Henderson
2022-11-22 11:33 ` Philippe Mathieu-Daudé
2022-11-11 7:40 ` [PATCH for-8.0 v3 16/45] tcg: Replace TCG_TARGET_CALL_ALIGN_ARGS with TCG_TARGET_CALL_ARG_I64 Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 17/45] tcg: Replace TCG_TARGET_EXTEND_ARGS with TCG_TARGET_CALL_ARG_I32 Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 18/45] tcg: Use TCG_CALL_ARG_EVEN for TCI special case Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 19/45] accel/tcg/plugin: Don't search for the function pointer index Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 20/45] accel/tcg/plugin: Avoid duplicate copy in copy_call Richard Henderson
2022-11-22 15:21 ` Philippe Mathieu-Daudé
2022-11-11 7:40 ` [PATCH for-8.0 v3 21/45] accel/tcg/plugin: Use copy_op in append_{udata, mem}_cb Richard Henderson
2022-11-22 15:22 ` Philippe Mathieu-Daudé
2022-11-11 7:40 ` [PATCH for-8.0 v3 22/45] tci: MAX_OPC_PARAM_IARGS is no longer used Richard Henderson
2022-11-22 15:25 ` Philippe Mathieu-Daudé
2022-11-11 7:40 ` [PATCH for-8.0 v3 23/45] tcg: Vary the allocation size for TCGOp Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 24/45] tcg: Use output_pref wrapper function Richard Henderson
2022-11-22 15:28 ` Philippe Mathieu-Daudé
2022-11-11 7:40 ` [PATCH for-8.0 v3 25/45] tcg: Reorg function calls Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 26/45] tcg: Move ffi_cif pointer into TCGHelperInfo Richard Henderson
2022-11-22 18:08 ` [PATCH 0/3] tcg: Move ffi_cif pointer into TCGHelperInfo (splitted) Philippe Mathieu-Daudé
2022-11-22 18:08 ` [PATCH 1/3] tcg: Convert typecode_to_ffi from array to function Philippe Mathieu-Daudé
2022-11-22 18:08 ` [PATCH 2/3] tcg: Factor init_ffi_layouts() out of tcg_context_init() Philippe Mathieu-Daudé
2022-11-22 18:08 ` [PATCH 3/3] tcg: Move ffi_cif pointer into TCGHelperInfo Philippe Mathieu-Daudé
2022-11-23 16:22 ` Philippe Mathieu-Daudé
2022-11-11 7:40 ` [PATCH for-8.0 v3 27/45] tcg/aarch64: Merge tcg_out_callr into tcg_out_call Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 28/45] tcg: Add TCGHelperInfo argument to tcg_out_call Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 29/45] tcg: Define TCG_TYPE_I128 and related helper macros Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 30/45] tcg: Handle dh_typecode_i128 with TCG_CALL_{RET, ARG}_NORMAL Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 31/45] tcg: Allocate objects contiguously in temp_allocate_frame Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 32/45] tcg: Introduce tcg_out_addi_ptr Richard Henderson
2022-11-22 9:45 ` Daniel Henrique Barboza
2022-11-11 7:40 ` [PATCH for-8.0 v3 33/45] tcg: Add TCG_CALL_{RET,ARG}_BY_REF Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 34/45] tcg: Introduce tcg_target_call_oarg_reg Richard Henderson
2022-11-22 9:41 ` Daniel Henrique Barboza
2022-11-11 7:40 ` [PATCH for-8.0 v3 35/45] tcg: Add TCG_CALL_RET_BY_VEC Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 36/45] include/qemu/int128: Use Int128 structure for TCI Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 37/45] tcg/i386: Add TCG_TARGET_CALL_{RET, ARG}_I128 Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 38/45] tcg/tci: Fix big-endian return register ordering Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 39/45] tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128 Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 40/45] tcg: " Richard Henderson
2022-11-22 9:47 ` Daniel Henrique Barboza
2022-11-11 7:40 ` [PATCH for-8.0 v3 41/45] tcg: Add temp allocation for TCGv_i128 Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 42/45] tcg: Add basic data movement " Richard Henderson
2022-11-11 7:40 ` [PATCH for-8.0 v3 43/45] tcg: Add guest load/store primitives " Richard Henderson
2022-11-11 7:41 ` [PATCH for-8.0 v3 44/45] tcg: Add tcg_gen_{non}atomic_cmpxchg_i128 Richard Henderson
2022-11-11 7:41 ` [PATCH for-8.0 v3 45/45] tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32, 64} Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221111074101.2069454-1-richard.henderson@linaro.org \
--to=richard.henderson@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).