* [PATCH v4 00/54] tcg: Simplify calls to load/store helpers
@ 2023-05-03 6:56 Richard Henderson
2023-05-03 6:56 ` [PATCH v4 01/54] tcg/i386: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
` (53 more replies)
0 siblings, 54 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
v1: https://lore.kernel.org/qemu-devel/20230408024314.3357414-1-richard.henderson@linaro.org/
v2: https://lore.kernel.org/qemu-devel/20230411010512.5375-1-richard.henderson@linaro.org/
v3: https://lore.kernel.org/qemu-devel/20230424054105.1579315-1-richard.henderson@linaro.org/
There are several changes to the load/store helpers coming, and making
sure that those changes are properly reflected across all of the backends
was harrowing.
I have gone back and restarted by hoisting the code out of the backends
and into tcg.c. We already have all of the parameters for the host
function call abi for "normal" helpers, we simply need to apply that to
the load/store slow path.
No major changes for v4. A few patches upstreamed, and one new one
based on Phil's review.
r~
Richard Henderson (54):
tcg/i386: Rationalize args to tcg_out_qemu_{ld,st}
tcg/i386: Generalize multi-part load overlap test
tcg/i386: Introduce HostAddress
tcg/i386: Drop r0+r1 local variables from tcg_out_tlb_load
tcg/i386: Introduce tcg_out_testi
tcg/i386: Introduce prepare_host_addr
tcg/i386: Use indexed addressing for softmmu fast path
tcg/aarch64: Rationalize args to tcg_out_qemu_{ld,st}
tcg/aarch64: Introduce HostAddress
tcg/aarch64: Introduce prepare_host_addr
tcg/arm: Rationalize args to tcg_out_qemu_{ld,st}
tcg/arm: Introduce HostAddress
tcg/arm: Introduce prepare_host_addr
tcg/loongarch64: Rationalize args to tcg_out_qemu_{ld,st}
tcg/loongarch64: Introduce HostAddress
tcg/loongarch64: Introduce prepare_host_addr
tcg/mips: Rationalize args to tcg_out_qemu_{ld,st}
tcg/mips: Introduce prepare_host_addr
tcg/ppc: Rationalize args to tcg_out_qemu_{ld,st}
tcg/ppc: Introduce HostAddress
tcg/ppc: Introduce prepare_host_addr
tcg/riscv: Require TCG_TARGET_REG_BITS == 64
tcg/riscv: Rationalize args to tcg_out_qemu_{ld,st}
tcg/riscv: Introduce prepare_host_addr
tcg/s390x: Pass TCGType to tcg_out_qemu_{ld,st}
tcg/s390x: Introduce HostAddress
tcg/s390x: Introduce prepare_host_addr
tcg/sparc64: Drop is_64 test from tcg_out_qemu_ld data return
tcg/sparc64: Pass TCGType to tcg_out_qemu_{ld,st}
tcg: Move TCGLabelQemuLdst to tcg.c
tcg: Replace REG_P with arg_loc_reg_p
tcg: Introduce arg_slot_stk_ofs
tcg: Widen helper_*_st[bw]_mmu val arguments
tcg: Add routines for calling slow-path helpers
tcg/i386: Convert tcg_out_qemu_ld_slow_path
tcg/i386: Convert tcg_out_qemu_st_slow_path
tcg/aarch64: Convert tcg_out_qemu_{ld,st}_slow_path
tcg/arm: Convert tcg_out_qemu_{ld,st}_slow_path
tcg/loongarch64: Convert tcg_out_qemu_{ld,st}_slow_path
tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path
tcg/ppc: Convert tcg_out_qemu_{ld,st}_slow_path
tcg/riscv: Convert tcg_out_qemu_{ld,st}_slow_path
tcg/s390x: Convert tcg_out_qemu_{ld,st}_slow_path
tcg/loongarch64: Simplify constraints on qemu_ld/st
tcg/mips: Remove MO_BSWAP handling
tcg/mips: Reorg tlb load within prepare_host_addr
tcg/mips: Simplify constraints on qemu_ld/st
tcg/ppc: Reorg tcg_out_tlb_read
tcg/ppc: Adjust constraints on qemu_ld/st
tcg/ppc: Remove unused constraints A, B, C, D
tcg/ppc: Remove unused constraint J
tcg/riscv: Simplify constraints on qemu_ld/st
tcg/s390x: Use ALGFR in constructing softmmu host address
tcg/s390x: Simplify constraints on qemu_ld/st
include/tcg/tcg-ldst.h | 10 +-
tcg/loongarch64/tcg-target-con-set.h | 2 -
tcg/loongarch64/tcg-target-con-str.h | 1 -
tcg/mips/tcg-target-con-set.h | 13 +-
tcg/mips/tcg-target-con-str.h | 2 -
tcg/mips/tcg-target.h | 4 +-
tcg/ppc/tcg-target-con-set.h | 11 +-
tcg/ppc/tcg-target-con-str.h | 7 -
tcg/riscv/tcg-target-con-set.h | 10 -
tcg/riscv/tcg-target-con-str.h | 1 -
tcg/riscv/tcg-target.h | 22 +-
tcg/s390x/tcg-target-con-set.h | 2 -
tcg/s390x/tcg-target-con-str.h | 1 -
tcg/tcg-internal.h | 4 -
accel/tcg/cputlb.c | 6 +-
tcg/tcg.c | 514 ++++++++++++++-
tcg/aarch64/tcg-target.c.inc | 363 +++++------
tcg/arm/tcg-target.c.inc | 718 ++++++++------------
tcg/i386/tcg-target.c.inc | 700 +++++++++-----------
tcg/loongarch64/tcg-target.c.inc | 372 ++++-------
tcg/mips/tcg-target.c.inc | 942 ++++++++-------------------
tcg/ppc/tcg-target.c.inc | 640 ++++++++----------
tcg/riscv/tcg-target.c.inc | 534 +++++----------
tcg/s390x/tcg-target.c.inc | 393 +++++------
tcg/sparc64/tcg-target.c.inc | 8 +-
tcg/tcg-ldst.c.inc | 14 -
26 files changed, 2340 insertions(+), 2954 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH v4 01/54] tcg/i386: Rationalize args to tcg_out_qemu_{ld,st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 02/54] tcg/i386: Generalize multi-part load overlap test Richard Henderson
` (52 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Interpret the variable argument placement in the caller. Pass data_type
instead of is64 -- there are several places where we already convert back
from bool to type. Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 111 +++++++++++++++++---------------------
1 file changed, 50 insertions(+), 61 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index caf91a3151..cfa2349b03 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1884,8 +1884,8 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
* Record the context of a call to the out of line helper code for the slow path
* for a load or store, so that we can later generate the correct helper code
*/
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64,
- MemOpIdx oi,
+static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
+ TCGType type, MemOpIdx oi,
TCGReg datalo, TCGReg datahi,
TCGReg addrlo, TCGReg addrhi,
tcg_insn_unit *raddr,
@@ -1895,7 +1895,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64,
label->is_ld = is_ld;
label->oi = oi;
- label->type = is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ label->type = type;
label->datalo_reg = datalo;
label->datahi_reg = datahi;
label->addrlo_reg = addrlo;
@@ -2152,11 +2152,10 @@ static inline int setup_guest_base_seg(void)
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
TCGReg base, int index, intptr_t ofs,
- int seg, bool is64, MemOp memop)
+ int seg, TCGType type, MemOp memop)
{
- TCGType type = is64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
bool use_movbe = false;
- int rexw = is64 * P_REXW;
+ int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW);
int movop = OPC_MOVL_GvEv;
/* Do big-endian loads with movbe. */
@@ -2246,50 +2245,34 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
}
}
-/* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and
- EAX. It will be useful once fixed registers globals are less
- common. */
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
+static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg datalo, datahi, addrlo;
- TCGReg addrhi __attribute__((unused));
- MemOpIdx oi;
- MemOp opc;
+ MemOp opc = get_memop(oi);
+
#if defined(CONFIG_SOFTMMU)
- int mem_index;
tcg_insn_unit *label_ptr[2];
-#else
- unsigned a_bits;
-#endif
- datalo = *args++;
- datahi = (TCG_TARGET_REG_BITS == 32 && is64 ? *args++ : 0);
- addrlo = *args++;
- addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
-
-#if defined(CONFIG_SOFTMMU)
- mem_index = get_mmuidx(oi);
-
- tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc,
+ tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
label_ptr, offsetof(CPUTLBEntry, addr_read));
/* TLB Hit. */
- tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, is64, opc);
+ tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1,
+ -1, 0, 0, data_type, opc);
/* Record the current context of a load into ldst label */
- add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, addrlo, addrhi,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#else
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index,
x86_guest_base_offset, x86_guest_base_seg,
- is64, opc);
+ data_type, opc);
#endif
}
@@ -2345,40 +2328,26 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
}
}
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
+static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg datalo, datahi, addrlo;
- TCGReg addrhi __attribute__((unused));
- MemOpIdx oi;
- MemOp opc;
+ MemOp opc = get_memop(oi);
+
#if defined(CONFIG_SOFTMMU)
- int mem_index;
tcg_insn_unit *label_ptr[2];
-#else
- unsigned a_bits;
-#endif
- datalo = *args++;
- datahi = (TCG_TARGET_REG_BITS == 32 && is64 ? *args++ : 0);
- addrlo = *args++;
- addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
-
-#if defined(CONFIG_SOFTMMU)
- mem_index = get_mmuidx(oi);
-
- tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc,
+ tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
label_ptr, offsetof(CPUTLBEntry, addr_write));
/* TLB Hit. */
tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc);
/* Record the current context of a store into ldst label */
- add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, addrlo, addrhi,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#else
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
}
@@ -2673,17 +2642,37 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- tcg_out_qemu_ld(s, args, 0);
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ tcg_out_qemu_ld(s, a0, -1, a1, -1, a2, TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_ld(s, a0, -1, a1, a2, args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, args, 1);
+ if (TCG_TARGET_REG_BITS == 64) {
+ tcg_out_qemu_ld(s, a0, -1, a1, -1, a2, TCG_TYPE_I64);
+ } else if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_ld(s, a0, a1, a2, -1, args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_ld(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
+ }
break;
case INDEX_op_qemu_st_i32:
case INDEX_op_qemu_st8_i32:
- tcg_out_qemu_st(s, args, 0);
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ tcg_out_qemu_st(s, a0, -1, a1, -1, a2, TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_st(s, a0, -1, a1, a2, args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, args, 1);
+ if (TCG_TARGET_REG_BITS == 64) {
+ tcg_out_qemu_st(s, a0, -1, a1, -1, a2, TCG_TYPE_I64);
+ } else if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_st(s, a0, a1, a2, -1, args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_st(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
+ }
break;
OP_32_64(mulu2):
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 02/54] tcg/i386: Generalize multi-part load overlap test
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
2023-05-03 6:56 ` [PATCH v4 01/54] tcg/i386: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 03/54] tcg/i386: Introduce HostAddress Richard Henderson
` (51 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Test for both base and index; use datahi as a temporary, overwritten
by the final load. Always perform the loads in ascending order, so
that any (user-only) fault sees the correct address.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index cfa2349b03..173f3c3172 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -2221,23 +2221,22 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
if (TCG_TARGET_REG_BITS == 64) {
tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo,
base, index, 0, ofs);
+ break;
+ }
+ if (use_movbe) {
+ TCGReg t = datalo;
+ datalo = datahi;
+ datahi = t;
+ }
+ if (base == datalo || index == datalo) {
+ tcg_out_modrm_sib_offset(s, OPC_LEA, datahi, base, index, 0, ofs);
+ tcg_out_modrm_offset(s, movop + seg, datalo, datahi, 0);
+ tcg_out_modrm_offset(s, movop + seg, datahi, datahi, 4);
} else {
- if (use_movbe) {
- TCGReg t = datalo;
- datalo = datahi;
- datahi = t;
- }
- if (base != datalo) {
- tcg_out_modrm_sib_offset(s, movop + seg, datalo,
- base, index, 0, ofs);
- tcg_out_modrm_sib_offset(s, movop + seg, datahi,
- base, index, 0, ofs + 4);
- } else {
- tcg_out_modrm_sib_offset(s, movop + seg, datahi,
- base, index, 0, ofs + 4);
- tcg_out_modrm_sib_offset(s, movop + seg, datalo,
- base, index, 0, ofs);
- }
+ tcg_out_modrm_sib_offset(s, movop + seg, datalo,
+ base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, movop + seg, datahi,
+ base, index, 0, ofs + 4);
}
break;
default:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 03/54] tcg/i386: Introduce HostAddress
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
2023-05-03 6:56 ` [PATCH v4 01/54] tcg/i386: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 02/54] tcg/i386: Generalize multi-part load overlap test Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 04/54] tcg/i386: Drop r0+r1 local variables from tcg_out_tlb_load Richard Henderson
` (50 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Collect the 4 potential parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reorg guest_base handling to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 165 +++++++++++++++++++++-----------------
1 file changed, 90 insertions(+), 75 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 173f3c3172..909eecd4a3 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1751,6 +1751,13 @@ static void tcg_out_nopn(TCGContext *s, int n)
tcg_out8(s, 0x90);
}
+typedef struct {
+ TCGReg base;
+ int index;
+ int ofs;
+ int seg;
+} HostAddress;
+
#if defined(CONFIG_SOFTMMU)
/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
* int mmu_idx, uintptr_t ra)
@@ -2113,17 +2120,13 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
return tcg_out_fail_alignment(s, l);
}
-#if TCG_TARGET_REG_BITS == 32
-# define x86_guest_base_seg 0
-# define x86_guest_base_index -1
-# define x86_guest_base_offset guest_base
-#else
-static int x86_guest_base_seg;
-static int x86_guest_base_index = -1;
-static int32_t x86_guest_base_offset;
-# if defined(__x86_64__) && defined(__linux__)
-# include <asm/prctl.h>
-# include <sys/prctl.h>
+static HostAddress x86_guest_base = {
+ .index = -1
+};
+
+#if defined(__x86_64__) && defined(__linux__)
+# include <asm/prctl.h>
+# include <sys/prctl.h>
int arch_prctl(int code, unsigned long addr);
static inline int setup_guest_base_seg(void)
{
@@ -2132,8 +2135,9 @@ static inline int setup_guest_base_seg(void)
}
return 0;
}
-# elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)
-# include <machine/sysarch.h>
+#elif defined(__x86_64__) && \
+ (defined (__FreeBSD__) || defined (__FreeBSD_kernel__))
+# include <machine/sysarch.h>
static inline int setup_guest_base_seg(void)
{
if (sysarch(AMD64_SET_GSBASE, &guest_base) == 0) {
@@ -2141,18 +2145,16 @@ static inline int setup_guest_base_seg(void)
}
return 0;
}
-# else
+#else
static inline int setup_guest_base_seg(void)
{
return 0;
}
-# endif
-#endif
+#endif /* setup_guest_base_seg */
#endif /* SOFTMMU */
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
- TCGReg base, int index, intptr_t ofs,
- int seg, TCGType type, MemOp memop)
+ HostAddress h, TCGType type, MemOp memop)
{
bool use_movbe = false;
int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW);
@@ -2167,60 +2169,61 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
switch (memop & MO_SSIZE) {
case MO_UB:
- tcg_out_modrm_sib_offset(s, OPC_MOVZBL + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVZBL + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
break;
case MO_SB:
- tcg_out_modrm_sib_offset(s, OPC_MOVSBL + rexw + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVSBL + rexw + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
break;
case MO_UW:
if (use_movbe) {
/* There is no extending movbe; only low 16-bits are modified. */
- if (datalo != base && datalo != index) {
+ if (datalo != h.base && datalo != h.index) {
/* XOR breaks dependency chains. */
tgen_arithr(s, ARITH_XOR, datalo, datalo);
- tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + P_DATA16 + seg,
- datalo, base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + P_DATA16 + h.seg,
+ datalo, h.base, h.index, 0, h.ofs);
} else {
- tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + P_DATA16 + seg,
- datalo, base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + P_DATA16 + h.seg,
+ datalo, h.base, h.index, 0, h.ofs);
tcg_out_ext16u(s, datalo, datalo);
}
} else {
- tcg_out_modrm_sib_offset(s, OPC_MOVZWL + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVZWL + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
}
break;
case MO_SW:
if (use_movbe) {
- tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + P_DATA16 + seg,
- datalo, base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + P_DATA16 + h.seg,
+ datalo, h.base, h.index, 0, h.ofs);
tcg_out_ext16s(s, type, datalo, datalo);
} else {
- tcg_out_modrm_sib_offset(s, OPC_MOVSWL + rexw + seg,
- datalo, base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVSWL + rexw + h.seg,
+ datalo, h.base, h.index, 0, h.ofs);
}
break;
case MO_UL:
- tcg_out_modrm_sib_offset(s, movop + seg, datalo, base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, movop + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
break;
#if TCG_TARGET_REG_BITS == 64
case MO_SL:
if (use_movbe) {
- tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
tcg_out_ext32s(s, datalo, datalo);
} else {
- tcg_out_modrm_sib_offset(s, OPC_MOVSLQ + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVSLQ + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
}
break;
#endif
case MO_UQ:
if (TCG_TARGET_REG_BITS == 64) {
- tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
break;
}
if (use_movbe) {
@@ -2228,15 +2231,16 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
datalo = datahi;
datahi = t;
}
- if (base == datalo || index == datalo) {
- tcg_out_modrm_sib_offset(s, OPC_LEA, datahi, base, index, 0, ofs);
- tcg_out_modrm_offset(s, movop + seg, datalo, datahi, 0);
- tcg_out_modrm_offset(s, movop + seg, datahi, datahi, 4);
+ if (h.base == datalo || h.index == datalo) {
+ tcg_out_modrm_sib_offset(s, OPC_LEA, datahi,
+ h.base, h.index, 0, h.ofs);
+ tcg_out_modrm_offset(s, movop + h.seg, datalo, datahi, 0);
+ tcg_out_modrm_offset(s, movop + h.seg, datahi, datahi, 4);
} else {
- tcg_out_modrm_sib_offset(s, movop + seg, datalo,
- base, index, 0, ofs);
- tcg_out_modrm_sib_offset(s, movop + seg, datahi,
- base, index, 0, ofs + 4);
+ tcg_out_modrm_sib_offset(s, movop + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
+ tcg_out_modrm_sib_offset(s, movop + h.seg, datahi,
+ h.base, h.index, 0, h.ofs + 4);
}
break;
default:
@@ -2249,6 +2253,7 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ HostAddress h;
#if defined(CONFIG_SOFTMMU)
tcg_insn_unit *label_ptr[2];
@@ -2257,8 +2262,11 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
label_ptr, offsetof(CPUTLBEntry, addr_read));
/* TLB Hit. */
- tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1,
- -1, 0, 0, data_type, opc);
+ h.base = TCG_REG_L1;
+ h.index = -1;
+ h.ofs = 0;
+ h.seg = 0;
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
/* Record the current context of a load into ldst label */
add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
@@ -2269,15 +2277,14 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
- tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index,
- x86_guest_base_offset, x86_guest_base_seg,
- data_type, opc);
+ h = x86_guest_base;
+ h.base = addrlo;
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
#endif
}
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
- TCGReg base, int index, intptr_t ofs,
- int seg, MemOp memop)
+ HostAddress h, MemOp memop)
{
bool use_movbe = false;
int movop = OPC_MOVL_EvGv;
@@ -2296,30 +2303,31 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
case MO_8:
/* This is handled with constraints on INDEX_op_qemu_st8_i32. */
tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || datalo < 4);
- tcg_out_modrm_sib_offset(s, OPC_MOVB_EvGv + P_REXB_R + seg,
- datalo, base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, OPC_MOVB_EvGv + P_REXB_R + h.seg,
+ datalo, h.base, h.index, 0, h.ofs);
break;
case MO_16:
- tcg_out_modrm_sib_offset(s, movop + P_DATA16 + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, movop + P_DATA16 + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
break;
case MO_32:
- tcg_out_modrm_sib_offset(s, movop + seg, datalo, base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, movop + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
break;
case MO_64:
if (TCG_TARGET_REG_BITS == 64) {
- tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo,
- base, index, 0, ofs);
+ tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
} else {
if (use_movbe) {
TCGReg t = datalo;
datalo = datahi;
datahi = t;
}
- tcg_out_modrm_sib_offset(s, movop + seg, datalo,
- base, index, 0, ofs);
- tcg_out_modrm_sib_offset(s, movop + seg, datahi,
- base, index, 0, ofs + 4);
+ tcg_out_modrm_sib_offset(s, movop + h.seg, datalo,
+ h.base, h.index, 0, h.ofs);
+ tcg_out_modrm_sib_offset(s, movop + h.seg, datahi,
+ h.base, h.index, 0, h.ofs + 4);
}
break;
default:
@@ -2332,6 +2340,7 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ HostAddress h;
#if defined(CONFIG_SOFTMMU)
tcg_insn_unit *label_ptr[2];
@@ -2340,7 +2349,11 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
label_ptr, offsetof(CPUTLBEntry, addr_write));
/* TLB Hit. */
- tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc);
+ h.base = TCG_REG_L1;
+ h.index = -1;
+ h.ofs = 0;
+ h.seg = 0;
+ tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
/* Record the current context of a store into ldst label */
add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
@@ -2351,8 +2364,10 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
}
- tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index,
- x86_guest_base_offset, x86_guest_base_seg, opc);
+ h = x86_guest_base;
+ h.base = addrlo;
+
+ tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
#endif
}
@@ -4058,18 +4073,18 @@ static void tcg_target_qemu_prologue(TCGContext *s)
(ARRAY_SIZE(tcg_target_callee_save_regs) + 2) * 4
+ stack_addend);
#else
-# if !defined(CONFIG_SOFTMMU) && TCG_TARGET_REG_BITS == 64
+# if !defined(CONFIG_SOFTMMU)
if (guest_base) {
int seg = setup_guest_base_seg();
if (seg != 0) {
- x86_guest_base_seg = seg;
+ x86_guest_base.seg = seg;
} else if (guest_base == (int32_t)guest_base) {
- x86_guest_base_offset = guest_base;
+ x86_guest_base.ofs = guest_base;
} else {
/* Choose R12 because, as a base, it requires a SIB byte. */
- x86_guest_base_index = TCG_REG_R12;
- tcg_out_movi(s, TCG_TYPE_PTR, x86_guest_base_index, guest_base);
- tcg_regset_set_reg(s->reserved_regs, x86_guest_base_index);
+ x86_guest_base.index = TCG_REG_R12;
+ tcg_out_movi(s, TCG_TYPE_PTR, x86_guest_base.index, guest_base);
+ tcg_regset_set_reg(s->reserved_regs, x86_guest_base.index);
}
}
# endif
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 04/54] tcg/i386: Drop r0+r1 local variables from tcg_out_tlb_load
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (2 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 03/54] tcg/i386: Introduce HostAddress Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 05/54] tcg/i386: Introduce tcg_out_testi Richard Henderson
` (49 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use TCG_REG_L[01] constants directly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 909eecd4a3..78160f453b 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1810,8 +1810,6 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
int mem_index, MemOp opc,
tcg_insn_unit **label_ptr, int which)
{
- const TCGReg r0 = TCG_REG_L0;
- const TCGReg r1 = TCG_REG_L1;
TCGType ttype = TCG_TYPE_I32;
TCGType tlbtype = TCG_TYPE_I32;
int trexw = 0, hrexw = 0, tlbrexw = 0;
@@ -1835,15 +1833,15 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
}
}
- tcg_out_mov(s, tlbtype, r0, addrlo);
- tcg_out_shifti(s, SHIFT_SHR + tlbrexw, r0,
+ tcg_out_mov(s, tlbtype, TCG_REG_L0, addrlo);
+ tcg_out_shifti(s, SHIFT_SHR + tlbrexw, TCG_REG_L0,
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
- tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, r0, TCG_AREG0,
+ tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, TCG_REG_L0, TCG_AREG0,
TLB_MASK_TABLE_OFS(mem_index) +
offsetof(CPUTLBDescFast, mask));
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r0, TCG_AREG0,
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L0, TCG_AREG0,
TLB_MASK_TABLE_OFS(mem_index) +
offsetof(CPUTLBDescFast, table));
@@ -1851,19 +1849,21 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
copy the address and mask. For lesser alignments, check that we don't
cross pages for the complete access. */
if (a_bits >= s_bits) {
- tcg_out_mov(s, ttype, r1, addrlo);
+ tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
} else {
- tcg_out_modrm_offset(s, OPC_LEA + trexw, r1, addrlo, s_mask - a_mask);
+ tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
+ addrlo, s_mask - a_mask);
}
tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
- tgen_arithi(s, ARITH_AND + trexw, r1, tlb_mask, 0);
+ tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
- /* cmp 0(r0), r1 */
- tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw, r1, r0, which);
+ /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
+ TCG_REG_L1, TCG_REG_L0, which);
/* Prepare for both the fast path add of the tlb addend, and the slow
path function argument setup. */
- tcg_out_mov(s, ttype, r1, addrlo);
+ tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
/* jne slow_path */
tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
@@ -1871,8 +1871,8 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
s->code_ptr += 4;
if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
- /* cmp 4(r0), addrhi */
- tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, r0, which + 4);
+ /* cmp 4(TCG_REG_L0), addrhi */
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, which + 4);
/* jne slow_path */
tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
@@ -1882,8 +1882,8 @@ static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
/* TLB Hit. */
- /* add addend(r0), r1 */
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r1, r0,
+ /* add addend(TCG_REG_L0), TCG_REG_L1 */
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L1, TCG_REG_L0,
offsetof(CPUTLBEntry, addend));
}
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 05/54] tcg/i386: Introduce tcg_out_testi
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (3 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 04/54] tcg/i386: Drop r0+r1 local variables from tcg_out_tlb_load Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 06/54] tcg/i386: Introduce prepare_host_addr Richard Henderson
` (48 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Split out a helper for choosing testb vs testl.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 78160f453b..aae698121a 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1751,6 +1751,23 @@ static void tcg_out_nopn(TCGContext *s, int n)
tcg_out8(s, 0x90);
}
+/* Test register R vs immediate bits I, setting Z flag for EQ/NE. */
+static void __attribute__((unused))
+tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i)
+{
+ /*
+ * This is used for testing alignment, so we can usually use testb.
+ * For i686, we have to use testl for %esi/%edi.
+ */
+ if (i <= 0xff && (TCG_TARGET_REG_BITS == 64 || r < 4)) {
+ tcg_out_modrm(s, OPC_GRP3_Eb | P_REXB_RM, EXT3_TESTi, r);
+ tcg_out8(s, i);
+ } else {
+ tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, r);
+ tcg_out32(s, i);
+ }
+}
+
typedef struct {
TCGReg base;
int index;
@@ -2051,18 +2068,7 @@ static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
unsigned a_mask = (1 << a_bits) - 1;
TCGLabelQemuLdst *label;
- /*
- * We are expecting a_bits to max out at 7, so we can usually use testb.
- * For i686, we have to use testl for %esi/%edi.
- */
- if (a_mask <= 0xff && (TCG_TARGET_REG_BITS == 64 || addrlo < 4)) {
- tcg_out_modrm(s, OPC_GRP3_Eb | P_REXB_RM, EXT3_TESTi, addrlo);
- tcg_out8(s, a_mask);
- } else {
- tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, addrlo);
- tcg_out32(s, a_mask);
- }
-
+ tcg_out_testi(s, addrlo, a_mask);
/* jne slow_path */
tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 06/54] tcg/i386: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (4 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 05/54] tcg/i386: Introduce tcg_out_testi Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 07/54] tcg/i386: Use indexed addressing for softmmu fast path Richard Henderson
` (47 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label,
tcg_out_test_alignment, and some code that lived in both
tcg_out_qemu_ld and tcg_out_qemu_st into one function
that returns HostAddress and TCGLabelQemuLdst structures.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 344 ++++++++++++++++----------------------
1 file changed, 143 insertions(+), 201 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index aae698121a..237b154194 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1802,135 +1802,6 @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
[MO_BEUQ] = helper_be_stq_mmu,
};
-/* Perform the TLB load and compare.
-
- Inputs:
- ADDRLO and ADDRHI contain the low and high part of the address.
-
- MEM_INDEX and S_BITS are the memory context and log2 size of the load.
-
- WHICH is the offset into the CPUTLBEntry structure of the slot to read.
- This should be offsetof addr_read or addr_write.
-
- Outputs:
- LABEL_PTRS is filled with 1 (32-bit addresses) or 2 (64-bit addresses)
- positions of the displacements of forward jumps to the TLB miss case.
-
- Second argument register is loaded with the low part of the address.
- In the TLB hit case, it has been adjusted as indicated by the TLB
- and so is a host address. In the TLB miss case, it continues to
- hold a guest address.
-
- First argument register is clobbered. */
-
-static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
- int mem_index, MemOp opc,
- tcg_insn_unit **label_ptr, int which)
-{
- TCGType ttype = TCG_TYPE_I32;
- TCGType tlbtype = TCG_TYPE_I32;
- int trexw = 0, hrexw = 0, tlbrexw = 0;
- unsigned a_bits = get_alignment_bits(opc);
- unsigned s_bits = opc & MO_SIZE;
- unsigned a_mask = (1 << a_bits) - 1;
- unsigned s_mask = (1 << s_bits) - 1;
- target_ulong tlb_mask;
-
- if (TCG_TARGET_REG_BITS == 64) {
- if (TARGET_LONG_BITS == 64) {
- ttype = TCG_TYPE_I64;
- trexw = P_REXW;
- }
- if (TCG_TYPE_PTR == TCG_TYPE_I64) {
- hrexw = P_REXW;
- if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) {
- tlbtype = TCG_TYPE_I64;
- tlbrexw = P_REXW;
- }
- }
- }
-
- tcg_out_mov(s, tlbtype, TCG_REG_L0, addrlo);
- tcg_out_shifti(s, SHIFT_SHR + tlbrexw, TCG_REG_L0,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
-
- tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, TCG_REG_L0, TCG_AREG0,
- TLB_MASK_TABLE_OFS(mem_index) +
- offsetof(CPUTLBDescFast, mask));
-
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L0, TCG_AREG0,
- TLB_MASK_TABLE_OFS(mem_index) +
- offsetof(CPUTLBDescFast, table));
-
- /* If the required alignment is at least as large as the access, simply
- copy the address and mask. For lesser alignments, check that we don't
- cross pages for the complete access. */
- if (a_bits >= s_bits) {
- tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
- } else {
- tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
- addrlo, s_mask - a_mask);
- }
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
- tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
-
- /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
- tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
- TCG_REG_L1, TCG_REG_L0, which);
-
- /* Prepare for both the fast path add of the tlb addend, and the slow
- path function argument setup. */
- tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
-
- /* jne slow_path */
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
- label_ptr[0] = s->code_ptr;
- s->code_ptr += 4;
-
- if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
- /* cmp 4(TCG_REG_L0), addrhi */
- tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, which + 4);
-
- /* jne slow_path */
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
- label_ptr[1] = s->code_ptr;
- s->code_ptr += 4;
- }
-
- /* TLB Hit. */
-
- /* add addend(TCG_REG_L0), TCG_REG_L1 */
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L1, TCG_REG_L0,
- offsetof(CPUTLBEntry, addend));
-}
-
-/*
- * Record the context of a call to the out of line helper code for the slow path
- * for a load or store, so that we can later generate the correct helper code
- */
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
- TCGType type, MemOpIdx oi,
- TCGReg datalo, TCGReg datahi,
- TCGReg addrlo, TCGReg addrhi,
- tcg_insn_unit *raddr,
- tcg_insn_unit **label_ptr)
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->oi = oi;
- label->type = type;
- label->datalo_reg = datalo;
- label->datahi_reg = datahi;
- label->addrlo_reg = addrlo;
- label->addrhi_reg = addrhi;
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = label_ptr[0];
- if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
- label->label_ptr[1] = label_ptr[1];
- }
-}
-
/*
* Generate code for the slow path for a load at the end of block
*/
@@ -2061,27 +1932,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
return true;
}
#else
-
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
- TCGReg addrhi, unsigned a_bits)
-{
- unsigned a_mask = (1 << a_bits) - 1;
- TCGLabelQemuLdst *label;
-
- tcg_out_testi(s, addrlo, a_mask);
- /* jne slow_path */
- tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
-
- label = new_ldst_label(s);
- label->is_ld = is_ld;
- label->addrlo_reg = addrlo;
- label->addrhi_reg = addrhi;
- label->raddr = tcg_splitwx_to_rx(s->code_ptr + 4);
- label->label_ptr[0] = s->code_ptr;
-
- s->code_ptr += 4;
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
/* resolve label address */
@@ -2159,6 +2009,133 @@ static inline int setup_guest_base_seg(void)
#endif /* setup_guest_base_seg */
#endif /* SOFTMMU */
+/*
+ * For softmmu, perform the TLB load and compare.
+ * For useronly, perform any required alignment tests.
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
+ * is required and fill in @h with the host address for the fast path.
+ */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, bool is_ld)
+{
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+ unsigned a_mask = (1 << a_bits) - 1;
+
+#ifdef CONFIG_SOFTMMU
+ int cmp_ofs = is_ld ? offsetof(CPUTLBEntry, addr_read)
+ : offsetof(CPUTLBEntry, addr_write);
+ TCGType ttype = TCG_TYPE_I32;
+ TCGType tlbtype = TCG_TYPE_I32;
+ int trexw = 0, hrexw = 0, tlbrexw = 0;
+ unsigned mem_index = get_mmuidx(oi);
+ unsigned s_bits = opc & MO_SIZE;
+ unsigned s_mask = (1 << s_bits) - 1;
+ target_ulong tlb_mask;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+
+ if (TCG_TARGET_REG_BITS == 64) {
+ if (TARGET_LONG_BITS == 64) {
+ ttype = TCG_TYPE_I64;
+ trexw = P_REXW;
+ }
+ if (TCG_TYPE_PTR == TCG_TYPE_I64) {
+ hrexw = P_REXW;
+ if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) {
+ tlbtype = TCG_TYPE_I64;
+ tlbrexw = P_REXW;
+ }
+ }
+ }
+
+ tcg_out_mov(s, tlbtype, TCG_REG_L0, addrlo);
+ tcg_out_shifti(s, SHIFT_SHR + tlbrexw, TCG_REG_L0,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+
+ tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, TCG_REG_L0, TCG_AREG0,
+ TLB_MASK_TABLE_OFS(mem_index) +
+ offsetof(CPUTLBDescFast, mask));
+
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, TCG_REG_L0, TCG_AREG0,
+ TLB_MASK_TABLE_OFS(mem_index) +
+ offsetof(CPUTLBDescFast, table));
+
+ /* If the required alignment is at least as large as the access, simply
+ copy the address and mask. For lesser alignments, check that we don't
+ cross pages for the complete access. */
+ if (a_bits >= s_bits) {
+ tcg_out_mov(s, ttype, TCG_REG_L1, addrlo);
+ } else {
+ tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
+ addrlo, s_mask - a_mask);
+ }
+ tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
+ tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
+
+ /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
+ TCG_REG_L1, TCG_REG_L0, cmp_ofs);
+
+ /*
+ * Prepare for both the fast path add of the tlb addend, and the slow
+ * path function argument setup.
+ */
+ *h = (HostAddress) {
+ .base = TCG_REG_L1,
+ .index = -1
+ };
+ tcg_out_mov(s, ttype, h->base, addrlo);
+
+ /* jne slow_path */
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
+ ldst->label_ptr[0] = s->code_ptr;
+ s->code_ptr += 4;
+
+ if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
+ /* cmp 4(TCG_REG_L0), addrhi */
+ tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, cmp_ofs + 4);
+
+ /* jne slow_path */
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
+ ldst->label_ptr[1] = s->code_ptr;
+ s->code_ptr += 4;
+ }
+
+ /* TLB Hit. */
+
+ /* add addend(TCG_REG_L0), TCG_REG_L1 */
+ tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, h->base, TCG_REG_L0,
+ offsetof(CPUTLBEntry, addend));
+#else
+ if (a_bits) {
+ ldst = new_ldst_label(s);
+
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+
+ tcg_out_testi(s, addrlo, a_mask);
+ /* jne slow_path */
+ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
+ ldst->label_ptr[0] = s->code_ptr;
+ s->code_ptr += 4;
+ }
+
+ *h = x86_guest_base;
+ h->base = addrlo;
+#endif
+
+ return ldst;
+}
+
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
HostAddress h, TCGType type, MemOp memop)
{
@@ -2258,35 +2235,18 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
TCGReg addrlo, TCGReg addrhi,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[2];
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, get_memop(oi));
- tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
- label_ptr, offsetof(CPUTLBEntry, addr_read));
-
- /* TLB Hit. */
- h.base = TCG_REG_L1;
- h.index = -1;
- h.ofs = 0;
- h.seg = 0;
- tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
-
- /* Record the current context of a load into ldst label */
- add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
-
- h = x86_guest_base;
- h.base = addrlo;
- tcg_out_qemu_ld_direct(s, datalo, datahi, h, data_type, opc);
-#endif
}
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
@@ -2345,36 +2305,18 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
TCGReg addrlo, TCGReg addrhi,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[2];
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
+ tcg_out_qemu_st_direct(s, datalo, datahi, h, get_memop(oi));
- tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc,
- label_ptr, offsetof(CPUTLBEntry, addr_write));
-
- /* TLB Hit. */
- h.base = TCG_REG_L1;
- h.index = -1;
- h.ofs = 0;
- h.seg = 0;
- tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
-
- /* Record the current context of a store into ldst label */
- add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
-
- h = x86_guest_base;
- h.base = addrlo;
-
- tcg_out_qemu_st_direct(s, datalo, datahi, h, opc);
-#endif
}
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 07/54] tcg/i386: Use indexed addressing for softmmu fast path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (5 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 06/54] tcg/i386: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 08/54] tcg/aarch64: Rationalize args to tcg_out_qemu_{ld, st} Richard Henderson
` (46 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Since tcg_out_{ld,st}_helper_args, the slow path no longer requires
the address argument to be set up by the tlb load sequence. Use a
plain load for the addend and indexed addressing with the original
input address register.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 25 ++++++++++---------------
1 file changed, 10 insertions(+), 15 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 237b154194..8752968af2 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1837,7 +1837,8 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs);
} else {
tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
- /* The second argument is already loaded with addrlo. */
+ tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
+ l->addrlo_reg);
tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi);
tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3],
(uintptr_t)l->raddr);
@@ -1910,7 +1911,8 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs);
} else {
tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
- /* The second argument is already loaded with addrlo. */
+ tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
+ l->addrlo_reg);
tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
tcg_target_call_iarg_regs[2], l->datalo_reg);
tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi);
@@ -2083,16 +2085,6 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw,
TCG_REG_L1, TCG_REG_L0, cmp_ofs);
- /*
- * Prepare for both the fast path add of the tlb addend, and the slow
- * path function argument setup.
- */
- *h = (HostAddress) {
- .base = TCG_REG_L1,
- .index = -1
- };
- tcg_out_mov(s, ttype, h->base, addrlo);
-
/* jne slow_path */
tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0);
ldst->label_ptr[0] = s->code_ptr;
@@ -2109,10 +2101,13 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
}
/* TLB Hit. */
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_L0, TCG_REG_L0,
+ offsetof(CPUTLBEntry, addend));
- /* add addend(TCG_REG_L0), TCG_REG_L1 */
- tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, h->base, TCG_REG_L0,
- offsetof(CPUTLBEntry, addend));
+ *h = (HostAddress) {
+ .base = addrlo,
+ .index = TCG_REG_L0,
+ };
#else
if (a_bits) {
ldst = new_ldst_label(s);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 08/54] tcg/aarch64: Rationalize args to tcg_out_qemu_{ld, st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (6 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 07/54] tcg/i386: Use indexed addressing for softmmu fast path Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 09/54] tcg/aarch64: Introduce HostAddress Richard Henderson
` (45 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Rename the 'ext' parameter 'data_type' to make the use clearer;
pass it to tcg_out_qemu_st as well to even out the interfaces.
Rename the 'otype' local 'addr_type' to make the use clearer.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/aarch64/tcg-target.c.inc | 36 +++++++++++++++++-------------------
1 file changed, 17 insertions(+), 19 deletions(-)
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 4ec3cf3172..ecbf6564fc 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1851,22 +1851,21 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp memop,
}
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
- MemOpIdx oi, TCGType ext)
+ MemOpIdx oi, TCGType data_type)
{
MemOp memop = get_memop(oi);
- const TCGType otype = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((memop & MO_BSWAP) == 0);
#ifdef CONFIG_SOFTMMU
- unsigned mem_index = get_mmuidx(oi);
tcg_insn_unit *label_ptr;
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, mem_index, 1);
- tcg_out_qemu_ld_direct(s, memop, ext, data_reg,
- TCG_REG_X1, otype, addr_reg);
- add_qemu_ldst_label(s, true, oi, ext, data_reg, addr_reg,
+ tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 1);
+ tcg_out_qemu_ld_direct(s, memop, data_type, data_reg,
+ TCG_REG_X1, addr_type, addr_reg);
+ add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
s->code_ptr, label_ptr);
#else /* !CONFIG_SOFTMMU */
unsigned a_bits = get_alignment_bits(memop);
@@ -1874,33 +1873,32 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
tcg_out_test_alignment(s, true, addr_reg, a_bits);
}
if (USE_GUEST_BASE) {
- tcg_out_qemu_ld_direct(s, memop, ext, data_reg,
- TCG_REG_GUEST_BASE, otype, addr_reg);
+ tcg_out_qemu_ld_direct(s, memop, data_type, data_reg,
+ TCG_REG_GUEST_BASE, addr_type, addr_reg);
} else {
- tcg_out_qemu_ld_direct(s, memop, ext, data_reg,
+ tcg_out_qemu_ld_direct(s, memop, data_type, data_reg,
addr_reg, TCG_TYPE_I64, TCG_REG_XZR);
}
#endif /* CONFIG_SOFTMMU */
}
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
- MemOpIdx oi)
+ MemOpIdx oi, TCGType data_type)
{
MemOp memop = get_memop(oi);
- const TCGType otype = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((memop & MO_BSWAP) == 0);
#ifdef CONFIG_SOFTMMU
- unsigned mem_index = get_mmuidx(oi);
tcg_insn_unit *label_ptr;
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, mem_index, 0);
+ tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 0);
tcg_out_qemu_st_direct(s, memop, data_reg,
- TCG_REG_X1, otype, addr_reg);
- add_qemu_ldst_label(s, false, oi, (memop & MO_SIZE)== MO_64,
- data_reg, addr_reg, s->code_ptr, label_ptr);
+ TCG_REG_X1, addr_type, addr_reg);
+ add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
+ s->code_ptr, label_ptr);
#else /* !CONFIG_SOFTMMU */
unsigned a_bits = get_alignment_bits(memop);
if (a_bits) {
@@ -1908,7 +1906,7 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
}
if (USE_GUEST_BASE) {
tcg_out_qemu_st_direct(s, memop, data_reg,
- TCG_REG_GUEST_BASE, otype, addr_reg);
+ TCG_REG_GUEST_BASE, addr_type, addr_reg);
} else {
tcg_out_qemu_st_direct(s, memop, data_reg,
addr_reg, TCG_TYPE_I64, TCG_REG_XZR);
@@ -2249,7 +2247,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_st_i32:
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, REG0(0), a1, a2);
+ tcg_out_qemu_st(s, REG0(0), a1, a2, ext);
break;
case INDEX_op_bswap64_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 09/54] tcg/aarch64: Introduce HostAddress
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (7 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 08/54] tcg/aarch64: Rationalize args to tcg_out_qemu_{ld, st} Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 10/54] tcg/aarch64: Introduce prepare_host_addr Richard Henderson
` (44 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Collect the 3 potential parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/aarch64/tcg-target.c.inc | 86 +++++++++++++++++++++++++-----------
1 file changed, 59 insertions(+), 27 deletions(-)
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index ecbf6564fc..d8d464e4a0 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1587,6 +1587,12 @@ static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
tcg_out_insn(s, 3406, ADR, rd, offset);
}
+typedef struct {
+ TCGReg base;
+ TCGReg index;
+ TCGType index_ext;
+} HostAddress;
+
#ifdef CONFIG_SOFTMMU
/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
* MemOpIdx oi, uintptr_t ra)
@@ -1796,32 +1802,31 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
#endif /* CONFIG_SOFTMMU */
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp memop, TCGType ext,
- TCGReg data_r, TCGReg addr_r,
- TCGType otype, TCGReg off_r)
+ TCGReg data_r, HostAddress h)
{
switch (memop & MO_SSIZE) {
case MO_UB:
- tcg_out_ldst_r(s, I3312_LDRB, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_LDRB, data_r, h.base, h.index_ext, h.index);
break;
case MO_SB:
tcg_out_ldst_r(s, ext ? I3312_LDRSBX : I3312_LDRSBW,
- data_r, addr_r, otype, off_r);
+ data_r, h.base, h.index_ext, h.index);
break;
case MO_UW:
- tcg_out_ldst_r(s, I3312_LDRH, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_LDRH, data_r, h.base, h.index_ext, h.index);
break;
case MO_SW:
tcg_out_ldst_r(s, (ext ? I3312_LDRSHX : I3312_LDRSHW),
- data_r, addr_r, otype, off_r);
+ data_r, h.base, h.index_ext, h.index);
break;
case MO_UL:
- tcg_out_ldst_r(s, I3312_LDRW, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_LDRW, data_r, h.base, h.index_ext, h.index);
break;
case MO_SL:
- tcg_out_ldst_r(s, I3312_LDRSWX, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_LDRSWX, data_r, h.base, h.index_ext, h.index);
break;
case MO_UQ:
- tcg_out_ldst_r(s, I3312_LDRX, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_LDRX, data_r, h.base, h.index_ext, h.index);
break;
default:
g_assert_not_reached();
@@ -1829,21 +1834,20 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp memop, TCGType ext,
}
static void tcg_out_qemu_st_direct(TCGContext *s, MemOp memop,
- TCGReg data_r, TCGReg addr_r,
- TCGType otype, TCGReg off_r)
+ TCGReg data_r, HostAddress h)
{
switch (memop & MO_SIZE) {
case MO_8:
- tcg_out_ldst_r(s, I3312_STRB, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_STRB, data_r, h.base, h.index_ext, h.index);
break;
case MO_16:
- tcg_out_ldst_r(s, I3312_STRH, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_STRH, data_r, h.base, h.index_ext, h.index);
break;
case MO_32:
- tcg_out_ldst_r(s, I3312_STRW, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_STRW, data_r, h.base, h.index_ext, h.index);
break;
case MO_64:
- tcg_out_ldst_r(s, I3312_STRX, data_r, addr_r, otype, off_r);
+ tcg_out_ldst_r(s, I3312_STRX, data_r, h.base, h.index_ext, h.index);
break;
default:
g_assert_not_reached();
@@ -1855,6 +1859,7 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
{
MemOp memop = get_memop(oi);
TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ HostAddress h;
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((memop & MO_BSWAP) == 0);
@@ -1863,8 +1868,14 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
tcg_insn_unit *label_ptr;
tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 1);
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg,
- TCG_REG_X1, addr_type, addr_reg);
+
+ h = (HostAddress){
+ .base = TCG_REG_X1,
+ .index = addr_reg,
+ .index_ext = addr_type
+ };
+ tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
+
add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
s->code_ptr, label_ptr);
#else /* !CONFIG_SOFTMMU */
@@ -1873,12 +1884,19 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
tcg_out_test_alignment(s, true, addr_reg, a_bits);
}
if (USE_GUEST_BASE) {
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg,
- TCG_REG_GUEST_BASE, addr_type, addr_reg);
+ h = (HostAddress){
+ .base = TCG_REG_GUEST_BASE,
+ .index = addr_reg,
+ .index_ext = addr_type
+ };
} else {
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg,
- addr_reg, TCG_TYPE_I64, TCG_REG_XZR);
+ h = (HostAddress){
+ .base = addr_reg,
+ .index = TCG_REG_XZR,
+ .index_ext = TCG_TYPE_I64
+ };
}
+ tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
#endif /* CONFIG_SOFTMMU */
}
@@ -1887,6 +1905,7 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
{
MemOp memop = get_memop(oi);
TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ HostAddress h;
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((memop & MO_BSWAP) == 0);
@@ -1895,8 +1914,14 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
tcg_insn_unit *label_ptr;
tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 0);
- tcg_out_qemu_st_direct(s, memop, data_reg,
- TCG_REG_X1, addr_type, addr_reg);
+
+ h = (HostAddress){
+ .base = TCG_REG_X1,
+ .index = addr_reg,
+ .index_ext = addr_type
+ };
+ tcg_out_qemu_st_direct(s, memop, data_reg, h);
+
add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
s->code_ptr, label_ptr);
#else /* !CONFIG_SOFTMMU */
@@ -1905,12 +1930,19 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
tcg_out_test_alignment(s, false, addr_reg, a_bits);
}
if (USE_GUEST_BASE) {
- tcg_out_qemu_st_direct(s, memop, data_reg,
- TCG_REG_GUEST_BASE, addr_type, addr_reg);
+ h = (HostAddress){
+ .base = TCG_REG_GUEST_BASE,
+ .index = addr_reg,
+ .index_ext = addr_type
+ };
} else {
- tcg_out_qemu_st_direct(s, memop, data_reg,
- addr_reg, TCG_TYPE_I64, TCG_REG_XZR);
+ h = (HostAddress){
+ .base = addr_reg,
+ .index = TCG_REG_XZR,
+ .index_ext = TCG_TYPE_I64
+ };
}
+ tcg_out_qemu_st_direct(s, memop, data_reg, h);
#endif /* CONFIG_SOFTMMU */
}
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 10/54] tcg/aarch64: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (8 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 09/54] tcg/aarch64: Introduce HostAddress Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 11/54] tcg/arm: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
` (43 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/aarch64/tcg-target.c.inc | 313 +++++++++++++++--------------------
1 file changed, 133 insertions(+), 180 deletions(-)
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index d8d464e4a0..202b90c001 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1667,113 +1667,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
tcg_out_goto(s, lb->raddr);
return true;
}
-
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
- TCGType ext, TCGReg data_reg, TCGReg addr_reg,
- tcg_insn_unit *raddr, tcg_insn_unit *label_ptr)
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->oi = oi;
- label->type = ext;
- label->datalo_reg = data_reg;
- label->addrlo_reg = addr_reg;
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = label_ptr;
-}
-
-/* We expect to use a 7-bit scaled negative offset from ENV. */
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
-
-/* These offsets are built into the LDP below. */
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
-
-/* Load and compare a TLB entry, emitting the conditional jump to the
- slow path for the failure case, which will be patched later when finalizing
- the slow path. Generated code returns the host addend in X1,
- clobbers X0,X2,X3,TMP. */
-static void tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg, MemOp opc,
- tcg_insn_unit **label_ptr, int mem_index,
- bool is_read)
-{
- unsigned a_bits = get_alignment_bits(opc);
- unsigned s_bits = opc & MO_SIZE;
- unsigned a_mask = (1u << a_bits) - 1;
- unsigned s_mask = (1u << s_bits) - 1;
- TCGReg x3;
- TCGType mask_type;
- uint64_t compare_mask;
-
- mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
- ? TCG_TYPE_I64 : TCG_TYPE_I32);
-
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
- tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
- TLB_MASK_TABLE_OFS(mem_index), 1, 0);
-
- /* Extract the TLB index from the address into X0. */
- tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
- TCG_REG_X0, TCG_REG_X0, addr_reg,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
-
- /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
- tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
-
- /* Load the tlb comparator into X0, and the fast path addend into X1. */
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1, is_read
- ? offsetof(CPUTLBEntry, addr_read)
- : offsetof(CPUTLBEntry, addr_write));
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
- offsetof(CPUTLBEntry, addend));
-
- /* For aligned accesses, we check the first byte and include the alignment
- bits within the address. For unaligned access, we check that we don't
- cross pages using the address of the last byte of the access. */
- if (a_bits >= s_bits) {
- x3 = addr_reg;
- } else {
- tcg_out_insn(s, 3401, ADDI, TARGET_LONG_BITS == 64,
- TCG_REG_X3, addr_reg, s_mask - a_mask);
- x3 = TCG_REG_X3;
- }
- compare_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
-
- /* Store the page mask part of the address into X3. */
- tcg_out_logicali(s, I3404_ANDI, TARGET_LONG_BITS == 64,
- TCG_REG_X3, x3, compare_mask);
-
- /* Perform the address comparison. */
- tcg_out_cmp(s, TARGET_LONG_BITS == 64, TCG_REG_X0, TCG_REG_X3, 0);
-
- /* If not equal, we jump to the slow path. */
- *label_ptr = s->code_ptr;
- tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
-}
-
#else
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
- unsigned a_bits)
-{
- unsigned a_mask = (1 << a_bits) - 1;
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->addrlo_reg = addr_reg;
-
- /* tst addr, #mask */
- tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, a_mask);
-
- label->label_ptr[0] = s->code_ptr;
-
- /* b.ne slow_path */
- tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
-
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
if (!reloc_pc19(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
@@ -1801,6 +1695,125 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
#endif /* CONFIG_SOFTMMU */
+/*
+ * For softmmu, perform the TLB load and compare.
+ * For useronly, perform any required alignment tests.
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
+ * is required and fill in @h with the host address for the fast path.
+ */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
+ TCGReg addr_reg, MemOpIdx oi,
+ bool is_ld)
+{
+ TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+ unsigned a_mask = (1u << a_bits) - 1;
+
+#ifdef CONFIG_SOFTMMU
+ unsigned s_bits = opc & MO_SIZE;
+ unsigned s_mask = (1u << s_bits) - 1;
+ unsigned mem_index = get_mmuidx(oi);
+ TCGReg x3;
+ TCGType mask_type;
+ uint64_t compare_mask;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ mask_type = (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32
+ ? TCG_TYPE_I64 : TCG_TYPE_I32);
+
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {x0,x1}. */
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -512);
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 8);
+ tcg_out_insn(s, 3314, LDP, TCG_REG_X0, TCG_REG_X1, TCG_AREG0,
+ TLB_MASK_TABLE_OFS(mem_index), 1, 0);
+
+ /* Extract the TLB index from the address into X0. */
+ tcg_out_insn(s, 3502S, AND_LSR, mask_type == TCG_TYPE_I64,
+ TCG_REG_X0, TCG_REG_X0, addr_reg,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+
+ /* Add the tlb_table pointer, creating the CPUTLBEntry address into X1. */
+ tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
+
+ /* Load the tlb comparator into X0, and the fast path addend into X1. */
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1,
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
+ : offsetof(CPUTLBEntry, addr_write));
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
+ offsetof(CPUTLBEntry, addend));
+
+ /*
+ * For aligned accesses, we check the first byte and include the alignment
+ * bits within the address. For unaligned access, we check that we don't
+ * cross pages using the address of the last byte of the access.
+ */
+ if (a_bits >= s_bits) {
+ x3 = addr_reg;
+ } else {
+ tcg_out_insn(s, 3401, ADDI, TARGET_LONG_BITS == 64,
+ TCG_REG_X3, addr_reg, s_mask - a_mask);
+ x3 = TCG_REG_X3;
+ }
+ compare_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
+
+ /* Store the page mask part of the address into X3. */
+ tcg_out_logicali(s, I3404_ANDI, TARGET_LONG_BITS == 64,
+ TCG_REG_X3, x3, compare_mask);
+
+ /* Perform the address comparison. */
+ tcg_out_cmp(s, TARGET_LONG_BITS == 64, TCG_REG_X0, TCG_REG_X3, 0);
+
+ /* If not equal, we jump to the slow path. */
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
+
+ *h = (HostAddress){
+ .base = TCG_REG_X1,
+ .index = addr_reg,
+ .index_ext = addr_type
+ };
+#else
+ if (a_mask) {
+ ldst = new_ldst_label(s);
+
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ /* tst addr, #mask */
+ tcg_out_logicali(s, I3404_ANDSI, 0, TCG_REG_XZR, addr_reg, a_mask);
+
+ /* b.ne slow_path */
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_insn(s, 3202, B_C, TCG_COND_NE, 0);
+ }
+
+ if (USE_GUEST_BASE) {
+ *h = (HostAddress){
+ .base = TCG_REG_GUEST_BASE,
+ .index = addr_reg,
+ .index_ext = addr_type
+ };
+ } else {
+ *h = (HostAddress){
+ .base = addr_reg,
+ .index = TCG_REG_XZR,
+ .index_ext = TCG_TYPE_I64
+ };
+ }
+#endif
+
+ return ldst;
+}
+
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp memop, TCGType ext,
TCGReg data_r, HostAddress h)
{
@@ -1857,93 +1870,33 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp memop,
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp memop = get_memop(oi);
- TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ TCGLabelQemuLdst *ldst;
HostAddress h;
- /* Byte swapping is left to middle-end expansion. */
- tcg_debug_assert((memop & MO_BSWAP) == 0);
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
+ tcg_out_qemu_ld_direct(s, get_memop(oi), data_type, data_reg, h);
-#ifdef CONFIG_SOFTMMU
- tcg_insn_unit *label_ptr;
-
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 1);
-
- h = (HostAddress){
- .base = TCG_REG_X1,
- .index = addr_reg,
- .index_ext = addr_type
- };
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
-
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#else /* !CONFIG_SOFTMMU */
- unsigned a_bits = get_alignment_bits(memop);
- if (a_bits) {
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- if (USE_GUEST_BASE) {
- h = (HostAddress){
- .base = TCG_REG_GUEST_BASE,
- .index = addr_reg,
- .index_ext = addr_type
- };
- } else {
- h = (HostAddress){
- .base = addr_reg,
- .index = TCG_REG_XZR,
- .index_ext = TCG_TYPE_I64
- };
- }
- tcg_out_qemu_ld_direct(s, memop, data_type, data_reg, h);
-#endif /* CONFIG_SOFTMMU */
}
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp memop = get_memop(oi);
- TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+ TCGLabelQemuLdst *ldst;
HostAddress h;
- /* Byte swapping is left to middle-end expansion. */
- tcg_debug_assert((memop & MO_BSWAP) == 0);
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
+ tcg_out_qemu_st_direct(s, get_memop(oi), data_reg, h);
-#ifdef CONFIG_SOFTMMU
- tcg_insn_unit *label_ptr;
-
- tcg_out_tlb_read(s, addr_reg, memop, &label_ptr, get_mmuidx(oi), 0);
-
- h = (HostAddress){
- .base = TCG_REG_X1,
- .index = addr_reg,
- .index_ext = addr_type
- };
- tcg_out_qemu_st_direct(s, memop, data_reg, h);
-
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#else /* !CONFIG_SOFTMMU */
- unsigned a_bits = get_alignment_bits(memop);
- if (a_bits) {
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- if (USE_GUEST_BASE) {
- h = (HostAddress){
- .base = TCG_REG_GUEST_BASE,
- .index = addr_reg,
- .index_ext = addr_type
- };
- } else {
- h = (HostAddress){
- .base = addr_reg,
- .index = TCG_REG_XZR,
- .index_ext = TCG_TYPE_I64
- };
- }
- tcg_out_qemu_st_direct(s, memop, data_reg, h);
-#endif /* CONFIG_SOFTMMU */
}
static const tcg_insn_unit *tb_ret_addr;
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 11/54] tcg/arm: Rationalize args to tcg_out_qemu_{ld,st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (9 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 10/54] tcg/aarch64: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 12/54] tcg/arm: Introduce HostAddress Richard Henderson
` (42 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Interpret the variable argument placement in the caller.
Pass data_type instead of is_64. We need to set this in
TCGLabelQemuLdst, so plumb this all the way through from tcg_out_op.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/arm/tcg-target.c.inc | 113 +++++++++++++++++++--------------------
1 file changed, 56 insertions(+), 57 deletions(-)
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 83c818a58b..6ce52b9612 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1526,15 +1526,18 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
/* Record the context of a call to the out of line helper code for the slow
path for a load or store, so that we can later generate the correct
helper code. */
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
- TCGReg datalo, TCGReg datahi, TCGReg addrlo,
- TCGReg addrhi, tcg_insn_unit *raddr,
+static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
+ MemOpIdx oi, TCGType type,
+ TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ tcg_insn_unit *raddr,
tcg_insn_unit *label_ptr)
{
TCGLabelQemuLdst *label = new_ldst_label(s);
label->is_ld = is_ld;
label->oi = oi;
+ label->type = type;
label->datalo_reg = datalo;
label->datahi_reg = datahi;
label->addrlo_reg = addrlo;
@@ -1796,41 +1799,28 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
}
#endif
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
+static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg addrlo, datalo, datahi, addrhi __attribute__((unused));
- MemOpIdx oi;
- MemOp opc;
-#ifdef CONFIG_SOFTMMU
- int mem_index;
- TCGReg addend;
- tcg_insn_unit *label_ptr;
-#else
- unsigned a_bits;
-#endif
-
- datalo = *args++;
- datahi = (is64 ? *args++ : 0);
- addrlo = *args++;
- addrhi = (TARGET_LONG_BITS == 64 ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
+ MemOp opc = get_memop(oi);
#ifdef CONFIG_SOFTMMU
- mem_index = get_mmuidx(oi);
- addend = tcg_out_tlb_read(s, addrlo, addrhi, opc, mem_index, 1);
+ TCGReg addend= tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 1);
- /* This a conditional BL only to load a pointer within this opcode into LR
- for the slow path. We will not be using the value for a tail call. */
- label_ptr = s->code_ptr;
+ /*
+ * This a conditional BL only to load a pointer within this opcode into
+ * LR for the slow path. We will not be using the value for a tail call.
+ */
+ tcg_insn_unit *label_ptr = s->code_ptr;
tcg_out_bl_imm(s, COND_NE, 0);
tcg_out_qemu_ld_index(s, opc, datalo, datahi, addrlo, addend, true);
- add_qemu_ldst_label(s, true, oi, datalo, datahi, addrlo, addrhi,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#else /* !CONFIG_SOFTMMU */
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
@@ -1918,41 +1908,26 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
}
#endif
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64)
+static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg addrlo, datalo, datahi, addrhi __attribute__((unused));
- MemOpIdx oi;
- MemOp opc;
-#ifdef CONFIG_SOFTMMU
- int mem_index;
- TCGReg addend;
- tcg_insn_unit *label_ptr;
-#else
- unsigned a_bits;
-#endif
-
- datalo = *args++;
- datahi = (is64 ? *args++ : 0);
- addrlo = *args++;
- addrhi = (TARGET_LONG_BITS == 64 ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
+ MemOp opc = get_memop(oi);
#ifdef CONFIG_SOFTMMU
- mem_index = get_mmuidx(oi);
- addend = tcg_out_tlb_read(s, addrlo, addrhi, opc, mem_index, 0);
+ TCGReg addend = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 0);
tcg_out_qemu_st_index(s, COND_EQ, opc, datalo, datahi,
addrlo, addend, true);
/* The conditional call must come last, as we're going to return here. */
- label_ptr = s->code_ptr;
+ tcg_insn_unit *label_ptr = s->code_ptr;
tcg_out_bl_imm(s, COND_NE, 0);
- add_qemu_ldst_label(s, false, oi, datalo, datahi, addrlo, addrhi,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#else /* !CONFIG_SOFTMMU */
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
}
@@ -2245,16 +2220,40 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- tcg_out_qemu_ld(s, args, 0);
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_ld(s, args[0], -1, args[1], -1,
+ args[2], TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_ld(s, args[0], -1, args[1], args[2],
+ args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, args, 1);
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_ld(s, args[0], args[1], args[2], -1,
+ args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_ld(s, args[0], args[1], args[2], args[3],
+ args[4], TCG_TYPE_I64);
+ }
break;
case INDEX_op_qemu_st_i32:
- tcg_out_qemu_st(s, args, 0);
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_st(s, args[0], -1, args[1], -1,
+ args[2], TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_st(s, args[0], -1, args[1], args[2],
+ args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, args, 1);
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_st(s, args[0], args[1], args[2], -1,
+ args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_st(s, args[0], args[1], args[2], args[3],
+ args[4], TCG_TYPE_I64);
+ }
break;
case INDEX_op_bswap16_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 12/54] tcg/arm: Introduce HostAddress
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (10 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 11/54] tcg/arm: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 13/54] tcg/arm: Introduce prepare_host_addr Richard Henderson
` (41 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Collect the parts of the host address, and condition, into a struct.
Merge tcg_out_qemu_*_{index,direct} and use it.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/arm/tcg-target.c.inc | 248 ++++++++++++++++++---------------------
1 file changed, 115 insertions(+), 133 deletions(-)
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 6ce52b9612..b6b4ffc546 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1337,6 +1337,13 @@ static void tcg_out_vldst(TCGContext *s, ARMInsn insn,
tcg_out32(s, insn | (rn << 16) | encode_vd(rd) | 0xf);
}
+typedef struct {
+ ARMCond cond;
+ TCGReg base;
+ int index;
+ bool index_scratch;
+} HostAddress;
+
#ifdef CONFIG_SOFTMMU
/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
* int mmu_idx, uintptr_t ra)
@@ -1696,29 +1703,49 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
#endif /* SOFTMMU */
-static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
- TCGReg datalo, TCGReg datahi,
- TCGReg addrlo, TCGReg addend,
- bool scratch_addend)
+static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
+ TCGReg datahi, HostAddress h)
{
+ TCGReg base;
+
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((opc & MO_BSWAP) == 0);
switch (opc & MO_SSIZE) {
case MO_UB:
- tcg_out_ld8_r(s, COND_AL, datalo, addrlo, addend);
+ if (h.index < 0) {
+ tcg_out_ld8_12(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_ld8_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_SB:
- tcg_out_ld8s_r(s, COND_AL, datalo, addrlo, addend);
+ if (h.index < 0) {
+ tcg_out_ld8s_8(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_ld8s_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_UW:
- tcg_out_ld16u_r(s, COND_AL, datalo, addrlo, addend);
+ if (h.index < 0) {
+ tcg_out_ld16u_8(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_ld16u_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_SW:
- tcg_out_ld16s_r(s, COND_AL, datalo, addrlo, addend);
+ if (h.index < 0) {
+ tcg_out_ld16s_8(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_ld16s_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_UL:
- tcg_out_ld32_r(s, COND_AL, datalo, addrlo, addend);
+ if (h.index < 0) {
+ tcg_out_ld32_12(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_ld32_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_UQ:
/* We used pair allocation for datalo, so already should be aligned. */
@@ -1726,87 +1753,59 @@ static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
tcg_debug_assert(datahi == datalo + 1);
/* LDRD requires alignment; double-check that. */
if (get_alignment_bits(opc) >= MO_64) {
+ if (h.index < 0) {
+ tcg_out_ldrd_8(s, h.cond, datalo, h.base, 0);
+ break;
+ }
/*
* Rm (the second address op) must not overlap Rt or Rt + 1.
* Since datalo is aligned, we can simplify the test via alignment.
* Flip the two address arguments if that works.
*/
- if ((addend & ~1) != datalo) {
- tcg_out_ldrd_r(s, COND_AL, datalo, addrlo, addend);
+ if ((h.index & ~1) != datalo) {
+ tcg_out_ldrd_r(s, h.cond, datalo, h.base, h.index);
break;
}
- if ((addrlo & ~1) != datalo) {
- tcg_out_ldrd_r(s, COND_AL, datalo, addend, addrlo);
+ if ((h.base & ~1) != datalo) {
+ tcg_out_ldrd_r(s, h.cond, datalo, h.index, h.base);
break;
}
}
- if (scratch_addend) {
- tcg_out_ld32_rwb(s, COND_AL, datalo, addend, addrlo);
- tcg_out_ld32_12(s, COND_AL, datahi, addend, 4);
+ if (h.index < 0) {
+ base = h.base;
+ if (datalo == h.base) {
+ tcg_out_mov_reg(s, h.cond, TCG_REG_TMP, base);
+ base = TCG_REG_TMP;
+ }
+ } else if (h.index_scratch) {
+ tcg_out_ld32_rwb(s, h.cond, datalo, h.index, h.base);
+ tcg_out_ld32_12(s, h.cond, datahi, h.index, 4);
+ break;
} else {
- tcg_out_dat_reg(s, COND_AL, ARITH_ADD, TCG_REG_TMP,
- addend, addrlo, SHIFT_IMM_LSL(0));
- tcg_out_ld32_12(s, COND_AL, datalo, TCG_REG_TMP, 0);
- tcg_out_ld32_12(s, COND_AL, datahi, TCG_REG_TMP, 4);
+ tcg_out_dat_reg(s, h.cond, ARITH_ADD, TCG_REG_TMP,
+ h.base, h.index, SHIFT_IMM_LSL(0));
+ base = TCG_REG_TMP;
}
+ tcg_out_ld32_12(s, h.cond, datalo, base, 0);
+ tcg_out_ld32_12(s, h.cond, datahi, base, 4);
break;
default:
g_assert_not_reached();
}
}
-#ifndef CONFIG_SOFTMMU
-static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
- TCGReg datahi, TCGReg addrlo)
-{
- /* Byte swapping is left to middle-end expansion. */
- tcg_debug_assert((opc & MO_BSWAP) == 0);
-
- switch (opc & MO_SSIZE) {
- case MO_UB:
- tcg_out_ld8_12(s, COND_AL, datalo, addrlo, 0);
- break;
- case MO_SB:
- tcg_out_ld8s_8(s, COND_AL, datalo, addrlo, 0);
- break;
- case MO_UW:
- tcg_out_ld16u_8(s, COND_AL, datalo, addrlo, 0);
- break;
- case MO_SW:
- tcg_out_ld16s_8(s, COND_AL, datalo, addrlo, 0);
- break;
- case MO_UL:
- tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
- break;
- case MO_UQ:
- /* We used pair allocation for datalo, so already should be aligned. */
- tcg_debug_assert((datalo & 1) == 0);
- tcg_debug_assert(datahi == datalo + 1);
- /* LDRD requires alignment; double-check that. */
- if (get_alignment_bits(opc) >= MO_64) {
- tcg_out_ldrd_8(s, COND_AL, datalo, addrlo, 0);
- } else if (datalo == addrlo) {
- tcg_out_ld32_12(s, COND_AL, datahi, addrlo, 4);
- tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
- } else {
- tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
- tcg_out_ld32_12(s, COND_AL, datahi, addrlo, 4);
- }
- break;
- default:
- g_assert_not_reached();
- }
-}
-#endif
-
static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
TCGReg addrlo, TCGReg addrhi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ HostAddress h;
#ifdef CONFIG_SOFTMMU
- TCGReg addend= tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 1);
+ h.cond = COND_AL;
+ h.base = addrlo;
+ h.index_scratch = true;
+ h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 1);
/*
* This a conditional BL only to load a pointer within this opcode into
@@ -1815,80 +1814,51 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
tcg_insn_unit *label_ptr = s->code_ptr;
tcg_out_bl_imm(s, COND_NE, 0);
- tcg_out_qemu_ld_index(s, opc, datalo, datahi, addrlo, addend, true);
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
addrlo, addrhi, s->code_ptr, label_ptr);
-#else /* !CONFIG_SOFTMMU */
+#else
unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
- if (guest_base) {
- tcg_out_qemu_ld_index(s, opc, datalo, datahi,
- addrlo, TCG_REG_GUEST_BASE, false);
- } else {
- tcg_out_qemu_ld_direct(s, opc, datalo, datahi, addrlo);
- }
+
+ h.cond = COND_AL;
+ h.base = addrlo;
+ h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
+ h.index_scratch = false;
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
#endif
}
-static void tcg_out_qemu_st_index(TCGContext *s, ARMCond cond, MemOp opc,
- TCGReg datalo, TCGReg datahi,
- TCGReg addrlo, TCGReg addend,
- bool scratch_addend)
-{
- /* Byte swapping is left to middle-end expansion. */
- tcg_debug_assert((opc & MO_BSWAP) == 0);
-
- switch (opc & MO_SIZE) {
- case MO_8:
- tcg_out_st8_r(s, cond, datalo, addrlo, addend);
- break;
- case MO_16:
- tcg_out_st16_r(s, cond, datalo, addrlo, addend);
- break;
- case MO_32:
- tcg_out_st32_r(s, cond, datalo, addrlo, addend);
- break;
- case MO_64:
- /* We used pair allocation for datalo, so already should be aligned. */
- tcg_debug_assert((datalo & 1) == 0);
- tcg_debug_assert(datahi == datalo + 1);
- /* STRD requires alignment; double-check that. */
- if (get_alignment_bits(opc) >= MO_64) {
- tcg_out_strd_r(s, cond, datalo, addrlo, addend);
- } else if (scratch_addend) {
- tcg_out_st32_rwb(s, cond, datalo, addend, addrlo);
- tcg_out_st32_12(s, cond, datahi, addend, 4);
- } else {
- tcg_out_dat_reg(s, cond, ARITH_ADD, TCG_REG_TMP,
- addend, addrlo, SHIFT_IMM_LSL(0));
- tcg_out_st32_12(s, cond, datalo, TCG_REG_TMP, 0);
- tcg_out_st32_12(s, cond, datahi, TCG_REG_TMP, 4);
- }
- break;
- default:
- g_assert_not_reached();
- }
-}
-
-#ifndef CONFIG_SOFTMMU
static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
- TCGReg datahi, TCGReg addrlo)
+ TCGReg datahi, HostAddress h)
{
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((opc & MO_BSWAP) == 0);
switch (opc & MO_SIZE) {
case MO_8:
- tcg_out_st8_12(s, COND_AL, datalo, addrlo, 0);
+ if (h.index < 0) {
+ tcg_out_st8_12(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_st8_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_16:
- tcg_out_st16_8(s, COND_AL, datalo, addrlo, 0);
+ if (h.index < 0) {
+ tcg_out_st16_8(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_st16_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_32:
- tcg_out_st32_12(s, COND_AL, datalo, addrlo, 0);
+ if (h.index < 0) {
+ tcg_out_st32_12(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_st32_r(s, h.cond, datalo, h.base, h.index);
+ }
break;
case MO_64:
/* We used pair allocation for datalo, so already should be aligned. */
@@ -1896,29 +1866,39 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
tcg_debug_assert(datahi == datalo + 1);
/* STRD requires alignment; double-check that. */
if (get_alignment_bits(opc) >= MO_64) {
- tcg_out_strd_8(s, COND_AL, datalo, addrlo, 0);
+ if (h.index < 0) {
+ tcg_out_strd_8(s, h.cond, datalo, h.base, 0);
+ } else {
+ tcg_out_strd_r(s, h.cond, datalo, h.base, h.index);
+ }
+ } else if (h.index_scratch) {
+ tcg_out_st32_rwb(s, h.cond, datalo, h.index, h.base);
+ tcg_out_st32_12(s, h.cond, datahi, h.index, 4);
} else {
- tcg_out_st32_12(s, COND_AL, datalo, addrlo, 0);
- tcg_out_st32_12(s, COND_AL, datahi, addrlo, 4);
+ tcg_out_dat_reg(s, h.cond, ARITH_ADD, TCG_REG_TMP,
+ h.base, h.index, SHIFT_IMM_LSL(0));
+ tcg_out_st32_12(s, h.cond, datalo, TCG_REG_TMP, 0);
+ tcg_out_st32_12(s, h.cond, datahi, TCG_REG_TMP, 4);
}
break;
default:
g_assert_not_reached();
}
}
-#endif
static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
TCGReg addrlo, TCGReg addrhi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ HostAddress h;
#ifdef CONFIG_SOFTMMU
- TCGReg addend = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 0);
-
- tcg_out_qemu_st_index(s, COND_EQ, opc, datalo, datahi,
- addrlo, addend, true);
+ h.cond = COND_EQ;
+ h.base = addrlo;
+ h.index_scratch = true;
+ h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 0);
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
/* The conditional call must come last, as we're going to return here. */
tcg_insn_unit *label_ptr = s->code_ptr;
@@ -1926,17 +1906,19 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
addrlo, addrhi, s->code_ptr, label_ptr);
-#else /* !CONFIG_SOFTMMU */
+#else
unsigned a_bits = get_alignment_bits(opc);
+
+ h.cond = COND_AL;
if (a_bits) {
tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
+ h.cond = COND_EQ;
}
- if (guest_base) {
- tcg_out_qemu_st_index(s, COND_AL, opc, datalo, datahi,
- addrlo, TCG_REG_GUEST_BASE, false);
- } else {
- tcg_out_qemu_st_direct(s, opc, datalo, datahi, addrlo);
- }
+
+ h.base = addrlo;
+ h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
+ h.index_scratch = false;
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
#endif
}
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 13/54] tcg/arm: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (11 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 12/54] tcg/arm: Introduce HostAddress Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 14/54] tcg/loongarch64: Rationalize args to tcg_out_qemu_{ld, st} Richard Henderson
` (40 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label, and some code that lived
in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that
returns HostAddress and TCGLabelQemuLdst structures.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/arm/tcg-target.c.inc | 351 ++++++++++++++++++---------------------
1 file changed, 159 insertions(+), 192 deletions(-)
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index b6b4ffc546..c744512778 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1434,125 +1434,6 @@ static TCGReg tcg_out_arg_reg64(TCGContext *s, TCGReg argreg,
}
}
-#define TLB_SHIFT (CPU_TLB_ENTRY_BITS + CPU_TLB_BITS)
-
-/* We expect to use an 9-bit sign-magnitude negative offset from ENV. */
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -256);
-
-/* These offsets are built into the LDRD below. */
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
-QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 4);
-
-/* Load and compare a TLB entry, leaving the flags set. Returns the register
- containing the addend of the tlb entry. Clobbers R0, R1, R2, TMP. */
-
-static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addrlo, TCGReg addrhi,
- MemOp opc, int mem_index, bool is_load)
-{
- int cmp_off = (is_load ? offsetof(CPUTLBEntry, addr_read)
- : offsetof(CPUTLBEntry, addr_write));
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
- unsigned s_mask = (1 << (opc & MO_SIZE)) - 1;
- unsigned a_mask = (1 << get_alignment_bits(opc)) - 1;
- TCGReg t_addr;
-
- /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {r0,r1}. */
- tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_AREG0, fast_off);
-
- /* Extract the tlb index from the address into R0. */
- tcg_out_dat_reg(s, COND_AL, ARITH_AND, TCG_REG_R0, TCG_REG_R0, addrlo,
- SHIFT_IMM_LSR(TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS));
-
- /*
- * Add the tlb_table pointer, creating the CPUTLBEntry address in R1.
- * Load the tlb comparator into R2/R3 and the fast path addend into R1.
- */
- if (cmp_off == 0) {
- if (TARGET_LONG_BITS == 64) {
- tcg_out_ldrd_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
- } else {
- tcg_out_ld32_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
- }
- } else {
- tcg_out_dat_reg(s, COND_AL, ARITH_ADD,
- TCG_REG_R1, TCG_REG_R1, TCG_REG_R0, 0);
- if (TARGET_LONG_BITS == 64) {
- tcg_out_ldrd_8(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
- } else {
- tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
- }
- }
-
- /* Load the tlb addend. */
- tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R1,
- offsetof(CPUTLBEntry, addend));
-
- /*
- * Check alignment, check comparators.
- * Do this in 2-4 insns. Use MOVW for v7, if possible,
- * to reduce the number of sequential conditional instructions.
- * Almost all guests have at least 4k pages, which means that we need
- * to clear at least 9 bits even for an 8-byte memory, which means it
- * isn't worth checking for an immediate operand for BIC.
- *
- * For unaligned accesses, test the page of the last unit of alignment.
- * This leaves the least significant alignment bits unchanged, and of
- * course must be zero.
- */
- t_addr = addrlo;
- if (a_mask < s_mask) {
- t_addr = TCG_REG_R0;
- tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
- addrlo, s_mask - a_mask);
- }
- if (use_armv7_instructions && TARGET_PAGE_BITS <= 16) {
- tcg_out_movi32(s, COND_AL, TCG_REG_TMP, ~(TARGET_PAGE_MASK | a_mask));
- tcg_out_dat_reg(s, COND_AL, ARITH_BIC, TCG_REG_TMP,
- t_addr, TCG_REG_TMP, 0);
- tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R2, TCG_REG_TMP, 0);
- } else {
- if (a_mask) {
- tcg_debug_assert(a_mask <= 0xff);
- tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
- }
- tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t_addr,
- SHIFT_IMM_LSR(TARGET_PAGE_BITS));
- tcg_out_dat_reg(s, (a_mask ? COND_EQ : COND_AL), ARITH_CMP,
- 0, TCG_REG_R2, TCG_REG_TMP,
- SHIFT_IMM_LSL(TARGET_PAGE_BITS));
- }
-
- if (TARGET_LONG_BITS == 64) {
- tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R3, addrhi, 0);
- }
-
- return TCG_REG_R1;
-}
-
-/* Record the context of a call to the out of line helper code for the slow
- path for a load or store, so that we can later generate the correct
- helper code. */
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
- MemOpIdx oi, TCGType type,
- TCGReg datalo, TCGReg datahi,
- TCGReg addrlo, TCGReg addrhi,
- tcg_insn_unit *raddr,
- tcg_insn_unit *label_ptr)
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->oi = oi;
- label->type = type;
- label->datalo_reg = datalo;
- label->datahi_reg = datahi;
- label->addrlo_reg = addrlo;
- label->addrhi_reg = addrhi;
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = label_ptr;
-}
-
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
TCGReg argreg;
@@ -1636,29 +1517,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
return true;
}
#else
-
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
- TCGReg addrhi, unsigned a_bits)
-{
- unsigned a_mask = (1 << a_bits) - 1;
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->addrlo_reg = addrlo;
- label->addrhi_reg = addrhi;
-
- /* We are expecting a_bits to max out at 7, and can easily support 8. */
- tcg_debug_assert(a_mask <= 0xff);
- /* tst addr, #mask */
- tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
-
- /* blne slow_path */
- label->label_ptr[0] = s->code_ptr;
- tcg_out_bl_imm(s, COND_NE, 0);
-
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
if (!reloc_pc24(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
@@ -1703,6 +1561,134 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
#endif /* SOFTMMU */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, bool is_ld)
+{
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ MemOp a_bits = get_alignment_bits(opc);
+ unsigned a_mask = (1 << a_bits) - 1;
+
+#ifdef CONFIG_SOFTMMU
+ int mem_index = get_mmuidx(oi);
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
+ : offsetof(CPUTLBEntry, addr_write);
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
+ unsigned s_mask = (1 << (opc & MO_SIZE)) - 1;
+ TCGReg t_addr;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+
+ /* Load env_tlb(env)->f[mmu_idx].{mask,table} into {r0,r1}. */
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -256);
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, mask) != 0);
+ QEMU_BUILD_BUG_ON(offsetof(CPUTLBDescFast, table) != 4);
+ tcg_out_ldrd_8(s, COND_AL, TCG_REG_R0, TCG_AREG0, fast_off);
+
+ /* Extract the tlb index from the address into R0. */
+ tcg_out_dat_reg(s, COND_AL, ARITH_AND, TCG_REG_R0, TCG_REG_R0, addrlo,
+ SHIFT_IMM_LSR(TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS));
+
+ /*
+ * Add the tlb_table pointer, creating the CPUTLBEntry address in R1.
+ * Load the tlb comparator into R2/R3 and the fast path addend into R1.
+ */
+ if (cmp_off == 0) {
+ if (TARGET_LONG_BITS == 64) {
+ tcg_out_ldrd_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
+ } else {
+ tcg_out_ld32_rwb(s, COND_AL, TCG_REG_R2, TCG_REG_R1, TCG_REG_R0);
+ }
+ } else {
+ tcg_out_dat_reg(s, COND_AL, ARITH_ADD,
+ TCG_REG_R1, TCG_REG_R1, TCG_REG_R0, 0);
+ if (TARGET_LONG_BITS == 64) {
+ tcg_out_ldrd_8(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
+ } else {
+ tcg_out_ld32_12(s, COND_AL, TCG_REG_R2, TCG_REG_R1, cmp_off);
+ }
+ }
+
+ /* Load the tlb addend. */
+ tcg_out_ld32_12(s, COND_AL, TCG_REG_R1, TCG_REG_R1,
+ offsetof(CPUTLBEntry, addend));
+
+ /*
+ * Check alignment, check comparators.
+ * Do this in 2-4 insns. Use MOVW for v7, if possible,
+ * to reduce the number of sequential conditional instructions.
+ * Almost all guests have at least 4k pages, which means that we need
+ * to clear at least 9 bits even for an 8-byte memory, which means it
+ * isn't worth checking for an immediate operand for BIC.
+ *
+ * For unaligned accesses, test the page of the last unit of alignment.
+ * This leaves the least significant alignment bits unchanged, and of
+ * course must be zero.
+ */
+ t_addr = addrlo;
+ if (a_mask < s_mask) {
+ t_addr = TCG_REG_R0;
+ tcg_out_dat_imm(s, COND_AL, ARITH_ADD, t_addr,
+ addrlo, s_mask - a_mask);
+ }
+ if (use_armv7_instructions && TARGET_PAGE_BITS <= 16) {
+ tcg_out_movi32(s, COND_AL, TCG_REG_TMP, ~(TARGET_PAGE_MASK | a_mask));
+ tcg_out_dat_reg(s, COND_AL, ARITH_BIC, TCG_REG_TMP,
+ t_addr, TCG_REG_TMP, 0);
+ tcg_out_dat_reg(s, COND_AL, ARITH_CMP, 0, TCG_REG_R2, TCG_REG_TMP, 0);
+ } else {
+ if (a_mask) {
+ tcg_debug_assert(a_mask <= 0xff);
+ tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
+ }
+ tcg_out_dat_reg(s, COND_AL, ARITH_MOV, TCG_REG_TMP, 0, t_addr,
+ SHIFT_IMM_LSR(TARGET_PAGE_BITS));
+ tcg_out_dat_reg(s, (a_mask ? COND_EQ : COND_AL), ARITH_CMP,
+ 0, TCG_REG_R2, TCG_REG_TMP,
+ SHIFT_IMM_LSL(TARGET_PAGE_BITS));
+ }
+
+ if (TARGET_LONG_BITS == 64) {
+ tcg_out_dat_reg(s, COND_EQ, ARITH_CMP, 0, TCG_REG_R3, addrhi, 0);
+ }
+
+ *h = (HostAddress){
+ .cond = COND_AL,
+ .base = addrlo,
+ .index = TCG_REG_R1,
+ .index_scratch = true,
+ };
+#else
+ if (a_mask) {
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+
+ /* We are expecting a_bits to max out at 7 */
+ tcg_debug_assert(a_mask <= 0xff);
+ /* tst addr, #mask */
+ tcg_out_dat_imm(s, COND_AL, ARITH_TST, 0, addrlo, a_mask);
+ }
+
+ *h = (HostAddress){
+ .cond = COND_AL,
+ .base = addrlo,
+ .index = guest_base ? TCG_REG_GUEST_BASE : -1,
+ .index_scratch = false,
+ };
+#endif
+
+ return ldst;
+}
+
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
TCGReg datahi, HostAddress h)
{
@@ -1799,37 +1785,28 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- h.cond = COND_AL;
- h.base = addrlo;
- h.index_scratch = true;
- h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 1);
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
- /*
- * This a conditional BL only to load a pointer within this opcode into
- * LR for the slow path. We will not be using the value for a tail call.
- */
- tcg_insn_unit *label_ptr = s->code_ptr;
- tcg_out_bl_imm(s, COND_NE, 0);
+ /*
+ * This a conditional BL only to load a pointer within this
+ * opcode into LR for the slow path. We will not be using
+ * the value for a tail call.
+ */
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_bl_imm(s, COND_NE, 0);
- tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
-
- add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
+ } else {
+ tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
}
-
- h.cond = COND_AL;
- h.base = addrlo;
- h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
- h.index_scratch = false;
- tcg_out_qemu_ld_direct(s, opc, datalo, datahi, h);
-#endif
}
static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg datalo,
@@ -1891,35 +1868,25 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- h.cond = COND_EQ;
- h.base = addrlo;
- h.index_scratch = true;
- h.index = tcg_out_tlb_read(s, addrlo, addrhi, opc, get_mmuidx(oi), 0);
- tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
- /* The conditional call must come last, as we're going to return here. */
- tcg_insn_unit *label_ptr = s->code_ptr;
- tcg_out_bl_imm(s, COND_NE, 0);
-
- add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
-
- h.cond = COND_AL;
- if (a_bits) {
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
h.cond = COND_EQ;
- }
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
- h.base = addrlo;
- h.index = guest_base ? TCG_REG_GUEST_BASE : -1;
- h.index_scratch = false;
- tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
-#endif
+ /* The conditional call is last, as we're going to return here. */
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_bl_imm(s, COND_NE, 0);
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
+ } else {
+ tcg_out_qemu_st_direct(s, opc, datalo, datahi, h);
+ }
}
static void tcg_out_epilogue(TCGContext *s);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 14/54] tcg/loongarch64: Rationalize args to tcg_out_qemu_{ld, st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (12 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 13/54] tcg/arm: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 15/54] tcg/loongarch64: Introduce HostAddress Richard Henderson
` (39 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Interpret the variable argument placement in the caller. Shift some
code around slightly to share more between softmmu and user-only.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/loongarch64/tcg-target.c.inc | 100 +++++++++++++------------------
1 file changed, 42 insertions(+), 58 deletions(-)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 0940788c6f..2e3c67054b 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1049,39 +1049,31 @@ static void tcg_out_qemu_ld_indexed(TCGContext *s, TCGReg rd, TCGReg rj,
}
}
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type)
+static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg addr_regl;
- TCGReg data_regl;
- MemOpIdx oi;
- MemOp opc;
-#if defined(CONFIG_SOFTMMU)
+ MemOp opc = get_memop(oi);
+ TCGReg base, index;
+
+#ifdef CONFIG_SOFTMMU
tcg_insn_unit *label_ptr[1];
-#else
- unsigned a_bits;
-#endif
- TCGReg base;
- data_regl = *args++;
- addr_regl = *args++;
- oi = *args++;
- opc = get_memop(oi);
-
-#if defined(CONFIG_SOFTMMU)
- tcg_out_tlb_load(s, addr_regl, oi, label_ptr, 1);
- base = tcg_out_zext_addr_if_32_bit(s, addr_regl, TCG_REG_TMP0);
- tcg_out_qemu_ld_indexed(s, data_regl, base, TCG_REG_TMP2, opc, type);
- add_qemu_ldst_label(s, 1, oi, type,
- data_regl, addr_regl,
- s->code_ptr, label_ptr);
+ tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
+ index = TCG_REG_TMP2;
#else
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
- tcg_out_test_alignment(s, true, addr_regl, a_bits);
+ tcg_out_test_alignment(s, true, addr_reg, a_bits);
}
- base = tcg_out_zext_addr_if_32_bit(s, addr_regl, TCG_REG_TMP0);
- TCGReg guest_base_reg = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
- tcg_out_qemu_ld_indexed(s, data_regl, base, guest_base_reg, opc, type);
+ index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
+#endif
+
+ base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
+ tcg_out_qemu_ld_indexed(s, data_reg, base, index, opc, data_type);
+
+#ifdef CONFIG_SOFTMMU
+ add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
+ s->code_ptr, label_ptr);
#endif
}
@@ -1109,39 +1101,31 @@ static void tcg_out_qemu_st_indexed(TCGContext *s, TCGReg data,
}
}
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type)
+static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg addr_regl;
- TCGReg data_regl;
- MemOpIdx oi;
- MemOp opc;
-#if defined(CONFIG_SOFTMMU)
+ MemOp opc = get_memop(oi);
+ TCGReg base, index;
+
+#ifdef CONFIG_SOFTMMU
tcg_insn_unit *label_ptr[1];
-#else
- unsigned a_bits;
-#endif
- TCGReg base;
- data_regl = *args++;
- addr_regl = *args++;
- oi = *args++;
- opc = get_memop(oi);
-
-#if defined(CONFIG_SOFTMMU)
- tcg_out_tlb_load(s, addr_regl, oi, label_ptr, 0);
- base = tcg_out_zext_addr_if_32_bit(s, addr_regl, TCG_REG_TMP0);
- tcg_out_qemu_st_indexed(s, data_regl, base, TCG_REG_TMP2, opc);
- add_qemu_ldst_label(s, 0, oi, type,
- data_regl, addr_regl,
- s->code_ptr, label_ptr);
+ tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
+ index = TCG_REG_TMP2;
#else
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
- tcg_out_test_alignment(s, false, addr_regl, a_bits);
+ tcg_out_test_alignment(s, false, addr_reg, a_bits);
}
- base = tcg_out_zext_addr_if_32_bit(s, addr_regl, TCG_REG_TMP0);
- TCGReg guest_base_reg = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
- tcg_out_qemu_st_indexed(s, data_regl, base, guest_base_reg, opc);
+ index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
+#endif
+
+ base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
+ tcg_out_qemu_st_indexed(s, data_reg, base, index, opc);
+
+#ifdef CONFIG_SOFTMMU
+ add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
+ s->code_ptr, label_ptr);
#endif
}
@@ -1564,16 +1548,16 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- tcg_out_qemu_ld(s, args, TCG_TYPE_I32);
+ tcg_out_qemu_ld(s, a0, a1, a2, TCG_TYPE_I32);
break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, args, TCG_TYPE_I64);
+ tcg_out_qemu_ld(s, a0, a1, a2, TCG_TYPE_I64);
break;
case INDEX_op_qemu_st_i32:
- tcg_out_qemu_st(s, args, TCG_TYPE_I32);
+ tcg_out_qemu_st(s, a0, a1, a2, TCG_TYPE_I32);
break;
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, args, TCG_TYPE_I64);
+ tcg_out_qemu_st(s, a0, a1, a2, TCG_TYPE_I64);
break;
case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 15/54] tcg/loongarch64: Introduce HostAddress
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (13 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 14/54] tcg/loongarch64: Rationalize args to tcg_out_qemu_{ld, st} Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 16/54] tcg/loongarch64: Introduce prepare_host_addr Richard Henderson
` (38 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Collect the 2 parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/loongarch64/tcg-target.c.inc | 55 +++++++++++++++++---------------
1 file changed, 30 insertions(+), 25 deletions(-)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 2e3c67054b..6a87a5e5a3 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1013,36 +1013,41 @@ static TCGReg tcg_out_zext_addr_if_32_bit(TCGContext *s,
return addr;
}
-static void tcg_out_qemu_ld_indexed(TCGContext *s, TCGReg rd, TCGReg rj,
- TCGReg rk, MemOp opc, TCGType type)
+typedef struct {
+ TCGReg base;
+ TCGReg index;
+} HostAddress;
+
+static void tcg_out_qemu_ld_indexed(TCGContext *s, MemOp opc, TCGType type,
+ TCGReg rd, HostAddress h)
{
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((opc & MO_BSWAP) == 0);
switch (opc & MO_SSIZE) {
case MO_UB:
- tcg_out_opc_ldx_bu(s, rd, rj, rk);
+ tcg_out_opc_ldx_bu(s, rd, h.base, h.index);
break;
case MO_SB:
- tcg_out_opc_ldx_b(s, rd, rj, rk);
+ tcg_out_opc_ldx_b(s, rd, h.base, h.index);
break;
case MO_UW:
- tcg_out_opc_ldx_hu(s, rd, rj, rk);
+ tcg_out_opc_ldx_hu(s, rd, h.base, h.index);
break;
case MO_SW:
- tcg_out_opc_ldx_h(s, rd, rj, rk);
+ tcg_out_opc_ldx_h(s, rd, h.base, h.index);
break;
case MO_UL:
if (type == TCG_TYPE_I64) {
- tcg_out_opc_ldx_wu(s, rd, rj, rk);
+ tcg_out_opc_ldx_wu(s, rd, h.base, h.index);
break;
}
/* fallthrough */
case MO_SL:
- tcg_out_opc_ldx_w(s, rd, rj, rk);
+ tcg_out_opc_ldx_w(s, rd, h.base, h.index);
break;
case MO_UQ:
- tcg_out_opc_ldx_d(s, rd, rj, rk);
+ tcg_out_opc_ldx_d(s, rd, h.base, h.index);
break;
default:
g_assert_not_reached();
@@ -1053,23 +1058,23 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
- TCGReg base, index;
+ HostAddress h;
#ifdef CONFIG_SOFTMMU
tcg_insn_unit *label_ptr[1];
tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
- index = TCG_REG_TMP2;
+ h.index = TCG_REG_TMP2;
#else
unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addr_reg, a_bits);
}
- index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
+ h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
#endif
- base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
- tcg_out_qemu_ld_indexed(s, data_reg, base, index, opc, data_type);
+ h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
+ tcg_out_qemu_ld_indexed(s, opc, data_type, data_reg, h);
#ifdef CONFIG_SOFTMMU
add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
@@ -1077,24 +1082,24 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
#endif
}
-static void tcg_out_qemu_st_indexed(TCGContext *s, TCGReg data,
- TCGReg rj, TCGReg rk, MemOp opc)
+static void tcg_out_qemu_st_indexed(TCGContext *s, MemOp opc,
+ TCGReg rd, HostAddress h)
{
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((opc & MO_BSWAP) == 0);
switch (opc & MO_SIZE) {
case MO_8:
- tcg_out_opc_stx_b(s, data, rj, rk);
+ tcg_out_opc_stx_b(s, rd, h.base, h.index);
break;
case MO_16:
- tcg_out_opc_stx_h(s, data, rj, rk);
+ tcg_out_opc_stx_h(s, rd, h.base, h.index);
break;
case MO_32:
- tcg_out_opc_stx_w(s, data, rj, rk);
+ tcg_out_opc_stx_w(s, rd, h.base, h.index);
break;
case MO_64:
- tcg_out_opc_stx_d(s, data, rj, rk);
+ tcg_out_opc_stx_d(s, rd, h.base, h.index);
break;
default:
g_assert_not_reached();
@@ -1105,23 +1110,23 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
- TCGReg base, index;
+ HostAddress h;
#ifdef CONFIG_SOFTMMU
tcg_insn_unit *label_ptr[1];
tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
- index = TCG_REG_TMP2;
+ h.index = TCG_REG_TMP2;
#else
unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, false, addr_reg, a_bits);
}
- index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
+ h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
#endif
- base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
- tcg_out_qemu_st_indexed(s, data_reg, base, index, opc);
+ h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
+ tcg_out_qemu_st_indexed(s, opc, data_reg, h);
#ifdef CONFIG_SOFTMMU
add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 16/54] tcg/loongarch64: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (14 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 15/54] tcg/loongarch64: Introduce HostAddress Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 17/54] tcg/mips: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
` (37 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
tcg_out_zext_addr_if_32_bit, and some code that lived in both
tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns
HostAddress and TCGLabelQemuLdst structures.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/loongarch64/tcg-target.c.inc | 255 +++++++++++++------------------
1 file changed, 105 insertions(+), 150 deletions(-)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 6a87a5e5a3..2f2c34b930 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -818,81 +818,12 @@ static void * const qemu_st_helpers[4] = {
[MO_64] = helper_le_stq_mmu,
};
-/* We expect to use a 12-bit negative offset from ENV. */
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
-
static bool tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
{
tcg_out_opc_b(s, 0);
return reloc_br_sd10k16(s->code_ptr - 1, target);
}
-/*
- * Emits common code for TLB addend lookup, that eventually loads the
- * addend in TCG_REG_TMP2.
- */
-static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl, MemOpIdx oi,
- tcg_insn_unit **label_ptr, bool is_load)
-{
- MemOp opc = get_memop(oi);
- unsigned s_bits = opc & MO_SIZE;
- unsigned a_bits = get_alignment_bits(opc);
- tcg_target_long compare_mask;
- int mem_index = get_mmuidx(oi);
- int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
- int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
- int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
-
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_AREG0, mask_ofs);
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, table_ofs);
-
- tcg_out_opc_srli_d(s, TCG_REG_TMP2, addrl,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
- tcg_out_opc_and(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
- tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
-
- /* Load the tlb comparator and the addend. */
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
- is_load ? offsetof(CPUTLBEntry, addr_read)
- : offsetof(CPUTLBEntry, addr_write));
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
- offsetof(CPUTLBEntry, addend));
-
- /* We don't support unaligned accesses. */
- if (a_bits < s_bits) {
- a_bits = s_bits;
- }
- /* Clear the non-page, non-alignment bits from the address. */
- compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
- tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
- tcg_out_opc_and(s, TCG_REG_TMP1, TCG_REG_TMP1, addrl);
-
- /* Compare masked address with the TLB entry. */
- label_ptr[0] = s->code_ptr;
- tcg_out_opc_bne(s, TCG_REG_TMP0, TCG_REG_TMP1, 0);
-
- /* TLB Hit - addend in TCG_REG_TMP2, ready for use. */
-}
-
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
- TCGType type,
- TCGReg datalo, TCGReg addrlo,
- void *raddr, tcg_insn_unit **label_ptr)
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->oi = oi;
- label->type = type;
- label->datalo_reg = datalo;
- label->datahi_reg = 0; /* unused */
- label->addrlo_reg = addrlo;
- label->addrhi_reg = 0; /* unused */
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = label_ptr[0];
-}
-
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
MemOpIdx oi = l->oi;
@@ -941,33 +872,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
return tcg_out_goto(s, l->raddr);
}
#else
-
-/*
- * Alignment helpers for user-mode emulation
- */
-
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
- unsigned a_bits)
-{
- TCGLabelQemuLdst *l = new_ldst_label(s);
-
- l->is_ld = is_ld;
- l->addrlo_reg = addr_reg;
-
- /*
- * Without micro-architecture details, we don't know which of bstrpick or
- * andi is faster, so use bstrpick as it's not constrained by imm field
- * width. (Not to say alignments >= 2^12 are going to happen any time
- * soon, though)
- */
- tcg_out_opc_bstrpick_d(s, TCG_REG_TMP1, addr_reg, 0, a_bits - 1);
-
- l->label_ptr[0] = s->code_ptr;
- tcg_out_opc_bne(s, TCG_REG_TMP1, TCG_REG_ZERO, 0);
-
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
/* resolve label address */
@@ -997,27 +901,102 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
#endif /* CONFIG_SOFTMMU */
-/*
- * `ext32u` the address register into the temp register given,
- * if target is 32-bit, no-op otherwise.
- *
- * Returns the address register ready for use with TLB addend.
- */
-static TCGReg tcg_out_zext_addr_if_32_bit(TCGContext *s,
- TCGReg addr, TCGReg tmp)
-{
- if (TARGET_LONG_BITS == 32) {
- tcg_out_ext32u(s, tmp, addr);
- return tmp;
- }
- return addr;
-}
-
typedef struct {
TCGReg base;
TCGReg index;
} HostAddress;
+/*
+ * For softmmu, perform the TLB load and compare.
+ * For useronly, perform any required alignment tests.
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
+ * is required and fill in @h with the host address for the fast path.
+ */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
+ TCGReg addr_reg, MemOpIdx oi,
+ bool is_ld)
+{
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+
+#ifdef CONFIG_SOFTMMU
+ unsigned s_bits = opc & MO_SIZE;
+ int mem_index = get_mmuidx(oi);
+ int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
+ int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
+ int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
+ tcg_target_long compare_mask;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_AREG0, mask_ofs);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, table_ofs);
+
+ tcg_out_opc_srli_d(s, TCG_REG_TMP2, addr_reg,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+ tcg_out_opc_and(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
+ tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
+
+ /* Load the tlb comparator and the addend. */
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
+ : offsetof(CPUTLBEntry, addr_write));
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
+ offsetof(CPUTLBEntry, addend));
+
+ /* We don't support unaligned accesses. */
+ if (a_bits < s_bits) {
+ a_bits = s_bits;
+ }
+ /* Clear the non-page, non-alignment bits from the address. */
+ compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
+ tcg_out_opc_and(s, TCG_REG_TMP1, TCG_REG_TMP1, addr_reg);
+
+ /* Compare masked address with the TLB entry. */
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_opc_bne(s, TCG_REG_TMP0, TCG_REG_TMP1, 0);
+
+ h->index = TCG_REG_TMP2;
+#else
+ if (a_bits) {
+ ldst = new_ldst_label(s);
+
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ /*
+ * Without micro-architecture details, we don't know which of
+ * bstrpick or andi is faster, so use bstrpick as it's not
+ * constrained by imm field width. Not to say alignments >= 2^12
+ * are going to happen any time soon.
+ */
+ tcg_out_opc_bstrpick_d(s, TCG_REG_TMP1, addr_reg, 0, a_bits - 1);
+
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_opc_bne(s, TCG_REG_TMP1, TCG_REG_ZERO, 0);
+ }
+
+ h->index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
+#endif
+
+ if (TARGET_LONG_BITS == 32) {
+ h->base = TCG_REG_TMP0;
+ tcg_out_ext32u(s, h->base, addr_reg);
+ } else {
+ h->base = addr_reg;
+ }
+
+ return ldst;
+}
+
static void tcg_out_qemu_ld_indexed(TCGContext *s, MemOp opc, TCGType type,
TCGReg rd, HostAddress h)
{
@@ -1057,29 +1036,17 @@ static void tcg_out_qemu_ld_indexed(TCGContext *s, MemOp opc, TCGType type,
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- tcg_insn_unit *label_ptr[1];
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
+ tcg_out_qemu_ld_indexed(s, get_memop(oi), data_type, data_reg, h);
- tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
- h.index = TCG_REG_TMP2;
-#else
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
-#endif
-
- h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
- tcg_out_qemu_ld_indexed(s, opc, data_type, data_reg, h);
-
-#ifdef CONFIG_SOFTMMU
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#endif
}
static void tcg_out_qemu_st_indexed(TCGContext *s, MemOp opc,
@@ -1109,29 +1076,17 @@ static void tcg_out_qemu_st_indexed(TCGContext *s, MemOp opc,
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- tcg_insn_unit *label_ptr[1];
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
+ tcg_out_qemu_st_indexed(s, get_memop(oi), data_reg, h);
- tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
- h.index = TCG_REG_TMP2;
-#else
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- h.index = USE_GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_ZERO;
-#endif
-
- h.base = tcg_out_zext_addr_if_32_bit(s, addr_reg, TCG_REG_TMP0);
- tcg_out_qemu_st_indexed(s, opc, data_reg, h);
-
-#ifdef CONFIG_SOFTMMU
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#endif
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 17/54] tcg/mips: Rationalize args to tcg_out_qemu_{ld,st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (15 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 16/54] tcg/loongarch64: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 18/54] tcg/mips: Introduce prepare_host_addr Richard Henderson
` (36 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Interpret the variable argument placement in the caller. There are
several places where we already convert back from bool to type.
Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/mips/tcg-target.c.inc | 186 +++++++++++++++++++-------------------
1 file changed, 95 insertions(+), 91 deletions(-)
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index a83ebe8729..ef8350e9cd 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -1479,7 +1479,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
#endif /* SOFTMMU */
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
- TCGReg base, MemOp opc, bool is_64)
+ TCGReg base, MemOp opc, TCGType type)
{
switch (opc & (MO_SSIZE | MO_BSWAP)) {
case MO_UB:
@@ -1503,7 +1503,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
tcg_out_opc_imm(s, OPC_LH, lo, base, 0);
break;
case MO_UL | MO_BSWAP:
- if (TCG_TARGET_REG_BITS == 64 && is_64) {
+ if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
if (use_mips32r2_instructions) {
tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
tcg_out_bswap32(s, lo, lo, TCG_BSWAP_IZ | TCG_BSWAP_OZ);
@@ -1528,7 +1528,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
}
break;
case MO_UL:
- if (TCG_TARGET_REG_BITS == 64 && is_64) {
+ if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
break;
}
@@ -1583,7 +1583,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
}
static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
- TCGReg base, MemOp opc, bool is_64)
+ TCGReg base, MemOp opc, TCGType type)
{
const MIPSInsn lw1 = MIPS_BE ? OPC_LWL : OPC_LWR;
const MIPSInsn lw2 = MIPS_BE ? OPC_LWR : OPC_LWL;
@@ -1623,7 +1623,7 @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
case MO_UL:
tcg_out_opc_imm(s, lw1, lo, base, 0);
tcg_out_opc_imm(s, lw2, lo, base, 3);
- if (TCG_TARGET_REG_BITS == 64 && is_64 && !sgn) {
+ if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn) {
tcg_out_ext32u(s, lo, lo);
}
break;
@@ -1634,18 +1634,18 @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
tcg_out_opc_imm(s, lw1, lo, base, 0);
tcg_out_opc_imm(s, lw2, lo, base, 3);
tcg_out_bswap32(s, lo, lo,
- TCG_TARGET_REG_BITS == 64 && is_64
+ TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64
? (sgn ? TCG_BSWAP_OS : TCG_BSWAP_OZ) : 0);
} else {
const tcg_insn_unit *subr =
- (TCG_TARGET_REG_BITS == 64 && is_64 && !sgn
+ (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn
? bswap32u_addr : bswap32_addr);
tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0);
tcg_out_bswap_subr(s, subr);
/* delay slot */
tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 3);
- tcg_out_mov(s, is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, lo, TCG_TMP3);
+ tcg_out_mov(s, type, lo, TCG_TMP3);
}
break;
@@ -1702,68 +1702,59 @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
}
}
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
+static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg addr_regl, addr_regh __attribute__((unused));
- TCGReg data_regl, data_regh;
- MemOpIdx oi;
- MemOp opc;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[2];
-#else
-#endif
- unsigned a_bits, s_bits;
- TCGReg base = TCG_REG_A0;
-
- data_regl = *args++;
- data_regh = (TCG_TARGET_REG_BITS == 32 && is_64 ? *args++ : 0);
- addr_regl = *args++;
- addr_regh = (TCG_TARGET_REG_BITS < TARGET_LONG_BITS ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
- a_bits = get_alignment_bits(opc);
- s_bits = opc & MO_SIZE;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+ unsigned s_bits = opc & MO_SIZE;
+ TCGReg base;
/*
* R6 removes the left/right instructions but requires the
* system to support misaligned memory accesses.
*/
#if defined(CONFIG_SOFTMMU)
- tcg_out_tlb_load(s, base, addr_regl, addr_regh, oi, label_ptr, 1);
+ tcg_insn_unit *label_ptr[2];
+
+ base = TCG_REG_A0;
+ tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 1);
if (use_mips32r6_instructions || a_bits >= s_bits) {
- tcg_out_qemu_ld_direct(s, data_regl, data_regh, base, opc, is_64);
+ tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
} else {
- tcg_out_qemu_ld_unalign(s, data_regl, data_regh, base, opc, is_64);
+ tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
}
- add_qemu_ldst_label(s, 1, oi,
- (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
- data_regl, data_regh, addr_regl, addr_regh,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#else
+ base = addrlo;
if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, base, addr_regl);
- addr_regl = base;
+ tcg_out_ext32u(s, TCG_REG_A0, base);
+ base = TCG_REG_A0;
}
- if (guest_base == 0 && data_regl != addr_regl) {
- base = addr_regl;
- } else if (guest_base == (int16_t)guest_base) {
- tcg_out_opc_imm(s, ALIAS_PADDI, base, addr_regl, guest_base);
- } else {
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_GUEST_BASE_REG, addr_regl);
+ if (guest_base) {
+ if (guest_base == (int16_t)guest_base) {
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
+ } else {
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
+ TCG_GUEST_BASE_REG);
+ }
+ base = TCG_REG_A0;
}
if (use_mips32r6_instructions) {
if (a_bits) {
- tcg_out_test_alignment(s, true, addr_regl, addr_regh, a_bits);
+ tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
- tcg_out_qemu_ld_direct(s, data_regl, data_regh, base, opc, is_64);
+ tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
} else {
if (a_bits && a_bits != s_bits) {
- tcg_out_test_alignment(s, true, addr_regl, addr_regh, a_bits);
+ tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
if (a_bits >= s_bits) {
- tcg_out_qemu_ld_direct(s, data_regl, data_regh, base, opc, is_64);
+ tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
} else {
- tcg_out_qemu_ld_unalign(s, data_regl, data_regh, base, opc, is_64);
+ tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
}
}
#endif
@@ -1902,67 +1893,60 @@ static void tcg_out_qemu_st_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
g_assert_not_reached();
}
}
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
-{
- TCGReg addr_regl, addr_regh __attribute__((unused));
- TCGReg data_regl, data_regh;
- MemOpIdx oi;
- MemOp opc;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[2];
-#endif
- unsigned a_bits, s_bits;
- TCGReg base = TCG_REG_A0;
- data_regl = *args++;
- data_regh = (TCG_TARGET_REG_BITS == 32 && is_64 ? *args++ : 0);
- addr_regl = *args++;
- addr_regh = (TCG_TARGET_REG_BITS < TARGET_LONG_BITS ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
- a_bits = get_alignment_bits(opc);
- s_bits = opc & MO_SIZE;
+static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
+{
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+ unsigned s_bits = opc & MO_SIZE;
+ TCGReg base;
/*
* R6 removes the left/right instructions but requires the
* system to support misaligned memory accesses.
*/
#if defined(CONFIG_SOFTMMU)
- tcg_out_tlb_load(s, base, addr_regl, addr_regh, oi, label_ptr, 0);
+ tcg_insn_unit *label_ptr[2];
+
+ base = TCG_REG_A0;
+ tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 0);
if (use_mips32r6_instructions || a_bits >= s_bits) {
- tcg_out_qemu_st_direct(s, data_regl, data_regh, base, opc);
+ tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
} else {
- tcg_out_qemu_st_unalign(s, data_regl, data_regh, base, opc);
+ tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
}
- add_qemu_ldst_label(s, 0, oi,
- (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
- data_regl, data_regh, addr_regl, addr_regh,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#else
+ base = addrlo;
if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, base, addr_regl);
- addr_regl = base;
+ tcg_out_ext32u(s, TCG_REG_A0, base);
+ base = TCG_REG_A0;
}
- if (guest_base == 0) {
- base = addr_regl;
- } else if (guest_base == (int16_t)guest_base) {
- tcg_out_opc_imm(s, ALIAS_PADDI, base, addr_regl, guest_base);
- } else {
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_GUEST_BASE_REG, addr_regl);
+ if (guest_base) {
+ if (guest_base == (int16_t)guest_base) {
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
+ } else {
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
+ TCG_GUEST_BASE_REG);
+ }
+ base = TCG_REG_A0;
}
if (use_mips32r6_instructions) {
if (a_bits) {
- tcg_out_test_alignment(s, true, addr_regl, addr_regh, a_bits);
+ tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
- tcg_out_qemu_st_direct(s, data_regl, data_regh, base, opc);
+ tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
} else {
if (a_bits && a_bits != s_bits) {
- tcg_out_test_alignment(s, true, addr_regl, addr_regh, a_bits);
+ tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
if (a_bits >= s_bits) {
- tcg_out_qemu_st_direct(s, data_regl, data_regh, base, opc);
+ tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
} else {
- tcg_out_qemu_st_unalign(s, data_regl, data_regh, base, opc);
+ tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
}
}
#endif
@@ -2425,16 +2409,36 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- tcg_out_qemu_ld(s, args, false);
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ tcg_out_qemu_ld(s, a0, 0, a1, 0, a2, TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_ld(s, a0, 0, a1, a2, args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, args, true);
+ if (TCG_TARGET_REG_BITS == 64) {
+ tcg_out_qemu_ld(s, a0, 0, a1, 0, a2, TCG_TYPE_I64);
+ } else if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_ld(s, a0, a1, a2, 0, args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_ld(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
+ }
break;
case INDEX_op_qemu_st_i32:
- tcg_out_qemu_st(s, args, false);
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ tcg_out_qemu_st(s, a0, 0, a1, 0, a2, TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_st(s, a0, 0, a1, a2, args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, args, true);
+ if (TCG_TARGET_REG_BITS == 64) {
+ tcg_out_qemu_st(s, a0, 0, a1, 0, a2, TCG_TYPE_I64);
+ } else if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_st(s, a0, a1, a2, 0, args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_st(s, a0, a1, a2, args[3], args[4], TCG_TYPE_I64);
+ }
break;
case INDEX_op_add2_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 18/54] tcg/mips: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (16 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 17/54] tcg/mips: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 19/54] tcg/ppc: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
` (35 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/mips/tcg-target.c.inc | 404 ++++++++++++++++----------------------
1 file changed, 172 insertions(+), 232 deletions(-)
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index ef8350e9cd..94708e6ea7 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -1181,120 +1181,6 @@ static int tcg_out_call_iarg_reg2(TCGContext *s, int i, TCGReg al, TCGReg ah)
return i;
}
-/* We expect to use a 16-bit negative offset from ENV. */
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
-
-/*
- * Perform the tlb comparison operation.
- * The complete host address is placed in BASE.
- * Clobbers TMP0, TMP1, TMP2, TMP3.
- */
-static void tcg_out_tlb_load(TCGContext *s, TCGReg base, TCGReg addrl,
- TCGReg addrh, MemOpIdx oi,
- tcg_insn_unit *label_ptr[2], bool is_load)
-{
- MemOp opc = get_memop(oi);
- unsigned a_bits = get_alignment_bits(opc);
- unsigned s_bits = opc & MO_SIZE;
- unsigned a_mask = (1 << a_bits) - 1;
- unsigned s_mask = (1 << s_bits) - 1;
- int mem_index = get_mmuidx(oi);
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
- int add_off = offsetof(CPUTLBEntry, addend);
- int cmp_off = (is_load ? offsetof(CPUTLBEntry, addr_read)
- : offsetof(CPUTLBEntry, addr_write));
- target_ulong tlb_mask;
-
- /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_AREG0, mask_off);
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP1, TCG_AREG0, table_off);
-
- /* Extract the TLB index from the address into TMP3. */
- tcg_out_opc_sa(s, ALIAS_TSRL, TCG_TMP3, addrl,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP3, TCG_TMP3, TCG_TMP0);
-
- /* Add the tlb_table pointer, creating the CPUTLBEntry address in TMP3. */
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_TMP3, TCG_TMP3, TCG_TMP1);
-
- /* Load the (low-half) tlb comparator. */
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
- } else {
- tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
- : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
- TCG_TMP0, TCG_TMP3, cmp_off);
- }
-
- /* Zero extend a 32-bit guest address for a 64-bit host. */
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, base, addrl);
- addrl = base;
- }
-
- /*
- * Mask the page bits, keeping the alignment bits to compare against.
- * For unaligned accesses, compare against the end of the access to
- * verify that it does not cross a page boundary.
- */
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
- tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
- if (a_mask >= s_mask) {
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrl);
- } else {
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrl, s_mask - a_mask);
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
- }
-
- if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
- /* Load the tlb addend for the fast path. */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
- }
-
- label_ptr[0] = s->code_ptr;
- tcg_out_opc_br(s, OPC_BNE, TCG_TMP1, TCG_TMP0);
-
- /* Load and test the high half tlb comparator. */
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- /* delay slot */
- tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
-
- /* Load the tlb addend for the fast path. */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
-
- label_ptr[1] = s->code_ptr;
- tcg_out_opc_br(s, OPC_BNE, addrh, TCG_TMP0);
- }
-
- /* delay slot */
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrl);
-}
-
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
- TCGType ext,
- TCGReg datalo, TCGReg datahi,
- TCGReg addrlo, TCGReg addrhi,
- void *raddr, tcg_insn_unit *label_ptr[2])
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->oi = oi;
- label->type = ext;
- label->datalo_reg = datalo;
- label->datahi_reg = datahi;
- label->addrlo_reg = addrlo;
- label->addrhi_reg = addrhi;
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = label_ptr[0];
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- label->label_ptr[1] = label_ptr[1];
- }
-}
-
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
@@ -1403,32 +1289,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
#else
-
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
- TCGReg addrhi, unsigned a_bits)
-{
- unsigned a_mask = (1 << a_bits) - 1;
- TCGLabelQemuLdst *l = new_ldst_label(s);
-
- l->is_ld = is_ld;
- l->addrlo_reg = addrlo;
- l->addrhi_reg = addrhi;
-
- /* We are expecting a_bits to max out at 7, much lower than ANDI. */
- tcg_debug_assert(a_bits < 16);
- tcg_out_opc_imm(s, OPC_ANDI, TCG_TMP0, addrlo, a_mask);
-
- l->label_ptr[0] = s->code_ptr;
- if (use_mips32r6_instructions) {
- tcg_out_opc_br(s, OPC_BNEZALC_R6, TCG_REG_ZERO, TCG_TMP0);
- } else {
- tcg_out_opc_br(s, OPC_BNEL, TCG_TMP0, TCG_REG_ZERO);
- tcg_out_nop(s);
- }
-
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
void *target;
@@ -1478,6 +1338,154 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
#endif /* SOFTMMU */
+typedef struct {
+ TCGReg base;
+ MemOp align;
+} HostAddress;
+
+/*
+ * For softmmu, perform the TLB load and compare.
+ * For useronly, perform any required alignment tests.
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
+ * is required and fill in @h with the host address for the fast path.
+ */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, bool is_ld)
+{
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+ unsigned s_bits = opc & MO_SIZE;
+ unsigned a_mask = (1 << a_bits) - 1;
+ TCGReg base;
+
+#ifdef CONFIG_SOFTMMU
+ unsigned s_mask = (1 << s_bits) - 1;
+ int mem_index = get_mmuidx(oi);
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
+ int add_off = offsetof(CPUTLBEntry, addend);
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
+ : offsetof(CPUTLBEntry, addr_write);
+ target_ulong tlb_mask;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+ base = TCG_REG_A0;
+
+ /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP0, TCG_AREG0, mask_off);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP1, TCG_AREG0, table_off);
+
+ /* Extract the TLB index from the address into TMP3. */
+ tcg_out_opc_sa(s, ALIAS_TSRL, TCG_TMP3, addrlo,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP3, TCG_TMP3, TCG_TMP0);
+
+ /* Add the tlb_table pointer, creating the CPUTLBEntry address in TMP3. */
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_TMP3, TCG_TMP3, TCG_TMP1);
+
+ /* Load the (low-half) tlb comparator. */
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
+ tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
+ } else {
+ tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
+ : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
+ TCG_TMP0, TCG_TMP3, cmp_off);
+ }
+
+ /* Zero extend a 32-bit guest address for a 64-bit host. */
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+ tcg_out_ext32u(s, base, addrlo);
+ addrlo = base;
+ }
+
+ /*
+ * Mask the page bits, keeping the alignment bits to compare against.
+ * For unaligned accesses, compare against the end of the access to
+ * verify that it does not cross a page boundary.
+ */
+ tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
+ tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
+ if (a_mask >= s_mask) {
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
+ } else {
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrlo, s_mask - a_mask);
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
+ }
+
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ /* Load the tlb addend for the fast path. */
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
+ }
+
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_opc_br(s, OPC_BNE, TCG_TMP1, TCG_TMP0);
+
+ /* Load and test the high half tlb comparator. */
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
+ /* delay slot */
+ tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
+
+ /* Load the tlb addend for the fast path. */
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
+
+ ldst->label_ptr[1] = s->code_ptr;
+ tcg_out_opc_br(s, OPC_BNE, addrhi, TCG_TMP0);
+ }
+
+ /* delay slot */
+ tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrlo);
+#else
+ if (a_mask && (use_mips32r6_instructions || a_bits != s_bits)) {
+ ldst = new_ldst_label(s);
+
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+
+ /* We are expecting a_bits to max out at 7, much lower than ANDI. */
+ tcg_debug_assert(a_bits < 16);
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_TMP0, addrlo, a_mask);
+
+ ldst->label_ptr[0] = s->code_ptr;
+ if (use_mips32r6_instructions) {
+ tcg_out_opc_br(s, OPC_BNEZALC_R6, TCG_REG_ZERO, TCG_TMP0);
+ } else {
+ tcg_out_opc_br(s, OPC_BNEL, TCG_TMP0, TCG_REG_ZERO);
+ tcg_out_nop(s);
+ }
+ }
+
+ base = addrlo;
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+ tcg_out_ext32u(s, TCG_REG_A0, base);
+ base = TCG_REG_A0;
+ }
+ if (guest_base) {
+ if (guest_base == (int16_t)guest_base) {
+ tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
+ } else {
+ tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
+ TCG_GUEST_BASE_REG);
+ }
+ base = TCG_REG_A0;
+ }
+#endif
+
+ h->base = base;
+ h->align = a_bits;
+ return ldst;
+}
+
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
TCGReg base, MemOp opc, TCGType type)
{
@@ -1707,57 +1715,23 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
- unsigned a_bits = get_alignment_bits(opc);
- unsigned s_bits = opc & MO_SIZE;
- TCGReg base;
+ TCGLabelQemuLdst *ldst;
+ HostAddress h;
- /*
- * R6 removes the left/right instructions but requires the
- * system to support misaligned memory accesses.
- */
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[2];
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
- base = TCG_REG_A0;
- tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 1);
- if (use_mips32r6_instructions || a_bits >= s_bits) {
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
+ if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
+ tcg_out_qemu_ld_direct(s, datalo, datahi, h.base, opc, data_type);
} else {
- tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
+ tcg_out_qemu_ld_unalign(s, datalo, datahi, h.base, opc, data_type);
}
- add_qemu_ldst_label(s, true, oi, data_type, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#else
- base = addrlo;
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, TCG_REG_A0, base);
- base = TCG_REG_A0;
+
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- if (guest_base) {
- if (guest_base == (int16_t)guest_base) {
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
- } else {
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
- TCG_GUEST_BASE_REG);
- }
- base = TCG_REG_A0;
- }
- if (use_mips32r6_instructions) {
- if (a_bits) {
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
- }
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
- } else {
- if (a_bits && a_bits != s_bits) {
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
- }
- if (a_bits >= s_bits) {
- tcg_out_qemu_ld_direct(s, datalo, datahi, base, opc, data_type);
- } else {
- tcg_out_qemu_ld_unalign(s, datalo, datahi, base, opc, data_type);
- }
- }
-#endif
}
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
@@ -1899,57 +1873,23 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
- unsigned a_bits = get_alignment_bits(opc);
- unsigned s_bits = opc & MO_SIZE;
- TCGReg base;
+ TCGLabelQemuLdst *ldst;
+ HostAddress h;
- /*
- * R6 removes the left/right instructions but requires the
- * system to support misaligned memory accesses.
- */
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[2];
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
- base = TCG_REG_A0;
- tcg_out_tlb_load(s, base, addrlo, addrhi, oi, label_ptr, 0);
- if (use_mips32r6_instructions || a_bits >= s_bits) {
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
+ if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
+ tcg_out_qemu_st_direct(s, datalo, datahi, h.base, opc);
} else {
- tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
+ tcg_out_qemu_st_unalign(s, datalo, datahi, h.base, opc);
}
- add_qemu_ldst_label(s, false, oi, data_type, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#else
- base = addrlo;
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, TCG_REG_A0, base);
- base = TCG_REG_A0;
+
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- if (guest_base) {
- if (guest_base == (int16_t)guest_base) {
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_REG_A0, base, guest_base);
- } else {
- tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, base,
- TCG_GUEST_BASE_REG);
- }
- base = TCG_REG_A0;
- }
- if (use_mips32r6_instructions) {
- if (a_bits) {
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
- }
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
- } else {
- if (a_bits && a_bits != s_bits) {
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
- }
- if (a_bits >= s_bits) {
- tcg_out_qemu_st_direct(s, datalo, datahi, base, opc);
- } else {
- tcg_out_qemu_st_unalign(s, datalo, datahi, base, opc);
- }
- }
-#endif
}
static void tcg_out_mb(TCGContext *s, TCGArg a0)
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 19/54] tcg/ppc: Rationalize args to tcg_out_qemu_{ld,st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (17 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 18/54] tcg/mips: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 20/54] tcg/ppc: Introduce HostAddress Richard Henderson
` (34 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
Interpret the variable argument placement in the caller. Pass data_type
instead of is64 -- there are several places where we already convert back
from bool to type. Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target.c.inc | 110 +++++++++++++++++++++------------------
1 file changed, 59 insertions(+), 51 deletions(-)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 77abb7d20c..d1aa2a9f53 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2118,7 +2118,8 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, MemOp opc,
/* Record the context of a call to the out of line helper code for the slow
path for a load or store, so that we can later generate the correct
helper code. */
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
+static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
+ TCGType type, MemOpIdx oi,
TCGReg datalo_reg, TCGReg datahi_reg,
TCGReg addrlo_reg, TCGReg addrhi_reg,
tcg_insn_unit *raddr, tcg_insn_unit *lptr)
@@ -2126,6 +2127,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
TCGLabelQemuLdst *label = new_ldst_label(s);
label->is_ld = is_ld;
+ label->type = type;
label->oi = oi;
label->datalo_reg = datalo_reg;
label->datahi_reg = datahi_reg;
@@ -2288,30 +2290,18 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
#endif /* SOFTMMU */
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
+static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg datalo, datahi, addrlo, rbase;
- TCGReg addrhi __attribute__((unused));
- MemOpIdx oi;
- MemOp opc, s_bits;
+ MemOp opc = get_memop(oi);
+ MemOp s_bits = opc & MO_SIZE;
+ TCGReg rbase;
+
#ifdef CONFIG_SOFTMMU
- int mem_index;
tcg_insn_unit *label_ptr;
-#else
- unsigned a_bits;
-#endif
- datalo = *args++;
- datahi = (TCG_TARGET_REG_BITS == 32 && is_64 ? *args++ : 0);
- addrlo = *args++;
- addrhi = (TCG_TARGET_REG_BITS < TARGET_LONG_BITS ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
- s_bits = opc & MO_SIZE;
-
-#ifdef CONFIG_SOFTMMU
- mem_index = get_mmuidx(oi);
- addrlo = tcg_out_tlb_read(s, opc, addrlo, addrhi, mem_index, true);
+ addrlo = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), true);
/* Load a pointer into the current opcode w/conditional branch-link. */
label_ptr = s->code_ptr;
@@ -2319,7 +2309,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
rbase = TCG_REG_R3;
#else /* !CONFIG_SOFTMMU */
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
@@ -2364,35 +2354,23 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
}
#ifdef CONFIG_SOFTMMU
- add_qemu_ldst_label(s, true, oi, datalo, datahi, addrlo, addrhi,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#endif
}
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
+static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg datalo, datahi, addrlo, rbase;
- TCGReg addrhi __attribute__((unused));
- MemOpIdx oi;
- MemOp opc, s_bits;
+ MemOp opc = get_memop(oi);
+ MemOp s_bits = opc & MO_SIZE;
+ TCGReg rbase;
+
#ifdef CONFIG_SOFTMMU
- int mem_index;
tcg_insn_unit *label_ptr;
-#else
- unsigned a_bits;
-#endif
- datalo = *args++;
- datahi = (TCG_TARGET_REG_BITS == 32 && is_64 ? *args++ : 0);
- addrlo = *args++;
- addrhi = (TCG_TARGET_REG_BITS < TARGET_LONG_BITS ? *args++ : 0);
- oi = *args++;
- opc = get_memop(oi);
- s_bits = opc & MO_SIZE;
-
-#ifdef CONFIG_SOFTMMU
- mem_index = get_mmuidx(oi);
- addrlo = tcg_out_tlb_read(s, opc, addrlo, addrhi, mem_index, false);
+ addrlo = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), false);
/* Load a pointer into the current opcode w/conditional branch-link. */
label_ptr = s->code_ptr;
@@ -2400,7 +2378,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
rbase = TCG_REG_R3;
#else /* !CONFIG_SOFTMMU */
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
}
@@ -2437,8 +2415,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
}
#ifdef CONFIG_SOFTMMU
- add_qemu_ldst_label(s, false, oi, datalo, datahi, addrlo, addrhi,
- s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
+ addrlo, addrhi, s->code_ptr, label_ptr);
#endif
}
@@ -2972,16 +2950,46 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- tcg_out_qemu_ld(s, args, false);
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ tcg_out_qemu_ld(s, args[0], -1, args[1], -1,
+ args[2], TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_ld(s, args[0], -1, args[1], args[2],
+ args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, args, true);
+ if (TCG_TARGET_REG_BITS == 64) {
+ tcg_out_qemu_ld(s, args[0], -1, args[1], -1,
+ args[2], TCG_TYPE_I64);
+ } else if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_ld(s, args[0], args[1], args[2], -1,
+ args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_ld(s, args[0], args[1], args[2], args[3],
+ args[4], TCG_TYPE_I64);
+ }
break;
case INDEX_op_qemu_st_i32:
- tcg_out_qemu_st(s, args, false);
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ tcg_out_qemu_st(s, args[0], -1, args[1], -1,
+ args[2], TCG_TYPE_I32);
+ } else {
+ tcg_out_qemu_st(s, args[0], -1, args[1], args[2],
+ args[3], TCG_TYPE_I32);
+ }
break;
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, args, true);
+ if (TCG_TARGET_REG_BITS == 64) {
+ tcg_out_qemu_st(s, args[0], -1, args[1], -1,
+ args[2], TCG_TYPE_I64);
+ } else if (TARGET_LONG_BITS == 32) {
+ tcg_out_qemu_st(s, args[0], args[1], args[2], -1,
+ args[3], TCG_TYPE_I64);
+ } else {
+ tcg_out_qemu_st(s, args[0], args[1], args[2], args[3],
+ args[4], TCG_TYPE_I64);
+ }
break;
case INDEX_op_setcond_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 20/54] tcg/ppc: Introduce HostAddress
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (18 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 19/54] tcg/ppc: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 21/54] tcg/ppc: Introduce prepare_host_addr Richard Henderson
` (33 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Collect the parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st} to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target.c.inc | 90 +++++++++++++++++++++-------------------
1 file changed, 47 insertions(+), 43 deletions(-)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index d1aa2a9f53..cd473deb36 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2287,67 +2287,71 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
return tcg_out_fail_alignment(s, l);
}
-
#endif /* SOFTMMU */
+typedef struct {
+ TCGReg base;
+ TCGReg index;
+} HostAddress;
+
static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
TCGReg addrlo, TCGReg addrhi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
MemOp s_bits = opc & MO_SIZE;
- TCGReg rbase;
+ HostAddress h;
#ifdef CONFIG_SOFTMMU
tcg_insn_unit *label_ptr;
- addrlo = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), true);
+ h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), true);
+ h.base = TCG_REG_R3;
/* Load a pointer into the current opcode w/conditional branch-link. */
label_ptr = s->code_ptr;
tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
-
- rbase = TCG_REG_R3;
#else /* !CONFIG_SOFTMMU */
unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
}
- rbase = guest_base ? TCG_GUEST_BASE_REG : 0;
+ h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
+ h.index = addrlo;
if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
- addrlo = TCG_REG_TMP1;
+ h.index = TCG_REG_TMP1;
}
#endif
if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
if (opc & MO_BSWAP) {
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, addrlo, 4));
- tcg_out32(s, LWBRX | TAB(datalo, rbase, addrlo));
- tcg_out32(s, LWBRX | TAB(datahi, rbase, TCG_REG_R0));
- } else if (rbase != 0) {
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, addrlo, 4));
- tcg_out32(s, LWZX | TAB(datahi, rbase, addrlo));
- tcg_out32(s, LWZX | TAB(datalo, rbase, TCG_REG_R0));
- } else if (addrlo == datahi) {
- tcg_out32(s, LWZ | TAI(datalo, addrlo, 4));
- tcg_out32(s, LWZ | TAI(datahi, addrlo, 0));
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
+ tcg_out32(s, LWBRX | TAB(datalo, h.base, h.index));
+ tcg_out32(s, LWBRX | TAB(datahi, h.base, TCG_REG_R0));
+ } else if (h.base != 0) {
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
+ tcg_out32(s, LWZX | TAB(datahi, h.base, h.index));
+ tcg_out32(s, LWZX | TAB(datalo, h.base, TCG_REG_R0));
+ } else if (h.index == datahi) {
+ tcg_out32(s, LWZ | TAI(datalo, h.index, 4));
+ tcg_out32(s, LWZ | TAI(datahi, h.index, 0));
} else {
- tcg_out32(s, LWZ | TAI(datahi, addrlo, 0));
- tcg_out32(s, LWZ | TAI(datalo, addrlo, 4));
+ tcg_out32(s, LWZ | TAI(datahi, h.index, 0));
+ tcg_out32(s, LWZ | TAI(datalo, h.index, 4));
}
} else {
uint32_t insn = qemu_ldx_opc[opc & (MO_BSWAP | MO_SSIZE)];
if (!have_isa_2_06 && insn == LDBRX) {
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, addrlo, 4));
- tcg_out32(s, LWBRX | TAB(datalo, rbase, addrlo));
- tcg_out32(s, LWBRX | TAB(TCG_REG_R0, rbase, TCG_REG_R0));
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
+ tcg_out32(s, LWBRX | TAB(datalo, h.base, h.index));
+ tcg_out32(s, LWBRX | TAB(TCG_REG_R0, h.base, TCG_REG_R0));
tcg_out_rld(s, RLDIMI, datalo, TCG_REG_R0, 32, 0);
} else if (insn) {
- tcg_out32(s, insn | TAB(datalo, rbase, addrlo));
+ tcg_out32(s, insn | TAB(datalo, h.base, h.index));
} else {
insn = qemu_ldx_opc[opc & (MO_SIZE | MO_BSWAP)];
- tcg_out32(s, insn | TAB(datalo, rbase, addrlo));
+ tcg_out32(s, insn | TAB(datalo, h.base, h.index));
tcg_out_movext(s, TCG_TYPE_REG, datalo,
TCG_TYPE_REG, opc & MO_SSIZE, datalo);
}
@@ -2365,52 +2369,52 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
{
MemOp opc = get_memop(oi);
MemOp s_bits = opc & MO_SIZE;
- TCGReg rbase;
+ HostAddress h;
#ifdef CONFIG_SOFTMMU
tcg_insn_unit *label_ptr;
- addrlo = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), false);
+ h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), false);
+ h.base = TCG_REG_R3;
/* Load a pointer into the current opcode w/conditional branch-link. */
label_ptr = s->code_ptr;
tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
-
- rbase = TCG_REG_R3;
#else /* !CONFIG_SOFTMMU */
unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
}
- rbase = guest_base ? TCG_GUEST_BASE_REG : 0;
+ h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
+ h.index = addrlo;
if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
- addrlo = TCG_REG_TMP1;
+ h.index = TCG_REG_TMP1;
}
#endif
if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
if (opc & MO_BSWAP) {
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, addrlo, 4));
- tcg_out32(s, STWBRX | SAB(datalo, rbase, addrlo));
- tcg_out32(s, STWBRX | SAB(datahi, rbase, TCG_REG_R0));
- } else if (rbase != 0) {
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, addrlo, 4));
- tcg_out32(s, STWX | SAB(datahi, rbase, addrlo));
- tcg_out32(s, STWX | SAB(datalo, rbase, TCG_REG_R0));
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
+ tcg_out32(s, STWBRX | SAB(datalo, h.base, h.index));
+ tcg_out32(s, STWBRX | SAB(datahi, h.base, TCG_REG_R0));
+ } else if (h.base != 0) {
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
+ tcg_out32(s, STWX | SAB(datahi, h.base, h.index));
+ tcg_out32(s, STWX | SAB(datalo, h.base, TCG_REG_R0));
} else {
- tcg_out32(s, STW | TAI(datahi, addrlo, 0));
- tcg_out32(s, STW | TAI(datalo, addrlo, 4));
+ tcg_out32(s, STW | TAI(datahi, h.index, 0));
+ tcg_out32(s, STW | TAI(datalo, h.index, 4));
}
} else {
uint32_t insn = qemu_stx_opc[opc & (MO_BSWAP | MO_SIZE)];
if (!have_isa_2_06 && insn == STDBRX) {
- tcg_out32(s, STWBRX | SAB(datalo, rbase, addrlo));
- tcg_out32(s, ADDI | TAI(TCG_REG_TMP1, addrlo, 4));
+ tcg_out32(s, STWBRX | SAB(datalo, h.base, h.index));
+ tcg_out32(s, ADDI | TAI(TCG_REG_TMP1, h.index, 4));
tcg_out_shri64(s, TCG_REG_R0, datalo, 32);
- tcg_out32(s, STWBRX | SAB(TCG_REG_R0, rbase, TCG_REG_TMP1));
+ tcg_out32(s, STWBRX | SAB(TCG_REG_R0, h.base, TCG_REG_TMP1));
} else {
- tcg_out32(s, insn | SAB(datalo, rbase, addrlo));
+ tcg_out32(s, insn | SAB(datalo, h.base, h.index));
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 21/54] tcg/ppc: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (19 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 20/54] tcg/ppc: Introduce HostAddress Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 22/54] tcg/riscv: Require TCG_TARGET_REG_BITS == 64 Richard Henderson
` (32 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns HostAddress and TCGLabelQemuLdst structures.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target.c.inc | 377 +++++++++++++++++----------------------
1 file changed, 168 insertions(+), 209 deletions(-)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index cd473deb36..7239335bdf 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2003,140 +2003,6 @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
[MO_BEUQ] = helper_be_stq_mmu,
};
-/* We expect to use a 16-bit negative offset from ENV. */
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
-
-/* Perform the TLB load and compare. Places the result of the comparison
- in CR7, loads the addend of the TLB into R3, and returns the register
- containing the guest address (zero-extended into R4). Clobbers R0 and R2. */
-
-static TCGReg tcg_out_tlb_read(TCGContext *s, MemOp opc,
- TCGReg addrlo, TCGReg addrhi,
- int mem_index, bool is_read)
-{
- int cmp_off
- = (is_read
- ? offsetof(CPUTLBEntry, addr_read)
- : offsetof(CPUTLBEntry, addr_write));
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
- unsigned s_bits = opc & MO_SIZE;
- unsigned a_bits = get_alignment_bits(opc);
-
- /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
-
- /* Extract the page index, shifted into place for tlb index. */
- if (TCG_TARGET_REG_BITS == 32) {
- tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
- } else {
- tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
- }
- tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
-
- /* Load the TLB comparator. */
- if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
- uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
- ? LWZUX : LDUX);
- tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
- } else {
- tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
- } else {
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
- }
- }
-
- /* Load the TLB addend for use on the fast path. Do this asap
- to minimize any load use delay. */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_REG_R3,
- offsetof(CPUTLBEntry, addend));
-
- /* Clear the non-page, non-alignment bits from the address */
- if (TCG_TARGET_REG_BITS == 32) {
- /* We don't support unaligned accesses on 32-bits.
- * Preserve the bottom bits and thus trigger a comparison
- * failure on unaligned accesses.
- */
- if (a_bits < s_bits) {
- a_bits = s_bits;
- }
- tcg_out_rlw(s, RLWINM, TCG_REG_R0, addrlo, 0,
- (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
- } else {
- TCGReg t = addrlo;
-
- /* If the access is unaligned, we need to make sure we fail if we
- * cross a page boundary. The trick is to add the access size-1
- * to the address before masking the low bits. That will make the
- * address overflow to the next page if we cross a page boundary,
- * which will then force a mismatch of the TLB compare.
- */
- if (a_bits < s_bits) {
- unsigned a_mask = (1 << a_bits) - 1;
- unsigned s_mask = (1 << s_bits) - 1;
- tcg_out32(s, ADDI | TAI(TCG_REG_R0, t, s_mask - a_mask));
- t = TCG_REG_R0;
- }
-
- /* Mask the address for the requested alignment. */
- if (TARGET_LONG_BITS == 32) {
- tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
- (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
- /* Zero-extend the address for use in the final address. */
- tcg_out_ext32u(s, TCG_REG_R4, addrlo);
- addrlo = TCG_REG_R4;
- } else if (a_bits == 0) {
- tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
- } else {
- tcg_out_rld(s, RLDICL, TCG_REG_R0, t,
- 64 - TARGET_PAGE_BITS, TARGET_PAGE_BITS - a_bits);
- tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
- }
- }
-
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
- 0, 7, TCG_TYPE_I32);
- tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
- tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
- } else {
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
- 0, 7, TCG_TYPE_TL);
- }
-
- return addrlo;
-}
-
-/* Record the context of a call to the out of line helper code for the slow
- path for a load or store, so that we can later generate the correct
- helper code. */
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld,
- TCGType type, MemOpIdx oi,
- TCGReg datalo_reg, TCGReg datahi_reg,
- TCGReg addrlo_reg, TCGReg addrhi_reg,
- tcg_insn_unit *raddr, tcg_insn_unit *lptr)
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->type = type;
- label->oi = oi;
- label->datalo_reg = datalo_reg;
- label->datahi_reg = datahi_reg;
- label->addrlo_reg = addrlo_reg;
- label->addrhi_reg = addrhi_reg;
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = lptr;
-}
-
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
MemOpIdx oi = lb->oi;
@@ -2225,27 +2091,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
return true;
}
#else
-
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo,
- TCGReg addrhi, unsigned a_bits)
-{
- unsigned a_mask = (1 << a_bits) - 1;
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->addrlo_reg = addrlo;
- label->addrhi_reg = addrhi;
-
- /* We are expecting a_bits to max out at 7, much lower than ANDI. */
- tcg_debug_assert(a_bits < 16);
- tcg_out32(s, ANDI | SAI(addrlo, TCG_REG_R0, a_mask));
-
- label->label_ptr[0] = s->code_ptr;
- tcg_out32(s, BC | BI(0, CR_EQ) | BO_COND_FALSE | LK);
-
- label->raddr = tcg_splitwx_to_rx(s->code_ptr);
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
if (!reloc_pc14(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
@@ -2294,37 +2139,167 @@ typedef struct {
TCGReg index;
} HostAddress;
+/*
+ * For softmmu, perform the TLB load and compare.
+ * For useronly, perform any required alignment tests.
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
+ * is required and fill in @h with the host address for the fast path.
+ */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
+ TCGReg addrlo, TCGReg addrhi,
+ MemOpIdx oi, bool is_ld)
+{
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+
+#ifdef CONFIG_SOFTMMU
+ int mem_index = get_mmuidx(oi);
+ int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
+ : offsetof(CPUTLBEntry, addr_write);
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
+ unsigned s_bits = opc & MO_SIZE;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+
+ /* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
+
+ /* Extract the page index, shifted into place for tlb index. */
+ if (TCG_TARGET_REG_BITS == 32) {
+ tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+ } else {
+ tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+ }
+ tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
+
+ /* Load the TLB comparator. */
+ if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
+ ? LWZUX : LDUX);
+ tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
+ } else {
+ tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
+ } else {
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
+ }
+ }
+
+ /* Load the TLB addend for use on the fast path. Do this asap
+ to minimize any load use delay. */
+ h->base = TCG_REG_R3;
+ tcg_out_ld(s, TCG_TYPE_PTR, h->base, TCG_REG_R3,
+ offsetof(CPUTLBEntry, addend));
+
+ /* Clear the non-page, non-alignment bits from the address */
+ if (TCG_TARGET_REG_BITS == 32) {
+ /* We don't support unaligned accesses on 32-bits.
+ * Preserve the bottom bits and thus trigger a comparison
+ * failure on unaligned accesses.
+ */
+ if (a_bits < s_bits) {
+ a_bits = s_bits;
+ }
+ tcg_out_rlw(s, RLWINM, TCG_REG_R0, addrlo, 0,
+ (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
+ } else {
+ TCGReg t = addrlo;
+
+ /* If the access is unaligned, we need to make sure we fail if we
+ * cross a page boundary. The trick is to add the access size-1
+ * to the address before masking the low bits. That will make the
+ * address overflow to the next page if we cross a page boundary,
+ * which will then force a mismatch of the TLB compare.
+ */
+ if (a_bits < s_bits) {
+ unsigned a_mask = (1 << a_bits) - 1;
+ unsigned s_mask = (1 << s_bits) - 1;
+ tcg_out32(s, ADDI | TAI(TCG_REG_R0, t, s_mask - a_mask));
+ t = TCG_REG_R0;
+ }
+
+ /* Mask the address for the requested alignment. */
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
+ (32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
+ /* Zero-extend the address for use in the final address. */
+ tcg_out_ext32u(s, TCG_REG_R4, addrlo);
+ addrlo = TCG_REG_R4;
+ } else if (a_bits == 0) {
+ tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
+ } else {
+ tcg_out_rld(s, RLDICL, TCG_REG_R0, t,
+ 64 - TARGET_PAGE_BITS, TARGET_PAGE_BITS - a_bits);
+ tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
+ }
+ }
+ h->index = addrlo;
+
+ if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
+ 0, 7, TCG_TYPE_I32);
+ tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
+ tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
+ } else {
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
+ 0, 7, TCG_TYPE_TL);
+ }
+
+ /* Load a pointer into the current opcode w/conditional branch-link. */
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
+#else
+ if (a_bits) {
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addrlo;
+ ldst->addrhi_reg = addrhi;
+
+ /* We are expecting a_bits to max out at 7, much lower than ANDI. */
+ tcg_debug_assert(a_bits < 16);
+ tcg_out32(s, ANDI | SAI(addrlo, TCG_REG_R0, (1 << a_bits) - 1));
+
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out32(s, BC | BI(0, CR_EQ) | BO_COND_FALSE | LK);
+ }
+
+ h->base = guest_base ? TCG_GUEST_BASE_REG : 0;
+ h->index = addrlo;
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+ tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
+ h->index = TCG_REG_TMP1;
+ }
+#endif
+
+ return ldst;
+}
+
static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
TCGReg addrlo, TCGReg addrhi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
- MemOp s_bits = opc & MO_SIZE;
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- tcg_insn_unit *label_ptr;
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
- h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), true);
- h.base = TCG_REG_R3;
-
- /* Load a pointer into the current opcode w/conditional branch-link. */
- label_ptr = s->code_ptr;
- tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
-#else /* !CONFIG_SOFTMMU */
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits);
- }
- h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
- h.index = addrlo;
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
- h.index = TCG_REG_TMP1;
- }
-#endif
-
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
+ if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
if (opc & MO_BSWAP) {
tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
tcg_out32(s, LWBRX | TAB(datalo, h.base, h.index));
@@ -2357,10 +2332,12 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
}
}
-#ifdef CONFIG_SOFTMMU
- add_qemu_ldst_label(s, true, data_type, oi, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#endif
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
+ }
}
static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
@@ -2368,32 +2345,12 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
- MemOp s_bits = opc & MO_SIZE;
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- tcg_insn_unit *label_ptr;
+ ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
- h.index = tcg_out_tlb_read(s, opc, addrlo, addrhi, get_mmuidx(oi), false);
- h.base = TCG_REG_R3;
-
- /* Load a pointer into the current opcode w/conditional branch-link. */
- label_ptr = s->code_ptr;
- tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
-#else /* !CONFIG_SOFTMMU */
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits);
- }
- h.base = guest_base ? TCG_GUEST_BASE_REG : 0;
- h.index = addrlo;
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
- h.index = TCG_REG_TMP1;
- }
-#endif
-
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
+ if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
if (opc & MO_BSWAP) {
tcg_out32(s, ADDI | TAI(TCG_REG_R0, h.index, 4));
tcg_out32(s, STWBRX | SAB(datalo, h.base, h.index));
@@ -2418,10 +2375,12 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, TCGReg datahi,
}
}
-#ifdef CONFIG_SOFTMMU
- add_qemu_ldst_label(s, false, data_type, oi, datalo, datahi,
- addrlo, addrhi, s->code_ptr, label_ptr);
-#endif
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = datalo;
+ ldst->datahi_reg = datahi;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
+ }
}
static void tcg_out_nop_fill(tcg_insn_unit *p, int count)
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 22/54] tcg/riscv: Require TCG_TARGET_REG_BITS == 64
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (20 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 21/54] tcg/ppc: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 23/54] tcg/riscv: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
` (31 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
The port currently does not support "oversize" guests, which
means riscv32 can only target 32-bit guests. We will soon be
building TCG once for all guests. This implies that we can
only support riscv64.
Since all Linux distributions target riscv64 not riscv32,
this is not much of a restriction and simplifies the code.
The brcond2 and setcond2 opcodes are exclusive to 32-bit hosts,
so we can and should remove the stubs.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target-con-set.h | 8 --
tcg/riscv/tcg-target.h | 22 ++--
tcg/riscv/tcg-target.c.inc | 232 +++++++++------------------------
3 files changed, 72 insertions(+), 190 deletions(-)
diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
index cf0ac4d751..d4cff673b0 100644
--- a/tcg/riscv/tcg-target-con-set.h
+++ b/tcg/riscv/tcg-target-con-set.h
@@ -13,18 +13,10 @@ C_O0_I1(r)
C_O0_I2(LZ, L)
C_O0_I2(rZ, r)
C_O0_I2(rZ, rZ)
-C_O0_I3(LZ, L, L)
-C_O0_I3(LZ, LZ, L)
-C_O0_I4(LZ, LZ, L, L)
-C_O0_I4(rZ, rZ, rZ, rZ)
C_O1_I1(r, L)
C_O1_I1(r, r)
-C_O1_I2(r, L, L)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
C_O1_I2(r, rZ, rN)
C_O1_I2(r, rZ, rZ)
-C_O1_I4(r, rZ, rZ, rZ, rZ)
-C_O2_I1(r, r, L)
-C_O2_I2(r, r, L, L)
C_O2_I4(r, r, rZ, rZ, rM, rM)
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 0deb33701f..dddf2486c1 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -25,11 +25,14 @@
#ifndef RISCV_TCG_TARGET_H
#define RISCV_TCG_TARGET_H
-#if __riscv_xlen == 32
-# define TCG_TARGET_REG_BITS 32
-#elif __riscv_xlen == 64
-# define TCG_TARGET_REG_BITS 64
+/*
+ * We don't support oversize guests.
+ * Since we will only build tcg once, this in turn requires a 64-bit host.
+ */
+#if __riscv_xlen != 64
+#error "unsupported code generation mode"
#endif
+#define TCG_TARGET_REG_BITS 64
#define TCG_TARGET_INSN_UNIT_SIZE 4
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 20
@@ -83,13 +86,8 @@ typedef enum {
#define TCG_TARGET_STACK_ALIGN 16
#define TCG_TARGET_CALL_STACK_OFFSET 0
#define TCG_TARGET_CALL_ARG_I32 TCG_CALL_ARG_NORMAL
-#if TCG_TARGET_REG_BITS == 32
-#define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_EVEN
-#define TCG_TARGET_CALL_ARG_I128 TCG_CALL_ARG_EVEN
-#else
#define TCG_TARGET_CALL_ARG_I64 TCG_CALL_ARG_NORMAL
#define TCG_TARGET_CALL_ARG_I128 TCG_CALL_ARG_NORMAL
-#endif
#define TCG_TARGET_CALL_RET_I128 TCG_CALL_RET_NORMAL
/* optional instructions */
@@ -106,8 +104,8 @@ typedef enum {
#define TCG_TARGET_HAS_sub2_i32 1
#define TCG_TARGET_HAS_mulu2_i32 0
#define TCG_TARGET_HAS_muls2_i32 0
-#define TCG_TARGET_HAS_muluh_i32 (TCG_TARGET_REG_BITS == 32)
-#define TCG_TARGET_HAS_mulsh_i32 (TCG_TARGET_REG_BITS == 32)
+#define TCG_TARGET_HAS_muluh_i32 0
+#define TCG_TARGET_HAS_mulsh_i32 0
#define TCG_TARGET_HAS_ext8s_i32 1
#define TCG_TARGET_HAS_ext16s_i32 1
#define TCG_TARGET_HAS_ext8u_i32 1
@@ -128,7 +126,6 @@ typedef enum {
#define TCG_TARGET_HAS_setcond2 1
#define TCG_TARGET_HAS_qemu_st8_i32 0
-#if TCG_TARGET_REG_BITS == 64
#define TCG_TARGET_HAS_movcond_i64 0
#define TCG_TARGET_HAS_div_i64 1
#define TCG_TARGET_HAS_rem_i64 1
@@ -165,7 +162,6 @@ typedef enum {
#define TCG_TARGET_HAS_muls2_i64 0
#define TCG_TARGET_HAS_muluh_i64 1
#define TCG_TARGET_HAS_mulsh_i64 1
-#endif
#define TCG_TARGET_DEFAULT_MO (0)
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 266fe1433d..7a674ff5ce 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -137,15 +137,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
#define SOFTMMU_RESERVE_REGS 0
#endif
-
-static inline tcg_target_long sextreg(tcg_target_long val, int pos, int len)
-{
- if (TCG_TARGET_REG_BITS == 32) {
- return sextract32(val, pos, len);
- } else {
- return sextract64(val, pos, len);
- }
-}
+#define sextreg sextract64
/* test if a constant matches the constraint */
static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
@@ -235,7 +227,6 @@ typedef enum {
OPC_XOR = 0x4033,
OPC_XORI = 0x4013,
-#if TCG_TARGET_REG_BITS == 64
OPC_ADDIW = 0x1b,
OPC_ADDW = 0x3b,
OPC_DIVUW = 0x200503b,
@@ -250,23 +241,6 @@ typedef enum {
OPC_SRLIW = 0x501b,
OPC_SRLW = 0x503b,
OPC_SUBW = 0x4000003b,
-#else
- /* Simplify code throughout by defining aliases for RV32. */
- OPC_ADDIW = OPC_ADDI,
- OPC_ADDW = OPC_ADD,
- OPC_DIVUW = OPC_DIVU,
- OPC_DIVW = OPC_DIV,
- OPC_MULW = OPC_MUL,
- OPC_REMUW = OPC_REMU,
- OPC_REMW = OPC_REM,
- OPC_SLLIW = OPC_SLLI,
- OPC_SLLW = OPC_SLL,
- OPC_SRAIW = OPC_SRAI,
- OPC_SRAW = OPC_SRA,
- OPC_SRLIW = OPC_SRLI,
- OPC_SRLW = OPC_SRL,
- OPC_SUBW = OPC_SUB,
-#endif
OPC_FENCE = 0x0000000f,
OPC_NOP = OPC_ADDI, /* nop = addi r0,r0,0 */
@@ -500,7 +474,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
tcg_target_long lo, hi, tmp;
int shift, ret;
- if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I32) {
+ if (type == TCG_TYPE_I32) {
val = (int32_t)val;
}
@@ -511,7 +485,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
}
hi = val - lo;
- if (TCG_TARGET_REG_BITS == 32 || val == (int32_t)val) {
+ if (val == (int32_t)val) {
tcg_out_opc_upper(s, OPC_LUI, rd, hi);
if (lo != 0) {
tcg_out_opc_imm(s, OPC_ADDIW, rd, rd, lo);
@@ -519,7 +493,6 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
return;
}
- /* We can only be here if TCG_TARGET_REG_BITS != 32 */
tmp = tcg_pcrel_diff(s, (void *)val);
if (tmp == (int32_t)tmp) {
tcg_out_opc_upper(s, OPC_AUIPC, rd, 0);
@@ -668,15 +641,15 @@ static void tcg_out_ldst(TCGContext *s, RISCVInsn opc, TCGReg data,
static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg arg,
TCGReg arg1, intptr_t arg2)
{
- bool is32bit = (TCG_TARGET_REG_BITS == 32 || type == TCG_TYPE_I32);
- tcg_out_ldst(s, is32bit ? OPC_LW : OPC_LD, arg, arg1, arg2);
+ RISCVInsn insn = type == TCG_TYPE_I32 ? OPC_LW : OPC_LD;
+ tcg_out_ldst(s, insn, arg, arg1, arg2);
}
static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg,
TCGReg arg1, intptr_t arg2)
{
- bool is32bit = (TCG_TARGET_REG_BITS == 32 || type == TCG_TYPE_I32);
- tcg_out_ldst(s, is32bit ? OPC_SW : OPC_SD, arg, arg1, arg2);
+ RISCVInsn insn = type == TCG_TYPE_I32 ? OPC_SW : OPC_SD;
+ tcg_out_ldst(s, insn, arg, arg1, arg2);
}
static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
@@ -829,20 +802,6 @@ static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
}
}
-static void tcg_out_brcond2(TCGContext *s, TCGCond cond, TCGReg al, TCGReg ah,
- TCGReg bl, TCGReg bh, TCGLabel *l)
-{
- /* todo */
- g_assert_not_reached();
-}
-
-static void tcg_out_setcond2(TCGContext *s, TCGCond cond, TCGReg ret,
- TCGReg al, TCGReg ah, TCGReg bl, TCGReg bh)
-{
- /* todo */
- g_assert_not_reached();
-}
-
static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *arg, bool tail)
{
TCGReg link = tail ? TCG_REG_ZERO : TCG_REG_RA;
@@ -853,20 +812,18 @@ static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *arg, bool tail)
if (offset == sextreg(offset, 0, 20)) {
/* short jump: -2097150 to 2097152 */
tcg_out_opc_jump(s, OPC_JAL, link, offset);
- } else if (TCG_TARGET_REG_BITS == 32 || offset == (int32_t)offset) {
+ } else if (offset == (int32_t)offset) {
/* long jump: -2147483646 to 2147483648 */
tcg_out_opc_upper(s, OPC_AUIPC, TCG_REG_TMP0, 0);
tcg_out_opc_imm(s, OPC_JALR, link, TCG_REG_TMP0, 0);
ret = reloc_call(s->code_ptr - 2, arg);
tcg_debug_assert(ret == true);
- } else if (TCG_TARGET_REG_BITS == 64) {
+ } else {
/* far jump: 64-bit */
tcg_target_long imm = sextreg((tcg_target_long)arg, 0, 12);
tcg_target_long base = (tcg_target_long)arg - imm;
tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP0, base);
tcg_out_opc_imm(s, OPC_JALR, link, TCG_REG_TMP0, imm);
- } else {
- g_assert_not_reached();
}
}
@@ -942,9 +899,6 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
#endif
};
-/* We don't support oversize guests */
-QEMU_BUILD_BUG_ON(TCG_TARGET_REG_BITS < TARGET_LONG_BITS);
-
/* We expect to use a 12-bit negative offset from ENV. */
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
@@ -956,8 +910,7 @@ static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
tcg_debug_assert(ok);
}
-static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
- TCGReg addrh, MemOpIdx oi,
+static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addr, MemOpIdx oi,
tcg_insn_unit **label_ptr, bool is_load)
{
MemOp opc = get_memop(oi);
@@ -973,7 +926,7 @@ static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_ofs);
tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_ofs);
- tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addrl,
+ tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addr,
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
@@ -992,10 +945,10 @@ static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
/* Clear the non-page, non-alignment bits from the address. */
compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
if (compare_mask == sextreg(compare_mask, 0, 12)) {
- tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addrl, compare_mask);
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr, compare_mask);
} else {
tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
- tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addrl);
+ tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addr);
}
/* Compare masked address with the TLB entry. */
@@ -1003,29 +956,26 @@ static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addrl,
tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
/* TLB Hit - translate address using addend. */
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, TCG_REG_TMP0, addrl);
- addrl = TCG_REG_TMP0;
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_ext32u(s, TCG_REG_TMP0, addr);
+ addr = TCG_REG_TMP0;
}
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addrl);
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr);
return TCG_REG_TMP0;
}
static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
- TCGType ext,
- TCGReg datalo, TCGReg datahi,
- TCGReg addrlo, TCGReg addrhi,
- void *raddr, tcg_insn_unit **label_ptr)
+ TCGType data_type, TCGReg data_reg,
+ TCGReg addr_reg, void *raddr,
+ tcg_insn_unit **label_ptr)
{
TCGLabelQemuLdst *label = new_ldst_label(s);
label->is_ld = is_ld;
label->oi = oi;
- label->type = ext;
- label->datalo_reg = datalo;
- label->datahi_reg = datahi;
- label->addrlo_reg = addrlo;
- label->addrhi_reg = addrhi;
+ label->type = data_type;
+ label->datalo_reg = data_reg;
+ label->addrlo_reg = addr_reg;
label->raddr = tcg_splitwx_to_rx(raddr);
label->label_ptr[0] = label_ptr[0];
}
@@ -1039,11 +989,6 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
TCGReg a2 = tcg_target_call_iarg_regs[2];
TCGReg a3 = tcg_target_call_iarg_regs[3];
- /* We don't support oversize guests */
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- g_assert_not_reached();
- }
-
/* resolve label address */
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
@@ -1073,11 +1018,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
TCGReg a3 = tcg_target_call_iarg_regs[3];
TCGReg a4 = tcg_target_call_iarg_regs[4];
- /* We don't support oversize guests */
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- g_assert_not_reached();
- }
-
/* resolve label address */
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
@@ -1146,7 +1086,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
#endif /* CONFIG_SOFTMMU */
-static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
+static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
TCGReg base, MemOp opc, bool is_64)
{
/* Byte swapping is left to middle-end expansion. */
@@ -1154,37 +1094,28 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
switch (opc & (MO_SSIZE)) {
case MO_UB:
- tcg_out_opc_imm(s, OPC_LBU, lo, base, 0);
+ tcg_out_opc_imm(s, OPC_LBU, val, base, 0);
break;
case MO_SB:
- tcg_out_opc_imm(s, OPC_LB, lo, base, 0);
+ tcg_out_opc_imm(s, OPC_LB, val, base, 0);
break;
case MO_UW:
- tcg_out_opc_imm(s, OPC_LHU, lo, base, 0);
+ tcg_out_opc_imm(s, OPC_LHU, val, base, 0);
break;
case MO_SW:
- tcg_out_opc_imm(s, OPC_LH, lo, base, 0);
+ tcg_out_opc_imm(s, OPC_LH, val, base, 0);
break;
case MO_UL:
- if (TCG_TARGET_REG_BITS == 64 && is_64) {
- tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
+ if (is_64) {
+ tcg_out_opc_imm(s, OPC_LWU, val, base, 0);
break;
}
/* FALLTHRU */
case MO_SL:
- tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
+ tcg_out_opc_imm(s, OPC_LW, val, base, 0);
break;
case MO_UQ:
- /* Prefer to load from offset 0 first, but allow for overlap. */
- if (TCG_TARGET_REG_BITS == 64) {
- tcg_out_opc_imm(s, OPC_LD, lo, base, 0);
- } else if (lo != base) {
- tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
- tcg_out_opc_imm(s, OPC_LW, hi, base, 4);
- } else {
- tcg_out_opc_imm(s, OPC_LW, hi, base, 4);
- tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
- }
+ tcg_out_opc_imm(s, OPC_LD, val, base, 0);
break;
default:
g_assert_not_reached();
@@ -1193,8 +1124,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
{
- TCGReg addr_regl, addr_regh __attribute__((unused));
- TCGReg data_regl, data_regh;
+ TCGReg addr_reg, data_reg;
MemOpIdx oi;
MemOp opc;
#if defined(CONFIG_SOFTMMU)
@@ -1204,27 +1134,23 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
#endif
TCGReg base;
- data_regl = *args++;
- data_regh = (TCG_TARGET_REG_BITS == 32 && is_64 ? *args++ : 0);
- addr_regl = *args++;
- addr_regh = (TCG_TARGET_REG_BITS < TARGET_LONG_BITS ? *args++ : 0);
+ data_reg = *args++;
+ addr_reg = *args++;
oi = *args++;
opc = get_memop(oi);
#if defined(CONFIG_SOFTMMU)
- base = tcg_out_tlb_load(s, addr_regl, addr_regh, oi, label_ptr, 1);
- tcg_out_qemu_ld_direct(s, data_regl, data_regh, base, opc, is_64);
- add_qemu_ldst_label(s, 1, oi,
- (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
- data_regl, data_regh, addr_regl, addr_regh,
- s->code_ptr, label_ptr);
+ base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
+ tcg_out_qemu_ld_direct(s, data_reg, base, opc, is_64);
+ add_qemu_ldst_label(s, 1, oi, (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
+ data_reg, addr_reg, s->code_ptr, label_ptr);
#else
a_bits = get_alignment_bits(opc);
if (a_bits) {
- tcg_out_test_alignment(s, true, addr_regl, a_bits);
+ tcg_out_test_alignment(s, true, addr_reg, a_bits);
}
- base = addr_regl;
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+ base = addr_reg;
+ if (TARGET_LONG_BITS == 32) {
tcg_out_ext32u(s, TCG_REG_TMP0, base);
base = TCG_REG_TMP0;
}
@@ -1232,11 +1158,11 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
base = TCG_REG_TMP0;
}
- tcg_out_qemu_ld_direct(s, data_regl, data_regh, base, opc, is_64);
+ tcg_out_qemu_ld_direct(s, data_reg, base, opc, is_64);
#endif
}
-static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
+static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
TCGReg base, MemOp opc)
{
/* Byte swapping is left to middle-end expansion. */
@@ -1244,21 +1170,16 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
switch (opc & (MO_SSIZE)) {
case MO_8:
- tcg_out_opc_store(s, OPC_SB, base, lo, 0);
+ tcg_out_opc_store(s, OPC_SB, base, val, 0);
break;
case MO_16:
- tcg_out_opc_store(s, OPC_SH, base, lo, 0);
+ tcg_out_opc_store(s, OPC_SH, base, val, 0);
break;
case MO_32:
- tcg_out_opc_store(s, OPC_SW, base, lo, 0);
+ tcg_out_opc_store(s, OPC_SW, base, val, 0);
break;
case MO_64:
- if (TCG_TARGET_REG_BITS == 64) {
- tcg_out_opc_store(s, OPC_SD, base, lo, 0);
- } else {
- tcg_out_opc_store(s, OPC_SW, base, lo, 0);
- tcg_out_opc_store(s, OPC_SW, base, hi, 4);
- }
+ tcg_out_opc_store(s, OPC_SD, base, val, 0);
break;
default:
g_assert_not_reached();
@@ -1267,8 +1188,7 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
{
- TCGReg addr_regl, addr_regh __attribute__((unused));
- TCGReg data_regl, data_regh;
+ TCGReg addr_reg, data_reg;
MemOpIdx oi;
MemOp opc;
#if defined(CONFIG_SOFTMMU)
@@ -1278,27 +1198,23 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
#endif
TCGReg base;
- data_regl = *args++;
- data_regh = (TCG_TARGET_REG_BITS == 32 && is_64 ? *args++ : 0);
- addr_regl = *args++;
- addr_regh = (TCG_TARGET_REG_BITS < TARGET_LONG_BITS ? *args++ : 0);
+ data_reg = *args++;
+ addr_reg = *args++;
oi = *args++;
opc = get_memop(oi);
#if defined(CONFIG_SOFTMMU)
- base = tcg_out_tlb_load(s, addr_regl, addr_regh, oi, label_ptr, 0);
- tcg_out_qemu_st_direct(s, data_regl, data_regh, base, opc);
- add_qemu_ldst_label(s, 0, oi,
- (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
- data_regl, data_regh, addr_regl, addr_regh,
- s->code_ptr, label_ptr);
+ base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
+ tcg_out_qemu_st_direct(s, data_reg, base, opc);
+ add_qemu_ldst_label(s, 0, oi, (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
+ data_reg, addr_reg, s->code_ptr, label_ptr);
#else
a_bits = get_alignment_bits(opc);
if (a_bits) {
- tcg_out_test_alignment(s, false, addr_regl, a_bits);
+ tcg_out_test_alignment(s, false, addr_reg, a_bits);
}
- base = addr_regl;
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+ base = addr_reg;
+ if (TARGET_LONG_BITS == 32) {
tcg_out_ext32u(s, TCG_REG_TMP0, base);
base = TCG_REG_TMP0;
}
@@ -1306,7 +1222,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
base = TCG_REG_TMP0;
}
- tcg_out_qemu_st_direct(s, data_regl, data_regh, base, opc);
+ tcg_out_qemu_st_direct(s, data_reg, base, opc);
#endif
}
@@ -1585,17 +1501,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
case INDEX_op_brcond_i64:
tcg_out_brcond(s, a2, a0, a1, arg_label(args[3]));
break;
- case INDEX_op_brcond2_i32:
- tcg_out_brcond2(s, args[4], a0, a1, a2, args[3], arg_label(args[5]));
- break;
case INDEX_op_setcond_i32:
case INDEX_op_setcond_i64:
tcg_out_setcond(s, args[3], a0, a1, a2);
break;
- case INDEX_op_setcond2_i32:
- tcg_out_setcond2(s, args[5], a0, a1, a2, args[3], args[4]);
- break;
case INDEX_op_qemu_ld_i32:
tcg_out_qemu_ld(s, args, false);
@@ -1748,26 +1658,12 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_sub2_i64:
return C_O2_I4(r, r, rZ, rZ, rM, rM);
- case INDEX_op_brcond2_i32:
- return C_O0_I4(rZ, rZ, rZ, rZ);
-
- case INDEX_op_setcond2_i32:
- return C_O1_I4(r, rZ, rZ, rZ, rZ);
-
case INDEX_op_qemu_ld_i32:
- return (TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
- ? C_O1_I1(r, L) : C_O1_I2(r, L, L));
- case INDEX_op_qemu_st_i32:
- return (TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
- ? C_O0_I2(LZ, L) : C_O0_I3(LZ, L, L));
case INDEX_op_qemu_ld_i64:
- return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, L)
- : TARGET_LONG_BITS <= TCG_TARGET_REG_BITS ? C_O2_I1(r, r, L)
- : C_O2_I2(r, r, L, L));
+ return C_O1_I1(r, L);
+ case INDEX_op_qemu_st_i32:
case INDEX_op_qemu_st_i64:
- return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(LZ, L)
- : TARGET_LONG_BITS <= TCG_TARGET_REG_BITS ? C_O0_I3(LZ, LZ, L)
- : C_O0_I4(LZ, LZ, L, L));
+ return C_O0_I2(LZ, L);
default:
g_assert_not_reached();
@@ -1843,9 +1739,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
static void tcg_target_init(TCGContext *s)
{
tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffff;
- if (TCG_TARGET_REG_BITS == 64) {
- tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffff;
- }
+ tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffff;
tcg_target_call_clobber_regs = -1u;
tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S0);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 23/54] tcg/riscv: Rationalize args to tcg_out_qemu_{ld,st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (21 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 22/54] tcg/riscv: Require TCG_TARGET_REG_BITS == 64 Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:56 ` [PATCH v4 24/54] tcg/riscv: Introduce prepare_host_addr Richard Henderson
` (30 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
Interpret the variable argument placement in the caller. Pass data_type
instead of is64 -- there are several places where we already convert back
from bool to type. Clean things up by using type throughout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target.c.inc | 66 ++++++++++++++------------------------
1 file changed, 24 insertions(+), 42 deletions(-)
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 7a674ff5ce..a4cf60ca75 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -1087,7 +1087,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
#endif /* CONFIG_SOFTMMU */
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
- TCGReg base, MemOp opc, bool is_64)
+ TCGReg base, MemOp opc, TCGType type)
{
/* Byte swapping is left to middle-end expansion. */
tcg_debug_assert((opc & MO_BSWAP) == 0);
@@ -1106,7 +1106,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
tcg_out_opc_imm(s, OPC_LH, val, base, 0);
break;
case MO_UL:
- if (is_64) {
+ if (type == TCG_TYPE_I64) {
tcg_out_opc_imm(s, OPC_LWU, val, base, 0);
break;
}
@@ -1122,30 +1122,21 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
}
}
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
+static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg addr_reg, data_reg;
- MemOpIdx oi;
- MemOp opc;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[1];
-#else
- unsigned a_bits;
-#endif
+ MemOp opc = get_memop(oi);
TCGReg base;
- data_reg = *args++;
- addr_reg = *args++;
- oi = *args++;
- opc = get_memop(oi);
-
#if defined(CONFIG_SOFTMMU)
+ tcg_insn_unit *label_ptr[1];
+
base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, is_64);
- add_qemu_ldst_label(s, 1, oi, (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
- data_reg, addr_reg, s->code_ptr, label_ptr);
+ tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
+ add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
+ s->code_ptr, label_ptr);
#else
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addr_reg, a_bits);
}
@@ -1158,7 +1149,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is_64)
tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
base = TCG_REG_TMP0;
}
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, is_64);
+ tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
#endif
}
@@ -1186,30 +1177,21 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
}
}
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is_64)
+static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
+ MemOpIdx oi, TCGType data_type)
{
- TCGReg addr_reg, data_reg;
- MemOpIdx oi;
- MemOp opc;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[1];
-#else
- unsigned a_bits;
-#endif
+ MemOp opc = get_memop(oi);
TCGReg base;
- data_reg = *args++;
- addr_reg = *args++;
- oi = *args++;
- opc = get_memop(oi);
-
#if defined(CONFIG_SOFTMMU)
+ tcg_insn_unit *label_ptr[1];
+
base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
tcg_out_qemu_st_direct(s, data_reg, base, opc);
- add_qemu_ldst_label(s, 0, oi, (is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
- data_reg, addr_reg, s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
+ s->code_ptr, label_ptr);
#else
- a_bits = get_alignment_bits(opc);
+ unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, false, addr_reg, a_bits);
}
@@ -1508,16 +1490,16 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- tcg_out_qemu_ld(s, args, false);
+ tcg_out_qemu_ld(s, a0, a1, a2, TCG_TYPE_I32);
break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, args, true);
+ tcg_out_qemu_ld(s, a0, a1, a2, TCG_TYPE_I64);
break;
case INDEX_op_qemu_st_i32:
- tcg_out_qemu_st(s, args, false);
+ tcg_out_qemu_st(s, a0, a1, a2, TCG_TYPE_I32);
break;
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, args, true);
+ tcg_out_qemu_st(s, a0, a1, a2, TCG_TYPE_I64);
break;
case INDEX_op_extrh_i64_i32:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 24/54] tcg/riscv: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (22 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 23/54] tcg/riscv: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
@ 2023-05-03 6:56 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 25/54] tcg/s390x: Pass TCGType to tcg_out_qemu_{ld,st} Richard Henderson
` (29 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:56 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st
into one function that returns TCGReg and TCGLabelQemuLdst.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target.c.inc | 253 +++++++++++++++++--------------------
1 file changed, 114 insertions(+), 139 deletions(-)
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index a4cf60ca75..2b2d313fe2 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -899,10 +899,6 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
#endif
};
-/* We expect to use a 12-bit negative offset from ENV. */
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
-
static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
{
tcg_out_opc_jump(s, OPC_JAL, TCG_REG_ZERO, 0);
@@ -910,76 +906,6 @@ static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
tcg_debug_assert(ok);
}
-static TCGReg tcg_out_tlb_load(TCGContext *s, TCGReg addr, MemOpIdx oi,
- tcg_insn_unit **label_ptr, bool is_load)
-{
- MemOp opc = get_memop(oi);
- unsigned s_bits = opc & MO_SIZE;
- unsigned a_bits = get_alignment_bits(opc);
- tcg_target_long compare_mask;
- int mem_index = get_mmuidx(oi);
- int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
- int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
- int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
- TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0;
-
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_ofs);
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_ofs);
-
- tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addr,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
- tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
-
- /* Load the tlb comparator and the addend. */
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
- is_load ? offsetof(CPUTLBEntry, addr_read)
- : offsetof(CPUTLBEntry, addr_write));
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
- offsetof(CPUTLBEntry, addend));
-
- /* We don't support unaligned accesses. */
- if (a_bits < s_bits) {
- a_bits = s_bits;
- }
- /* Clear the non-page, non-alignment bits from the address. */
- compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
- if (compare_mask == sextreg(compare_mask, 0, 12)) {
- tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr, compare_mask);
- } else {
- tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
- tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addr);
- }
-
- /* Compare masked address with the TLB entry. */
- label_ptr[0] = s->code_ptr;
- tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
-
- /* TLB Hit - translate address using addend. */
- if (TARGET_LONG_BITS == 32) {
- tcg_out_ext32u(s, TCG_REG_TMP0, addr);
- addr = TCG_REG_TMP0;
- }
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr);
- return TCG_REG_TMP0;
-}
-
-static void add_qemu_ldst_label(TCGContext *s, int is_ld, MemOpIdx oi,
- TCGType data_type, TCGReg data_reg,
- TCGReg addr_reg, void *raddr,
- tcg_insn_unit **label_ptr)
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->oi = oi;
- label->type = data_type;
- label->datalo_reg = data_reg;
- label->addrlo_reg = addr_reg;
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = label_ptr[0];
-}
-
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
MemOpIdx oi = l->oi;
@@ -1037,26 +963,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
return true;
}
#else
-
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addr_reg,
- unsigned a_bits)
-{
- unsigned a_mask = (1 << a_bits) - 1;
- TCGLabelQemuLdst *l = new_ldst_label(s);
-
- l->is_ld = is_ld;
- l->addrlo_reg = addr_reg;
-
- /* We are expecting a_bits to max out at 7, so we can always use andi. */
- tcg_debug_assert(a_bits < 12);
- tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, a_mask);
-
- l->label_ptr[0] = s->code_ptr;
- tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
-
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
/* resolve label address */
@@ -1083,9 +989,108 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
return tcg_out_fail_alignment(s, l);
}
-
#endif /* CONFIG_SOFTMMU */
+/*
+ * For softmmu, perform the TLB load and compare.
+ * For useronly, perform any required alignment tests.
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
+ * is required and fill in @h with the host address for the fast path.
+ */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, TCGReg *pbase,
+ TCGReg addr_reg, MemOpIdx oi,
+ bool is_ld)
+{
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+ unsigned a_mask = (1u << a_bits) - 1;
+
+#ifdef CONFIG_SOFTMMU
+ unsigned s_bits = opc & MO_SIZE;
+ int mem_index = get_mmuidx(oi);
+ int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
+ int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
+ int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
+ TCGReg mask_base = TCG_AREG0, table_base = TCG_AREG0;
+ tcg_target_long compare_mask;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, mask_base, mask_ofs);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, table_base, table_ofs);
+
+ tcg_out_opc_imm(s, OPC_SRLI, TCG_REG_TMP2, addr_reg,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+ tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
+
+ /* Load the tlb comparator and the addend. */
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
+ is_ld ? offsetof(CPUTLBEntry, addr_read)
+ : offsetof(CPUTLBEntry, addr_write));
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
+ offsetof(CPUTLBEntry, addend));
+
+ /* We don't support unaligned accesses. */
+ if (a_bits < s_bits) {
+ a_bits = s_bits;
+ }
+ /* Clear the non-page, non-alignment bits from the address. */
+ compare_mask = (tcg_target_long)TARGET_PAGE_MASK | a_mask;
+ if (compare_mask == sextreg(compare_mask, 0, 12)) {
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, compare_mask);
+ } else {
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
+ tcg_out_opc_reg(s, OPC_AND, TCG_REG_TMP1, TCG_REG_TMP1, addr_reg);
+ }
+
+ /* Compare masked address with the TLB entry. */
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
+
+ /* TLB Hit - translate address using addend. */
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_ext32u(s, TCG_REG_TMP0, addr_reg);
+ addr_reg = TCG_REG_TMP0;
+ }
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr_reg);
+ *pbase = TCG_REG_TMP0;
+#else
+ if (a_mask) {
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ /* We are expecting a_bits max 7, so we can always use andi. */
+ tcg_debug_assert(a_bits < 12);
+ tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_TMP1, addr_reg, a_mask);
+
+ ldst->label_ptr[0] = s->code_ptr;
+ tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
+ }
+
+ TCGReg base = addr_reg;
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_ext32u(s, TCG_REG_TMP0, base);
+ base = TCG_REG_TMP0;
+ }
+ if (guest_base != 0) {
+ tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
+ base = TCG_REG_TMP0;
+ }
+ *pbase = base;
+#endif
+
+ return ldst;
+}
+
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
TCGReg base, MemOp opc, TCGType type)
{
@@ -1125,32 +1130,17 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg val,
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
TCGReg base;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[1];
+ ldst = prepare_host_addr(s, &base, addr_reg, oi, true);
+ tcg_out_qemu_ld_direct(s, data_reg, base, get_memop(oi), data_type);
- base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 1);
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- base = addr_reg;
- if (TARGET_LONG_BITS == 32) {
- tcg_out_ext32u(s, TCG_REG_TMP0, base);
- base = TCG_REG_TMP0;
- }
- if (guest_base != 0) {
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
- base = TCG_REG_TMP0;
- }
- tcg_out_qemu_ld_direct(s, data_reg, base, opc, data_type);
-#endif
}
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
@@ -1180,32 +1170,17 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg val,
static void tcg_out_qemu_st(TCGContext *s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
TCGReg base;
-#if defined(CONFIG_SOFTMMU)
- tcg_insn_unit *label_ptr[1];
+ ldst = prepare_host_addr(s, &base, addr_reg, oi, false);
+ tcg_out_qemu_st_direct(s, data_reg, base, get_memop(oi));
- base = tcg_out_tlb_load(s, addr_reg, oi, label_ptr, 0);
- tcg_out_qemu_st_direct(s, data_reg, base, opc);
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
- if (a_bits) {
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- base = addr_reg;
- if (TARGET_LONG_BITS == 32) {
- tcg_out_ext32u(s, TCG_REG_TMP0, base);
- base = TCG_REG_TMP0;
- }
- if (guest_base != 0) {
- tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
- base = TCG_REG_TMP0;
- }
- tcg_out_qemu_st_direct(s, data_reg, base, opc);
-#endif
}
static const tcg_insn_unit *tb_ret_addr;
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 25/54] tcg/s390x: Pass TCGType to tcg_out_qemu_{ld,st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (23 preceding siblings ...)
2023-05-03 6:56 ` [PATCH v4 24/54] tcg/riscv: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 26/54] tcg/s390x: Introduce HostAddress Richard Henderson
` (28 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
We need to set this in TCGLabelQemuLdst, so plumb this
all the way through from tcg_out_op.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/s390x/tcg-target.c.inc | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index b399798664..e931f0cde4 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1770,13 +1770,14 @@ static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg, MemOp opc,
}
static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
- TCGReg data, TCGReg addr,
+ TCGType type, TCGReg data, TCGReg addr,
tcg_insn_unit *raddr, tcg_insn_unit *label_ptr)
{
TCGLabelQemuLdst *label = new_ldst_label(s);
label->is_ld = is_ld;
label->oi = oi;
+ label->type = type;
label->datalo_reg = data;
label->addrlo_reg = addr;
label->raddr = tcg_splitwx_to_rx(raddr);
@@ -1900,7 +1901,7 @@ static void tcg_prepare_user_ldst(TCGContext *s, TCGReg *addr_reg,
#endif /* CONFIG_SOFTMMU */
static void tcg_out_qemu_ld(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
- MemOpIdx oi)
+ MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
#ifdef CONFIG_SOFTMMU
@@ -1916,7 +1917,8 @@ static void tcg_out_qemu_ld(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
tcg_out_qemu_ld_direct(s, opc, data_reg, base_reg, TCG_REG_R2, 0);
- add_qemu_ldst_label(s, 1, oi, data_reg, addr_reg, s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
+ s->code_ptr, label_ptr);
#else
TCGReg index_reg;
tcg_target_long disp;
@@ -1931,7 +1933,7 @@ static void tcg_out_qemu_ld(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
}
static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
- MemOpIdx oi)
+ MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
#ifdef CONFIG_SOFTMMU
@@ -1947,7 +1949,8 @@ static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
tcg_out_qemu_st_direct(s, opc, data_reg, base_reg, TCG_REG_R2, 0);
- add_qemu_ldst_label(s, 0, oi, data_reg, addr_reg, s->code_ptr, label_ptr);
+ add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
+ s->code_ptr, label_ptr);
#else
TCGReg index_reg;
tcg_target_long disp;
@@ -2307,13 +2310,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- /* ??? Technically we can use a non-extending instruction. */
+ tcg_out_qemu_ld(s, args[0], args[1], args[2], TCG_TYPE_I32);
+ break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, args[0], args[1], args[2]);
+ tcg_out_qemu_ld(s, args[0], args[1], args[2], TCG_TYPE_I64);
break;
case INDEX_op_qemu_st_i32:
+ tcg_out_qemu_st(s, args[0], args[1], args[2], TCG_TYPE_I32);
+ break;
case INDEX_op_qemu_st_i64:
- tcg_out_qemu_st(s, args[0], args[1], args[2]);
+ tcg_out_qemu_st(s, args[0], args[1], args[2], TCG_TYPE_I64);
break;
case INDEX_op_ld16s_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 26/54] tcg/s390x: Introduce HostAddress
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (24 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 25/54] tcg/s390x: Pass TCGType to tcg_out_qemu_{ld,st} Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 27/54] tcg/s390x: Introduce prepare_host_addr Richard Henderson
` (27 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Collect the 3 potential parts of the host address into a struct.
Reorg tcg_out_qemu_{ld,st}_direct to use it.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/s390x/tcg-target.c.inc | 109 ++++++++++++++++++++-----------------
1 file changed, 60 insertions(+), 49 deletions(-)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index e931f0cde4..da7ee5b085 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1606,58 +1606,64 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *dest,
tcg_out_call_int(s, dest);
}
+typedef struct {
+ TCGReg base;
+ TCGReg index;
+ int disp;
+} HostAddress;
+
static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg data,
- TCGReg base, TCGReg index, int disp)
+ HostAddress h)
{
switch (opc & (MO_SSIZE | MO_BSWAP)) {
case MO_UB:
- tcg_out_insn(s, RXY, LLGC, data, base, index, disp);
+ tcg_out_insn(s, RXY, LLGC, data, h.base, h.index, h.disp);
break;
case MO_SB:
- tcg_out_insn(s, RXY, LGB, data, base, index, disp);
+ tcg_out_insn(s, RXY, LGB, data, h.base, h.index, h.disp);
break;
case MO_UW | MO_BSWAP:
/* swapped unsigned halfword load with upper bits zeroed */
- tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
+ tcg_out_insn(s, RXY, LRVH, data, h.base, h.index, h.disp);
tcg_out_ext16u(s, data, data);
break;
case MO_UW:
- tcg_out_insn(s, RXY, LLGH, data, base, index, disp);
+ tcg_out_insn(s, RXY, LLGH, data, h.base, h.index, h.disp);
break;
case MO_SW | MO_BSWAP:
/* swapped sign-extended halfword load */
- tcg_out_insn(s, RXY, LRVH, data, base, index, disp);
+ tcg_out_insn(s, RXY, LRVH, data, h.base, h.index, h.disp);
tcg_out_ext16s(s, TCG_TYPE_REG, data, data);
break;
case MO_SW:
- tcg_out_insn(s, RXY, LGH, data, base, index, disp);
+ tcg_out_insn(s, RXY, LGH, data, h.base, h.index, h.disp);
break;
case MO_UL | MO_BSWAP:
/* swapped unsigned int load with upper bits zeroed */
- tcg_out_insn(s, RXY, LRV, data, base, index, disp);
+ tcg_out_insn(s, RXY, LRV, data, h.base, h.index, h.disp);
tcg_out_ext32u(s, data, data);
break;
case MO_UL:
- tcg_out_insn(s, RXY, LLGF, data, base, index, disp);
+ tcg_out_insn(s, RXY, LLGF, data, h.base, h.index, h.disp);
break;
case MO_SL | MO_BSWAP:
/* swapped sign-extended int load */
- tcg_out_insn(s, RXY, LRV, data, base, index, disp);
+ tcg_out_insn(s, RXY, LRV, data, h.base, h.index, h.disp);
tcg_out_ext32s(s, data, data);
break;
case MO_SL:
- tcg_out_insn(s, RXY, LGF, data, base, index, disp);
+ tcg_out_insn(s, RXY, LGF, data, h.base, h.index, h.disp);
break;
case MO_UQ | MO_BSWAP:
- tcg_out_insn(s, RXY, LRVG, data, base, index, disp);
+ tcg_out_insn(s, RXY, LRVG, data, h.base, h.index, h.disp);
break;
case MO_UQ:
- tcg_out_insn(s, RXY, LG, data, base, index, disp);
+ tcg_out_insn(s, RXY, LG, data, h.base, h.index, h.disp);
break;
default:
@@ -1666,44 +1672,44 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg data,
}
static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
- TCGReg base, TCGReg index, int disp)
+ HostAddress h)
{
switch (opc & (MO_SIZE | MO_BSWAP)) {
case MO_UB:
- if (disp >= 0 && disp < 0x1000) {
- tcg_out_insn(s, RX, STC, data, base, index, disp);
+ if (h.disp >= 0 && h.disp < 0x1000) {
+ tcg_out_insn(s, RX, STC, data, h.base, h.index, h.disp);
} else {
- tcg_out_insn(s, RXY, STCY, data, base, index, disp);
+ tcg_out_insn(s, RXY, STCY, data, h.base, h.index, h.disp);
}
break;
case MO_UW | MO_BSWAP:
- tcg_out_insn(s, RXY, STRVH, data, base, index, disp);
+ tcg_out_insn(s, RXY, STRVH, data, h.base, h.index, h.disp);
break;
case MO_UW:
- if (disp >= 0 && disp < 0x1000) {
- tcg_out_insn(s, RX, STH, data, base, index, disp);
+ if (h.disp >= 0 && h.disp < 0x1000) {
+ tcg_out_insn(s, RX, STH, data, h.base, h.index, h.disp);
} else {
- tcg_out_insn(s, RXY, STHY, data, base, index, disp);
+ tcg_out_insn(s, RXY, STHY, data, h.base, h.index, h.disp);
}
break;
case MO_UL | MO_BSWAP:
- tcg_out_insn(s, RXY, STRV, data, base, index, disp);
+ tcg_out_insn(s, RXY, STRV, data, h.base, h.index, h.disp);
break;
case MO_UL:
- if (disp >= 0 && disp < 0x1000) {
- tcg_out_insn(s, RX, ST, data, base, index, disp);
+ if (h.disp >= 0 && h.disp < 0x1000) {
+ tcg_out_insn(s, RX, ST, data, h.base, h.index, h.disp);
} else {
- tcg_out_insn(s, RXY, STY, data, base, index, disp);
+ tcg_out_insn(s, RXY, STY, data, h.base, h.index, h.disp);
}
break;
case MO_UQ | MO_BSWAP:
- tcg_out_insn(s, RXY, STRVG, data, base, index, disp);
+ tcg_out_insn(s, RXY, STRVG, data, h.base, h.index, h.disp);
break;
case MO_UQ:
- tcg_out_insn(s, RXY, STG, data, base, index, disp);
+ tcg_out_insn(s, RXY, STG, data, h.base, h.index, h.disp);
break;
default:
@@ -1883,20 +1889,23 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
return tcg_out_fail_alignment(s, l);
}
-static void tcg_prepare_user_ldst(TCGContext *s, TCGReg *addr_reg,
- TCGReg *index_reg, tcg_target_long *disp)
+static HostAddress tcg_prepare_user_ldst(TCGContext *s, TCGReg addr_reg)
{
+ TCGReg index;
+ int disp;
+
if (TARGET_LONG_BITS == 32) {
- tcg_out_ext32u(s, TCG_TMP0, *addr_reg);
- *addr_reg = TCG_TMP0;
+ tcg_out_ext32u(s, TCG_TMP0, addr_reg);
+ addr_reg = TCG_TMP0;
}
if (guest_base < 0x80000) {
- *index_reg = TCG_REG_NONE;
- *disp = guest_base;
+ index = TCG_REG_NONE;
+ disp = guest_base;
} else {
- *index_reg = TCG_GUEST_BASE_REG;
- *disp = 0;
+ index = TCG_GUEST_BASE_REG;
+ disp = 0;
}
+ return (HostAddress){ .base = addr_reg, .index = index, .disp = disp };
}
#endif /* CONFIG_SOFTMMU */
@@ -1904,31 +1913,32 @@ static void tcg_out_qemu_ld(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ HostAddress h;
+
#ifdef CONFIG_SOFTMMU
unsigned mem_index = get_mmuidx(oi);
tcg_insn_unit *label_ptr;
- TCGReg base_reg;
- base_reg = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 1);
+ h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 1);
+ h.index = TCG_REG_R2;
+ h.disp = 0;
tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
label_ptr = s->code_ptr;
s->code_ptr += 1;
- tcg_out_qemu_ld_direct(s, opc, data_reg, base_reg, TCG_REG_R2, 0);
+ tcg_out_qemu_ld_direct(s, opc, data_reg, h);
add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
s->code_ptr, label_ptr);
#else
- TCGReg index_reg;
- tcg_target_long disp;
unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, true, addr_reg, a_bits);
}
- tcg_prepare_user_ldst(s, &addr_reg, &index_reg, &disp);
- tcg_out_qemu_ld_direct(s, opc, data_reg, addr_reg, index_reg, disp);
+ h = tcg_prepare_user_ldst(s, addr_reg);
+ tcg_out_qemu_ld_direct(s, opc, data_reg, h);
#endif
}
@@ -1936,31 +1946,32 @@ static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
MemOp opc = get_memop(oi);
+ HostAddress h;
+
#ifdef CONFIG_SOFTMMU
unsigned mem_index = get_mmuidx(oi);
tcg_insn_unit *label_ptr;
- TCGReg base_reg;
- base_reg = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 0);
+ h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 0);
+ h.index = TCG_REG_R2;
+ h.disp = 0;
tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
label_ptr = s->code_ptr;
s->code_ptr += 1;
- tcg_out_qemu_st_direct(s, opc, data_reg, base_reg, TCG_REG_R2, 0);
+ tcg_out_qemu_st_direct(s, opc, data_reg, h);
add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
s->code_ptr, label_ptr);
#else
- TCGReg index_reg;
- tcg_target_long disp;
unsigned a_bits = get_alignment_bits(opc);
if (a_bits) {
tcg_out_test_alignment(s, false, addr_reg, a_bits);
}
- tcg_prepare_user_ldst(s, &addr_reg, &index_reg, &disp);
- tcg_out_qemu_st_direct(s, opc, data_reg, addr_reg, index_reg, disp);
+ h = tcg_prepare_user_ldst(s, addr_reg);
+ tcg_out_qemu_st_direct(s, opc, data_reg, h);
#endif
}
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 27/54] tcg/s390x: Introduce prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (25 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 26/54] tcg/s390x: Introduce HostAddress Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 28/54] tcg/sparc64: Drop is_64 test from tcg_out_qemu_ld data return Richard Henderson
` (26 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment,
tcg_prepare_user_ldst, and some code that lived in both tcg_out_qemu_ld
and tcg_out_qemu_st into one function that returns HostAddress and
TCGLabelQemuLdst structures.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/s390x/tcg-target.c.inc | 263 ++++++++++++++++---------------------
1 file changed, 113 insertions(+), 150 deletions(-)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index da7ee5b085..c3157d22be 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1718,78 +1718,6 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
}
#if defined(CONFIG_SOFTMMU)
-/* We're expecting to use a 20-bit negative offset on the tlb memory ops. */
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
-QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
-
-/* Load and compare a TLB entry, leaving the flags set. Loads the TLB
- addend into R2. Returns a register with the santitized guest address. */
-static TCGReg tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg, MemOp opc,
- int mem_index, bool is_ld)
-{
- unsigned s_bits = opc & MO_SIZE;
- unsigned a_bits = get_alignment_bits(opc);
- unsigned s_mask = (1 << s_bits) - 1;
- unsigned a_mask = (1 << a_bits) - 1;
- int fast_off = TLB_MASK_TABLE_OFS(mem_index);
- int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
- int table_off = fast_off + offsetof(CPUTLBDescFast, table);
- int ofs, a_off;
- uint64_t tlb_mask;
-
- tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
- TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
- tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
- tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
-
- /* For aligned accesses, we check the first byte and include the alignment
- bits within the address. For unaligned access, we check that we don't
- cross pages using the address of the last byte of the access. */
- a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
- tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
- if (a_off == 0) {
- tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
- } else {
- tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
- tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
- }
-
- if (is_ld) {
- ofs = offsetof(CPUTLBEntry, addr_read);
- } else {
- ofs = offsetof(CPUTLBEntry, addr_write);
- }
- if (TARGET_LONG_BITS == 32) {
- tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
- } else {
- tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
- }
-
- tcg_out_insn(s, RXY, LG, TCG_REG_R2, TCG_REG_R2, TCG_REG_NONE,
- offsetof(CPUTLBEntry, addend));
-
- if (TARGET_LONG_BITS == 32) {
- tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
- return TCG_REG_R3;
- }
- return addr_reg;
-}
-
-static void add_qemu_ldst_label(TCGContext *s, bool is_ld, MemOpIdx oi,
- TCGType type, TCGReg data, TCGReg addr,
- tcg_insn_unit *raddr, tcg_insn_unit *label_ptr)
-{
- TCGLabelQemuLdst *label = new_ldst_label(s);
-
- label->is_ld = is_ld;
- label->oi = oi;
- label->type = type;
- label->datalo_reg = data;
- label->addrlo_reg = addr;
- label->raddr = tcg_splitwx_to_rx(raddr);
- label->label_ptr[0] = label_ptr;
-}
-
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
TCGReg addr_reg = lb->addrlo_reg;
@@ -1842,26 +1770,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
return true;
}
#else
-static void tcg_out_test_alignment(TCGContext *s, bool is_ld,
- TCGReg addrlo, unsigned a_bits)
-{
- unsigned a_mask = (1 << a_bits) - 1;
- TCGLabelQemuLdst *l = new_ldst_label(s);
-
- l->is_ld = is_ld;
- l->addrlo_reg = addrlo;
-
- /* We are expecting a_bits to max out at 7, much lower than TMLL. */
- tcg_debug_assert(a_bits < 16);
- tcg_out_insn(s, RI, TMLL, addrlo, a_mask);
-
- tcg_out16(s, RI_BRC | (7 << 4)); /* CC in {1,2,3} */
- l->label_ptr[0] = s->code_ptr;
- s->code_ptr += 1;
-
- l->raddr = tcg_splitwx_to_rx(s->code_ptr);
-}
-
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
if (!patch_reloc(l->label_ptr[0], R_390_PC16DBL,
@@ -1888,91 +1796,146 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
return tcg_out_fail_alignment(s, l);
}
+#endif /* CONFIG_SOFTMMU */
-static HostAddress tcg_prepare_user_ldst(TCGContext *s, TCGReg addr_reg)
+/*
+ * For softmmu, perform the TLB load and compare.
+ * For useronly, perform any required alignment tests.
+ * In both cases, return a TCGLabelQemuLdst structure if the slow path
+ * is required and fill in @h with the host address for the fast path.
+ */
+static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
+ TCGReg addr_reg, MemOpIdx oi,
+ bool is_ld)
{
- TCGReg index;
- int disp;
+ TCGLabelQemuLdst *ldst = NULL;
+ MemOp opc = get_memop(oi);
+ unsigned a_bits = get_alignment_bits(opc);
+ unsigned a_mask = (1u << a_bits) - 1;
+#ifdef CONFIG_SOFTMMU
+ unsigned s_bits = opc & MO_SIZE;
+ unsigned s_mask = (1 << s_bits) - 1;
+ int mem_index = get_mmuidx(oi);
+ int fast_off = TLB_MASK_TABLE_OFS(mem_index);
+ int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
+ int table_off = fast_off + offsetof(CPUTLBDescFast, table);
+ int ofs, a_off;
+ uint64_t tlb_mask;
+
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+ QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
+ tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
+ tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
+
+ /*
+ * For aligned accesses, we check the first byte and include the alignment
+ * bits within the address. For unaligned access, we check that we don't
+ * cross pages using the address of the last byte of the access.
+ */
+ a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
+ tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
+ if (a_off == 0) {
+ tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
+ } else {
+ tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
+ tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
+ }
+
+ if (is_ld) {
+ ofs = offsetof(CPUTLBEntry, addr_read);
+ } else {
+ ofs = offsetof(CPUTLBEntry, addr_write);
+ }
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
+ } else {
+ tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
+ }
+
+ tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
+ ldst->label_ptr[0] = s->code_ptr++;
+
+ h->index = TCG_REG_R2;
+ tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
+ offsetof(CPUTLBEntry, addend));
+
+ h->base = addr_reg;
+ if (TARGET_LONG_BITS == 32) {
+ tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
+ h->base = TCG_REG_R3;
+ }
+ h->disp = 0;
+#else
+ if (a_mask) {
+ ldst = new_ldst_label(s);
+ ldst->is_ld = is_ld;
+ ldst->oi = oi;
+ ldst->addrlo_reg = addr_reg;
+
+ /* We are expecting a_bits to max out at 7, much lower than TMLL. */
+ tcg_debug_assert(a_bits < 16);
+ tcg_out_insn(s, RI, TMLL, addr_reg, a_mask);
+
+ tcg_out16(s, RI_BRC | (7 << 4)); /* CC in {1,2,3} */
+ ldst->label_ptr[0] = s->code_ptr++;
+ }
+
+ h->base = addr_reg;
if (TARGET_LONG_BITS == 32) {
tcg_out_ext32u(s, TCG_TMP0, addr_reg);
- addr_reg = TCG_TMP0;
+ h->base = TCG_TMP0;
}
if (guest_base < 0x80000) {
- index = TCG_REG_NONE;
- disp = guest_base;
+ h->index = TCG_REG_NONE;
+ h->disp = guest_base;
} else {
- index = TCG_GUEST_BASE_REG;
- disp = 0;
+ h->index = TCG_GUEST_BASE_REG;
+ h->disp = 0;
}
- return (HostAddress){ .base = addr_reg, .index = index, .disp = disp };
+#endif
+
+ return ldst;
}
-#endif /* CONFIG_SOFTMMU */
static void tcg_out_qemu_ld(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- unsigned mem_index = get_mmuidx(oi);
- tcg_insn_unit *label_ptr;
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
+ tcg_out_qemu_ld_direct(s, get_memop(oi), data_reg, h);
- h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 1);
- h.index = TCG_REG_R2;
- h.disp = 0;
-
- tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
- label_ptr = s->code_ptr;
- s->code_ptr += 1;
-
- tcg_out_qemu_ld_direct(s, opc, data_reg, h);
-
- add_qemu_ldst_label(s, true, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
-
- if (a_bits) {
- tcg_out_test_alignment(s, true, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- h = tcg_prepare_user_ldst(s, addr_reg);
- tcg_out_qemu_ld_direct(s, opc, data_reg, h);
-#endif
}
static void tcg_out_qemu_st(TCGContext* s, TCGReg data_reg, TCGReg addr_reg,
MemOpIdx oi, TCGType data_type)
{
- MemOp opc = get_memop(oi);
+ TCGLabelQemuLdst *ldst;
HostAddress h;
-#ifdef CONFIG_SOFTMMU
- unsigned mem_index = get_mmuidx(oi);
- tcg_insn_unit *label_ptr;
+ ldst = prepare_host_addr(s, &h, addr_reg, oi, false);
+ tcg_out_qemu_st_direct(s, get_memop(oi), data_reg, h);
- h.base = tcg_out_tlb_read(s, addr_reg, opc, mem_index, 0);
- h.index = TCG_REG_R2;
- h.disp = 0;
-
- tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
- label_ptr = s->code_ptr;
- s->code_ptr += 1;
-
- tcg_out_qemu_st_direct(s, opc, data_reg, h);
-
- add_qemu_ldst_label(s, false, oi, data_type, data_reg, addr_reg,
- s->code_ptr, label_ptr);
-#else
- unsigned a_bits = get_alignment_bits(opc);
-
- if (a_bits) {
- tcg_out_test_alignment(s, false, addr_reg, a_bits);
+ if (ldst) {
+ ldst->type = data_type;
+ ldst->datalo_reg = data_reg;
+ ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
}
- h = tcg_prepare_user_ldst(s, addr_reg);
- tcg_out_qemu_st_direct(s, opc, data_reg, h);
-#endif
}
static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 28/54] tcg/sparc64: Drop is_64 test from tcg_out_qemu_ld data return
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (26 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 27/54] tcg/s390x: Introduce prepare_host_addr Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 29/54] tcg/sparc64: Pass TCGType to tcg_out_qemu_{ld,st} Richard Henderson
` (25 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
In tcg_canonicalize_memop, we remove MO_SIGN from MO_32 operations
with TCG_TYPE_I32. Thus this is never set. We already have an
identical test just above which does not include is_64
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/sparc64/tcg-target.c.inc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc
index 4f477d539c..dbe4bf96b9 100644
--- a/tcg/sparc64/tcg-target.c.inc
+++ b/tcg/sparc64/tcg-target.c.inc
@@ -1220,7 +1220,7 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data, TCGReg addr,
tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_O2, oi);
/* We let the helper sign-extend SB and SW, but leave SL for here. */
- if (is_64 && (memop & MO_SSIZE) == MO_SL) {
+ if ((memop & MO_SSIZE) == MO_SL) {
tcg_out_ext32s(s, data, TCG_REG_O0);
} else {
tcg_out_mov(s, TCG_TYPE_REG, data, TCG_REG_O0);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 29/54] tcg/sparc64: Pass TCGType to tcg_out_qemu_{ld,st}
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (27 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 28/54] tcg/sparc64: Drop is_64 test from tcg_out_qemu_ld data return Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 30/54] tcg: Move TCGLabelQemuLdst to tcg.c Richard Henderson
` (24 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
We need to set this in TCGLabelQemuLdst, so plumb this
all the way through from tcg_out_op.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/sparc64/tcg-target.c.inc | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc
index dbe4bf96b9..7e6466d3b6 100644
--- a/tcg/sparc64/tcg-target.c.inc
+++ b/tcg/sparc64/tcg-target.c.inc
@@ -1178,7 +1178,7 @@ static const int qemu_st_opc[(MO_SIZE | MO_BSWAP) + 1] = {
};
static void tcg_out_qemu_ld(TCGContext *s, TCGReg data, TCGReg addr,
- MemOpIdx oi, bool is_64)
+ MemOpIdx oi, TCGType data_type)
{
MemOp memop = get_memop(oi);
tcg_insn_unit *label_ptr;
@@ -1636,10 +1636,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_qemu_ld_i32:
- tcg_out_qemu_ld(s, a0, a1, a2, false);
+ tcg_out_qemu_ld(s, a0, a1, a2, TCG_TYPE_I32);
break;
case INDEX_op_qemu_ld_i64:
- tcg_out_qemu_ld(s, a0, a1, a2, true);
+ tcg_out_qemu_ld(s, a0, a1, a2, TCG_TYPE_I64);
break;
case INDEX_op_qemu_st_i32:
tcg_out_qemu_st(s, a0, a1, a2, TCG_TYPE_I32);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 30/54] tcg: Move TCGLabelQemuLdst to tcg.c
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (28 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 29/54] tcg/sparc64: Pass TCGType to tcg_out_qemu_{ld,st} Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 31/54] tcg: Replace REG_P with arg_loc_reg_p Richard Henderson
` (23 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 13 +++++++++++++
tcg/tcg-ldst.c.inc | 14 --------------
2 files changed, 13 insertions(+), 14 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index cfd3262a4a..6f5daaee5f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -94,6 +94,19 @@ typedef struct QEMU_PACKED {
DebugFrameFDEHeader fde;
} DebugFrameHeader;
+typedef struct TCGLabelQemuLdst {
+ bool is_ld; /* qemu_ld: true, qemu_st: false */
+ MemOpIdx oi;
+ TCGType type; /* result type of a load */
+ TCGReg addrlo_reg; /* reg index for low word of guest virtual addr */
+ TCGReg addrhi_reg; /* reg index for high word of guest virtual addr */
+ TCGReg datalo_reg; /* reg index for low word to be loaded or stored */
+ TCGReg datahi_reg; /* reg index for high word to be loaded or stored */
+ const tcg_insn_unit *raddr; /* addr of the next IR of qemu_ld/st IR */
+ tcg_insn_unit *label_ptr[2]; /* label pointers to be updated */
+ QSIMPLEQ_ENTRY(TCGLabelQemuLdst) next;
+} TCGLabelQemuLdst;
+
static void tcg_register_jit_int(const void *buf, size_t size,
const void *debug_frame,
size_t debug_frame_size)
diff --git a/tcg/tcg-ldst.c.inc b/tcg/tcg-ldst.c.inc
index 403cbb0f06..ffada04af0 100644
--- a/tcg/tcg-ldst.c.inc
+++ b/tcg/tcg-ldst.c.inc
@@ -20,20 +20,6 @@
* THE SOFTWARE.
*/
-typedef struct TCGLabelQemuLdst {
- bool is_ld; /* qemu_ld: true, qemu_st: false */
- MemOpIdx oi;
- TCGType type; /* result type of a load */
- TCGReg addrlo_reg; /* reg index for low word of guest virtual addr */
- TCGReg addrhi_reg; /* reg index for high word of guest virtual addr */
- TCGReg datalo_reg; /* reg index for low word to be loaded or stored */
- TCGReg datahi_reg; /* reg index for high word to be loaded or stored */
- const tcg_insn_unit *raddr; /* addr of the next IR of qemu_ld/st IR */
- tcg_insn_unit *label_ptr[2]; /* label pointers to be updated */
- QSIMPLEQ_ENTRY(TCGLabelQemuLdst) next;
-} TCGLabelQemuLdst;
-
-
/*
* Generate TB finalization at the end of block
*/
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 31/54] tcg: Replace REG_P with arg_loc_reg_p
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (29 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 30/54] tcg: Move TCGLabelQemuLdst to tcg.c Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 32/54] tcg: Introduce arg_slot_stk_ofs Richard Henderson
` (22 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
An inline function is safer than a macro, and REG_P
was rather too generic.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-internal.h | 4 ----
tcg/tcg.c | 16 +++++++++++++---
2 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h
index e542a4e9b7..0f1ba01a9a 100644
--- a/tcg/tcg-internal.h
+++ b/tcg/tcg-internal.h
@@ -58,10 +58,6 @@ typedef struct TCGCallArgumentLoc {
unsigned tmp_subindex : 2;
} TCGCallArgumentLoc;
-/* Avoid "unsigned < 0 is always false" Werror, when iarg_regs is empty. */
-#define REG_P(L) \
- ((int)(L)->arg_slot < (int)ARRAY_SIZE(tcg_target_call_iarg_regs))
-
typedef struct TCGHelperInfo {
void *func;
const char *name;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 6f5daaee5f..fa28db0188 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -806,6 +806,16 @@ static void init_ffi_layouts(void)
}
#endif /* CONFIG_TCG_INTERPRETER */
+static inline bool arg_slot_reg_p(unsigned arg_slot)
+{
+ /*
+ * Split the sizeof away from the comparison to avoid Werror from
+ * "unsigned < 0 is always false", when iarg_regs is empty.
+ */
+ unsigned nreg = ARRAY_SIZE(tcg_target_call_iarg_regs);
+ return arg_slot < nreg;
+}
+
typedef struct TCGCumulativeArgs {
int arg_idx; /* tcg_gen_callN args[] */
int info_in_idx; /* TCGHelperInfo in[] */
@@ -3231,7 +3241,7 @@ liveness_pass_1(TCGContext *s)
case TCG_CALL_ARG_NORMAL:
case TCG_CALL_ARG_EXTEND_U:
case TCG_CALL_ARG_EXTEND_S:
- if (REG_P(loc)) {
+ if (arg_slot_reg_p(loc->arg_slot)) {
*la_temp_pref(ts) = 0;
break;
}
@@ -3258,7 +3268,7 @@ liveness_pass_1(TCGContext *s)
case TCG_CALL_ARG_NORMAL:
case TCG_CALL_ARG_EXTEND_U:
case TCG_CALL_ARG_EXTEND_S:
- if (REG_P(loc)) {
+ if (arg_slot_reg_p(loc->arg_slot)) {
tcg_regset_set_reg(*la_temp_pref(ts),
tcg_target_call_iarg_regs[loc->arg_slot]);
}
@@ -4833,7 +4843,7 @@ static void load_arg_stk(TCGContext *s, int stk_slot, TCGTemp *ts,
static void load_arg_normal(TCGContext *s, const TCGCallArgumentLoc *l,
TCGTemp *ts, TCGRegSet *allocated_regs)
{
- if (REG_P(l)) {
+ if (arg_slot_reg_p(l->arg_slot)) {
TCGReg reg = tcg_target_call_iarg_regs[l->arg_slot];
load_arg_reg(s, reg, ts, *allocated_regs);
tcg_regset_set_reg(*allocated_regs, reg);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 32/54] tcg: Introduce arg_slot_stk_ofs
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (30 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 31/54] tcg: Replace REG_P with arg_loc_reg_p Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 33/54] tcg: Widen helper_*_st[bw]_mmu val arguments Richard Henderson
` (21 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Unify all computation of argument stack offset in one function.
This requires that we adjust ref_slot to be in the same units,
by adding max_reg_slots during init_call_layout.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index fa28db0188..057423c121 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -816,6 +816,15 @@ static inline bool arg_slot_reg_p(unsigned arg_slot)
return arg_slot < nreg;
}
+static inline int arg_slot_stk_ofs(unsigned arg_slot)
+{
+ unsigned max = TCG_STATIC_CALL_ARGS_SIZE / sizeof(tcg_target_long);
+ unsigned stk_slot = arg_slot - ARRAY_SIZE(tcg_target_call_iarg_regs);
+
+ tcg_debug_assert(stk_slot < max);
+ return TCG_TARGET_CALL_STACK_OFFSET + stk_slot * sizeof(tcg_target_long);
+}
+
typedef struct TCGCumulativeArgs {
int arg_idx; /* tcg_gen_callN args[] */
int info_in_idx; /* TCGHelperInfo in[] */
@@ -1055,6 +1064,7 @@ static void init_call_layout(TCGHelperInfo *info)
}
}
assert(ref_base + cum.ref_slot <= max_stk_slots);
+ ref_base += max_reg_slots;
if (ref_base != 0) {
for (int i = cum.info_in_idx - 1; i >= 0; --i) {
@@ -4826,7 +4836,7 @@ static void load_arg_reg(TCGContext *s, TCGReg reg, TCGTemp *ts,
}
}
-static void load_arg_stk(TCGContext *s, int stk_slot, TCGTemp *ts,
+static void load_arg_stk(TCGContext *s, unsigned arg_slot, TCGTemp *ts,
TCGRegSet allocated_regs)
{
/*
@@ -4836,8 +4846,7 @@ static void load_arg_stk(TCGContext *s, int stk_slot, TCGTemp *ts,
*/
temp_load(s, ts, tcg_target_available_regs[ts->type], allocated_regs, 0);
tcg_out_st(s, ts->type, ts->reg, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET +
- stk_slot * sizeof(tcg_target_long));
+ arg_slot_stk_ofs(arg_slot));
}
static void load_arg_normal(TCGContext *s, const TCGCallArgumentLoc *l,
@@ -4848,18 +4857,16 @@ static void load_arg_normal(TCGContext *s, const TCGCallArgumentLoc *l,
load_arg_reg(s, reg, ts, *allocated_regs);
tcg_regset_set_reg(*allocated_regs, reg);
} else {
- load_arg_stk(s, l->arg_slot - ARRAY_SIZE(tcg_target_call_iarg_regs),
- ts, *allocated_regs);
+ load_arg_stk(s, l->arg_slot, ts, *allocated_regs);
}
}
-static void load_arg_ref(TCGContext *s, int arg_slot, TCGReg ref_base,
+static void load_arg_ref(TCGContext *s, unsigned arg_slot, TCGReg ref_base,
intptr_t ref_off, TCGRegSet *allocated_regs)
{
TCGReg reg;
- int stk_slot = arg_slot - ARRAY_SIZE(tcg_target_call_iarg_regs);
- if (stk_slot < 0) {
+ if (arg_slot_reg_p(arg_slot)) {
reg = tcg_target_call_iarg_regs[arg_slot];
tcg_reg_free(s, reg, *allocated_regs);
tcg_out_addi_ptr(s, reg, ref_base, ref_off);
@@ -4869,8 +4876,7 @@ static void load_arg_ref(TCGContext *s, int arg_slot, TCGReg ref_base,
*allocated_regs, 0, false);
tcg_out_addi_ptr(s, reg, ref_base, ref_off);
tcg_out_st(s, TCG_TYPE_PTR, reg, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET
- + stk_slot * sizeof(tcg_target_long));
+ arg_slot_stk_ofs(arg_slot));
}
}
@@ -4900,8 +4906,7 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)
case TCG_CALL_ARG_BY_REF:
load_arg_stk(s, loc->ref_slot, ts, allocated_regs);
load_arg_ref(s, loc->arg_slot, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET
- + loc->ref_slot * sizeof(tcg_target_long),
+ arg_slot_stk_ofs(loc->ref_slot),
&allocated_regs);
break;
case TCG_CALL_ARG_BY_REF_N:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 33/54] tcg: Widen helper_*_st[bw]_mmu val arguments
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (31 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 32/54] tcg: Introduce arg_slot_stk_ofs Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 34/54] tcg: Add routines for calling slow-path helpers Richard Henderson
` (20 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
While the old type was correct in the ideal sense, some ABIs require
the argument to be zero-extended. Using uint32_t for all such values
is a decent compromise.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/tcg/tcg-ldst.h | 10 +++++++---
accel/tcg/cputlb.c | 6 +++---
2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h
index 2ba22bd5fe..684e394b06 100644
--- a/include/tcg/tcg-ldst.h
+++ b/include/tcg/tcg-ldst.h
@@ -55,15 +55,19 @@ tcg_target_ulong helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr,
tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr,
MemOpIdx oi, uintptr_t retaddr);
-void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
+/*
+ * Value extended to at least uint32_t, so that some ABIs do not require
+ * zero-extension from uint8_t or uint16_t.
+ */
+void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr);
-void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr);
void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr);
void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
MemOpIdx oi, uintptr_t retaddr);
-void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr);
void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr);
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c8bd642d0e..3117886af1 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2508,7 +2508,7 @@ full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
store_helper(env, addr, val, oi, retaddr, MO_UB);
}
-void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
+void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr)
{
full_stb_mmu(env, addr, val, oi, retaddr);
@@ -2521,7 +2521,7 @@ static void full_le_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
store_helper(env, addr, val, oi, retaddr, MO_LEUW);
}
-void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr)
{
full_le_stw_mmu(env, addr, val, oi, retaddr);
@@ -2534,7 +2534,7 @@ static void full_be_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
store_helper(env, addr, val, oi, retaddr, MO_BEUW);
}
-void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val,
+void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
MemOpIdx oi, uintptr_t retaddr)
{
full_be_stw_mmu(env, addr, val, oi, retaddr);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 34/54] tcg: Add routines for calling slow-path helpers
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (32 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 33/54] tcg: Widen helper_*_st[bw]_mmu val arguments Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 35/54] tcg/i386: Convert tcg_out_qemu_ld_slow_path Richard Henderson
` (19 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Add tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. These and their subroutines
use the existing knowledge of the host function call abi
to load the function call arguments and return results.
These will be used to simplify the backends in turn.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 456 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 453 insertions(+), 3 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 057423c121..748be8426a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -181,6 +181,22 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct);
static int tcg_out_ldst_finalize(TCGContext *s);
#endif
+typedef struct TCGLdstHelperParam {
+ TCGReg (*ra_gen)(TCGContext *s, const TCGLabelQemuLdst *l, int arg_reg);
+ unsigned ntmp;
+ int tmp[3];
+} TCGLdstHelperParam;
+
+static void tcg_out_ld_helper_args(TCGContext *s, const TCGLabelQemuLdst *l,
+ const TCGLdstHelperParam *p)
+ __attribute__((unused));
+static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *l,
+ bool load_sign, const TCGLdstHelperParam *p)
+ __attribute__((unused));
+static void tcg_out_st_helper_args(TCGContext *s, const TCGLabelQemuLdst *l,
+ const TCGLdstHelperParam *p)
+ __attribute__((unused));
+
TCGContext tcg_init_ctx;
__thread TCGContext *tcg_ctx;
@@ -459,9 +475,8 @@ static void tcg_out_movext1(TCGContext *s, const TCGMovExtend *i)
* between the sources and destinations.
*/
-static void __attribute__((unused))
-tcg_out_movext2(TCGContext *s, const TCGMovExtend *i1,
- const TCGMovExtend *i2, int scratch)
+static void tcg_out_movext2(TCGContext *s, const TCGMovExtend *i1,
+ const TCGMovExtend *i2, int scratch)
{
TCGReg src1 = i1->src;
TCGReg src2 = i2->src;
@@ -715,6 +730,50 @@ static TCGHelperInfo all_helpers[] = {
};
static GHashTable *helper_table;
+#if TCG_TARGET_REG_BITS == 32
+# define dh_typecode_ttl dh_typecode_i32
+#else
+# define dh_typecode_ttl dh_typecode_i64
+#endif
+
+static TCGHelperInfo info_helper_ld32_mmu = {
+ .flags = TCG_CALL_NO_WG,
+ .typemask = dh_typemask(ttl, 0) /* return tcg_target_ulong */
+ | dh_typemask(env, 1)
+ | dh_typemask(tl, 2) /* target_ulong addr */
+ | dh_typemask(i32, 3) /* unsigned oi */
+ | dh_typemask(ptr, 4) /* uintptr_t ra */
+};
+
+static TCGHelperInfo info_helper_ld64_mmu = {
+ .flags = TCG_CALL_NO_WG,
+ .typemask = dh_typemask(i64, 0) /* return uint64_t */
+ | dh_typemask(env, 1)
+ | dh_typemask(tl, 2) /* target_ulong addr */
+ | dh_typemask(i32, 3) /* unsigned oi */
+ | dh_typemask(ptr, 4) /* uintptr_t ra */
+};
+
+static TCGHelperInfo info_helper_st32_mmu = {
+ .flags = TCG_CALL_NO_WG,
+ .typemask = dh_typemask(void, 0)
+ | dh_typemask(env, 1)
+ | dh_typemask(tl, 2) /* target_ulong addr */
+ | dh_typemask(i32, 3) /* uint32_t data */
+ | dh_typemask(i32, 4) /* unsigned oi */
+ | dh_typemask(ptr, 5) /* uintptr_t ra */
+};
+
+static TCGHelperInfo info_helper_st64_mmu = {
+ .flags = TCG_CALL_NO_WG,
+ .typemask = dh_typemask(void, 0)
+ | dh_typemask(env, 1)
+ | dh_typemask(tl, 2) /* target_ulong addr */
+ | dh_typemask(i64, 3) /* uint64_t data */
+ | dh_typemask(i32, 4) /* unsigned oi */
+ | dh_typemask(ptr, 5) /* uintptr_t ra */
+};
+
#ifdef CONFIG_TCG_INTERPRETER
static ffi_type *typecode_to_ffi(int argmask)
{
@@ -1126,6 +1185,11 @@ static void tcg_context_init(unsigned max_cpus)
(gpointer)&all_helpers[i]);
}
+ init_call_layout(&info_helper_ld32_mmu);
+ init_call_layout(&info_helper_ld64_mmu);
+ init_call_layout(&info_helper_st32_mmu);
+ init_call_layout(&info_helper_st64_mmu);
+
#ifdef CONFIG_TCG_INTERPRETER
init_ffi_layouts();
#endif
@@ -5011,6 +5075,392 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)
}
}
+/*
+ * Similarly for qemu_ld/st slow path helpers.
+ * We must re-implement tcg_gen_callN and tcg_reg_alloc_call simultaneously,
+ * using only the provided backend tcg_out_* functions.
+ */
+
+static int tcg_out_helper_stk_ofs(TCGType type, unsigned slot)
+{
+ int ofs = arg_slot_stk_ofs(slot);
+
+ /*
+ * Each stack slot is TCG_TARGET_LONG_BITS. If the host does not
+ * require extension to uint64_t, adjust the address for uint32_t.
+ */
+ if (HOST_BIG_ENDIAN &&
+ TCG_TARGET_REG_BITS == 64 &&
+ type == TCG_TYPE_I32) {
+ ofs += 4;
+ }
+ return ofs;
+}
+
+static void tcg_out_helper_load_regs(TCGContext *s,
+ unsigned nmov, TCGMovExtend *mov,
+ unsigned ntmp, const int *tmp)
+{
+ switch (nmov) {
+ default:
+ /* The backend must have provided enough temps for the worst case. */
+ tcg_debug_assert(ntmp + 1 >= nmov);
+
+ for (unsigned i = nmov - 1; i >= 2; --i) {
+ TCGReg dst = mov[i].dst;
+
+ for (unsigned j = 0; j < i; ++j) {
+ if (dst == mov[j].src) {
+ /*
+ * Conflict.
+ * Copy the source to a temporary, recurse for the
+ * remaining moves, perform the extension from our
+ * scratch on the way out.
+ */
+ TCGReg scratch = tmp[--ntmp];
+ tcg_out_mov(s, mov[i].src_type, scratch, mov[i].src);
+ mov[i].src = scratch;
+
+ tcg_out_helper_load_regs(s, i, mov, ntmp, tmp);
+ tcg_out_movext1(s, &mov[i]);
+ return;
+ }
+ }
+
+ /* No conflicts: perform this move and continue. */
+ tcg_out_movext1(s, &mov[i]);
+ }
+ /* fall through for the final two moves */
+
+ case 2:
+ tcg_out_movext2(s, mov, mov + 1, ntmp ? tmp[0] : -1);
+ return;
+ case 1:
+ tcg_out_movext1(s, mov);
+ return;
+ case 0:
+ g_assert_not_reached();
+ }
+}
+
+static void tcg_out_helper_load_slots(TCGContext *s,
+ unsigned nmov, TCGMovExtend *mov,
+ const TCGLdstHelperParam *parm)
+{
+ unsigned i;
+
+ /*
+ * Start from the end, storing to the stack first.
+ * This frees those registers, so we need not consider overlap.
+ */
+ for (i = nmov; i-- > 0; ) {
+ unsigned slot = mov[i].dst;
+
+ if (arg_slot_reg_p(slot)) {
+ goto found_reg;
+ }
+
+ TCGReg src = mov[i].src;
+ TCGType dst_type = mov[i].dst_type;
+ MemOp dst_mo = dst_type == TCG_TYPE_I32 ? MO_32 : MO_64;
+
+ /* The argument is going onto the stack; extend into scratch. */
+ if ((mov[i].src_ext & MO_SIZE) != dst_mo) {
+ tcg_debug_assert(parm->ntmp != 0);
+ mov[i].dst = src = parm->tmp[0];
+ tcg_out_movext1(s, &mov[i]);
+ }
+
+ tcg_out_st(s, dst_type, src, TCG_REG_CALL_STACK,
+ tcg_out_helper_stk_ofs(dst_type, slot));
+ }
+ return;
+
+ found_reg:
+ /*
+ * The remaining arguments are in registers.
+ * Convert slot numbers to argument registers.
+ */
+ nmov = i + 1;
+ for (i = 0; i < nmov; ++i) {
+ mov[i].dst = tcg_target_call_iarg_regs[mov[i].dst];
+ }
+ tcg_out_helper_load_regs(s, nmov, mov, parm->ntmp, parm->tmp);
+}
+
+static void tcg_out_helper_load_imm(TCGContext *s, unsigned slot,
+ TCGType type, tcg_target_long imm,
+ const TCGLdstHelperParam *parm)
+{
+ if (arg_slot_reg_p(slot)) {
+ tcg_out_movi(s, type, tcg_target_call_iarg_regs[slot], imm);
+ } else {
+ int ofs = tcg_out_helper_stk_ofs(type, slot);
+ if (!tcg_out_sti(s, type, imm, TCG_REG_CALL_STACK, ofs)) {
+ tcg_debug_assert(parm->ntmp != 0);
+ tcg_out_movi(s, type, parm->tmp[0], imm);
+ tcg_out_st(s, type, parm->tmp[0], TCG_REG_CALL_STACK, ofs);
+ }
+ }
+}
+
+static void tcg_out_helper_load_common_args(TCGContext *s,
+ const TCGLabelQemuLdst *ldst,
+ const TCGLdstHelperParam *parm,
+ const TCGHelperInfo *info,
+ unsigned next_arg)
+{
+ TCGMovExtend ptr_mov = {
+ .dst_type = TCG_TYPE_PTR,
+ .src_type = TCG_TYPE_PTR,
+ .src_ext = sizeof(void *) == 4 ? MO_32 : MO_64
+ };
+ const TCGCallArgumentLoc *loc = &info->in[0];
+ TCGType type;
+ unsigned slot;
+ tcg_target_ulong imm;
+
+ /*
+ * Handle env, which is always first.
+ */
+ ptr_mov.dst = loc->arg_slot;
+ ptr_mov.src = TCG_AREG0;
+ tcg_out_helper_load_slots(s, 1, &ptr_mov, parm);
+
+ /*
+ * Handle oi.
+ */
+ imm = ldst->oi;
+ loc = &info->in[next_arg];
+ type = TCG_TYPE_I32;
+ switch (loc->kind) {
+ case TCG_CALL_ARG_NORMAL:
+ break;
+ case TCG_CALL_ARG_EXTEND_U:
+ case TCG_CALL_ARG_EXTEND_S:
+ /* No extension required for MemOpIdx. */
+ tcg_debug_assert(imm <= INT32_MAX);
+ type = TCG_TYPE_REG;
+ break;
+ default:
+ g_assert_not_reached();
+ }
+ tcg_out_helper_load_imm(s, loc->arg_slot, type, imm, parm);
+ next_arg++;
+
+ /*
+ * Handle ra.
+ */
+ loc = &info->in[next_arg];
+ slot = loc->arg_slot;
+ if (parm->ra_gen) {
+ int arg_reg = -1;
+ TCGReg ra_reg;
+
+ if (arg_slot_reg_p(slot)) {
+ arg_reg = tcg_target_call_iarg_regs[slot];
+ }
+ ra_reg = parm->ra_gen(s, ldst, arg_reg);
+
+ ptr_mov.dst = slot;
+ ptr_mov.src = ra_reg;
+ tcg_out_helper_load_slots(s, 1, &ptr_mov, parm);
+ } else {
+ imm = (uintptr_t)ldst->raddr;
+ tcg_out_helper_load_imm(s, slot, TCG_TYPE_PTR, imm, parm);
+ }
+}
+
+static unsigned tcg_out_helper_add_mov(TCGMovExtend *mov,
+ const TCGCallArgumentLoc *loc,
+ TCGType dst_type, TCGType src_type,
+ TCGReg lo, TCGReg hi)
+{
+ if (dst_type <= TCG_TYPE_REG) {
+ MemOp src_ext;
+
+ switch (loc->kind) {
+ case TCG_CALL_ARG_NORMAL:
+ src_ext = src_type == TCG_TYPE_I32 ? MO_32 : MO_64;
+ break;
+ case TCG_CALL_ARG_EXTEND_U:
+ dst_type = TCG_TYPE_REG;
+ src_ext = MO_UL;
+ break;
+ case TCG_CALL_ARG_EXTEND_S:
+ dst_type = TCG_TYPE_REG;
+ src_ext = MO_SL;
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ mov[0].dst = loc->arg_slot;
+ mov[0].dst_type = dst_type;
+ mov[0].src = lo;
+ mov[0].src_type = src_type;
+ mov[0].src_ext = src_ext;
+ return 1;
+ }
+
+ assert(TCG_TARGET_REG_BITS == 32);
+
+ mov[0].dst = loc[HOST_BIG_ENDIAN].arg_slot;
+ mov[0].src = lo;
+ mov[0].dst_type = TCG_TYPE_I32;
+ mov[0].src_type = TCG_TYPE_I32;
+ mov[0].src_ext = MO_32;
+
+ mov[1].dst = loc[!HOST_BIG_ENDIAN].arg_slot;
+ mov[1].src = hi;
+ mov[1].dst_type = TCG_TYPE_I32;
+ mov[1].src_type = TCG_TYPE_I32;
+ mov[1].src_ext = MO_32;
+
+ return 2;
+}
+
+static void tcg_out_ld_helper_args(TCGContext *s, const TCGLabelQemuLdst *ldst,
+ const TCGLdstHelperParam *parm)
+{
+ const TCGHelperInfo *info;
+ const TCGCallArgumentLoc *loc;
+ TCGMovExtend mov[2];
+ unsigned next_arg, nmov;
+ MemOp mop = get_memop(ldst->oi);
+
+ switch (mop & MO_SIZE) {
+ case MO_8:
+ case MO_16:
+ case MO_32:
+ info = &info_helper_ld32_mmu;
+ break;
+ case MO_64:
+ info = &info_helper_ld64_mmu;
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ /* Defer env argument. */
+ next_arg = 1;
+
+ loc = &info->in[next_arg];
+ nmov = tcg_out_helper_add_mov(mov, loc, TCG_TYPE_TL, TCG_TYPE_TL,
+ ldst->addrlo_reg, ldst->addrhi_reg);
+ next_arg += nmov;
+
+ tcg_out_helper_load_slots(s, nmov, mov, parm);
+
+ /* No special attention for 32 and 64-bit return values. */
+ tcg_debug_assert(info->out_kind == TCG_CALL_RET_NORMAL);
+
+ tcg_out_helper_load_common_args(s, ldst, parm, info, next_arg);
+}
+
+static void tcg_out_ld_helper_ret(TCGContext *s, const TCGLabelQemuLdst *ldst,
+ bool load_sign,
+ const TCGLdstHelperParam *parm)
+{
+ TCGMovExtend mov[2];
+
+ if (ldst->type <= TCG_TYPE_REG) {
+ MemOp mop = get_memop(ldst->oi);
+
+ mov[0].dst = ldst->datalo_reg;
+ mov[0].src = tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, 0);
+ mov[0].dst_type = ldst->type;
+ mov[0].src_type = TCG_TYPE_REG;
+
+ /*
+ * If load_sign, then we allowed the helper to perform the
+ * appropriate sign extension to tcg_target_ulong, and all
+ * we need now is a plain move.
+ *
+ * If they do not, then we expect the relevant extension
+ * instruction to be no more expensive than a move, and
+ * we thus save the icache etc by only using one of two
+ * helper functions.
+ */
+ if (load_sign || !(mop & MO_SIGN)) {
+ if (TCG_TARGET_REG_BITS == 32 || ldst->type == TCG_TYPE_I32) {
+ mov[0].src_ext = MO_32;
+ } else {
+ mov[0].src_ext = MO_64;
+ }
+ } else {
+ mov[0].src_ext = mop & MO_SSIZE;
+ }
+ tcg_out_movext1(s, mov);
+ } else {
+ assert(TCG_TARGET_REG_BITS == 32);
+
+ mov[0].dst = ldst->datalo_reg;
+ mov[0].src =
+ tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, HOST_BIG_ENDIAN);
+ mov[0].dst_type = TCG_TYPE_I32;
+ mov[0].src_type = TCG_TYPE_I32;
+ mov[0].src_ext = MO_32;
+
+ mov[1].dst = ldst->datahi_reg;
+ mov[1].src =
+ tcg_target_call_oarg_reg(TCG_CALL_RET_NORMAL, !HOST_BIG_ENDIAN);
+ mov[1].dst_type = TCG_TYPE_REG;
+ mov[1].src_type = TCG_TYPE_REG;
+ mov[1].src_ext = MO_32;
+
+ tcg_out_movext2(s, mov, mov + 1, parm->ntmp ? parm->tmp[0] : -1);
+ }
+}
+
+static void tcg_out_st_helper_args(TCGContext *s, const TCGLabelQemuLdst *ldst,
+ const TCGLdstHelperParam *parm)
+{
+ const TCGHelperInfo *info;
+ const TCGCallArgumentLoc *loc;
+ TCGMovExtend mov[4];
+ TCGType data_type;
+ unsigned next_arg, nmov, n;
+ MemOp mop = get_memop(ldst->oi);
+
+ switch (mop & MO_SIZE) {
+ case MO_8:
+ case MO_16:
+ case MO_32:
+ info = &info_helper_st32_mmu;
+ data_type = TCG_TYPE_I32;
+ break;
+ case MO_64:
+ info = &info_helper_st64_mmu;
+ data_type = TCG_TYPE_I64;
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ /* Defer env argument. */
+ next_arg = 1;
+ nmov = 0;
+
+ /* Handle addr argument. */
+ loc = &info->in[next_arg];
+ n = tcg_out_helper_add_mov(mov, loc, TCG_TYPE_TL, TCG_TYPE_TL,
+ ldst->addrlo_reg, ldst->addrhi_reg);
+ next_arg += n;
+ nmov += n;
+
+ /* Handle data argument. */
+ loc = &info->in[next_arg];
+ n = tcg_out_helper_add_mov(mov + nmov, loc, data_type, ldst->type,
+ ldst->datalo_reg, ldst->datahi_reg);
+ next_arg += n;
+ nmov += n;
+ tcg_debug_assert(nmov <= ARRAY_SIZE(mov));
+
+ tcg_out_helper_load_slots(s, nmov, mov, parm);
+ tcg_out_helper_load_common_args(s, ldst, parm, info, next_arg);
+}
+
#ifdef CONFIG_PROFILER
/* avoid copy/paste errors */
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 35/54] tcg/i386: Convert tcg_out_qemu_ld_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (33 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 34/54] tcg: Add routines for calling slow-path helpers Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 36/54] tcg/i386: Convert tcg_out_qemu_st_slow_path Richard Henderson
` (18 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use tcg_out_ld_helper_args and tcg_out_ld_helper_ret.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 71 +++++++++++++++------------------------
1 file changed, 28 insertions(+), 43 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 8752968af2..17ad3c5963 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1802,13 +1802,37 @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
[MO_BEUQ] = helper_be_stq_mmu,
};
+/*
+ * Because i686 has no register parameters and because x86_64 has xchg
+ * to handle addr/data register overlap, we have placed all input arguments
+ * before we need might need a scratch reg.
+ *
+ * Even then, a scratch is only needed for l->raddr. Rather than expose
+ * a general-purpose scratch when we don't actually know it's available,
+ * use the ra_gen hook to load into RAX if needed.
+ */
+#if TCG_TARGET_REG_BITS == 64
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
+{
+ if (arg < 0) {
+ arg = TCG_REG_RAX;
+ }
+ tcg_out_movi(s, TCG_TYPE_PTR, arg, (uintptr_t)l->raddr);
+ return arg;
+}
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ra_gen = ldst_ra_gen
+};
+#else
+static const TCGLdstHelperParam ldst_helper_param = { };
+#endif
+
/*
* Generate code for the slow path for a load at the end of block
*/
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
+ MemOp opc = get_memop(l->oi);
tcg_insn_unit **label_ptr = &l->label_ptr[0];
/* resolve label address */
@@ -1817,49 +1841,10 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4);
}
- if (TCG_TARGET_REG_BITS == 32) {
- int ofs = 0;
-
- tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
- ofs += 4;
-
- tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
- ofs += 4;
-
- if (TARGET_LONG_BITS == 64) {
- tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
- ofs += 4;
- }
-
- tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
- ofs += 4;
-
- tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs);
- } else {
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
- l->addrlo_reg);
- tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi);
- tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3],
- (uintptr_t)l->raddr);
- }
-
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
tcg_out_branch(s, 1, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
+ tcg_out_ld_helper_ret(s, l, false, &ldst_helper_param);
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
- TCGMovExtend ext[2] = {
- { .dst = l->datalo_reg, .dst_type = TCG_TYPE_I32,
- .src = TCG_REG_EAX, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
- { .dst = l->datahi_reg, .dst_type = TCG_TYPE_I32,
- .src = TCG_REG_EDX, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
- };
- tcg_out_movext2(s, &ext[0], &ext[1], -1);
- } else {
- tcg_out_movext(s, l->type, l->datalo_reg,
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_EAX);
- }
-
- /* Jump to the code corresponding to next IR of qemu_st */
tcg_out_jmp(s, l->raddr);
return true;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 36/54] tcg/i386: Convert tcg_out_qemu_st_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (34 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 35/54] tcg/i386: Convert tcg_out_qemu_ld_slow_path Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 37/54] tcg/aarch64: Convert tcg_out_qemu_{ld,st}_slow_path Richard Henderson
` (17 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use tcg_out_st_helper_args. This eliminates the use of a tail call to
the store helper. This may or may not be an improvement, depending on
the call/return branch prediction of the host microarchitecture.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/i386/tcg-target.c.inc | 57 +++------------------------------------
1 file changed, 4 insertions(+), 53 deletions(-)
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 17ad3c5963..7dbfcbd20f 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1854,11 +1854,8 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
*/
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
- MemOp s_bits = opc & MO_SIZE;
+ MemOp opc = get_memop(l->oi);
tcg_insn_unit **label_ptr = &l->label_ptr[0];
- TCGReg retaddr;
/* resolve label address */
tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4);
@@ -1866,56 +1863,10 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4);
}
- if (TCG_TARGET_REG_BITS == 32) {
- int ofs = 0;
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
+ tcg_out_branch(s, 1, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
- tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs);
- ofs += 4;
-
- tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs);
- ofs += 4;
-
- if (TARGET_LONG_BITS == 64) {
- tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs);
- ofs += 4;
- }
-
- tcg_out_st(s, TCG_TYPE_I32, l->datalo_reg, TCG_REG_ESP, ofs);
- ofs += 4;
-
- if (s_bits == MO_64) {
- tcg_out_st(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_ESP, ofs);
- ofs += 4;
- }
-
- tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs);
- ofs += 4;
-
- retaddr = TCG_REG_EAX;
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
- tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs);
- } else {
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
- l->addrlo_reg);
- tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32),
- tcg_target_call_iarg_regs[2], l->datalo_reg);
- tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi);
-
- if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) {
- retaddr = tcg_target_call_iarg_regs[4];
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
- } else {
- retaddr = TCG_REG_RAX;
- tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr);
- tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP,
- TCG_TARGET_CALL_STACK_OFFSET);
- }
- }
-
- /* "Tail call" to the helper, with the return address back inline. */
- tcg_out_push(s, retaddr);
- tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
+ tcg_out_jmp(s, l->raddr);
return true;
}
#else
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 37/54] tcg/aarch64: Convert tcg_out_qemu_{ld,st}_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (35 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 36/54] tcg/i386: Convert tcg_out_qemu_st_slow_path Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 38/54] tcg/arm: " Richard Henderson
` (16 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/aarch64/tcg-target.c.inc | 40 +++++++++++++++---------------------
1 file changed, 16 insertions(+), 24 deletions(-)
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 202b90c001..62dd22d73c 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1580,13 +1580,6 @@ static void tcg_out_cltz(TCGContext *s, TCGType ext, TCGReg d,
}
}
-static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
-{
- ptrdiff_t offset = tcg_pcrel_diff(s, target);
- tcg_debug_assert(offset == sextract64(offset, 0, 21));
- tcg_out_insn(s, 3406, ADR, rd, offset);
-}
-
typedef struct {
TCGReg base;
TCGReg index;
@@ -1627,47 +1620,46 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
#endif
};
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ntmp = 1, .tmp = { TCG_REG_TMP }
+};
+
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
+ MemOp opc = get_memop(lb->oi);
if (!reloc_pc19(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_X0, TCG_AREG0);
- tcg_out_mov(s, TARGET_LONG_BITS == 64, TCG_REG_X1, lb->addrlo_reg);
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X2, oi);
- tcg_out_adr(s, TCG_REG_X3, lb->raddr);
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE]);
-
- tcg_out_movext(s, lb->type, lb->datalo_reg,
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_X0);
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
tcg_out_goto(s, lb->raddr);
return true;
}
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
- MemOp size = opc & MO_SIZE;
+ MemOp opc = get_memop(lb->oi);
if (!reloc_pc19(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_X0, TCG_AREG0);
- tcg_out_mov(s, TARGET_LONG_BITS == 64, TCG_REG_X1, lb->addrlo_reg);
- tcg_out_mov(s, size == MO_64, TCG_REG_X2, lb->datalo_reg);
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_X3, oi);
- tcg_out_adr(s, TCG_REG_X4, lb->raddr);
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE]);
tcg_out_goto(s, lb->raddr);
return true;
}
#else
+static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target)
+{
+ ptrdiff_t offset = tcg_pcrel_diff(s, target);
+ tcg_debug_assert(offset == sextract64(offset, 0, 21));
+ tcg_out_insn(s, 3406, ADR, rd, offset);
+}
+
static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l)
{
if (!reloc_pc19(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 38/54] tcg/arm: Convert tcg_out_qemu_{ld,st}_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (36 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 37/54] tcg/aarch64: Convert tcg_out_qemu_{ld,st}_slow_path Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 39/54] tcg/loongarch64: Convert tcg_out_qemu_{ld, st}_slow_path Richard Henderson
` (15 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. This allows our local
tcg_out_arg_* infrastructure to be removed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/arm/tcg-target.c.inc | 140 +++++----------------------------------
1 file changed, 18 insertions(+), 122 deletions(-)
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index c744512778..df514e56fc 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -690,8 +690,8 @@ tcg_out_ldrd_rwb(TCGContext *s, ARMCond cond, TCGReg rt, TCGReg rn, TCGReg rm)
tcg_out_memop_r(s, cond, INSN_LDRD_REG, rt, rn, rm, 1, 1, 1);
}
-static void tcg_out_strd_8(TCGContext *s, ARMCond cond, TCGReg rt,
- TCGReg rn, int imm8)
+static void __attribute__((unused))
+tcg_out_strd_8(TCGContext *s, ARMCond cond, TCGReg rt, TCGReg rn, int imm8)
{
tcg_out_memop_8(s, cond, INSN_STRD_IMM, rt, rn, imm8, 1, 0);
}
@@ -969,28 +969,16 @@ static void tcg_out_ext8u(TCGContext *s, TCGReg rd, TCGReg rn)
tcg_out_dat_imm(s, COND_AL, ARITH_AND, rd, rn, 0xff);
}
-static void __attribute__((unused))
-tcg_out_ext8u_cond(TCGContext *s, ARMCond cond, TCGReg rd, TCGReg rn)
-{
- tcg_out_dat_imm(s, cond, ARITH_AND, rd, rn, 0xff);
-}
-
static void tcg_out_ext16s(TCGContext *s, TCGType t, TCGReg rd, TCGReg rn)
{
/* sxth */
tcg_out32(s, 0x06bf0070 | (COND_AL << 28) | (rd << 12) | rn);
}
-static void tcg_out_ext16u_cond(TCGContext *s, ARMCond cond,
- TCGReg rd, TCGReg rn)
-{
- /* uxth */
- tcg_out32(s, 0x06ff0070 | (cond << 28) | (rd << 12) | rn);
-}
-
static void tcg_out_ext16u(TCGContext *s, TCGReg rd, TCGReg rn)
{
- tcg_out_ext16u_cond(s, COND_AL, rd, rn);
+ /* uxth */
+ tcg_out32(s, 0x06ff0070 | (COND_AL << 28) | (rd << 12) | rn);
}
static void tcg_out_ext32s(TCGContext *s, TCGReg rd, TCGReg rn)
@@ -1382,92 +1370,29 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = {
#endif
};
-/* Helper routines for marshalling helper function arguments into
- * the correct registers and stack.
- * argreg is where we want to put this argument, arg is the argument itself.
- * Return value is the updated argreg ready for the next call.
- * Note that argreg 0..3 is real registers, 4+ on stack.
- *
- * We provide routines for arguments which are: immediate, 32 bit
- * value in register, 16 and 8 bit values in register (which must be zero
- * extended before use) and 64 bit value in a lo:hi register pair.
- */
-#define DEFINE_TCG_OUT_ARG(NAME, ARGTYPE, MOV_ARG, EXT_ARG) \
-static TCGReg NAME(TCGContext *s, TCGReg argreg, ARGTYPE arg) \
-{ \
- if (argreg < 4) { \
- MOV_ARG(s, COND_AL, argreg, arg); \
- } else { \
- int ofs = (argreg - 4) * 4; \
- EXT_ARG; \
- tcg_debug_assert(ofs + 4 <= TCG_STATIC_CALL_ARGS_SIZE); \
- tcg_out_st32_12(s, COND_AL, arg, TCG_REG_CALL_STACK, ofs); \
- } \
- return argreg + 1; \
-}
-
-DEFINE_TCG_OUT_ARG(tcg_out_arg_imm32, uint32_t, tcg_out_movi32,
- (tcg_out_movi32(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg8, TCGReg, tcg_out_ext8u_cond,
- (tcg_out_ext8u_cond(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg16, TCGReg, tcg_out_ext16u_cond,
- (tcg_out_ext16u_cond(s, COND_AL, TCG_REG_TMP, arg), arg = TCG_REG_TMP))
-DEFINE_TCG_OUT_ARG(tcg_out_arg_reg32, TCGReg, tcg_out_mov_reg, )
-
-static TCGReg tcg_out_arg_reg64(TCGContext *s, TCGReg argreg,
- TCGReg arglo, TCGReg arghi)
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
{
- /* 64 bit arguments must go in even/odd register pairs
- * and in 8-aligned stack slots.
- */
- if (argreg & 1) {
- argreg++;
- }
- if (argreg >= 4 && (arglo & 1) == 0 && arghi == arglo + 1) {
- tcg_out_strd_8(s, COND_AL, arglo,
- TCG_REG_CALL_STACK, (argreg - 4) * 4);
- return argreg + 2;
- } else {
- argreg = tcg_out_arg_reg32(s, argreg, arglo);
- argreg = tcg_out_arg_reg32(s, argreg, arghi);
- return argreg;
- }
+ /* We arrive at the slow path via "BLNE", so R14 contains l->raddr. */
+ return TCG_REG_R14;
}
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ra_gen = ldst_ra_gen,
+ .ntmp = 1,
+ .tmp = { TCG_REG_TMP },
+};
+
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- TCGReg argreg;
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
+ MemOp opc = get_memop(lb->oi);
if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- argreg = tcg_out_arg_reg32(s, TCG_REG_R0, TCG_AREG0);
- if (TARGET_LONG_BITS == 64) {
- argreg = tcg_out_arg_reg64(s, argreg, lb->addrlo_reg, lb->addrhi_reg);
- } else {
- argreg = tcg_out_arg_reg32(s, argreg, lb->addrlo_reg);
- }
- argreg = tcg_out_arg_imm32(s, argreg, oi);
- argreg = tcg_out_arg_reg32(s, argreg, TCG_REG_R14);
-
- /* Use the canonical unsigned helpers and minimize icache usage. */
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE]);
-
- if ((opc & MO_SIZE) == MO_64) {
- TCGMovExtend ext[2] = {
- { .dst = lb->datalo_reg, .dst_type = TCG_TYPE_I32,
- .src = TCG_REG_R0, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
- { .dst = lb->datahi_reg, .dst_type = TCG_TYPE_I32,
- .src = TCG_REG_R1, .src_type = TCG_TYPE_I32, .src_ext = MO_UL },
- };
- tcg_out_movext2(s, &ext[0], &ext[1], TCG_REG_TMP);
- } else {
- tcg_out_movext(s, TCG_TYPE_I32, lb->datalo_reg,
- TCG_TYPE_I32, opc & MO_SSIZE, TCG_REG_R0);
- }
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
tcg_out_goto(s, COND_AL, lb->raddr);
return true;
@@ -1475,42 +1400,13 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- TCGReg argreg, datalo, datahi;
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
+ MemOp opc = get_memop(lb->oi);
if (!reloc_pc24(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- argreg = TCG_REG_R0;
- argreg = tcg_out_arg_reg32(s, argreg, TCG_AREG0);
- if (TARGET_LONG_BITS == 64) {
- argreg = tcg_out_arg_reg64(s, argreg, lb->addrlo_reg, lb->addrhi_reg);
- } else {
- argreg = tcg_out_arg_reg32(s, argreg, lb->addrlo_reg);
- }
-
- datalo = lb->datalo_reg;
- datahi = lb->datahi_reg;
- switch (opc & MO_SIZE) {
- case MO_8:
- argreg = tcg_out_arg_reg8(s, argreg, datalo);
- break;
- case MO_16:
- argreg = tcg_out_arg_reg16(s, argreg, datalo);
- break;
- case MO_32:
- default:
- argreg = tcg_out_arg_reg32(s, argreg, datalo);
- break;
- case MO_64:
- argreg = tcg_out_arg_reg64(s, argreg, datalo, datahi);
- break;
- }
-
- argreg = tcg_out_arg_imm32(s, argreg, oi);
- argreg = tcg_out_arg_reg32(s, argreg, TCG_REG_R14);
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
/* Tail-call to the helper, which will return to the fast path. */
tcg_out_goto(s, COND_AL, qemu_st_helpers[opc & MO_SIZE]);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 39/54] tcg/loongarch64: Convert tcg_out_qemu_{ld, st}_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (37 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 38/54] tcg/arm: " Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 40/54] tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path Richard Henderson
` (14 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/loongarch64/tcg-target.c.inc | 37 ++++++++++----------------------
1 file changed, 11 insertions(+), 26 deletions(-)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 2f2c34b930..60d2c904dd 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -824,51 +824,36 @@ static bool tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
return reloc_br_sd10k16(s->code_ptr - 1, target);
}
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ntmp = 1, .tmp = { TCG_REG_TMP0 }
+};
+
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
- MemOp size = opc & MO_SIZE;
+ MemOp opc = get_memop(l->oi);
/* resolve label address */
if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- /* call load helper */
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A2, oi);
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, (tcg_target_long)l->raddr);
-
- tcg_out_call_int(s, qemu_ld_helpers[size], false);
-
- tcg_out_movext(s, l->type, l->datalo_reg,
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_A0);
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
+ tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SIZE], false);
+ tcg_out_ld_helper_ret(s, l, false, &ldst_helper_param);
return tcg_out_goto(s, l->raddr);
}
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
- MemOp size = opc & MO_SIZE;
+ MemOp opc = get_memop(l->oi);
/* resolve label address */
if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- /* call store helper */
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
- tcg_out_movext(s, size == MO_64 ? TCG_TYPE_I32 : TCG_TYPE_I32, TCG_REG_A2,
- l->type, size, l->datalo_reg);
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, oi);
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A4, (tcg_target_long)l->raddr);
-
- tcg_out_call_int(s, qemu_st_helpers[size], false);
-
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
+ tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
return tcg_out_goto(s, l->raddr);
}
#else
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 40/54] tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (38 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 39/54] tcg/loongarch64: Convert tcg_out_qemu_{ld, st}_slow_path Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 41/54] tcg/ppc: " Richard Henderson
` (13 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args. This allows our local
tcg_out_arg_* infrastructure to be removed.
We are no longer filling the call or return branch
delay slots, nor are we tail-calling for the store,
but this seems a small price to pay.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/mips/tcg-target.c.inc | 154 ++++++--------------------------------
1 file changed, 22 insertions(+), 132 deletions(-)
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 94708e6ea7..022960d79a 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -1115,79 +1115,15 @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
[MO_BEUQ] = helper_be_stq_mmu,
};
-/* Helper routines for marshalling helper function arguments into
- * the correct registers and stack.
- * I is where we want to put this argument, and is updated and returned
- * for the next call. ARG is the argument itself.
- *
- * We provide routines for arguments which are: immediate, 32 bit
- * value in register, 16 and 8 bit values in register (which must be zero
- * extended before use) and 64 bit value in a lo:hi register pair.
- */
-
-static int tcg_out_call_iarg_reg(TCGContext *s, int i, TCGReg arg)
-{
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
- tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[i], arg);
- } else {
- /* For N32 and N64, the initial offset is different. But there
- we also have 8 argument register so we don't run out here. */
- tcg_debug_assert(TCG_TARGET_REG_BITS == 32);
- tcg_out_st(s, TCG_TYPE_REG, arg, TCG_REG_SP, 4 * i);
- }
- return i + 1;
-}
-
-static int tcg_out_call_iarg_reg8(TCGContext *s, int i, TCGReg arg)
-{
- TCGReg tmp = TCG_TMP0;
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
- tmp = tcg_target_call_iarg_regs[i];
- }
- tcg_out_ext8u(s, tmp, arg);
- return tcg_out_call_iarg_reg(s, i, tmp);
-}
-
-static int tcg_out_call_iarg_reg16(TCGContext *s, int i, TCGReg arg)
-{
- TCGReg tmp = TCG_TMP0;
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
- tmp = tcg_target_call_iarg_regs[i];
- }
- tcg_out_opc_imm(s, OPC_ANDI, tmp, arg, 0xffff);
- return tcg_out_call_iarg_reg(s, i, tmp);
-}
-
-static int tcg_out_call_iarg_imm(TCGContext *s, int i, TCGArg arg)
-{
- TCGReg tmp = TCG_TMP0;
- if (arg == 0) {
- tmp = TCG_REG_ZERO;
- } else {
- if (i < ARRAY_SIZE(tcg_target_call_iarg_regs)) {
- tmp = tcg_target_call_iarg_regs[i];
- }
- tcg_out_movi(s, TCG_TYPE_REG, tmp, arg);
- }
- return tcg_out_call_iarg_reg(s, i, tmp);
-}
-
-static int tcg_out_call_iarg_reg2(TCGContext *s, int i, TCGReg al, TCGReg ah)
-{
- tcg_debug_assert(TCG_TARGET_REG_BITS == 32);
- i = (i + 1) & ~1;
- i = tcg_out_call_iarg_reg(s, i, (MIPS_BE ? ah : al));
- i = tcg_out_call_iarg_reg(s, i, (MIPS_BE ? al : ah));
- return i;
-}
+/* We have four temps, we might as well expose three of them. */
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ntmp = 3, .tmp = { TCG_TMP0, TCG_TMP1, TCG_TMP2 }
+};
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
- TCGReg v0;
- int i;
+ MemOp opc = get_memop(l->oi);
/* resolve label address */
if (!reloc_pc16(l->label_ptr[0], tgt_rx)
@@ -1196,29 +1132,13 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
return false;
}
- i = 1;
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- i = tcg_out_call_iarg_reg2(s, i, l->addrlo_reg, l->addrhi_reg);
- } else {
- i = tcg_out_call_iarg_reg(s, i, l->addrlo_reg);
- }
- i = tcg_out_call_iarg_imm(s, i, oi);
- i = tcg_out_call_iarg_imm(s, i, (intptr_t)l->raddr);
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
+
tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false);
/* delay slot */
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
+ tcg_out_nop(s);
- v0 = l->datalo_reg;
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
- /* We eliminated V0 from the possible output registers, so it
- cannot be clobbered here. So we must move V1 first. */
- if (MIPS_BE) {
- tcg_out_mov(s, TCG_TYPE_I32, v0, TCG_REG_V1);
- v0 = l->datahi_reg;
- } else {
- tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_V1);
- }
- }
+ tcg_out_ld_helper_ret(s, l, true, &ldst_helper_param);
tcg_out_opc_br(s, OPC_BEQ, TCG_REG_ZERO, TCG_REG_ZERO);
if (!reloc_pc16(s->code_ptr - 1, l->raddr)) {
@@ -1226,22 +1146,14 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
/* delay slot */
- if (TCG_TARGET_REG_BITS == 64 && l->type == TCG_TYPE_I32) {
- /* we always sign-extend 32-bit loads */
- tcg_out_ext32s(s, v0, TCG_REG_V0);
- } else {
- tcg_out_opc_reg(s, OPC_OR, v0, TCG_REG_V0, TCG_REG_ZERO);
- }
+ tcg_out_nop(s);
return true;
}
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
const tcg_insn_unit *tgt_rx = tcg_splitwx_to_rx(s->code_ptr);
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
- MemOp s_bits = opc & MO_SIZE;
- int i;
+ MemOp opc = get_memop(l->oi);
/* resolve label address */
if (!reloc_pc16(l->label_ptr[0], tgt_rx)
@@ -1250,41 +1162,19 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
return false;
}
- i = 1;
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- i = tcg_out_call_iarg_reg2(s, i, l->addrlo_reg, l->addrhi_reg);
- } else {
- i = tcg_out_call_iarg_reg(s, i, l->addrlo_reg);
- }
- switch (s_bits) {
- case MO_8:
- i = tcg_out_call_iarg_reg8(s, i, l->datalo_reg);
- break;
- case MO_16:
- i = tcg_out_call_iarg_reg16(s, i, l->datalo_reg);
- break;
- case MO_32:
- i = tcg_out_call_iarg_reg(s, i, l->datalo_reg);
- break;
- case MO_64:
- if (TCG_TARGET_REG_BITS == 32) {
- i = tcg_out_call_iarg_reg2(s, i, l->datalo_reg, l->datahi_reg);
- } else {
- i = tcg_out_call_iarg_reg(s, i, l->datalo_reg);
- }
- break;
- default:
- g_assert_not_reached();
- }
- i = tcg_out_call_iarg_imm(s, i, oi);
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
- /* Tail call to the store helper. Thus force the return address
- computation to take place in the return address register. */
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RA, (intptr_t)l->raddr);
- i = tcg_out_call_iarg_reg(s, i, TCG_REG_RA);
- tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], true);
+ tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], false);
/* delay slot */
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
+ tcg_out_nop(s);
+
+ tcg_out_opc_br(s, OPC_BEQ, TCG_REG_ZERO, TCG_REG_ZERO);
+ if (!reloc_pc16(s->code_ptr - 1, l->raddr)) {
+ return false;
+ }
+
+ /* delay slot */
+ tcg_out_nop(s);
return true;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 41/54] tcg/ppc: Convert tcg_out_qemu_{ld,st}_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (39 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 40/54] tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 42/54] tcg/riscv: " Richard Henderson
` (12 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target.c.inc | 88 ++++++++++++----------------------------
1 file changed, 26 insertions(+), 62 deletions(-)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 7239335bdf..042136fee7 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2003,44 +2003,38 @@ static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
[MO_BEUQ] = helper_be_stq_mmu,
};
+static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
+{
+ if (arg < 0) {
+ arg = TCG_REG_TMP1;
+ }
+ tcg_out32(s, MFSPR | RT(arg) | LR);
+ return arg;
+}
+
+/*
+ * For the purposes of ppc32 sorting 4 input registers into 4 argument
+ * registers, there is an outside chance we would require 3 temps.
+ * Because of constraints, no inputs are in r3, and env will not be
+ * placed into r3 until after the sorting is done, and is thus free.
+ */
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ra_gen = ldst_ra_gen,
+ .ntmp = 3,
+ .tmp = { TCG_REG_TMP1, TCG_REG_R0, TCG_REG_R3 }
+};
+
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
- TCGReg hi, lo, arg = TCG_REG_R3;
+ MemOp opc = get_memop(lb->oi);
if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- tcg_out_mov(s, TCG_TYPE_PTR, arg++, TCG_AREG0);
-
- lo = lb->addrlo_reg;
- hi = lb->addrhi_reg;
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
- } else {
- /* If the address needed to be zero-extended, we'll have already
- placed it in R4. The only remaining case is 64-bit guest. */
- tcg_out_mov(s, TCG_TYPE_TL, arg++, lo);
- }
-
- tcg_out_movi(s, TCG_TYPE_I32, arg++, oi);
- tcg_out32(s, MFSPR | RT(arg) | LR);
-
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
tcg_out_call_int(s, LK, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
-
- lo = lb->datalo_reg;
- hi = lb->datahi_reg;
- if (TCG_TARGET_REG_BITS == 32 && (opc & MO_SIZE) == MO_64) {
- tcg_out_mov(s, TCG_TYPE_I32, lo, TCG_REG_R4);
- tcg_out_mov(s, TCG_TYPE_I32, hi, TCG_REG_R3);
- } else {
- tcg_out_movext(s, lb->type, lo,
- TCG_TYPE_REG, opc & MO_SSIZE, TCG_REG_R3);
- }
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
tcg_out_b(s, 0, lb->raddr);
return true;
@@ -2048,43 +2042,13 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
- MemOp s_bits = opc & MO_SIZE;
- TCGReg hi, lo, arg = TCG_REG_R3;
+ MemOp opc = get_memop(lb->oi);
if (!reloc_pc14(lb->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
return false;
}
- tcg_out_mov(s, TCG_TYPE_PTR, arg++, TCG_AREG0);
-
- lo = lb->addrlo_reg;
- hi = lb->addrhi_reg;
- if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
- } else {
- /* If the address needed to be zero-extended, we'll have already
- placed it in R4. The only remaining case is 64-bit guest. */
- tcg_out_mov(s, TCG_TYPE_TL, arg++, lo);
- }
-
- lo = lb->datalo_reg;
- hi = lb->datahi_reg;
- if (TCG_TARGET_REG_BITS == 32 && s_bits == MO_64) {
- arg |= (TCG_TARGET_CALL_ARG_I64 == TCG_CALL_ARG_EVEN);
- tcg_out_mov(s, TCG_TYPE_I32, arg++, hi);
- tcg_out_mov(s, TCG_TYPE_I32, arg++, lo);
- } else {
- tcg_out_movext(s, s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32,
- arg++, lb->type, s_bits, lo);
- }
-
- tcg_out_movi(s, TCG_TYPE_I32, arg++, oi);
- tcg_out32(s, MFSPR | RT(arg) | LR);
-
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
tcg_out_call_int(s, LK, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
tcg_out_b(s, 0, lb->raddr);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 42/54] tcg/riscv: Convert tcg_out_qemu_{ld,st}_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (40 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 41/54] tcg/ppc: " Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 43/54] tcg/s390x: " Richard Henderson
` (11 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target.c.inc | 37 ++++++++++---------------------------
1 file changed, 10 insertions(+), 27 deletions(-)
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 2b2d313fe2..c22d1e35ac 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -906,14 +906,14 @@ static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
tcg_debug_assert(ok);
}
+/* We have three temps, we might as well expose them. */
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ntmp = 3, .tmp = { TCG_REG_TMP0, TCG_REG_TMP1, TCG_REG_TMP2 }
+};
+
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
- TCGReg a0 = tcg_target_call_iarg_regs[0];
- TCGReg a1 = tcg_target_call_iarg_regs[1];
- TCGReg a2 = tcg_target_call_iarg_regs[2];
- TCGReg a3 = tcg_target_call_iarg_regs[3];
+ MemOp opc = get_memop(l->oi);
/* resolve label address */
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
@@ -921,13 +921,9 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
/* call load helper */
- tcg_out_mov(s, TCG_TYPE_PTR, a0, TCG_AREG0);
- tcg_out_mov(s, TCG_TYPE_PTR, a1, l->addrlo_reg);
- tcg_out_movi(s, TCG_TYPE_PTR, a2, oi);
- tcg_out_movi(s, TCG_TYPE_PTR, a3, (tcg_target_long)l->raddr);
-
+ tcg_out_ld_helper_args(s, l, &ldst_helper_param);
tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false);
- tcg_out_mov(s, (opc & MO_SIZE) == MO_64, l->datalo_reg, a0);
+ tcg_out_ld_helper_ret(s, l, true, &ldst_helper_param);
tcg_out_goto(s, l->raddr);
return true;
@@ -935,14 +931,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
{
- MemOpIdx oi = l->oi;
- MemOp opc = get_memop(oi);
- MemOp s_bits = opc & MO_SIZE;
- TCGReg a0 = tcg_target_call_iarg_regs[0];
- TCGReg a1 = tcg_target_call_iarg_regs[1];
- TCGReg a2 = tcg_target_call_iarg_regs[2];
- TCGReg a3 = tcg_target_call_iarg_regs[3];
- TCGReg a4 = tcg_target_call_iarg_regs[4];
+ MemOp opc = get_memop(l->oi);
/* resolve label address */
if (!reloc_sbimm12(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
@@ -950,13 +939,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
}
/* call store helper */
- tcg_out_mov(s, TCG_TYPE_PTR, a0, TCG_AREG0);
- tcg_out_mov(s, TCG_TYPE_PTR, a1, l->addrlo_reg);
- tcg_out_movext(s, s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, a2,
- l->type, s_bits, l->datalo_reg);
- tcg_out_movi(s, TCG_TYPE_PTR, a3, oi);
- tcg_out_movi(s, TCG_TYPE_PTR, a4, (tcg_target_long)l->raddr);
-
+ tcg_out_st_helper_args(s, l, &ldst_helper_param);
tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
tcg_out_goto(s, l->raddr);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 43/54] tcg/s390x: Convert tcg_out_qemu_{ld,st}_slow_path
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (41 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 42/54] tcg/riscv: " Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 44/54] tcg/loongarch64: Simplify constraints on qemu_ld/st Richard Henderson
` (10 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret,
and tcg_out_st_helper_args.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/s390x/tcg-target.c.inc | 35 ++++++++++-------------------------
1 file changed, 10 insertions(+), 25 deletions(-)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index c3157d22be..dfcf4d9e34 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1718,26 +1718,22 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
}
#if defined(CONFIG_SOFTMMU)
+static const TCGLdstHelperParam ldst_helper_param = {
+ .ntmp = 1, .tmp = { TCG_TMP0 }
+};
+
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- TCGReg addr_reg = lb->addrlo_reg;
- TCGReg data_reg = lb->datalo_reg;
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
+ MemOp opc = get_memop(lb->oi);
if (!patch_reloc(lb->label_ptr[0], R_390_PC16DBL,
(intptr_t)tcg_splitwx_to_rx(s->code_ptr), 2)) {
return false;
}
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_AREG0);
- if (TARGET_LONG_BITS == 64) {
- tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_R3, addr_reg);
- }
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, oi);
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R5, (uintptr_t)lb->raddr);
- tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)]);
- tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_R2);
+ tcg_out_ld_helper_args(s, lb, &ldst_helper_param);
+ tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]);
+ tcg_out_ld_helper_ret(s, lb, false, &ldst_helper_param);
tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr);
return true;
@@ -1745,25 +1741,14 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
{
- TCGReg addr_reg = lb->addrlo_reg;
- TCGReg data_reg = lb->datalo_reg;
- MemOpIdx oi = lb->oi;
- MemOp opc = get_memop(oi);
- MemOp size = opc & MO_SIZE;
+ MemOp opc = get_memop(lb->oi);
if (!patch_reloc(lb->label_ptr[0], R_390_PC16DBL,
(intptr_t)tcg_splitwx_to_rx(s->code_ptr), 2)) {
return false;
}
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_R2, TCG_AREG0);
- if (TARGET_LONG_BITS == 64) {
- tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_R3, addr_reg);
- }
- tcg_out_movext(s, size == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32,
- TCG_REG_R4, lb->type, size, data_reg);
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R5, oi);
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R6, (uintptr_t)lb->raddr);
+ tcg_out_st_helper_args(s, lb, &ldst_helper_param);
tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]);
tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 44/54] tcg/loongarch64: Simplify constraints on qemu_ld/st
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (42 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 43/54] tcg/s390x: " Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 45/54] tcg/mips: Remove MO_BSWAP handling Richard Henderson
` (9 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/loongarch64/tcg-target-con-set.h | 2 --
tcg/loongarch64/tcg-target-con-str.h | 1 -
tcg/loongarch64/tcg-target.c.inc | 23 ++++-------------------
3 files changed, 4 insertions(+), 22 deletions(-)
diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 172c107289..c2bde44613 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -17,9 +17,7 @@
C_O0_I1(r)
C_O0_I2(rZ, r)
C_O0_I2(rZ, rZ)
-C_O0_I2(LZ, L)
C_O1_I1(r, r)
-C_O1_I1(r, L)
C_O1_I2(r, r, rC)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
diff --git a/tcg/loongarch64/tcg-target-con-str.h b/tcg/loongarch64/tcg-target-con-str.h
index 541ff47fa9..6e9ccca3ad 100644
--- a/tcg/loongarch64/tcg-target-con-str.h
+++ b/tcg/loongarch64/tcg-target-con-str.h
@@ -14,7 +14,6 @@
* REGS(letter, register_mask)
*/
REGS('r', ALL_GENERAL_REGS)
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
/*
* Define constraint letters for constants:
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 60d2c904dd..83fa45c802 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -133,18 +133,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
#define TCG_CT_CONST_C12 0x1000
#define TCG_CT_CONST_WSZ 0x2000
-#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
-/*
- * For softmmu, we need to avoid conflicts with the first 5
- * argument registers to call the helper. Some of these are
- * also used for the tlb lookup.
- */
-#ifdef CONFIG_SOFTMMU
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_A0, 5)
-#else
-#define SOFTMMU_RESERVE_REGS 0
-#endif
-
+#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
static inline tcg_target_long sextreg(tcg_target_long val, int pos, int len)
{
@@ -1541,16 +1530,14 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_st32_i64:
case INDEX_op_st_i32:
case INDEX_op_st_i64:
+ case INDEX_op_qemu_st_i32:
+ case INDEX_op_qemu_st_i64:
return C_O0_I2(rZ, r);
case INDEX_op_brcond_i32:
case INDEX_op_brcond_i64:
return C_O0_I2(rZ, rZ);
- case INDEX_op_qemu_st_i32:
- case INDEX_op_qemu_st_i64:
- return C_O0_I2(LZ, L);
-
case INDEX_op_ext8s_i32:
case INDEX_op_ext8s_i64:
case INDEX_op_ext8u_i32:
@@ -1586,11 +1573,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_ld32u_i64:
case INDEX_op_ld_i32:
case INDEX_op_ld_i64:
- return C_O1_I1(r, r);
-
case INDEX_op_qemu_ld_i32:
case INDEX_op_qemu_ld_i64:
- return C_O1_I1(r, L);
+ return C_O1_I1(r, r);
case INDEX_op_andc_i32:
case INDEX_op_andc_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 45/54] tcg/mips: Remove MO_BSWAP handling
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (43 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 44/54] tcg/loongarch64: Simplify constraints on qemu_ld/st Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 46/54] tcg/mips: Reorg tlb load within prepare_host_addr Richard Henderson
` (8 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
While performing the load in the delay slot of the call to the common
bswap helper function is cute, it is not worth the added complexity.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/mips/tcg-target.h | 4 +-
tcg/mips/tcg-target.c.inc | 284 ++++++--------------------------------
2 files changed, 48 insertions(+), 240 deletions(-)
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index 2431fc5353..42bd7fff01 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -204,8 +204,8 @@ extern bool use_mips32r2_instructions;
#define TCG_TARGET_HAS_ext16u_i64 0 /* andi rt, rs, 0xffff */
#endif
-#define TCG_TARGET_DEFAULT_MO (0)
-#define TCG_TARGET_HAS_MEMORY_BSWAP 1
+#define TCG_TARGET_DEFAULT_MO 0
+#define TCG_TARGET_HAS_MEMORY_BSWAP 0
#define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 022960d79a..31d58e1977 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -1088,31 +1088,35 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg,
}
#if defined(CONFIG_SOFTMMU)
-static void * const qemu_ld_helpers[(MO_SSIZE | MO_BSWAP) + 1] = {
+static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
[MO_UB] = helper_ret_ldub_mmu,
[MO_SB] = helper_ret_ldsb_mmu,
- [MO_LEUW] = helper_le_lduw_mmu,
- [MO_LESW] = helper_le_ldsw_mmu,
- [MO_LEUL] = helper_le_ldul_mmu,
- [MO_LEUQ] = helper_le_ldq_mmu,
- [MO_BEUW] = helper_be_lduw_mmu,
- [MO_BESW] = helper_be_ldsw_mmu,
- [MO_BEUL] = helper_be_ldul_mmu,
- [MO_BEUQ] = helper_be_ldq_mmu,
-#if TCG_TARGET_REG_BITS == 64
- [MO_LESL] = helper_le_ldsl_mmu,
- [MO_BESL] = helper_be_ldsl_mmu,
+#if HOST_BIG_ENDIAN
+ [MO_UW] = helper_be_lduw_mmu,
+ [MO_SW] = helper_be_ldsw_mmu,
+ [MO_UL] = helper_be_ldul_mmu,
+ [MO_SL] = helper_be_ldsl_mmu,
+ [MO_UQ] = helper_be_ldq_mmu,
+#else
+ [MO_UW] = helper_le_lduw_mmu,
+ [MO_SW] = helper_le_ldsw_mmu,
+ [MO_UL] = helper_le_ldul_mmu,
+ [MO_UQ] = helper_le_ldq_mmu,
+ [MO_SL] = helper_le_ldsl_mmu,
#endif
};
-static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = {
+static void * const qemu_st_helpers[MO_SIZE + 1] = {
[MO_UB] = helper_ret_stb_mmu,
- [MO_LEUW] = helper_le_stw_mmu,
- [MO_LEUL] = helper_le_stl_mmu,
- [MO_LEUQ] = helper_le_stq_mmu,
- [MO_BEUW] = helper_be_stw_mmu,
- [MO_BEUL] = helper_be_stl_mmu,
- [MO_BEUQ] = helper_be_stq_mmu,
+#if HOST_BIG_ENDIAN
+ [MO_UW] = helper_be_stw_mmu,
+ [MO_UL] = helper_be_stl_mmu,
+ [MO_UQ] = helper_be_stq_mmu,
+#else
+ [MO_UW] = helper_le_stw_mmu,
+ [MO_UL] = helper_le_stl_mmu,
+ [MO_UQ] = helper_le_stq_mmu,
+#endif
};
/* We have four temps, we might as well expose three of them. */
@@ -1134,7 +1138,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
tcg_out_ld_helper_args(s, l, &ldst_helper_param);
- tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false);
+ tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false);
/* delay slot */
tcg_out_nop(s);
@@ -1164,7 +1168,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
tcg_out_st_helper_args(s, l, &ldst_helper_param);
- tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], false);
+ tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], false);
/* delay slot */
tcg_out_nop(s);
@@ -1379,52 +1383,19 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
TCGReg base, MemOp opc, TCGType type)
{
- switch (opc & (MO_SSIZE | MO_BSWAP)) {
+ switch (opc & MO_SSIZE) {
case MO_UB:
tcg_out_opc_imm(s, OPC_LBU, lo, base, 0);
break;
case MO_SB:
tcg_out_opc_imm(s, OPC_LB, lo, base, 0);
break;
- case MO_UW | MO_BSWAP:
- tcg_out_opc_imm(s, OPC_LHU, TCG_TMP1, base, 0);
- tcg_out_bswap16(s, lo, TCG_TMP1, TCG_BSWAP_IZ | TCG_BSWAP_OZ);
- break;
case MO_UW:
tcg_out_opc_imm(s, OPC_LHU, lo, base, 0);
break;
- case MO_SW | MO_BSWAP:
- tcg_out_opc_imm(s, OPC_LHU, TCG_TMP1, base, 0);
- tcg_out_bswap16(s, lo, TCG_TMP1, TCG_BSWAP_IZ | TCG_BSWAP_OS);
- break;
case MO_SW:
tcg_out_opc_imm(s, OPC_LH, lo, base, 0);
break;
- case MO_UL | MO_BSWAP:
- if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
- if (use_mips32r2_instructions) {
- tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
- tcg_out_bswap32(s, lo, lo, TCG_BSWAP_IZ | TCG_BSWAP_OZ);
- } else {
- tcg_out_bswap_subr(s, bswap32u_addr);
- /* delay slot */
- tcg_out_opc_imm(s, OPC_LWU, TCG_TMP0, base, 0);
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
- }
- break;
- }
- /* FALLTHRU */
- case MO_SL | MO_BSWAP:
- if (use_mips32r2_instructions) {
- tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
- tcg_out_bswap32(s, lo, lo, 0);
- } else {
- tcg_out_bswap_subr(s, bswap32_addr);
- /* delay slot */
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
- tcg_out_mov(s, TCG_TYPE_I32, lo, TCG_TMP3);
- }
- break;
case MO_UL:
if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64) {
tcg_out_opc_imm(s, OPC_LWU, lo, base, 0);
@@ -1434,35 +1405,6 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
case MO_SL:
tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
break;
- case MO_UQ | MO_BSWAP:
- if (TCG_TARGET_REG_BITS == 64) {
- if (use_mips32r2_instructions) {
- tcg_out_opc_imm(s, OPC_LD, lo, base, 0);
- tcg_out_bswap64(s, lo, lo);
- } else {
- tcg_out_bswap_subr(s, bswap64_addr);
- /* delay slot */
- tcg_out_opc_imm(s, OPC_LD, TCG_TMP0, base, 0);
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
- }
- } else if (use_mips32r2_instructions) {
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP1, base, 4);
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, TCG_TMP0);
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, TCG_TMP1);
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? lo : hi, TCG_TMP0, 16);
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? hi : lo, TCG_TMP1, 16);
- } else {
- tcg_out_bswap_subr(s, bswap32_addr);
- /* delay slot */
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 0);
- tcg_out_opc_imm(s, OPC_LW, TCG_TMP0, base, 4);
- tcg_out_bswap_subr(s, bswap32_addr);
- /* delay slot */
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? lo : hi, TCG_TMP3);
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? hi : lo, TCG_TMP3);
- }
- break;
case MO_UQ:
/* Prefer to load from offset 0 first, but allow for overlap. */
if (TCG_TARGET_REG_BITS == 64) {
@@ -1487,25 +1429,20 @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
const MIPSInsn lw2 = MIPS_BE ? OPC_LWR : OPC_LWL;
const MIPSInsn ld1 = MIPS_BE ? OPC_LDL : OPC_LDR;
const MIPSInsn ld2 = MIPS_BE ? OPC_LDR : OPC_LDL;
+ bool sgn = opc & MO_SIGN;
- bool sgn = (opc & MO_SIGN);
-
- switch (opc & (MO_SSIZE | MO_BSWAP)) {
- case MO_SW | MO_BE:
- case MO_UW | MO_BE:
- tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 0);
- tcg_out_opc_imm(s, OPC_LBU, lo, base, 1);
- if (use_mips32r2_instructions) {
- tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
- } else {
- tcg_out_opc_sa(s, OPC_SLL, TCG_TMP0, TCG_TMP0, 8);
- tcg_out_opc_reg(s, OPC_OR, lo, TCG_TMP0, TCG_TMP1);
- }
- break;
-
- case MO_SW | MO_LE:
- case MO_UW | MO_LE:
- if (use_mips32r2_instructions && lo != base) {
+ switch (opc & MO_SIZE) {
+ case MO_16:
+ if (HOST_BIG_ENDIAN) {
+ tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 0);
+ tcg_out_opc_imm(s, OPC_LBU, lo, base, 1);
+ if (use_mips32r2_instructions) {
+ tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
+ } else {
+ tcg_out_opc_sa(s, OPC_SLL, TCG_TMP0, TCG_TMP0, 8);
+ tcg_out_opc_reg(s, OPC_OR, lo, lo, TCG_TMP0);
+ }
+ } else if (use_mips32r2_instructions && lo != base) {
tcg_out_opc_imm(s, OPC_LBU, lo, base, 0);
tcg_out_opc_imm(s, sgn ? OPC_LB : OPC_LBU, TCG_TMP0, base, 1);
tcg_out_opc_bf(s, OPC_INS, lo, TCG_TMP0, 31, 8);
@@ -1517,8 +1454,7 @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
}
break;
- case MO_SL:
- case MO_UL:
+ case MO_32:
tcg_out_opc_imm(s, lw1, lo, base, 0);
tcg_out_opc_imm(s, lw2, lo, base, 3);
if (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn) {
@@ -1526,28 +1462,7 @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
}
break;
- case MO_UL | MO_BSWAP:
- case MO_SL | MO_BSWAP:
- if (use_mips32r2_instructions) {
- tcg_out_opc_imm(s, lw1, lo, base, 0);
- tcg_out_opc_imm(s, lw2, lo, base, 3);
- tcg_out_bswap32(s, lo, lo,
- TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64
- ? (sgn ? TCG_BSWAP_OS : TCG_BSWAP_OZ) : 0);
- } else {
- const tcg_insn_unit *subr =
- (TCG_TARGET_REG_BITS == 64 && type == TCG_TYPE_I64 && !sgn
- ? bswap32u_addr : bswap32_addr);
-
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0);
- tcg_out_bswap_subr(s, subr);
- /* delay slot */
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 3);
- tcg_out_mov(s, type, lo, TCG_TMP3);
- }
- break;
-
- case MO_UQ:
+ case MO_64:
if (TCG_TARGET_REG_BITS == 64) {
tcg_out_opc_imm(s, ld1, lo, base, 0);
tcg_out_opc_imm(s, ld2, lo, base, 7);
@@ -1559,42 +1474,6 @@ static void tcg_out_qemu_ld_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
}
break;
- case MO_UQ | MO_BSWAP:
- if (TCG_TARGET_REG_BITS == 64) {
- if (use_mips32r2_instructions) {
- tcg_out_opc_imm(s, ld1, lo, base, 0);
- tcg_out_opc_imm(s, ld2, lo, base, 7);
- tcg_out_bswap64(s, lo, lo);
- } else {
- tcg_out_opc_imm(s, ld1, TCG_TMP0, base, 0);
- tcg_out_bswap_subr(s, bswap64_addr);
- /* delay slot */
- tcg_out_opc_imm(s, ld2, TCG_TMP0, base, 7);
- tcg_out_mov(s, TCG_TYPE_I64, lo, TCG_TMP3);
- }
- } else if (use_mips32r2_instructions) {
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0 + 0);
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 0 + 3);
- tcg_out_opc_imm(s, lw1, TCG_TMP1, base, 4 + 0);
- tcg_out_opc_imm(s, lw2, TCG_TMP1, base, 4 + 3);
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, TCG_TMP0);
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, TCG_TMP1);
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? lo : hi, TCG_TMP0, 16);
- tcg_out_opc_sa(s, OPC_ROTR, MIPS_BE ? hi : lo, TCG_TMP1, 16);
- } else {
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 0 + 0);
- tcg_out_bswap_subr(s, bswap32_addr);
- /* delay slot */
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 0 + 3);
- tcg_out_opc_imm(s, lw1, TCG_TMP0, base, 4 + 0);
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? lo : hi, TCG_TMP3);
- tcg_out_bswap_subr(s, bswap32_addr);
- /* delay slot */
- tcg_out_opc_imm(s, lw2, TCG_TMP0, base, 4 + 3);
- tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? hi : lo, TCG_TMP3);
- }
- break;
-
default:
g_assert_not_reached();
}
@@ -1627,50 +1506,16 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, TCGReg datahi,
static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
TCGReg base, MemOp opc)
{
- /* Don't clutter the code below with checks to avoid bswapping ZERO. */
- if ((lo | hi) == 0) {
- opc &= ~MO_BSWAP;
- }
-
- switch (opc & (MO_SIZE | MO_BSWAP)) {
+ switch (opc & MO_SIZE) {
case MO_8:
tcg_out_opc_imm(s, OPC_SB, lo, base, 0);
break;
-
- case MO_16 | MO_BSWAP:
- tcg_out_bswap16(s, TCG_TMP1, lo, 0);
- lo = TCG_TMP1;
- /* FALLTHRU */
case MO_16:
tcg_out_opc_imm(s, OPC_SH, lo, base, 0);
break;
-
- case MO_32 | MO_BSWAP:
- tcg_out_bswap32(s, TCG_TMP3, lo, 0);
- lo = TCG_TMP3;
- /* FALLTHRU */
case MO_32:
tcg_out_opc_imm(s, OPC_SW, lo, base, 0);
break;
-
- case MO_64 | MO_BSWAP:
- if (TCG_TARGET_REG_BITS == 64) {
- tcg_out_bswap64(s, TCG_TMP3, lo);
- tcg_out_opc_imm(s, OPC_SD, TCG_TMP3, base, 0);
- } else if (use_mips32r2_instructions) {
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, MIPS_BE ? lo : hi);
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, MIPS_BE ? hi : lo);
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP0, TCG_TMP0, 16);
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP1, TCG_TMP1, 16);
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP0, base, 0);
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP1, base, 4);
- } else {
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? lo : hi, 0);
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP3, base, 0);
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? hi : lo, 0);
- tcg_out_opc_imm(s, OPC_SW, TCG_TMP3, base, 4);
- }
- break;
case MO_64:
if (TCG_TARGET_REG_BITS == 64) {
tcg_out_opc_imm(s, OPC_SD, lo, base, 0);
@@ -1679,7 +1524,6 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo, TCGReg hi,
tcg_out_opc_imm(s, OPC_SW, MIPS_BE ? lo : hi, base, 4);
}
break;
-
default:
g_assert_not_reached();
}
@@ -1693,54 +1537,18 @@ static void tcg_out_qemu_st_unalign(TCGContext *s, TCGReg lo, TCGReg hi,
const MIPSInsn sd1 = MIPS_BE ? OPC_SDL : OPC_SDR;
const MIPSInsn sd2 = MIPS_BE ? OPC_SDR : OPC_SDL;
- /* Don't clutter the code below with checks to avoid bswapping ZERO. */
- if ((lo | hi) == 0) {
- opc &= ~MO_BSWAP;
- }
-
- switch (opc & (MO_SIZE | MO_BSWAP)) {
- case MO_16 | MO_BE:
+ switch (opc & MO_SIZE) {
+ case MO_16:
tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, lo, 8);
- tcg_out_opc_imm(s, OPC_SB, TCG_TMP0, base, 0);
- tcg_out_opc_imm(s, OPC_SB, lo, base, 1);
+ tcg_out_opc_imm(s, OPC_SB, HOST_BIG_ENDIAN ? TCG_TMP0 : lo, base, 0);
+ tcg_out_opc_imm(s, OPC_SB, HOST_BIG_ENDIAN ? lo : TCG_TMP0, base, 1);
break;
- case MO_16 | MO_LE:
- tcg_out_opc_sa(s, OPC_SRL, TCG_TMP0, lo, 8);
- tcg_out_opc_imm(s, OPC_SB, lo, base, 0);
- tcg_out_opc_imm(s, OPC_SB, TCG_TMP0, base, 1);
- break;
-
- case MO_32 | MO_BSWAP:
- tcg_out_bswap32(s, TCG_TMP3, lo, 0);
- lo = TCG_TMP3;
- /* fall through */
case MO_32:
tcg_out_opc_imm(s, sw1, lo, base, 0);
tcg_out_opc_imm(s, sw2, lo, base, 3);
break;
- case MO_64 | MO_BSWAP:
- if (TCG_TARGET_REG_BITS == 64) {
- tcg_out_bswap64(s, TCG_TMP3, lo);
- lo = TCG_TMP3;
- } else if (use_mips32r2_instructions) {
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP0, 0, MIPS_BE ? hi : lo);
- tcg_out_opc_reg(s, OPC_WSBH, TCG_TMP1, 0, MIPS_BE ? lo : hi);
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP0, TCG_TMP0, 16);
- tcg_out_opc_sa(s, OPC_ROTR, TCG_TMP1, TCG_TMP1, 16);
- hi = MIPS_BE ? TCG_TMP0 : TCG_TMP1;
- lo = MIPS_BE ? TCG_TMP1 : TCG_TMP0;
- } else {
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? lo : hi, 0);
- tcg_out_opc_imm(s, sw1, TCG_TMP3, base, 0 + 0);
- tcg_out_opc_imm(s, sw2, TCG_TMP3, base, 0 + 3);
- tcg_out_bswap32(s, TCG_TMP3, MIPS_BE ? hi : lo, 0);
- tcg_out_opc_imm(s, sw1, TCG_TMP3, base, 4 + 0);
- tcg_out_opc_imm(s, sw2, TCG_TMP3, base, 4 + 3);
- break;
- }
- /* fall through */
case MO_64:
if (TCG_TARGET_REG_BITS == 64) {
tcg_out_opc_imm(s, sd1, lo, base, 0);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 46/54] tcg/mips: Reorg tlb load within prepare_host_addr
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (44 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 45/54] tcg/mips: Remove MO_BSWAP handling Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 47/54] tcg/mips: Simplify constraints on qemu_ld/st Richard Henderson
` (7 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Compare the address vs the tlb entry with sign-extended values.
This simplifies the page+alignment mask constant, and the
generation of the last byte address for the misaligned test.
Move the tlb addend load up, and the zero-extension down.
This frees up a register, which allows us use TMP3 as the returned base
address register instead of A0, which we were using as a 5th temporary.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/mips/tcg-target.c.inc | 38 ++++++++++++++++++--------------------
1 file changed, 18 insertions(+), 20 deletions(-)
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 31d58e1977..695c137023 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -370,6 +370,8 @@ typedef enum {
ALIAS_PADDI = sizeof(void *) == 4 ? OPC_ADDIU : OPC_DADDIU,
ALIAS_TSRL = TARGET_LONG_BITS == 32 || TCG_TARGET_REG_BITS == 32
? OPC_SRL : OPC_DSRL,
+ ALIAS_TADDI = TARGET_LONG_BITS == 32 || TCG_TARGET_REG_BITS == 32
+ ? OPC_ADDIU : OPC_DADDIU,
} MIPSInsn;
/*
@@ -1263,14 +1265,12 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
int add_off = offsetof(CPUTLBEntry, addend);
int cmp_off = is_ld ? offsetof(CPUTLBEntry, addr_read)
: offsetof(CPUTLBEntry, addr_write);
- target_ulong tlb_mask;
ldst = new_ldst_label(s);
ldst->is_ld = is_ld;
ldst->oi = oi;
ldst->addrlo_reg = addrlo;
ldst->addrhi_reg = addrhi;
- base = TCG_REG_A0;
/* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
@@ -1290,15 +1290,12 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + LO_OFF);
} else {
- tcg_out_ldst(s, (TARGET_LONG_BITS == 64 ? OPC_LD
- : TCG_TARGET_REG_BITS == 64 ? OPC_LWU : OPC_LW),
- TCG_TMP0, TCG_TMP3, cmp_off);
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_TMP0, TCG_TMP3, cmp_off);
}
- /* Zero extend a 32-bit guest address for a 64-bit host. */
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, base, addrlo);
- addrlo = base;
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ /* Load the tlb addend for the fast path. */
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP3, TCG_TMP3, add_off);
}
/*
@@ -1306,18 +1303,18 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
* For unaligned accesses, compare against the end of the access to
* verify that it does not cross a page boundary.
*/
- tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
- tcg_out_movi(s, TCG_TYPE_I32, TCG_TMP1, tlb_mask);
- if (a_mask >= s_mask) {
- tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
- } else {
- tcg_out_opc_imm(s, ALIAS_PADDI, TCG_TMP2, addrlo, s_mask - a_mask);
+ tcg_out_movi(s, TCG_TYPE_TL, TCG_TMP1, TARGET_PAGE_MASK | a_mask);
+ if (a_mask < s_mask) {
+ tcg_out_opc_imm(s, ALIAS_TADDI, TCG_TMP2, addrlo, s_mask - a_mask);
tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, TCG_TMP2);
+ } else {
+ tcg_out_opc_reg(s, OPC_AND, TCG_TMP1, TCG_TMP1, addrlo);
}
- if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
- /* Load the tlb addend for the fast path. */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
+ /* Zero extend a 32-bit guest address for a 64-bit host. */
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+ tcg_out_ext32u(s, TCG_TMP2, addrlo);
+ addrlo = TCG_TMP2;
}
ldst->label_ptr[0] = s->code_ptr;
@@ -1329,14 +1326,15 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
tcg_out_ldst(s, OPC_LW, TCG_TMP0, TCG_TMP3, cmp_off + HI_OFF);
/* Load the tlb addend for the fast path. */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP2, TCG_TMP3, add_off);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_TMP3, TCG_TMP3, add_off);
ldst->label_ptr[1] = s->code_ptr;
tcg_out_opc_br(s, OPC_BNE, addrhi, TCG_TMP0);
}
/* delay slot */
- tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP2, addrlo);
+ base = TCG_TMP3;
+ tcg_out_opc_reg(s, ALIAS_PADD, base, TCG_TMP3, addrlo);
#else
if (a_mask && (use_mips32r6_instructions || a_bits != s_bits)) {
ldst = new_ldst_label(s);
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 47/54] tcg/mips: Simplify constraints on qemu_ld/st
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (45 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 46/54] tcg/mips: Reorg tlb load within prepare_host_addr Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 48/54] tcg/ppc: Reorg tcg_out_tlb_read Richard Henderson
` (6 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
The softmmu tlb uses TCG_REG_TMP[0-3], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
and have eliminated use of A0, we can allow any allocatable reg.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/mips/tcg-target-con-set.h | 13 +++++--------
tcg/mips/tcg-target-con-str.h | 2 --
tcg/mips/tcg-target.c.inc | 30 ++++++++----------------------
3 files changed, 13 insertions(+), 32 deletions(-)
diff --git a/tcg/mips/tcg-target-con-set.h b/tcg/mips/tcg-target-con-set.h
index fe3e868a2f..864034f468 100644
--- a/tcg/mips/tcg-target-con-set.h
+++ b/tcg/mips/tcg-target-con-set.h
@@ -12,15 +12,13 @@
C_O0_I1(r)
C_O0_I2(rZ, r)
C_O0_I2(rZ, rZ)
-C_O0_I2(SZ, S)
-C_O0_I3(SZ, S, S)
-C_O0_I3(SZ, SZ, S)
+C_O0_I3(rZ, r, r)
+C_O0_I3(rZ, rZ, r)
C_O0_I4(rZ, rZ, rZ, rZ)
-C_O0_I4(SZ, SZ, S, S)
-C_O1_I1(r, L)
+C_O0_I4(rZ, rZ, r, r)
C_O1_I1(r, r)
C_O1_I2(r, 0, rZ)
-C_O1_I2(r, L, L)
+C_O1_I2(r, r, r)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
C_O1_I2(r, r, rIK)
@@ -30,7 +28,6 @@ C_O1_I2(r, rZ, rN)
C_O1_I2(r, rZ, rZ)
C_O1_I4(r, rZ, rZ, rZ, 0)
C_O1_I4(r, rZ, rZ, rZ, rZ)
-C_O2_I1(r, r, L)
-C_O2_I2(r, r, L, L)
+C_O2_I1(r, r, r)
C_O2_I2(r, r, r, r)
C_O2_I4(r, r, rZ, rZ, rN, rN)
diff --git a/tcg/mips/tcg-target-con-str.h b/tcg/mips/tcg-target-con-str.h
index e4b2965c72..413c280a7a 100644
--- a/tcg/mips/tcg-target-con-str.h
+++ b/tcg/mips/tcg-target-con-str.h
@@ -9,8 +9,6 @@
* REGS(letter, register_mask)
*/
REGS('r', ALL_GENERAL_REGS)
-REGS('L', ALL_QLOAD_REGS)
-REGS('S', ALL_QSTORE_REGS)
/*
* Define constraint letters for constants:
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 695c137023..5ad9867882 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -176,20 +176,6 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
#define TCG_CT_CONST_WSZ 0x2000 /* word size */
#define ALL_GENERAL_REGS 0xffffffffu
-#define NOA0_REGS (ALL_GENERAL_REGS & ~(1 << TCG_REG_A0))
-
-#ifdef CONFIG_SOFTMMU
-#define ALL_QLOAD_REGS \
- (NOA0_REGS & ~((TCG_TARGET_REG_BITS < TARGET_LONG_BITS) << TCG_REG_A2))
-#define ALL_QSTORE_REGS \
- (NOA0_REGS & ~(TCG_TARGET_REG_BITS < TARGET_LONG_BITS \
- ? (1 << TCG_REG_A2) | (1 << TCG_REG_A3) \
- : (1 << TCG_REG_A1)))
-#else
-#define ALL_QLOAD_REGS NOA0_REGS
-#define ALL_QSTORE_REGS NOA0_REGS
-#endif
-
static bool is_p2m1(tcg_target_long val)
{
@@ -2232,18 +2218,18 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_qemu_ld_i32:
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
- ? C_O1_I1(r, L) : C_O1_I2(r, L, L));
+ ? C_O1_I1(r, r) : C_O1_I2(r, r, r));
case INDEX_op_qemu_st_i32:
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
- ? C_O0_I2(SZ, S) : C_O0_I3(SZ, S, S));
+ ? C_O0_I2(rZ, r) : C_O0_I3(rZ, r, r));
case INDEX_op_qemu_ld_i64:
- return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, L)
- : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, L)
- : C_O2_I2(r, r, L, L));
+ return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, r)
+ : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, r)
+ : C_O2_I2(r, r, r, r));
case INDEX_op_qemu_st_i64:
- return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(SZ, S)
- : TARGET_LONG_BITS == 32 ? C_O0_I3(SZ, SZ, S)
- : C_O0_I4(SZ, SZ, S, S));
+ return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(rZ, r)
+ : TARGET_LONG_BITS == 32 ? C_O0_I3(rZ, rZ, r)
+ : C_O0_I4(rZ, rZ, r, r));
default:
g_assert_not_reached();
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 48/54] tcg/ppc: Reorg tcg_out_tlb_read
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (46 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 47/54] tcg/mips: Simplify constraints on qemu_ld/st Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 49/54] tcg/ppc: Adjust constraints on qemu_ld/st Richard Henderson
` (5 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
Allocate TCG_REG_TMP2. Use R0, TMP1, TMP2 instead of any of
the normally allocated registers for the tlb load.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target.c.inc | 84 ++++++++++++++++++++++++----------------
1 file changed, 51 insertions(+), 33 deletions(-)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 042136fee7..6850ecbc80 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -68,6 +68,7 @@
#else
# define TCG_REG_TMP1 TCG_REG_R12
#endif
+#define TCG_REG_TMP2 TCG_REG_R11
#define TCG_VEC_TMP1 TCG_REG_V0
#define TCG_VEC_TMP2 TCG_REG_V1
@@ -2015,13 +2016,11 @@ static TCGReg ldst_ra_gen(TCGContext *s, const TCGLabelQemuLdst *l, int arg)
/*
* For the purposes of ppc32 sorting 4 input registers into 4 argument
* registers, there is an outside chance we would require 3 temps.
- * Because of constraints, no inputs are in r3, and env will not be
- * placed into r3 until after the sorting is done, and is thus free.
*/
static const TCGLdstHelperParam ldst_helper_param = {
.ra_gen = ldst_ra_gen,
.ntmp = 3,
- .tmp = { TCG_REG_TMP1, TCG_REG_R0, TCG_REG_R3 }
+ .tmp = { TCG_REG_TMP1, TCG_REG_TMP2, TCG_REG_R0 }
};
static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
@@ -2135,41 +2134,44 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
/* Load tlb_mask[mmu_idx] and tlb_table[mmu_idx]. */
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -32768);
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R3, TCG_AREG0, mask_off);
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_R4, TCG_AREG0, table_off);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, mask_off);
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_AREG0, table_off);
/* Extract the page index, shifted into place for tlb index. */
if (TCG_TARGET_REG_BITS == 32) {
- tcg_out_shri32(s, TCG_REG_TMP1, addrlo,
+ tcg_out_shri32(s, TCG_REG_R0, addrlo,
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
} else {
- tcg_out_shri64(s, TCG_REG_TMP1, addrlo,
+ tcg_out_shri64(s, TCG_REG_R0, addrlo,
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
}
- tcg_out32(s, AND | SAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_TMP1));
+ tcg_out32(s, AND | SAB(TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_R0));
- /* Load the TLB comparator. */
+ /* Load the (low part) TLB comparator into TMP2. */
if (cmp_off == 0 && TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
uint32_t lxu = (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32
? LWZUX : LDUX);
- tcg_out32(s, lxu | TAB(TCG_REG_TMP1, TCG_REG_R3, TCG_REG_R4));
+ tcg_out32(s, lxu | TAB(TCG_REG_TMP2, TCG_REG_TMP1, TCG_REG_TMP2));
} else {
- tcg_out32(s, ADD | TAB(TCG_REG_R3, TCG_REG_R3, TCG_REG_R4));
+ tcg_out32(s, ADD | TAB(TCG_REG_TMP1, TCG_REG_TMP1, TCG_REG_TMP2));
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP1, TCG_REG_R3, cmp_off + 4);
- tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_R4, TCG_REG_R3, cmp_off);
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP2,
+ TCG_REG_TMP1, cmp_off + 4 * HOST_BIG_ENDIAN);
} else {
- tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP1, TCG_REG_R3, cmp_off);
+ tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP2, TCG_REG_TMP1, cmp_off);
}
}
- /* Load the TLB addend for use on the fast path. Do this asap
- to minimize any load use delay. */
- h->base = TCG_REG_R3;
- tcg_out_ld(s, TCG_TYPE_PTR, h->base, TCG_REG_R3,
- offsetof(CPUTLBEntry, addend));
+ /*
+ * Load the TLB addend for use on the fast path.
+ * Do this asap to minimize any load use delay.
+ */
+ if (TCG_TARGET_REG_BITS >= TARGET_LONG_BITS) {
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
+ offsetof(CPUTLBEntry, addend));
+ }
- /* Clear the non-page, non-alignment bits from the address */
+ /* Clear the non-page, non-alignment bits from the address in R0. */
if (TCG_TARGET_REG_BITS == 32) {
/* We don't support unaligned accesses on 32-bits.
* Preserve the bottom bits and thus trigger a comparison
@@ -2200,9 +2202,6 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
if (TARGET_LONG_BITS == 32) {
tcg_out_rlw(s, RLWINM, TCG_REG_R0, t, 0,
(32 - a_bits) & 31, 31 - TARGET_PAGE_BITS);
- /* Zero-extend the address for use in the final address. */
- tcg_out_ext32u(s, TCG_REG_R4, addrlo);
- addrlo = TCG_REG_R4;
} else if (a_bits == 0) {
tcg_out_rld(s, RLDICR, TCG_REG_R0, t, 0, 63 - TARGET_PAGE_BITS);
} else {
@@ -2211,21 +2210,36 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
tcg_out_rld(s, RLDICL, TCG_REG_R0, TCG_REG_R0, TARGET_PAGE_BITS, 0);
}
}
- h->index = addrlo;
if (TCG_TARGET_REG_BITS < TARGET_LONG_BITS) {
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
+ /* Low part comparison into cr7. */
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP2,
0, 7, TCG_TYPE_I32);
- tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_R4, 0, 6, TCG_TYPE_I32);
+
+ /* Load the high part TLB comparator into TMP2. */
+ tcg_out_ld(s, TCG_TYPE_I32, TCG_REG_TMP2, TCG_REG_TMP1,
+ cmp_off + 4 * !HOST_BIG_ENDIAN);
+
+ /* Load addend, deferred for this case. */
+ tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1,
+ offsetof(CPUTLBEntry, addend));
+
+ /* High part comparison into cr6. */
+ tcg_out_cmp(s, TCG_COND_EQ, addrhi, TCG_REG_TMP2, 0, 6, TCG_TYPE_I32);
+
+ /* Combine comparisons into cr7. */
tcg_out32(s, CRAND | BT(7, CR_EQ) | BA(6, CR_EQ) | BB(7, CR_EQ));
} else {
- tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP1,
+ /* Full comparison into cr7. */
+ tcg_out_cmp(s, TCG_COND_EQ, TCG_REG_R0, TCG_REG_TMP2,
0, 7, TCG_TYPE_TL);
}
/* Load a pointer into the current opcode w/conditional branch-link. */
ldst->label_ptr[0] = s->code_ptr;
tcg_out32(s, BC | BI(7, CR_EQ) | BO_COND_FALSE | LK);
+
+ h->base = TCG_REG_TMP1;
#else
if (a_bits) {
ldst = new_ldst_label(s);
@@ -2243,13 +2257,16 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
}
h->base = guest_base ? TCG_GUEST_BASE_REG : 0;
- h->index = addrlo;
- if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
- tcg_out_ext32u(s, TCG_REG_TMP1, addrlo);
- h->index = TCG_REG_TMP1;
- }
#endif
+ if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+ /* Zero-extend the guest address for use in the host address. */
+ tcg_out_ext32u(s, TCG_REG_R0, addrlo);
+ h->index = TCG_REG_R0;
+ } else {
+ h->index = addrlo;
+ }
+
return ldst;
}
@@ -3901,7 +3918,8 @@ static void tcg_target_init(TCGContext *s)
#if defined(_CALL_SYSV) || TCG_TARGET_REG_BITS == 64
tcg_regset_set_reg(s->reserved_regs, TCG_REG_R13); /* thread pointer */
#endif
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1); /* mem temp */
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2);
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP1);
tcg_regset_set_reg(s->reserved_regs, TCG_VEC_TMP2);
if (USE_REG_TB) {
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 49/54] tcg/ppc: Adjust constraints on qemu_ld/st
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (47 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 48/54] tcg/ppc: Reorg tcg_out_tlb_read Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 50/54] tcg/ppc: Remove unused constraints A, B, C, D Richard Henderson
` (4 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
The softmmu tlb uses TCG_REG_{TMP1,TMP2,R0}, not any of the normally
available registers. Now that we handle overlap betwen inputs and
helper arguments, we can allow any allocatable reg.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target-con-set.h | 11 ++++-------
tcg/ppc/tcg-target-con-str.h | 2 --
tcg/ppc/tcg-target.c.inc | 32 ++++++++++----------------------
3 files changed, 14 insertions(+), 31 deletions(-)
diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h
index a1a345883d..f206b29205 100644
--- a/tcg/ppc/tcg-target-con-set.h
+++ b/tcg/ppc/tcg-target-con-set.h
@@ -12,18 +12,15 @@
C_O0_I1(r)
C_O0_I2(r, r)
C_O0_I2(r, ri)
-C_O0_I2(S, S)
C_O0_I2(v, r)
-C_O0_I3(S, S, S)
+C_O0_I3(r, r, r)
C_O0_I4(r, r, ri, ri)
-C_O0_I4(S, S, S, S)
-C_O1_I1(r, L)
+C_O0_I4(r, r, r, r)
C_O1_I1(r, r)
C_O1_I1(v, r)
C_O1_I1(v, v)
C_O1_I1(v, vr)
C_O1_I2(r, 0, rZ)
-C_O1_I2(r, L, L)
C_O1_I2(r, rI, ri)
C_O1_I2(r, rI, rT)
C_O1_I2(r, r, r)
@@ -36,7 +33,7 @@ C_O1_I2(v, v, v)
C_O1_I3(v, v, v, v)
C_O1_I4(r, r, ri, rZ, rZ)
C_O1_I4(r, r, r, ri, ri)
-C_O2_I1(L, L, L)
-C_O2_I2(L, L, L, L)
+C_O2_I1(r, r, r)
+C_O2_I2(r, r, r, r)
C_O2_I4(r, r, rI, rZM, r, r)
C_O2_I4(r, r, r, r, rI, rZM)
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
index 298ca20d5b..f3bf030bc3 100644
--- a/tcg/ppc/tcg-target-con-str.h
+++ b/tcg/ppc/tcg-target-con-str.h
@@ -14,8 +14,6 @@ REGS('A', 1u << TCG_REG_R3)
REGS('B', 1u << TCG_REG_R4)
REGS('C', 1u << TCG_REG_R5)
REGS('D', 1u << TCG_REG_R6)
-REGS('L', ALL_QLOAD_REGS)
-REGS('S', ALL_QSTORE_REGS)
/*
* Define constraint letters for constants:
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 6850ecbc80..5a4ec0470a 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -93,18 +93,6 @@
#define ALL_GENERAL_REGS 0xffffffffu
#define ALL_VECTOR_REGS 0xffffffff00000000ull
-#ifdef CONFIG_SOFTMMU
-#define ALL_QLOAD_REGS \
- (ALL_GENERAL_REGS & \
- ~((1 << TCG_REG_R3) | (1 << TCG_REG_R4) | (1 << TCG_REG_R5)))
-#define ALL_QSTORE_REGS \
- (ALL_GENERAL_REGS & ~((1 << TCG_REG_R3) | (1 << TCG_REG_R4) | \
- (1 << TCG_REG_R5) | (1 << TCG_REG_R6)))
-#else
-#define ALL_QLOAD_REGS (ALL_GENERAL_REGS & ~(1 << TCG_REG_R3))
-#define ALL_QSTORE_REGS ALL_QLOAD_REGS
-#endif
-
TCGPowerISA have_isa;
static bool have_isel;
bool have_altivec;
@@ -3752,23 +3740,23 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_qemu_ld_i32:
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
- ? C_O1_I1(r, L)
- : C_O1_I2(r, L, L));
+ ? C_O1_I1(r, r)
+ : C_O1_I2(r, r, r));
case INDEX_op_qemu_st_i32:
return (TCG_TARGET_REG_BITS == 64 || TARGET_LONG_BITS == 32
- ? C_O0_I2(S, S)
- : C_O0_I3(S, S, S));
+ ? C_O0_I2(r, r)
+ : C_O0_I3(r, r, r));
case INDEX_op_qemu_ld_i64:
- return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, L)
- : TARGET_LONG_BITS == 32 ? C_O2_I1(L, L, L)
- : C_O2_I2(L, L, L, L));
+ return (TCG_TARGET_REG_BITS == 64 ? C_O1_I1(r, r)
+ : TARGET_LONG_BITS == 32 ? C_O2_I1(r, r, r)
+ : C_O2_I2(r, r, r, r));
case INDEX_op_qemu_st_i64:
- return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(S, S)
- : TARGET_LONG_BITS == 32 ? C_O0_I3(S, S, S)
- : C_O0_I4(S, S, S, S));
+ return (TCG_TARGET_REG_BITS == 64 ? C_O0_I2(r, r)
+ : TARGET_LONG_BITS == 32 ? C_O0_I3(r, r, r)
+ : C_O0_I4(r, r, r, r));
case INDEX_op_add_vec:
case INDEX_op_sub_vec:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 50/54] tcg/ppc: Remove unused constraints A, B, C, D
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (48 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 49/54] tcg/ppc: Adjust constraints on qemu_ld/st Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 51/54] tcg/ppc: Remove unused constraint J Richard Henderson
` (3 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
These constraints have not been used for quite some time.
Fixes: 77b73de67632 ("Use rem/div[u]_i32 drop div[u]2_i32")
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target-con-str.h | 4 ----
1 file changed, 4 deletions(-)
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
index f3bf030bc3..9dcbc3df50 100644
--- a/tcg/ppc/tcg-target-con-str.h
+++ b/tcg/ppc/tcg-target-con-str.h
@@ -10,10 +10,6 @@
*/
REGS('r', ALL_GENERAL_REGS)
REGS('v', ALL_VECTOR_REGS)
-REGS('A', 1u << TCG_REG_R3)
-REGS('B', 1u << TCG_REG_R4)
-REGS('C', 1u << TCG_REG_R5)
-REGS('D', 1u << TCG_REG_R6)
/*
* Define constraint letters for constants:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 51/54] tcg/ppc: Remove unused constraint J
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (49 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 50/54] tcg/ppc: Remove unused constraints A, B, C, D Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 52/54] tcg/riscv: Simplify constraints on qemu_ld/st Richard Henderson
` (2 subsequent siblings)
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Never used since its introduction.
Fixes: 3d582c6179c ("tcg-ppc64: Rearrange integer constant constraints")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/ppc/tcg-target-con-str.h | 1 -
tcg/ppc/tcg-target.c.inc | 3 ---
2 files changed, 4 deletions(-)
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
index 9dcbc3df50..094613cbcb 100644
--- a/tcg/ppc/tcg-target-con-str.h
+++ b/tcg/ppc/tcg-target-con-str.h
@@ -16,7 +16,6 @@ REGS('v', ALL_VECTOR_REGS)
* CONST(letter, TCG_CT_CONST_* bit set)
*/
CONST('I', TCG_CT_CONST_S16)
-CONST('J', TCG_CT_CONST_U16)
CONST('M', TCG_CT_CONST_MONE)
CONST('T', TCG_CT_CONST_S32)
CONST('U', TCG_CT_CONST_U32)
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 5a4ec0470a..0a14c3e997 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -83,7 +83,6 @@
#define SZR (TCG_TARGET_REG_BITS / 8)
#define TCG_CT_CONST_S16 0x100
-#define TCG_CT_CONST_U16 0x200
#define TCG_CT_CONST_S32 0x400
#define TCG_CT_CONST_U32 0x800
#define TCG_CT_CONST_ZERO 0x1000
@@ -270,8 +269,6 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
if ((ct & TCG_CT_CONST_S16) && val == (int16_t)val) {
return 1;
- } else if ((ct & TCG_CT_CONST_U16) && val == (uint16_t)val) {
- return 1;
} else if ((ct & TCG_CT_CONST_S32) && val == (int32_t)val) {
return 1;
} else if ((ct & TCG_CT_CONST_U32) && val == (uint32_t)val) {
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 52/54] tcg/riscv: Simplify constraints on qemu_ld/st
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (50 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 51/54] tcg/ppc: Remove unused constraint J Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 53/54] tcg/s390x: Use ALGFR in constructing softmmu host address Richard Henderson
2023-05-03 6:57 ` [PATCH v4 54/54] tcg/s390x: Simplify constraints on qemu_ld/st Richard Henderson
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel
Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x,
Daniel Henrique Barboza
The softmmu tlb uses TCG_REG_TMP[0-2], not any of the normally available
registers. Now that we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/riscv/tcg-target-con-set.h | 2 --
tcg/riscv/tcg-target-con-str.h | 1 -
tcg/riscv/tcg-target.c.inc | 16 +++-------------
3 files changed, 3 insertions(+), 16 deletions(-)
diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
index d4cff673b0..d88888d3ac 100644
--- a/tcg/riscv/tcg-target-con-set.h
+++ b/tcg/riscv/tcg-target-con-set.h
@@ -10,10 +10,8 @@
* tcg-target-con-str.h; the constraint combination is inclusive or.
*/
C_O0_I1(r)
-C_O0_I2(LZ, L)
C_O0_I2(rZ, r)
C_O0_I2(rZ, rZ)
-C_O1_I1(r, L)
C_O1_I1(r, r)
C_O1_I2(r, r, ri)
C_O1_I2(r, r, rI)
diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
index 8d8afaee53..6f1cfb976c 100644
--- a/tcg/riscv/tcg-target-con-str.h
+++ b/tcg/riscv/tcg-target-con-str.h
@@ -9,7 +9,6 @@
* REGS(letter, register_mask)
*/
REGS('r', ALL_GENERAL_REGS)
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
/*
* Define constraint letters for constants:
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index c22d1e35ac..d12b824d8c 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -125,17 +125,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
#define TCG_CT_CONST_N12 0x400
#define TCG_CT_CONST_M12 0x800
-#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
-/*
- * For softmmu, we need to avoid conflicts with the first 5
- * argument registers to call the helper. Some of these are
- * also used for the tlb lookup.
- */
-#ifdef CONFIG_SOFTMMU
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_A0, 5)
-#else
-#define SOFTMMU_RESERVE_REGS 0
-#endif
+#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 32)
#define sextreg sextract64
@@ -1600,10 +1590,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_qemu_ld_i32:
case INDEX_op_qemu_ld_i64:
- return C_O1_I1(r, L);
+ return C_O1_I1(r, r);
case INDEX_op_qemu_st_i32:
case INDEX_op_qemu_st_i64:
- return C_O0_I2(LZ, L);
+ return C_O0_I2(rZ, r);
default:
g_assert_not_reached();
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 53/54] tcg/s390x: Use ALGFR in constructing softmmu host address
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (51 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 52/54] tcg/riscv: Simplify constraints on qemu_ld/st Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
2023-05-03 6:57 ` [PATCH v4 54/54] tcg/s390x: Simplify constraints on qemu_ld/st Richard Henderson
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Rather than zero-extend the guest address into a register,
use an add instruction which zero-extends the second input.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/s390x/tcg-target.c.inc | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index dfcf4d9e34..dd13326670 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -149,6 +149,7 @@ typedef enum S390Opcode {
RRE_ALGR = 0xb90a,
RRE_ALCR = 0xb998,
RRE_ALCGR = 0xb988,
+ RRE_ALGFR = 0xb91a,
RRE_CGR = 0xb920,
RRE_CLGR = 0xb921,
RRE_DLGR = 0xb987,
@@ -1853,10 +1854,11 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
offsetof(CPUTLBEntry, addend));
- h->base = addr_reg;
if (TARGET_LONG_BITS == 32) {
- tcg_out_ext32u(s, TCG_REG_R3, addr_reg);
- h->base = TCG_REG_R3;
+ tcg_out_insn(s, RRE, ALGFR, h->index, addr_reg);
+ h->base = TCG_REG_NONE;
+ } else {
+ h->base = addr_reg;
}
h->disp = 0;
#else
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v4 54/54] tcg/s390x: Simplify constraints on qemu_ld/st
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
` (52 preceding siblings ...)
2023-05-03 6:57 ` [PATCH v4 53/54] tcg/s390x: Use ALGFR in constructing softmmu host address Richard Henderson
@ 2023-05-03 6:57 ` Richard Henderson
53 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2023-05-03 6:57 UTC (permalink / raw)
To: qemu-devel; +Cc: git, philmd, qemu-arm, qemu-riscv, qemu-s390x
Adjust the softmmu tlb to use R0+R1, not any of the normally available
registers. Since we handle overlap betwen inputs and helper arguments,
we can allow any allocatable reg.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/s390x/tcg-target-con-set.h | 2 --
tcg/s390x/tcg-target-con-str.h | 1 -
tcg/s390x/tcg-target.c.inc | 36 ++++++++++++----------------------
3 files changed, 12 insertions(+), 27 deletions(-)
diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index 15f1c55103..ecc079bb6d 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -10,12 +10,10 @@
* tcg-target-con-str.h; the constraint combination is inclusive or.
*/
C_O0_I1(r)
-C_O0_I2(L, L)
C_O0_I2(r, r)
C_O0_I2(r, ri)
C_O0_I2(r, rA)
C_O0_I2(v, r)
-C_O1_I1(r, L)
C_O1_I1(r, r)
C_O1_I1(v, r)
C_O1_I1(v, v)
diff --git a/tcg/s390x/tcg-target-con-str.h b/tcg/s390x/tcg-target-con-str.h
index 6fa64a1ed6..25675b449e 100644
--- a/tcg/s390x/tcg-target-con-str.h
+++ b/tcg/s390x/tcg-target-con-str.h
@@ -9,7 +9,6 @@
* REGS(letter, register_mask)
*/
REGS('r', ALL_GENERAL_REGS)
-REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
REGS('v', ALL_VECTOR_REGS)
REGS('o', 0xaaaa) /* odd numbered general regs */
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index dd13326670..aacbaf21d5 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -44,18 +44,6 @@
#define ALL_GENERAL_REGS MAKE_64BIT_MASK(0, 16)
#define ALL_VECTOR_REGS MAKE_64BIT_MASK(32, 32)
-/*
- * For softmmu, we need to avoid conflicts with the first 3
- * argument registers to perform the tlb lookup, and to call
- * the helper function.
- */
-#ifdef CONFIG_SOFTMMU
-#define SOFTMMU_RESERVE_REGS MAKE_64BIT_MASK(TCG_REG_R2, 3)
-#else
-#define SOFTMMU_RESERVE_REGS 0
-#endif
-
-
/* Several places within the instruction set 0 means "no register"
rather than TCG_REG_R0. */
#define TCG_REG_NONE 0
@@ -1814,13 +1802,13 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
ldst->oi = oi;
ldst->addrlo_reg = addr_reg;
- tcg_out_sh64(s, RSY_SRLG, TCG_REG_R2, addr_reg, TCG_REG_NONE,
+ tcg_out_sh64(s, RSY_SRLG, TCG_TMP0, addr_reg, TCG_REG_NONE,
TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 19));
- tcg_out_insn(s, RXY, NG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, mask_off);
- tcg_out_insn(s, RXY, AG, TCG_REG_R2, TCG_AREG0, TCG_REG_NONE, table_off);
+ tcg_out_insn(s, RXY, NG, TCG_TMP0, TCG_AREG0, TCG_REG_NONE, mask_off);
+ tcg_out_insn(s, RXY, AG, TCG_TMP0, TCG_AREG0, TCG_REG_NONE, table_off);
/*
* For aligned accesses, we check the first byte and include the alignment
@@ -1830,10 +1818,10 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
a_off = (a_bits >= s_bits ? 0 : s_mask - a_mask);
tlb_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
if (a_off == 0) {
- tgen_andi_risbg(s, TCG_REG_R3, addr_reg, tlb_mask);
+ tgen_andi_risbg(s, TCG_REG_R0, addr_reg, tlb_mask);
} else {
- tcg_out_insn(s, RX, LA, TCG_REG_R3, addr_reg, TCG_REG_NONE, a_off);
- tgen_andi(s, TCG_TYPE_TL, TCG_REG_R3, tlb_mask);
+ tcg_out_insn(s, RX, LA, TCG_REG_R0, addr_reg, TCG_REG_NONE, a_off);
+ tgen_andi(s, TCG_TYPE_TL, TCG_REG_R0, tlb_mask);
}
if (is_ld) {
@@ -1842,16 +1830,16 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, HostAddress *h,
ofs = offsetof(CPUTLBEntry, addr_write);
}
if (TARGET_LONG_BITS == 32) {
- tcg_out_insn(s, RX, C, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
+ tcg_out_insn(s, RX, C, TCG_REG_R0, TCG_TMP0, TCG_REG_NONE, ofs);
} else {
- tcg_out_insn(s, RXY, CG, TCG_REG_R3, TCG_REG_R2, TCG_REG_NONE, ofs);
+ tcg_out_insn(s, RXY, CG, TCG_REG_R0, TCG_TMP0, TCG_REG_NONE, ofs);
}
tcg_out16(s, RI_BRC | (S390_CC_NE << 4));
ldst->label_ptr[0] = s->code_ptr++;
- h->index = TCG_REG_R2;
- tcg_out_insn(s, RXY, LG, h->index, TCG_REG_R2, TCG_REG_NONE,
+ h->index = TCG_TMP0;
+ tcg_out_insn(s, RXY, LG, h->index, TCG_TMP0, TCG_REG_NONE,
offsetof(CPUTLBEntry, addend));
if (TARGET_LONG_BITS == 32) {
@@ -3155,10 +3143,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
case INDEX_op_qemu_ld_i32:
case INDEX_op_qemu_ld_i64:
- return C_O1_I1(r, L);
+ return C_O1_I1(r, r);
case INDEX_op_qemu_st_i64:
case INDEX_op_qemu_st_i32:
- return C_O0_I2(L, L);
+ return C_O0_I2(r, r);
case INDEX_op_deposit_i32:
case INDEX_op_deposit_i64:
--
2.34.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
end of thread, other threads:[~2023-05-03 7:48 UTC | newest]
Thread overview: 55+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-03 6:56 [PATCH v4 00/54] tcg: Simplify calls to load/store helpers Richard Henderson
2023-05-03 6:56 ` [PATCH v4 01/54] tcg/i386: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 02/54] tcg/i386: Generalize multi-part load overlap test Richard Henderson
2023-05-03 6:56 ` [PATCH v4 03/54] tcg/i386: Introduce HostAddress Richard Henderson
2023-05-03 6:56 ` [PATCH v4 04/54] tcg/i386: Drop r0+r1 local variables from tcg_out_tlb_load Richard Henderson
2023-05-03 6:56 ` [PATCH v4 05/54] tcg/i386: Introduce tcg_out_testi Richard Henderson
2023-05-03 6:56 ` [PATCH v4 06/54] tcg/i386: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:56 ` [PATCH v4 07/54] tcg/i386: Use indexed addressing for softmmu fast path Richard Henderson
2023-05-03 6:56 ` [PATCH v4 08/54] tcg/aarch64: Rationalize args to tcg_out_qemu_{ld, st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 09/54] tcg/aarch64: Introduce HostAddress Richard Henderson
2023-05-03 6:56 ` [PATCH v4 10/54] tcg/aarch64: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:56 ` [PATCH v4 11/54] tcg/arm: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 12/54] tcg/arm: Introduce HostAddress Richard Henderson
2023-05-03 6:56 ` [PATCH v4 13/54] tcg/arm: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:56 ` [PATCH v4 14/54] tcg/loongarch64: Rationalize args to tcg_out_qemu_{ld, st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 15/54] tcg/loongarch64: Introduce HostAddress Richard Henderson
2023-05-03 6:56 ` [PATCH v4 16/54] tcg/loongarch64: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:56 ` [PATCH v4 17/54] tcg/mips: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 18/54] tcg/mips: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:56 ` [PATCH v4 19/54] tcg/ppc: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 20/54] tcg/ppc: Introduce HostAddress Richard Henderson
2023-05-03 6:56 ` [PATCH v4 21/54] tcg/ppc: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:56 ` [PATCH v4 22/54] tcg/riscv: Require TCG_TARGET_REG_BITS == 64 Richard Henderson
2023-05-03 6:56 ` [PATCH v4 23/54] tcg/riscv: Rationalize args to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:56 ` [PATCH v4 24/54] tcg/riscv: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:57 ` [PATCH v4 25/54] tcg/s390x: Pass TCGType to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:57 ` [PATCH v4 26/54] tcg/s390x: Introduce HostAddress Richard Henderson
2023-05-03 6:57 ` [PATCH v4 27/54] tcg/s390x: Introduce prepare_host_addr Richard Henderson
2023-05-03 6:57 ` [PATCH v4 28/54] tcg/sparc64: Drop is_64 test from tcg_out_qemu_ld data return Richard Henderson
2023-05-03 6:57 ` [PATCH v4 29/54] tcg/sparc64: Pass TCGType to tcg_out_qemu_{ld,st} Richard Henderson
2023-05-03 6:57 ` [PATCH v4 30/54] tcg: Move TCGLabelQemuLdst to tcg.c Richard Henderson
2023-05-03 6:57 ` [PATCH v4 31/54] tcg: Replace REG_P with arg_loc_reg_p Richard Henderson
2023-05-03 6:57 ` [PATCH v4 32/54] tcg: Introduce arg_slot_stk_ofs Richard Henderson
2023-05-03 6:57 ` [PATCH v4 33/54] tcg: Widen helper_*_st[bw]_mmu val arguments Richard Henderson
2023-05-03 6:57 ` [PATCH v4 34/54] tcg: Add routines for calling slow-path helpers Richard Henderson
2023-05-03 6:57 ` [PATCH v4 35/54] tcg/i386: Convert tcg_out_qemu_ld_slow_path Richard Henderson
2023-05-03 6:57 ` [PATCH v4 36/54] tcg/i386: Convert tcg_out_qemu_st_slow_path Richard Henderson
2023-05-03 6:57 ` [PATCH v4 37/54] tcg/aarch64: Convert tcg_out_qemu_{ld,st}_slow_path Richard Henderson
2023-05-03 6:57 ` [PATCH v4 38/54] tcg/arm: " Richard Henderson
2023-05-03 6:57 ` [PATCH v4 39/54] tcg/loongarch64: Convert tcg_out_qemu_{ld, st}_slow_path Richard Henderson
2023-05-03 6:57 ` [PATCH v4 40/54] tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path Richard Henderson
2023-05-03 6:57 ` [PATCH v4 41/54] tcg/ppc: " Richard Henderson
2023-05-03 6:57 ` [PATCH v4 42/54] tcg/riscv: " Richard Henderson
2023-05-03 6:57 ` [PATCH v4 43/54] tcg/s390x: " Richard Henderson
2023-05-03 6:57 ` [PATCH v4 44/54] tcg/loongarch64: Simplify constraints on qemu_ld/st Richard Henderson
2023-05-03 6:57 ` [PATCH v4 45/54] tcg/mips: Remove MO_BSWAP handling Richard Henderson
2023-05-03 6:57 ` [PATCH v4 46/54] tcg/mips: Reorg tlb load within prepare_host_addr Richard Henderson
2023-05-03 6:57 ` [PATCH v4 47/54] tcg/mips: Simplify constraints on qemu_ld/st Richard Henderson
2023-05-03 6:57 ` [PATCH v4 48/54] tcg/ppc: Reorg tcg_out_tlb_read Richard Henderson
2023-05-03 6:57 ` [PATCH v4 49/54] tcg/ppc: Adjust constraints on qemu_ld/st Richard Henderson
2023-05-03 6:57 ` [PATCH v4 50/54] tcg/ppc: Remove unused constraints A, B, C, D Richard Henderson
2023-05-03 6:57 ` [PATCH v4 51/54] tcg/ppc: Remove unused constraint J Richard Henderson
2023-05-03 6:57 ` [PATCH v4 52/54] tcg/riscv: Simplify constraints on qemu_ld/st Richard Henderson
2023-05-03 6:57 ` [PATCH v4 53/54] tcg/s390x: Use ALGFR in constructing softmmu host address Richard Henderson
2023-05-03 6:57 ` [PATCH v4 54/54] tcg/s390x: Simplify constraints on qemu_ld/st Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).