* [Qemu-devel] [PATCH 00/15] tcg-sparc improvments
@ 2012-03-25 22:27 Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 01/15] tcg-sparc: Hack in qemu_ld/st64 for 32-bit Richard Henderson
` (14 more replies)
0 siblings, 15 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
32-bit sparc hasn't worked in quite a while. Missing opcodes,
incorrect opcodes, unconditional use of ASI_PRIMARY_LITTLE.
This patch set begins by dropping support for pre-v9 sparc.
This lets us clean things up quite a bit, using 64-bit load
and store operations.
I was still having problems with %g6 being clobbered in glibc.
Patches 7-10 drop the use of global registers for the sparc
port entirely. Given the hoops being used to protect areg0
around calls within the tcg generated code, deferring to a
%g7-relative tls access in the helpers is approximately as
efficient. As targets are converted to CONFIG_TCG_PASS_AREG0
even this will improve as direct register access is available.
r~
Richard Henderson (15):
tcg-sparc: Hack in qemu_ld/st64 for 32-bit.
tcg-sparc: Fix ADDX opcode.
tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode.
tcg-sparc: Fix qemu_ld/st to handle 32-bit host.
tcg-sparc: Simplify qemu_ld/st direct memory paths.
tcg-sparc: Support GUEST_BASE.
tcg-sparc: Steamline qemu_ld/st more.
Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0.
tcg-sparc: Do not use a global register for AREG0.
tcg-sparc: Change AREG0 in generated code to %i0.
tcg-sparc: Clean up cruft stemming from attempts to use global
registers.
tcg-sparc: Mask shift immediates to avoid illegal insns.
tcg-sparc: Use defines for temporaries.
tcg-sparc: Add %g/%o registers to alloc_order
tcg-sparc: Fix and enable direct TB chaining.
configure | 53 +---
dyngen-exec.h | 27 +-
exec-all.h | 9 +-
exec.c | 16 +-
tcg/sparc/tcg-target.c | 951 +++++++++++++++++++++++-------------------------
tcg/sparc/tcg-target.h | 34 +-
user-exec.c | 17 +-
7 files changed, 520 insertions(+), 587 deletions(-)
--
1.7.7.6
^ permalink raw reply [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 01/15] tcg-sparc: Hack in qemu_ld/st64 for 32-bit.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 02/15] tcg-sparc: Fix ADDX opcode Richard Henderson
` (13 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Not actually implemented, but at least we avoid the tcg assert
at startup.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 247a278..0e71618 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -1586,6 +1586,9 @@ static const TCGTargetOpDef sparc_op_defs[] = {
{ INDEX_op_brcond_i64, { "r", "rJ" } },
{ INDEX_op_setcond_i64, { "r", "r", "rJ" } },
+#else
+ { INDEX_op_qemu_ld64, { "L", "L", "L" } },
+ { INDEX_op_qemu_st64, { "L", "L", "L" } },
#endif
{ -1 },
};
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 02/15] tcg-sparc: Fix ADDX opcode.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 01/15] tcg-sparc: Hack in qemu_ld/st64 for 32-bit Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 03/15] tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode Richard Henderson
` (12 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 0e71618..358a70c 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -242,7 +242,7 @@ static inline int tcg_target_const_match(tcg_target_long val,
#define ARITH_XOR (INSN_OP(2) | INSN_OP3(0x03))
#define ARITH_SUB (INSN_OP(2) | INSN_OP3(0x04))
#define ARITH_SUBCC (INSN_OP(2) | INSN_OP3(0x14))
-#define ARITH_ADDX (INSN_OP(2) | INSN_OP3(0x10))
+#define ARITH_ADDX (INSN_OP(2) | INSN_OP3(0x08))
#define ARITH_SUBX (INSN_OP(2) | INSN_OP3(0x0c))
#define ARITH_UMUL (INSN_OP(2) | INSN_OP3(0x0a))
#define ARITH_UDIV (INSN_OP(2) | INSN_OP3(0x0e))
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 03/15] tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 01/15] tcg-sparc: Hack in qemu_ld/st64 for 32-bit Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 02/15] tcg-sparc: Fix ADDX opcode Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 04/15] tcg-sparc: Fix qemu_ld/st to handle 32-bit host Richard Henderson
` (11 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Current code doesn't actually work in 32-bit mode at all. Since
no one really noticed, drop the complication of v7 and v8 cpus.
Eliminate the --sparc_cpu configure option and standardize macro
testing on TCG_TARGET_REG_BITS.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
configure | 41 ++++-------------------------------------
dyngen-exec.h | 4 +---
tcg/sparc/tcg-target.c | 16 ++++------------
tcg/sparc/tcg-target.h | 7 ++++---
4 files changed, 13 insertions(+), 55 deletions(-)
diff --git a/configure b/configure
index 80ca430..7741ba9 100755
--- a/configure
+++ b/configure
@@ -86,7 +86,6 @@ source_path=`dirname "$0"`
cpu=""
interp_prefix="/usr/gnemul/qemu-%M"
static="no"
-sparc_cpu=""
cross_prefix=""
audio_drv_list=""
audio_card_list="ac97 es1370 sb16 hda"
@@ -216,21 +215,6 @@ for opt do
;;
--disable-debug-info) debug_info="no"
;;
- --sparc_cpu=*)
- sparc_cpu="$optarg"
- case $sparc_cpu in
- v7|v8|v8plus|v8plusa)
- cpu="sparc"
- ;;
- v9)
- cpu="sparc64"
- ;;
- *)
- echo "undefined SPARC architecture. Exiting";
- exit 1
- ;;
- esac
- ;;
esac
done
# OS specific
@@ -284,8 +268,6 @@ elif check_define __i386__ ; then
elif check_define __x86_64__ ; then
cpu="x86_64"
elif check_define __sparc__ ; then
- # We can't check for 64 bit (when gcc is biarch) or V8PLUSA
- # They must be specified using --sparc_cpu
if check_define __arch64__ ; then
cpu="sparc64"
else
@@ -749,8 +731,6 @@ for opt do
;;
--enable-uname-release=*) uname_release="$optarg"
;;
- --sparc_cpu=*)
- ;;
--enable-werror) werror="yes"
;;
--disable-werror) werror="no"
@@ -830,32 +810,19 @@ for opt do
esac
done
-#
-# If cpu ~= sparc and sparc_cpu hasn't been defined, plug in the right
-# QEMU_CFLAGS/LDFLAGS (assume sparc_v8plus for 32-bit and sparc_v9 for 64-bit)
-#
host_guest_base="no"
case "$cpu" in
- sparc) case $sparc_cpu in
- v7|v8)
- QEMU_CFLAGS="-mcpu=${sparc_cpu} -D__sparc_${sparc_cpu}__ $QEMU_CFLAGS"
- ;;
- v8plus|v8plusa)
- QEMU_CFLAGS="-mcpu=ultrasparc -D__sparc_${sparc_cpu}__ $QEMU_CFLAGS"
- ;;
- *) # sparc_cpu not defined in the command line
- QEMU_CFLAGS="-mcpu=ultrasparc -D__sparc_v8plus__ $QEMU_CFLAGS"
- esac
+ sparc)
LDFLAGS="-m32 $LDFLAGS"
- QEMU_CFLAGS="-m32 -ffixed-g2 -ffixed-g3 $QEMU_CFLAGS"
+ QEMU_CFLAGS="-m32 -mcpu=ultrasparc $QEMU_CFLAGS"
+ QEMU_CFLAGS="-ffixed-g2 -ffixed-g3 $QEMU_CFLAGS"
if test "$solaris" = "no" ; then
QEMU_CFLAGS="-ffixed-g1 -ffixed-g6 $QEMU_CFLAGS"
- helper_cflags="-ffixed-i0"
fi
;;
sparc64)
- QEMU_CFLAGS="-m64 -mcpu=ultrasparc -D__sparc_v9__ $QEMU_CFLAGS"
LDFLAGS="-m64 $LDFLAGS"
+ QEMU_CFLAGS="-m64 -mcpu=ultrasparc $QEMU_CFLAGS"
QEMU_CFLAGS="-ffixed-g5 -ffixed-g6 -ffixed-g7 $QEMU_CFLAGS"
if test "$solaris" != "no" ; then
QEMU_CFLAGS="-ffixed-g1 $QEMU_CFLAGS"
diff --git a/dyngen-exec.h b/dyngen-exec.h
index 083e20b..cfeef99 100644
--- a/dyngen-exec.h
+++ b/dyngen-exec.h
@@ -39,13 +39,11 @@
#elif defined(__sparc__)
#ifdef CONFIG_SOLARIS
#define AREG0 "g2"
-#else
-#ifdef __sparc_v9__
+#elif HOST_LONG_BITS == 64
#define AREG0 "g5"
#else
#define AREG0 "g6"
#endif
-#endif
#elif defined(__s390__)
#define AREG0 "r10"
#elif defined(__alpha__)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 358a70c..257d20a 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -627,18 +627,10 @@ static void tcg_out_setcond_i32(TCGContext *s, TCGCond cond, TCGArg ret,
default:
tcg_out_cmp(s, c1, c2, c2const);
-#if defined(__sparc_v9__) || defined(__sparc_v8plus__)
tcg_out_movi_imm13(s, ret, 0);
- tcg_out32 (s, ARITH_MOVCC | INSN_RD(ret)
- | INSN_RS1(tcg_cond_to_bcond[cond])
- | MOVCC_ICC | INSN_IMM11(1));
-#else
- t = gen_new_label();
- tcg_out_branch_i32(s, INSN_COND(tcg_cond_to_bcond[cond], 1), t);
- tcg_out_movi_imm13(s, ret, 1);
- tcg_out_movi_imm13(s, ret, 0);
- tcg_out_label(s, t, s->code_ptr);
-#endif
+ tcg_out32(s, ARITH_MOVCC | INSN_RD(ret)
+ | INSN_RS1(tcg_cond_to_bcond[cond])
+ | MOVCC_ICC | INSN_IMM11(1));
return;
}
@@ -768,7 +760,7 @@ static const void * const qemu_st_helpers[4] = {
#endif
#endif
-#ifdef __arch64__
+#if TCG_TARGET_REG_BITS == 64
#define HOST_LD_OP LDX
#define HOST_ST_OP STX
#define HOST_SLL_OP SHIFT_SLLX
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index ee2274d..56742bf 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -67,7 +67,8 @@ typedef enum {
/* used for function call generation */
#define TCG_REG_CALL_STACK TCG_REG_I6
-#ifdef __arch64__
+
+#if TCG_TARGET_REG_BITS == 64
// Reserve space for AREG0
#define TCG_TARGET_STACK_MINFRAME (176 + 4 * (int)sizeof(long) + \
TCG_STATIC_CALL_ARGS_SIZE)
@@ -81,7 +82,7 @@ typedef enum {
#define TCG_TARGET_STACK_ALIGN 8
#endif
-#ifdef __arch64__
+#if TCG_TARGET_REG_BITS == 64
#define TCG_TARGET_EXTEND_ARGS 1
#endif
@@ -128,7 +129,7 @@ typedef enum {
/* Note: must be synced with dyngen-exec.h */
#ifdef CONFIG_SOLARIS
#define TCG_AREG0 TCG_REG_G2
-#elif defined(__sparc_v9__)
+#elif HOST_LONG_BITS == 64
#define TCG_AREG0 TCG_REG_G5
#else
#define TCG_AREG0 TCG_REG_G6
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 04/15] tcg-sparc: Fix qemu_ld/st to handle 32-bit host.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (2 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 03/15] tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 05/15] tcg-sparc: Simplify qemu_ld/st direct memory paths Richard Henderson
` (10 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
At the same time, split out the tlb load logic to a new function.
Fixes the cases of two data registers and two address registers.
Fixes the signature of, and adds missing, qemu_ld/st opcodes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 751 ++++++++++++++++++++++++------------------------
1 files changed, 378 insertions(+), 373 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 257d20a..8763b03 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -448,14 +448,15 @@ static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
}
}
-static inline void tcg_out_andi(TCGContext *s, int reg, tcg_target_long val)
+static inline void tcg_out_andi(TCGContext *s, int rd, int rs,
+ tcg_target_long val)
{
if (val != 0) {
if (check_fit_tl(val, 13))
- tcg_out_arithi(s, reg, reg, val, ARITH_AND);
+ tcg_out_arithi(s, rd, rs, val, ARITH_AND);
else {
tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, val);
- tcg_out_arith(s, reg, reg, TCG_REG_I5, ARITH_AND);
+ tcg_out_arith(s, rd, rs, TCG_REG_I5, ARITH_AND);
}
}
}
@@ -744,422 +745,405 @@ static const void * const qemu_st_helpers[4] = {
__stq_mmu,
};
#endif
-#endif
-#if TARGET_LONG_BITS == 32
-#define TARGET_LD_OP LDUW
-#else
-#define TARGET_LD_OP LDX
-#endif
+/* Perform the TLB load and compare.
-#if defined(CONFIG_SOFTMMU)
-#if HOST_LONG_BITS == 32
-#define TARGET_ADDEND_LD_OP LDUW
-#else
-#define TARGET_ADDEND_LD_OP LDX
-#endif
-#endif
+ Inputs:
+ ADDRLO_IDX contains the index into ARGS of the low part of the
+ address; the high part of the address is at ADDR_LOW_IDX+1.
-#if TCG_TARGET_REG_BITS == 64
-#define HOST_LD_OP LDX
-#define HOST_ST_OP STX
-#define HOST_SLL_OP SHIFT_SLLX
-#define HOST_SRA_OP SHIFT_SRAX
+ MEM_INDEX and S_BITS are the memory context and log2 size of the load.
+
+ WHICH is the offset into the CPUTLBEntry structure of the slot to read.
+ This should be offsetof addr_read or addr_write.
+
+ Outputs:
+ LABEL_PTRS is filled with the position of the forward jumps to the
+ TLB miss case. This will always be a ,PN insn, so a 19-bit offset.
+
+ Returns a register loaded with the low part of the address, adjusted
+ as indicated by the TLB and so is a host address. Undefined in the
+ TLB miss case. */
+
+static int tcg_out_tlb_load(TCGContext *s, int addrlo_idx, int mem_index,
+ int s_bits, const TCGArg *args,
+ uint32_t **label_ptr, int which)
+{
+ const int addrlo = args[addrlo_idx];
+ const int r0 = tcg_target_call_iarg_regs[0];
+ const int r1 = tcg_target_call_iarg_regs[1];
+ const int r2 = tcg_target_call_iarg_regs[2];
+ int addr = addrlo;
+ int tlb_ofs;
+
+ if (TCG_TARGET_REG_BITS == 32 && TARGET_LONG_BITS == 64) {
+ /* Assemble the 64-bit address in R0. */
+ tcg_out_arithi(s, r0, addrlo, 0, SHIFT_SRL);
+ tcg_out_arithi(s, r1, args[addrlo_idx + 1], 32, SHIFT_SLLX);
+ tcg_out_arith(s, r0, r0, r1, ARITH_OR);
+ }
+
+ /* Shift the page number down to tlb-entry. */
+ tcg_out_arithi(s, r1, addrlo,
+ TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS, SHIFT_SRL);
+
+ /* Mask out the page offset, except for the required alignment. */
+ tcg_out_andi(s, r0, addr, TARGET_PAGE_MASK | ((1 << s_bits) - 1));
+
+ /* Compute tlb index, modulo tlb size. */
+ tcg_out_andi(s, r1, r1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
+
+ /* Relative to the current ENV. */
+ tcg_out_arith(s, r1, TCG_AREG0, r1, ARITH_ADD);
+
+ /* Find a base address that can load both tlb comparator and addend. */
+ tlb_ofs = offsetof(CPUArchState, tlb_table[mem_index][0]);
+ if (!check_fit_tl(tlb_ofs + sizeof(CPUTLBEntry), 13)) {
+ tcg_out_addi(s, r1, tlb_ofs);
+ tlb_ofs = 0;
+ }
+
+ /* ld [arg1 + which], arg2 */
+ tcg_out_ld(s, TCG_TYPE_TL, r2, r1, tlb_ofs + which);
+
+ /* subcc arg0, arg2, %g0 */
+ tcg_out_cmp(s, r0, r2, 0);
+
+ /* bne,pn %[ix]cc, label0 */
+ *label_ptr = (uint32_t *)s->code_ptr;
+ tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_NE, 0) | INSN_OP2(0x1) |
+ ((TARGET_LONG_BITS == 64) << 21)));
+
+ /* TLB Hit. Compute the host address into r1. The ld is in the
+ branch delay slot; harmless for the TLB miss case. */
+ tcg_out_ld(s, TCG_TYPE_PTR, r1, r1, tlb_ofs+offsetof(CPUTLBEntry, addend));
+
+ if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
+ tcg_out_arithi(s, r0, addrlo, 0, SHIFT_SRL);
+ tcg_out_arith(s, r1, r0, r1, ARITH_ADD);
+ } else {
+ tcg_out_arith(s, r1, addrlo, r1, ARITH_ADD);
+ }
+
+ return r1;
+}
+#endif /* CONFIG_SOFTMMU */
+
+static void tcg_out_qemu_ld_direct(TCGContext *s, int addr, int datalo,
+ int datahi, int sizeop)
+{
+#ifdef TARGET_WORDS_BIGENDIAN
+ const int bigendian = 1;
#else
-#define HOST_LD_OP LDUW
-#define HOST_ST_OP STW
-#define HOST_SLL_OP SHIFT_SLL
-#define HOST_SRA_OP SHIFT_SRA
+ const int bigendian = 0;
#endif
+ switch (sizeop) {
+ case 0:
+ /* ldub [addr], datalo */
+ tcg_out_ldst(s, datalo, addr, 0, LDUB);
+ break;
+ case 0 | 4:
+ /* ldsb [addr], datalo */
+ tcg_out_ldst(s, datalo, addr, 0, LDSB);
+ break;
+ case 1:
+ if (bigendian) {
+ /* lduh [addr], datalo */
+ tcg_out_ldst(s, datalo, addr, 0, LDUH);
+ } else {
+ /* lduha [addr] ASI_PRIMARY_LITTLE, datalo */
+ tcg_out_ldst_asi(s, datalo, addr, 0, LDUHA, ASI_PRIMARY_LITTLE);
+ }
+ break;
+ case 1 | 4:
+ if (bigendian) {
+ /* ldsh [addr], datalo */
+ tcg_out_ldst(s, datalo, addr, 0, LDSH);
+ } else {
+ /* ldsha [addr] ASI_PRIMARY_LITTLE, datalo */
+ tcg_out_ldst_asi(s, datalo, addr, 0, LDSHA, ASI_PRIMARY_LITTLE);
+ }
+ break;
+ case 2:
+ if (bigendian) {
+ /* lduw [addr], datalo */
+ tcg_out_ldst(s, datalo, addr, 0, LDUW);
+ } else {
+ /* lduwa [addr] ASI_PRIMARY_LITTLE, datalo */
+ tcg_out_ldst_asi(s, datalo, addr, 0, LDUWA, ASI_PRIMARY_LITTLE);
+ }
+ break;
+ case 2 | 4:
+ if (bigendian) {
+ /* ldsw [addr], datalo */
+ tcg_out_ldst(s, datalo, addr, 0, LDSW);
+ } else {
+ /* ldswa [addr] ASI_PRIMARY_LITTLE, datalo */
+ tcg_out_ldst_asi(s, datalo, addr, 0, LDSWA, ASI_PRIMARY_LITTLE);
+ }
+ break;
+ case 3:
+ if (TCG_TARGET_REG_BITS == 64) {
+ if (bigendian) {
+ /* ldx [addr], datalo */
+ tcg_out_ldst(s, datalo, addr, 0, LDX);
+ } else {
+ /* ldxa [addr] ASI_PRIMARY_LITTLE, datalo */
+ tcg_out_ldst_asi(s, datalo, addr, 0, LDXA, ASI_PRIMARY_LITTLE);
+ }
+ } else {
+ if (bigendian) {
+ tcg_out_ldst(s, datahi, addr, 0, LDUW);
+ tcg_out_ldst(s, datalo, addr, 4, LDUW);
+ } else {
+ tcg_out_ldst_asi(s, datalo, addr, 0, LDUWA, ASI_PRIMARY_LITTLE);
+ tcg_out_ldst_asi(s, datahi, addr, 4, LDUWA, ASI_PRIMARY_LITTLE);
+ }
+ }
+ break;
+ default:
+ tcg_abort();
+ }
+}
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args,
- int opc)
+static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
{
- int addr_reg, data_reg, arg0, arg1, arg2, mem_index, s_bits;
+ int addrlo_idx = 1, datalo, datahi, addr_reg;
#if defined(CONFIG_SOFTMMU)
- uint32_t *label1_ptr, *label2_ptr;
+ int memi_idx, memi, s_bits, n;
+ uint32_t *label_ptr[2];
#endif
- data_reg = *args++;
- addr_reg = *args++;
- mem_index = *args;
- s_bits = opc & 3;
-
- arg0 = TCG_REG_O0;
- arg1 = TCG_REG_O1;
- arg2 = TCG_REG_O2;
+ datahi = datalo = args[0];
+ if (TCG_TARGET_REG_BITS == 32 && opc == 3) {
+ datahi = args[1];
+ addrlo_idx = 2;
+ }
#if defined(CONFIG_SOFTMMU)
- /* srl addr_reg, x, arg1 */
- tcg_out_arithi(s, arg1, addr_reg, TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS,
- SHIFT_SRL);
- /* and addr_reg, x, arg0 */
- tcg_out_arithi(s, arg0, addr_reg, TARGET_PAGE_MASK | ((1 << s_bits) - 1),
- ARITH_AND);
-
- /* and arg1, x, arg1 */
- tcg_out_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
-
- /* add arg1, x, arg1 */
- tcg_out_addi(s, arg1, offsetof(CPUArchState,
- tlb_table[mem_index][0].addr_read));
-
- /* add env, arg1, arg1 */
- tcg_out_arith(s, arg1, TCG_AREG0, arg1, ARITH_ADD);
+ memi_idx = addrlo_idx + 1 + (TARGET_LONG_BITS > TCG_TARGET_REG_BITS);
+ memi = args[memi_idx];
+ s_bits = opc & 3;
- /* ld [arg1], arg2 */
- tcg_out32(s, TARGET_LD_OP | INSN_RD(arg2) | INSN_RS1(arg1) |
- INSN_RS2(TCG_REG_G0));
+ addr_reg = tcg_out_tlb_load(s, addrlo_idx, memi, s_bits, args,
+ label_ptr, offsetof(CPUTLBEntry, addr_read));
- /* subcc arg0, arg2, %g0 */
- tcg_out_arith(s, TCG_REG_G0, arg0, arg2, ARITH_SUBCC);
+ /* TLB Hit. */
+ tcg_out_qemu_ld_direct(s, addr_reg, datalo, datahi, opc);
- /* will become:
- be label1
- or
- be,pt %xcc label1 */
- label1_ptr = (uint32_t *)s->code_ptr;
- tcg_out32(s, 0);
+ /* b,pt,n label1 */
+ label_ptr[1] = (uint32_t *)s->code_ptr;
+ tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x1)
+ | (1 << 29) | (1 << 19)));
- /* mov (delay slot) */
- tcg_out_mov(s, TCG_TYPE_PTR, arg0, addr_reg);
+ /* TLB Miss. */
- /* mov */
- tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index);
+ *label_ptr[0] |= INSN_OFF19((unsigned long)s->code_ptr -
+ (unsigned long)label_ptr[0]);
+ n = 0;
#ifdef CONFIG_TCG_PASS_AREG0
- /* XXX/FIXME: suboptimal */
- tcg_out_mov(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3],
- tcg_target_call_iarg_regs[2]);
- tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[2],
- tcg_target_call_iarg_regs[1]);
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
- tcg_target_call_iarg_regs[0]);
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0],
- TCG_AREG0);
+ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[n++], TCG_AREG0);
#endif
+ if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
+ tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
+ args[addrlo_idx + 1]);
+ }
+ tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
+ args[addrlo_idx]);
+
+ /* Store AREG0 in stack to avoid ugly glibc bugs that mangle
+ global registers */
+ tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
+ TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
+ sizeof(long));
- /* XXX: move that code at the end of the TB */
/* qemu_ld_helper[s_bits](arg0, arg1) */
tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_ld_helpers[s_bits]
- (tcg_target_ulong)s->code_ptr) >> 2)
& 0x3fffffff));
- /* Store AREG0 in stack to avoid ugly glibc bugs that mangle
- global registers */
- // delay slot
- tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long), HOST_ST_OP);
- tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long), HOST_LD_OP);
-
- /* data_reg = sign_extend(arg0) */
+ /* delay slot */
+ tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[n], memi);
+
+ /* Reload AREG0. */
+ tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
+ TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
+ sizeof(long));
+
+ n = tcg_target_call_oarg_regs[0];
+ /* datalo = sign_extend(arg0) */
switch(opc) {
case 0 | 4:
- /* sll arg0, 24/56, data_reg */
- tcg_out_arithi(s, data_reg, arg0, (int)sizeof(tcg_target_long) * 8 - 8,
- HOST_SLL_OP);
- /* sra data_reg, 24/56, data_reg */
- tcg_out_arithi(s, data_reg, data_reg,
- (int)sizeof(tcg_target_long) * 8 - 8, HOST_SRA_OP);
+ /* Recall that SRA sign extends from bit 31 through bit 63. */
+ tcg_out_arithi(s, datalo, n, 24, SHIFT_SLL);
+ tcg_out_arithi(s, datalo, datalo, 24, SHIFT_SRA);
break;
case 1 | 4:
- /* sll arg0, 16/48, data_reg */
- tcg_out_arithi(s, data_reg, arg0,
- (int)sizeof(tcg_target_long) * 8 - 16, HOST_SLL_OP);
- /* sra data_reg, 16/48, data_reg */
- tcg_out_arithi(s, data_reg, data_reg,
- (int)sizeof(tcg_target_long) * 8 - 16, HOST_SRA_OP);
+ tcg_out_arithi(s, datalo, n, 16, SHIFT_SLL);
+ tcg_out_arithi(s, datalo, datalo, 16, SHIFT_SRA);
break;
case 2 | 4:
- /* sll arg0, 32, data_reg */
- tcg_out_arithi(s, data_reg, arg0, 32, HOST_SLL_OP);
- /* sra data_reg, 32, data_reg */
- tcg_out_arithi(s, data_reg, data_reg, 32, HOST_SRA_OP);
+ tcg_out_arithi(s, datalo, n, 0, SHIFT_SRA);
break;
+ case 3:
+ if (TCG_TARGET_REG_BITS == 32) {
+ tcg_out_mov(s, TCG_TYPE_REG, datahi, n);
+ tcg_out_mov(s, TCG_TYPE_REG, datalo, n + 1);
+ break;
+ }
+ /* FALLTHRU */
case 0:
case 1:
case 2:
- case 3:
default:
/* mov */
- tcg_out_mov(s, TCG_TYPE_REG, data_reg, arg0);
+ tcg_out_mov(s, TCG_TYPE_REG, datalo, n);
break;
}
- /* will become:
- ba label2 */
- label2_ptr = (uint32_t *)s->code_ptr;
- tcg_out32(s, 0);
-
- /* nop (delay slot */
- tcg_out_nop(s);
-
- /* label1: */
-#if TARGET_LONG_BITS == 32
- /* be label1 */
- *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x2) |
- INSN_OFF22((unsigned long)s->code_ptr -
- (unsigned long)label1_ptr));
+ *label_ptr[1] |= INSN_OFF19((unsigned long)s->code_ptr -
+ (unsigned long)label_ptr[1]);
#else
- /* be,pt %xcc label1 */
- *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x1) |
- (0x5 << 19) | INSN_OFF19((unsigned long)s->code_ptr -
- (unsigned long)label1_ptr));
-#endif
-
- /* ld [arg1 + x], arg1 */
- tcg_out_ldst(s, arg1, arg1, offsetof(CPUTLBEntry, addend) -
- offsetof(CPUTLBEntry, addr_read), TARGET_ADDEND_LD_OP);
-
-#if TARGET_LONG_BITS == 32
- /* and addr_reg, x, arg0 */
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, 0xffffffff);
- tcg_out_arith(s, arg0, addr_reg, TCG_REG_I5, ARITH_AND);
- /* add arg0, arg1, arg0 */
- tcg_out_arith(s, arg0, arg0, arg1, ARITH_ADD);
-#else
- /* add addr_reg, arg1, arg0 */
- tcg_out_arith(s, arg0, addr_reg, arg1, ARITH_ADD);
-#endif
+ addr_reg = args[addrlo_idx];
+ if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
+ tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
+ addr_reg = TCG_REG_I5;
+ }
+ tcg_out_qemu_ld_direct(s, addr_reg, datalo, datahi, opc);
+#endif /* CONFIG_SOFTMMU */
+}
+static void tcg_out_qemu_st_direct(TCGContext *s, int addr, int datalo,
+ int datahi, int sizeop)
+{
+#ifdef TARGET_WORDS_BIGENDIAN
+ const int bigendian = 1;
#else
- arg0 = addr_reg;
+ const int bigendian = 0;
#endif
-
- switch(opc) {
+ switch (sizeop) {
case 0:
- /* ldub [arg0], data_reg */
- tcg_out_ldst(s, data_reg, arg0, 0, LDUB);
- break;
- case 0 | 4:
- /* ldsb [arg0], data_reg */
- tcg_out_ldst(s, data_reg, arg0, 0, LDSB);
+ /* stb datalo, [addr] */
+ tcg_out_ldst(s, datalo, addr, 0, STB);
break;
case 1:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* lduh [arg0], data_reg */
- tcg_out_ldst(s, data_reg, arg0, 0, LDUH);
-#else
- /* lduha [arg0] ASI_PRIMARY_LITTLE, data_reg */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, LDUHA, ASI_PRIMARY_LITTLE);
-#endif
- break;
- case 1 | 4:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* ldsh [arg0], data_reg */
- tcg_out_ldst(s, data_reg, arg0, 0, LDSH);
-#else
- /* ldsha [arg0] ASI_PRIMARY_LITTLE, data_reg */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, LDSHA, ASI_PRIMARY_LITTLE);
-#endif
+ if (bigendian) {
+ /* sth datalo, [addr] */
+ tcg_out_ldst(s, datalo, addr, 0, STH);
+ } else {
+ /* stha datalo, [addr] ASI_PRIMARY_LITTLE */
+ tcg_out_ldst_asi(s, datalo, addr, 0, STHA, ASI_PRIMARY_LITTLE);
+ }
break;
case 2:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* lduw [arg0], data_reg */
- tcg_out_ldst(s, data_reg, arg0, 0, LDUW);
-#else
- /* lduwa [arg0] ASI_PRIMARY_LITTLE, data_reg */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, LDUWA, ASI_PRIMARY_LITTLE);
-#endif
- break;
- case 2 | 4:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* ldsw [arg0], data_reg */
- tcg_out_ldst(s, data_reg, arg0, 0, LDSW);
-#else
- /* ldswa [arg0] ASI_PRIMARY_LITTLE, data_reg */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, LDSWA, ASI_PRIMARY_LITTLE);
-#endif
+ if (bigendian) {
+ /* stw datalo, [addr] */
+ tcg_out_ldst(s, datalo, addr, 0, STW);
+ } else {
+ /* stwa datalo, [addr] ASI_PRIMARY_LITTLE */
+ tcg_out_ldst_asi(s, datalo, addr, 0, STWA, ASI_PRIMARY_LITTLE);
+ }
break;
case 3:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* ldx [arg0], data_reg */
- tcg_out_ldst(s, data_reg, arg0, 0, LDX);
-#else
- /* ldxa [arg0] ASI_PRIMARY_LITTLE, data_reg */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, LDXA, ASI_PRIMARY_LITTLE);
-#endif
+ if (TCG_TARGET_REG_BITS == 64) {
+ if (bigendian) {
+ /* stx datalo, [addr] */
+ tcg_out_ldst(s, datalo, addr, 0, STX);
+ } else {
+ /* stxa datalo, [addr] ASI_PRIMARY_LITTLE */
+ tcg_out_ldst_asi(s, datalo, addr, 0, STXA, ASI_PRIMARY_LITTLE);
+ }
+ } else {
+ if (bigendian) {
+ tcg_out_ldst(s, datahi, addr, 0, STW);
+ tcg_out_ldst(s, datalo, addr, 4, STW);
+ } else {
+ tcg_out_ldst_asi(s, datalo, addr, 0, STWA, ASI_PRIMARY_LITTLE);
+ tcg_out_ldst_asi(s, datahi, addr, 4, STWA, ASI_PRIMARY_LITTLE);
+ }
+ }
break;
default:
tcg_abort();
}
-
-#if defined(CONFIG_SOFTMMU)
- /* label2: */
- *label2_ptr = (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x2) |
- INSN_OFF22((unsigned long)s->code_ptr -
- (unsigned long)label2_ptr));
-#endif
}
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args,
- int opc)
+static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
{
- int addr_reg, data_reg, arg0, arg1, arg2, mem_index, s_bits;
+ int addrlo_idx = 1, datalo, datahi, addr_reg;
#if defined(CONFIG_SOFTMMU)
- uint32_t *label1_ptr, *label2_ptr;
+ int memi_idx, memi, n;
+ uint32_t *label_ptr[2];
#endif
- data_reg = *args++;
- addr_reg = *args++;
- mem_index = *args;
-
- s_bits = opc;
-
- arg0 = TCG_REG_O0;
- arg1 = TCG_REG_O1;
- arg2 = TCG_REG_O2;
+ datahi = datalo = args[0];
+ if (TCG_TARGET_REG_BITS == 32 && opc == 3) {
+ datahi = args[1];
+ addrlo_idx = 2;
+ }
#if defined(CONFIG_SOFTMMU)
- /* srl addr_reg, x, arg1 */
- tcg_out_arithi(s, arg1, addr_reg, TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS,
- SHIFT_SRL);
+ memi_idx = addrlo_idx + 1 + (TARGET_LONG_BITS > TCG_TARGET_REG_BITS);
+ memi = args[memi_idx];
- /* and addr_reg, x, arg0 */
- tcg_out_arithi(s, arg0, addr_reg, TARGET_PAGE_MASK | ((1 << s_bits) - 1),
- ARITH_AND);
+ addr_reg = tcg_out_tlb_load(s, addrlo_idx, memi, opc, args,
+ label_ptr, offsetof(CPUTLBEntry, addr_write));
- /* and arg1, x, arg1 */
- tcg_out_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS);
+ /* TLB Hit. */
+ tcg_out_qemu_st_direct(s, addr_reg, datalo, datahi, opc);
- /* add arg1, x, arg1 */
- tcg_out_addi(s, arg1, offsetof(CPUArchState,
- tlb_table[mem_index][0].addr_write));
+ /* b,pt,n label1 */
+ label_ptr[1] = (uint32_t *)s->code_ptr;
+ tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x1)
+ | (1 << 29) | (1 << 19)));
- /* add env, arg1, arg1 */
- tcg_out_arith(s, arg1, TCG_AREG0, arg1, ARITH_ADD);
+ /* TLB Miss. */
- /* ld [arg1], arg2 */
- tcg_out32(s, TARGET_LD_OP | INSN_RD(arg2) | INSN_RS1(arg1) |
- INSN_RS2(TCG_REG_G0));
-
- /* subcc arg0, arg2, %g0 */
- tcg_out_arith(s, TCG_REG_G0, arg0, arg2, ARITH_SUBCC);
-
- /* will become:
- be label1
- or
- be,pt %xcc label1 */
- label1_ptr = (uint32_t *)s->code_ptr;
- tcg_out32(s, 0);
-
- /* mov (delay slot) */
- tcg_out_mov(s, TCG_TYPE_PTR, arg0, addr_reg);
-
- /* mov */
- tcg_out_mov(s, TCG_TYPE_REG, arg1, data_reg);
-
- /* mov */
- tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index);
+ *label_ptr[0] |= INSN_OFF19((unsigned long)s->code_ptr -
+ (unsigned long)label_ptr[0]);
+ n = 0;
#ifdef CONFIG_TCG_PASS_AREG0
- /* XXX/FIXME: suboptimal */
- tcg_out_mov(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3],
- tcg_target_call_iarg_regs[2]);
- tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[2],
- tcg_target_call_iarg_regs[1]);
- tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1],
- tcg_target_call_iarg_regs[0]);
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0],
- TCG_AREG0);
+ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[n++], TCG_AREG0);
#endif
- /* XXX: move that code at the end of the TB */
- /* qemu_st_helper[s_bits](arg0, arg1, arg2) */
- tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_st_helpers[s_bits]
- - (tcg_target_ulong)s->code_ptr) >> 2)
- & 0x3fffffff));
+ if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
+ tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
+ args[addrlo_idx + 1]);
+ }
+ tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
+ args[addrlo_idx]);
+ if (TCG_TARGET_REG_BITS == 32 && opc == 3) {
+ tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], datahi);
+ }
+ tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], datalo);
+
/* Store AREG0 in stack to avoid ugly glibc bugs that mangle
global registers */
- // delay slot
- tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long), HOST_ST_OP);
- tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long), HOST_LD_OP);
-
- /* will become:
- ba label2 */
- label2_ptr = (uint32_t *)s->code_ptr;
- tcg_out32(s, 0);
-
- /* nop (delay slot) */
- tcg_out_nop(s);
-
-#if TARGET_LONG_BITS == 32
- /* be label1 */
- *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x2) |
- INSN_OFF22((unsigned long)s->code_ptr -
- (unsigned long)label1_ptr));
-#else
- /* be,pt %xcc label1 */
- *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x1) |
- (0x5 << 19) | INSN_OFF19((unsigned long)s->code_ptr -
- (unsigned long)label1_ptr));
-#endif
+ tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
+ TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
+ sizeof(long));
- /* ld [arg1 + x], arg1 */
- tcg_out_ldst(s, arg1, arg1, offsetof(CPUTLBEntry, addend) -
- offsetof(CPUTLBEntry, addr_write), TARGET_ADDEND_LD_OP);
+ /* qemu_st_helper[s_bits](arg0, arg1, arg2) */
+ tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_st_helpers[opc]
+ - (tcg_target_ulong)s->code_ptr) >> 2)
+ & 0x3fffffff));
+ /* delay slot */
+ tcg_out_movi(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n], memi);
-#if TARGET_LONG_BITS == 32
- /* and addr_reg, x, arg0 */
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, 0xffffffff);
- tcg_out_arith(s, arg0, addr_reg, TCG_REG_I5, ARITH_AND);
- /* add arg0, arg1, arg0 */
- tcg_out_arith(s, arg0, arg0, arg1, ARITH_ADD);
-#else
- /* add addr_reg, arg1, arg0 */
- tcg_out_arith(s, arg0, addr_reg, arg1, ARITH_ADD);
-#endif
+ /* Reload AREG0. */
+ tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
+ TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
+ sizeof(long));
+ *label_ptr[1] |= INSN_OFF19((unsigned long)s->code_ptr -
+ (unsigned long)label_ptr[1]);
#else
- arg0 = addr_reg;
-#endif
-
- switch(opc) {
- case 0:
- /* stb data_reg, [arg0] */
- tcg_out_ldst(s, data_reg, arg0, 0, STB);
- break;
- case 1:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* sth data_reg, [arg0] */
- tcg_out_ldst(s, data_reg, arg0, 0, STH);
-#else
- /* stha data_reg, [arg0] ASI_PRIMARY_LITTLE */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, STHA, ASI_PRIMARY_LITTLE);
-#endif
- break;
- case 2:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* stw data_reg, [arg0] */
- tcg_out_ldst(s, data_reg, arg0, 0, STW);
-#else
- /* stwa data_reg, [arg0] ASI_PRIMARY_LITTLE */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, STWA, ASI_PRIMARY_LITTLE);
-#endif
- break;
- case 3:
-#ifdef TARGET_WORDS_BIGENDIAN
- /* stx data_reg, [arg0] */
- tcg_out_ldst(s, data_reg, arg0, 0, STX);
-#else
- /* stxa data_reg, [arg0] ASI_PRIMARY_LITTLE */
- tcg_out_ldst_asi(s, data_reg, arg0, 0, STXA, ASI_PRIMARY_LITTLE);
-#endif
- break;
- default:
- tcg_abort();
+ addr_reg = args[addrlo_idx];
+ if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
+ tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
+ addr_reg = TCG_REG_I5;
}
-
-#if defined(CONFIG_SOFTMMU)
- /* label2: */
- *label2_ptr = (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x2) |
- INSN_OFF22((unsigned long)s->code_ptr -
- (unsigned long)label2_ptr));
-#endif
+ tcg_out_qemu_st_direct(s, addr_reg, datalo, datahi, opc);
+#endif /* CONFIG_SOFTMMU */
}
static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
@@ -1205,12 +1189,12 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
/* Store AREG0 in stack to avoid ugly glibc bugs that mangle
global registers */
// delay slot
- tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long), HOST_ST_OP);
- tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long), HOST_LD_OP);
+ tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
+ TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
+ sizeof(long));
+ tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
+ TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
+ sizeof(long));
break;
case INDEX_op_jmp:
case INDEX_op_br:
@@ -1378,6 +1362,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
tcg_out_qemu_ld(s, args, 2 | 4);
break;
#endif
+ case INDEX_op_qemu_ld64:
+ tcg_out_qemu_ld(s, args, 3);
+ break;
case INDEX_op_qemu_st8:
tcg_out_qemu_st(s, args, 0);
break;
@@ -1387,6 +1374,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
case INDEX_op_qemu_st32:
tcg_out_qemu_st(s, args, 2);
break;
+ case INDEX_op_qemu_st64:
+ tcg_out_qemu_st(s, args, 3);
+ break;
#if TCG_TARGET_REG_BITS == 64
case INDEX_op_movi_i64:
@@ -1451,13 +1441,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
args[2], const_args[2]);
break;
- case INDEX_op_qemu_ld64:
- tcg_out_qemu_ld(s, args, 3);
- break;
- case INDEX_op_qemu_st64:
- tcg_out_qemu_st(s, args, 3);
- break;
-
#endif
gen_arith:
tcg_out_arithc(s, args[0], args[1], args[2], const_args[2], c);
@@ -1522,20 +1505,6 @@ static const TCGTargetOpDef sparc_op_defs[] = {
{ INDEX_op_mulu2_i32, { "r", "r", "r", "rJ" } },
#endif
- { INDEX_op_qemu_ld8u, { "r", "L" } },
- { INDEX_op_qemu_ld8s, { "r", "L" } },
- { INDEX_op_qemu_ld16u, { "r", "L" } },
- { INDEX_op_qemu_ld16s, { "r", "L" } },
- { INDEX_op_qemu_ld32, { "r", "L" } },
-#if TCG_TARGET_REG_BITS == 64
- { INDEX_op_qemu_ld32u, { "r", "L" } },
- { INDEX_op_qemu_ld32s, { "r", "L" } },
-#endif
-
- { INDEX_op_qemu_st8, { "L", "L" } },
- { INDEX_op_qemu_st16, { "L", "L" } },
- { INDEX_op_qemu_st32, { "L", "L" } },
-
#if TCG_TARGET_REG_BITS == 64
{ INDEX_op_mov_i64, { "r", "r" } },
{ INDEX_op_movi_i64, { "r" } },
@@ -1550,8 +1519,6 @@ static const TCGTargetOpDef sparc_op_defs[] = {
{ INDEX_op_st16_i64, { "r", "r" } },
{ INDEX_op_st32_i64, { "r", "r" } },
{ INDEX_op_st_i64, { "r", "r" } },
- { INDEX_op_qemu_ld64, { "L", "L" } },
- { INDEX_op_qemu_st64, { "L", "L" } },
{ INDEX_op_add_i64, { "r", "r", "rJ" } },
{ INDEX_op_mul_i64, { "r", "r", "rJ" } },
@@ -1578,10 +1545,48 @@ static const TCGTargetOpDef sparc_op_defs[] = {
{ INDEX_op_brcond_i64, { "r", "rJ" } },
{ INDEX_op_setcond_i64, { "r", "r", "rJ" } },
-#else
- { INDEX_op_qemu_ld64, { "L", "L", "L" } },
+#endif
+
+#if TCG_TARGET_REG_BITS == 64
+ { INDEX_op_qemu_ld8u, { "r", "L" } },
+ { INDEX_op_qemu_ld8s, { "r", "L" } },
+ { INDEX_op_qemu_ld16u, { "r", "L" } },
+ { INDEX_op_qemu_ld16s, { "r", "L" } },
+ { INDEX_op_qemu_ld32, { "r", "L" } },
+ { INDEX_op_qemu_ld32u, { "r", "L" } },
+ { INDEX_op_qemu_ld32s, { "r", "L" } },
+ { INDEX_op_qemu_ld64, { "r", "L" } },
+
+ { INDEX_op_qemu_st8, { "L", "L" } },
+ { INDEX_op_qemu_st16, { "L", "L" } },
+ { INDEX_op_qemu_st32, { "L", "L" } },
+ { INDEX_op_qemu_st64, { "L", "L" } },
+#elif TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
+ { INDEX_op_qemu_ld8u, { "r", "L" } },
+ { INDEX_op_qemu_ld8s, { "r", "L" } },
+ { INDEX_op_qemu_ld16u, { "r", "L" } },
+ { INDEX_op_qemu_ld16s, { "r", "L" } },
+ { INDEX_op_qemu_ld32, { "r", "L" } },
+ { INDEX_op_qemu_ld64, { "r", "r", "L" } },
+
+ { INDEX_op_qemu_st8, { "L", "L" } },
+ { INDEX_op_qemu_st16, { "L", "L" } },
+ { INDEX_op_qemu_st32, { "L", "L" } },
{ INDEX_op_qemu_st64, { "L", "L", "L" } },
+#else
+ { INDEX_op_qemu_ld8u, { "r", "L", "L" } },
+ { INDEX_op_qemu_ld8s, { "r", "L", "L" } },
+ { INDEX_op_qemu_ld16u, { "r", "L", "L" } },
+ { INDEX_op_qemu_ld16s, { "r", "L", "L" } },
+ { INDEX_op_qemu_ld32, { "r", "L", "L" } },
+ { INDEX_op_qemu_ld64, { "L", "L", "L", "L" } },
+
+ { INDEX_op_qemu_st8, { "L", "L", "L" } },
+ { INDEX_op_qemu_st16, { "L", "L", "L" } },
+ { INDEX_op_qemu_st32, { "L", "L", "L" } },
+ { INDEX_op_qemu_st64, { "L", "L", "L", "L" } },
#endif
+
{ -1 },
};
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 05/15] tcg-sparc: Simplify qemu_ld/st direct memory paths.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (3 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 04/15] tcg-sparc: Fix qemu_ld/st to handle 32-bit host Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 06/15] tcg-sparc: Support GUEST_BASE Richard Henderson
` (9 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Given that we have an opcode for all sizes, all endianness,
turn the functions into a simple table lookup.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 209 +++++++++++++-----------------------------------
1 files changed, 56 insertions(+), 153 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 8763b03..1b27626 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -294,6 +294,16 @@ static inline int tcg_target_const_match(tcg_target_long val,
#define ASI_PRIMARY_LITTLE 0x88
#endif
+#define LDUH_LE (LDUHA | INSN_ASI(ASI_PRIMARY_LITTLE))
+#define LDSH_LE (LDSHA | INSN_ASI(ASI_PRIMARY_LITTLE))
+#define LDUW_LE (LDUWA | INSN_ASI(ASI_PRIMARY_LITTLE))
+#define LDSW_LE (LDSWA | INSN_ASI(ASI_PRIMARY_LITTLE))
+#define LDX_LE (LDXA | INSN_ASI(ASI_PRIMARY_LITTLE))
+
+#define STH_LE (STHA | INSN_ASI(ASI_PRIMARY_LITTLE))
+#define STW_LE (STWA | INSN_ASI(ASI_PRIMARY_LITTLE))
+#define STX_LE (STXA | INSN_ASI(ASI_PRIMARY_LITTLE))
+
static inline void tcg_out_arith(TCGContext *s, int rd, int rs1, int rs2,
int op)
{
@@ -366,66 +376,46 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
}
}
-static inline void tcg_out_ld_raw(TCGContext *s, int ret,
- tcg_target_long arg)
+static inline void tcg_out_ldst_rr(TCGContext *s, int data, int a1,
+ int a2, int op)
{
- tcg_out_sethi(s, ret, arg);
- tcg_out32(s, LDUW | INSN_RD(ret) | INSN_RS1(ret) |
- INSN_IMM13(arg & 0x3ff));
+ tcg_out32(s, op | INSN_RD(data) | INSN_RS1(a1) | INSN_RS2(a2));
}
-static inline void tcg_out_ld_ptr(TCGContext *s, int ret,
- tcg_target_long arg)
+static inline void tcg_out_ldst(TCGContext *s, int ret, int addr,
+ int offset, int op)
{
- if (!check_fit_tl(arg, 10))
- tcg_out_movi(s, TCG_TYPE_PTR, ret, arg & ~0x3ffULL);
- if (TCG_TARGET_REG_BITS == 64) {
- tcg_out32(s, LDX | INSN_RD(ret) | INSN_RS1(ret) |
- INSN_IMM13(arg & 0x3ff));
- } else {
- tcg_out32(s, LDUW | INSN_RD(ret) | INSN_RS1(ret) |
- INSN_IMM13(arg & 0x3ff));
- }
-}
-
-static inline void tcg_out_ldst(TCGContext *s, int ret, int addr, int offset, int op)
-{
- if (check_fit_tl(offset, 13))
+ if (check_fit_tl(offset, 13)) {
tcg_out32(s, op | INSN_RD(ret) | INSN_RS1(addr) |
INSN_IMM13(offset));
- else {
+ } else {
tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_I5, offset);
- tcg_out32(s, op | INSN_RD(ret) | INSN_RS1(TCG_REG_I5) |
- INSN_RS2(addr));
+ tcg_out_ldst_rr(s, ret, addr, TCG_REG_I5, op);
}
}
-static inline void tcg_out_ldst_asi(TCGContext *s, int ret, int addr,
- int offset, int op, int asi)
-{
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_I5, offset);
- tcg_out32(s, op | INSN_RD(ret) | INSN_RS1(TCG_REG_I5) |
- INSN_ASI(asi) | INSN_RS2(addr));
-}
-
static inline void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret,
TCGReg arg1, tcg_target_long arg2)
{
- if (type == TCG_TYPE_I32)
- tcg_out_ldst(s, ret, arg1, arg2, LDUW);
- else
- tcg_out_ldst(s, ret, arg1, arg2, LDX);
+ tcg_out_ldst(s, ret, arg1, arg2, (type == TCG_TYPE_I32 ? LDUW : LDX));
}
static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg,
TCGReg arg1, tcg_target_long arg2)
{
- if (type == TCG_TYPE_I32)
- tcg_out_ldst(s, arg, arg1, arg2, STW);
- else
- tcg_out_ldst(s, arg, arg1, arg2, STX);
+ tcg_out_ldst(s, arg, arg1, arg2, (type == TCG_TYPE_I32 ? STW : STX));
}
+static inline void tcg_out_ld_ptr(TCGContext *s, int ret,
+ tcg_target_long arg)
+{
+ if (!check_fit_tl(arg, 10)) {
+ tcg_out_movi(s, TCG_TYPE_PTR, ret, arg & ~0x3ff);
+ }
+ tcg_out_ld(s, TCG_TYPE_PTR, ret, ret, arg & 0x3ff);
+}
+
+
static inline void tcg_out_sety(TCGContext *s, int rs)
{
tcg_out32(s, WRY | INSN_RS1(TCG_REG_G0) | INSN_RS2(rs));
@@ -833,76 +823,26 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, int addr, int datalo,
int datahi, int sizeop)
{
#ifdef TARGET_WORDS_BIGENDIAN
- const int bigendian = 1;
+ static const int ld_opc[8] = {
+ LDUB, LDUH, LDUW, LDX, LDSB, LDSH, LDSW, LDX
+ };
#else
- const int bigendian = 0;
+ static const int ld_opc[8] = {
+ LDUB, LDUH_LE, LDUW_LE, LDX_LE, LDSB, LDSH_LE, LDSW_LE, LDX_LE
+ };
#endif
- switch (sizeop) {
- case 0:
- /* ldub [addr], datalo */
- tcg_out_ldst(s, datalo, addr, 0, LDUB);
- break;
- case 0 | 4:
- /* ldsb [addr], datalo */
- tcg_out_ldst(s, datalo, addr, 0, LDSB);
- break;
- case 1:
- if (bigendian) {
- /* lduh [addr], datalo */
- tcg_out_ldst(s, datalo, addr, 0, LDUH);
- } else {
- /* lduha [addr] ASI_PRIMARY_LITTLE, datalo */
- tcg_out_ldst_asi(s, datalo, addr, 0, LDUHA, ASI_PRIMARY_LITTLE);
- }
- break;
- case 1 | 4:
- if (bigendian) {
- /* ldsh [addr], datalo */
- tcg_out_ldst(s, datalo, addr, 0, LDSH);
- } else {
- /* ldsha [addr] ASI_PRIMARY_LITTLE, datalo */
- tcg_out_ldst_asi(s, datalo, addr, 0, LDSHA, ASI_PRIMARY_LITTLE);
- }
- break;
- case 2:
- if (bigendian) {
- /* lduw [addr], datalo */
- tcg_out_ldst(s, datalo, addr, 0, LDUW);
- } else {
- /* lduwa [addr] ASI_PRIMARY_LITTLE, datalo */
- tcg_out_ldst_asi(s, datalo, addr, 0, LDUWA, ASI_PRIMARY_LITTLE);
- }
- break;
- case 2 | 4:
- if (bigendian) {
- /* ldsw [addr], datalo */
- tcg_out_ldst(s, datalo, addr, 0, LDSW);
- } else {
- /* ldswa [addr] ASI_PRIMARY_LITTLE, datalo */
- tcg_out_ldst_asi(s, datalo, addr, 0, LDSWA, ASI_PRIMARY_LITTLE);
- }
- break;
- case 3:
- if (TCG_TARGET_REG_BITS == 64) {
- if (bigendian) {
- /* ldx [addr], datalo */
- tcg_out_ldst(s, datalo, addr, 0, LDX);
- } else {
- /* ldxa [addr] ASI_PRIMARY_LITTLE, datalo */
- tcg_out_ldst_asi(s, datalo, addr, 0, LDXA, ASI_PRIMARY_LITTLE);
- }
- } else {
- if (bigendian) {
- tcg_out_ldst(s, datahi, addr, 0, LDUW);
- tcg_out_ldst(s, datalo, addr, 4, LDUW);
- } else {
- tcg_out_ldst_asi(s, datalo, addr, 0, LDUWA, ASI_PRIMARY_LITTLE);
- tcg_out_ldst_asi(s, datahi, addr, 4, LDUWA, ASI_PRIMARY_LITTLE);
- }
+
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
+ /* Load all 64-bits into an O/G register. */
+ int reg64 = (datalo < 16 ? datalo : TCG_REG_O0);
+ tcg_out_ldst_rr(s, reg64, addr, TCG_REG_G0, ld_opc[sizeop]);
+ /* Move the two 32-bit pieces into the destination registers. */
+ tcg_out_arithi(s, datahi, reg64, 32, SHIFT_SRLX);
+ if (reg64 != datalo) {
+ tcg_out_mov(s, TCG_TYPE_I32, datalo, reg64);
}
- break;
- default:
- tcg_abort();
+ } else {
+ tcg_out_ldst_rr(s, datalo, addr, TCG_REG_G0, ld_opc[sizeop]);
}
}
@@ -1016,55 +956,18 @@ static void tcg_out_qemu_st_direct(TCGContext *s, int addr, int datalo,
int datahi, int sizeop)
{
#ifdef TARGET_WORDS_BIGENDIAN
- const int bigendian = 1;
+ static const int st_opc[4] = { STB, STH, STW, STX };
#else
- const int bigendian = 0;
+ static const int st_opc[4] = { STB, STH_LE, STW_LE, STX_LE };
#endif
- switch (sizeop) {
- case 0:
- /* stb datalo, [addr] */
- tcg_out_ldst(s, datalo, addr, 0, STB);
- break;
- case 1:
- if (bigendian) {
- /* sth datalo, [addr] */
- tcg_out_ldst(s, datalo, addr, 0, STH);
- } else {
- /* stha datalo, [addr] ASI_PRIMARY_LITTLE */
- tcg_out_ldst_asi(s, datalo, addr, 0, STHA, ASI_PRIMARY_LITTLE);
- }
- break;
- case 2:
- if (bigendian) {
- /* stw datalo, [addr] */
- tcg_out_ldst(s, datalo, addr, 0, STW);
- } else {
- /* stwa datalo, [addr] ASI_PRIMARY_LITTLE */
- tcg_out_ldst_asi(s, datalo, addr, 0, STWA, ASI_PRIMARY_LITTLE);
- }
- break;
- case 3:
- if (TCG_TARGET_REG_BITS == 64) {
- if (bigendian) {
- /* stx datalo, [addr] */
- tcg_out_ldst(s, datalo, addr, 0, STX);
- } else {
- /* stxa datalo, [addr] ASI_PRIMARY_LITTLE */
- tcg_out_ldst_asi(s, datalo, addr, 0, STXA, ASI_PRIMARY_LITTLE);
- }
- } else {
- if (bigendian) {
- tcg_out_ldst(s, datahi, addr, 0, STW);
- tcg_out_ldst(s, datalo, addr, 4, STW);
- } else {
- tcg_out_ldst_asi(s, datalo, addr, 0, STWA, ASI_PRIMARY_LITTLE);
- tcg_out_ldst_asi(s, datahi, addr, 4, STWA, ASI_PRIMARY_LITTLE);
- }
- }
- break;
- default:
- tcg_abort();
+
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
+ tcg_out_arithi(s, TCG_REG_O0, datalo, 0, SHIFT_SRL);
+ tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
+ tcg_out_arith(s, TCG_REG_O0, TCG_REG_O0, TCG_REG_O2, ARITH_OR);
+ datalo = TCG_REG_O0;
}
+ tcg_out_ldst_rr(s, datalo, addr, TCG_REG_G0, st_opc[sizeop]);
}
static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 06/15] tcg-sparc: Support GUEST_BASE.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (4 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 05/15] tcg-sparc: Simplify qemu_ld/st direct memory paths Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 07/15] tcg-sparc: Steamline qemu_ld/st more Richard Henderson
` (8 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
configure | 2 ++
tcg/sparc/tcg-target.c | 40 +++++++++++++++++++++++++++++-----------
tcg/sparc/tcg-target.h | 2 ++
3 files changed, 33 insertions(+), 11 deletions(-)
diff --git a/configure b/configure
index 7741ba9..a79a090 100755
--- a/configure
+++ b/configure
@@ -819,6 +819,7 @@ case "$cpu" in
if test "$solaris" = "no" ; then
QEMU_CFLAGS="-ffixed-g1 -ffixed-g6 $QEMU_CFLAGS"
fi
+ host_guest_base="yes"
;;
sparc64)
LDFLAGS="-m64 $LDFLAGS"
@@ -827,6 +828,7 @@ case "$cpu" in
if test "$solaris" != "no" ; then
QEMU_CFLAGS="-ffixed-g1 $QEMU_CFLAGS"
fi
+ host_guest_base="yes"
;;
s390)
QEMU_CFLAGS="-m31 -march=z990 $QEMU_CFLAGS"
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 1b27626..9891648 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -59,6 +59,12 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
};
#endif
+#ifdef CONFIG_USE_GUEST_BASE
+# define TCG_GUEST_BASE_REG TCG_REG_I3
+#else
+# define TCG_GUEST_BASE_REG TCG_REG_G0
+#endif
+
#ifdef CONFIG_TCG_PASS_AREG0
#define ARG_OFFSET 1
#else
@@ -689,6 +695,14 @@ static void tcg_target_qemu_prologue(TCGContext *s)
tcg_out32(s, SAVE | INSN_RD(TCG_REG_O6) | INSN_RS1(TCG_REG_O6) |
INSN_IMM13(-(TCG_TARGET_STACK_MINFRAME +
CPU_TEMP_BUF_NLONGS * (int)sizeof(long))));
+
+#ifdef CONFIG_USE_GUEST_BASE
+ if (GUEST_BASE != 0) {
+ tcg_out_movi(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, GUEST_BASE);
+ tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG);
+ }
+#endif
+
tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_I1) |
INSN_RS2(TCG_REG_G0));
tcg_out_mov(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_I0);
@@ -819,8 +833,8 @@ static int tcg_out_tlb_load(TCGContext *s, int addrlo_idx, int mem_index,
}
#endif /* CONFIG_SOFTMMU */
-static void tcg_out_qemu_ld_direct(TCGContext *s, int addr, int datalo,
- int datahi, int sizeop)
+static void tcg_out_qemu_ld_direct(TCGContext *s, int addr, int addend,
+ int datalo, int datahi, int sizeop)
{
#ifdef TARGET_WORDS_BIGENDIAN
static const int ld_opc[8] = {
@@ -835,14 +849,14 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, int addr, int datalo,
if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
/* Load all 64-bits into an O/G register. */
int reg64 = (datalo < 16 ? datalo : TCG_REG_O0);
- tcg_out_ldst_rr(s, reg64, addr, TCG_REG_G0, ld_opc[sizeop]);
+ tcg_out_ldst_rr(s, reg64, addr, addend, ld_opc[sizeop]);
/* Move the two 32-bit pieces into the destination registers. */
tcg_out_arithi(s, datahi, reg64, 32, SHIFT_SRLX);
if (reg64 != datalo) {
tcg_out_mov(s, TCG_TYPE_I32, datalo, reg64);
}
} else {
- tcg_out_ldst_rr(s, datalo, addr, TCG_REG_G0, ld_opc[sizeop]);
+ tcg_out_ldst_rr(s, datalo, addr, addend, ld_opc[sizeop]);
}
}
@@ -869,7 +883,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
label_ptr, offsetof(CPUTLBEntry, addr_read));
/* TLB Hit. */
- tcg_out_qemu_ld_direct(s, addr_reg, datalo, datahi, opc);
+ tcg_out_qemu_ld_direct(s, addr_reg, TCG_REG_G0, datalo, datahi, opc);
/* b,pt,n label1 */
label_ptr[1] = (uint32_t *)s->code_ptr;
@@ -948,12 +962,14 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
addr_reg = TCG_REG_I5;
}
- tcg_out_qemu_ld_direct(s, addr_reg, datalo, datahi, opc);
+ tcg_out_qemu_ld_direct(s, addr_reg,
+ (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
+ datalo, datahi, opc);
#endif /* CONFIG_SOFTMMU */
}
-static void tcg_out_qemu_st_direct(TCGContext *s, int addr, int datalo,
- int datahi, int sizeop)
+static void tcg_out_qemu_st_direct(TCGContext *s, int addr, int addend,
+ int datalo, int datahi, int sizeop)
{
#ifdef TARGET_WORDS_BIGENDIAN
static const int st_opc[4] = { STB, STH, STW, STX };
@@ -967,7 +983,7 @@ static void tcg_out_qemu_st_direct(TCGContext *s, int addr, int datalo,
tcg_out_arith(s, TCG_REG_O0, TCG_REG_O0, TCG_REG_O2, ARITH_OR);
datalo = TCG_REG_O0;
}
- tcg_out_ldst_rr(s, datalo, addr, TCG_REG_G0, st_opc[sizeop]);
+ tcg_out_ldst_rr(s, datalo, addr, addend, st_opc[sizeop]);
}
static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
@@ -992,7 +1008,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
label_ptr, offsetof(CPUTLBEntry, addr_write));
/* TLB Hit. */
- tcg_out_qemu_st_direct(s, addr_reg, datalo, datahi, opc);
+ tcg_out_qemu_st_direct(s, addr_reg, TCG_REG_G0, datalo, datahi, opc);
/* b,pt,n label1 */
label_ptr[1] = (uint32_t *)s->code_ptr;
@@ -1045,7 +1061,9 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
addr_reg = TCG_REG_I5;
}
- tcg_out_qemu_st_direct(s, addr_reg, datalo, datahi, opc);
+ tcg_out_qemu_st_direct(s, addr_reg,
+ (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
+ datalo, datahi, opc);
#endif /* CONFIG_SOFTMMU */
}
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 56742bf..e69dfc8 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -126,6 +126,8 @@ typedef enum {
#define TCG_TARGET_HAS_deposit_i64 0
#endif
+#define TCG_TARGET_HAS_GUEST_BASE
+
/* Note: must be synced with dyngen-exec.h */
#ifdef CONFIG_SOLARIS
#define TCG_AREG0 TCG_REG_G2
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 07/15] tcg-sparc: Steamline qemu_ld/st more.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (5 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 06/15] tcg-sparc: Support GUEST_BASE Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 08/15] Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0 Richard Henderson
` (7 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 235 +++++++++++++++++++++++++----------------------
1 files changed, 125 insertions(+), 110 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 9891648..d45114f 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -761,22 +761,16 @@ static const void * const qemu_st_helpers[4] = {
WHICH is the offset into the CPUTLBEntry structure of the slot to read.
This should be offsetof addr_read or addr_write.
- Outputs:
- LABEL_PTRS is filled with the position of the forward jumps to the
- TLB miss case. This will always be a ,PN insn, so a 19-bit offset.
-
- Returns a register loaded with the low part of the address, adjusted
- as indicated by the TLB and so is a host address. Undefined in the
- TLB miss case. */
+ The result of the TLB comparison is in %[ix]cc. The sanitized address
+ is in the returned register, maybe %o0. The TLB addend is in %o1. */
static int tcg_out_tlb_load(TCGContext *s, int addrlo_idx, int mem_index,
- int s_bits, const TCGArg *args,
- uint32_t **label_ptr, int which)
+ int s_bits, const TCGArg *args, int which)
{
const int addrlo = args[addrlo_idx];
- const int r0 = tcg_target_call_iarg_regs[0];
- const int r1 = tcg_target_call_iarg_regs[1];
- const int r2 = tcg_target_call_iarg_regs[2];
+ const int r0 = TCG_REG_O0;
+ const int r1 = TCG_REG_O1;
+ const int r2 = TCG_REG_O2;
int addr = addrlo;
int tlb_ofs;
@@ -807,60 +801,39 @@ static int tcg_out_tlb_load(TCGContext *s, int addrlo_idx, int mem_index,
tlb_ofs = 0;
}
- /* ld [arg1 + which], arg2 */
+ /* Load the tlb comparator and the addend. */
tcg_out_ld(s, TCG_TYPE_TL, r2, r1, tlb_ofs + which);
+ tcg_out_ld(s, TCG_TYPE_PTR, r1, r1, tlb_ofs+offsetof(CPUTLBEntry, addend));
/* subcc arg0, arg2, %g0 */
tcg_out_cmp(s, r0, r2, 0);
- /* bne,pn %[ix]cc, label0 */
- *label_ptr = (uint32_t *)s->code_ptr;
- tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_NE, 0) | INSN_OP2(0x1) |
- ((TARGET_LONG_BITS == 64) << 21)));
-
- /* TLB Hit. Compute the host address into r1. The ld is in the
- branch delay slot; harmless for the TLB miss case. */
- tcg_out_ld(s, TCG_TYPE_PTR, r1, r1, tlb_ofs+offsetof(CPUTLBEntry, addend));
-
+ /* If the guest address must be zero-extended, do so now. */
if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
tcg_out_arithi(s, r0, addrlo, 0, SHIFT_SRL);
- tcg_out_arith(s, r1, r0, r1, ARITH_ADD);
- } else {
- tcg_out_arith(s, r1, addrlo, r1, ARITH_ADD);
+ return r0;
}
-
- return r1;
+ return addrlo;
}
#endif /* CONFIG_SOFTMMU */
-static void tcg_out_qemu_ld_direct(TCGContext *s, int addr, int addend,
- int datalo, int datahi, int sizeop)
-{
+static const int qemu_ld_opc[8] = {
#ifdef TARGET_WORDS_BIGENDIAN
- static const int ld_opc[8] = {
- LDUB, LDUH, LDUW, LDX, LDSB, LDSH, LDSW, LDX
- };
+ LDUB, LDUH, LDUW, LDX, LDSB, LDSH, LDSW, LDX
#else
- static const int ld_opc[8] = {
- LDUB, LDUH_LE, LDUW_LE, LDX_LE, LDSB, LDSH_LE, LDSW_LE, LDX_LE
- };
+ LDUB, LDUH_LE, LDUW_LE, LDX_LE, LDSB, LDSH_LE, LDSW_LE, LDX_LE
#endif
+};
- if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
- /* Load all 64-bits into an O/G register. */
- int reg64 = (datalo < 16 ? datalo : TCG_REG_O0);
- tcg_out_ldst_rr(s, reg64, addr, addend, ld_opc[sizeop]);
- /* Move the two 32-bit pieces into the destination registers. */
- tcg_out_arithi(s, datahi, reg64, 32, SHIFT_SRLX);
- if (reg64 != datalo) {
- tcg_out_mov(s, TCG_TYPE_I32, datalo, reg64);
- }
- } else {
- tcg_out_ldst_rr(s, datalo, addr, addend, ld_opc[sizeop]);
- }
-}
+static const int qemu_st_opc[4] = {
+#ifdef TARGET_WORDS_BIGENDIAN
+ STB, STH, STW, STX
+#else
+ STB, STH_LE, STW_LE, STX_LE
+#endif
+};
-static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
+static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int sizeop)
{
int addrlo_idx = 1, datalo, datahi, addr_reg;
#if defined(CONFIG_SOFTMMU)
@@ -869,7 +842,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
#endif
datahi = datalo = args[0];
- if (TCG_TARGET_REG_BITS == 32 && opc == 3) {
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
datahi = args[1];
addrlo_idx = 2;
}
@@ -877,27 +850,59 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
#if defined(CONFIG_SOFTMMU)
memi_idx = addrlo_idx + 1 + (TARGET_LONG_BITS > TCG_TARGET_REG_BITS);
memi = args[memi_idx];
- s_bits = opc & 3;
+ s_bits = sizeop & 3;
addr_reg = tcg_out_tlb_load(s, addrlo_idx, memi, s_bits, args,
- label_ptr, offsetof(CPUTLBEntry, addr_read));
+ offsetof(CPUTLBEntry, addr_read));
- /* TLB Hit. */
- tcg_out_qemu_ld_direct(s, addr_reg, TCG_REG_G0, datalo, datahi, opc);
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
+ int reg64;
- /* b,pt,n label1 */
- label_ptr[1] = (uint32_t *)s->code_ptr;
- tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x1)
- | (1 << 29) | (1 << 19)));
+ /* bne,pn %[xi]cc, label0 */
+ label_ptr[0] = (uint32_t *)s->code_ptr;
+ tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_NE, 0) | INSN_OP2(0x1)
+ | ((TARGET_LONG_BITS == 64) << 21)));
+
+ /* TLB Hit. */
+ /* Load all 64-bits into an O/G register. */
+ reg64 = (datalo < 16 ? datalo : TCG_REG_O0);
+ tcg_out_ldst_rr(s, reg64, addr_reg, TCG_REG_O1, qemu_ld_opc[sizeop]);
+
+ /* Move the two 32-bit pieces into the destination registers. */
+ tcg_out_arithi(s, datahi, reg64, 32, SHIFT_SRLX);
+ if (reg64 != datalo) {
+ tcg_out_mov(s, TCG_TYPE_I32, datalo, reg64);
+ }
+
+ /* b,pt,n label1 */
+ label_ptr[1] = (uint32_t *)s->code_ptr;
+ tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x1)
+ | (1 << 29) | (1 << 19)));
+ } else {
+ /* The fast path is exactly one insn. Thus we can perform the
+ entire TLB Hit in the (annulled) delay slot of the branch
+ over the TLB Miss case. */
+
+ /* beq,a,pt %[xi]cc, label0 */
+ label_ptr[0] = NULL;
+ label_ptr[1] = (uint32_t *)s->code_ptr;
+ tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x1)
+ | ((TARGET_LONG_BITS == 64) << 21)
+ | (1 << 29) | (1 << 19)));
+ /* delay slot */
+ tcg_out_ldst_rr(s, datalo, addr_reg, TCG_REG_O1, qemu_ld_opc[sizeop]);
+ }
/* TLB Miss. */
- *label_ptr[0] |= INSN_OFF19((unsigned long)s->code_ptr -
- (unsigned long)label_ptr[0]);
- n = 0;
-#ifdef CONFIG_TCG_PASS_AREG0
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[n++], TCG_AREG0);
-#endif
+ if (label_ptr[0]) {
+ *label_ptr[0] |= INSN_OFF19((unsigned long)s->code_ptr -
+ (unsigned long)label_ptr[0]);
+ }
+ n = ARG_OFFSET;
+ if (ARG_OFFSET) {
+ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
+ }
if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
args[addrlo_idx + 1]);
@@ -925,7 +930,7 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
n = tcg_target_call_oarg_regs[0];
/* datalo = sign_extend(arg0) */
- switch(opc) {
+ switch (sizeop) {
case 0 | 4:
/* Recall that SRA sign extends from bit 31 through bit 63. */
tcg_out_arithi(s, datalo, n, 24, SHIFT_SLL);
@@ -962,40 +967,35 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc)
tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
addr_reg = TCG_REG_I5;
}
- tcg_out_qemu_ld_direct(s, addr_reg,
- (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
- datalo, datahi, opc);
-#endif /* CONFIG_SOFTMMU */
-}
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
+ int reg64 = (datalo < 16 ? datalo : TCG_REG_O0);
-static void tcg_out_qemu_st_direct(TCGContext *s, int addr, int addend,
- int datalo, int datahi, int sizeop)
-{
-#ifdef TARGET_WORDS_BIGENDIAN
- static const int st_opc[4] = { STB, STH, STW, STX };
-#else
- static const int st_opc[4] = { STB, STH_LE, STW_LE, STX_LE };
-#endif
+ tcg_out_ldst_rr(s, reg64, addr_reg,
+ (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
+ qemu_ld_opc[sizeop]);
- if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
- tcg_out_arithi(s, TCG_REG_O0, datalo, 0, SHIFT_SRL);
- tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
- tcg_out_arith(s, TCG_REG_O0, TCG_REG_O0, TCG_REG_O2, ARITH_OR);
- datalo = TCG_REG_O0;
+ tcg_out_arithi(s, datahi, reg64, 32, SHIFT_SRLX);
+ if (reg64 != datalo) {
+ tcg_out_mov(s, TCG_TYPE_I32, datalo, reg64);
+ }
+ } else {
+ tcg_out_ldst_rr(s, datalo, addr_reg,
+ (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
+ qemu_ld_opc[sizeop]);
}
- tcg_out_ldst_rr(s, datalo, addr, addend, st_opc[sizeop]);
+#endif /* CONFIG_SOFTMMU */
}
-static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
+static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop)
{
int addrlo_idx = 1, datalo, datahi, addr_reg;
#if defined(CONFIG_SOFTMMU)
int memi_idx, memi, n;
- uint32_t *label_ptr[2];
+ uint32_t *label_ptr;
#endif
datahi = datalo = args[0];
- if (TCG_TARGET_REG_BITS == 32 && opc == 3) {
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
datahi = args[1];
addrlo_idx = 2;
}
@@ -1004,33 +1004,40 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
memi_idx = addrlo_idx + 1 + (TARGET_LONG_BITS > TCG_TARGET_REG_BITS);
memi = args[memi_idx];
- addr_reg = tcg_out_tlb_load(s, addrlo_idx, memi, opc, args,
- label_ptr, offsetof(CPUTLBEntry, addr_write));
+ addr_reg = tcg_out_tlb_load(s, addrlo_idx, memi, sizeop, args,
+ offsetof(CPUTLBEntry, addr_write));
- /* TLB Hit. */
- tcg_out_qemu_st_direct(s, addr_reg, TCG_REG_G0, datalo, datahi, opc);
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
+ /* Reconstruct the full 64-bit value in %g1, using %o2 as temp. */
+ /* ??? Redefine the temps from %i4/%i5 so that we have a o/g temp. */
+ tcg_out_arithi(s, TCG_REG_G1, datalo, 0, SHIFT_SRL);
+ tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
+ tcg_out_arith(s, TCG_REG_G1, TCG_REG_G1, TCG_REG_O2, ARITH_OR);
+ datalo = TCG_REG_G1;
+ }
- /* b,pt,n label1 */
- label_ptr[1] = (uint32_t *)s->code_ptr;
- tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x1)
+ /* The fast path is exactly one insn. Thus we can perform the entire
+ TLB Hit in the (annulled) delay slot of the branch over TLB Miss. */
+ /* beq,a,pt %[xi]cc, label0 */
+ label_ptr = (uint32_t *)s->code_ptr;
+ tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x1)
+ | ((TARGET_LONG_BITS == 64) << 21)
| (1 << 29) | (1 << 19)));
+ /* delay slot */
+ tcg_out_ldst_rr(s, datalo, addr_reg, TCG_REG_O1, qemu_st_opc[sizeop]);
/* TLB Miss. */
-
- *label_ptr[0] |= INSN_OFF19((unsigned long)s->code_ptr -
- (unsigned long)label_ptr[0]);
-
- n = 0;
-#ifdef CONFIG_TCG_PASS_AREG0
- tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[n++], TCG_AREG0);
-#endif
+ n = ARG_OFFSET;
+ if (ARG_OFFSET) {
+ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0);
+ }
if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
args[addrlo_idx + 1]);
}
tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
args[addrlo_idx]);
- if (TCG_TARGET_REG_BITS == 32 && opc == 3) {
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], datahi);
}
tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], datalo);
@@ -1042,7 +1049,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
sizeof(long));
/* qemu_st_helper[s_bits](arg0, arg1, arg2) */
- tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_st_helpers[opc]
+ tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_st_helpers[sizeop]
- (tcg_target_ulong)s->code_ptr) >> 2)
& 0x3fffffff));
/* delay slot */
@@ -1053,17 +1060,25 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc)
TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
sizeof(long));
- *label_ptr[1] |= INSN_OFF19((unsigned long)s->code_ptr -
- (unsigned long)label_ptr[1]);
+ *label_ptr |= INSN_OFF19((unsigned long)s->code_ptr -
+ (unsigned long)label_ptr);
#else
addr_reg = args[addrlo_idx];
if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
addr_reg = TCG_REG_I5;
}
- tcg_out_qemu_st_direct(s, addr_reg,
- (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
- datalo, datahi, opc);
+ if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
+ /* Reconstruct the full 64-bit value in %g1, using %o2 as temp. */
+ /* ??? Redefine the temps from %i4/%i5 so that we have a o/g temp. */
+ tcg_out_arithi(s, TCG_REG_G1, datalo, 0, SHIFT_SRL);
+ tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
+ tcg_out_arith(s, TCG_REG_G1, TCG_REG_G1, TCG_REG_O2, ARITH_OR);
+ datalo = TCG_REG_G1;
+ }
+ tcg_out_ldst_rr(s, datalo, addr_reg,
+ (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
+ qemu_st_opc[sizeop]);
#endif /* CONFIG_SOFTMMU */
}
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 08/15] Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (6 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 07/15] tcg-sparc: Steamline qemu_ld/st more Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-26 16:26 ` Blue Swirl
2012-03-25 22:27 ` [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0 Richard Henderson
` (6 subsequent siblings)
14 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
dyngen-exec.h | 5 +++++
user-exec.c | 17 ++++++++++++++---
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/dyngen-exec.h b/dyngen-exec.h
index cfeef99..65fcb43 100644
--- a/dyngen-exec.h
+++ b/dyngen-exec.h
@@ -19,6 +19,10 @@
#if !defined(__DYNGEN_EXEC_H__)
#define __DYNGEN_EXEC_H__
+/* If the target has indicated that it does not need an AREG0,
+ don't declare the env variable at all, much less as a register. */
+#if !defined(CONFIG_TCG_PASS_AREG0)
+
#if defined(CONFIG_TCG_INTERPRETER)
/* The TCG interpreter does not need a special register AREG0,
* but it is possible to use one by defining AREG0.
@@ -65,4 +69,5 @@ register CPUArchState *env asm(AREG0);
extern CPUArchState *env;
#endif
+#endif /* !CONFIG_TCG_PASS_AREG0 */
#endif /* !defined(__DYNGEN_EXEC_H__) */
diff --git a/user-exec.c b/user-exec.c
index cd905ff..e326104 100644
--- a/user-exec.c
+++ b/user-exec.c
@@ -58,7 +58,9 @@ void cpu_resume_from_signal(CPUArchState *env1, void *puc)
struct sigcontext *uc = puc;
#endif
+#ifndef CONFIG_TCG_PASS_AREG0
env = env1;
+#endif
/* XXX: restore cpu registers saved in host registers */
@@ -74,8 +76,8 @@ void cpu_resume_from_signal(CPUArchState *env1, void *puc)
sigprocmask(SIG_SETMASK, &uc->sc_mask, NULL);
#endif
}
- env->exception_index = -1;
- longjmp(env->jmp_env, 1);
+ env1->exception_index = -1;
+ longjmp(env1->jmp_env, 1);
}
/* 'pc' is the host PC at which the exception was raised. 'address' is
@@ -89,9 +91,18 @@ static inline int handle_cpu_signal(unsigned long pc, unsigned long address,
TranslationBlock *tb;
int ret;
+ /* XXX: find a correct solution for multithread */
+#ifdef CONFIG_TCG_PASS_AREG0
+ /* ??? While we no longer have a global env register, if PC is within
+ the code_gen_buffer then we know that env is within a known register
+ there, and we could have the signal handler extract that value. */
+ CPUArchState *env = cpu_single_env;
+#else
if (cpu_single_env) {
- env = cpu_single_env; /* XXX: find a correct solution for multithread */
+ env = cpu_single_env;
}
+#endif
+
#if defined(DEBUG_SIGNAL)
qemu_printf("qemu: SIGSEGV pc=0x%08lx address=%08lx w=%d oldset=0x%08lx\n",
pc, address, is_write, *(unsigned long *)old_set);
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (7 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 08/15] Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0 Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-26 16:31 ` Blue Swirl
2012-03-25 22:27 ` [Qemu-devel] [PATCH 10/15] tcg-sparc: Change AREG0 in generated code to %i0 Richard Henderson
` (5 subsequent siblings)
14 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
dyngen-exec.h | 20 +++++++++++---------
exec.c | 16 ++++++++++++++--
2 files changed, 25 insertions(+), 11 deletions(-)
diff --git a/dyngen-exec.h b/dyngen-exec.h
index 65fcb43..d673f9f 100644
--- a/dyngen-exec.h
+++ b/dyngen-exec.h
@@ -41,13 +41,8 @@
#elif defined(__mips__)
#define AREG0 "s0"
#elif defined(__sparc__)
-#ifdef CONFIG_SOLARIS
-#define AREG0 "g2"
-#elif HOST_LONG_BITS == 64
-#define AREG0 "g5"
-#else
-#define AREG0 "g6"
-#endif
+/* Don't use a global register. Working around glibc clobbering these
+ global registers is more trouble than just using TLS. */
#elif defined(__s390__)
#define AREG0 "r10"
#elif defined(__alpha__)
@@ -62,12 +57,19 @@
#error unsupported CPU
#endif
-#if defined(AREG0)
+#ifdef AREG0
register CPUArchState *env asm(AREG0);
#else
-/* TODO: Try env = cpu_single_env. */
+/* It's tempting to #define env cpu_single_cpu, but that runs afoul of
+ the other macro usage in target-foo/helper.h. Instead use an alias.
+ That has to happen where cpu_single_cpu is defined, so just a
+ declaration here. */
+#ifdef __linux__
+extern __thread CPUArchState *env;
+#else
extern CPUArchState *env;
#endif
+#endif /* AREG0 */
#endif /* !CONFIG_TCG_PASS_AREG0 */
#endif /* !defined(__DYNGEN_EXEC_H__) */
diff --git a/exec.c b/exec.c
index 6731ab8..d84caa5 100644
--- a/exec.c
+++ b/exec.c
@@ -124,9 +124,21 @@ static MemoryRegion io_mem_subpage_ram;
#endif
CPUArchState *first_cpu;
-/* current CPU in the current thread. It is only valid inside
- cpu_exec() */
+
+/* Current CPU in the current thread. It is only valid inside cpu_exec(). */
DEFINE_TLS(CPUArchState *,cpu_single_env);
+
+/* In dyngen-exec.h, without AREG0, we fall back to an alias to cpu_single_env.
+ We can't actually tell from here whether that's needed or not, but it does
+ not hurt to go ahead and make the declaration. */
+#ifndef CONFIG_TCG_PASS_AREG0
+extern
+#ifdef __linux__
+ __thread
+#endif
+ CPUArchState *env __attribute__((alias("tls__cpu_single_env")));
+#endif /* CONFIG_TCG_PASS_AREG0 */
+
/* 0 = Do not count executed instructions.
1 = Precise instruction counting.
2 = Adaptive rate instruction counting. */
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 10/15] tcg-sparc: Change AREG0 in generated code to %i0.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (8 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0 Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 11/15] tcg-sparc: Clean up cruft stemming from attempts to use global registers Richard Henderson
` (4 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 3 ++-
tcg/sparc/tcg-target.h | 9 +--------
2 files changed, 3 insertions(+), 9 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index d45114f..dc36840 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -705,7 +705,8 @@ static void tcg_target_qemu_prologue(TCGContext *s)
tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_I1) |
INSN_RS2(TCG_REG_G0));
- tcg_out_mov(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_I0);
+ /* delay slot */
+ tcg_out_nop(s);
}
#if defined(CONFIG_SOFTMMU)
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index e69dfc8..31b98e2 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -128,14 +128,7 @@ typedef enum {
#define TCG_TARGET_HAS_GUEST_BASE
-/* Note: must be synced with dyngen-exec.h */
-#ifdef CONFIG_SOLARIS
-#define TCG_AREG0 TCG_REG_G2
-#elif HOST_LONG_BITS == 64
-#define TCG_AREG0 TCG_REG_G5
-#else
-#define TCG_AREG0 TCG_REG_G6
-#endif
+#define TCG_AREG0 TCG_REG_I0
static inline void flush_icache_range(tcg_target_ulong start,
tcg_target_ulong stop)
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 11/15] tcg-sparc: Clean up cruft stemming from attempts to use global registers.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (9 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 10/15] tcg-sparc: Change AREG0 in generated code to %i0 Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 12/15] tcg-sparc: Mask shift immediates to avoid illegal insns Richard Henderson
` (3 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Don't use -ffixed-gN. Don't link statically. Don't save/restore
AREG0 around calls. Don't allocate space on the stack for AREG0 save.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
configure | 12 ----------
tcg/sparc/tcg-target.c | 57 ++++++++++++++++--------------------------------
tcg/sparc/tcg-target.h | 18 ++++++---------
3 files changed, 26 insertions(+), 61 deletions(-)
diff --git a/configure b/configure
index a79a090..4ae70c0 100755
--- a/configure
+++ b/configure
@@ -815,19 +815,11 @@ case "$cpu" in
sparc)
LDFLAGS="-m32 $LDFLAGS"
QEMU_CFLAGS="-m32 -mcpu=ultrasparc $QEMU_CFLAGS"
- QEMU_CFLAGS="-ffixed-g2 -ffixed-g3 $QEMU_CFLAGS"
- if test "$solaris" = "no" ; then
- QEMU_CFLAGS="-ffixed-g1 -ffixed-g6 $QEMU_CFLAGS"
- fi
host_guest_base="yes"
;;
sparc64)
LDFLAGS="-m64 $LDFLAGS"
QEMU_CFLAGS="-m64 -mcpu=ultrasparc $QEMU_CFLAGS"
- QEMU_CFLAGS="-ffixed-g5 -ffixed-g6 -ffixed-g7 $QEMU_CFLAGS"
- if test "$solaris" != "no" ; then
- QEMU_CFLAGS="-ffixed-g1 $QEMU_CFLAGS"
- fi
host_guest_base="yes"
;;
s390)
@@ -3817,10 +3809,6 @@ fi
if test "$target_linux_user" = "yes" -o "$target_bsd_user" = "yes" ; then
case "$ARCH" in
- sparc)
- # -static is used to avoid g1/g3 usage by the dynamic linker
- ldflags="$linker_script -static $ldflags"
- ;;
alpha | s390x)
# The default placement of the application is fine.
;;
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index dc36840..c1d5ab1 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -167,9 +167,6 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str)
tcg_regset_reset_reg(ct->u.regs, TCG_REG_O0);
tcg_regset_reset_reg(ct->u.regs, TCG_REG_O1);
tcg_regset_reset_reg(ct->u.regs, TCG_REG_O2);
-#ifdef CONFIG_TCG_PASS_AREG0
- tcg_regset_reset_reg(ct->u.regs, TCG_REG_O3);
-#endif
break;
case 'I':
ct->ct |= TCG_CT_CONST_S11;
@@ -690,11 +687,22 @@ static void tcg_out_setcond2_i32(TCGContext *s, TCGCond cond, TCGArg ret,
/* Generate global QEMU prologue and epilogue code */
static void tcg_target_qemu_prologue(TCGContext *s)
{
- tcg_set_frame(s, TCG_REG_I6, TCG_TARGET_CALL_STACK_OFFSET,
- CPU_TEMP_BUF_NLONGS * (int)sizeof(long));
+ int tmp_buf_size, frame_size;
+
+ /* The TCG temp buffer is at the top of the frame, immediately
+ below the frame pointer. */
+ tmp_buf_size = CPU_TEMP_BUF_NLONGS * (int)sizeof(long);
+ tcg_set_frame(s, TCG_REG_I6, TCG_TARGET_STACK_BIAS - tmp_buf_size,
+ tmp_buf_size);
+
+ /* TCG_TARGET_CALL_STACK_OFFSET includes the stack bias, but is
+ otherwise the minimal frame usable by callees. */
+ frame_size = TCG_TARGET_CALL_STACK_OFFSET - TCG_TARGET_STACK_BIAS;
+ frame_size += TCG_STATIC_CALL_ARGS_SIZE + tmp_buf_size;
+ frame_size += TCG_TARGET_STACK_ALIGN - 1;
+ frame_size &= -TCG_TARGET_STACK_ALIGN;
tcg_out32(s, SAVE | INSN_RD(TCG_REG_O6) | INSN_RS1(TCG_REG_O6) |
- INSN_IMM13(-(TCG_TARGET_STACK_MINFRAME +
- CPU_TEMP_BUF_NLONGS * (int)sizeof(long))));
+ INSN_IMM13(-frame_size));
#ifdef CONFIG_USE_GUEST_BASE
if (GUEST_BASE != 0) {
@@ -707,6 +715,8 @@ static void tcg_target_qemu_prologue(TCGContext *s)
INSN_RS2(TCG_REG_G0));
/* delay slot */
tcg_out_nop(s);
+
+ /* No epilogue required. We issue ret + restore directly in the TB. */
}
#if defined(CONFIG_SOFTMMU)
@@ -911,12 +921,6 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int sizeop)
tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++],
args[addrlo_idx]);
- /* Store AREG0 in stack to avoid ugly glibc bugs that mangle
- global registers */
- tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long));
-
/* qemu_ld_helper[s_bits](arg0, arg1) */
tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_ld_helpers[s_bits]
- (tcg_target_ulong)s->code_ptr) >> 2)
@@ -924,11 +928,6 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int sizeop)
/* delay slot */
tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[n], memi);
- /* Reload AREG0. */
- tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long));
-
n = tcg_target_call_oarg_regs[0];
/* datalo = sign_extend(arg0) */
switch (sizeop) {
@@ -1043,12 +1042,6 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop)
}
tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], datalo);
- /* Store AREG0 in stack to avoid ugly glibc bugs that mangle
- global registers */
- tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long));
-
/* qemu_st_helper[s_bits](arg0, arg1, arg2) */
tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_st_helpers[sizeop]
- (tcg_target_ulong)s->code_ptr) >> 2)
@@ -1056,11 +1049,6 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop)
/* delay slot */
tcg_out_movi(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n], memi);
- /* Reload AREG0. */
- tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long));
-
*label_ptr |= INSN_OFF19((unsigned long)s->code_ptr -
(unsigned long)label_ptr);
#else
@@ -1123,15 +1111,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
tcg_out32(s, JMPL | INSN_RD(TCG_REG_O7) | INSN_RS1(TCG_REG_I5) |
INSN_RS2(TCG_REG_G0));
}
- /* Store AREG0 in stack to avoid ugly glibc bugs that mangle
- global registers */
- // delay slot
- tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long));
- tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK,
- TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE -
- sizeof(long));
+ /* delay slot */
+ tcg_out_nop(s);
break;
case INDEX_op_jmp:
case INDEX_op_br:
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 31b98e2..b7afa7b 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -66,20 +66,16 @@ typedef enum {
#define TCG_CT_CONST_S13 0x200
/* used for function call generation */
-#define TCG_REG_CALL_STACK TCG_REG_I6
+#define TCG_REG_CALL_STACK TCG_REG_O6
#if TCG_TARGET_REG_BITS == 64
-// Reserve space for AREG0
-#define TCG_TARGET_STACK_MINFRAME (176 + 4 * (int)sizeof(long) + \
- TCG_STATIC_CALL_ARGS_SIZE)
-#define TCG_TARGET_CALL_STACK_OFFSET (2047 - 16)
-#define TCG_TARGET_STACK_ALIGN 16
+#define TCG_TARGET_STACK_BIAS 2047
+#define TCG_TARGET_STACK_ALIGN 16
+#define TCG_TARGET_CALL_STACK_OFFSET (128 + 6*8 + TCG_TARGET_STACK_BIAS)
#else
-// AREG0 + one word for alignment
-#define TCG_TARGET_STACK_MINFRAME (92 + (2 + 1) * (int)sizeof(long) + \
- TCG_STATIC_CALL_ARGS_SIZE)
-#define TCG_TARGET_CALL_STACK_OFFSET TCG_TARGET_STACK_MINFRAME
-#define TCG_TARGET_STACK_ALIGN 8
+#define TCG_TARGET_STACK_BIAS 0
+#define TCG_TARGET_STACK_ALIGN 8
+#define TCG_TARGET_CALL_STACK_OFFSET (64 + 4 + 6*4)
#endif
#if TCG_TARGET_REG_BITS == 64
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 12/15] tcg-sparc: Mask shift immediates to avoid illegal insns.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (10 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 11/15] tcg-sparc: Clean up cruft stemming from attempts to use global registers Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 13/15] tcg-sparc: Use defines for temporaries Richard Henderson
` (2 subsequent siblings)
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
The xtensa-test image generates a sra_i32 with count 0x40.
Whether this is accident of tcg constant propagation or
originating directly from the instruction stream is immaterial.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 18 ++++++++++++------
1 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index c1d5ab1..181ba26 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -1184,13 +1184,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
goto gen_arith;
case INDEX_op_shl_i32:
c = SHIFT_SLL;
- goto gen_arith;
+ do_shift32:
+ /* Limit immediate shift count lest we create an illegal insn. */
+ tcg_out_arithc(s, args[0], args[1], args[2] & 31, const_args[2], c);
+ break;
case INDEX_op_shr_i32:
c = SHIFT_SRL;
- goto gen_arith;
+ goto do_shift32;
case INDEX_op_sar_i32:
c = SHIFT_SRA;
- goto gen_arith;
+ goto do_shift32;
case INDEX_op_mul_i32:
c = ARITH_UMUL;
goto gen_arith;
@@ -1311,13 +1314,16 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
break;
case INDEX_op_shl_i64:
c = SHIFT_SLLX;
- goto gen_arith;
+ do_shift64:
+ /* Limit immediate shift count lest we create an illegal insn. */
+ tcg_out_arithc(s, args[0], args[1], args[2] & 63, const_args[2], c);
+ break;
case INDEX_op_shr_i64:
c = SHIFT_SRLX;
- goto gen_arith;
+ goto do_shift64;
case INDEX_op_sar_i64:
c = SHIFT_SRAX;
- goto gen_arith;
+ goto do_shift64;
case INDEX_op_mul_i64:
c = ARITH_MULX;
goto gen_arith;
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 13/15] tcg-sparc: Use defines for temporaries.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (11 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 12/15] tcg-sparc: Mask shift immediates to avoid illegal insns Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-26 16:38 ` Blue Swirl
2012-03-25 22:27 ` [Qemu-devel] [PATCH 14/15] tcg-sparc: Add %g/%o registers to alloc_order Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 15/15] tcg-sparc: Fix and enable direct TB chaining Richard Henderson
14 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
And change from %i4 to %g1 to remove a v8plus fixme.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 110 ++++++++++++++++++++++++-----------------------
1 files changed, 56 insertions(+), 54 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 181ba26..896fab1 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -59,8 +59,11 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
};
#endif
+#define TCG_REG_TMP TCG_REG_G1
+#define TCG_REG_TMP2 TCG_REG_I5
+
#ifdef CONFIG_USE_GUEST_BASE
-# define TCG_GUEST_BASE_REG TCG_REG_I3
+# define TCG_GUEST_BASE_REG TCG_REG_I4
#else
# define TCG_GUEST_BASE_REG TCG_REG_G0
#endif
@@ -372,10 +375,10 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
tcg_out_sethi(s, ret, ~arg);
tcg_out_arithi(s, ret, ret, (arg & 0x3ff) | -0x400, ARITH_XOR);
} else {
- tcg_out_movi_imm32(s, TCG_REG_I4, arg >> (TCG_TARGET_REG_BITS / 2));
- tcg_out_arithi(s, TCG_REG_I4, TCG_REG_I4, 32, SHIFT_SLLX);
- tcg_out_movi_imm32(s, ret, arg);
- tcg_out_arith(s, ret, ret, TCG_REG_I4, ARITH_OR);
+ tcg_out_movi_imm32(s, ret, arg >> (TCG_TARGET_REG_BITS / 2));
+ tcg_out_arithi(s, ret, ret, 32, SHIFT_SLLX);
+ tcg_out_movi_imm32(s, TCG_REG_TMP2, arg);
+ tcg_out_arith(s, ret, ret, TCG_REG_TMP2, ARITH_OR);
}
}
@@ -392,8 +395,8 @@ static inline void tcg_out_ldst(TCGContext *s, int ret, int addr,
tcg_out32(s, op | INSN_RD(ret) | INSN_RS1(addr) |
INSN_IMM13(offset));
} else {
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_I5, offset);
- tcg_out_ldst_rr(s, ret, addr, TCG_REG_I5, op);
+ tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP, offset);
+ tcg_out_ldst_rr(s, ret, addr, TCG_REG_TMP, op);
}
}
@@ -435,8 +438,8 @@ static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
if (check_fit_tl(val, 13))
tcg_out_arithi(s, reg, reg, val, ARITH_ADD);
else {
- tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_I5, val);
- tcg_out_arith(s, reg, reg, TCG_REG_I5, ARITH_ADD);
+ tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP, val);
+ tcg_out_arith(s, reg, reg, TCG_REG_TMP, ARITH_ADD);
}
}
}
@@ -448,8 +451,8 @@ static inline void tcg_out_andi(TCGContext *s, int rd, int rs,
if (check_fit_tl(val, 13))
tcg_out_arithi(s, rd, rs, val, ARITH_AND);
else {
- tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, val);
- tcg_out_arith(s, rd, rs, TCG_REG_I5, ARITH_AND);
+ tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_TMP, val);
+ tcg_out_arith(s, rd, rs, TCG_REG_TMP, ARITH_AND);
}
}
}
@@ -461,8 +464,8 @@ static void tcg_out_div32(TCGContext *s, int rd, int rs1,
if (uns) {
tcg_out_sety(s, TCG_REG_G0);
} else {
- tcg_out_arithi(s, TCG_REG_I5, rs1, 31, SHIFT_SRA);
- tcg_out_sety(s, TCG_REG_I5);
+ tcg_out_arithi(s, TCG_REG_TMP, rs1, 31, SHIFT_SRA);
+ tcg_out_sety(s, TCG_REG_TMP);
}
tcg_out_arithc(s, rd, rs1, val2, val2const,
@@ -608,8 +611,8 @@ static void tcg_out_setcond_i32(TCGContext *s, TCGCond cond, TCGArg ret,
case TCG_COND_GTU:
case TCG_COND_GEU:
if (c2const && c2 != 0) {
- tcg_out_movi_imm13(s, TCG_REG_I5, c2);
- c2 = TCG_REG_I5;
+ tcg_out_movi_imm13(s, TCG_REG_TMP, c2);
+ c2 = TCG_REG_TMP;
}
t = c1, c1 = c2, c2 = t, c2const = 0;
cond = tcg_swap_cond(cond);
@@ -656,15 +659,15 @@ static void tcg_out_setcond2_i32(TCGContext *s, TCGCond cond, TCGArg ret,
switch (cond) {
case TCG_COND_EQ:
- tcg_out_setcond_i32(s, TCG_COND_EQ, TCG_REG_I5, al, bl, blconst);
+ tcg_out_setcond_i32(s, TCG_COND_EQ, TCG_REG_TMP, al, bl, blconst);
tcg_out_setcond_i32(s, TCG_COND_EQ, ret, ah, bh, bhconst);
- tcg_out_arith(s, ret, ret, TCG_REG_I5, ARITH_AND);
+ tcg_out_arith(s, ret, ret, TCG_REG_TMP, ARITH_AND);
break;
case TCG_COND_NE:
- tcg_out_setcond_i32(s, TCG_COND_NE, TCG_REG_I5, al, al, blconst);
+ tcg_out_setcond_i32(s, TCG_COND_NE, TCG_REG_TMP, al, al, blconst);
tcg_out_setcond_i32(s, TCG_COND_NE, ret, ah, bh, bhconst);
- tcg_out_arith(s, ret, ret, TCG_REG_I5, ARITH_OR);
+ tcg_out_arith(s, ret, ret, TCG_REG_TMP, ARITH_OR);
break;
default:
@@ -964,8 +967,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int sizeop)
#else
addr_reg = args[addrlo_idx];
if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
- tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
- addr_reg = TCG_REG_I5;
+ tcg_out_arithi(s, TCG_REG_TMP, addr_reg, 0, SHIFT_SRL);
+ addr_reg = TCG_REG_TMP;
}
if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
int reg64 = (datalo < 16 ? datalo : TCG_REG_O0);
@@ -1008,12 +1011,11 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop)
offsetof(CPUTLBEntry, addr_write));
if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
- /* Reconstruct the full 64-bit value in %g1, using %o2 as temp. */
- /* ??? Redefine the temps from %i4/%i5 so that we have a o/g temp. */
- tcg_out_arithi(s, TCG_REG_G1, datalo, 0, SHIFT_SRL);
+ /* Reconstruct the full 64-bit value. */
+ tcg_out_arithi(s, TCG_REG_TMP, datalo, 0, SHIFT_SRL);
tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
- tcg_out_arith(s, TCG_REG_G1, TCG_REG_G1, TCG_REG_O2, ARITH_OR);
- datalo = TCG_REG_G1;
+ tcg_out_arith(s, TCG_REG_O2, TCG_REG_TMP, TCG_REG_O2, ARITH_OR);
+ datalo = TCG_REG_O2;
}
/* The fast path is exactly one insn. Thus we can perform the entire
@@ -1054,16 +1056,14 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop)
#else
addr_reg = args[addrlo_idx];
if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
- tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
- addr_reg = TCG_REG_I5;
+ tcg_out_arithi(s, TCG_REG_TMP, addr_reg, 0, SHIFT_SRL);
+ addr_reg = TCG_REG_TMP;
}
if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
- /* Reconstruct the full 64-bit value in %g1, using %o2 as temp. */
- /* ??? Redefine the temps from %i4/%i5 so that we have a o/g temp. */
- tcg_out_arithi(s, TCG_REG_G1, datalo, 0, SHIFT_SRL);
+ tcg_out_arithi(s, TCG_REG_TMP, datalo, 0, SHIFT_SRL);
tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
- tcg_out_arith(s, TCG_REG_G1, TCG_REG_G1, TCG_REG_O2, ARITH_OR);
- datalo = TCG_REG_G1;
+ tcg_out_arith(s, TCG_REG_O2, TCG_REG_TMP, TCG_REG_O2, ARITH_OR);
+ datalo = TCG_REG_O2;
}
tcg_out_ldst_rr(s, datalo, addr_reg,
(GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
@@ -1087,14 +1087,14 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
case INDEX_op_goto_tb:
if (s->tb_jmp_offset) {
/* direct jump method */
- tcg_out_sethi(s, TCG_REG_I5, args[0] & 0xffffe000);
- tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_I5) |
+ tcg_out_sethi(s, TCG_REG_TMP, args[0] & 0xffffe000);
+ tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_TMP) |
INSN_IMM13((args[0] & 0x1fff)));
s->tb_jmp_offset[args[0]] = s->code_ptr - s->code_buf;
} else {
/* indirect jump method */
- tcg_out_ld_ptr(s, TCG_REG_I5, (tcg_target_long)(s->tb_next + args[0]));
- tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_I5) |
+ tcg_out_ld_ptr(s, TCG_REG_TMP, (tcg_target_long)(s->tb_next + args[0]));
+ tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_TMP) |
INSN_RS2(TCG_REG_G0));
}
tcg_out_nop(s);
@@ -1106,9 +1106,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
- (tcg_target_ulong)s->code_ptr) >> 2)
& 0x3fffffff));
else {
- tcg_out_ld_ptr(s, TCG_REG_I5,
+ tcg_out_ld_ptr(s, TCG_REG_TMP,
(tcg_target_long)(s->tb_next + args[0]));
- tcg_out32(s, JMPL | INSN_RD(TCG_REG_O7) | INSN_RS1(TCG_REG_I5) |
+ tcg_out32(s, JMPL | INSN_RD(TCG_REG_O7) | INSN_RS1(TCG_REG_TMP) |
INSN_RS2(TCG_REG_G0));
}
/* delay slot */
@@ -1214,11 +1214,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
case INDEX_op_rem_i32:
case INDEX_op_remu_i32:
- tcg_out_div32(s, TCG_REG_I5, args[1], args[2], const_args[2],
+ tcg_out_div32(s, TCG_REG_TMP, args[1], args[2], const_args[2],
opc == INDEX_op_remu_i32);
- tcg_out_arithc(s, TCG_REG_I5, TCG_REG_I5, args[2], const_args[2],
+ tcg_out_arithc(s, TCG_REG_TMP, TCG_REG_TMP, args[2], const_args[2],
ARITH_UMUL);
- tcg_out_arith(s, args[0], args[1], TCG_REG_I5, ARITH_SUB);
+ tcg_out_arith(s, args[0], args[1], TCG_REG_TMP, ARITH_SUB);
break;
case INDEX_op_brcond_i32:
@@ -1335,11 +1335,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
goto gen_arith;
case INDEX_op_rem_i64:
case INDEX_op_remu_i64:
- tcg_out_arithc(s, TCG_REG_I5, args[1], args[2], const_args[2],
+ tcg_out_arithc(s, TCG_REG_TMP, args[1], args[2], const_args[2],
opc == INDEX_op_rem_i64 ? ARITH_SDIVX : ARITH_UDIVX);
- tcg_out_arithc(s, TCG_REG_I5, TCG_REG_I5, args[2], const_args[2],
+ tcg_out_arithc(s, TCG_REG_TMP, TCG_REG_TMP, args[2], const_args[2],
ARITH_MULX);
- tcg_out_arith(s, args[0], args[1], TCG_REG_I5, ARITH_SUB);
+ tcg_out_arith(s, args[0], args[1], TCG_REG_TMP, ARITH_SUB);
break;
case INDEX_op_ext32s_i64:
if (const_args[1]) {
@@ -1537,15 +1537,17 @@ static void tcg_target_init(TCGContext *s)
(1 << TCG_REG_O7));
tcg_regset_clear(s->reserved_regs);
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_G0);
-#if TCG_TARGET_REG_BITS == 64
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_I4); // for internal use
-#endif
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_I5); // for internal use
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_I6);
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_I7);
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_O6);
- tcg_regset_set_reg(s->reserved_regs, TCG_REG_O7);
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_G0); // zero
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_G6); // reserved for os
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_G7); // thread pointer
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_I6); // frame pointer
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_I7); // return address
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_O6); // stack pointer
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP); // for internal use
+ if (TCG_TARGET_REG_BITS == 64) {
+ tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2); // for internal use
+ }
+
tcg_add_target_add_op_defs(sparc_op_defs);
}
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 14/15] tcg-sparc: Add %g/%o registers to alloc_order
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (12 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 13/15] tcg-sparc: Use defines for temporaries Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 15/15] tcg-sparc: Fix and enable direct TB chaining Richard Henderson
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/sparc/tcg-target.c | 14 ++++++++++++++
1 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index 896fab1..ce7c44e 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -83,11 +83,25 @@ static const int tcg_target_reg_alloc_order[] = {
TCG_REG_L5,
TCG_REG_L6,
TCG_REG_L7,
+
TCG_REG_I0,
TCG_REG_I1,
TCG_REG_I2,
TCG_REG_I3,
TCG_REG_I4,
+
+ TCG_REG_G2,
+ TCG_REG_G3,
+ TCG_REG_G4,
+ TCG_REG_G5,
+
+ TCG_REG_O0,
+ TCG_REG_O1,
+ TCG_REG_O2,
+ TCG_REG_O3,
+ TCG_REG_O4,
+ TCG_REG_O5,
+ TCG_REG_O7,
};
static const int tcg_target_call_iarg_regs[6] = {
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [Qemu-devel] [PATCH 15/15] tcg-sparc: Fix and enable direct TB chaining.
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
` (13 preceding siblings ...)
2012-03-25 22:27 ` [Qemu-devel] [PATCH 14/15] tcg-sparc: Add %g/%o registers to alloc_order Richard Henderson
@ 2012-03-25 22:27 ` Richard Henderson
14 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-25 22:27 UTC (permalink / raw)
To: qemu-devel; +Cc: Blue Swirl
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
exec-all.h | 9 ++++++---
tcg/sparc/tcg-target.c | 19 ++++++++++++++++---
2 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/exec-all.h b/exec-all.h
index 93a5b22..f7d4708 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -120,9 +120,10 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
#define CODE_GEN_AVG_BLOCK_SIZE 64
#endif
-#if defined(_ARCH_PPC) || defined(__x86_64__) || defined(__arm__) || defined(__i386__)
-#define USE_DIRECT_JUMP
-#elif defined(CONFIG_TCG_INTERPRETER)
+#if defined(__arm__) || defined(_ARCH_PPC) \
+ || defined(__x86_64__) || defined(__i386__) \
+ || defined(__sparc__) \
+ || defined(CONFIG_TCG_INTERPRETER)
#define USE_DIRECT_JUMP
#endif
@@ -232,6 +233,8 @@ static inline void tb_set_jmp_target1(unsigned long jmp_addr, unsigned long addr
__asm __volatile__ ("swi 0x9f0002" : : "r" (_beg), "r" (_end), "r" (_flg));
#endif
}
+#elif defined(__sparc__)
+extern void tb_set_jmp_target1(unsigned long jmp_addr, unsigned long addr);
#else
#error tb_set_jmp_target1 is missing
#endif
diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index ce7c44e..2a09e23 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -1101,10 +1101,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
case INDEX_op_goto_tb:
if (s->tb_jmp_offset) {
/* direct jump method */
- tcg_out_sethi(s, TCG_REG_TMP, args[0] & 0xffffe000);
- tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_TMP) |
- INSN_IMM13((args[0] & 0x1fff)));
s->tb_jmp_offset[args[0]] = s->code_ptr - s->code_buf;
+ tcg_out32(s, CALL | (8 >> 2));
} else {
/* indirect jump method */
tcg_out_ld_ptr(s, TCG_REG_TMP, (tcg_target_long)(s->tb_next + args[0]));
@@ -1627,3 +1625,18 @@ void tcg_register_jit(void *buf, size_t buf_size)
tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
}
+
+void tb_set_jmp_target1(unsigned long jmp_addr, unsigned long addr)
+{
+ uint32_t *ptr = (uint32_t *)jmp_addr;
+ tcg_target_long disp = (tcg_target_long)(addr - jmp_addr) >> 2;
+
+ /* We can reach the entire address space for 32-bit. For 64-bit
+ the code_gen_buffer can't be larger than 2GB. */
+ if (TCG_TARGET_REG_BITS == 64 && !check_fit_tl(disp, 30)) {
+ abort();
+ }
+
+ *ptr = CALL | (disp & 0x3fffffff);
+ flush_icache_range(jmp_addr, jmp_addr + 4);
+}
--
1.7.7.6
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH 08/15] Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0.
2012-03-25 22:27 ` [Qemu-devel] [PATCH 08/15] Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0 Richard Henderson
@ 2012-03-26 16:26 ` Blue Swirl
2012-03-26 16:31 ` Richard Henderson
0 siblings, 1 reply; 22+ messages in thread
From: Blue Swirl @ 2012-03-26 16:26 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Sun, Mar 25, 2012 at 22:27, Richard Henderson <rth@twiddle.net> wrote:
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> dyngen-exec.h | 5 +++++
> user-exec.c | 17 ++++++++++++++---
> 2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/dyngen-exec.h b/dyngen-exec.h
> index cfeef99..65fcb43 100644
> --- a/dyngen-exec.h
> +++ b/dyngen-exec.h
> @@ -19,6 +19,10 @@
> #if !defined(__DYNGEN_EXEC_H__)
> #define __DYNGEN_EXEC_H__
>
> +/* If the target has indicated that it does not need an AREG0,
> + don't declare the env variable at all, much less as a register. */
> +#if !defined(CONFIG_TCG_PASS_AREG0)
> +
> #if defined(CONFIG_TCG_INTERPRETER)
> /* The TCG interpreter does not need a special register AREG0,
> * but it is possible to use one by defining AREG0.
> @@ -65,4 +69,5 @@ register CPUArchState *env asm(AREG0);
> extern CPUArchState *env;
> #endif
>
> +#endif /* !CONFIG_TCG_PASS_AREG0 */
> #endif /* !defined(__DYNGEN_EXEC_H__) */
> diff --git a/user-exec.c b/user-exec.c
> index cd905ff..e326104 100644
> --- a/user-exec.c
> +++ b/user-exec.c
> @@ -58,7 +58,9 @@ void cpu_resume_from_signal(CPUArchState *env1, void *puc)
> struct sigcontext *uc = puc;
> #endif
>
> +#ifndef CONFIG_TCG_PASS_AREG0
> env = env1;
> +#endif
Shouldn't longjmp() restore global registers as well? Actually, we
return to cpu-exec.c which does not use global env. Isn't this
useless?
>
> /* XXX: restore cpu registers saved in host registers */
>
> @@ -74,8 +76,8 @@ void cpu_resume_from_signal(CPUArchState *env1, void *puc)
> sigprocmask(SIG_SETMASK, &uc->sc_mask, NULL);
> #endif
> }
> - env->exception_index = -1;
> - longjmp(env->jmp_env, 1);
> + env1->exception_index = -1;
> + longjmp(env1->jmp_env, 1);
> }
>
> /* 'pc' is the host PC at which the exception was raised. 'address' is
> @@ -89,9 +91,18 @@ static inline int handle_cpu_signal(unsigned long pc, unsigned long address,
> TranslationBlock *tb;
> int ret;
>
> + /* XXX: find a correct solution for multithread */
> +#ifdef CONFIG_TCG_PASS_AREG0
> + /* ??? While we no longer have a global env register, if PC is within
> + the code_gen_buffer then we know that env is within a known register
> + there, and we could have the signal handler extract that value. */
> + CPUArchState *env = cpu_single_env;
This just makes env a useless variable. The original code was trying
to restore the global variable, but the functions called later do not
use global env.
I'd change user-exec.c to work without global env use.
> +#else
> if (cpu_single_env) {
> - env = cpu_single_env; /* XXX: find a correct solution for multithread */
> + env = cpu_single_env;
> }
> +#endif
> +
> #if defined(DEBUG_SIGNAL)
> qemu_printf("qemu: SIGSEGV pc=0x%08lx address=%08lx w=%d oldset=0x%08lx\n",
> pc, address, is_write, *(unsigned long *)old_set);
> --
> 1.7.7.6
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH 08/15] Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0.
2012-03-26 16:26 ` Blue Swirl
@ 2012-03-26 16:31 ` Richard Henderson
0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2012-03-26 16:31 UTC (permalink / raw)
To: Blue Swirl; +Cc: qemu-devel
On 03/26/12 09:26, Blue Swirl wrote:
>> +#ifndef CONFIG_TCG_PASS_AREG0
>> env = env1;
>> +#endif
>
> Shouldn't longjmp() restore global registers as well? Actually, we
> return to cpu-exec.c which does not use global env. Isn't this
> useless?
Possibly. I didn't think to try to actually remove these uses,
just get the code to compile without env being declared.
> I'd change user-exec.c to work without global env use.
I'll give it a shot...
r~
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0.
2012-03-25 22:27 ` [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0 Richard Henderson
@ 2012-03-26 16:31 ` Blue Swirl
2012-03-26 16:52 ` Richard Henderson
0 siblings, 1 reply; 22+ messages in thread
From: Blue Swirl @ 2012-03-26 16:31 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Sun, Mar 25, 2012 at 22:27, Richard Henderson <rth@twiddle.net> wrote:
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> dyngen-exec.h | 20 +++++++++++---------
> exec.c | 16 ++++++++++++++--
> 2 files changed, 25 insertions(+), 11 deletions(-)
>
> diff --git a/dyngen-exec.h b/dyngen-exec.h
> index 65fcb43..d673f9f 100644
> --- a/dyngen-exec.h
> +++ b/dyngen-exec.h
> @@ -41,13 +41,8 @@
> #elif defined(__mips__)
> #define AREG0 "s0"
> #elif defined(__sparc__)
> -#ifdef CONFIG_SOLARIS
> -#define AREG0 "g2"
> -#elif HOST_LONG_BITS == 64
> -#define AREG0 "g5"
> -#else
> -#define AREG0 "g6"
> -#endif
> +/* Don't use a global register. Working around glibc clobbering these
> + global registers is more trouble than just using TLS. */
> #elif defined(__s390__)
> #define AREG0 "r10"
> #elif defined(__alpha__)
> @@ -62,12 +57,19 @@
> #error unsupported CPU
> #endif
>
> -#if defined(AREG0)
> +#ifdef AREG0
> register CPUArchState *env asm(AREG0);
> #else
> -/* TODO: Try env = cpu_single_env. */
> +/* It's tempting to #define env cpu_single_cpu, but that runs afoul of
> + the other macro usage in target-foo/helper.h. Instead use an alias.
> + That has to happen where cpu_single_cpu is defined, so just a
> + declaration here. */
> +#ifdef __linux__
> +extern __thread CPUArchState *env;
> +#else
> extern CPUArchState *env;
> #endif
> +#endif /* AREG0 */
>
> #endif /* !CONFIG_TCG_PASS_AREG0 */
> #endif /* !defined(__DYNGEN_EXEC_H__) */
> diff --git a/exec.c b/exec.c
> index 6731ab8..d84caa5 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -124,9 +124,21 @@ static MemoryRegion io_mem_subpage_ram;
> #endif
>
> CPUArchState *first_cpu;
> -/* current CPU in the current thread. It is only valid inside
> - cpu_exec() */
> +
> +/* Current CPU in the current thread. It is only valid inside cpu_exec(). */
> DEFINE_TLS(CPUArchState *,cpu_single_env);
> +
> +/* In dyngen-exec.h, without AREG0, we fall back to an alias to cpu_single_env.
> + We can't actually tell from here whether that's needed or not, but it does
> + not hurt to go ahead and make the declaration. */
> +#ifndef CONFIG_TCG_PASS_AREG0
> +extern
> +#ifdef __linux__
> + __thread
> +#endif
> + CPUArchState *env __attribute__((alias("tls__cpu_single_env")));
> +#endif /* CONFIG_TCG_PASS_AREG0 */
Please use DECLARE_TLS/DEFINE_TLS and global env accesses should also
use tls_var().
> +
> /* 0 = Do not count executed instructions.
> 1 = Precise instruction counting.
> 2 = Adaptive rate instruction counting. */
> --
> 1.7.7.6
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH 13/15] tcg-sparc: Use defines for temporaries.
2012-03-25 22:27 ` [Qemu-devel] [PATCH 13/15] tcg-sparc: Use defines for temporaries Richard Henderson
@ 2012-03-26 16:38 ` Blue Swirl
0 siblings, 0 replies; 22+ messages in thread
From: Blue Swirl @ 2012-03-26 16:38 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Sun, Mar 25, 2012 at 22:27, Richard Henderson <rth@twiddle.net> wrote:
> And change from %i4 to %g1 to remove a v8plus fixme.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> tcg/sparc/tcg-target.c | 110 ++++++++++++++++++++++++-----------------------
> 1 files changed, 56 insertions(+), 54 deletions(-)
>
> diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
> index 181ba26..896fab1 100644
> --- a/tcg/sparc/tcg-target.c
> +++ b/tcg/sparc/tcg-target.c
> @@ -59,8 +59,11 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
> };
> #endif
>
> +#define TCG_REG_TMP TCG_REG_G1
> +#define TCG_REG_TMP2 TCG_REG_I5
> +
> #ifdef CONFIG_USE_GUEST_BASE
> -# define TCG_GUEST_BASE_REG TCG_REG_I3
> +# define TCG_GUEST_BASE_REG TCG_REG_I4
> #else
> # define TCG_GUEST_BASE_REG TCG_REG_G0
> #endif
> @@ -372,10 +375,10 @@ static inline void tcg_out_movi(TCGContext *s, TCGType type,
> tcg_out_sethi(s, ret, ~arg);
> tcg_out_arithi(s, ret, ret, (arg & 0x3ff) | -0x400, ARITH_XOR);
> } else {
> - tcg_out_movi_imm32(s, TCG_REG_I4, arg >> (TCG_TARGET_REG_BITS / 2));
> - tcg_out_arithi(s, TCG_REG_I4, TCG_REG_I4, 32, SHIFT_SLLX);
> - tcg_out_movi_imm32(s, ret, arg);
> - tcg_out_arith(s, ret, ret, TCG_REG_I4, ARITH_OR);
> + tcg_out_movi_imm32(s, ret, arg >> (TCG_TARGET_REG_BITS / 2));
> + tcg_out_arithi(s, ret, ret, 32, SHIFT_SLLX);
> + tcg_out_movi_imm32(s, TCG_REG_TMP2, arg);
> + tcg_out_arith(s, ret, ret, TCG_REG_TMP2, ARITH_OR);
> }
> }
>
> @@ -392,8 +395,8 @@ static inline void tcg_out_ldst(TCGContext *s, int ret, int addr,
> tcg_out32(s, op | INSN_RD(ret) | INSN_RS1(addr) |
> INSN_IMM13(offset));
> } else {
> - tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_I5, offset);
> - tcg_out_ldst_rr(s, ret, addr, TCG_REG_I5, op);
> + tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP, offset);
> + tcg_out_ldst_rr(s, ret, addr, TCG_REG_TMP, op);
> }
> }
>
> @@ -435,8 +438,8 @@ static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val)
> if (check_fit_tl(val, 13))
> tcg_out_arithi(s, reg, reg, val, ARITH_ADD);
> else {
> - tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_I5, val);
> - tcg_out_arith(s, reg, reg, TCG_REG_I5, ARITH_ADD);
> + tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP, val);
> + tcg_out_arith(s, reg, reg, TCG_REG_TMP, ARITH_ADD);
> }
> }
> }
> @@ -448,8 +451,8 @@ static inline void tcg_out_andi(TCGContext *s, int rd, int rs,
> if (check_fit_tl(val, 13))
> tcg_out_arithi(s, rd, rs, val, ARITH_AND);
> else {
> - tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, val);
> - tcg_out_arith(s, rd, rs, TCG_REG_I5, ARITH_AND);
> + tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_TMP, val);
> + tcg_out_arith(s, rd, rs, TCG_REG_TMP, ARITH_AND);
> }
> }
> }
> @@ -461,8 +464,8 @@ static void tcg_out_div32(TCGContext *s, int rd, int rs1,
> if (uns) {
> tcg_out_sety(s, TCG_REG_G0);
> } else {
> - tcg_out_arithi(s, TCG_REG_I5, rs1, 31, SHIFT_SRA);
> - tcg_out_sety(s, TCG_REG_I5);
> + tcg_out_arithi(s, TCG_REG_TMP, rs1, 31, SHIFT_SRA);
> + tcg_out_sety(s, TCG_REG_TMP);
> }
>
> tcg_out_arithc(s, rd, rs1, val2, val2const,
> @@ -608,8 +611,8 @@ static void tcg_out_setcond_i32(TCGContext *s, TCGCond cond, TCGArg ret,
> case TCG_COND_GTU:
> case TCG_COND_GEU:
> if (c2const && c2 != 0) {
> - tcg_out_movi_imm13(s, TCG_REG_I5, c2);
> - c2 = TCG_REG_I5;
> + tcg_out_movi_imm13(s, TCG_REG_TMP, c2);
> + c2 = TCG_REG_TMP;
> }
> t = c1, c1 = c2, c2 = t, c2const = 0;
> cond = tcg_swap_cond(cond);
> @@ -656,15 +659,15 @@ static void tcg_out_setcond2_i32(TCGContext *s, TCGCond cond, TCGArg ret,
>
> switch (cond) {
> case TCG_COND_EQ:
> - tcg_out_setcond_i32(s, TCG_COND_EQ, TCG_REG_I5, al, bl, blconst);
> + tcg_out_setcond_i32(s, TCG_COND_EQ, TCG_REG_TMP, al, bl, blconst);
> tcg_out_setcond_i32(s, TCG_COND_EQ, ret, ah, bh, bhconst);
> - tcg_out_arith(s, ret, ret, TCG_REG_I5, ARITH_AND);
> + tcg_out_arith(s, ret, ret, TCG_REG_TMP, ARITH_AND);
> break;
>
> case TCG_COND_NE:
> - tcg_out_setcond_i32(s, TCG_COND_NE, TCG_REG_I5, al, al, blconst);
> + tcg_out_setcond_i32(s, TCG_COND_NE, TCG_REG_TMP, al, al, blconst);
> tcg_out_setcond_i32(s, TCG_COND_NE, ret, ah, bh, bhconst);
> - tcg_out_arith(s, ret, ret, TCG_REG_I5, ARITH_OR);
> + tcg_out_arith(s, ret, ret, TCG_REG_TMP, ARITH_OR);
> break;
>
> default:
> @@ -964,8 +967,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int sizeop)
> #else
> addr_reg = args[addrlo_idx];
> if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
> - tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
> - addr_reg = TCG_REG_I5;
> + tcg_out_arithi(s, TCG_REG_TMP, addr_reg, 0, SHIFT_SRL);
> + addr_reg = TCG_REG_TMP;
> }
> if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
> int reg64 = (datalo < 16 ? datalo : TCG_REG_O0);
> @@ -1008,12 +1011,11 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop)
> offsetof(CPUTLBEntry, addr_write));
>
> if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
> - /* Reconstruct the full 64-bit value in %g1, using %o2 as temp. */
> - /* ??? Redefine the temps from %i4/%i5 so that we have a o/g temp. */
> - tcg_out_arithi(s, TCG_REG_G1, datalo, 0, SHIFT_SRL);
> + /* Reconstruct the full 64-bit value. */
> + tcg_out_arithi(s, TCG_REG_TMP, datalo, 0, SHIFT_SRL);
> tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
> - tcg_out_arith(s, TCG_REG_G1, TCG_REG_G1, TCG_REG_O2, ARITH_OR);
> - datalo = TCG_REG_G1;
> + tcg_out_arith(s, TCG_REG_O2, TCG_REG_TMP, TCG_REG_O2, ARITH_OR);
> + datalo = TCG_REG_O2;
> }
>
> /* The fast path is exactly one insn. Thus we can perform the entire
> @@ -1054,16 +1056,14 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop)
> #else
> addr_reg = args[addrlo_idx];
> if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) {
> - tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL);
> - addr_reg = TCG_REG_I5;
> + tcg_out_arithi(s, TCG_REG_TMP, addr_reg, 0, SHIFT_SRL);
> + addr_reg = TCG_REG_TMP;
> }
> if (TCG_TARGET_REG_BITS == 32 && sizeop == 3) {
> - /* Reconstruct the full 64-bit value in %g1, using %o2 as temp. */
> - /* ??? Redefine the temps from %i4/%i5 so that we have a o/g temp. */
> - tcg_out_arithi(s, TCG_REG_G1, datalo, 0, SHIFT_SRL);
> + tcg_out_arithi(s, TCG_REG_TMP, datalo, 0, SHIFT_SRL);
> tcg_out_arithi(s, TCG_REG_O2, datahi, 32, SHIFT_SLLX);
> - tcg_out_arith(s, TCG_REG_G1, TCG_REG_G1, TCG_REG_O2, ARITH_OR);
> - datalo = TCG_REG_G1;
> + tcg_out_arith(s, TCG_REG_O2, TCG_REG_TMP, TCG_REG_O2, ARITH_OR);
> + datalo = TCG_REG_O2;
> }
> tcg_out_ldst_rr(s, datalo, addr_reg,
> (GUEST_BASE ? TCG_GUEST_BASE_REG : TCG_REG_G0),
> @@ -1087,14 +1087,14 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
> case INDEX_op_goto_tb:
> if (s->tb_jmp_offset) {
> /* direct jump method */
> - tcg_out_sethi(s, TCG_REG_I5, args[0] & 0xffffe000);
> - tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_I5) |
> + tcg_out_sethi(s, TCG_REG_TMP, args[0] & 0xffffe000);
> + tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_TMP) |
> INSN_IMM13((args[0] & 0x1fff)));
> s->tb_jmp_offset[args[0]] = s->code_ptr - s->code_buf;
> } else {
> /* indirect jump method */
> - tcg_out_ld_ptr(s, TCG_REG_I5, (tcg_target_long)(s->tb_next + args[0]));
> - tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_I5) |
> + tcg_out_ld_ptr(s, TCG_REG_TMP, (tcg_target_long)(s->tb_next + args[0]));
> + tcg_out32(s, JMPL | INSN_RD(TCG_REG_G0) | INSN_RS1(TCG_REG_TMP) |
> INSN_RS2(TCG_REG_G0));
> }
> tcg_out_nop(s);
> @@ -1106,9 +1106,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
> - (tcg_target_ulong)s->code_ptr) >> 2)
> & 0x3fffffff));
> else {
> - tcg_out_ld_ptr(s, TCG_REG_I5,
> + tcg_out_ld_ptr(s, TCG_REG_TMP,
> (tcg_target_long)(s->tb_next + args[0]));
> - tcg_out32(s, JMPL | INSN_RD(TCG_REG_O7) | INSN_RS1(TCG_REG_I5) |
> + tcg_out32(s, JMPL | INSN_RD(TCG_REG_O7) | INSN_RS1(TCG_REG_TMP) |
> INSN_RS2(TCG_REG_G0));
> }
> /* delay slot */
> @@ -1214,11 +1214,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
>
> case INDEX_op_rem_i32:
> case INDEX_op_remu_i32:
> - tcg_out_div32(s, TCG_REG_I5, args[1], args[2], const_args[2],
> + tcg_out_div32(s, TCG_REG_TMP, args[1], args[2], const_args[2],
> opc == INDEX_op_remu_i32);
> - tcg_out_arithc(s, TCG_REG_I5, TCG_REG_I5, args[2], const_args[2],
> + tcg_out_arithc(s, TCG_REG_TMP, TCG_REG_TMP, args[2], const_args[2],
> ARITH_UMUL);
> - tcg_out_arith(s, args[0], args[1], TCG_REG_I5, ARITH_SUB);
> + tcg_out_arith(s, args[0], args[1], TCG_REG_TMP, ARITH_SUB);
> break;
>
> case INDEX_op_brcond_i32:
> @@ -1335,11 +1335,11 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args,
> goto gen_arith;
> case INDEX_op_rem_i64:
> case INDEX_op_remu_i64:
> - tcg_out_arithc(s, TCG_REG_I5, args[1], args[2], const_args[2],
> + tcg_out_arithc(s, TCG_REG_TMP, args[1], args[2], const_args[2],
> opc == INDEX_op_rem_i64 ? ARITH_SDIVX : ARITH_UDIVX);
> - tcg_out_arithc(s, TCG_REG_I5, TCG_REG_I5, args[2], const_args[2],
> + tcg_out_arithc(s, TCG_REG_TMP, TCG_REG_TMP, args[2], const_args[2],
> ARITH_MULX);
> - tcg_out_arith(s, args[0], args[1], TCG_REG_I5, ARITH_SUB);
> + tcg_out_arith(s, args[0], args[1], TCG_REG_TMP, ARITH_SUB);
> break;
> case INDEX_op_ext32s_i64:
> if (const_args[1]) {
> @@ -1537,15 +1537,17 @@ static void tcg_target_init(TCGContext *s)
> (1 << TCG_REG_O7));
>
> tcg_regset_clear(s->reserved_regs);
> - tcg_regset_set_reg(s->reserved_regs, TCG_REG_G0);
> -#if TCG_TARGET_REG_BITS == 64
> - tcg_regset_set_reg(s->reserved_regs, TCG_REG_I4); // for internal use
> -#endif
> - tcg_regset_set_reg(s->reserved_regs, TCG_REG_I5); // for internal use
> - tcg_regset_set_reg(s->reserved_regs, TCG_REG_I6);
> - tcg_regset_set_reg(s->reserved_regs, TCG_REG_I7);
> - tcg_regset_set_reg(s->reserved_regs, TCG_REG_O6);
> - tcg_regset_set_reg(s->reserved_regs, TCG_REG_O7);
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_G0); // zero
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_G6); // reserved for os
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_G7); // thread pointer
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_I6); // frame pointer
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_I7); // return address
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_O6); // stack pointer
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP); // for internal use
> + if (TCG_TARGET_REG_BITS == 64) {
> + tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2); // for internal use
Please fix the comment style above.
> + }
> +
> tcg_add_target_add_op_defs(sparc_op_defs);
> }
>
> --
> 1.7.7.6
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0.
2012-03-26 16:31 ` Blue Swirl
@ 2012-03-26 16:52 ` Richard Henderson
2012-03-26 17:22 ` Blue Swirl
0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2012-03-26 16:52 UTC (permalink / raw)
To: Blue Swirl; +Cc: qemu-devel
On 03/26/12 09:31, Blue Swirl wrote:
>> > +/* In dyngen-exec.h, without AREG0, we fall back to an alias to cpu_single_env.
>> > + We can't actually tell from here whether that's needed or not, but it does
>> > + not hurt to go ahead and make the declaration. */
>> > +#ifndef CONFIG_TCG_PASS_AREG0
>> > +extern
>> > +#ifdef __linux__
>> > + __thread
>> > +#endif
>> > + CPUArchState *env __attribute__((alias("tls__cpu_single_env")));
>> > +#endif /* CONFIG_TCG_PASS_AREG0 */
> Please use DECLARE_TLS/DEFINE_TLS and global env accesses should also
> use tls_var().
>
That won't work.
This is intended to be a drop-in replacement for the "env" symbol that
we declare in dyngen-exec.h. For all other hosts, this symbol is a
global register variable. We can't go wrapping tls_var around all uses
in all target backends.
As I say in the comment, the most natural replacement is a preprocessor
macro, but then that fails with the uses of "env" in the DEF_HELPER_N
macros.
Which leaves no alternative -- short of converting *all* targets to
CONFIG_TCG_PASS_AREG0 first -- except the symbol alias you see there.
Hmm... actually... I'm wrong about the use of preprocessor macros.
The simple solution there is to re-order the includes on a few ports.
I.e. "helper.h" must come before "dyngen-exec.h". Now that's a much
simpler fix...
r~
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0.
2012-03-26 16:52 ` Richard Henderson
@ 2012-03-26 17:22 ` Blue Swirl
0 siblings, 0 replies; 22+ messages in thread
From: Blue Swirl @ 2012-03-26 17:22 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Mon, Mar 26, 2012 at 16:52, Richard Henderson <rth@twiddle.net> wrote:
> On 03/26/12 09:31, Blue Swirl wrote:
>>> > +/* In dyngen-exec.h, without AREG0, we fall back to an alias to cpu_single_env.
>>> > + We can't actually tell from here whether that's needed or not, but it does
>>> > + not hurt to go ahead and make the declaration. */
>>> > +#ifndef CONFIG_TCG_PASS_AREG0
>>> > +extern
>>> > +#ifdef __linux__
>>> > + __thread
>>> > +#endif
>>> > + CPUArchState *env __attribute__((alias("tls__cpu_single_env")));
>>> > +#endif /* CONFIG_TCG_PASS_AREG0 */
>> Please use DECLARE_TLS/DEFINE_TLS and global env accesses should also
>> use tls_var().
>>
>
> That won't work.
>
> This is intended to be a drop-in replacement for the "env" symbol that
> we declare in dyngen-exec.h. For all other hosts, this symbol is a
> global register variable. We can't go wrapping tls_var around all uses
> in all target backends.
>
> As I say in the comment, the most natural replacement is a preprocessor
> macro, but then that fails with the uses of "env" in the DEF_HELPER_N
> macros.
>
> Which leaves no alternative -- short of converting *all* targets to
> CONFIG_TCG_PASS_AREG0 first -- except the symbol alias you see there.
But at that point there will be no global env use anymore, so
dyngen-exec.h etc. can be removed. Perhaps this patch and its
dependencies should wait for that to happen. As an intermediate hack
it's sort of OK.
> Hmm... actually... I'm wrong about the use of preprocessor macros.
> The simple solution there is to re-order the includes on a few ports.
> I.e. "helper.h" must come before "dyngen-exec.h". Now that's a much
> simpler fix...
OK.
>
>
> r~
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2012-03-26 17:23 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-25 22:27 [Qemu-devel] [PATCH 00/15] tcg-sparc improvments Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 01/15] tcg-sparc: Hack in qemu_ld/st64 for 32-bit Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 02/15] tcg-sparc: Fix ADDX opcode Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 03/15] tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 04/15] tcg-sparc: Fix qemu_ld/st to handle 32-bit host Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 05/15] tcg-sparc: Simplify qemu_ld/st direct memory paths Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 06/15] tcg-sparc: Support GUEST_BASE Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 07/15] tcg-sparc: Steamline qemu_ld/st more Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 08/15] Avoid declaring the env variable at all if CONFIG_TCG_PASS_AREG0 Richard Henderson
2012-03-26 16:26 ` Blue Swirl
2012-03-26 16:31 ` Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 09/15] tcg-sparc: Do not use a global register for AREG0 Richard Henderson
2012-03-26 16:31 ` Blue Swirl
2012-03-26 16:52 ` Richard Henderson
2012-03-26 17:22 ` Blue Swirl
2012-03-25 22:27 ` [Qemu-devel] [PATCH 10/15] tcg-sparc: Change AREG0 in generated code to %i0 Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 11/15] tcg-sparc: Clean up cruft stemming from attempts to use global registers Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 12/15] tcg-sparc: Mask shift immediates to avoid illegal insns Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 13/15] tcg-sparc: Use defines for temporaries Richard Henderson
2012-03-26 16:38 ` Blue Swirl
2012-03-25 22:27 ` [Qemu-devel] [PATCH 14/15] tcg-sparc: Add %g/%o registers to alloc_order Richard Henderson
2012-03-25 22:27 ` [Qemu-devel] [PATCH 15/15] tcg-sparc: Fix and enable direct TB chaining Richard Henderson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.