* [Qemu-devel] [PATCH 02/20] target-*: Unconditionally emit tcg_gen_insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 03/20] tcg: Allow extra data to be attached to insn_start Richard Henderson
` (21 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
While we're at it, emit the opcode adjacent to where we currently
record data for search_pc. This puts gen_io_start et al on the
"correct" side of the marker.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-alpha/translate.c | 6 ++----
target-arm/translate-a64.c | 5 +----
target-arm/translate.c | 5 +----
target-cris/translate.c | 5 +----
target-cris/translate_v10.c | 3 ---
target-i386/translate.c | 5 ++---
target-lm32/translate.c | 5 +----
target-m68k/translate.c | 10 +++++-----
target-microblaze/translate.c | 5 +----
target-mips/translate.c | 9 ++++-----
target-moxie/translate.c | 6 ++----
target-openrisc/translate.c | 5 +----
target-ppc/translate.c | 5 ++---
target-s390x/translate.c | 6 ++----
target-sh4/translate.c | 14 +++++---------
target-sparc/translate.c | 10 +++++-----
target-tricore/translate.c | 2 ++
target-unicore32/translate.c | 5 +----
target-xtensa/translate.c | 5 +----
19 files changed, 39 insertions(+), 77 deletions(-)
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index fe0e841..0c43ffa 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2928,16 +2928,14 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(ctx.pc);
+
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
}
insn = cpu_ldl_code(env, ctx.pc);
num_insns++;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(ctx.pc);
- }
-
TCGV_UNUSED_I64(ctx.zero);
TCGV_UNUSED_I64(ctx.sink);
TCGV_UNUSED_I64(ctx.lit);
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 549113c..48c34d1 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11021,15 +11021,12 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(dc->pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
}
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc->pc);
- }
-
if (dc->ss_active && !dc->pstate_ss) {
/* Singlestep state is Active-pending.
* If we're in this state at the start of a TB then either
diff --git a/target-arm/translate.c b/target-arm/translate.c
index a9c4d6b..8fc7edd 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11312,14 +11312,11 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(dc->pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
gen_io_start();
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc->pc);
- }
-
if (dc->ss_active && !dc->pstate_ss) {
/* Singlestep state is Active-pending.
* If we're in this state at the start of a TB then either
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 199e7d9..3279bad 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3007,10 +3007,6 @@ static unsigned int crisv32_decoder(CPUCRISState *env, DisasContext *dc)
int insn_len = 2;
int i;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc->pc);
- }
-
/* Load a halfword onto the instruction register. */
dc->ir = cris_fetch(env, dc, dc->pc, 2, 0);
@@ -3210,6 +3206,7 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(dc->pc);
/* Pretty disas. */
LOG_DIS("%8.8x:\t", dc->pc);
diff --git a/target-cris/translate_v10.c b/target-cris/translate_v10.c
index e0a271e..5f1b5ce 100644
--- a/target-cris/translate_v10.c
+++ b/target-cris/translate_v10.c
@@ -1207,9 +1207,6 @@ static unsigned int crisv10_decoder(CPUCRISState *env, DisasContext *dc)
{
unsigned int insn_len = 2;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)))
- tcg_gen_insn_start(dc->pc);
-
/* Load a halfword onto the instruction register. */
dc->ir = cpu_lduw_code(env, dc->pc);
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 8089b2e..4c84c9a 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -4412,9 +4412,6 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s,
target_ulong next_eip, tval;
int rex_w, rex_r;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(pc_start);
- }
s->pc = pc_start;
prefixes = 0;
s->override = -1;
@@ -8022,6 +8019,8 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(pc_ptr);
+
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
gen_io_start();
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index b1b4cbb..84eeac3 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1005,10 +1005,6 @@ static const DecoderInfo decinfo[] = {
static inline void decode(DisasContext *dc, uint32_t ir)
{
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc->pc);
- }
-
dc->ir = ir;
LOG_DIS("%8.8x\t", dc->ir);
@@ -1106,6 +1102,7 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(dc->pc);
/* Pretty disas. */
LOG_DIS("%8.8x:\t", dc->pc);
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index e34bf2b..bfd9c00 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2955,10 +2955,6 @@ static void disas_m68k_insn(CPUM68KState * env, DisasContext *s)
{
uint16_t insn;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(s->pc);
- }
-
insn = cpu_lduw_code(env, s->pc);
s->pc += 2;
@@ -3025,8 +3021,12 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+ tcg_gen_insn_start(dc->pc);
+
+ if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
+ }
+
dc->insn_pc = dc->pc;
disas_m68k_insn(env, dc);
num_insns++;
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index bd2f7cd..9e046f7 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1603,10 +1603,6 @@ static inline void decode(DisasContext *dc, uint32_t ir)
{
int i;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc->pc);
- }
-
dc->ir = ir;
LOG_DIS("%8.8x\t", dc->ir);
@@ -1733,6 +1729,7 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(dc->pc);
/* Pretty disas. */
LOG_DIS("%8.8x:\t", dc->pc);
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 28cdc31..9226420 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -19515,10 +19515,6 @@ static void decode_opc(CPUMIPSState *env, DisasContext *ctx)
gen_set_label(l1);
}
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(ctx->pc);
- }
-
op = MASK_OP_MAJOR(ctx->opcode);
rs = (ctx->opcode >> 21) & 0x1f;
rt = (ctx->opcode >> 16) & 0x1f;
@@ -20234,8 +20230,11 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+ tcg_gen_insn_start(ctx.pc);
+
+ if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
+ }
is_slot = ctx.hflags & MIPS_HFLAG_BMASK;
if (!(ctx.hflags & MIPS_HFLAG_M16)) {
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index c5d3741..cfc3cec 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -154,10 +154,6 @@ static int decode_opc(MoxieCPU *cpu, DisasContext *ctx)
/* Set the default instruction length. */
int length = 2;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(ctx->pc);
- }
-
/* Examine the 16-bit opcode. */
opcode = ctx->opcode;
@@ -866,6 +862,8 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(ctx.pc);
+
ctx.opcode = cpu_lduw_code(env, ctx.pc);
ctx.pc += decode_opc(cpu, &ctx);
num_insns++;
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index b316f3b..d5da295 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1687,10 +1687,7 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
tcg_ctx.gen_opc_instr_start[k] = 1;
tcg_ctx.gen_opc_icount[k] = num_insns;
}
-
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc->pc);
- }
+ tcg_gen_insn_start(dc->pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 2759793..2872c77 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11495,6 +11495,8 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(ctx.nip);
+
LOG_DISAS("----------------\n");
LOG_DISAS("nip=" TARGET_FMT_lx " super=%d ir=%d\n",
ctx.nip, ctx.mem_idx, (int)msr_ir);
@@ -11508,9 +11510,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
LOG_DISAS("translate opcode %08x (%02x %02x %02x) (%s)\n",
ctx.opcode, opc1(ctx.opcode), opc2(ctx.opcode),
opc3(ctx.opcode), ctx.le_mode ? "little" : "big");
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(ctx.nip);
- }
ctx.nip += 4;
table = env->opcodes;
num_insns++;
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index a87d83c..2767f6a 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5370,14 +5370,12 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(dc.pc);
+
if (++num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
}
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc.pc);
- }
-
status = NO_EXIT;
if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index c73ee10..645e9d4 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1814,10 +1814,6 @@ static void decode_opc(DisasContext * ctx)
{
uint32_t old_flags = ctx->flags;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(ctx->pc);
- }
-
_decode_opc(ctx);
if (old_flags & (DELAY_SLOT | DELAY_SLOT_CONDITIONAL)) {
@@ -1900,12 +1896,12 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[ii] = 1;
tcg_ctx.gen_opc_icount[ii] = num_insns;
}
- if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+ tcg_gen_insn_start(ctx.pc);
+
+ if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
-#if 0
- fprintf(stderr, "Loading opcode at address 0x%08x\n", ctx.pc);
- fflush(stderr);
-#endif
+ }
+
ctx.opcode = cpu_lduw_code(env, ctx.pc);
decode_opc(&ctx);
num_insns++;
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index fdf62c3..cb7b5f5 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2482,10 +2482,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
TCGv_i64 cpu_src1_64, cpu_src2_64, cpu_dst_64;
target_long simm;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc->pc);
- }
-
opc = GET_FIELD(insn, 0, 1);
rd = GET_FIELD(insn, 2, 6);
@@ -5271,8 +5267,12 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
}
- if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+ tcg_gen_insn_start(dc->pc);
+
+ if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
+ }
+
last_pc = dc->pc;
insn = cpu_ldl_code(env, dc->pc);
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index f02bef4..a5e4ddb 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8292,6 +8292,8 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
tcg_clear_temp_count();
gen_tb_start(tb);
while (ctx.bstate == BS_NONE) {
+ tcg_gen_insn_start(ctx.pc);
+
ctx.opcode = cpu_ldl_code(env, ctx.pc);
decode_opc(env, &ctx, 0);
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 63a5192..28db34a 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1794,10 +1794,6 @@ static void disas_uc32_insn(CPUUniCore32State *env, DisasContext *s)
UniCore32CPU *cpu = uc32_env_get_cpu(env);
unsigned int insn;
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(s->pc);
- }
-
insn = cpu_ldl_code(env, s->pc);
s->pc += 4;
@@ -1941,6 +1937,7 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
+ tcg_gen_insn_start(dc->pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index ea777da..ab9e8f9 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3076,10 +3076,7 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = insn_count;
}
-
- if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
- tcg_gen_insn_start(dc.pc);
- }
+ tcg_gen_insn_start(dc.pc);
++dc.ccount_delta;
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 03/20] tcg: Allow extra data to be attached to insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 02/20] target-*: Unconditionally emit tcg_gen_insn_start Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-08 18:44 ` Peter Maydell
2015-09-02 5:51 ` [Qemu-devel] [PATCH 04/20] target-arm: Add condexec state " Richard Henderson
` (20 subsequent siblings)
22 siblings, 1 reply; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
With an eye toward having this data replace the gen_opc_* arrays
that each target collects in order to enable restore_state_from_tb.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/tcg-op.h | 52 ++++++++++++++++++++++++++++++++++++++++++++--------
tcg/tcg-opc.h | 4 ++--
tcg/tcg.c | 14 ++++++++------
tcg/tcg.h | 6 ++++++
4 files changed, 60 insertions(+), 16 deletions(-)
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 6409db8..4e20dc1 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -700,17 +700,53 @@ static inline void tcg_gen_concat32_i64(TCGv_i64 ret, TCGv_i64 lo, TCGv_i64 hi)
#error must include QEMU headers
#endif
-/* debug info: write the PC of the corresponding QEMU CPU instruction */
-static inline void tcg_gen_insn_start(uint64_t pc)
+#if TARGET_INSN_START_WORDS == 1
+# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
+static inline void tcg_gen_insn_start(target_ulong pc)
{
- /* XXX: must really use a 32 bit size for TCGArg in all cases */
-#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
- tcg_gen_op2ii(INDEX_op_insn_start,
- (uint32_t)(pc), (uint32_t)(pc >> 32));
+ tcg_gen_op1(&tcg_ctx, INDEX_op_insn_start, pc);
+}
+# else
+static inline void tcg_gen_insn_start(target_ulong pc)
+{
+ tcg_gen_op2(&tcg_ctx, INDEX_op_insn_start,
+ (uint32_t)pc, (uint32_t)(pc >> 32));
+}
+# endif
+#elif TARGET_INSN_START_WORDS == 2
+# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1)
+{
+ tcg_gen_op2(&tcg_ctx, INDEX_op_insn_start, pc, a1);
+}
+# else
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1)
+{
+ tcg_gen_op4(&tcg_ctx, INDEX_op_insn_start,
+ (uint32_t)pc, (uint32_t)(pc >> 32),
+ (uint32_t)a1, (uint32_t)(a1 >> 32));
+}
+# endif
+#elif TARGET_INSN_START_WORDS == 3
+# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1,
+ target_ulong a2)
+{
+ tcg_gen_op3(&tcg_ctx, INDEX_op_insn_start, pc, a1, a2);
+}
+# else
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1,
+ target_ulong a2)
+{
+ tcg_gen_op6(&tcg_ctx, INDEX_op_insn_start,
+ (uint32_t)pc, (uint32_t)(pc >> 32),
+ (uint32_t)a1, (uint32_t)(a1 >> 32),
+ (uint32_t)a2, (uint32_t)(a2 >> 32));
+}
+# endif
#else
- tcg_gen_op1i(INDEX_op_insn_start, pc);
+# error "Unhandled number of operands to insn_start"
#endif
-}
static inline void tcg_gen_exit_tb(uintptr_t val)
{
diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
index f60d3c2..c6f9570 100644
--- a/tcg/tcg-opc.h
+++ b/tcg/tcg-opc.h
@@ -175,9 +175,9 @@ DEF(mulsh_i64, 1, 2, 0, IMPL(TCG_TARGET_HAS_mulsh_i64))
/* QEMU specific */
#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-DEF(insn_start, 0, 0, 2, TCG_OPF_NOT_PRESENT)
+DEF(insn_start, 0, 0, 2 * TARGET_INSN_START_WORDS, TCG_OPF_NOT_PRESENT)
#else
-DEF(insn_start, 0, 0, 1, TCG_OPF_NOT_PRESENT)
+DEF(insn_start, 0, 0, TARGET_INSN_START_WORDS, TCG_OPF_NOT_PRESENT)
#endif
DEF(exit_tb, 0, 0, 1, TCG_OPF_BB_END)
DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_END)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 4087f76..a44b834 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -997,16 +997,18 @@ void tcg_dump_ops(TCGContext *s)
args = &s->gen_opparam_buf[op->args];
if (c == INDEX_op_insn_start) {
- uint64_t pc;
+ qemu_log("%s ----", oi != s->gen_first_op_idx ? "\n" : "");
+
+ for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
+ target_ulong a;
#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
- pc = ((uint64_t)args[1] << 32) | args[0];
+ a = ((target_ulong)args[i * 2 + 1] << 32) | args[i * 2];
#else
- pc = args[0];
+ a = args[i];
#endif
- if (oi != s->gen_first_op_idx) {
- qemu_log("\n");
+ qemu_log(" " TARGET_FMT_lx, a);
+
}
- qemu_log(" ---- 0x%" PRIx64, pc);
} else if (c == INDEX_op_call) {
/* variable number of arguments */
nb_oargs = op->callo;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index f437824..455c229 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -129,6 +129,12 @@ typedef uint64_t TCGRegSet;
# error "Missing unsigned widening multiply"
#endif
+#ifndef TARGET_INSN_START_EXTRA_WORDS
+# define TARGET_INSN_START_WORDS 1
+#else
+# define TARGET_INSN_START_WORDS (1 + TARGET_INSN_START_EXTRA_WORDS)
+#endif
+
typedef enum TCGOpcode {
#define DEF(name, oargs, iargs, cargs, flags) INDEX_op_ ## name,
#include "tcg-opc.h"
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 04/20] target-arm: Add condexec state to insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 02/20] target-*: Unconditionally emit tcg_gen_insn_start Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 03/20] tcg: Allow extra data to be attached to insn_start Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 05/20] target-i386: Add cc_op " Richard Henderson
` (19 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-arm/cpu.h | 1 +
target-arm/translate-a64.c | 2 +-
target-arm/translate.c | 3 ++-
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 31825d3..8d5ae3e 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -96,6 +96,7 @@
struct arm_boot_info;
#define NB_MMU_MODES 7
+#define TARGET_INSN_START_EXTRA_WORDS 1
/* We currently assume float and double are IEEE single and double
precision respectively.
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 48c34d1..4fb4a9f 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11021,7 +11021,7 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- tcg_gen_insn_start(dc->pc);
+ tcg_gen_insn_start(dc->pc, 0);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 8fc7edd..c9de455 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11312,7 +11312,8 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- tcg_gen_insn_start(dc->pc);
+ tcg_gen_insn_start(dc->pc,
+ (dc->condexec_cond << 4) | (dc->condexec_mask >> 1));
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
gen_io_start();
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 05/20] target-i386: Add cc_op state to insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (2 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 04/20] target-arm: Add condexec state " Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 06/20] target-mips: Add delayed branch " Richard Henderson
` (18 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-i386/cpu.h | 1 +
target-i386/translate.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 74b674d..f0e381c 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -793,6 +793,7 @@ typedef struct {
#define MAX_GP_COUNTERS (MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)
#define NB_MMU_MODES 3
+#define TARGET_INSN_START_EXTRA_WORDS 1
#define NB_OPMASK_REGS 8
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 4c84c9a..4497c13 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -8019,7 +8019,7 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- tcg_gen_insn_start(pc_ptr);
+ tcg_gen_insn_start(pc_ptr, dc->cc_op);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
gen_io_start();
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 06/20] target-mips: Add delayed branch state to insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (3 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 05/20] target-i386: Add cc_op " Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 07/20] target-s390x: Add cc_op " Richard Henderson
` (17 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-mips/cpu.h | 1 +
target-mips/translate.c | 3 ++-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/target-mips/cpu.h b/target-mips/cpu.h
index c91883d..0a53568 100644
--- a/target-mips/cpu.h
+++ b/target-mips/cpu.h
@@ -132,6 +132,7 @@ struct CPUMIPSFPUContext {
};
#define NB_MMU_MODES 3
+#define TARGET_INSN_START_EXTRA_WORDS 2
typedef struct CPUMIPSMVPContext CPUMIPSMVPContext;
struct CPUMIPSMVPContext {
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 9226420..320adef 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -20175,6 +20175,7 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
ctx.CP0_Config1 = env->CP0_Config1;
ctx.tb = tb;
ctx.bstate = BS_NONE;
+ ctx.btarget = 0;
ctx.kscrexist = (env->CP0_Config4 >> CP0C4_KScrExist) & 0xff;
ctx.rxi = (env->CP0_Config3 >> CP0C3_RXI) & 1;
ctx.ie = (env->CP0_Config4 >> CP0C4_IE) & 3;
@@ -20230,7 +20231,7 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- tcg_gen_insn_start(ctx.pc);
+ tcg_gen_insn_start(ctx.pc, ctx.hflags & MIPS_HFLAG_BMASK, ctx.btarget);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 07/20] target-s390x: Add cc_op state to insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (4 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 06/20] target-mips: Add delayed branch " Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 08/20] target-sh4: Add flags " Richard Henderson
` (16 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-s390x/cpu.h | 1 +
target-s390x/translate.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 63aebf4..6515351 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -43,6 +43,7 @@
#include "fpu/softfloat.h"
#define NB_MMU_MODES 3
+#define TARGET_INSN_START_EXTRA_WORDS 1
#define MMU_MODE0_SUFFIX _primary
#define MMU_MODE1_SUFFIX _secondary
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 2767f6a..d62e4a3 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5370,7 +5370,7 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- tcg_gen_insn_start(dc.pc);
+ tcg_gen_insn_start(dc.pc, dc.cc_op);
if (++num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 08/20] target-sh4: Add flags state to insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (5 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 07/20] target-s390x: Add cc_op " Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 09/20] target-cris: Mirror gen_opc_pc into insn_start Richard Henderson
` (15 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-sh4/cpu.h | 1 +
target-sh4/translate.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/target-sh4/cpu.h b/target-sh4/cpu.h
index 34bb3d7..4fc7b1d 100644
--- a/target-sh4/cpu.h
+++ b/target-sh4/cpu.h
@@ -122,6 +122,7 @@ typedef struct tlb_t {
#define ITLB_SIZE 4
#define NB_MMU_MODES 2
+#define TARGET_INSN_START_EXTRA_WORDS 1
enum sh_features {
SH_FEATURE_SH4A = 1,
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 645e9d4..740bf66 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1896,7 +1896,7 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[ii] = 1;
tcg_ctx.gen_opc_icount[ii] = num_insns;
}
- tcg_gen_insn_start(ctx.pc);
+ tcg_gen_insn_start(ctx.pc, ctx.flags);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 09/20] target-cris: Mirror gen_opc_pc into insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (6 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 08/20] target-sh4: Add flags " Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 10/20] target-sparc: Tidy gen_branch_a interface Richard Henderson
` (14 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
This perhaps isn't ideal in terms of (ab)using the "pc" field
to encode both pc and ppc + delay branch state, as one has to
be aware of this when examining opcode dumps.
But it preserves existing logic, which will be good for bisection,
and it certainly does save storage space.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-cris/translate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 3279bad..3c8fac4 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3206,7 +3206,8 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
tcg_ctx.gen_opc_instr_start[lj] = 1;
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
- tcg_gen_insn_start(dc->pc);
+ tcg_gen_insn_start(dc->delayed_branch == 1
+ ? dc->ppc | 1 : dc->pc);
/* Pretty disas. */
LOG_DIS("%8.8x:\t", dc->pc);
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 10/20] target-sparc: Tidy gen_branch_a interface
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (7 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 09/20] target-cris: Mirror gen_opc_pc into insn_start Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 11/20] target-sparc: Split out gen_branch_n Richard Henderson
` (13 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
We always pass pc2 == dc->npc and r_cond == cpu_cond,
and always set is_br afterward. Infer all of that.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-sparc/translate.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index cb7b5f5..f536ae3 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -955,17 +955,19 @@ static inline void gen_branch2(DisasContext *dc, target_ulong pc1,
gen_goto_tb(dc, 1, pc2, pc2 + 4);
}
-static inline void gen_branch_a(DisasContext *dc, target_ulong pc1,
- target_ulong pc2, TCGv r_cond)
+static void gen_branch_a(DisasContext *dc, target_ulong pc1)
{
TCGLabel *l1 = gen_new_label();
+ target_ulong npc = dc->npc;
- tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
+ tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_cond, 0, l1);
- gen_goto_tb(dc, 0, pc2, pc1);
+ gen_goto_tb(dc, 0, npc, pc1);
gen_set_label(l1);
- gen_goto_tb(dc, 1, pc2 + 4, pc2 + 8);
+ gen_goto_tb(dc, 1, npc + 4, npc + 8);
+
+ dc->is_br = 1;
}
static inline void gen_generic_branch(DisasContext *dc)
@@ -1398,8 +1400,7 @@ static void do_branch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
flush_cond(dc);
gen_cond(cpu_cond, cc, cond, dc);
if (a) {
- gen_branch_a(dc, target, dc->npc, cpu_cond);
- dc->is_br = 1;
+ gen_branch_a(dc, target);
} else {
dc->pc = dc->npc;
dc->jump_pc[0] = target;
@@ -1447,8 +1448,7 @@ static void do_fbranch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
flush_cond(dc);
gen_fcond(cpu_cond, cc, cond);
if (a) {
- gen_branch_a(dc, target, dc->npc, cpu_cond);
- dc->is_br = 1;
+ gen_branch_a(dc, target);
} else {
dc->pc = dc->npc;
dc->jump_pc[0] = target;
@@ -1476,8 +1476,7 @@ static void do_branch_reg(DisasContext *dc, int32_t offset, uint32_t insn,
flush_cond(dc);
gen_cond_reg(cpu_cond, cond, r_reg);
if (a) {
- gen_branch_a(dc, target, dc->npc, cpu_cond);
- dc->is_br = 1;
+ gen_branch_a(dc, target);
} else {
dc->pc = dc->npc;
dc->jump_pc[0] = target;
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 11/20] target-sparc: Split out gen_branch_n
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (8 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 10/20] target-sparc: Tidy gen_branch_a interface Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 12/20] target-sparc: Remove gen_opc_jump_pc Richard Henderson
` (12 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Unify three copies of this code from different
branch types. Fix the case when npc == DYNAMIC_PC,
i.e. a branch within a delay slot.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-sparc/translate.c | 55 ++++++++++++++++++++++++------------------------
1 file changed, 28 insertions(+), 27 deletions(-)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index f536ae3..8aa19e1 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -970,6 +970,31 @@ static void gen_branch_a(DisasContext *dc, target_ulong pc1)
dc->is_br = 1;
}
+static void gen_branch_n(DisasContext *dc, target_ulong pc1)
+{
+ target_ulong npc = dc->npc;
+
+ if (likely(npc != DYNAMIC_PC)) {
+ dc->pc = npc;
+ dc->jump_pc[0] = pc1;
+ dc->jump_pc[1] = npc + 4;
+ dc->npc = JUMP_PC;
+ } else {
+ TCGv t, z;
+
+ tcg_gen_mov_tl(cpu_pc, cpu_npc);
+
+ tcg_gen_addi_tl(cpu_npc, cpu_npc, 4);
+ t = tcg_const_tl(pc1);
+ z = tcg_const_tl(0);
+ tcg_gen_movcond_tl(TCG_COND_NE, cpu_npc, cpu_cond, z, t, cpu_npc);
+ tcg_temp_free(t);
+ tcg_temp_free(z);
+
+ dc->pc = DYNAMIC_PC;
+ }
+}
+
static inline void gen_generic_branch(DisasContext *dc)
{
TCGv npc0 = tcg_const_tl(dc->jump_pc[0]);
@@ -1402,15 +1427,7 @@ static void do_branch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
if (a) {
gen_branch_a(dc, target);
} else {
- dc->pc = dc->npc;
- dc->jump_pc[0] = target;
- if (unlikely(dc->npc == DYNAMIC_PC)) {
- dc->jump_pc[1] = DYNAMIC_PC;
- tcg_gen_addi_tl(cpu_pc, cpu_npc, 4);
- } else {
- dc->jump_pc[1] = dc->npc + 4;
- dc->npc = JUMP_PC;
- }
+ gen_branch_n(dc, target);
}
}
}
@@ -1450,15 +1467,7 @@ static void do_fbranch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
if (a) {
gen_branch_a(dc, target);
} else {
- dc->pc = dc->npc;
- dc->jump_pc[0] = target;
- if (unlikely(dc->npc == DYNAMIC_PC)) {
- dc->jump_pc[1] = DYNAMIC_PC;
- tcg_gen_addi_tl(cpu_pc, cpu_npc, 4);
- } else {
- dc->jump_pc[1] = dc->npc + 4;
- dc->npc = JUMP_PC;
- }
+ gen_branch_n(dc, target);
}
}
}
@@ -1478,15 +1487,7 @@ static void do_branch_reg(DisasContext *dc, int32_t offset, uint32_t insn,
if (a) {
gen_branch_a(dc, target);
} else {
- dc->pc = dc->npc;
- dc->jump_pc[0] = target;
- if (unlikely(dc->npc == DYNAMIC_PC)) {
- dc->jump_pc[1] = DYNAMIC_PC;
- tcg_gen_addi_tl(cpu_pc, cpu_npc, 4);
- } else {
- dc->jump_pc[1] = dc->npc + 4;
- dc->npc = JUMP_PC;
- }
+ gen_branch_n(dc, target);
}
}
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 12/20] target-sparc: Remove gen_opc_jump_pc
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (9 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 11/20] target-sparc: Split out gen_branch_n Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 13/20] target-sparc: Add npc state to insn_start Richard Henderson
` (11 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Since jump_pc[1] is always npc + 4, and since we
only continue translation when pc + 4 == npc, we
can infer that jump_pc[1] == pc + 8.
Because of that, we can encode the branch destination
into a single word, and store that in npc.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-sparc/translate.c | 50 +++++++++++++++++++++++-------------------------
1 file changed, 24 insertions(+), 26 deletions(-)
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 8aa19e1..b1e533f 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -39,7 +39,7 @@
#define DYNAMIC_PC 1 /* dynamic pc value */
#define JUMP_PC 2 /* dynamic pc value which takes only two values
- according to jump_pc[T2] */
+ according to npc & ~3 and pc + 4. */
/* global register indexes */
static TCGv_ptr cpu_env, cpu_regwptr;
@@ -65,14 +65,12 @@ static TCGv cpu_wim;
static TCGv_i64 cpu_fpr[TARGET_DPREGS];
static target_ulong gen_opc_npc[OPC_BUF_SIZE];
-static target_ulong gen_opc_jump_pc[2];
#include "exec/gen-icount.h"
typedef struct DisasContext {
target_ulong pc; /* current Program Counter: integer or DYNAMIC_PC */
target_ulong npc; /* next PC: integer or DYNAMIC_PC or JUMP_PC */
- target_ulong jump_pc[2]; /* used when JUMP_PC pc value is used */
int is_br;
int mem_idx;
int fpu_enabled;
@@ -976,9 +974,7 @@ static void gen_branch_n(DisasContext *dc, target_ulong pc1)
if (likely(npc != DYNAMIC_PC)) {
dc->pc = npc;
- dc->jump_pc[0] = pc1;
- dc->jump_pc[1] = npc + 4;
- dc->npc = JUMP_PC;
+ dc->npc = pc1 | JUMP_PC;
} else {
TCGv t, z;
@@ -997,12 +993,15 @@ static void gen_branch_n(DisasContext *dc, target_ulong pc1)
static inline void gen_generic_branch(DisasContext *dc)
{
- TCGv npc0 = tcg_const_tl(dc->jump_pc[0]);
- TCGv npc1 = tcg_const_tl(dc->jump_pc[1]);
- TCGv zero = tcg_const_tl(0);
+ TCGv npc0, npc1, zero;
- tcg_gen_movcond_tl(TCG_COND_NE, cpu_npc, cpu_cond, zero, npc0, npc1);
+ assert((dc->npc & 3) == JUMP_PC);
+ assert((dc->pc & 3) == 0);
+ npc0 = tcg_const_tl(dc->npc & ~3);
+ npc1 = tcg_const_tl(dc->pc + 4);
+ zero = tcg_const_tl(0);
+ tcg_gen_movcond_tl(TCG_COND_NE, cpu_npc, cpu_cond, zero, npc0, npc1);
tcg_temp_free(npc0);
tcg_temp_free(npc1);
tcg_temp_free(zero);
@@ -1012,7 +1011,7 @@ static inline void gen_generic_branch(DisasContext *dc)
have been set for a jump */
static inline void flush_cond(DisasContext *dc)
{
- if (dc->npc == JUMP_PC) {
+ if (dc->npc & JUMP_PC) {
gen_generic_branch(dc);
dc->npc = DYNAMIC_PC;
}
@@ -1020,7 +1019,7 @@ static inline void flush_cond(DisasContext *dc)
static inline void save_npc(DisasContext *dc)
{
- if (dc->npc == JUMP_PC) {
+ if (dc->npc & JUMP_PC) {
gen_generic_branch(dc);
dc->npc = DYNAMIC_PC;
} else if (dc->npc != DYNAMIC_PC) {
@@ -1044,7 +1043,7 @@ static inline void save_state(DisasContext *dc)
static inline void gen_mov_pc_npc(DisasContext *dc)
{
- if (dc->npc == JUMP_PC) {
+ if (dc->npc & JUMP_PC) {
gen_generic_branch(dc);
tcg_gen_mov_tl(cpu_pc, cpu_npc);
dc->pc = DYNAMIC_PC;
@@ -5122,9 +5121,9 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
if (dc->npc == DYNAMIC_PC) {
dc->pc = DYNAMIC_PC;
gen_op_next_insn();
- } else if (dc->npc == JUMP_PC) {
- /* we can do a static jump */
- gen_branch2(dc, dc->jump_pc[0], dc->jump_pc[1], cpu_cond);
+ } else if (dc->npc & JUMP_PC) {
+ assert((dc->pc & 3) == 0);
+ gen_branch2(dc, dc->npc & ~3, dc->pc + 4, cpu_cond);
dc->is_br = 1;
} else {
dc->pc = dc->npc;
@@ -5324,8 +5323,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
#if 0
log_page_dump();
#endif
- gen_opc_jump_pc[0] = dc->jump_pc[0];
- gen_opc_jump_pc[1] = dc->jump_pc[1];
} else {
tb->size = last_pc + 4 - pc_start;
tb->icount = num_insns;
@@ -5453,17 +5450,18 @@ void gen_intermediate_code_init(CPUSPARCState *env)
void restore_state_to_opc(CPUSPARCState *env, TranslationBlock *tb, int pc_pos)
{
- target_ulong npc;
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ target_ulong pc, npc;
+
+ env->pc = pc = tcg_ctx.gen_opc_pc[pc_pos];
npc = gen_opc_npc[pc_pos];
- if (npc == 1) {
- /* dynamic NPC: already stored */
- } else if (npc == 2) {
- /* jump PC: use 'cond' and the jump targets of the translation */
+ if (npc == DYNAMIC_PC) {
+ /* already stored */
+ } else if (npc & JUMP_PC) {
+ /* use 'cond' and the jump targets of the translation */
if (env->cond) {
- env->npc = gen_opc_jump_pc[0];
+ env->npc = npc & ~3;
} else {
- env->npc = gen_opc_jump_pc[1];
+ env->npc = pc + 8;
}
} else {
env->npc = npc;
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 13/20] target-sparc: Add npc state to insn_start
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (10 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 12/20] target-sparc: Remove gen_opc_jump_pc Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 14/20] tcg: Merge cpu_gen_code into tb_gen_code Richard Henderson
` (10 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-sparc/cpu.h | 1 +
target-sparc/translate.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 0522b65..40b6625 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -236,6 +236,7 @@ typedef struct trap_state {
uint32_t tt;
} trap_state;
#endif
+#define TARGET_INSN_START_EXTRA_WORDS 1
typedef struct sparc_def_t {
const char *name;
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index b1e533f..8f7bfb5 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5266,7 +5266,7 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
tcg_ctx.gen_opc_icount[lj] = num_insns;
}
}
- tcg_gen_insn_start(dc->pc);
+ tcg_gen_insn_start(dc->pc, dc->npc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
gen_io_start();
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 14/20] tcg: Merge cpu_gen_code into tb_gen_code
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (11 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 13/20] target-sparc: Add npc state to insn_start Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 15/20] target-*: Drop cpu_gen_code define Richard Henderson
` (9 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
As it's only caller, this tidies things a bit.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
include/exec/exec-all.h | 2 -
translate-all.c | 126 ++++++++++++++++++++++--------------------------
2 files changed, 58 insertions(+), 70 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 83b9251..070291e 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -78,8 +78,6 @@ void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
int pc_pos);
void cpu_gen_init(void);
-int cpu_gen_code(CPUArchState *env, struct TranslationBlock *tb,
- int *gen_code_size_ptr);
bool cpu_restore_state(CPUState *cpu, uintptr_t searched_pc);
void page_size_init(void);
diff --git a/translate-all.c b/translate-all.c
index 2a40530..a5f7e78 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -138,69 +138,6 @@ void cpu_gen_init(void)
tcg_context_init(&tcg_ctx);
}
-/* return non zero if the very first instruction is invalid so that
- the virtual CPU can trigger an exception.
-
- '*gen_code_size_ptr' contains the size of the generated code (host
- code).
-*/
-int cpu_gen_code(CPUArchState *env, TranslationBlock *tb, int *gen_code_size_ptr)
-{
- TCGContext *s = &tcg_ctx;
- tcg_insn_unit *gen_code_buf;
- int gen_code_size;
-#ifdef CONFIG_PROFILER
- int64_t ti;
-#endif
-
-#ifdef CONFIG_PROFILER
- s->tb_count1++; /* includes aborted translations because of
- exceptions */
- ti = profile_getclock();
-#endif
- tcg_func_start(s);
-
- gen_intermediate_code(env, tb);
-
- trace_translate_block(tb, tb->pc, tb->tc_ptr);
-
- /* generate machine code */
- gen_code_buf = tb->tc_ptr;
- tb->tb_next_offset[0] = 0xffff;
- tb->tb_next_offset[1] = 0xffff;
- s->tb_next_offset = tb->tb_next_offset;
-#ifdef USE_DIRECT_JUMP
- s->tb_jmp_offset = tb->tb_jmp_offset;
- s->tb_next = NULL;
-#else
- s->tb_jmp_offset = NULL;
- s->tb_next = tb->tb_next;
-#endif
-
-#ifdef CONFIG_PROFILER
- s->tb_count++;
- s->interm_time += profile_getclock() - ti;
- s->code_time -= profile_getclock();
-#endif
- gen_code_size = tcg_gen_code(s, gen_code_buf);
- *gen_code_size_ptr = gen_code_size;
-#ifdef CONFIG_PROFILER
- s->code_time += profile_getclock();
- s->code_in_len += tb->size;
- s->code_out_len += gen_code_size;
-#endif
-
-#ifdef DEBUG_DISAS
- if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
- qemu_log("OUT: [size=%d]\n", gen_code_size);
- log_disas(tb->tc_ptr, gen_code_size);
- qemu_log("\n");
- qemu_log_flush();
- }
-#endif
- return 0;
-}
-
/* The cpu state corresponding to 'searched_pc' is restored.
*/
static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
@@ -1004,7 +941,11 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
TranslationBlock *tb;
tb_page_addr_t phys_pc, phys_page2;
target_ulong virt_page2;
- int code_gen_size;
+ tcg_insn_unit *gen_code_buf;
+ int gen_code_size;
+#ifdef CONFIG_PROFILER
+ int64_t ti;
+#endif
phys_pc = get_page_addr_code(env, pc);
if (use_icount) {
@@ -1019,13 +960,62 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
/* Don't forget to invalidate previous TB info. */
tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
}
- tb->tc_ptr = tcg_ctx.code_gen_ptr;
+
+ gen_code_buf = tcg_ctx.code_gen_ptr;
+ tb->tc_ptr = gen_code_buf;
tb->cs_base = cs_base;
tb->flags = flags;
tb->cflags = cflags;
- cpu_gen_code(env, tb, &code_gen_size);
- tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)tcg_ctx.code_gen_ptr +
- code_gen_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
+
+#ifdef CONFIG_PROFILER
+ tcg_ctx.tb_count1++; /* includes aborted translations because of
+ exceptions */
+ ti = profile_getclock();
+#endif
+
+ tcg_func_start(&tcg_ctx);
+
+ gen_intermediate_code(env, tb);
+
+ trace_translate_block(tb, tb->pc, tb->tc_ptr);
+
+ /* generate machine code */
+ tb->tb_next_offset[0] = 0xffff;
+ tb->tb_next_offset[1] = 0xffff;
+ tcg_ctx.tb_next_offset = tb->tb_next_offset;
+#ifdef USE_DIRECT_JUMP
+ tcg_ctx.tb_jmp_offset = tb->tb_jmp_offset;
+ tcg_ctx.tb_next = NULL;
+#else
+ tcg_ctx.tb_jmp_offset = NULL;
+ tcg_ctx.tb_next = tb->tb_next;
+#endif
+
+#ifdef CONFIG_PROFILER
+ tcg_ctx.tb_count++;
+ tcg_ctx.interm_time += profile_getclock() - ti;
+ tcg_ctx.code_time -= profile_getclock();
+#endif
+
+ gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
+
+#ifdef CONFIG_PROFILER
+ tcg_ctx.code_time += profile_getclock();
+ tcg_ctx.code_in_len += tb->size;
+ tcg_ctx.code_out_len += gen_code_size;
+#endif
+
+#ifdef DEBUG_DISAS
+ if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
+ qemu_log("OUT: [size=%d]\n", gen_code_size);
+ log_disas(tb->tc_ptr, gen_code_size);
+ qemu_log("\n");
+ qemu_log_flush();
+ }
+#endif
+
+ tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)gen_code_buf +
+ gen_code_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
/* check next page if needed */
virt_page2 = (pc + tb->size - 1) & TARGET_PAGE_MASK;
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 15/20] target-*: Drop cpu_gen_code define
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (12 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 14/20] tcg: Merge cpu_gen_code into tb_gen_code Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:51 ` [Qemu-devel] [PATCH 16/20] tcg: Add TCG_MAX_INSNS Richard Henderson
` (8 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
This symbol no longer exists.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-alpha/cpu.h | 1 -
target-arm/cpu.h | 1 -
target-cris/cpu.h | 1 -
target-i386/cpu.h | 1 -
target-lm32/cpu.h | 1 -
target-m68k/cpu.h | 1 -
target-microblaze/cpu.h | 1 -
target-mips/cpu.h | 1 -
target-moxie/cpu.h | 1 -
target-openrisc/cpu.h | 1 -
target-ppc/cpu.h | 1 -
target-s390x/cpu.h | 1 -
target-sh4/cpu.h | 1 -
target-sparc/cpu.h | 1 -
target-xtensa/cpu.h | 1 -
15 files changed, 15 deletions(-)
diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h
index 3f1ece3..f1e8fa7 100644
--- a/target-alpha/cpu.h
+++ b/target-alpha/cpu.h
@@ -289,7 +289,6 @@ struct CPUAlphaState {
#define cpu_list alpha_cpu_list
#define cpu_exec cpu_alpha_exec
-#define cpu_gen_code cpu_alpha_gen_code
#define cpu_signal_handler cpu_alpha_signal_handler
#include "exec/cpu-all.h"
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 8d5ae3e..953540d 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1598,7 +1598,6 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
#define cpu_init(cpu_model) CPU(cpu_arm_init(cpu_model))
#define cpu_exec cpu_arm_exec
-#define cpu_gen_code cpu_arm_gen_code
#define cpu_signal_handler cpu_arm_signal_handler
#define cpu_list arm_cpu_list
diff --git a/target-cris/cpu.h b/target-cris/cpu.h
index d422e35..64ea33d 100644
--- a/target-cris/cpu.h
+++ b/target-cris/cpu.h
@@ -224,7 +224,6 @@ enum {
#define cpu_init(cpu_model) CPU(cpu_cris_init(cpu_model))
#define cpu_exec cpu_cris_exec
-#define cpu_gen_code cpu_cris_gen_code
#define cpu_signal_handler cpu_cris_signal_handler
#define CPU_SAVE_VERSION 1
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index f0e381c..47d56bc 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -1188,7 +1188,6 @@ uint64_t cpu_get_tsc(CPUX86State *env);
#define cpu_init(cpu_model) CPU(cpu_x86_init(cpu_model))
#define cpu_exec cpu_x86_exec
-#define cpu_gen_code cpu_x86_gen_code
#define cpu_signal_handler cpu_x86_signal_handler
#define cpu_list x86_cpu_list
#define cpudef_setup x86_cpudef_setup
diff --git a/target-lm32/cpu.h b/target-lm32/cpu.h
index 944777d..70b6597 100644
--- a/target-lm32/cpu.h
+++ b/target-lm32/cpu.h
@@ -221,7 +221,6 @@ bool lm32_cpu_do_semihosting(CPUState *cs);
#define cpu_list lm32_cpu_list
#define cpu_exec cpu_lm32_exec
-#define cpu_gen_code cpu_lm32_gen_code
#define cpu_signal_handler cpu_lm32_signal_handler
int lm32_cpu_handle_mmu_fault(CPUState *cpu, vaddr address, int rw,
diff --git a/target-m68k/cpu.h b/target-m68k/cpu.h
index 9a62f6c..b57f03c 100644
--- a/target-m68k/cpu.h
+++ b/target-m68k/cpu.h
@@ -215,7 +215,6 @@ void register_m68k_insns (CPUM68KState *env);
#define cpu_init(cpu_model) CPU(cpu_m68k_init(cpu_model))
#define cpu_exec cpu_m68k_exec
-#define cpu_gen_code cpu_m68k_gen_code
#define cpu_signal_handler cpu_m68k_signal_handler
#define cpu_list m68k_cpu_list
diff --git a/target-microblaze/cpu.h b/target-microblaze/cpu.h
index 7e20e59..a67bb9f 100644
--- a/target-microblaze/cpu.h
+++ b/target-microblaze/cpu.h
@@ -297,7 +297,6 @@ int cpu_mb_signal_handler(int host_signum, void *pinfo,
#define cpu_init(cpu_model) CPU(cpu_mb_init(cpu_model))
#define cpu_exec cpu_mb_exec
-#define cpu_gen_code cpu_mb_gen_code
#define cpu_signal_handler cpu_mb_signal_handler
/* MMU modes definitions */
diff --git a/target-mips/cpu.h b/target-mips/cpu.h
index 0a53568..a93a98f 100644
--- a/target-mips/cpu.h
+++ b/target-mips/cpu.h
@@ -622,7 +622,6 @@ void mips_cpu_unassigned_access(CPUState *cpu, hwaddr addr,
void mips_cpu_list (FILE *f, fprintf_function cpu_fprintf);
#define cpu_exec cpu_mips_exec
-#define cpu_gen_code cpu_mips_gen_code
#define cpu_signal_handler cpu_mips_signal_handler
#define cpu_list mips_cpu_list
diff --git a/target-moxie/cpu.h b/target-moxie/cpu.h
index 29572aa..ad1ba36 100644
--- a/target-moxie/cpu.h
+++ b/target-moxie/cpu.h
@@ -124,7 +124,6 @@ int cpu_moxie_signal_handler(int host_signum, void *pinfo,
#define cpu_init(cpu_model) CPU(cpu_moxie_init(cpu_model))
#define cpu_exec cpu_moxie_exec
-#define cpu_gen_code cpu_moxie_gen_code
#define cpu_signal_handler cpu_moxie_signal_handler
static inline int cpu_mmu_index(CPUMoxieState *env)
diff --git a/target-openrisc/cpu.h b/target-openrisc/cpu.h
index 36c4f20..fa438b2 100644
--- a/target-openrisc/cpu.h
+++ b/target-openrisc/cpu.h
@@ -361,7 +361,6 @@ int cpu_openrisc_signal_handler(int host_signum, void *pinfo, void *puc);
#define cpu_list cpu_openrisc_list
#define cpu_exec cpu_openrisc_exec
-#define cpu_gen_code cpu_openrisc_gen_code
#define cpu_signal_handler cpu_openrisc_signal_handler
#ifndef CONFIG_USER_ONLY
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 6f76674..472172b 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1241,7 +1241,6 @@ int ppc_dcr_write (ppc_dcr_t *dcr_env, int dcrn, uint32_t val);
#define cpu_init(cpu_model) CPU(cpu_ppc_init(cpu_model))
#define cpu_exec cpu_ppc_exec
-#define cpu_gen_code cpu_ppc_gen_code
#define cpu_signal_handler cpu_ppc_signal_handler
#define cpu_list ppc_cpu_list
diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 6515351..dfeb6d2 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -601,7 +601,6 @@ bool css_present(uint8_t cssid);
#define cpu_init(model) CPU(cpu_s390x_init(model))
#define cpu_exec cpu_s390x_exec
-#define cpu_gen_code cpu_s390x_gen_code
#define cpu_signal_handler cpu_s390x_signal_handler
void s390_cpu_list(FILE *f, fprintf_function cpu_fprintf);
diff --git a/target-sh4/cpu.h b/target-sh4/cpu.h
index 4fc7b1d..21de302 100644
--- a/target-sh4/cpu.h
+++ b/target-sh4/cpu.h
@@ -228,7 +228,6 @@ void cpu_load_tlb(CPUSH4State * env);
#define cpu_init(cpu_model) CPU(cpu_sh4_init(cpu_model))
#define cpu_exec cpu_sh4_exec
-#define cpu_gen_code cpu_sh4_gen_code
#define cpu_signal_handler cpu_sh4_signal_handler
#define cpu_list sh4_cpu_list
diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 40b6625..62ec9f1 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -599,7 +599,6 @@ int cpu_sparc_signal_handler(int host_signum, void *pinfo, void *puc);
#endif
#define cpu_exec cpu_sparc_exec
-#define cpu_gen_code cpu_sparc_gen_code
#define cpu_signal_handler cpu_sparc_signal_handler
#define cpu_list sparc_cpu_list
diff --git a/target-xtensa/cpu.h b/target-xtensa/cpu.h
index 96bfc82..c8e994f 100644
--- a/target-xtensa/cpu.h
+++ b/target-xtensa/cpu.h
@@ -383,7 +383,6 @@ typedef struct CPUXtensaState {
#include "cpu-qom.h"
#define cpu_exec cpu_xtensa_exec
-#define cpu_gen_code cpu_xtensa_gen_code
#define cpu_signal_handler cpu_xtensa_signal_handler
#define cpu_list xtensa_cpu_list
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 16/20] tcg: Add TCG_MAX_INSNS
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (13 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 15/20] target-*: Drop cpu_gen_code define Richard Henderson
@ 2015-09-02 5:51 ` Richard Henderson
2015-09-02 5:52 ` [Qemu-devel] [PATCH 17/20] tcg: Pass data argument to restore_state_to_opc Richard Henderson
` (7 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:51 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
Adjust all translators to respect it.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
target-alpha/translate.c | 3 +++
target-arm/translate-a64.c | 3 +++
target-arm/translate.c | 6 +++++-
target-cris/translate.c | 3 +++
target-i386/translate.c | 6 +++++-
target-lm32/translate.c | 3 +++
target-m68k/translate.c | 6 +++++-
target-microblaze/translate.c | 6 +++++-
target-mips/translate.c | 7 ++++++-
target-moxie/translate.c | 13 +++++++++++--
target-openrisc/translate.c | 3 +++
target-ppc/translate.c | 6 +++++-
target-s390x/translate.c | 3 +++
target-sh4/translate.c | 7 ++++++-
target-sparc/translate.c | 7 ++++++-
target-tricore/translate.c | 17 ++++++++++-------
target-unicore32/translate.c | 3 +++
target-xtensa/translate.c | 3 +++
tcg/tcg.h | 1 +
19 files changed, 89 insertions(+), 17 deletions(-)
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 0c43ffa..0229a03 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2899,6 +2899,9 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
if (in_superpage(&ctx, pc_start)) {
pc_mask = (1ULL << 41) - 1;
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 4fb4a9f..10173a4 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -10991,6 +10991,9 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
diff --git a/target-arm/translate.c b/target-arm/translate.c
index c9de455..b4c5dd9 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11223,8 +11223,12 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 3c8fac4..716d961 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3185,6 +3185,9 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
do {
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 4497c13..e272409 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7993,8 +7993,12 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
for(;;) {
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 84eeac3..67fdb09 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1085,6 +1085,9 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
do {
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index bfd9c00..9ac2cea 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2992,8 +2992,12 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
do {
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 9e046f7..d4ec25c 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1704,8 +1704,12 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
do
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 320adef..a1e6b68 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -20199,8 +20199,13 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
MO_UNALN : MO_ALIGN;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
+
LOG_DISAS("\ntb %p idx %d hflags %04x\n", tb, ctx.mem_idx, ctx.hflags);
gen_tb_start(tb);
while (ctx.bstate == BS_NONE) {
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index cfc3cec..8741bba 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -826,7 +826,7 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
CPUBreakpoint *bp;
int j, lj = -1;
CPUMoxieState *env = &cpu->env;
- int num_insns;
+ int num_insns, max_insns;
pc_start = tb->pc;
ctx.pc = pc_start;
@@ -836,6 +836,13 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
ctx.singlestep_enabled = 0;
ctx.bstate = BS_NONE;
num_insns = 0;
+ max_insns = tb->cflags & CF_COUNT_MASK;
+ if (max_insns == 0) {
+ max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
do {
@@ -868,10 +875,12 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
ctx.pc += decode_opc(cpu, &ctx);
num_insns++;
+ if (num_insns >= max_insns) {
+ break;
+ }
if (cs->singlestep_enabled) {
break;
}
-
if ((ctx.pc & (TARGET_PAGE_SIZE - 1)) == 0) {
break;
}
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index d5da295..002c9a4 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1670,6 +1670,9 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 2872c77..f576ecb 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11469,8 +11469,12 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
#endif
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
tcg_clear_temp_count();
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index d62e4a3..4518571 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5353,6 +5353,9 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
gen_tb_start(tb);
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 740bf66..ac77ab2 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1869,8 +1869,13 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
ii = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
+
gen_tb_start(tb);
while (ctx.bstate == BS_NONE && !tcg_op_buf_full()) {
if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 8f7bfb5..4e3760b 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5237,8 +5237,13 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
- if (max_insns == 0)
+ if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
+
gen_tb_start(tb);
do {
if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index a5e4ddb..8173055 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8274,13 +8274,21 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
CPUTriCoreState *env = &cpu->env;
DisasContext ctx;
target_ulong pc_start;
- int num_insns;
+ int num_insns, max_insns;
if (search_pc) {
qemu_log("search pc %d\n", search_pc);
}
num_insns = 0;
+ max_insns = tb->cflags & CF_COUNT_MASK;
+ if (max_insns == 0) {
+ max_insns = CF_COUNT_MASK;
+ }
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
+
pc_start = tb->pc;
ctx.pc = pc_start;
ctx.saved_pc = -1;
@@ -8299,12 +8307,7 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
num_insns++;
- if (tcg_op_buf_full()) {
- gen_save_pc(ctx.next_pc);
- tcg_gen_exit_tb(0);
- break;
- }
- if (singlestep) {
+ if (num_insns >= max_insns || singlestep || tcg_op_buf_full()) {
gen_save_pc(ctx.next_pc);
tcg_gen_exit_tb(0);
break;
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 28db34a..b701c51 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1901,6 +1901,9 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
#ifndef CONFIG_USER_ONLY
if ((env->uncached_asr & ASR_M) == ASR_MODE_USER) {
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index ab9e8f9..c7151bb 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3030,6 +3030,9 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
+ if (max_insns > TCG_MAX_INSNS) {
+ max_insns = TCG_MAX_INSNS;
+ }
dc.config = env->config;
dc.singlestep_enabled = cs->singlestep_enabled;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 455c229..8e67e41 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -194,6 +194,7 @@ typedef struct TCGPool {
#define TCG_POOL_CHUNK_SIZE 32768
#define TCG_MAX_TEMPS 512
+#define TCG_MAX_INSNS 512
/* when the size of the arguments of a called function is smaller than
this value, they are statically allocated in the TB stack frame */
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 17/20] tcg: Pass data argument to restore_state_to_opc
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (14 preceding siblings ...)
2015-09-02 5:51 ` [Qemu-devel] [PATCH 16/20] tcg: Add TCG_MAX_INSNS Richard Henderson
@ 2015-09-02 5:52 ` Richard Henderson
2015-09-08 18:46 ` Peter Maydell
2015-09-02 5:52 ` [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
` (6 subsequent siblings)
22 siblings, 1 reply; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:52 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments. Transition restore_state_to_opc to use
data from the later.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
include/exec/exec-all.h | 2 +-
target-alpha/translate.c | 5 +++--
target-arm/translate.c | 9 +++++----
target-cris/translate.c | 5 +++--
target-i386/translate.c | 26 ++++++--------------------
target-lm32/translate.c | 5 +++--
target-m68k/translate.c | 5 +++--
target-microblaze/translate.c | 5 +++--
target-mips/translate.c | 9 +++++----
target-moxie/translate.c | 5 +++--
target-openrisc/translate.c | 4 ++--
target-ppc/translate.c | 5 +++--
target-s390x/translate.c | 8 ++++----
target-sh4/translate.c | 7 ++++---
target-sparc/translate.c | 9 +++++----
target-tricore/translate.c | 5 +++--
target-unicore32/translate.c | 5 +++--
target-xtensa/translate.c | 5 +++--
tcg/tcg.c | 12 ++++++++++--
tcg/tcg.h | 2 ++
translate-all.c | 2 +-
21 files changed, 75 insertions(+), 65 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 070291e..8c347ba 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -75,7 +75,7 @@ typedef struct TranslationBlock TranslationBlock;
void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
void gen_intermediate_code_pc(CPUArchState *env, struct TranslationBlock *tb);
void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
- int pc_pos);
+ target_ulong *data);
void cpu_gen_init(void);
bool cpu_restore_state(CPUState *cpu, uintptr_t searched_pc);
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 0229a03..27c9942 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3023,7 +3023,8 @@ void gen_intermediate_code_pc (CPUAlphaState *env, struct TranslationBlock *tb)
gen_intermediate_code_internal(alpha_env_get_cpu(env), tb, true);
}
-void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
}
diff --git a/target-arm/translate.c b/target-arm/translate.c
index b4c5dd9..2940d07 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11574,13 +11574,14 @@ void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
}
}
-void restore_state_to_opc(CPUARMState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUARMState *env, TranslationBlock *tb,
+ target_ulong *data)
{
if (is_a64(env)) {
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
env->condexec_bits = 0;
} else {
- env->regs[15] = tcg_ctx.gen_opc_pc[pc_pos];
- env->condexec_bits = gen_opc_condexec_bits[pc_pos];
+ env->regs[15] = data[0];
+ env->condexec_bits = data[1];
}
}
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 716d961..ce2d3a0 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3457,7 +3457,8 @@ void cris_initialize_tcg(void)
}
}
-void restore_state_to_opc(CPUCRISState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUCRISState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
}
diff --git a/target-i386/translate.c b/target-i386/translate.c
index e272409..49944df 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -8117,26 +8117,12 @@ void gen_intermediate_code_pc(CPUX86State *env, TranslationBlock *tb)
gen_intermediate_code_internal(x86_env_get_cpu(env), tb, true);
}
-void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb,
+ target_ulong *data)
{
- int cc_op;
-#ifdef DEBUG_DISAS
- if (qemu_loglevel_mask(CPU_LOG_TB_OP)) {
- int i;
- qemu_log("RESTORE:\n");
- for(i = 0;i <= pc_pos; i++) {
- if (tcg_ctx.gen_opc_instr_start[i]) {
- qemu_log("0x%04x: " TARGET_FMT_lx "\n", i,
- tcg_ctx.gen_opc_pc[i]);
- }
- }
- qemu_log("pc_pos=0x%x eip=" TARGET_FMT_lx " cs_base=%x\n",
- pc_pos, tcg_ctx.gen_opc_pc[pc_pos] - tb->cs_base,
- (uint32_t)tb->cs_base);
- }
-#endif
- env->eip = tcg_ctx.gen_opc_pc[pc_pos] - tb->cs_base;
- cc_op = gen_opc_cc_op[pc_pos];
- if (cc_op != CC_OP_DYNAMIC)
+ int cc_op = data[1];
+ env->eip = data[0] - tb->cs_base;
+ if (cc_op != CC_OP_DYNAMIC) {
env->cc_op = cc_op;
+ }
}
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 67fdb09..d0aaea2 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1219,9 +1219,10 @@ void lm32_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
cpu_fprintf(f, "\n\n");
}
-void restore_state_to_opc(CPULM32State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPULM32State *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
}
void lm32_translate_init(void)
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 9ac2cea..2fc9b68 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -3124,7 +3124,8 @@ void m68k_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
cpu_fprintf (f, "FPRESULT = %12g\n", *(double *)&env->fp_result);
}
-void restore_state_to_opc(CPUM68KState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUM68KState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
}
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index d4ec25c..57c79a6 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1952,7 +1952,8 @@ void mb_tcg_init(void)
}
}
-void restore_state_to_opc(CPUMBState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUMBState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->sregs[SR_PC] = tcg_ctx.gen_opc_pc[pc_pos];
+ env->sregs[SR_PC] = data[0];
}
diff --git a/target-mips/translate.c b/target-mips/translate.c
index a1e6b68..ad9e5d2 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -20717,18 +20717,19 @@ void cpu_state_reset(CPUMIPSState *env)
}
}
-void restore_state_to_opc(CPUMIPSState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUMIPSState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->active_tc.PC = tcg_ctx.gen_opc_pc[pc_pos];
+ env->active_tc.PC = data[0];
env->hflags &= ~MIPS_HFLAG_BMASK;
- env->hflags |= gen_opc_hflags[pc_pos];
+ env->hflags |= data[1];
switch (env->hflags & MIPS_HFLAG_BMASK_BASE) {
case MIPS_HFLAG_BR:
break;
case MIPS_HFLAG_BC:
case MIPS_HFLAG_BL:
case MIPS_HFLAG_B:
- env->btarget = gen_opc_btarget[pc_pos];
+ env->btarget = data[2];
break;
}
}
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index 8741bba..9fa8a43 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -928,7 +928,8 @@ void gen_intermediate_code_pc(CPUMoxieState *env, struct TranslationBlock *tb)
gen_intermediate_code_internal(moxie_env_get_cpu(env), tb, true);
}
-void restore_state_to_opc(CPUMoxieState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUMoxieState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
}
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 002c9a4..78c157b 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1804,7 +1804,7 @@ void openrisc_cpu_dump_state(CPUState *cs, FILE *f,
}
void restore_state_to_opc(CPUOpenRISCState *env, TranslationBlock *tb,
- int pc_pos)
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
}
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index f576ecb..1cfc1ea 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11626,7 +11626,8 @@ void gen_intermediate_code_pc (CPUPPCState *env, struct TranslationBlock *tb)
gen_intermediate_code_internal(ppc_env_get_cpu(env), tb, true);
}
-void restore_state_to_opc(CPUPPCState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUPPCState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->nip = tcg_ctx.gen_opc_pc[pc_pos];
+ env->nip = data[0];
}
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 4518571..047685c 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5463,11 +5463,11 @@ void gen_intermediate_code_pc (CPUS390XState *env, struct TranslationBlock *tb)
gen_intermediate_code_internal(s390_env_get_cpu(env), tb, true);
}
-void restore_state_to_opc(CPUS390XState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUS390XState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- int cc_op;
- env->psw.addr = tcg_ctx.gen_opc_pc[pc_pos];
- cc_op = gen_opc_cc_op[pc_pos];
+ int cc_op = data[1];
+ env->psw.addr = data[0];
if ((cc_op != CC_OP_DYNAMIC) && (cc_op != CC_OP_STATIC)) {
env->cc_op = cc_op;
}
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index ac77ab2..db41d0b 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1978,8 +1978,9 @@ void gen_intermediate_code_pc(CPUSH4State * env, struct TranslationBlock *tb)
gen_intermediate_code_internal(sh_env_get_cpu(env), tb, true);
}
-void restore_state_to_opc(CPUSH4State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUSH4State *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
- env->flags = gen_opc_hflags[pc_pos];
+ env->pc = data[0];
+ env->flags = data[1];
}
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 4e3760b..a208a6b 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5453,12 +5453,13 @@ void gen_intermediate_code_init(CPUSPARCState *env)
}
}
-void restore_state_to_opc(CPUSPARCState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUSPARCState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- target_ulong pc, npc;
+ target_ulong pc = data[0];
+ target_ulong npc = data[1];
- env->pc = pc = tcg_ctx.gen_opc_pc[pc_pos];
- npc = gen_opc_npc[pc_pos];
+ env->pc = pc;
if (npc == DYNAMIC_PC) {
/* already stored */
} else if (npc & JUMP_PC) {
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 8173055..a23bfcd 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8348,9 +8348,10 @@ gen_intermediate_code_pc(CPUTriCoreState *env, struct TranslationBlock *tb)
}
void
-restore_state_to_opc(CPUTriCoreState *env, TranslationBlock *tb, int pc_pos)
+restore_state_to_opc(CPUTriCoreState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->PC = tcg_ctx.gen_opc_pc[pc_pos];
+ env->PC = data[0];
}
/*
*
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index b701c51..75c7d65 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -2133,7 +2133,8 @@ void uc32_cpu_dump_state(CPUState *cs, FILE *f,
cpu_dump_state_ucf64(env, f, cpu_fprintf, flags);
}
-void restore_state_to_opc(CPUUniCore32State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUUniCore32State *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->regs[31] = tcg_ctx.gen_opc_pc[pc_pos];
+ env->regs[31] = data[0];
}
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index c7151bb..5df3913 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3213,7 +3213,8 @@ void xtensa_cpu_dump_state(CPUState *cs, FILE *f,
}
}
-void restore_state_to_opc(CPUXtensaState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUXtensaState *env, TranslationBlock *tb,
+ target_ulong *data)
{
- env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+ env->pc = data[0];
}
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a44b834..d956f0b 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1007,7 +1007,6 @@ void tcg_dump_ops(TCGContext *s)
a = args[i];
#endif
qemu_log(" " TARGET_FMT_lx, a);
-
}
} else if (c == INDEX_op_call) {
/* variable number of arguments */
@@ -2301,7 +2300,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
tcg_insn_unit *gen_code_buf,
long search_pc)
{
- int oi, oi_next;
+ int i, oi, oi_next;
#ifdef DEBUG_DISAS
if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
@@ -2368,6 +2367,15 @@ static inline int tcg_gen_code_common(TCGContext *s,
tcg_reg_alloc_movi(s, args, dead_args, sync_args);
break;
case INDEX_op_insn_start:
+ for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
+ target_ulong a;
+#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
+ a = ((target_ulong)args[i * 2 + 1] << 32) | args[i * 2];
+#else
+ a = args[i];
+#endif
+ s->gen_opc_data[i] = a;
+ }
break;
case INDEX_op_discard:
temp_dead(s, args[0]);
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 8e67e41..794b757 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -580,6 +580,8 @@ struct TCGContext {
target_ulong gen_opc_pc[OPC_BUF_SIZE];
uint16_t gen_opc_icount[OPC_BUF_SIZE];
uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
+
+ target_ulong gen_opc_data[TARGET_INSN_START_WORDS];
};
extern TCGContext tcg_ctx;
diff --git a/translate-all.c b/translate-all.c
index a5f7e78..74be98a 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -189,7 +189,7 @@ static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
}
cpu->icount_decr.u16.low -= s->gen_opc_icount[j];
- restore_state_to_opc(env, tb, j);
+ restore_state_to_opc(env, tb, s->gen_opc_data);
#ifdef CONFIG_PROFILER
s->restore_time += profile_getclock() - ti;
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 17/20] tcg: Pass data argument to restore_state_to_opc
2015-09-02 5:52 ` [Qemu-devel] [PATCH 17/20] tcg: Pass data argument to restore_state_to_opc Richard Henderson
@ 2015-09-08 18:46 ` Peter Maydell
2015-09-17 19:39 ` Richard Henderson
0 siblings, 1 reply; 62+ messages in thread
From: Peter Maydell @ 2015-09-08 18:46 UTC (permalink / raw)
To: Richard Henderson
Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 2 September 2015 at 06:52, Richard Henderson <rth@twiddle.net> wrote:
> The gen_opc_* arrays are already redundant with the data stored in
> the insn_start arguments. Transition restore_state_to_opc to use
> data from the later.
Typo: "latter".
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index a44b834..d956f0b 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -1007,7 +1007,6 @@ void tcg_dump_ops(TCGContext *s)
> a = args[i];
> #endif
> qemu_log(" " TARGET_FMT_lx, a);
> -
> }
> } else if (c == INDEX_op_call) {
> /* variable number of arguments */
Stray whitespace change.
Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 17/20] tcg: Pass data argument to restore_state_to_opc
2015-09-08 18:46 ` Peter Maydell
@ 2015-09-17 19:39 ` Richard Henderson
0 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-17 19:39 UTC (permalink / raw)
To: Peter Maydell; +Cc: dl.soluz, QEMU Developers, Aurelien Jarno, Artyom Tarasenko
On 09/08/2015 11:46 AM, Peter Maydell wrote:
> On 2 September 2015 at 06:52, Richard Henderson <rth@twiddle.net> wrote:
>> The gen_opc_* arrays are already redundant with the data stored in
>> the insn_start arguments. Transition restore_state_to_opc to use
>> data from the later.
>
> Typo: "latter".
>
>> Signed-off-by: Richard Henderson <rth@twiddle.net>
>> diff --git a/tcg/tcg.c b/tcg/tcg.c
>> index a44b834..d956f0b 100644
>> --- a/tcg/tcg.c
>> +++ b/tcg/tcg.c
>> @@ -1007,7 +1007,6 @@ void tcg_dump_ops(TCGContext *s)
>> a = args[i];
>> #endif
>> qemu_log(" " TARGET_FMT_lx, a);
>> -
>> }
>> } else if (c == INDEX_op_call) {
>> /* variable number of arguments */
>
> Stray whitespace change.
Amusingly, this is the whitespace that checkpatch complained about in patch 3.o
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (15 preceding siblings ...)
2015-09-02 5:52 ` [Qemu-devel] [PATCH 17/20] tcg: Pass data argument to restore_state_to_opc Richard Henderson
@ 2015-09-02 5:52 ` Richard Henderson
2015-09-10 13:49 ` Peter Maydell
2015-09-02 5:52 ` [Qemu-devel] [PATCH 19/20] tcg: Remove gen_intermediate_code_pc Richard Henderson
` (5 subsequent siblings)
22 siblings, 1 reply; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:52 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
We can now restore state without retranslation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
include/exec/exec-all.h | 1 +
tcg/tcg.c | 11 ++++-
tcg/tcg.h | 3 +-
translate-all.c | 129 +++++++++++++++++++++++++++++++++---------------
4 files changed, 100 insertions(+), 44 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 8c347ba..315f20a 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -198,6 +198,7 @@ struct TranslationBlock {
#define CF_USE_ICOUNT 0x20000
void *tc_ptr; /* pointer to the translated code */
+ uint8_t *tc_search; /* pointer to search data */
/* next matching tb for physical address. */
struct TranslationBlock *phys_hash_next;
/* original tb when cflags has CF_NOCACHE */
diff --git a/tcg/tcg.c b/tcg/tcg.c
index d956f0b..3541d4c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2300,7 +2300,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
tcg_insn_unit *gen_code_buf,
long search_pc)
{
- int i, oi, oi_next;
+ int i, oi, oi_next, num_insns;
#ifdef DEBUG_DISAS
if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
@@ -2344,6 +2344,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
tcg_out_tb_init(s);
+ num_insns = -1;
for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
TCGOp * const op = &s->gen_op_buf[oi];
TCGArg * const args = &s->gen_opparam_buf[op->args];
@@ -2367,6 +2368,10 @@ static inline int tcg_gen_code_common(TCGContext *s,
tcg_reg_alloc_movi(s, args, dead_args, sync_args);
break;
case INDEX_op_insn_start:
+ if (num_insns >= 0) {
+ s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
+ }
+ num_insns++;
for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
target_ulong a;
#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
@@ -2374,7 +2379,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
#else
a = args[i];
#endif
- s->gen_opc_data[i] = a;
+ s->gen_insn_data[num_insns][i] = a;
}
break;
case INDEX_op_discard:
@@ -2406,6 +2411,8 @@ static inline int tcg_gen_code_common(TCGContext *s,
check_regs(s);
#endif
}
+ tcg_debug_assert(num_insns >= 0);
+ s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
/* Generate TB finalization at the end of block */
tcg_out_tb_finalize(s);
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 794b757..11cc107 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -581,7 +581,8 @@ struct TCGContext {
uint16_t gen_opc_icount[OPC_BUF_SIZE];
uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
- target_ulong gen_opc_data[TARGET_INSN_START_WORDS];
+ uint16_t gen_insn_end_off[TCG_MAX_INSNS];
+ target_ulong gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
};
extern TCGContext tcg_ctx;
diff --git a/translate-all.c b/translate-all.c
index 74be98a..a31f839 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -138,58 +138,65 @@ void cpu_gen_init(void)
tcg_context_init(&tcg_ctx);
}
-/* The cpu state corresponding to 'searched_pc' is restored.
- */
+static target_long decode_sleb128(uint8_t **pp)
+{
+ uint8_t *p = *pp;
+ target_long val = 0;
+ int byte, shift = 0;
+
+ do {
+ byte = *p++;
+ val |= (target_ulong)(byte & 0x7f) << shift;
+ shift += 7;
+ } while (byte & 0x80);
+ if (shift < TARGET_LONG_BITS && (byte & 0x40)) {
+ val |= -(target_ulong)1 << shift;
+ }
+
+ *pp = p;
+ return val;
+}
+
+/* The cpu state corresponding to 'searched_pc' is restored. */
static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
uintptr_t searched_pc)
{
+ target_ulong data[TARGET_INSN_START_WORDS] = { };
+ uintptr_t host_pc = (uintptr_t)tb->tc_ptr;
CPUArchState *env = cpu->env_ptr;
- TCGContext *s = &tcg_ctx;
- int j;
- uintptr_t tc_ptr;
+ uint8_t *p = tb->tc_search;
+ int i, j, num_insns = tb->icount;
#ifdef CONFIG_PROFILER
- int64_t ti;
+ int64_t ti = profile_getclock();
#endif
-#ifdef CONFIG_PROFILER
- ti = profile_getclock();
-#endif
- tcg_func_start(s);
+ if (searched_pc < host_pc) {
+ return -1;
+ }
- gen_intermediate_code_pc(env, tb);
+ /* Reconstruct the stored insn data while looking for the point at
+ which the end of the insn exceeds the searched_pc. */
+ for (i = 0; i < num_insns; ++i) {
+ for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
+ data[j] += decode_sleb128(&p);
+ }
+ host_pc += decode_sleb128(&p);
+ if (host_pc > searched_pc) {
+ goto found;
+ }
+ }
+ return -1;
+ found:
if (tb->cflags & CF_USE_ICOUNT) {
assert(use_icount);
/* Reset the cycle counter to the start of the block. */
- cpu->icount_decr.u16.low += tb->icount;
+ cpu->icount_decr.u16.low += num_insns;
/* Clear the IO flag. */
cpu->can_do_io = 0;
}
-
- /* find opc index corresponding to search_pc */
- tc_ptr = (uintptr_t)tb->tc_ptr;
- if (searched_pc < tc_ptr)
- return -1;
-
- s->tb_next_offset = tb->tb_next_offset;
-#ifdef USE_DIRECT_JUMP
- s->tb_jmp_offset = tb->tb_jmp_offset;
- s->tb_next = NULL;
-#else
- s->tb_jmp_offset = NULL;
- s->tb_next = tb->tb_next;
-#endif
- j = tcg_gen_code_search_pc(s, (tcg_insn_unit *)tc_ptr,
- searched_pc - tc_ptr);
- if (j < 0)
- return -1;
- /* now find start of instruction before */
- while (s->gen_opc_instr_start[j] == 0) {
- j--;
- }
- cpu->icount_decr.u16.low -= s->gen_opc_icount[j];
-
- restore_state_to_opc(env, tb, s->gen_opc_data);
+ cpu->icount_decr.u16.low -= i;
+ restore_state_to_opc(env, tb, data);
#ifdef CONFIG_PROFILER
s->restore_time += profile_getclock() - ti;
@@ -933,6 +940,44 @@ static void build_page_bitmap(PageDesc *p)
}
}
+static uint8_t *encode_sleb128(uint8_t *p, target_long val)
+{
+ int more, byte;
+
+ do {
+ byte = val & 0x7f;
+ val >>= 7;
+ more = !((val == 0 && (byte & 0x40) == 0)
+ || (val == -1 && (byte & 0x40) != 0));
+ if (more)
+ byte |= 0x80;
+ *p++ = byte;
+ } while (more);
+
+ return p;
+}
+
+static int encode_search(TranslationBlock *tb, uint8_t *block)
+{
+ uint8_t *p = block;
+ int i, j, n;
+
+ tb->tc_search = block;
+
+ for (i = 0, n = tb->icount; i < n; ++i) {
+ target_ulong prev;
+
+ for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
+ prev = (i == 0 ? 0 : tcg_ctx.gen_insn_data[i - 1][j]);
+ p = encode_sleb128(p, tcg_ctx.gen_insn_data[i][j] - prev);
+ }
+ prev = (i == 0 ? 0 : tcg_ctx.gen_insn_end_off[i - 1]);
+ p = encode_sleb128(p, tcg_ctx.gen_insn_end_off[i] - prev);
+ }
+
+ return p - block;
+}
+
TranslationBlock *tb_gen_code(CPUState *cpu,
target_ulong pc, target_ulong cs_base,
int flags, int cflags)
@@ -942,7 +987,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
tb_page_addr_t phys_pc, phys_page2;
target_ulong virt_page2;
tcg_insn_unit *gen_code_buf;
- int gen_code_size;
+ int gen_code_size, search_size;
#ifdef CONFIG_PROFILER
int64_t ti;
#endif
@@ -998,11 +1043,12 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
#endif
gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
+ search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
#ifdef CONFIG_PROFILER
tcg_ctx.code_time += profile_getclock();
tcg_ctx.code_in_len += tb->size;
- tcg_ctx.code_out_len += gen_code_size;
+ tcg_ctx.code_out_len += gen_code_size + search_size;
#endif
#ifdef DEBUG_DISAS
@@ -1014,8 +1060,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
}
#endif
- tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)gen_code_buf +
- gen_code_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
+ tcg_ctx.code_gen_ptr = (void *)
+ (((uintptr_t)gen_code_buf + gen_code_size + search_size
+ + CODE_GEN_ALIGN - 1) & -CODE_GEN_ALIGN);
/* check next page if needed */
virt_page2 = (pc + tb->size - 1) & TARGET_PAGE_MASK;
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb
2015-09-02 5:52 ` [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
@ 2015-09-10 13:49 ` Peter Maydell
2015-09-11 10:29 ` Sergey Fedorov
2015-09-15 20:08 ` Richard Henderson
0 siblings, 2 replies; 62+ messages in thread
From: Peter Maydell @ 2015-09-10 13:49 UTC (permalink / raw)
To: Richard Henderson
Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 2 September 2015 at 06:52, Richard Henderson <rth@twiddle.net> wrote:
> We can now restore state without retranslation.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
> include/exec/exec-all.h | 1 +
> tcg/tcg.c | 11 ++++-
> tcg/tcg.h | 3 +-
> translate-all.c | 129 +++++++++++++++++++++++++++++++++---------------
> 4 files changed, 100 insertions(+), 44 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 8c347ba..315f20a 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -198,6 +198,7 @@ struct TranslationBlock {
> #define CF_USE_ICOUNT 0x20000
>
> void *tc_ptr; /* pointer to the translated code */
> + uint8_t *tc_search; /* pointer to search data */
> /* next matching tb for physical address. */
> struct TranslationBlock *phys_hash_next;
> /* original tb when cflags has CF_NOCACHE */
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index d956f0b..3541d4c 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -2300,7 +2300,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
> tcg_insn_unit *gen_code_buf,
> long search_pc)
> {
> - int i, oi, oi_next;
> + int i, oi, oi_next, num_insns;
>
> #ifdef DEBUG_DISAS
> if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
> @@ -2344,6 +2344,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
>
> tcg_out_tb_init(s);
>
> + num_insns = -1;
> for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
> TCGOp * const op = &s->gen_op_buf[oi];
> TCGArg * const args = &s->gen_opparam_buf[op->args];
> @@ -2367,6 +2368,10 @@ static inline int tcg_gen_code_common(TCGContext *s,
> tcg_reg_alloc_movi(s, args, dead_args, sync_args);
> break;
> case INDEX_op_insn_start:
> + if (num_insns >= 0) {
> + s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
> + }
> + num_insns++;
> for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
> target_ulong a;
> #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
> @@ -2374,7 +2379,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
> #else
> a = args[i];
> #endif
> - s->gen_opc_data[i] = a;
> + s->gen_insn_data[num_insns][i] = a;
> }
> break;
> case INDEX_op_discard:
> @@ -2406,6 +2411,8 @@ static inline int tcg_gen_code_common(TCGContext *s,
> check_regs(s);
> #endif
> }
> + tcg_debug_assert(num_insns >= 0);
This is claiming that every TB will have at least one insn_start,
right? I think that most targets will violate that in the breakpoint
case, because the "if we have a bp for this insn then generate a
debug insn and break out of the loop" code is before the call
to tcg_gen_insn_start().
We should probably assert that num_insns < TCG_MAX_INSNS while
we're here.
> + s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
>
> /* Generate TB finalization at the end of block */
> tcg_out_tb_finalize(s);
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index 794b757..11cc107 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -581,7 +581,8 @@ struct TCGContext {
> uint16_t gen_opc_icount[OPC_BUF_SIZE];
> uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
>
> - target_ulong gen_opc_data[TARGET_INSN_START_WORDS];
> + uint16_t gen_insn_end_off[TCG_MAX_INSNS];
> + target_ulong gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
> };
>
> extern TCGContext tcg_ctx;
> diff --git a/translate-all.c b/translate-all.c
> index 74be98a..a31f839 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -138,58 +138,65 @@ void cpu_gen_init(void)
> tcg_context_init(&tcg_ctx);
> }
>
> -/* The cpu state corresponding to 'searched_pc' is restored.
> - */
> +static target_long decode_sleb128(uint8_t **pp)
> +{
> + uint8_t *p = *pp;
> + target_long val = 0;
> + int byte, shift = 0;
> +
> + do {
> + byte = *p++;
> + val |= (target_ulong)(byte & 0x7f) << shift;
> + shift += 7;
> + } while (byte & 0x80);
> + if (shift < TARGET_LONG_BITS && (byte & 0x40)) {
> + val |= -(target_ulong)1 << shift;
> + }
> +
> + *pp = p;
> + return val;
> +}
Are the encode/decode sleb128 functions known-good ones
borrowed from somewhere else?
(PS: checkpatch complains about missing braces.)
> +
> +/* The cpu state corresponding to 'searched_pc' is restored. */
> static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
> uintptr_t searched_pc)
> {
> + target_ulong data[TARGET_INSN_START_WORDS] = { };
> + uintptr_t host_pc = (uintptr_t)tb->tc_ptr;
> CPUArchState *env = cpu->env_ptr;
> - TCGContext *s = &tcg_ctx;
> - int j;
> - uintptr_t tc_ptr;
> + uint8_t *p = tb->tc_search;
> + int i, j, num_insns = tb->icount;
> #ifdef CONFIG_PROFILER
> - int64_t ti;
> + int64_t ti = profile_getclock();
> #endif
>
> -#ifdef CONFIG_PROFILER
> - ti = profile_getclock();
> -#endif
> - tcg_func_start(s);
> + if (searched_pc < host_pc) {
> + return -1;
> + }
>
> - gen_intermediate_code_pc(env, tb);
> + /* Reconstruct the stored insn data while looking for the point at
> + which the end of the insn exceeds the searched_pc. */
> + for (i = 0; i < num_insns; ++i) {
> + for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
> + data[j] += decode_sleb128(&p);
> + }
> + host_pc += decode_sleb128(&p);
> + if (host_pc > searched_pc) {
> + goto found;
> + }
> + }
> + return -1;
>
> + found:
> if (tb->cflags & CF_USE_ICOUNT) {
> assert(use_icount);
> /* Reset the cycle counter to the start of the block. */
> - cpu->icount_decr.u16.low += tb->icount;
> + cpu->icount_decr.u16.low += num_insns;
> /* Clear the IO flag. */
> cpu->can_do_io = 0;
> }
> -
> - /* find opc index corresponding to search_pc */
> - tc_ptr = (uintptr_t)tb->tc_ptr;
> - if (searched_pc < tc_ptr)
> - return -1;
> -
> - s->tb_next_offset = tb->tb_next_offset;
> -#ifdef USE_DIRECT_JUMP
> - s->tb_jmp_offset = tb->tb_jmp_offset;
> - s->tb_next = NULL;
> -#else
> - s->tb_jmp_offset = NULL;
> - s->tb_next = tb->tb_next;
> -#endif
> - j = tcg_gen_code_search_pc(s, (tcg_insn_unit *)tc_ptr,
> - searched_pc - tc_ptr);
> - if (j < 0)
> - return -1;
> - /* now find start of instruction before */
> - while (s->gen_opc_instr_start[j] == 0) {
> - j--;
> - }
> - cpu->icount_decr.u16.low -= s->gen_opc_icount[j];
> -
> - restore_state_to_opc(env, tb, s->gen_opc_data);
> + cpu->icount_decr.u16.low -= i;
> + restore_state_to_opc(env, tb, data);
>
> #ifdef CONFIG_PROFILER
> s->restore_time += profile_getclock() - ti;
> @@ -933,6 +940,44 @@ static void build_page_bitmap(PageDesc *p)
> }
> }
>
> +static uint8_t *encode_sleb128(uint8_t *p, target_long val)
> +{
> + int more, byte;
> +
> + do {
> + byte = val & 0x7f;
> + val >>= 7;
> + more = !((val == 0 && (byte & 0x40) == 0)
> + || (val == -1 && (byte & 0x40) != 0));
> + if (more)
> + byte |= 0x80;
> + *p++ = byte;
> + } while (more);
> +
> + return p;
> +}
> +
> +static int encode_search(TranslationBlock *tb, uint8_t *block)
> +{
I think this function would benefit from a brief comment
describing the compressed format we're creating here.
> + uint8_t *p = block;
> + int i, j, n;
> +
> + tb->tc_search = block;
> +
> + for (i = 0, n = tb->icount; i < n; ++i) {
> + target_ulong prev;
> +
> + for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
> + prev = (i == 0 ? 0 : tcg_ctx.gen_insn_data[i - 1][j]);
> + p = encode_sleb128(p, tcg_ctx.gen_insn_data[i][j] - prev);
> + }
> + prev = (i == 0 ? 0 : tcg_ctx.gen_insn_end_off[i - 1]);
> + p = encode_sleb128(p, tcg_ctx.gen_insn_end_off[i] - prev);
> + }
> +
> + return p - block;
> +}
> +
> TranslationBlock *tb_gen_code(CPUState *cpu,
> target_ulong pc, target_ulong cs_base,
> int flags, int cflags)
> @@ -942,7 +987,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> tb_page_addr_t phys_pc, phys_page2;
> target_ulong virt_page2;
> tcg_insn_unit *gen_code_buf;
> - int gen_code_size;
> + int gen_code_size, search_size;
> #ifdef CONFIG_PROFILER
> int64_t ti;
> #endif
> @@ -998,11 +1043,12 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> #endif
>
> gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
> + search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
Now we're putting the encoded search info in the codegen buffer,
don't we need to adjust the calculation of code_gen_buffer_max_size
to avoid falling off the end if the last TB in the buffer has a very
large set of generated TCG code and also a big encoded search buffer?
It would also be nice to assert if we do fall off the end of the
buffer somehow.
>
> #ifdef CONFIG_PROFILER
> tcg_ctx.code_time += profile_getclock();
> tcg_ctx.code_in_len += tb->size;
> - tcg_ctx.code_out_len += gen_code_size;
> + tcg_ctx.code_out_len += gen_code_size + search_size;
> #endif
How much extra space does the encoded search typically take (as a
% of the gen_code_size, say)?
>
> #ifdef DEBUG_DISAS
> @@ -1014,8 +1060,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> }
> #endif
>
> - tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)gen_code_buf +
> - gen_code_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
> + tcg_ctx.code_gen_ptr = (void *)
> + (((uintptr_t)gen_code_buf + gen_code_size + search_size
> + + CODE_GEN_ALIGN - 1) & -CODE_GEN_ALIGN);
If we're messing with this line anyway we might as well use ROUND_UP:
tcg_ctx.code_gen_ptr = (void *)
ROUND_UP((uintptr_t)gen_code_buf + gen_code_size + search_size,
CODE_GEN_ALIGN);
>
> /* check next page if needed */
> virt_page2 = (pc + tb->size - 1) & TARGET_PAGE_MASK;
> --
> 2.4.3
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb
2015-09-10 13:49 ` Peter Maydell
@ 2015-09-11 10:29 ` Sergey Fedorov
2015-09-11 10:32 ` Peter Maydell
2015-09-15 20:08 ` Richard Henderson
1 sibling, 1 reply; 62+ messages in thread
From: Sergey Fedorov @ 2015-09-11 10:29 UTC (permalink / raw)
To: Peter Maydell, Richard Henderson
Cc: dl.soluz, QEMU Developers, Aurelien Jarno, Artyom Tarasenko
On 10.09.2015 16:49, Peter Maydell wrote:
>> @@ -2406,6 +2411,8 @@ static inline int tcg_gen_code_common(TCGContext *s,
>> > check_regs(s);
>> > #endif
>> > }
>> > + tcg_debug_assert(num_insns >= 0);
> This is claiming that every TB will have at least one insn_start,
> right? I think that most targets will violate that in the breakpoint
> case, because the "if we have a bp for this insn then generate a
> debug insn and break out of the loop" code is before the call
> to tcg_gen_insn_start().
>
> We should probably assert that num_insns < TCG_MAX_INSNS while
> we're here.
>
BTW, such skipping of instruction generation seems to be the cause of
getting a confusing "Disassembler disagrees with translator over
instruction" message in qemu log.
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb
2015-09-11 10:29 ` Sergey Fedorov
@ 2015-09-11 10:32 ` Peter Maydell
2015-09-11 10:46 ` Sergey Fedorov
0 siblings, 1 reply; 62+ messages in thread
From: Peter Maydell @ 2015-09-11 10:32 UTC (permalink / raw)
To: Sergey Fedorov
Cc: dl.soluz, Artyom Tarasenko, QEMU Developers, Aurelien Jarno,
Richard Henderson
On 11 September 2015 at 11:29, Sergey Fedorov <serge.fdrv@gmail.com> wrote:
> On 10.09.2015 16:49, Peter Maydell wrote:
>>> @@ -2406,6 +2411,8 @@ static inline int tcg_gen_code_common(TCGContext *s,
>>> > check_regs(s);
>>> > #endif
>>> > }
>>> > + tcg_debug_assert(num_insns >= 0);
>> This is claiming that every TB will have at least one insn_start,
>> right? I think that most targets will violate that in the breakpoint
>> case, because the "if we have a bp for this insn then generate a
>> debug insn and break out of the loop" code is before the call
>> to tcg_gen_insn_start().
>>
>> We should probably assert that num_insns < TCG_MAX_INSNS while
>> we're here.
>>
>
> BTW, such skipping of instruction generation seems to be the cause of
> getting a confusing "Disassembler disagrees with translator over
> instruction" message in qemu log.
...I'd been meaning to try to track down what was provoking that :-)
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb
2015-09-11 10:32 ` Peter Maydell
@ 2015-09-11 10:46 ` Sergey Fedorov
0 siblings, 0 replies; 62+ messages in thread
From: Sergey Fedorov @ 2015-09-11 10:46 UTC (permalink / raw)
To: Peter Maydell
Cc: dl.soluz, Artyom Tarasenko, QEMU Developers, Aurelien Jarno,
Richard Henderson
On 11.09.2015 13:32, Peter Maydell wrote:
> On 11 September 2015 at 11:29, Sergey Fedorov <serge.fdrv@gmail.com> wrote:
>> On 10.09.2015 16:49, Peter Maydell wrote:
>>>> @@ -2406,6 +2411,8 @@ static inline int tcg_gen_code_common(TCGContext *s,
>>>>> check_regs(s);
>>>>> #endif
>>>>> }
>>>>> + tcg_debug_assert(num_insns >= 0);
>>> This is claiming that every TB will have at least one insn_start,
>>> right? I think that most targets will violate that in the breakpoint
>>> case, because the "if we have a bp for this insn then generate a
>>> debug insn and break out of the loop" code is before the call
>>> to tcg_gen_insn_start().
>>>
>>> We should probably assert that num_insns < TCG_MAX_INSNS while
>>> we're here.
>>>
>> BTW, such skipping of instruction generation seems to be the cause of
>> getting a confusing "Disassembler disagrees with translator over
>> instruction" message in qemu log.
> ...I'd been meaning to try to track down what was provoking that :-)
Seems it was wrong pc increment when handling a fired breakpoint.
Best,
Sergey
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb
2015-09-10 13:49 ` Peter Maydell
2015-09-11 10:29 ` Sergey Fedorov
@ 2015-09-15 20:08 ` Richard Henderson
1 sibling, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-15 20:08 UTC (permalink / raw)
To: Peter Maydell; +Cc: dl.soluz, QEMU Developers, Aurelien Jarno, Artyom Tarasenko
On 09/10/2015 06:49 AM, Peter Maydell wrote:
>> + tcg_debug_assert(num_insns >= 0);
>
> This is claiming that every TB will have at least one insn_start,
> right? I think that most targets will violate that in the breakpoint
> case, because the "if we have a bp for this insn then generate a
> debug insn and break out of the loop" code is before the call
> to tcg_gen_insn_start().
>
> We should probably assert that num_insns < TCG_MAX_INSNS while
> we're here.
True. I wonder if we shouldn't fix bp placement while I'm at it. And the
assertion should really be num_insns == tb->icount.
>> +static target_long decode_sleb128(uint8_t **pp)
>> +{
>> + uint8_t *p = *pp;
>> + target_long val = 0;
>> + int byte, shift = 0;
>> +
>> + do {
>> + byte = *p++;
>> + val |= (target_ulong)(byte & 0x7f) << shift;
>> + shift += 7;
>> + } while (byte & 0x80);
>> + if (shift < TARGET_LONG_BITS && (byte & 0x40)) {
>> + val |= -(target_ulong)1 << shift;
>> + }
>> +
>> + *pp = p;
>> + return val;
>> +}
>
> Are the encode/decode sleb128 functions known-good ones
> borrowed from somewhere else?
Yes, from libgcc.
> (PS: checkpatch complains about missing braces.)
Ho hum...
>> +static int encode_search(TranslationBlock *tb, uint8_t *block)
>> +{
>
> I think this function would benefit from a brief comment
> describing the compressed format we're creating here.
Yes.
>> gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
>> + search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
>
> Now we're putting the encoded search info in the codegen buffer,
> don't we need to adjust the calculation of code_gen_buffer_max_size
> to avoid falling off the end if the last TB in the buffer has a very
> large set of generated TCG code and also a big encoded search buffer?
Dunno. It's not that we've ever checked for this before; I'm not sure what
factor I would actually apply.
> It would also be nice to assert if we do fall off the end of the
> buffer somehow.
Given that we generally use a very large mmap to allocate it, perhaps simply
adding a guard page would be best.
> How much extra space does the encoded search typically take (as a
> % of the gen_code_size, say)?
Dunno; I'll have to have a look at that. Probably easiest to just enhance info
jit...
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 19/20] tcg: Remove gen_intermediate_code_pc
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (16 preceding siblings ...)
2015-09-02 5:52 ` [Qemu-devel] [PATCH 18/20] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
@ 2015-09-02 5:52 ` Richard Henderson
2015-09-08 18:49 ` Peter Maydell
2015-09-02 5:52 ` [Qemu-devel] [PATCH 20/20] tcg: Remove tcg_gen_code_search_pc Richard Henderson
` (4 subsequent siblings)
22 siblings, 1 reply; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:52 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
It's no longer used, so tidy up everything reached by it.
This includes the gen_opc_* arrays, the search_pc parameter
and the inline gen_intermediate_code_internal functions.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
include/exec/exec-all.h | 1 -
target-alpha/translate.c | 41 ++++----------------------------
target-arm/translate-a64.c | 31 +++----------------------
target-arm/translate.c | 54 ++++++++-----------------------------------
target-arm/translate.h | 8 ++-----
target-cris/translate.c | 50 +++++----------------------------------
target-i386/translate.c | 49 ++++-----------------------------------
target-lm32/translate.c | 42 ++++-----------------------------
target-m68k/translate.c | 43 ++++------------------------------
target-microblaze/translate.c | 40 ++++----------------------------
target-mips/translate.c | 49 ++++-----------------------------------
target-moxie/translate.c | 41 ++++----------------------------
target-openrisc/translate.c | 42 ++++-----------------------------
target-ppc/translate.c | 40 ++++----------------------------
target-s390x/translate.c | 44 ++++-------------------------------
target-sh4/translate.c | 43 ++++------------------------------
target-sparc/translate.c | 47 ++++---------------------------------
target-tricore/translate.c | 31 ++++---------------------
target-unicore32/translate.c | 44 ++++-------------------------------
target-xtensa/translate.c | 39 ++++---------------------------
tcg/tcg.h | 4 ----
21 files changed, 86 insertions(+), 697 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 315f20a..c27a99a 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -73,7 +73,6 @@ typedef struct TranslationBlock TranslationBlock;
#include "qemu/log.h"
void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
-void gen_intermediate_code_pc(CPUArchState *env, struct TranslationBlock *tb);
void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
target_ulong *data);
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 27c9942..6ff3d08 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2853,18 +2853,15 @@ static ExitStatus translate_one(DisasContext *ctx, uint32_t insn)
return ret;
}
-static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
- TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb)
{
+ AlphaCPU *cpu = alpha_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUAlphaState *env = &cpu->env;
DisasContext ctx, *ctxp = &ctx;
target_ulong pc_start;
target_ulong pc_mask;
uint32_t insn;
CPUBreakpoint *bp;
- int j, lj = -1;
ExitStatus ret;
int num_insns;
int max_insns;
@@ -2919,18 +2916,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
}
}
}
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[lj] = ctx.pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(ctx.pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -2993,16 +2978,8 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- } else {
- tb->size = ctx.pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = ctx.pc - pc_start;
+ tb->icount = num_insns;
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -3013,16 +2990,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
#endif
}
-void gen_intermediate_code (CPUAlphaState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(alpha_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUAlphaState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(alpha_env_get_cpu(env), tb, true);
-}
-
void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb,
target_ulong *data)
{
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 10173a4..c277ab9 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -10922,15 +10922,12 @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
free_tmp_a64(s);
}
-void gen_intermediate_code_internal_a64(ARMCPU *cpu,
- TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb)
{
CPUState *cs = CPU(cpu);
CPUARMState *env = &cpu->env;
DisasContext dc1, *dc = &dc1;
CPUBreakpoint *bp;
- int j, lj;
target_ulong pc_start;
target_ulong next_page_start;
int num_insns;
@@ -10985,7 +10982,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
init_tmp_a64_array(dc);
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -11011,19 +11007,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
}
}
}
-
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc->pc, 0);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -11139,14 +11122,6 @@ done_generating:
qemu_log("\n");
}
#endif
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
}
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 2940d07..70e0187 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -52,7 +52,6 @@
#define ARCH(x) do { if (!ENABLE_ARCH_##x) goto illegal_op; } while(0)
#include "translate.h"
-static uint32_t gen_opc_condexec_bits[OPC_BUF_SIZE];
#if defined(CONFIG_USER_ONLY)
#define IS_USER(s) 1
@@ -11136,17 +11135,13 @@ undef:
}
/* generate intermediate code in gen_opc_buf and gen_opparam_buf for
- basic block 'tb'. If search_pc is TRUE, also generate PC
- information for each intermediate instruction. */
-static inline void gen_intermediate_code_internal(ARMCPU *cpu,
- TranslationBlock *tb,
- bool search_pc)
+ basic block 'tb'. */
+void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb)
{
+ ARMCPU *cpu = arm_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUARMState *env = &cpu->env;
DisasContext dc1, *dc = &dc1;
CPUBreakpoint *bp;
- int j, lj;
target_ulong pc_start;
target_ulong next_page_start;
int num_insns;
@@ -11158,7 +11153,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
* the A32/T32 complexity to do with conditional execution/IT blocks/etc.
*/
if (ARM_TBFLAG_AARCH64_STATE(tb->flags)) {
- gen_intermediate_code_internal_a64(cpu, tb, search_pc);
+ gen_intermediate_code_a64(cpu, tb);
return;
}
@@ -11220,7 +11215,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
/* FIXME: cpu_M0 can probably be the same as cpu_V0. */
cpu_M0 = tcg_temp_new_i64();
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -11254,10 +11248,9 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
* (3) if we leave the TB unexpectedly (eg a data abort on a load)
* then the CPUARMState will be wrong and we need to reset it.
* This is handled in the same way as restoration of the
- * PC in these situations: we will be called again with search_pc=1
- * and generate a mapping of the condexec bits for each PC in
- * gen_opc_condexec_bits[]. restore_state_to_opc() then uses
- * this to restore the condexec bits.
+ * PC in these situations; the saved mapping of the condexec bits
+ * for each PC which restore_state_to_opc() then uses this to
+ * restore the condexec bits.
*
* Note that there are no instructions which can read the condexec
* bits, and none which can write non-static values to them, so
@@ -11304,18 +11297,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
}
}
}
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- gen_opc_condexec_bits[lj] = (dc->condexec_cond << 4) | (dc->condexec_mask >> 1);
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc->pc,
(dc->condexec_cond << 4) | (dc->condexec_mask >> 1));
@@ -11499,25 +11480,8 @@ done_generating:
qemu_log("\n");
}
#endif
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
-}
-
-void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(arm_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUARMState *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(arm_env_get_cpu(env), tb, true);
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
}
static const char *cpu_mode_names[16] = {
diff --git a/target-arm/translate.h b/target-arm/translate.h
index 9ab978f..628f4b0 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -107,9 +107,7 @@ static inline int default_exception_el(DisasContext *s)
#ifdef TARGET_AARCH64
void a64_translate_init(void);
-void gen_intermediate_code_internal_a64(ARMCPU *cpu,
- TranslationBlock *tb,
- bool search_pc);
+void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb);
void gen_a64_set_pc_im(uint64_t val);
void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
fprintf_function cpu_fprintf, int flags);
@@ -118,9 +116,7 @@ static inline void a64_translate_init(void)
{
}
-static inline void gen_intermediate_code_internal_a64(ARMCPU *cpu,
- TranslationBlock *tb,
- bool search_pc)
+static inline void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb)
{
}
diff --git a/target-cris/translate.c b/target-cris/translate.c
index ce2d3a0..f7bed9b 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3097,15 +3097,12 @@ static void check_breakpoint(CPUCRISState *env, DisasContext *dc)
*/
/* generate intermediate code for basic block 'tb'. */
-static inline void
-gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUCRISState *env, struct TranslationBlock *tb)
{
+ CRISCPU *cpu = cris_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUCRISState *env = &cpu->env;
uint32_t pc_start;
unsigned int insn_len;
- int j, lj;
struct DisasContext ctx;
struct DisasContext *dc = &ctx;
uint32_t next_page_start;
@@ -3157,13 +3154,13 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
qemu_log(
- "srch=%d pc=%x %x flg=%" PRIx64 " bt=%x ds=%u ccs=%x\n"
+ "pc=%x %x flg=%" PRIx64 " bt=%x ds=%u ccs=%x\n"
"pid=%x usp=%x\n"
"%x.%x.%x.%x\n"
"%x.%x.%x.%x\n"
"%x.%x.%x.%x\n"
"%x.%x.%x.%x\n",
- search_pc, dc->pc, dc->ppc,
+ dc->pc, dc->ppc,
(uint64_t)tb->flags,
env->btarget, (unsigned)tb->flags & 7,
env->pregs[PR_CCS],
@@ -3179,7 +3176,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
}
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -3193,22 +3189,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
do {
check_breakpoint(env, dc);
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- if (dc->delayed_branch == 1) {
- tcg_ctx.gen_opc_pc[lj] = dc->ppc | 1;
- } else {
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- }
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc->delayed_branch == 1
? dc->ppc | 1 : dc->pc);
@@ -3332,16 +3312,8 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
}
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
#ifdef DEBUG_DISAS
#if !DISAS_CRIS
@@ -3355,16 +3327,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
#endif
}
-void gen_intermediate_code (CPUCRISState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(cris_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUCRISState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(cris_env_get_cpu(env), tb, true);
-}
-
void cris_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
int flags)
{
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 49944df..d0586f0 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -76,8 +76,6 @@ static TCGv_ptr cpu_ptr0, cpu_ptr1;
static TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
static TCGv_i64 cpu_tmp1_i64;
-static uint8_t gen_opc_cc_op[OPC_BUF_SIZE];
-
#include "exec/gen-icount.h"
#ifdef TARGET_X86_64
@@ -7899,18 +7897,14 @@ void optimize_flags_init(void)
}
/* generate intermediate code in gen_opc_buf and gen_opparam_buf for
- basic block 'tb'. If search_pc is TRUE, also generate PC
- information for each intermediate instruction. */
-static inline void gen_intermediate_code_internal(X86CPU *cpu,
- TranslationBlock *tb,
- bool search_pc)
+ basic block 'tb'. */
+void gen_intermediate_code(CPUX86State *env, TranslationBlock *tb)
{
+ X86CPU *cpu = x86_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUX86State *env = &cpu->env;
DisasContext dc1, *dc = &dc1;
target_ulong pc_ptr;
CPUBreakpoint *bp;
- int j, lj;
uint64_t flags;
target_ulong pc_start;
target_ulong cs_base;
@@ -7990,7 +7984,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
dc->is_jmp = DISAS_NEXT;
pc_ptr = pc_start;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -8011,18 +8004,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
}
}
}
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- tcg_ctx.gen_opc_pc[lj] = pc_ptr;
- gen_opc_cc_op[lj] = dc->cc_op;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(pc_ptr, dc->cc_op);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
@@ -8077,14 +8058,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
done_generating:
gen_tb_end(tb, num_insns);
- /* we don't forget to fill the last values */
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
-
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
int disas_flags;
@@ -8101,20 +8074,8 @@ done_generating:
}
#endif
- if (!search_pc) {
- tb->size = pc_ptr - pc_start;
- tb->icount = num_insns;
- }
-}
-
-void gen_intermediate_code(CPUX86State *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(x86_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUX86State *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(x86_env_get_cpu(env), tb, true);
+ tb->size = pc_ptr - pc_start;
+ tb->icount = num_insns;
}
void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb,
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index d0aaea2..bda1f5c 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1049,15 +1049,12 @@ static void check_breakpoint(CPULM32State *env, DisasContext *dc)
}
/* generate intermediate code for basic block 'tb'. */
-static inline
-void gen_intermediate_code_internal(LM32CPU *cpu,
- TranslationBlock *tb, bool search_pc)
+void gen_intermediate_code(CPULM32State *env, struct TranslationBlock *tb)
{
+ LM32CPU *cpu = lm32_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPULM32State *env = &cpu->env;
struct DisasContext ctx, *dc = &ctx;
uint32_t pc_start;
- int j, lj;
uint32_t next_page_start;
int num_insns;
int max_insns;
@@ -1079,7 +1076,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
}
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -1093,18 +1089,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
do {
check_breakpoint(env, dc);
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc->pc);
/* Pretty disas. */
@@ -1154,16 +1138,8 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -1175,16 +1151,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
#endif
}
-void gen_intermediate_code(CPULM32State *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(lm32_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPULM32State *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(lm32_env_get_cpu(env), tb, true);
-}
-
void lm32_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
int flags)
{
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 2fc9b68..8b2750f 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2962,15 +2962,12 @@ static void disas_m68k_insn(CPUM68KState * env, DisasContext *s)
}
/* generate intermediate code for basic block 'tb'. */
-static inline void
-gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUM68KState *env, TranslationBlock *tb)
{
+ M68kCPU *cpu = m68k_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUM68KState *env = &cpu->env;
DisasContext dc1, *dc = &dc1;
CPUBreakpoint *bp;
- int j, lj;
target_ulong pc_start;
int pc_offset;
int num_insns;
@@ -2989,7 +2986,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
dc->fpcr = env->fpcr;
dc->user = (env->sr & SR_S) == 0;
dc->done_mac = 0;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -3014,17 +3010,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
if (dc->is_jmp)
break;
}
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc->pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -3077,28 +3062,8 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
qemu_log("\n");
}
#endif
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
-
- //optimize_flags();
- //expand_target_qops();
-}
-
-void gen_intermediate_code(CPUM68KState *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(m68k_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUM68KState *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(m68k_env_get_cpu(env), tb, true);
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
}
void m68k_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 57c79a6..3f88074 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1657,14 +1657,11 @@ static void check_breakpoint(CPUMBState *env, DisasContext *dc)
}
/* generate intermediate code for basic block 'tb'. */
-static inline void
-gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUMBState *env, struct TranslationBlock *tb)
{
+ MicroBlazeCPU *cpu = mb_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUMBState *env = &cpu->env;
uint32_t pc_start;
- int j, lj;
struct DisasContext ctx;
struct DisasContext *dc = &ctx;
uint32_t next_page_start, org_flags;
@@ -1701,7 +1698,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
}
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -1722,17 +1718,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
#endif
check_breakpoint(env, dc);
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc->pc);
/* Pretty disas. */
@@ -1837,15 +1822,8 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
}
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
#ifdef DEBUG_DISAS
#if !SIM_COMPAT
@@ -1862,16 +1840,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
assert(!dc->abort_at_next_insn);
}
-void gen_intermediate_code (CPUMBState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(mb_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUMBState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(mb_env_get_cpu(env), tb, true);
-}
-
void mb_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
int flags)
{
diff --git a/target-mips/translate.c b/target-mips/translate.c
index ad9e5d2..219aab2 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1361,9 +1361,6 @@ static TCGv_i32 fpu_fcr0, fpu_fcr31;
static TCGv_i64 fpu_f64[32];
static TCGv_i64 msa_wr_d[64];
-static uint32_t gen_opc_hflags[OPC_BUF_SIZE];
-static target_ulong gen_opc_btarget[OPC_BUF_SIZE];
-
#include "exec/gen-icount.h"
#define gen_helper_0e0i(name, arg) do { \
@@ -20147,25 +20144,19 @@ static void decode_opc(CPUMIPSState *env, DisasContext *ctx)
}
}
-static inline void
-gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUMIPSState *env, struct TranslationBlock *tb)
{
+ MIPSCPU *cpu = mips_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUMIPSState *env = &cpu->env;
DisasContext ctx;
target_ulong pc_start;
target_ulong next_page_start;
CPUBreakpoint *bp;
- int j, lj = -1;
int num_insns;
int max_insns;
int insn_bytes;
int is_slot;
- if (search_pc)
- qemu_log("search pc %d\n", search_pc);
-
pc_start = tb->pc;
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
ctx.pc = pc_start;
@@ -20222,20 +20213,6 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
}
}
}
-
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- tcg_ctx.gen_opc_pc[lj] = ctx.pc;
- gen_opc_hflags[lj] = ctx.hflags & MIPS_HFLAG_BMASK;
- gen_opc_btarget[lj] = ctx.btarget;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(ctx.pc, ctx.hflags & MIPS_HFLAG_BMASK, ctx.btarget);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -20328,15 +20305,9 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
done_generating:
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- } else {
- tb->size = ctx.pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = ctx.pc - pc_start;
+ tb->icount = num_insns;
+
#ifdef DEBUG_DISAS
LOG_DISAS("\n");
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -20347,16 +20318,6 @@ done_generating:
#endif
}
-void gen_intermediate_code (CPUMIPSState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(mips_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUMIPSState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(mips_env_get_cpu(env), tb, true);
-}
-
static void fpu_dump_state(CPUMIPSState *env, FILE *f, fprintf_function fpu_fprintf,
int flags)
{
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index 9fa8a43..448200f 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -816,16 +816,13 @@ static int decode_opc(MoxieCPU *cpu, DisasContext *ctx)
}
/* generate intermediate code for basic block 'tb'. */
-static inline void
-gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUMoxieState *env, struct TranslationBlock *tb)
{
+ MoxieCPU *cpu = moxie_env_get_cpu(env);
CPUState *cs = CPU(cpu);
DisasContext ctx;
target_ulong pc_start;
CPUBreakpoint *bp;
- int j, lj = -1;
- CPUMoxieState *env = &cpu->env;
int num_insns, max_insns;
pc_start = tb->pc;
@@ -857,18 +854,6 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
}
}
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[lj] = ctx.pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(ctx.pc);
ctx.opcode = cpu_lduw_code(env, ctx.pc);
@@ -906,26 +891,8 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
done_generating:
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- } else {
- tb->size = ctx.pc - pc_start;
- tb->icount = num_insns;
- }
-}
-
-void gen_intermediate_code(CPUMoxieState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(moxie_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUMoxieState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(moxie_env_get_cpu(env), tb, true);
+ tb->size = ctx.pc - pc_start;
+ tb->icount = num_insns;
}
void restore_state_to_opc(CPUMoxieState *env, TranslationBlock *tb,
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 78c157b..60a57b4 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1634,14 +1634,12 @@ static void check_breakpoint(OpenRISCCPU *cpu, DisasContext *dc)
}
}
-static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
- TranslationBlock *tb,
- int search_pc)
+void gen_intermediate_code(CPUOpenRISCState *env, struct TranslationBlock *tb)
{
+ OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
CPUState *cs = CPU(cpu);
struct DisasContext ctx, *dc = &ctx;
uint32_t pc_start;
- int j, k;
uint32_t next_page_start;
int num_insns;
int max_insns;
@@ -1663,7 +1661,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
}
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- k = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
@@ -1678,18 +1675,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
do {
check_breakpoint(cpu, dc);
- if (search_pc) {
- j = tcg_op_buf_count();
- if (k < j) {
- k++;
- while (k < j) {
- tcg_ctx.gen_opc_instr_start[k++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[k] = dc->pc;
- tcg_ctx.gen_opc_instr_start[k] = 1;
- tcg_ctx.gen_opc_icount[k] = num_insns;
- }
tcg_gen_insn_start(dc->pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -1756,16 +1741,8 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- k++;
- while (k <= j) {
- tcg_ctx.gen_opc_instr_start[k++] = 0;
- }
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -1777,17 +1754,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
#endif
}
-void gen_intermediate_code(CPUOpenRISCState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(openrisc_env_get_cpu(env), tb, 0);
-}
-
-void gen_intermediate_code_pc(CPUOpenRISCState *env,
- struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(openrisc_env_get_cpu(env), tb, 1);
-}
-
void openrisc_cpu_dump_state(CPUState *cs, FILE *f,
fprintf_function cpu_fprintf,
int flags)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 1cfc1ea..a1b006f 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11402,17 +11402,14 @@ void ppc_cpu_dump_statistics(CPUState *cs, FILE*f,
}
/*****************************************************************************/
-static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
- TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUPPCState *env, struct TranslationBlock *tb)
{
+ PowerPCCPU *cpu = ppc_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUPPCState *env = &cpu->env;
DisasContext ctx, *ctxp = &ctx;
opc_handler_t **table, *handler;
target_ulong pc_start;
CPUBreakpoint *bp;
- int j, lj = -1;
int num_insns;
int max_insns;
@@ -11488,17 +11485,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
}
}
}
- if (unlikely(search_pc)) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- tcg_ctx.gen_opc_pc[lj] = ctx.nip;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(ctx.nip);
LOG_DISAS("----------------\n");
@@ -11595,15 +11581,9 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
}
gen_tb_end(tb, num_insns);
- if (unlikely(search_pc)) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- } else {
- tb->size = ctx.nip - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = ctx.nip - pc_start;
+ tb->icount = num_insns;
+
#if defined(DEBUG_DISAS)
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
int flags;
@@ -11616,16 +11596,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
#endif
}
-void gen_intermediate_code (CPUPPCState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(ppc_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUPPCState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(ppc_env_get_cpu(env), tb, true);
-}
-
void restore_state_to_opc(CPUPPCState *env, TranslationBlock *tb,
target_ulong *data)
{
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 047685c..90a09c2 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -161,8 +161,6 @@ static char cpu_reg_names[32][4];
static TCGv_i64 regs[16];
static TCGv_i64 fregs[16];
-static uint8_t gen_opc_cc_op[OPC_BUF_SIZE];
-
void s390x_translate_init(void)
{
int i;
@@ -5319,16 +5317,13 @@ static ExitStatus translate_one(CPUS390XState *env, DisasContext *s)
return ret;
}
-static inline void gen_intermediate_code_internal(S390CPU *cpu,
- TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUS390XState *env, struct TranslationBlock *tb)
{
+ S390CPU *cpu = s390_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUS390XState *env = &cpu->env;
DisasContext dc;
target_ulong pc_start;
uint64_t next_page_start;
- int j, lj = -1;
int num_insns, max_insns;
CPUBreakpoint *bp;
ExitStatus status;
@@ -5360,19 +5355,6 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
gen_tb_start(tb);
do {
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[lj] = dc.pc;
- gen_opc_cc_op[lj] = dc.cc_op;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc.pc, dc.cc_op);
if (++num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -5433,16 +5415,8 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
gen_tb_end(tb, num_insns);
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- } else {
- tb->size = dc.pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = dc.pc - pc_start;
+ tb->icount = num_insns;
#if defined(S390X_DEBUG_DISAS)
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -5453,16 +5427,6 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
#endif
}
-void gen_intermediate_code (CPUS390XState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(s390_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUS390XState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(s390_env_get_cpu(env), tb, true);
-}
-
void restore_state_to_opc(CPUS390XState *env, TranslationBlock *tb,
target_ulong *data)
{
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index db41d0b..2da6bac 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -70,8 +70,6 @@ static TCGv cpu_fregs[32];
/* internal register indexes */
static TCGv cpu_flags, cpu_delayed_pc;
-static uint32_t gen_opc_hflags[OPC_BUF_SIZE];
-
#include "exec/gen-icount.h"
void sh4_translate_init(void)
@@ -1840,16 +1838,13 @@ static void decode_opc(DisasContext * ctx)
gen_store_flags(ctx->flags);
}
-static inline void
-gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
- bool search_pc)
+void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
{
+ SuperHCPU *cpu = sh_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUSH4State *env = &cpu->env;
DisasContext ctx;
target_ulong pc_start;
CPUBreakpoint *bp;
- int i, ii;
int num_insns;
int max_insns;
@@ -1866,7 +1861,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
ctx.features = env->features;
ctx.has_movcal = (ctx.flags & TB_FLAG_PENDING_MOVCA);
- ii = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -1889,18 +1883,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
}
}
}
- if (search_pc) {
- i = tcg_op_buf_count();
- if (ii < i) {
- ii++;
- while (ii < i)
- tcg_ctx.gen_opc_instr_start[ii++] = 0;
- }
- tcg_ctx.gen_opc_pc[ii] = ctx.pc;
- gen_opc_hflags[ii] = ctx.flags;
- tcg_ctx.gen_opc_instr_start[ii] = 1;
- tcg_ctx.gen_opc_icount[ii] = num_insns;
- }
tcg_gen_insn_start(ctx.pc, ctx.flags);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -1949,15 +1931,8 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
gen_tb_end(tb, num_insns);
- if (search_pc) {
- i = tcg_op_buf_count();
- ii++;
- while (ii <= i)
- tcg_ctx.gen_opc_instr_start[ii++] = 0;
- } else {
- tb->size = ctx.pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = ctx.pc - pc_start;
+ tb->icount = num_insns;
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -1968,16 +1943,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
#endif
}
-void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(sh_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUSH4State * env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(sh_env_get_cpu(env), tb, true);
-}
-
void restore_state_to_opc(CPUSH4State *env, TranslationBlock *tb,
target_ulong *data)
{
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index a208a6b..8faa434 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -64,8 +64,6 @@ static TCGv cpu_wim;
/* Floating point registers */
static TCGv_i64 cpu_fpr[TARGET_DPREGS];
-static target_ulong gen_opc_npc[OPC_BUF_SIZE];
-
#include "exec/gen-icount.h"
typedef struct DisasContext {
@@ -5208,16 +5206,13 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
}
}
-static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
- TranslationBlock *tb,
- bool spc)
+void gen_intermediate_code(CPUSPARCState * env, TranslationBlock * tb)
{
+ SPARCCPU *cpu = sparc_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUSPARCState *env = &cpu->env;
target_ulong pc_start, last_pc;
DisasContext dc1, *dc = &dc1;
CPUBreakpoint *bp;
- int j, lj = -1;
int num_insns;
int max_insns;
unsigned int insn;
@@ -5258,19 +5253,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
}
}
}
- if (spc) {
- qemu_log("Search PC...\n");
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- gen_opc_npc[lj] = dc->npc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
- }
tcg_gen_insn_start(dc->pc, dc->npc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -5320,18 +5302,9 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
}
gen_tb_end(tb, num_insns);
- if (spc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j)
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
-#if 0
- log_page_dump();
-#endif
- } else {
- tb->size = last_pc + 4 - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = last_pc + 4 - pc_start;
+ tb->icount = num_insns;
+
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
qemu_log("--------------\n");
@@ -5342,16 +5315,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
#endif
}
-void gen_intermediate_code(CPUSPARCState * env, TranslationBlock * tb)
-{
- gen_intermediate_code_internal(sparc_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUSPARCState * env, TranslationBlock * tb)
-{
- gen_intermediate_code_internal(sparc_env_get_cpu(env), tb, true);
-}
-
void gen_intermediate_code_init(CPUSPARCState *env)
{
unsigned int i;
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index a23bfcd..73e8e04 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8266,20 +8266,14 @@ static void decode_opc(CPUTriCoreState *env, DisasContext *ctx, int *is_branch)
}
}
-static inline void
-gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
- int search_pc)
+void gen_intermediate_code(CPUTriCoreState *env, struct TranslationBlock *tb)
{
+ TriCoreCPU *cpu = tricore_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUTriCoreState *env = &cpu->env;
DisasContext ctx;
target_ulong pc_start;
int num_insns, max_insns;
- if (search_pc) {
- qemu_log("search pc %d\n", search_pc);
- }
-
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -8316,12 +8310,9 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
}
gen_tb_end(tb, num_insns);
- if (search_pc) {
- printf("done_generating search pc\n");
- } else {
- tb->size = ctx.pc - pc_start;
- tb->icount = num_insns;
- }
+ tb->size = ctx.pc - pc_start;
+ tb->icount = num_insns;
+
if (tcg_check_temp_count()) {
printf("LEAK at %08x\n", env->PC);
}
@@ -8336,18 +8327,6 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
}
void
-gen_intermediate_code(CPUTriCoreState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(tricore_env_get_cpu(env), tb, false);
-}
-
-void
-gen_intermediate_code_pc(CPUTriCoreState *env, struct TranslationBlock *tb)
-{
- gen_intermediate_code_internal(tricore_env_get_cpu(env), tb, true);
-}
-
-void
restore_state_to_opc(CPUTriCoreState *env, TranslationBlock *tb,
target_ulong *data)
{
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 75c7d65..503770d 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1864,16 +1864,13 @@ static void disas_uc32_insn(CPUUniCore32State *env, DisasContext *s)
}
/* generate intermediate code in gen_opc_buf and gen_opparam_buf for
- basic block 'tb'. If search_pc is TRUE, also generate PC
- information for each intermediate instruction. */
-static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
- TranslationBlock *tb, bool search_pc)
+ basic block 'tb'. */
+void gen_intermediate_code(CPUUniCore32State *env, TranslationBlock *tb)
{
+ UniCore32CPU *cpu = uc32_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUUniCore32State *env = &cpu->env;
DisasContext dc1, *dc = &dc1;
CPUBreakpoint *bp;
- int j, lj;
target_ulong pc_start;
uint32_t next_page_start;
int num_insns;
@@ -1895,7 +1892,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
cpu_F0d = tcg_temp_new_i64();
cpu_F1d = tcg_temp_new_i64();
next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- lj = -1;
num_insns = 0;
max_insns = tb->cflags & CF_COUNT_MASK;
if (max_insns == 0) {
@@ -1928,18 +1924,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
}
}
}
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[lj] = dc->pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = num_insns;
- }
tcg_gen_insn_start(dc->pc);
if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
@@ -2043,26 +2027,8 @@ done_generating:
qemu_log("\n");
}
#endif
- if (search_pc) {
- j = tcg_op_buf_count();
- lj++;
- while (lj <= j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- } else {
- tb->size = dc->pc - pc_start;
- tb->icount = num_insns;
- }
-}
-
-void gen_intermediate_code(CPUUniCore32State *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(uc32_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUUniCore32State *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(uc32_env_get_cpu(env), tb, true);
+ tb->size = dc->pc - pc_start;
+ tb->icount = num_insns;
}
static const char *cpu_mode_names[16] = {
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index 5df3913..335ab32 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3013,15 +3013,12 @@ static void gen_ibreak_check(CPUXtensaState *env, DisasContext *dc)
}
}
-static inline
-void gen_intermediate_code_internal(XtensaCPU *cpu,
- TranslationBlock *tb, bool search_pc)
+void gen_intermediate_code(CPUXtensaState *env, TranslationBlock *tb)
{
+ XtensaCPU *cpu = xtensa_env_get_cpu(env);
CPUState *cs = CPU(cpu);
- CPUXtensaState *env = &cpu->env;
DisasContext dc;
int insn_count = 0;
- int j, lj = -1;
int max_insns = tb->cflags & CF_COUNT_MASK;
uint32_t pc_start = tb->pc;
uint32_t next_page_start =
@@ -3067,18 +3064,6 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
do {
check_breakpoint(env, &dc);
- if (search_pc) {
- j = tcg_op_buf_count();
- if (lj < j) {
- lj++;
- while (lj < j) {
- tcg_ctx.gen_opc_instr_start[lj++] = 0;
- }
- }
- tcg_ctx.gen_opc_pc[lj] = dc.pc;
- tcg_ctx.gen_opc_instr_start[lj] = 1;
- tcg_ctx.gen_opc_icount[lj] = insn_count;
- }
tcg_gen_insn_start(dc.pc);
++dc.ccount_delta;
@@ -3142,24 +3127,8 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
qemu_log("\n");
}
#endif
- if (search_pc) {
- j = tcg_op_buf_count();
- memset(tcg_ctx.gen_opc_instr_start + lj + 1, 0,
- (j - lj) * sizeof(tcg_ctx.gen_opc_instr_start[0]));
- } else {
- tb->size = dc.pc - pc_start;
- tb->icount = insn_count;
- }
-}
-
-void gen_intermediate_code(CPUXtensaState *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(xtensa_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUXtensaState *env, TranslationBlock *tb)
-{
- gen_intermediate_code_internal(xtensa_env_get_cpu(env), tb, true);
+ tb->size = dc.pc - pc_start;
+ tb->icount = insn_count;
}
void xtensa_cpu_dump_state(CPUState *cs, FILE *f,
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 11cc107..6055715 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -577,10 +577,6 @@ struct TCGContext {
TCGOp gen_op_buf[OPC_BUF_SIZE];
TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
- target_ulong gen_opc_pc[OPC_BUF_SIZE];
- uint16_t gen_opc_icount[OPC_BUF_SIZE];
- uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-
uint16_t gen_insn_end_off[TCG_MAX_INSNS];
target_ulong gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
};
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [PATCH 19/20] tcg: Remove gen_intermediate_code_pc
2015-09-02 5:52 ` [Qemu-devel] [PATCH 19/20] tcg: Remove gen_intermediate_code_pc Richard Henderson
@ 2015-09-08 18:49 ` Peter Maydell
0 siblings, 0 replies; 62+ messages in thread
From: Peter Maydell @ 2015-09-08 18:49 UTC (permalink / raw)
To: Richard Henderson
Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 2 September 2015 at 06:52, Richard Henderson <rth@twiddle.net> wrote:
> It's no longer used, so tidy up everything reached by it.
> This includes the gen_opc_* arrays, the search_pc parameter
> and the inline gen_intermediate_code_internal functions.
> @@ -11254,10 +11248,9 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
> * (3) if we leave the TB unexpectedly (eg a data abort on a load)
> * then the CPUARMState will be wrong and we need to reset it.
> * This is handled in the same way as restoration of the
> - * PC in these situations: we will be called again with search_pc=1
> - * and generate a mapping of the condexec bits for each PC in
> - * gen_opc_condexec_bits[]. restore_state_to_opc() then uses
> - * this to restore the condexec bits.
> + * PC in these situations; the saved mapping of the condexec bits
> + * for each PC which restore_state_to_opc() then uses this to
> + * restore the condexec bits.
The grammar here is a bit flaky. Try
> + * PC in these situations; we save the value of the condexec bits
> + * for each PC via tcg_gen_insn_start(), and restore_state_to_opc()
> + * then uses this to restore them after an exception.
Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* [Qemu-devel] [PATCH 20/20] tcg: Remove tcg_gen_code_search_pc
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (17 preceding siblings ...)
2015-09-02 5:52 ` [Qemu-devel] [PATCH 19/20] tcg: Remove gen_intermediate_code_pc Richard Henderson
@ 2015-09-02 5:52 ` Richard Henderson
2015-09-02 12:21 ` [Qemu-devel] [RFC 00/20] Do away with TB retranslation Max Filippov
` (3 subsequent siblings)
22 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 5:52 UTC (permalink / raw)
To: qemu-devel; +Cc: dl.soluz, atar4qemu, aurelien
It's no longer used, so tidy up everything reached by it.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
tcg/tcg.c | 59 +++++++++++++++++++----------------------------------------
tcg/tcg.h | 2 --
2 files changed, 19 insertions(+), 42 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 3541d4c..98aa0f4 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2296,12 +2296,28 @@ void tcg_dump_op_count(FILE *f, fprintf_function cpu_fprintf)
#endif
-static inline int tcg_gen_code_common(TCGContext *s,
- tcg_insn_unit *gen_code_buf,
- long search_pc)
+int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
{
int i, oi, oi_next, num_insns;
+#ifdef CONFIG_PROFILER
+ {
+ int n;
+
+ n = s->gen_last_op_idx + 1;
+ s->op_count += n;
+ if (n > s->op_count_max) {
+ s->op_count_max = n;
+ }
+
+ n = s->nb_temps;
+ s->temp_count += n;
+ if (n > s->temp_count_max) {
+ s->temp_count_max = n;
+ }
+ }
+#endif
+
#ifdef DEBUG_DISAS
if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
qemu_log("OP:\n");
@@ -2404,9 +2420,6 @@ static inline int tcg_gen_code_common(TCGContext *s,
tcg_reg_alloc_op(s, def, opc, args, dead_args, sync_args);
break;
}
- if (search_pc >= 0 && search_pc < tcg_current_code_size(s)) {
- return oi;
- }
#ifndef NDEBUG
check_regs(s);
#endif
@@ -2416,30 +2429,6 @@ static inline int tcg_gen_code_common(TCGContext *s,
/* Generate TB finalization at the end of block */
tcg_out_tb_finalize(s);
- return -1;
-}
-
-int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
-{
-#ifdef CONFIG_PROFILER
- {
- int n;
-
- n = s->gen_last_op_idx + 1;
- s->op_count += n;
- if (n > s->op_count_max) {
- s->op_count_max = n;
- }
-
- n = s->nb_temps;
- s->temp_count += n;
- if (n > s->temp_count_max) {
- s->temp_count_max = n;
- }
- }
-#endif
-
- tcg_gen_code_common(s, gen_code_buf, -1);
/* flush instruction cache */
flush_icache_range((uintptr_t)s->code_buf, (uintptr_t)s->code_ptr);
@@ -2447,16 +2436,6 @@ int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
return tcg_current_code_size(s);
}
-/* Return the index of the micro operation such as the pc after is <
- offset bytes from the start of the TB. The contents of gen_code_buf must
- not be changed, though writing the same values is ok.
- Return -1 if not found. */
-int tcg_gen_code_search_pc(TCGContext *s, tcg_insn_unit *gen_code_buf,
- long offset)
-{
- return tcg_gen_code_common(s, gen_code_buf, offset);
-}
-
#ifdef CONFIG_PROFILER
void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
{
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 6055715..d2c6ac7 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -621,8 +621,6 @@ void tcg_prologue_init(TCGContext *s);
void tcg_func_start(TCGContext *s);
int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf);
-int tcg_gen_code_search_pc(TCGContext *s, tcg_insn_unit *gen_code_buf,
- long offset);
void tcg_set_frame(TCGContext *s, int reg, intptr_t start, intptr_t size);
--
2.4.3
^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (18 preceding siblings ...)
2015-09-02 5:52 ` [Qemu-devel] [PATCH 20/20] tcg: Remove tcg_gen_code_search_pc Richard Henderson
@ 2015-09-02 12:21 ` Max Filippov
2015-09-02 14:21 ` Richard Henderson
2015-09-08 18:56 ` Peter Maydell
` (2 subsequent siblings)
22 siblings, 1 reply; 62+ messages in thread
From: Max Filippov @ 2015-09-02 12:21 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
Richard,
patch 01/20 haven't got to the list. Do you have that series somewhere
in a public git?
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-02 12:21 ` [Qemu-devel] [RFC 00/20] Do away with TB retranslation Max Filippov
@ 2015-09-02 14:21 ` Richard Henderson
2015-09-04 15:18 ` Max Filippov
0 siblings, 1 reply; 62+ messages in thread
From: Richard Henderson @ 2015-09-02 14:21 UTC (permalink / raw)
To: Max Filippov; +Cc: qemu-devel
On 09/02/2015 05:21 AM, Max Filippov wrote:
> Richard,
>
> patch 01/20 haven't got to the list. Do you have that series somewhere
> in a public git?
git://github.com/rth7680/qemu.git tcg-search-2
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-02 14:21 ` Richard Henderson
@ 2015-09-04 15:18 ` Max Filippov
2015-09-04 15:31 ` Peter Maydell
2015-09-04 16:46 ` Richard Henderson
0 siblings, 2 replies; 62+ messages in thread
From: Max Filippov @ 2015-09-04 15:18 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Wed, Sep 2, 2015 at 5:21 PM, Richard Henderson <rth@twiddle.net> wrote:
> git://github.com/rth7680/qemu.git tcg-search-2
That makes an impressive speedup for native kernel build on xtensa softmmu:
down from 1240 minutes to 690.
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-04 15:18 ` Max Filippov
@ 2015-09-04 15:31 ` Peter Maydell
2015-09-04 16:46 ` Richard Henderson
1 sibling, 0 replies; 62+ messages in thread
From: Peter Maydell @ 2015-09-04 15:31 UTC (permalink / raw)
To: Max Filippov; +Cc: qemu-devel, Richard Henderson
On 4 September 2015 at 16:18, Max Filippov <jcmvbkbc@gmail.com> wrote:
> On Wed, Sep 2, 2015 at 5:21 PM, Richard Henderson <rth@twiddle.net> wrote:
>> git://github.com/rth7680/qemu.git tcg-search-2
>
> That makes an impressive speedup for native kernel build on xtensa softmmu:
> down from 1240 minutes to 690.
Blimey.
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-04 15:18 ` Max Filippov
2015-09-04 15:31 ` Peter Maydell
@ 2015-09-04 16:46 ` Richard Henderson
2015-09-04 17:07 ` Max Filippov
2015-09-05 14:11 ` Mark Cave-Ayland
1 sibling, 2 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-04 16:46 UTC (permalink / raw)
To: Max Filippov; +Cc: qemu-devel
On 09/04/2015 08:18 AM, Max Filippov wrote:
> On Wed, Sep 2, 2015 at 5:21 PM, Richard Henderson <rth@twiddle.net> wrote:
>> git://github.com/rth7680/qemu.git tcg-search-2
>
> That makes an impressive speedup for native kernel build on xtensa softmmu:
> down from 1240 minutes to 690.
>
Yowza. Is xtensa a hardware or software managed tlb?
I guess that's a data point for going ahead with this, supposing that I can
find the mips+sparc bugs I introduced.
I recall xtensa being one of the trivial conversions. You didn't notice
anything odd/broken on your target, then?
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-04 16:46 ` Richard Henderson
@ 2015-09-04 17:07 ` Max Filippov
2015-09-05 14:11 ` Mark Cave-Ayland
1 sibling, 0 replies; 62+ messages in thread
From: Max Filippov @ 2015-09-04 17:07 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel
On Fri, Sep 4, 2015 at 7:46 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 09/04/2015 08:18 AM, Max Filippov wrote:
>> On Wed, Sep 2, 2015 at 5:21 PM, Richard Henderson <rth@twiddle.net> wrote:
>>> git://github.com/rth7680/qemu.git tcg-search-2
>>
>> That makes an impressive speedup for native kernel build on xtensa softmmu:
>> down from 1240 minutes to 690.
>>
>
> Yowza. Is xtensa a hardware or software managed tlb?
It's mixed: it can walk 2-level page tables automatically, but in linux
second level of page tables is handled through exceptions.
> I guess that's a data point for going ahead with this, supposing that I can
> find the mips+sparc bugs I introduced.
>
> I recall xtensa being one of the trivial conversions. You didn't notice
> anything odd/broken on your target, then?
No, nothing obvious.
I've taken a look at changes to generic code, they look good to me.
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-04 16:46 ` Richard Henderson
2015-09-04 17:07 ` Max Filippov
@ 2015-09-05 14:11 ` Mark Cave-Ayland
2015-09-06 20:19 ` Richard Henderson
1 sibling, 1 reply; 62+ messages in thread
From: Mark Cave-Ayland @ 2015-09-05 14:11 UTC (permalink / raw)
To: Richard Henderson, Max Filippov; +Cc: qemu-devel
On 04/09/15 17:46, Richard Henderson wrote:
> On 09/04/2015 08:18 AM, Max Filippov wrote:
>> On Wed, Sep 2, 2015 at 5:21 PM, Richard Henderson <rth@twiddle.net> wrote:
>>> git://github.com/rth7680/qemu.git tcg-search-2
>>
>> That makes an impressive speedup for native kernel build on xtensa softmmu:
>> down from 1240 minutes to 690.
>>
>
> Yowza. Is xtensa a hardware or software managed tlb?
>
> I guess that's a data point for going ahead with this, supposing that I can
> find the mips+sparc bugs I introduced.
While I probably can't help too much on the TCG side, if it helps I can
rustle up some SPARC images in order to help with testing?
ATB,
Mark.
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-05 14:11 ` Mark Cave-Ayland
@ 2015-09-06 20:19 ` Richard Henderson
2015-09-09 15:35 ` Artyom Tarasenko
0 siblings, 1 reply; 62+ messages in thread
From: Richard Henderson @ 2015-09-06 20:19 UTC (permalink / raw)
To: Mark Cave-Ayland; +Cc: Max Filippov, qemu-devel
On Sep 5, 2015 07:11, Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
> While I probably can't help too much on the TCG side, if it helps I can
> rustle up some SPARC images in order to help with testing?
That would be helpful. Something more than the trivial sparc-test image on the wiki. Something with a sparc64 userland would be best.
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-06 20:19 ` Richard Henderson
@ 2015-09-09 15:35 ` Artyom Tarasenko
0 siblings, 0 replies; 62+ messages in thread
From: Artyom Tarasenko @ 2015-09-09 15:35 UTC (permalink / raw)
To: Richard Henderson; +Cc: Max Filippov, Mark Cave-Ayland, qemu-devel
On Sun, Sep 6, 2015 at 10:19 PM, Richard Henderson <rth@twiddle.net> wrote:
> On Sep 5, 2015 07:11, Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:
>> While I probably can't help too much on the TCG side, if it helps I can
>> rustle up some SPARC images in order to help with testing?
>
> That would be helpful. Something more than the trivial sparc-test image on the wiki. Something with a sparc64 userland would be best.
In case you still need it I can give you my working debian-sid/sparc64
image, but it's huge (~4GB), nothing for a mail.
Artyom
--
Regards,
Artyom Tarasenko
SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (19 preceding siblings ...)
2015-09-02 12:21 ` [Qemu-devel] [RFC 00/20] Do away with TB retranslation Max Filippov
@ 2015-09-08 18:56 ` Peter Maydell
2015-09-08 19:00 ` Richard Henderson
2015-09-10 13:54 ` Peter Maydell
2015-09-10 17:48 ` Aurelien Jarno
2015-09-10 18:55 ` Alex Bennée
22 siblings, 2 replies; 62+ messages in thread
From: Peter Maydell @ 2015-09-08 18:56 UTC (permalink / raw)
To: Richard Henderson
Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 2 September 2015 at 06:51, Richard Henderson <rth@twiddle.net> wrote:
> I've been looking at this problem off and on for the last week or so,
> prompted by the sparc performance work. Although I havn't been able
> to get a proper sparc64 guest install working, I see the exact same
> problem with a mips guest.
>
> On alpha or x86, which seem to perform well, perf numbers for the
> executable have about 30% of the execution time spent in cpu_exec.
> For mips, on the other hand, we spend about 30% of the time in
> routines related to tcg (re-)translation.
>
> Aurelien has a patch in his own branches that attempts to mitigate this
> on mips by shadow caching more tlb entries. While this does improve
> performace a bit, it employs a linear search through a large buffer,
> with the effect of 30-ish % perf numbers for r4k_map_address.
>
> (One could probably improve things by hashing the data in that array,
> rather than a linear search, but...)
>
> In the past we've talked about getting rid of retranslation entirely.
> It's clever, but it certainly has its share of problems. I gave it
> a go this weekend.
>
> The following isn't quite right. It fails to boot on sparc even with
> our tiny test kernel. It also triggers an abort on mips, eventually.
> But it's able to get all the way through to a prompt, and in the
> process I can see that perf results are quite different -- much more
> like results I see for alpha.
>
> Thoughts on the approach?
Looks sensible to me. For patches 1 2 4..16 20
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Patches 3, 17, 19 I've sent "minor nit, otherwise r-by" followups to.
Patch 18 is of course the meat of this series. It doesn't look
obviously wrong but I want to put some more time into reviewing
it tomorrow.
My sparc test image (which is just the 32-bit debian from
Aurelien's website) boots fine even with this patchset, though
I didn't try stressing it at all.
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-08 18:56 ` Peter Maydell
@ 2015-09-08 19:00 ` Richard Henderson
2015-09-08 19:06 ` Peter Maydell
` (2 more replies)
2015-09-10 13:54 ` Peter Maydell
1 sibling, 3 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-08 19:00 UTC (permalink / raw)
To: Peter Maydell; +Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 09/08/2015 11:56 AM, Peter Maydell wrote:
> My sparc test image (which is just the 32-bit debian from
> Aurelien's website) boots fine even with this patchset...
Odd, it shouldn't. ;-)
Anyway, I've just fixed the sparc problem and re-pushed the tree to
git://github.com/rth7680/qemu.git tcg-search-2
for anyone who wants to do any more testing.
I'll address your review nits shortly.
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-08 19:00 ` Richard Henderson
@ 2015-09-08 19:06 ` Peter Maydell
2015-09-08 19:28 ` Richard Henderson
2015-09-09 15:05 ` Artyom Tarasenko
2015-09-10 6:07 ` Dennis Luehring
2 siblings, 1 reply; 62+ messages in thread
From: Peter Maydell @ 2015-09-08 19:06 UTC (permalink / raw)
To: Richard Henderson
Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 8 September 2015 at 20:00, Richard Henderson <rth@twiddle.net> wrote:
> On 09/08/2015 11:56 AM, Peter Maydell wrote:
>> My sparc test image (which is just the 32-bit debian from
>> Aurelien's website) boots fine even with this patchset...
>
> Odd, it shouldn't. ;-)
>
> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>
> git://github.com/rth7680/qemu.git tcg-search-2
>
> for anyone who wants to do any more testing.
...so what was the bug? (Push doesn't seem to have made it
to github yet.)
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-08 19:06 ` Peter Maydell
@ 2015-09-08 19:28 ` Richard Henderson
2015-09-08 20:25 ` Peter Maydell
0 siblings, 1 reply; 62+ messages in thread
From: Richard Henderson @ 2015-09-08 19:28 UTC (permalink / raw)
To: Peter Maydell; +Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 09/08/2015 12:06 PM, Peter Maydell wrote:
> On 8 September 2015 at 20:00, Richard Henderson <rth@twiddle.net> wrote:
>> On 09/08/2015 11:56 AM, Peter Maydell wrote:
>>> My sparc test image (which is just the 32-bit debian from
>>> Aurelien's website) boots fine even with this patchset...
>>
>> Odd, it shouldn't. ;-)
>>
>> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>>
>> git://github.com/rth7680/qemu.git tcg-search-2
>>
>> for anyone who wants to do any more testing.
>
> ...so what was the bug? (Push doesn't seem to have made it
> to github yet.)
Err.. it has. Tip should be 98cb3e2ecffd126177f43634b643be81bdc764e7.
So I guess you pulled it post fix?
The problem was in 12/20, "target-sparc: Remove gen_opc_jump_pc".
The original was slightly off in how it was computing the npc in a delay slot.
The replacement keeps the dc->jump_pc array, but verifies that the value of
dc->jump_pc[1] is as expected: jump false to next insn. It's a smaller change
to the translator, and easier to verify correctness.
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-08 19:28 ` Richard Henderson
@ 2015-09-08 20:25 ` Peter Maydell
0 siblings, 0 replies; 62+ messages in thread
From: Peter Maydell @ 2015-09-08 20:25 UTC (permalink / raw)
To: Richard Henderson
Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 8 September 2015 at 20:28, Richard Henderson <rth@twiddle.net> wrote:
> On 09/08/2015 12:06 PM, Peter Maydell wrote:
>> On 8 September 2015 at 20:00, Richard Henderson <rth@twiddle.net> wrote:
>>> On 09/08/2015 11:56 AM, Peter Maydell wrote:
>>>> My sparc test image (which is just the 32-bit debian from
>>>> Aurelien's website) boots fine even with this patchset...
>>>
>>> Odd, it shouldn't. ;-)
>>>
>>> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>>>
>>> git://github.com/rth7680/qemu.git tcg-search-2
>>>
>>> for anyone who wants to do any more testing.
>>
>> ...so what was the bug? (Push doesn't seem to have made it
>> to github yet.)
>
> Err.. it has. Tip should be 98cb3e2ecffd126177f43634b643be81bdc764e7.
> So I guess you pulled it post fix?
Yep, that's what I've been reviewing...would explain why I
couldn't get it to fall over in testing :-)
> The problem was in 12/20, "target-sparc: Remove gen_opc_jump_pc".
>
> The original was slightly off in how it was computing the npc in a delay slot.
> The replacement keeps the dc->jump_pc array, but verifies that the value of
> dc->jump_pc[1] is as expected: jump false to next insn. It's a smaller change
> to the translator, and easier to verify correctness.
Yeah. My r-b applies to the fixed version, not the mail on the list.
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-08 19:00 ` Richard Henderson
2015-09-08 19:06 ` Peter Maydell
@ 2015-09-09 15:05 ` Artyom Tarasenko
2015-09-09 16:18 ` Paolo Bonzini
2015-09-10 6:07 ` Dennis Luehring
2 siblings, 1 reply; 62+ messages in thread
From: Artyom Tarasenko @ 2015-09-09 15:05 UTC (permalink / raw)
To: Richard Henderson
Cc: Peter Maydell, Aurelien Jarno, QEMU Developers, Dennis Luehring
Hi Richard,
On Tue, Sep 8, 2015 at 9:00 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 09/08/2015 11:56 AM, Peter Maydell wrote:
>> My sparc test image (which is just the 32-bit debian from
>> Aurelien's website) boots fine even with this patchset...
>
> Odd, it shouldn't. ;-)
>
> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>
> git://github.com/rth7680/qemu.git tcg-search-2
>
> for anyone who wants to do any more testing.
Great work! Both debian wheezy/sparc and debian sid/sparc64 work.
My benchmark (running g++ as described here [1]) shows ~3x speed up
factor. Not bad!
So, for the sparc64 part,
Tested-By: Artyom Tarasenko <atar4qemu@gmail.com>
1. https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg02220.html
--
Regards,
Artyom Tarasenko
SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-09 15:05 ` Artyom Tarasenko
@ 2015-09-09 16:18 ` Paolo Bonzini
2015-09-09 17:48 ` Artyom Tarasenko
0 siblings, 1 reply; 62+ messages in thread
From: Paolo Bonzini @ 2015-09-09 16:18 UTC (permalink / raw)
To: Artyom Tarasenko, Richard Henderson
Cc: Dennis Luehring, Peter Maydell, QEMU Developers, Aurelien Jarno
On 09/09/2015 17:05, Artyom Tarasenko wrote:
> Hi Richard,
>
> On Tue, Sep 8, 2015 at 9:00 PM, Richard Henderson <rth@twiddle.net> wrote:
>> On 09/08/2015 11:56 AM, Peter Maydell wrote:
>>> My sparc test image (which is just the 32-bit debian from
>>> Aurelien's website) boots fine even with this patchset...
>>
>> Odd, it shouldn't. ;-)
>>
>> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>>
>> git://github.com/rth7680/qemu.git tcg-search-2
>>
>> for anyone who wants to do any more testing.
>
> Great work! Both debian wheezy/sparc and debian sid/sparc64 work.
> My benchmark (running g++ as described here [1]) shows ~3x speed up
> factor. Not bad!
Does the optimizer still make things slower?
Paolo
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-09 16:18 ` Paolo Bonzini
@ 2015-09-09 17:48 ` Artyom Tarasenko
0 siblings, 0 replies; 62+ messages in thread
From: Artyom Tarasenko @ 2015-09-09 17:48 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Dennis Luehring, Peter Maydell, QEMU Developers, Aurelien Jarno,
Richard Henderson
On Wed, Sep 9, 2015 at 6:18 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 09/09/2015 17:05, Artyom Tarasenko wrote:
>> Hi Richard,
>>
>> On Tue, Sep 8, 2015 at 9:00 PM, Richard Henderson <rth@twiddle.net> wrote:
>>> On 09/08/2015 11:56 AM, Peter Maydell wrote:
>>>> My sparc test image (which is just the 32-bit debian from
>>>> Aurelien's website) boots fine even with this patchset...
>>>
>>> Odd, it shouldn't. ;-)
>>>
>>> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>>>
>>> git://github.com/rth7680/qemu.git tcg-search-2
>>>
>>> for anyone who wants to do any more testing.
>>
>> Great work! Both debian wheezy/sparc and debian sid/sparc64 work.
>> My benchmark (running g++ as described here [1]) shows ~3x speed up
>> factor. Not bad!
>
> Does the optimizer still make things slower?
Haven't tried without it yet, but would be surprised if it still does:
tcg_optimize doesn't appear in the "perf top" output anymore.
At least not in the benchmark I use.
Artyom
--
Regards,
Artyom Tarasenko
SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-08 19:00 ` Richard Henderson
2015-09-08 19:06 ` Peter Maydell
2015-09-09 15:05 ` Artyom Tarasenko
@ 2015-09-10 6:07 ` Dennis Luehring
2015-09-10 7:00 ` Artyom Tarasenko
2 siblings, 1 reply; 62+ messages in thread
From: Dennis Luehring @ 2015-09-10 6:07 UTC (permalink / raw)
To: Richard Henderson, Peter Maydell
Cc: Aurelien Jarno, QEMU Developers, Artyom Tarasenko
Am 08.09.2015 um 21:00 schrieb Richard Henderson:
> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>
> git://github.com/rth7680/qemu.git tcg-search-2
>
> for anyone who wants to do any more testing.
strangly your branch doesn't changed anything for pure SPARC64 in my
tests - i've always completely removed the qemu folder and cleanly rebuild
(all based on stable shell-scripts)
and now even qemu git master stream results degrated to the worst
results from my former tests
last result from stream on an 3-4 week old git master - best of all test
Function Best Rate MB/s Avg time Min time Max time
Copy: 771.5 0.214717 0.207377 0.244214
Scale: 288.1 0.573320 0.555401 0.660161
Add: 423.5 0.633523 0.566661 1.092067
Triad: 242.9 1.053032 0.987970 1.499563
are parts of your patches already includes in the master?
~/qemu/sparc64-softmmu/qemu-system-sparc64 \
-m 1G \
-nographic \
-serial mon:telnet::3000,server,wait \
-monitor telnet::4440,server,nowait \
-netdev user,id=mynet0 \
-device ne2k_pci,netdev=mynet0 \
-hda ~/ramdisk/netbsd-615-sparc64.qcow2 \
-cdrom ~/netbsd/NetBSD-6.1.5-sparc64.iso -boot c \
NetBSD SPARC64 (pure 64Bit)
file /usr/bin/gcc
/usr/bin/gcc: ELF 64-bit MSB executable, SPARC V9, relaxed memory
ordering, (SYSV), dynamically linked (uses shared libs), for NetBSD
6.1.5, not stripped
gcc prime.c -o prime.out -lm
file prime.out
prime.out: ELF 64-bit MSB executable, SPARC V9, relaxed memory ordering,
(SYSV), dynamically linked (uses shared libs), for NetBSD 6.1.5, not
stripped
gcc stream.c -o stream.out -lm
file stream.out
stream.out: ELF 64-bit MSB executable, SPARC V9, relaxed memory
ordering, (SYSV), dynamically linked (uses shared libs), for NetBSD
6.1.5, not stripped
qemu rth7680 tcg-search-2
cd ~/
rm -rf qemu
git clone git://github.com/rth7680/qemu.git
git checkout tcg-search-2
cd qemu
./configure --target-list=sparc64-softmmu
make
prime.out runtimes
11.22 real 11.19 user 0.00 sys
8.99 real 8.96 user 0.01 sys
9.03 real 8.99 user 0.01 sys
8.98 real 8.92 user 0.02 sys
stream benchmark results
#1
Function Best Rate MB/s Avg time Min time Max time
Copy: 277.5 0.584221 0.576543 0.599850
Scale: 178.4 0.915691 0.896731 0.973512
Add: 215.3 1.136872 1.114746 1.185749
Triad: 166.5 1.473816 1.441169 1.525007
#2
Function Best Rate MB/s Avg time Min time Max time
Copy: 278.1 0.577109 0.575259 0.583267
Scale: 179.3 0.901238 0.892399 0.911371
Add: 215.2 1.124701 1.115341 1.176286
Triad: 167.0 1.448070 1.437424 1.457319
#3
Function Best Rate MB/s Avg time Min time Max time
Copy: 276.5 0.580420 0.578757 0.583966
Scale: 179.8 0.898168 0.889899 0.903530
Add: 212.7 1.130851 1.128318 1.136284
Triad: 164.1 1.464667 1.462173 1.467856
#4
Function Best Rate MB/s Avg time Min time Max time
Copy: 276.0 0.580713 0.579757 0.583575
Scale: 177.8 0.906521 0.899661 0.910353
Add: 213.9 1.124599 1.122229 1.128944
Triad: 166.0 1.450210 1.446169 1.457350
g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c
-MMD -MP
compile times
156.49 real 149.09 user 6.97 sys
156.54 real 149.01 user 6.86 sys
155.91 real 148.71 user 6.93 sys
156.51 real 149.12 user 6.83 sys
qemu git master
cd ~/
rm -rf qemu
git clone git://git.qemu-project.org/qemu.git
cd qemu
./configure --target-list=sparc64-softmmu
make
prime.out runtimes
9.32 real 9.26 user 0.03 sys
8.98 real 8.92 user 0.04 sys
8.97 real 8.93 user 0.02 sys
9.00 real 8.97 user 0.02 sys
stream benchmark results
#1
Function Best Rate MB/s Avg time Min time Max time
Copy: 276.6 0.584092 0.578538 0.592748
Scale: 179.0 0.908465 0.893868 0.924004
Add: 215.5 1.122398 1.113689 1.128655
Triad: 167.6 1.449198 1.432321 1.457774
#2
Function Best Rate MB/s Avg time Min time Max time
Copy: 276.8 0.579905 0.577936 0.585222
Scale: 177.6 0.916502 0.900911 0.924964
Add: 213.0 1.128311 1.126591 1.132405
Triad: 167.7 1.441520 1.431449 1.447586
#3
Function Best Rate MB/s Avg time Min time Max time
Copy: 278.4 0.576423 0.574611 0.580779
Scale: 179.4 0.899889 0.892094 0.904380
Add: 215.6 1.117374 1.112954 1.128532
Triad: 165.1 1.457329 1.453899 1.465867
#4
Function Best Rate MB/s Avg time Min time Max time
Copy: 278.1 0.578641 0.575289 0.586838
Scale: 178.7 0.903208 0.895518 0.907769
Add: 210.8 1.144200 1.138318 1.156063
Triad: 165.2 1.459674 1.453116 1.479194
g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c
-MMD -MP
compile times
156.85 real 148.98 user 7.17 sys
157.05 real 149.99 user 6.78 sys
157.48 real 150.25 user 6.92 sys
156.50 real 149.06 user 6.60 sys
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 6:07 ` Dennis Luehring
@ 2015-09-10 7:00 ` Artyom Tarasenko
2015-09-10 9:32 ` Dennis Luehring
0 siblings, 1 reply; 62+ messages in thread
From: Artyom Tarasenko @ 2015-09-10 7:00 UTC (permalink / raw)
To: Dennis Luehring
Cc: Peter Maydell, QEMU Developers, Aurelien Jarno, Richard Henderson
On Thu, Sep 10, 2015 at 8:07 AM, Dennis Luehring <dl.soluz@gmx.net> wrote:
> Am 08.09.2015 um 21:00 schrieb Richard Henderson:
>>
>> Anyway, I've just fixed the sparc problem and re-pushed the tree to
>>
>> git://github.com/rth7680/qemu.git tcg-search-2
>>
>> for anyone who wants to do any more testing.
>
>
> strangly your branch doesn't changed anything for pure SPARC64 in my tests -
> i've always completely removed the qemu folder and cleanly rebuild
> (all based on stable shell-scripts)
Can you please show "perf top" of the qemu-system-sparc64 process on
the host, when your g++ compilation is running?
Artyom
> and now even qemu git master stream results degrated to the worst results
> from my former tests
>
> last result from stream on an 3-4 week old git master - best of all test
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 771.5 0.214717 0.207377 0.244214
> Scale: 288.1 0.573320 0.555401 0.660161
> Add: 423.5 0.633523 0.566661 1.092067
> Triad: 242.9 1.053032 0.987970 1.499563
>
> are parts of your patches already includes in the master?
>
> ~/qemu/sparc64-softmmu/qemu-system-sparc64 \
> -m 1G \
> -nographic \
> -serial mon:telnet::3000,server,wait \
> -monitor telnet::4440,server,nowait \
> -netdev user,id=mynet0 \
> -device ne2k_pci,netdev=mynet0 \
> -hda ~/ramdisk/netbsd-615-sparc64.qcow2 \
> -cdrom ~/netbsd/NetBSD-6.1.5-sparc64.iso -boot c \
>
> NetBSD SPARC64 (pure 64Bit)
>
> file /usr/bin/gcc
> /usr/bin/gcc: ELF 64-bit MSB executable, SPARC V9, relaxed memory ordering,
> (SYSV), dynamically linked (uses shared libs), for NetBSD 6.1.5, not
> stripped
>
> gcc prime.c -o prime.out -lm
> file prime.out
> prime.out: ELF 64-bit MSB executable, SPARC V9, relaxed memory ordering,
> (SYSV), dynamically linked (uses shared libs), for NetBSD 6.1.5, not
> stripped
>
> gcc stream.c -o stream.out -lm
> file stream.out
> stream.out: ELF 64-bit MSB executable, SPARC V9, relaxed memory ordering,
> (SYSV), dynamically linked (uses shared libs), for NetBSD 6.1.5, not
> stripped
>
> qemu rth7680 tcg-search-2
>
> cd ~/
> rm -rf qemu
> git clone git://github.com/rth7680/qemu.git
> git checkout tcg-search-2
> cd qemu
> ./configure --target-list=sparc64-softmmu
> make
>
> prime.out runtimes
>
> 11.22 real 11.19 user 0.00 sys
> 8.99 real 8.96 user 0.01 sys
> 9.03 real 8.99 user 0.01 sys
> 8.98 real 8.92 user 0.02 sys
>
> stream benchmark results
>
> #1
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 277.5 0.584221 0.576543 0.599850
> Scale: 178.4 0.915691 0.896731 0.973512
> Add: 215.3 1.136872 1.114746 1.185749
> Triad: 166.5 1.473816 1.441169 1.525007
>
> #2
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 278.1 0.577109 0.575259 0.583267
> Scale: 179.3 0.901238 0.892399 0.911371
> Add: 215.2 1.124701 1.115341 1.176286
> Triad: 167.0 1.448070 1.437424 1.457319
>
> #3
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 276.5 0.580420 0.578757 0.583966
> Scale: 179.8 0.898168 0.889899 0.903530
> Add: 212.7 1.130851 1.128318 1.136284
> Triad: 164.1 1.464667 1.462173 1.467856
>
> #4
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 276.0 0.580713 0.579757 0.583575
> Scale: 177.8 0.906521 0.899661 0.910353
> Add: 213.9 1.124599 1.122229 1.128944
> Triad: 166.0 1.450210 1.446169 1.457350
>
> g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c -MMD
> -MP
>
> compile times
> 156.49 real 149.09 user 6.97 sys
> 156.54 real 149.01 user 6.86 sys
> 155.91 real 148.71 user 6.93 sys
> 156.51 real 149.12 user 6.83 sys
>
> qemu git master
>
> cd ~/
> rm -rf qemu
> git clone git://git.qemu-project.org/qemu.git
> cd qemu
> ./configure --target-list=sparc64-softmmu
> make
>
> prime.out runtimes
>
> 9.32 real 9.26 user 0.03 sys
> 8.98 real 8.92 user 0.04 sys
> 8.97 real 8.93 user 0.02 sys
> 9.00 real 8.97 user 0.02 sys
>
> stream benchmark results
>
> #1
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 276.6 0.584092 0.578538 0.592748
> Scale: 179.0 0.908465 0.893868 0.924004
> Add: 215.5 1.122398 1.113689 1.128655
> Triad: 167.6 1.449198 1.432321 1.457774
>
> #2
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 276.8 0.579905 0.577936 0.585222
> Scale: 177.6 0.916502 0.900911 0.924964
> Add: 213.0 1.128311 1.126591 1.132405
> Triad: 167.7 1.441520 1.431449 1.447586
>
> #3
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 278.4 0.576423 0.574611 0.580779
> Scale: 179.4 0.899889 0.892094 0.904380
> Add: 215.6 1.117374 1.112954 1.128532
> Triad: 165.1 1.457329 1.453899 1.465867
>
> #4
> Function Best Rate MB/s Avg time Min time Max time
> Copy: 278.1 0.578641 0.575289 0.586838
> Scale: 178.7 0.903208 0.895518 0.907769
> Add: 210.8 1.144200 1.138318 1.156063
> Triad: 165.2 1.459674 1.453116 1.479194
>
> g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c -MMD
> -MP
>
> compile times
> 156.85 real 148.98 user 7.17 sys
> 157.05 real 149.99 user 6.78 sys
> 157.48 real 150.25 user 6.92 sys
> 156.50 real 149.06 user 6.60 sys
>
--
Regards,
Artyom Tarasenko
SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 7:00 ` Artyom Tarasenko
@ 2015-09-10 9:32 ` Dennis Luehring
2015-09-10 9:54 ` Artyom Tarasenko
0 siblings, 1 reply; 62+ messages in thread
From: Dennis Luehring @ 2015-09-10 9:32 UTC (permalink / raw)
To: Artyom Tarasenko
Cc: Peter Maydell, QEMU Developers, Aurelien Jarno, Richard Henderson
Am 10.09.2015 um 09:00 schrieb Artyom Tarasenko:
>> >strangly your branch doesn't changed anything for pure SPARC64 in my tests -
>> >i've always completely removed the qemu folder and cleanly rebuild
>> >(all based on stable shell-scripts)
> Can you please show "perf top" of the qemu-system-sparc64 process on
> the host, when your g++ compilation is running?
http://pastebin.com/nUkhWTj4
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 9:32 ` Dennis Luehring
@ 2015-09-10 9:54 ` Artyom Tarasenko
2015-09-10 10:37 ` Dennis Luehring
0 siblings, 1 reply; 62+ messages in thread
From: Artyom Tarasenko @ 2015-09-10 9:54 UTC (permalink / raw)
To: Dennis Luehring
Cc: Peter Maydell, QEMU Developers, Aurelien Jarno, Richard Henderson
On Thu, Sep 10, 2015 at 11:32 AM, Dennis Luehring <dl.soluz@gmx.net> wrote:
> Am 10.09.2015 um 09:00 schrieb Artyom Tarasenko:
>>>
>>> >strangly your branch doesn't changed anything for pure SPARC64 in my
>>> > tests -
>>> >i've always completely removed the qemu folder and cleanly rebuild
>>> >(all based on stable shell-scripts)
>>
>> Can you please show "perf top" of the qemu-system-sparc64 process on
>> the host, when your g++ compilation is running?
>
>
> http://pastebin.com/nUkhWTj4
Thanks.
1.94% qemu-system-sparc64 [.] gen_intermediate_code_pc
I suppose this was done on master? Richard's patch 5e3407a removes the
gen_intermediate_code_pc.
Can you also show the perf top of the tcg-search-2 branch run?
Artyom
--
Regards,
Artyom Tarasenko
SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 9:54 ` Artyom Tarasenko
@ 2015-09-10 10:37 ` Dennis Luehring
2015-09-10 10:57 ` Paolo Bonzini
2015-09-10 11:02 ` Dennis Luehring
0 siblings, 2 replies; 62+ messages in thread
From: Dennis Luehring @ 2015-09-10 10:37 UTC (permalink / raw)
To: Artyom Tarasenko
Cc: Peter Maydell, QEMU Developers, Aurelien Jarno, Richard Henderson
Am 10.09.2015 um 11:54 schrieb Artyom Tarasenko:
> On Thu, Sep 10, 2015 at 11:32 AM, Dennis Luehring<dl.soluz@gmx.net> wrote:
>> >Am 10.09.2015 um 09:00 schrieb Artyom Tarasenko:
>>>> >>>
>>>>> >>> >strangly your branch doesn't changed anything for pure SPARC64 in my
>>>>> >>> >tests -
>>>>> >>> >i've always completely removed the qemu folder and cleanly rebuild
>>>>> >>> >(all based on stable shell-scripts)
>>> >>
>>> >>Can you please show "perf top" of the qemu-system-sparc64 process on
>>> >>the host, when your g++ compilation is running?
>> >
>> >
>> >http://pastebin.com/nUkhWTj4
> Thanks.
> 1.94% qemu-system-sparc64 [.] gen_intermediate_code_pc
>
> I suppose this was done on master? Richard's patch 5e3407a removes the
> gen_intermediate_code_pc.
i didn't include the patches - just git-master and tcg-search-2 branch
>
> Can you also show the perf top of the tcg-search-2 branch run?
perf top from tcg-search-2 branch
http://pastebin.com/AtASpQvk
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 10:37 ` Dennis Luehring
@ 2015-09-10 10:57 ` Paolo Bonzini
2015-09-10 11:02 ` Dennis Luehring
1 sibling, 0 replies; 62+ messages in thread
From: Paolo Bonzini @ 2015-09-10 10:57 UTC (permalink / raw)
To: Dennis Luehring, Artyom Tarasenko
Cc: Peter Maydell, QEMU Developers, Aurelien Jarno, Richard Henderson
On 10/09/2015 12:37, Dennis Luehring wrote:
>>
>> Can you also show the perf top of the tcg-search-2 branch run?
>
> perf top from tcg-search-2 branch
>
> http://pastebin.com/AtASpQvk
Still has gen_intermediate_code_pc in it.
Paolo
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 10:37 ` Dennis Luehring
2015-09-10 10:57 ` Paolo Bonzini
@ 2015-09-10 11:02 ` Dennis Luehring
2015-09-10 11:20 ` Artyom Tarasenko
1 sibling, 1 reply; 62+ messages in thread
From: Dennis Luehring @ 2015-09-10 11:02 UTC (permalink / raw)
To: Artyom Tarasenko
Cc: Peter Maydell, QEMU Developers, Aurelien Jarno, Richard Henderson
Am 10.09.2015 um 12:37 schrieb Dennis Luehring:
> Am 10.09.2015 um 11:54 schrieb Artyom Tarasenko:
> > On Thu, Sep 10, 2015 at 11:32 AM, Dennis Luehring<dl.soluz@gmx.net> wrote:
> >> >Am 10.09.2015 um 09:00 schrieb Artyom Tarasenko:
> >>>> >>>
> >>>>> >>> >strangly your branch doesn't changed anything for pure SPARC64 in my
> >>>>> >>> >tests -
> >>>>> >>> >i've always completely removed the qemu folder and cleanly rebuild
> >>>>> >>> >(all based on stable shell-scripts)
> >>> >>
> >>> >>Can you please show "perf top" of the qemu-system-sparc64 process on
> >>> >>the host, when your g++ compilation is running?
> >> >
> >> >
> >> >http://pastebin.com/nUkhWTj4
> > Thanks.
> > 1.94% qemu-system-sparc64 [.] gen_intermediate_code_pc
> >
> > I suppose this was done on master? Richard's patch 5e3407a removes the
> > gen_intermediate_code_pc.
>
> i didn't include the patches - just git-master and tcg-search-2 branch
>
> >
> > Can you also show the perf top of the tcg-search-2 branch run?
>
> perf top from tcg-search-2 branch
>
> http://pastebin.com/AtASpQvk
cd ~/
rm -rf qemu
git clonegit://github.com/rth7680/qemu.git
git checkout tcg-search-2
cd qemu
./configure --target-list=sparc64-softmmu
make
what else can i do to get the correct source? there is no other version of qemu on my
system only the always rebuild ~/qemu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 11:02 ` Dennis Luehring
@ 2015-09-10 11:20 ` Artyom Tarasenko
0 siblings, 0 replies; 62+ messages in thread
From: Artyom Tarasenko @ 2015-09-10 11:20 UTC (permalink / raw)
To: Dennis Luehring
Cc: Peter Maydell, QEMU Developers, Aurelien Jarno, Richard Henderson
On Thu, Sep 10, 2015 at 1:02 PM, Dennis Luehring <dl.soluz@gmx.net> wrote:
> Am 10.09.2015 um 12:37 schrieb Dennis Luehring:
>>
>> Am 10.09.2015 um 11:54 schrieb Artyom Tarasenko:
>> > On Thu, Sep 10, 2015 at 11:32 AM, Dennis Luehring<dl.soluz@gmx.net>
>> > wrote:
>> >> >Am 10.09.2015 um 09:00 schrieb Artyom Tarasenko:
>> >>>> >>>
>> >>>>> >>> >strangly your branch doesn't changed anything for pure SPARC64
>> >>>>> >>> > in my
>> >>>>> >>> >tests -
>> >>>>> >>> >i've always completely removed the qemu folder and cleanly
>> >>>>> >>> > rebuild
>> >>>>> >>> >(all based on stable shell-scripts)
>> >>> >>
>> >>> >>Can you please show "perf top" of the qemu-system-sparc64 process on
>> >>> >>the host, when your g++ compilation is running?
>> >> >
>> >> >
>> >> >http://pastebin.com/nUkhWTj4
>> > Thanks.
>> > 1.94% qemu-system-sparc64 [.] gen_intermediate_code_pc
>> >
>> > I suppose this was done on master? Richard's patch 5e3407a removes the
>> > gen_intermediate_code_pc.
>>
>> i didn't include the patches - just git-master and tcg-search-2 branch
>>
>> >
>> > Can you also show the perf top of the tcg-search-2 branch run?
>>
>> perf top from tcg-search-2 branch
>>
>> http://pastebin.com/AtASpQvk
>
>
> cd ~/
> rm -rf qemu
> git clonegit://github.com/rth7680/qemu.git
> git checkout tcg-search-2
> cd qemu
> ./configure --target-list=sparc64-softmmu
> make
>
> what else can i do to get the correct source? there is no other version of
> qemu on my
> system only the always rebuild ~/qemu
I usually build out of tree, this way it's possible to remove the
build directory completely without the necessity to fetch the source
tree.
You can try
make distclean
# ensure that you are in the proper branch:
$ git log -1
commit 98cb3e2ecffd126177f43634b643be81bdc764e7
Author: Richard Henderson <rth@twiddle.net>
Date: Tue Sep 1 20:07:48 2015 -0700
tcg: Remove tcg_gen_code_search_pc
It's no longer used, so tidy up everything reached by it.
Artyom
--
Regards,
Artyom Tarasenko
SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-08 18:56 ` Peter Maydell
2015-09-08 19:00 ` Richard Henderson
@ 2015-09-10 13:54 ` Peter Maydell
1 sibling, 0 replies; 62+ messages in thread
From: Peter Maydell @ 2015-09-10 13:54 UTC (permalink / raw)
To: Richard Henderson
Cc: Aurelien Jarno, QEMU Developers, dl.soluz, Artyom Tarasenko
On 8 September 2015 at 19:56, Peter Maydell <peter.maydell@linaro.org> wrote:
> Looks sensible to me. For patches 1 2 4..16 20
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>
> Patches 3, 17, 19 I've sent "minor nit, otherwise r-by" followups to.
>
> Patch 18 is of course the meat of this series. It doesn't look
> obviously wrong but I want to put some more time into reviewing
> it tomorrow.
...and now I've sent the patch 18 review comments.
thanks
-- PMM
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (20 preceding siblings ...)
2015-09-08 18:56 ` Peter Maydell
@ 2015-09-10 17:48 ` Aurelien Jarno
2015-09-13 21:00 ` Aurelien Jarno
2015-09-10 18:55 ` Alex Bennée
22 siblings, 1 reply; 62+ messages in thread
From: Aurelien Jarno @ 2015-09-10 17:48 UTC (permalink / raw)
To: Richard Henderson; +Cc: dl.soluz, qemu-devel, atar4qemu
On 2015-09-01 22:51, Richard Henderson wrote:
> I've been looking at this problem off and on for the last week or so,
> prompted by the sparc performance work. Although I havn't been able
> to get a proper sparc64 guest install working, I see the exact same
> problem with a mips guest.
>
> On alpha or x86, which seem to perform well, perf numbers for the
> executable have about 30% of the execution time spent in cpu_exec.
> For mips, on the other hand, we spend about 30% of the time in
> routines related to tcg (re-)translation.
Indeed the problem happens on CPUs which implement the MMU as a
"software assisted TLB" (or any other marketing name), as opposed to
hardware page walk MMU. They can hold a limited number of TLB entry
at a given time, and require the OS to do the page walk to refill the
TLB. For that an exception is generated, and the faulting address has
to be determined. That's were the TB retranslation takes place, and
that's why it happens a lot more on these CPUS.
A few years ago, I measured about 45% of the TB translation actually
being retranslation for mips and 60% for SH4 for a standard workload.
For a comparison, these value around 1% on i386 and around 5% on ARM.
That's why each time we add an optimization to the optimize, we get
faster code, but we might loose because it takes longer to generate.
> Aurelien has a patch in his own branches that attempts to mitigate this
> on mips by shadow caching more tlb entries. While this does improve
> performace a bit, it employs a linear search through a large buffer,
> with the effect of 30-ish % perf numbers for r4k_map_address.
> (One could probably improve things by hashing the data in that array,
> rather than a linear search, but...)
Yes, that is just a workaround and probably highly workload dependent,
that's why I never submitted it.
> In the past we've talked about getting rid of retranslation entirely.
> It's clever, but it certainly has its share of problems. I gave it
> a go this weekend.
Really great that you have been able to implement that.
> The following isn't quite right. It fails to boot on sparc even with
> our tiny test kernel. It also triggers an abort on mips, eventually.
> But it's able to get all the way through to a prompt, and in the
> process I can see that perf results are quite different -- much more
> like results I see for alpha.
>
> Thoughts on the approach?
It looks like the approach we discussed with Paolo back in June:
http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg04885.html
For me it looks like the good way to proceed, we just have to take care
that the informations to store do not take too much space compared to
the actual translated code.
I'll give a look and a test asap.
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 17:48 ` Aurelien Jarno
@ 2015-09-13 21:00 ` Aurelien Jarno
0 siblings, 0 replies; 62+ messages in thread
From: Aurelien Jarno @ 2015-09-13 21:00 UTC (permalink / raw)
To: Richard Henderson, qemu-devel, atar4qemu, dl.soluz
On 2015-09-10 19:48, Aurelien Jarno wrote:
> On 2015-09-01 22:51, Richard Henderson wrote:
> > I've been looking at this problem off and on for the last week or so,
> > prompted by the sparc performance work. Although I havn't been able
> > to get a proper sparc64 guest install working, I see the exact same
> > problem with a mips guest.
> >
> > On alpha or x86, which seem to perform well, perf numbers for the
> > executable have about 30% of the execution time spent in cpu_exec.
> > For mips, on the other hand, we spend about 30% of the time in
> > routines related to tcg (re-)translation.
>
> Indeed the problem happens on CPUs which implement the MMU as a
> "software assisted TLB" (or any other marketing name), as opposed to
> hardware page walk MMU. They can hold a limited number of TLB entry
> at a given time, and require the OS to do the page walk to refill the
> TLB. For that an exception is generated, and the faulting address has
> to be determined. That's were the TB retranslation takes place, and
> that's why it happens a lot more on these CPUS.
>
> A few years ago, I measured about 45% of the TB translation actually
> being retranslation for mips and 60% for SH4 for a standard workload.
> For a comparison, these value around 1% on i386 and around 5% on ARM.
>
> That's why each time we add an optimization to the optimize, we get
> faster code, but we might loose because it takes longer to generate.
>
> > Aurelien has a patch in his own branches that attempts to mitigate this
> > on mips by shadow caching more tlb entries. While this does improve
> > performace a bit, it employs a linear search through a large buffer,
> > with the effect of 30-ish % perf numbers for r4k_map_address.
> > (One could probably improve things by hashing the data in that array,
> > rather than a linear search, but...)
>
> Yes, that is just a workaround and probably highly workload dependent,
> that's why I never submitted it.
>
> > In the past we've talked about getting rid of retranslation entirely.
> > It's clever, but it certainly has its share of problems. I gave it
> > a go this weekend.
>
> Really great that you have been able to implement that.
>
> > The following isn't quite right. It fails to boot on sparc even with
> > our tiny test kernel. It also triggers an abort on mips, eventually.
> > But it's able to get all the way through to a prompt, and in the
> > process I can see that perf results are quite different -- much more
> > like results I see for alpha.
> >
> > Thoughts on the approach?
>
> It looks like the approach we discussed with Paolo back in June:
>
> http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg04885.html
>
> For me it looks like the good way to proceed, we just have to take care
> that the informations to store do not take too much space compared to
> the actual translated code.
>
> I'll give a look and a test asap.
I haven't really reviewed the code yet, but I have been able to test
your tcg-search-2 branch.
First of all I have tested half of the targets (alpha, arm, cris, i386,
mips, ppc, s390x, sh4 and sparc), and I haven't noticed any regression.
They now have more than 50 hours of uptime, some of them have been
building stuff most of the time, so they are quite stable. That said
I have only tested your branch on an x86-64 host, and it might be a
good idea to test it in one or two different host architectures (I put
that on my todo list, but no promise there).
On the performance side, I have done real measurements only on i386 and
mips. On i386, I haven't seen any measurable difference. On mips, the
boot time is unchanged, but then some workloads are quite faster. The
best I have measured is on perl code, with a x2.4 improvements, while
on an average workload, the gain is around x1.5.
With all that said, you can get:
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
I hope to give you the corresponding reviewed-by in the next days.
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-02 5:51 [Qemu-devel] [RFC 00/20] Do away with TB retranslation Richard Henderson
` (21 preceding siblings ...)
2015-09-10 17:48 ` Aurelien Jarno
@ 2015-09-10 18:55 ` Alex Bennée
2015-09-15 20:19 ` Richard Henderson
22 siblings, 1 reply; 62+ messages in thread
From: Alex Bennée @ 2015-09-10 18:55 UTC (permalink / raw)
To: Richard Henderson; +Cc: aurelien, qemu-devel, dl.soluz, atar4qemu
Richard Henderson <rth@twiddle.net> writes:
> I've been looking at this problem off and on for the last week or so,
> prompted by the sparc performance work. Although I havn't been able
> to get a proper sparc64 guest install working, I see the exact same
> problem with a mips guest.
>
<snip>
> In the past we've talked about getting rid of retranslation entirely.
> It's clever, but it certainly has its share of problems. I gave it
> a go this weekend.
>
<snip>
> Thoughts on the approach?
I've only had a quick glance so far but I'm fairly familiar with the
concept from a previous life. I'll aim to do a full review later once
I've gotten through my MTTCG review backlog.
Anyway some quick points:
* You can save data by only marking faulting instructions
Assuming that all asynchronous instructions trigger at the end/prologue
of basic blocks you only actually need to record the address of
potentially faulting instructions. In fact only a few backend
instructions will actually synchronously fault.
Of course this does have the downside of having to mark all those
instructions in the front end.
* This method can also be used for additional rectification data
AIUI we currently ensure all load/stores are barriers and ensure the CPU
register file is updated before the occur. However if you wanted to you
could drop that requirement and mark the target-host register pair and
only fish it out when required on a fault.
* Test suites are essential if your going to get clever
Last time I went through this I built a SPARC test suite to cover all
faulting instructions in all the various addressing modes. It flushed
out a lot of bugs.
I appreciate the QEMU's aims may be a bit less demanding and not need to
be fully complete and fix up problems as we hit them in the field.
However consider at least a framework of a testcase for checking PC
rectification as it will help in validating those fixes.
* Delay slot/nPCs are a pain
Faults in delay slots are a pain to get right although maybe QEMUs
architecture makes it a little easier to do. Fortunately for me I no
longer have to worry too hard about these architectures, good luck ;-)
Anyway anything that gets rid of the re-translation cost I'm broadly
supportive of. I shall review the code later!
>
>
> r~
>
>
> Richard Henderson (20):
> tcg: Rename debug_insn_start to insn_start
> target-*: Unconditionally emit tcg_gen_insn_start
> tcg: Allow extra data to be attached to insn_start
> target-arm: Add condexec state to insn_start
> target-i386: Add cc_op state to insn_start
> target-mips: Add delayed branch state to insn_start
> target-s390x: Add cc_op state to insn_start
> target-sh4: Add flags state to insn_start
> target-cris: Mirror gen_opc_pc into insn_start
> target-sparc: Tidy gen_branch_a interface
> target-sparc: Split out gen_branch_n
> target-sparc: Remove gen_opc_jump_pc
> target-sparc: Add npc state to insn_start
> tcg: Merge cpu_gen_code into tb_gen_code
> target-*: Drop cpu_gen_code define
> tcg: Add TCG_MAX_INSNS
> tcg: Pass data argument to restore_state_to_opc
> tcg: Save insn data and use it in cpu_restore_state_from_tb
> tcg: Remove gen_intermediate_code_pc
> tcg: Remove tcg_gen_code_search_pc
>
> include/exec/exec-all.h | 6 +-
> target-alpha/cpu.h | 1 -
> target-alpha/translate.c | 55 +++-------
> target-arm/cpu.h | 2 +-
> target-arm/translate-a64.c | 39 ++-----
> target-arm/translate.c | 75 ++++---------
> target-arm/translate.h | 8 +-
> target-cris/cpu.h | 1 -
> target-cris/translate.c | 64 +++---------
> target-cris/translate_v10.c | 3 -
> target-i386/cpu.h | 2 +-
> target-i386/translate.c | 86 ++++-----------
> target-lm32/cpu.h | 1 -
> target-lm32/translate.c | 55 ++--------
> target-m68k/cpu.h | 1 -
> target-m68k/translate.c | 64 +++---------
> target-microblaze/cpu.h | 1 -
> target-microblaze/translate.c | 56 +++-------
> target-mips/cpu.h | 2 +-
> target-mips/translate.c | 73 ++++---------
> target-moxie/cpu.h | 1 -
> target-moxie/translate.c | 65 ++++--------
> target-openrisc/cpu.h | 1 -
> target-openrisc/translate.c | 54 ++--------
> target-ppc/cpu.h | 1 -
> target-ppc/translate.c | 56 +++-------
> target-s390x/cpu.h | 2 +-
> target-s390x/translate.c | 61 +++--------
> target-sh4/cpu.h | 2 +-
> target-sh4/translate.c | 71 ++++---------
> target-sparc/cpu.h | 2 +-
> target-sparc/translate.c | 189 ++++++++++++++-------------------
> target-tricore/translate.c | 53 ++++------
> target-unicore32/translate.c | 57 +++-------
> target-xtensa/cpu.h | 1 -
> target-xtensa/translate.c | 52 ++-------
> tcg/tcg-op.h | 52 +++++++--
> tcg/tcg-opc.h | 4 +-
> tcg/tcg.c | 96 ++++++++---------
> tcg/tcg.h | 14 ++-
> tci.c | 9 --
> translate-all.c | 237 ++++++++++++++++++++++++------------------
> 42 files changed, 578 insertions(+), 1097 deletions(-)
--
Alex Bennée
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-10 18:55 ` Alex Bennée
@ 2015-09-15 20:19 ` Richard Henderson
2015-09-16 6:19 ` Dennis Luehring
2015-09-16 8:59 ` Alex Bennée
0 siblings, 2 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-15 20:19 UTC (permalink / raw)
To: Alex Bennée; +Cc: dl.soluz, qemu-devel, aurelien, atar4qemu
On 09/10/2015 11:55 AM, Alex Bennée wrote:
> I've only had a quick glance so far but I'm fairly familiar with the
> concept from a previous life. I'll aim to do a full review later once
> I've gotten through my MTTCG review backlog.
>
> Anyway some quick points:
>
> * You can save data by only marking faulting instructions
>
> Assuming that all asynchronous instructions trigger at the end/prologue
> of basic blocks you only actually need to record the address of
> potentially faulting instructions. In fact only a few backend
> instructions will actually synchronously fault.
>
> Of course this does have the downside of having to mark all those
> instructions in the front end.
We have that. The only tcg opcodes that can fault are qemu_ld, qemu_st, and
call. So, yes, I could do exactly this. Perhaps not for round 1, however?
> * This method can also be used for additional rectification data
>
> AIUI we currently ensure all load/stores are barriers and ensure the CPU
> register file is updated before the occur. However if you wanted to you
> could drop that requirement and mark the target-host register pair and
> only fish it out when required on a fault.
Maybe. I'd have to think about this. We'd probably want to study how many
flushes to the register file this could elide. My off-the-cuff guess is not
enough to make the extra overhead useful.
> * Test suites are essential if your going to get clever
>
> Last time I went through this I built a SPARC test suite to cover all
> faulting instructions in all the various addressing modes. It flushed
> out a lot of bugs.
Indeed. It would indeed be good to add a bunch of bare-metal tests.
Pre-compiled and checked in so that one doesn't have to have a suite of
cross-compilers in order to use them.
OTOH, I don't see myself (or anyone else) really having the time to do that.
r~
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-15 20:19 ` Richard Henderson
@ 2015-09-16 6:19 ` Dennis Luehring
2015-09-16 8:59 ` Alex Bennée
1 sibling, 0 replies; 62+ messages in thread
From: Dennis Luehring @ 2015-09-16 6:19 UTC (permalink / raw)
To: Richard Henderson, Alex Bennée; +Cc: qemu-devel, aurelien, atar4qemu
Am 15.09.2015 um 22:19 schrieb Richard Henderson:
> Indeed. It would indeed be good to add a bunch of bare-metal tests.
> Pre-compiled and checked in so that one doesn't have to have a suite of
> cross-compilers in order to use them.
>
> OTOH, I don't see myself (or anyone else) really having the time to do that.
im working on that - i've got an clfs (clfs.org) based script that
generates a sparc64-64 cross-compile suite
and a sparc64-64 linux system with my benchmark tests (prime, stream
working, g++ pugixml missing)
starting from an initamfs+ramdisk - currently it is still too big (no
stripping done, too many packages installed)
but i will reduce that
next steps:
-get the base system and benchmarks running
-create an script that startups qemu,run tests,collect results,shutdown
-mips64-64, alpha
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-15 20:19 ` Richard Henderson
2015-09-16 6:19 ` Dennis Luehring
@ 2015-09-16 8:59 ` Alex Bennée
2015-09-16 20:41 ` Richard Henderson
1 sibling, 1 reply; 62+ messages in thread
From: Alex Bennée @ 2015-09-16 8:59 UTC (permalink / raw)
To: Richard Henderson; +Cc: dl.soluz, qemu-devel, aurelien, atar4qemu
Richard Henderson <rth@twiddle.net> writes:
> On 09/10/2015 11:55 AM, Alex Bennée wrote:
>> I've only had a quick glance so far but I'm fairly familiar with the
>> concept from a previous life. I'll aim to do a full review later once
>> I've gotten through my MTTCG review backlog.
>>
>> Anyway some quick points:
>>
>> * You can save data by only marking faulting instructions
>>
>> Assuming that all asynchronous instructions trigger at the end/prologue
>> of basic blocks you only actually need to record the address of
>> potentially faulting instructions. In fact only a few backend
>> instructions will actually synchronously fault.
>>
>> Of course this does have the downside of having to mark all those
>> instructions in the front end.
>
> We have that. The only tcg opcodes that can fault are qemu_ld, qemu_st, and
> call. So, yes, I could do exactly this. Perhaps not for round 1,
> however?
I unfortunately haven't got a SPARC manual handy (or my old testsuite)
but I was sure there was more than just loads/stores. I guess all the FP
related exceptions are handled in Softfloat for QEMU and the hardcoded
cases caught during instruction decode.
>> * This method can also be used for additional rectification data
>>
>> AIUI we currently ensure all load/stores are barriers and ensure the CPU
>> register file is updated before the occur. However if you wanted to you
>> could drop that requirement and mark the target-host register pair and
>> only fish it out when required on a fault.
>
> Maybe. I'd have to think about this. We'd probably want to study how many
> flushes to the register file this could elide. My off-the-cuff guess is not
> enough to make the extra overhead useful.
Sure - more instrumentation and data would be useful for this. It's an
area I'd like to look at once MTTCG is done as I think the next wins are
going to be in code generation and optimization.
>> * Test suites are essential if your going to get clever
>>
>> Last time I went through this I built a SPARC test suite to cover all
>> faulting instructions in all the various addressing modes. It flushed
>> out a lot of bugs.
>
> Indeed. It would indeed be good to add a bunch of bare-metal tests.
> Pre-compiled and checked in so that one doesn't have to have a suite of
> cross-compilers in order to use them.
>
> OTOH, I don't see myself (or anyone else) really having the time to do
> that.
The eternal problem ;-)
I'll send an email to my old employer and see if the unit tests can be
liberated. The worst that could happen is they say no.
>
>
> r~
--
Alex Bennée
^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
2015-09-16 8:59 ` Alex Bennée
@ 2015-09-16 20:41 ` Richard Henderson
0 siblings, 0 replies; 62+ messages in thread
From: Richard Henderson @ 2015-09-16 20:41 UTC (permalink / raw)
To: Alex Bennée; +Cc: aurelien, qemu-devel, dl.soluz, atar4qemu
On 09/16/2015 01:59 AM, Alex Bennée wrote:
>
> Richard Henderson <rth@twiddle.net> writes:
>
>> On 09/10/2015 11:55 AM, Alex Bennée wrote:
>>> I've only had a quick glance so far but I'm fairly familiar with the
>>> concept from a previous life. I'll aim to do a full review later once
>>> I've gotten through my MTTCG review backlog.
>>>
>>> Anyway some quick points:
>>>
>>> * You can save data by only marking faulting instructions
>>>
>>> Assuming that all asynchronous instructions trigger at the end/prologue
>>> of basic blocks you only actually need to record the address of
>>> potentially faulting instructions. In fact only a few backend
>>> instructions will actually synchronously fault.
>>>
>>> Of course this does have the downside of having to mark all those
>>> instructions in the front end.
>>
>> We have that. The only tcg opcodes that can fault are qemu_ld, qemu_st, and
>> call. So, yes, I could do exactly this. Perhaps not for round 1,
>> however?
>
> I unfortunately haven't got a SPARC manual handy (or my old testsuite)
> but I was sure there was more than just loads/stores. I guess all the FP
> related exceptions are handled in Softfloat for QEMU and the hardcoded
> cases caught during instruction decode.
Yes indeed. A tcg-opcode based "might it trap" test will be more inclusive
than one that actually examined the target insns. But it'll work universally
without needing to modify the translators, and it will save *some* space.
r~
^ permalink raw reply [flat|nested] 62+ messages in thread