* [Qemu-devel] [PATCH 1/7] cpu-exec: fix jmp_first out-of-bounds access with icount
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
@ 2017-01-29 21:09 ` Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 2/7] cpu-exec: tighten barrier on TCG_EXIT_REQUESTED Paolo Bonzini
` (8 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-29 21:09 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, peter.maydell, pavel.dovgaluk, qemu-stable
When icount is active, tb_add_jump is surprisingly called with an out of
bounds basic block index. I have no idea how that can work (it overwrites
jmp_first so at least it doesn't cause an immediate segv), but it does
not seem like a good idea. Clear *last_tb for all TB_EXIT_ICOUNT_EXPIRED
cases, even when all you have to do is refill icount_extra.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-exec.c | 7 ++++---
include/exec/exec-all.h | 1 +
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index fa08c73..2dc10c1 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -542,7 +542,7 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
trace_exec_tb(tb, tb->pc);
ret = cpu_tb_exec(cpu, tb);
- *last_tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
+ tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
*tb_exit = ret & TB_EXIT_MASK;
switch (*tb_exit) {
case TB_EXIT_REQUESTED:
@@ -566,6 +566,7 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
abort();
#else
int insns_left = cpu->icount_decr.u32;
+ *last_tb = NULL;
if (cpu->icount_extra && insns_left >= 0) {
/* Refill decrementer and continue execution. */
cpu->icount_extra += insns_left;
@@ -575,17 +576,17 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
} else {
if (insns_left > 0) {
/* Execute remaining instructions. */
- cpu_exec_nocache(cpu, insns_left, *last_tb, false);
+ cpu_exec_nocache(cpu, insns_left, tb, false);
align_clocks(sc, cpu);
}
cpu->exception_index = EXCP_INTERRUPT;
- *last_tb = NULL;
cpu_loop_exit(cpu);
}
break;
#endif
}
default:
+ *last_tb = tb;
break;
}
}
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index bbc9478..21ab7bf 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -318,6 +318,7 @@ static inline void tb_set_jmp_target(TranslationBlock *tb,
static inline void tb_add_jump(TranslationBlock *tb, int n,
TranslationBlock *tb_next)
{
+ assert(n < ARRAY_SIZE(tb->jmp_list_next));
if (tb->jmp_list_next[n]) {
/* Another thread has already done this while we were
* outside of the lock; nothing to do in this case */
--
2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 2/7] cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 1/7] cpu-exec: fix jmp_first out-of-bounds access with icount Paolo Bonzini
@ 2017-01-29 21:09 ` Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 3/7] cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt Paolo Bonzini
` (7 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-29 21:09 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, peter.maydell, pavel.dovgaluk
This seems to have worked just fine so far on weakly-ordered
architectures, but I don't see anything that prevents the
reordering from:
store 1 to exit_request
store 1 to tcg_exit_req
load tcg_exit_req
store 0 to tcg_exit_req
load exit_request
store 0 to exit_request
store 1 to exit_request
store 1 to tcg_exit_req
to this:
store 1 to exit_request
store 1 to tcg_exit_req
load tcg_exit_req
load exit_request
store 1 to exit_request
store 1 to tcg_exit_req
store 0 to tcg_exit_req
store 0 to exit_request
therefore losing a request. It's possible that other memory barriers
(e.g. in rcu_read_unlock) are hiding it, but better safe than
sorry.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-exec.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index 2dc10c1..4065e0e 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -552,11 +552,11 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
* have set something else (eg exit_request or
* interrupt_request) which we will handle
* next time around the loop. But we need to
- * ensure the tcg_exit_req read in generated code
+ * ensure the zeroing of tcg_exit_req (see cpu_tb_exec)
* comes before the next read of cpu->exit_request
* or cpu->interrupt_request.
*/
- smp_rmb();
+ smp_mb();
*last_tb = NULL;
break;
case TB_EXIT_ICOUNT_EXPIRED:
--
2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 3/7] cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 1/7] cpu-exec: fix jmp_first out-of-bounds access with icount Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 2/7] cpu-exec: tighten barrier on TCG_EXIT_REQUESTED Paolo Bonzini
@ 2017-01-29 21:09 ` Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 4/7] cpu-exec: avoid repeated sigsetjmp on interrupts Paolo Bonzini
` (6 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-29 21:09 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, peter.maydell, pavel.dovgaluk
The siglongjmp goes straight back to the beginning of cpu_exec's
outermost loop. We do not need a non-local jump; we can simply
leave the inner TB execution loop whose structure now becomes
clearer, too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-exec.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index 4065e0e..e62abed 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -461,7 +461,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
return false;
}
-static inline void cpu_handle_interrupt(CPUState *cpu,
+static inline bool cpu_handle_interrupt(CPUState *cpu,
TranslationBlock **last_tb)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
@@ -475,7 +475,7 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
if (interrupt_request & CPU_INTERRUPT_DEBUG) {
cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
cpu->exception_index = EXCP_DEBUG;
- cpu_loop_exit(cpu);
+ return true;
}
if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
/* Do nothing */
@@ -484,7 +484,7 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
cpu->halted = 1;
cpu->exception_index = EXCP_HLT;
- cpu_loop_exit(cpu);
+ return true;
}
#if defined(TARGET_I386)
else if (interrupt_request & CPU_INTERRUPT_INIT) {
@@ -494,13 +494,13 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0);
do_cpu_init(x86_cpu);
cpu->exception_index = EXCP_HALTED;
- cpu_loop_exit(cpu);
+ return true;
}
#else
else if (interrupt_request & CPU_INTERRUPT_RESET) {
replay_interrupt();
cpu_reset(cpu);
- cpu_loop_exit(cpu);
+ return true;
}
#endif
/* The target hook has 3 exit conditions:
@@ -526,8 +526,10 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
atomic_set(&cpu->exit_request, 0);
cpu->exception_index = EXCP_INTERRUPT;
- cpu_loop_exit(cpu);
+ return true;
}
+
+ return false;
}
static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
@@ -625,7 +627,7 @@ int cpu_exec(CPUState *cpu)
for(;;) {
/* prepare setjmp context for exception handling */
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
- TranslationBlock *tb, *last_tb = NULL;
+ TranslationBlock *last_tb = NULL;
int tb_exit = 0;
/* if an exception is pending, we execute it here */
@@ -633,14 +635,13 @@ int cpu_exec(CPUState *cpu)
break;
}
- for(;;) {
- cpu_handle_interrupt(cpu, &last_tb);
- tb = tb_find(cpu, last_tb, tb_exit);
+ while (!cpu_handle_interrupt(cpu, &last_tb)) {
+ TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);
- } /* for(;;) */
+ }
} else {
#if defined(__clang__) || !QEMU_GNUC_PREREQ(4, 6)
/* Some compilers wrongly smash all local variables after
--
2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 4/7] cpu-exec: avoid repeated sigsetjmp on interrupts
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
` (2 preceding siblings ...)
2017-01-29 21:09 ` [Qemu-devel] [PATCH 3/7] cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt Paolo Bonzini
@ 2017-01-29 21:09 ` Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 5/7] cpu-exec: remove outermost infinite loop Paolo Bonzini
` (5 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-29 21:09 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, peter.maydell, pavel.dovgaluk
The sigsetjmp only needs to be prepared once for the whole execution
of cpu_exec. This patch takes care of the "== 0" side, using a
nested loop so that cpu_handle_interrupt goes straight back to
cpu_handle_exception without doing another sigsetjmp.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-exec.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index e62abed..fcf37ba 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -627,21 +627,21 @@ int cpu_exec(CPUState *cpu)
for(;;) {
/* prepare setjmp context for exception handling */
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
- TranslationBlock *last_tb = NULL;
- int tb_exit = 0;
-
/* if an exception is pending, we execute it here */
- if (cpu_handle_exception(cpu, &ret)) {
- break;
+ while (!cpu_handle_exception(cpu, &ret)) {
+ TranslationBlock *last_tb = NULL;
+ int tb_exit = 0;
+
+ while (!cpu_handle_interrupt(cpu, &last_tb)) {
+ TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
+ cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
+ /* Try to align the host and virtual clocks
+ if the guest is in advance */
+ align_clocks(&sc, cpu);
+ }
}
+ break;
- while (!cpu_handle_interrupt(cpu, &last_tb)) {
- TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
- cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
- /* Try to align the host and virtual clocks
- if the guest is in advance */
- align_clocks(&sc, cpu);
- }
} else {
#if defined(__clang__) || !QEMU_GNUC_PREREQ(4, 6)
/* Some compilers wrongly smash all local variables after
--
2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 5/7] cpu-exec: remove outermost infinite loop
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
` (3 preceding siblings ...)
2017-01-29 21:09 ` [Qemu-devel] [PATCH 4/7] cpu-exec: avoid repeated sigsetjmp on interrupts Paolo Bonzini
@ 2017-01-29 21:09 ` Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 6/7] cpu-exec: unify icount_decr and tcg_exit_req Paolo Bonzini
` (4 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-29 21:09 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, peter.maydell, pavel.dovgaluk
Reorganize the sigsetjmp so that the restart case falls through
to cpu_handle_exception and the execution loop.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-exec.c | 58 +++++++++++++++++++++++++++-------------------------------
1 file changed, 27 insertions(+), 31 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index fcf37ba..1226362 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -624,41 +624,37 @@ int cpu_exec(CPUState *cpu)
*/
init_delay_params(&sc, cpu);
- for(;;) {
- /* prepare setjmp context for exception handling */
- if (sigsetjmp(cpu->jmp_env, 0) == 0) {
- /* if an exception is pending, we execute it here */
- while (!cpu_handle_exception(cpu, &ret)) {
- TranslationBlock *last_tb = NULL;
- int tb_exit = 0;
-
- while (!cpu_handle_interrupt(cpu, &last_tb)) {
- TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
- cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
- /* Try to align the host and virtual clocks
- if the guest is in advance */
- align_clocks(&sc, cpu);
- }
- }
- break;
-
- } else {
+ /* prepare setjmp context for exception handling */
+ if (sigsetjmp(cpu->jmp_env, 0) != 0) {
#if defined(__clang__) || !QEMU_GNUC_PREREQ(4, 6)
- /* Some compilers wrongly smash all local variables after
- * siglongjmp. There were bug reports for gcc 4.5.0 and clang.
- * Reload essential local variables here for those compilers.
- * Newer versions of gcc would complain about this code (-Wclobbered). */
- cpu = current_cpu;
- cc = CPU_GET_CLASS(cpu);
+ /* Some compilers wrongly smash all local variables after
+ * siglongjmp. There were bug reports for gcc 4.5.0 and clang.
+ * Reload essential local variables here for those compilers.
+ * Newer versions of gcc would complain about this code (-Wclobbered). */
+ cpu = current_cpu;
+ cc = CPU_GET_CLASS(cpu);
#else /* buggy compiler */
- /* Assert that the compiler does not smash local variables. */
- g_assert(cpu == current_cpu);
- g_assert(cc == CPU_GET_CLASS(cpu));
+ /* Assert that the compiler does not smash local variables. */
+ g_assert(cpu == current_cpu);
+ g_assert(cc == CPU_GET_CLASS(cpu));
#endif /* buggy compiler */
- cpu->can_do_io = 1;
- tb_lock_reset();
+ cpu->can_do_io = 1;
+ tb_lock_reset();
+ }
+
+ /* if an exception is pending, we execute it here */
+ while (!cpu_handle_exception(cpu, &ret)) {
+ TranslationBlock *last_tb = NULL;
+ int tb_exit = 0;
+
+ while (!cpu_handle_interrupt(cpu, &last_tb)) {
+ TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
+ cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
+ /* Try to align the host and virtual clocks
+ if the guest is in advance */
+ align_clocks(&sc, cpu);
}
- } /* for(;;) */
+ }
cc->cpu_exec_exit(cpu);
rcu_read_unlock();
--
2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 6/7] cpu-exec: unify icount_decr and tcg_exit_req
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
` (4 preceding siblings ...)
2017-01-29 21:09 ` [Qemu-devel] [PATCH 5/7] cpu-exec: remove outermost infinite loop Paolo Bonzini
@ 2017-01-29 21:09 ` Paolo Bonzini
2017-01-29 21:09 ` [Qemu-devel] [PATCH 7/7] cpu-exec: centralize exiting to the main loop Paolo Bonzini
` (3 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-29 21:09 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, peter.maydell, pavel.dovgaluk
The icount interrupt flag and tcg_exit_req serve almost the same
purpose, let's make them completely the same.
The former TB_EXIT_REQUESTED and TB_EXIT_ICOUNT_EXPIRED cases are
unified, since we can distinguish them from the value of the
interrupt flag.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-exec.c | 68 +++++++++++++++++++++--------------------------
include/exec/gen-icount.h | 53 +++++++++++++++++-------------------
include/qom/cpu.h | 15 +++++------
qom/cpu.c | 2 +-
tcg/tcg.h | 1 -
translate-all.c | 2 +-
translate-common.c | 13 ++++-----
7 files changed, 69 insertions(+), 85 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index 1226362..903a834 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -185,12 +185,6 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
cc->set_pc(cpu, last_tb->pc);
}
}
- if (tb_exit == TB_EXIT_REQUESTED) {
- /* We were asked to stop executing TBs (probably a pending
- * interrupt. We've now stopped, so clear the flag.
- */
- atomic_set(&cpu->tcg_exit_req, 0);
- }
return ret;
}
@@ -537,6 +531,7 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
SyncClocks *sc)
{
uintptr_t ret;
+ int32_t insns_left;
if (unlikely(atomic_read(&cpu->exit_request))) {
return;
@@ -546,8 +541,15 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
ret = cpu_tb_exec(cpu, tb);
tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
*tb_exit = ret & TB_EXIT_MASK;
- switch (*tb_exit) {
- case TB_EXIT_REQUESTED:
+ if (*tb_exit != TB_EXIT_REQUESTED) {
+ *last_tb = tb;
+ return;
+ }
+
+ *last_tb = NULL;
+ insns_left = atomic_read(&cpu->icount_decr.u32);
+ atomic_set(&cpu->icount_decr.u16.high, 0);
+ if (insns_left < 0) {
/* Something asked us to stop executing
* chained TBs; just continue round the main
* loop. Whatever requested the exit will also
@@ -559,37 +561,27 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
* or cpu->interrupt_request.
*/
smp_mb();
- *last_tb = NULL;
- break;
- case TB_EXIT_ICOUNT_EXPIRED:
- {
- /* Instruction counter expired. */
-#ifdef CONFIG_USER_ONLY
- abort();
-#else
- int insns_left = cpu->icount_decr.u32;
- *last_tb = NULL;
- if (cpu->icount_extra && insns_left >= 0) {
- /* Refill decrementer and continue execution. */
- cpu->icount_extra += insns_left;
- insns_left = MIN(0xffff, cpu->icount_extra);
- cpu->icount_extra -= insns_left;
- cpu->icount_decr.u16.low = insns_left;
- } else {
- if (insns_left > 0) {
- /* Execute remaining instructions. */
- cpu_exec_nocache(cpu, insns_left, tb, false);
- align_clocks(sc, cpu);
- }
- cpu->exception_index = EXCP_INTERRUPT;
- cpu_loop_exit(cpu);
- }
- break;
-#endif
+ return;
}
- default:
- *last_tb = tb;
- break;
+
+ /* Instruction counter expired. */
+ assert(use_icount);
+ if (cpu->icount_extra) {
+ /* Refill decrementer and continue execution. */
+ cpu->icount_extra += insns_left;
+ insns_left = MIN(0xffff, cpu->icount_extra);
+ cpu->icount_extra -= insns_left;
+ cpu->icount_decr.u16.low = insns_left;
+ } else {
+ /* Execute any remaining instructions, then let the main loop
+ * handle the next event.
+ */
+ if (insns_left > 0) {
+ cpu_exec_nocache(cpu, insns_left, tb, false);
+ align_clocks(sc, cpu);
+ }
+ cpu->exception_index = EXCP_INTERRUPT;
+ cpu_loop_exit(cpu);
}
}
diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index 050de59..62d462e 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -6,58 +6,55 @@
/* Helpers for instruction counting code generation. */
static int icount_start_insn_idx;
-static TCGLabel *icount_label;
static TCGLabel *exitreq_label;
static inline void gen_tb_start(TranslationBlock *tb)
{
- TCGv_i32 count, flag, imm;
+ TCGv_i32 count, imm;
exitreq_label = gen_new_label();
- flag = tcg_temp_new_i32();
- tcg_gen_ld_i32(flag, cpu_env,
- offsetof(CPUState, tcg_exit_req) - ENV_OFFSET);
- tcg_gen_brcondi_i32(TCG_COND_NE, flag, 0, exitreq_label);
- tcg_temp_free_i32(flag);
-
- if (!(tb->cflags & CF_USE_ICOUNT)) {
- return;
+ if (tb->cflags & CF_USE_ICOUNT) {
+ count = tcg_temp_local_new_i32();
+ } else {
+ count = tcg_temp_new_i32();
}
- icount_label = gen_new_label();
- count = tcg_temp_local_new_i32();
tcg_gen_ld_i32(count, cpu_env,
-ENV_OFFSET + offsetof(CPUState, icount_decr.u32));
- imm = tcg_temp_new_i32();
- /* We emit a movi with a dummy immediate argument. Keep the insn index
- * of the movi so that we later (when we know the actual insn count)
- * can update the immediate argument with the actual insn count. */
- icount_start_insn_idx = tcg_op_buf_count();
- tcg_gen_movi_i32(imm, 0xdeadbeef);
+ if (tb->cflags & CF_USE_ICOUNT) {
+ imm = tcg_temp_new_i32();
+ /* We emit a movi with a dummy immediate argument. Keep the insn index
+ * of the movi so that we later (when we know the actual insn count)
+ * can update the immediate argument with the actual insn count. */
+ icount_start_insn_idx = tcg_op_buf_count();
+ tcg_gen_movi_i32(imm, 0xdeadbeef);
+
+ tcg_gen_sub_i32(count, count, imm);
+ tcg_temp_free_i32(imm);
+ }
+
+ tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, exitreq_label);
- tcg_gen_sub_i32(count, count, imm);
- tcg_temp_free_i32(imm);
+ if (tb->cflags & CF_USE_ICOUNT) {
+ tcg_gen_st16_i32(count, cpu_env,
+ -ENV_OFFSET + offsetof(CPUState, icount_decr.u16.low));
+ }
- tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label);
- tcg_gen_st16_i32(count, cpu_env,
- -ENV_OFFSET + offsetof(CPUState, icount_decr.u16.low));
tcg_temp_free_i32(count);
}
static void gen_tb_end(TranslationBlock *tb, int num_insns)
{
- gen_set_label(exitreq_label);
- tcg_gen_exit_tb((uintptr_t)tb + TB_EXIT_REQUESTED);
-
if (tb->cflags & CF_USE_ICOUNT) {
/* Update the num_insn immediate parameter now that we know
* the actual insn count. */
tcg_set_insn_param(icount_start_insn_idx, 1, num_insns);
- gen_set_label(icount_label);
- tcg_gen_exit_tb((uintptr_t)tb + TB_EXIT_ICOUNT_EXPIRED);
}
+ gen_set_label(exitreq_label);
+ tcg_gen_exit_tb((uintptr_t)tb + TB_EXIT_REQUESTED);
+
/* Terminate the linked list. */
tcg_ctx.gen_op_buf[tcg_ctx.gen_op_buf[0].prev].next = 0;
}
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index ca4d0fb..e0b93ce 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -271,11 +271,11 @@ struct qemu_work_item;
* @stopped: Indicates the CPU has been artificially stopped.
* @unplug: Indicates a pending CPU unplug request.
* @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
- * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
- * CPU and return to its top level loop.
* @singlestep_enabled: Flags for single-stepping.
* @icount_extra: Instructions until next timer event.
- * @icount_decr: Number of cycles left, with interrupt flag in high bit.
+ * @icount_decr: Low 16 bits: number of cycles left, only used in icount mode.
+ * High 16 bits: Set to -1 to force TCG to stop executing linked TBs for this
+ * CPU and return to its top level loop (even in non-icount mode).
* This allows a single read-compare-cbranch-write sequence to test
* for both decrementer underflow and exceptions.
* @can_do_io: Nonzero if memory-mapped IO is safe. Deterministic execution
@@ -377,10 +377,6 @@ struct CPUState {
/* TODO Move common fields from CPUArchState here. */
int cpu_index; /* used by alpha TCG */
uint32_t halted; /* used by alpha, cris, ppc TCG */
- union {
- uint32_t u32;
- icount_decr_u16 u16;
- } icount_decr;
uint32_t can_do_io;
int32_t exception_index; /* used by m68k TCG */
@@ -393,7 +389,10 @@ struct CPUState {
offset from AREG0. Leave this field at the end so as to make the
(absolute value) offset as small as possible. This reduces code
size, especially for hosts without large memory offsets. */
- uint32_t tcg_exit_req;
+ union {
+ uint32_t u32;
+ icount_decr_u16 u16;
+ } icount_decr;
bool hax_vcpu_dirty;
struct hax_vcpu_state *hax_vcpu;
diff --git a/qom/cpu.c b/qom/cpu.c
index e815db7..e63f4b9 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -123,7 +123,7 @@ void cpu_exit(CPUState *cpu)
atomic_set(&cpu->exit_request, 1);
/* Ensure cpu_exec will see the exit request after TCG has exited. */
smp_wmb();
- atomic_set(&cpu->tcg_exit_req, 1);
+ atomic_set(&cpu->icount_decr.u16.high, -1);
}
int cpu_write_elf32_qemunote(WriteCoreDumpFunction f, CPUState *cpu,
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 631c6f6..167aa30 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -1108,7 +1108,6 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi)
#define TB_EXIT_MASK 3
#define TB_EXIT_IDX0 0
#define TB_EXIT_IDX1 1
-#define TB_EXIT_ICOUNT_EXPIRED 2
#define TB_EXIT_REQUESTED 3
#ifdef HAVE_TCG_QEMU_TB_EXEC
diff --git a/translate-all.c b/translate-all.c
index 6d2fcab..b239e32 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1958,7 +1958,7 @@ void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf)
void cpu_interrupt(CPUState *cpu, int mask)
{
cpu->interrupt_request |= mask;
- cpu->tcg_exit_req = 1;
+ cpu->icount_decr.u16.high = -1;
}
/*
diff --git a/translate-common.c b/translate-common.c
index 5e989cd..77762fd 100644
--- a/translate-common.c
+++ b/translate-common.c
@@ -43,14 +43,11 @@ static void tcg_handle_interrupt(CPUState *cpu, int mask)
return;
}
- if (use_icount) {
- cpu->icount_decr.u16.high = 0xffff;
- if (!cpu->can_do_io
- && (mask & ~old_mask) != 0) {
- cpu_abort(cpu, "Raised interrupt while not in I/O function");
- }
- } else {
- cpu->tcg_exit_req = 1;
+ cpu->icount_decr.u16.high = -1;
+ if (use_icount &&
+ !cpu->can_do_io
+ && (mask & ~old_mask) != 0) {
+ cpu_abort(cpu, "Raised interrupt while not in I/O function");
}
}
--
2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Qemu-devel] [PATCH 7/7] cpu-exec: centralize exiting to the main loop
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
` (5 preceding siblings ...)
2017-01-29 21:09 ` [Qemu-devel] [PATCH 6/7] cpu-exec: unify icount_decr and tcg_exit_req Paolo Bonzini
@ 2017-01-29 21:09 ` Paolo Bonzini
2017-01-29 21:18 ` [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases no-reply
` (2 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-01-29 21:09 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, peter.maydell, pavel.dovgaluk
The cpu->exit_request check in cpu_loop_exec_tb is unnecessary,
because cpu->tcg_exit_req is always set after cpu->exit_request.
So let the TB exit and we will pick up the exit request later
in cpu_handle_interrupt.
Likewise, use cpu->exit_request to set EXCP_INTERRUPT when the
instruction counter expires. Because cpu_loop_exec_tb is
followed immediately by align_clocks, this lets us remove the
SyncClocks *sc argument to cpu_loop_exec_tb.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
cpu-exec.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index 903a834..3838eb8 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -517,6 +517,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
*last_tb = NULL;
}
}
+
+ /* Finally, check if we need to exit to the main loop. */
if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
atomic_set(&cpu->exit_request, 0);
cpu->exception_index = EXCP_INTERRUPT;
@@ -527,16 +529,11 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
}
static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
- TranslationBlock **last_tb, int *tb_exit,
- SyncClocks *sc)
+ TranslationBlock **last_tb, int *tb_exit)
{
uintptr_t ret;
int32_t insns_left;
- if (unlikely(atomic_read(&cpu->exit_request))) {
- return;
- }
-
trace_exec_tb(tb, tb->pc);
ret = cpu_tb_exec(cpu, tb);
tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
@@ -578,10 +575,8 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
*/
if (insns_left > 0) {
cpu_exec_nocache(cpu, insns_left, tb, false);
- align_clocks(sc, cpu);
}
- cpu->exception_index = EXCP_INTERRUPT;
- cpu_loop_exit(cpu);
+ atomic_set(&cpu->exit_request, 1);
}
}
@@ -641,7 +636,7 @@ int cpu_exec(CPUState *cpu)
while (!cpu_handle_interrupt(cpu, &last_tb)) {
TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
- cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
+ cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit);
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);
--
2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
` (6 preceding siblings ...)
2017-01-29 21:09 ` [Qemu-devel] [PATCH 7/7] cpu-exec: centralize exiting to the main loop Paolo Bonzini
@ 2017-01-29 21:18 ` no-reply
2017-01-31 9:05 ` Pavel Dovgalyuk
2017-02-15 12:42 ` Paolo Bonzini
9 siblings, 0 replies; 16+ messages in thread
From: no-reply @ 2017-01-29 21:18 UTC (permalink / raw)
To: pbonzini; +Cc: famz, qemu-devel, serge.fdrv, pavel.dovgaluk, peter.maydell
Hi,
Your series seems to have some coding style problems. See output below for
more information:
Type: series
Subject: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
Message-id: 20170129210910.6333-1-pbonzini@redhat.com
=== TEST SCRIPT BEGIN ===
#!/bin/bash
BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0
# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True
commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done
exit $failed
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
- [tag update] patchew/20161220191523.5779-1-eblake@redhat.com -> patchew/20161220191523.5779-1-eblake@redhat.com
- [tag update] patchew/20170127103922.19658-1-alex.bennee@linaro.org -> patchew/20170127103922.19658-1-alex.bennee@linaro.org
* [new tag] patchew/20170129210910.6333-1-pbonzini@redhat.com -> patchew/20170129210910.6333-1-pbonzini@redhat.com
Switched to a new branch 'test'
0f5e3df cpu-exec: centralize exiting to the main loop
c58d84e cpu-exec: unify icount_decr and tcg_exit_req
a1bbdfc cpu-exec: remove outermost infinite loop
a984d03 cpu-exec: avoid repeated sigsetjmp on interrupts
0dc98e6 cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
218de04 cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
bf63e5a cpu-exec: fix jmp_first out-of-bounds access with icount
=== OUTPUT BEGIN ===
Checking PATCH 1/7: cpu-exec: fix jmp_first out-of-bounds access with icount...
Checking PATCH 2/7: cpu-exec: tighten barrier on TCG_EXIT_REQUESTED...
ERROR: memory barrier without comment
#51: FILE: cpu-exec.c:559:
+ smp_mb();
total: 1 errors, 0 warnings, 13 lines checked
Your patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 3/7: cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt...
Checking PATCH 4/7: cpu-exec: avoid repeated sigsetjmp on interrupts...
Checking PATCH 5/7: cpu-exec: remove outermost infinite loop...
WARNING: line over 80 characters
#51: FILE: cpu-exec.c:633:
+ * Newer versions of gcc would complain about this code (-Wclobbered). */
total: 0 errors, 1 warnings, 68 lines checked
Your patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 6/7: cpu-exec: unify icount_decr and tcg_exit_req...
Checking PATCH 7/7: cpu-exec: centralize exiting to the main loop...
=== OUTPUT END ===
Test command exited with code: 1
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
` (7 preceding siblings ...)
2017-01-29 21:18 ` [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases no-reply
@ 2017-01-31 9:05 ` Pavel Dovgalyuk
2017-02-01 20:54 ` Paolo Bonzini
2017-02-15 12:42 ` Paolo Bonzini
9 siblings, 1 reply; 16+ messages in thread
From: Pavel Dovgalyuk @ 2017-01-31 9:05 UTC (permalink / raw)
To: 'Paolo Bonzini', qemu-devel
Cc: serge.fdrv, peter.maydell, pavel.dovgaluk
Hi, Paolo!
Thanks for refactoring.
I tested these patches with icount record/replay on i386 machine.
It works, but the following changes should be applied.
I also removed call to replay_has_interrupt, because now it is not needed here.
It seems, that this call is an artifact of an older record/replay revision.
diff --git a/cpu-exec.c b/cpu-exec.c
index 3838eb8..5cef8bc 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -519,7 +519,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
}
/* Finally, check if we need to exit to the main loop. */
- if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
+ if (unlikely(atomic_read(&cpu->exit_request)
+ || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) {
atomic_set(&cpu->exit_request, 0);
cpu->exception_index = EXCP_INTERRUPT;
return true;
Pavel Dovgalyuk
> -----Original Message-----
> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
> Sent: Monday, January 30, 2017 12:09 AM
> To: qemu-devel@nongnu.org
> Cc: serge.fdrv@gmail.com; peter.maydell@linaro.org; pavel.dovgaluk@ispras.ru
> Subject: [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
>
> The series includes three parts:
>
> 1-2: fix two bugs, the first one pretty bad, the second seems
> to be theoretical only.
>
> 3-5: simplify cpu_exec. This builds on Sergey's conversion
> of cpu_exec to a simple top-down logic, making the phases
> clearer and saving on the cost of siglongjmp in the meanwhile.
>
> 6-7: these are intended to be a base for Pavel's record/replay
> fixes. The main thing I noticed while reviewing is that icount
> is redoing (with u16.high) a lot of things that tcg_exit_req is
> doing too. This is because, at the time icount was introduced,
> tcg_exit_req didn't exist and QEMU instead unwound chained TBs
> through POSIX signals. But now we have essentially two ways to
> do the same thing with subtly different invariants or downright
> bugs (such as the one fixed by patch 1). Patch 6 therefore
> unifies tcg_exit_req and the icount interrupt flag. It saves a
> handful of instructions per TB in icount mode and generally
> makes icount mode "less special", which is a good thing since
> no one seems to understand it well. Patch 7 then removes another
> EXCP_INTERRUPT/cpu_loop_exit pair; by exiting to main loop simply
> through cpu->exit_request, hopefully it fixes one of the issues that
> Pavel was seeing.
>
> For now I've tested this only on an aarch64 Linux image (with
> and without -icount). Thanks,
>
> Paolo
>
> Paolo Bonzini (7):
> cpu-exec: fix jmp_first out-of-bounds access with icount
> cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
> cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
> cpu-exec: avoid repeated sigsetjmp on interrupts
> cpu-exec: remove outermost infinite loop
> cpu-exec: unify icount_decr and tcg_exit_req
> cpu-exec: centralize exiting to the main loop
>
> cpu-exec.c | 153 +++++++++++++++++++++-------------------------
> include/exec/exec-all.h | 1 +
> include/exec/gen-icount.h | 53 ++++++++--------
> include/qom/cpu.h | 15 +++--
> qom/cpu.c | 2 +-
> tcg/tcg.h | 1 -
> translate-all.c | 2 +-
> translate-common.c | 13 ++--
> 8 files changed, 109 insertions(+), 131 deletions(-)
>
> --
> 2.9.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-01-31 9:05 ` Pavel Dovgalyuk
@ 2017-02-01 20:54 ` Paolo Bonzini
2017-02-03 7:07 ` Pavel Dovgalyuk
0 siblings, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2017-02-01 20:54 UTC (permalink / raw)
To: Pavel Dovgalyuk, qemu-devel; +Cc: serge.fdrv, pavel.dovgaluk, peter.maydell
On 31/01/2017 01:05, Pavel Dovgalyuk wrote:
> Hi, Paolo!
>
> Thanks for refactoring.
> I tested these patches with icount record/replay on i386 machine.
> It works, but the following changes should be applied.
> I also removed call to replay_has_interrupt, because now it is not needed here.
> It seems, that this call is an artifact of an older record/replay revision.
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 3838eb8..5cef8bc 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -519,7 +519,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> }
>
> /* Finally, check if we need to exit to the main loop. */
> - if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
> + if (unlikely(atomic_read(&cpu->exit_request)
> + || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) {
> atomic_set(&cpu->exit_request, 0);
> cpu->exception_index = EXCP_INTERRUPT;
> return true;
So is this needed to avoid exceptions in tb_find? Please add a comment
about this and check if you can also replace:
atomic_set(&cpu->exit_request, 1);
in cpu_loop_exec_tb with
cpu->icount_decr.u16.low = 0;
?
Thanks,
Paolo
> Pavel Dovgalyuk
>
>> -----Original Message-----
>> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
>> Sent: Monday, January 30, 2017 12:09 AM
>> To: qemu-devel@nongnu.org
>> Cc: serge.fdrv@gmail.com; peter.maydell@linaro.org; pavel.dovgaluk@ispras.ru
>> Subject: [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
>>
>> The series includes three parts:
>>
>> 1-2: fix two bugs, the first one pretty bad, the second seems
>> to be theoretical only.
>>
>> 3-5: simplify cpu_exec. This builds on Sergey's conversion
>> of cpu_exec to a simple top-down logic, making the phases
>> clearer and saving on the cost of siglongjmp in the meanwhile.
>>
>> 6-7: these are intended to be a base for Pavel's record/replay
>> fixes. The main thing I noticed while reviewing is that icount
>> is redoing (with u16.high) a lot of things that tcg_exit_req is
>> doing too. This is because, at the time icount was introduced,
>> tcg_exit_req didn't exist and QEMU instead unwound chained TBs
>> through POSIX signals. But now we have essentially two ways to
>> do the same thing with subtly different invariants or downright
>> bugs (such as the one fixed by patch 1). Patch 6 therefore
>> unifies tcg_exit_req and the icount interrupt flag. It saves a
>> handful of instructions per TB in icount mode and generally
>> makes icount mode "less special", which is a good thing since
>> no one seems to understand it well. Patch 7 then removes another
>> EXCP_INTERRUPT/cpu_loop_exit pair; by exiting to main loop simply
>> through cpu->exit_request, hopefully it fixes one of the issues that
>> Pavel was seeing.
>>
>> For now I've tested this only on an aarch64 Linux image (with
>> and without -icount). Thanks,
>>
>> Paolo
>>
>> Paolo Bonzini (7):
>> cpu-exec: fix jmp_first out-of-bounds access with icount
>> cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
>> cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
>> cpu-exec: avoid repeated sigsetjmp on interrupts
>> cpu-exec: remove outermost infinite loop
>> cpu-exec: unify icount_decr and tcg_exit_req
>> cpu-exec: centralize exiting to the main loop
>>
>> cpu-exec.c | 153 +++++++++++++++++++++-------------------------
>> include/exec/exec-all.h | 1 +
>> include/exec/gen-icount.h | 53 ++++++++--------
>> include/qom/cpu.h | 15 +++--
>> qom/cpu.c | 2 +-
>> tcg/tcg.h | 1 -
>> translate-all.c | 2 +-
>> translate-common.c | 13 ++--
>> 8 files changed, 109 insertions(+), 131 deletions(-)
>>
>> --
>> 2.9.3
>
>
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-02-01 20:54 ` Paolo Bonzini
@ 2017-02-03 7:07 ` Pavel Dovgalyuk
2017-02-03 15:07 ` Paolo Bonzini
0 siblings, 1 reply; 16+ messages in thread
From: Pavel Dovgalyuk @ 2017-02-03 7:07 UTC (permalink / raw)
To: 'Paolo Bonzini', qemu-devel
Cc: serge.fdrv, pavel.dovgaluk, peter.maydell
> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
> On 31/01/2017 01:05, Pavel Dovgalyuk wrote:
> > Hi, Paolo!
> >
> > Thanks for refactoring.
> > I tested these patches with icount record/replay on i386 machine.
> > It works, but the following changes should be applied.
> > I also removed call to replay_has_interrupt, because now it is not needed here.
> > It seems, that this call is an artifact of an older record/replay revision.
> >
> > diff --git a/cpu-exec.c b/cpu-exec.c
> > index 3838eb8..5cef8bc 100644
> > --- a/cpu-exec.c
> > +++ b/cpu-exec.c
> > @@ -519,7 +519,8 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
> > }
> >
> > /* Finally, check if we need to exit to the main loop. */
> > - if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
> > + if (unlikely(atomic_read(&cpu->exit_request)
> > + || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) {
> > atomic_set(&cpu->exit_request, 0);
> > cpu->exception_index = EXCP_INTERRUPT;
> > return true;
>
> So is this needed to avoid exceptions in tb_find? Please add a comment
> about this
This code comes from my last patch, that was not applied.
Here is the comment:
It adds check to break cpu loop when icount expires without
setting the TB_EXIT_ICOUNT_EXPIRED flag. It happens when there is no
available translated blocks and all instructions were executed.
In icount replay mode unnecessary tb_find will be called (which may
cause an exception) and execution will be non-deterministic.
> and check if you can also replace:
>
> atomic_set(&cpu->exit_request, 1);
>
> in cpu_loop_exec_tb with
>
> cpu->icount_decr.u16.low = 0;
>
> ?
>
This line is not needed at all, because the following code decrements
icount automatically.
if (insns_left > 0) {
cpu_exec_nocache(cpu, insns_left, tb, false);
}
Pavel Dovgalyuk
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-02-03 7:07 ` Pavel Dovgalyuk
@ 2017-02-03 15:07 ` Paolo Bonzini
0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-02-03 15:07 UTC (permalink / raw)
To: Pavel Dovgalyuk; +Cc: qemu-devel, serge fdrv, pavel dovgaluk, peter maydell
----- Original Message -----
> From: "Pavel Dovgalyuk" <dovgaluk@ispras.ru>
> To: "Paolo Bonzini" <pbonzini@redhat.com>, qemu-devel@nongnu.org
> Cc: "serge fdrv" <serge.fdrv@gmail.com>, "pavel dovgaluk" <pavel.dovgaluk@ispras.ru>, "peter maydell"
> <peter.maydell@linaro.org>
> Sent: Thursday, February 2, 2017 11:07:12 PM
> Subject: RE: [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
>
> > From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo
> > Bonzini
> > On 31/01/2017 01:05, Pavel Dovgalyuk wrote:
> > > Hi, Paolo!
> > >
> > > Thanks for refactoring.
> > > I tested these patches with icount record/replay on i386 machine.
> > > It works, but the following changes should be applied.
> > > I also removed call to replay_has_interrupt, because now it is not needed
> > > here.
> > > It seems, that this call is an artifact of an older record/replay
> > > revision.
> > >
> > > diff --git a/cpu-exec.c b/cpu-exec.c
> > > index 3838eb8..5cef8bc 100644
> > > --- a/cpu-exec.c
> > > +++ b/cpu-exec.c
> > > @@ -519,7 +519,8 @@ static inline bool cpu_handle_interrupt(CPUState
> > > *cpu,
> > > }
> > >
> > > /* Finally, check if we need to exit to the main loop. */
> > > - if (unlikely(atomic_read(&cpu->exit_request) ||
> > > replay_has_interrupt())) {
> > > + if (unlikely(atomic_read(&cpu->exit_request)
> > > + || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra
> > > == 0))) {
> > > atomic_set(&cpu->exit_request, 0);
> > > cpu->exception_index = EXCP_INTERRUPT;
> > > return true;
> >
> > So is this needed to avoid exceptions in tb_find? Please add a comment
> > about this
>
> This code comes from my last patch, that was not applied.
> Here is the comment:
>
> It adds check to break cpu loop when icount expires without
> setting the TB_EXIT_ICOUNT_EXPIRED flag. It happens when there is no
> available translated blocks and all instructions were executed.
> In icount replay mode unnecessary tb_find will be called (which may
> cause an exception) and execution will be non-deterministic.
>
> > and check if you can also replace:
> >
> > atomic_set(&cpu->exit_request, 1);
> >
> > in cpu_loop_exec_tb with
> >
> > cpu->icount_decr.u16.low = 0;
> >
> > ?
> >
>
> This line is not needed at all, because the following code decrements
> icount automatically.
>
> if (insns_left > 0) {
> cpu_exec_nocache(cpu, insns_left, tb, false);
> }
Right, so please send a patch and we can apply my series + yours.
Paolo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-01-29 21:09 [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases Paolo Bonzini
` (8 preceding siblings ...)
2017-01-31 9:05 ` Pavel Dovgalyuk
@ 2017-02-15 12:42 ` Paolo Bonzini
2017-02-15 12:45 ` Pavel Dovgalyuk
9 siblings, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2017-02-15 12:42 UTC (permalink / raw)
To: qemu-devel; +Cc: serge.fdrv, pavel.dovgaluk, peter.maydell
On 29/01/2017 22:09, Paolo Bonzini wrote:
> The series includes three parts:
>
> 1-2: fix two bugs, the first one pretty bad, the second seems
> to be theoretical only.
>
> 3-5: simplify cpu_exec. This builds on Sergey's conversion
> of cpu_exec to a simple top-down logic, making the phases
> clearer and saving on the cost of siglongjmp in the meanwhile.
I'll include these five in my next pull request. Review for patch 6 is
welcome!
Paolo
> 6-7: these are intended to be a base for Pavel's record/replay
> fixes. The main thing I noticed while reviewing is that icount
> is redoing (with u16.high) a lot of things that tcg_exit_req is
> doing too. This is because, at the time icount was introduced,
> tcg_exit_req didn't exist and QEMU instead unwound chained TBs
> through POSIX signals. But now we have essentially two ways to
> do the same thing with subtly different invariants or downright
> bugs (such as the one fixed by patch 1). Patch 6 therefore
> unifies tcg_exit_req and the icount interrupt flag. It saves a
> handful of instructions per TB in icount mode and generally
> makes icount mode "less special", which is a good thing since
> no one seems to understand it well. Patch 7 then removes another
> EXCP_INTERRUPT/cpu_loop_exit pair; by exiting to main loop simply
> through cpu->exit_request, hopefully it fixes one of the issues that
> Pavel was seeing.
>
> For now I've tested this only on an aarch64 Linux image (with
> and without -icount). Thanks,
>
> Paolo
>
> Paolo Bonzini (7):
> cpu-exec: fix jmp_first out-of-bounds access with icount
> cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
> cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
> cpu-exec: avoid repeated sigsetjmp on interrupts
> cpu-exec: remove outermost infinite loop
> cpu-exec: unify icount_decr and tcg_exit_req
> cpu-exec: centralize exiting to the main loop
>
> cpu-exec.c | 153 +++++++++++++++++++++-------------------------
> include/exec/exec-all.h | 1 +
> include/exec/gen-icount.h | 53 ++++++++--------
> include/qom/cpu.h | 15 +++--
> qom/cpu.c | 2 +-
> tcg/tcg.h | 1 -
> translate-all.c | 2 +-
> translate-common.c | 13 ++--
> 8 files changed, 109 insertions(+), 131 deletions(-)
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-02-15 12:42 ` Paolo Bonzini
@ 2017-02-15 12:45 ` Pavel Dovgalyuk
2017-02-15 12:57 ` Paolo Bonzini
0 siblings, 1 reply; 16+ messages in thread
From: Pavel Dovgalyuk @ 2017-02-15 12:45 UTC (permalink / raw)
To: 'Paolo Bonzini', qemu-devel
Cc: serge.fdrv, pavel.dovgaluk, peter.maydell
> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
> On 29/01/2017 22:09, Paolo Bonzini wrote:
> > The series includes three parts:
> >
> > 1-2: fix two bugs, the first one pretty bad, the second seems
> > to be theoretical only.
> >
> > 3-5: simplify cpu_exec. This builds on Sergey's conversion
> > of cpu_exec to a simple top-down logic, making the phases
> > clearer and saving on the cost of siglongjmp in the meanwhile.
>
> I'll include these five in my next pull request. Review for patch 6 is
> welcome!
I've tested all 7 and briefly looked at their code.
Do you need official reviewed-by?
Pavel Dovgalyuk
>
> > 6-7: these are intended to be a base for Pavel's record/replay
> > fixes. The main thing I noticed while reviewing is that icount
> > is redoing (with u16.high) a lot of things that tcg_exit_req is
> > doing too. This is because, at the time icount was introduced,
> > tcg_exit_req didn't exist and QEMU instead unwound chained TBs
> > through POSIX signals. But now we have essentially two ways to
> > do the same thing with subtly different invariants or downright
> > bugs (such as the one fixed by patch 1). Patch 6 therefore
> > unifies tcg_exit_req and the icount interrupt flag. It saves a
> > handful of instructions per TB in icount mode and generally
> > makes icount mode "less special", which is a good thing since
> > no one seems to understand it well. Patch 7 then removes another
> > EXCP_INTERRUPT/cpu_loop_exit pair; by exiting to main loop simply
> > through cpu->exit_request, hopefully it fixes one of the issues that
> > Pavel was seeing.
> >
> > For now I've tested this only on an aarch64 Linux image (with
> > and without -icount). Thanks,
> >
> > Paolo
> >
> > Paolo Bonzini (7):
> > cpu-exec: fix jmp_first out-of-bounds access with icount
> > cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
> > cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
> > cpu-exec: avoid repeated sigsetjmp on interrupts
> > cpu-exec: remove outermost infinite loop
> > cpu-exec: unify icount_decr and tcg_exit_req
> > cpu-exec: centralize exiting to the main loop
> >
> > cpu-exec.c | 153 +++++++++++++++++++++-------------------------
> > include/exec/exec-all.h | 1 +
> > include/exec/gen-icount.h | 53 ++++++++--------
> > include/qom/cpu.h | 15 +++--
> > qom/cpu.c | 2 +-
> > tcg/tcg.h | 1 -
> > translate-all.c | 2 +-
> > translate-common.c | 13 ++--
> > 8 files changed, 109 insertions(+), 131 deletions(-)
> >
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC/RFT PATCH 0/7] cpu-exec: simplify cpu_exec and remove some icount special cases
2017-02-15 12:45 ` Pavel Dovgalyuk
@ 2017-02-15 12:57 ` Paolo Bonzini
0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2017-02-15 12:57 UTC (permalink / raw)
To: Pavel Dovgalyuk, qemu-devel; +Cc: serge.fdrv, pavel.dovgaluk, peter.maydell
On 15/02/2017 13:45, Pavel Dovgalyuk wrote:
>> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo Bonzini
>> On 29/01/2017 22:09, Paolo Bonzini wrote:
>>> The series includes three parts:
>>>
>>> 1-2: fix two bugs, the first one pretty bad, the second seems
>>> to be theoretical only.
>>>
>>> 3-5: simplify cpu_exec. This builds on Sergey's conversion
>>> of cpu_exec to a simple top-down logic, making the phases
>>> clearer and saving on the cost of siglongjmp in the meanwhile.
>>
>> I'll include these five in my next pull request. Review for patch 6 is
>> welcome!
>
> I've tested all 7 and briefly looked at their code.
> Do you need official reviewed-by?
No, but for patch 6 specifically, I would like more than one review. :)
Paolo
> Pavel Dovgalyuk
>>
>>> 6-7: these are intended to be a base for Pavel's record/replay
>>> fixes. The main thing I noticed while reviewing is that icount
>>> is redoing (with u16.high) a lot of things that tcg_exit_req is
>>> doing too. This is because, at the time icount was introduced,
>>> tcg_exit_req didn't exist and QEMU instead unwound chained TBs
>>> through POSIX signals. But now we have essentially two ways to
>>> do the same thing with subtly different invariants or downright
>>> bugs (such as the one fixed by patch 1). Patch 6 therefore
>>> unifies tcg_exit_req and the icount interrupt flag. It saves a
>>> handful of instructions per TB in icount mode and generally
>>> makes icount mode "less special", which is a good thing since
>>> no one seems to understand it well. Patch 7 then removes another
>>> EXCP_INTERRUPT/cpu_loop_exit pair; by exiting to main loop simply
>>> through cpu->exit_request, hopefully it fixes one of the issues that
>>> Pavel was seeing.
>>>
>>> For now I've tested this only on an aarch64 Linux image (with
>>> and without -icount). Thanks,
>>>
>>> Paolo
>>>
>>> Paolo Bonzini (7):
>>> cpu-exec: fix jmp_first out-of-bounds access with icount
>>> cpu-exec: tighten barrier on TCG_EXIT_REQUESTED
>>> cpu-exec: avoid cpu_loop_exit in cpu_handle_interrupt
>>> cpu-exec: avoid repeated sigsetjmp on interrupts
>>> cpu-exec: remove outermost infinite loop
>>> cpu-exec: unify icount_decr and tcg_exit_req
>>> cpu-exec: centralize exiting to the main loop
>>>
>>> cpu-exec.c | 153 +++++++++++++++++++++-------------------------
>>> include/exec/exec-all.h | 1 +
>>> include/exec/gen-icount.h | 53 ++++++++--------
>>> include/qom/cpu.h | 15 +++--
>>> qom/cpu.c | 2 +-
>>> tcg/tcg.h | 1 -
>>> translate-all.c | 2 +-
>>> translate-common.c | 13 ++--
>>> 8 files changed, 109 insertions(+), 131 deletions(-)
>>>
>
^ permalink raw reply [flat|nested] 16+ messages in thread