From: "Alex Bennée" <alex.bennee@linaro.org>
To: mttcg@listserver.greensocs.com, qemu-devel@nongnu.org,
fred.konrad@greensocs.com, a.rigo@virtualopensystems.com,
serge.fdrv@gmail.com, cota@braap.org, bobby.prani@gmail.com
Cc: mark.burton@greensocs.com, pbonzini@redhat.com,
jan.kiszka@siemens.com, rth@twiddle.net,
peter.maydell@linaro.org, claudio.fontana@huawei.com,
"Alex Bennée" <alex.bennee@linaro.org>,
"Peter Crosthwaite" <crosthwaite.peter@gmail.com>,
"Eduardo Habkost" <ehabkost@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: [Qemu-devel] [RFC v3 15/19] tcg: drop global lock during TCG code execution
Date: Fri, 3 Jun 2016 21:40:24 +0100 [thread overview]
Message-ID: <1464986428-6739-16-git-send-email-alex.bennee@linaro.org> (raw)
In-Reply-To: <1464986428-6739-1-git-send-email-alex.bennee@linaro.org>
This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.
We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.
Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:
20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm
20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm
The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond
32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm
32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm
We don't benefit significantly, though, when the guest is not fully
loading a host CPU.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
[FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[EGC: fixed iothread lock for cpu-exec IRQ handling]
Signed-off-by: Emilio G. Cota <cota@braap.org>
[AJB: -smp single-threaded fix, rm old info from commit msg]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
v3 (ajb, base-patches):
- stale iothread_unlocks removed (cpu_exit/resume_from_signal deals
with it in the longjmp).
- fix re-base conflicts
v2 (ajb):
- merge with tcg: grab iothread lock in cpu-exec interrupt handling
- use existing fns for tracking lock state
- lock iothread for mem_region
- add assert on mem region modification
- ensure smm_helper holds iothread
- Add JK s-o-b
- Fix-up FK s-o-b annotation
v1 (ajb, base-patches):
- SMP failure now fixed by previous commit
Changes from Fred Konrad (mttcg-v7 via paolo):
* Rebase on the current HEAD.
* Fixes a deadlock in qemu_devices_reset().
* Remove the mutex in address_space_*
---
cpu-exec.c | 11 +++++++++++
cpus.c | 26 +++-----------------------
cputlb.c | 1 +
exec.c | 12 +++++++++---
hw/i386/kvmvapic.c | 4 ++--
memory.c | 2 ++
softmmu_template.h | 17 +++++++++++++++++
target-i386/smm_helper.c | 7 +++++++
translate-all.c | 9 +++++++--
9 files changed, 59 insertions(+), 30 deletions(-)
diff --git a/cpu-exec.c b/cpu-exec.c
index 1613c63..e1fb9ca 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -29,6 +29,7 @@
#include "qemu/rcu.h"
#include "exec/tb-hash.h"
#include "exec/log.h"
+#include "qemu/main-loop.h"
#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
#include "hw/i386/apic.h"
#endif
@@ -460,6 +461,8 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
int interrupt_request = cpu->interrupt_request;
if (unlikely(interrupt_request)) {
+ qemu_mutex_lock_iothread();
+
if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
/* Mask out external interrupts for this step. */
interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
@@ -514,6 +517,10 @@ static inline void cpu_handle_interrupt(CPUState *cpu,
the program flow was changed */
*last_tb = NULL;
}
+
+ /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
+ qemu_mutex_unlock_iothread();
+
}
if (unlikely(cpu->exit_request || replay_has_interrupt())) {
cpu->exit_request = 0;
@@ -642,8 +649,12 @@ int cpu_exec(CPUState *cpu)
g_assert(cpu == current_cpu);
g_assert(cc == CPU_GET_CLASS(cpu));
#endif /* buggy compiler */
+
cpu->can_do_io = 1;
tb_lock_reset();
+ if (qemu_mutex_iothread_locked()) {
+ qemu_mutex_unlock_iothread();
+ }
}
} /* for(;;) */
diff --git a/cpus.c b/cpus.c
index a84f02c..35374fd 100644
--- a/cpus.c
+++ b/cpus.c
@@ -915,8 +915,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
#endif /* _WIN32 */
static QemuMutex qemu_global_mutex;
-static QemuCond qemu_io_proceeded_cond;
-static unsigned iothread_requesting_mutex;
static QemuThread io_thread;
@@ -932,7 +930,6 @@ void qemu_init_cpu_loop(void)
qemu_cond_init(&qemu_cpu_cond);
qemu_cond_init(&qemu_pause_cond);
qemu_cond_init(&qemu_work_cond);
- qemu_cond_init(&qemu_io_proceeded_cond);
qemu_mutex_init(&qemu_global_mutex);
qemu_thread_get_self(&io_thread);
@@ -1045,10 +1042,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
}
- while (iothread_requesting_mutex) {
- qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
- }
-
CPU_FOREACH(cpu) {
qemu_wait_io_event_common(cpu);
}
@@ -1205,7 +1198,9 @@ static int tcg_cpu_exec(CPUState *cpu)
cpu->icount_decr.u16.low = decr;
cpu->icount_extra = count;
}
+ qemu_mutex_unlock_iothread();
ret = cpu_exec(cpu);
+ qemu_mutex_lock_iothread();
#ifdef CONFIG_PROFILER
tcg_time += profile_getclock() - ti;
#endif
@@ -1392,22 +1387,7 @@ bool qemu_mutex_iothread_locked(void)
void qemu_mutex_lock_iothread(void)
{
- atomic_inc(&iothread_requesting_mutex);
- /* In the simple case there is no need to bump the VCPU thread out of
- * TCG code execution.
- */
- if (!tcg_enabled() || qemu_in_vcpu_thread() ||
- !first_cpu || !first_cpu->created) {
- qemu_mutex_lock(&qemu_global_mutex);
- atomic_dec(&iothread_requesting_mutex);
- } else {
- if (qemu_mutex_trylock(&qemu_global_mutex)) {
- qemu_cpu_kick_rr_cpu();
- qemu_mutex_lock(&qemu_global_mutex);
- }
- atomic_dec(&iothread_requesting_mutex);
- qemu_cond_broadcast(&qemu_io_proceeded_cond);
- }
+ qemu_mutex_lock(&qemu_global_mutex);
iothread_locked = true;
}
diff --git a/cputlb.c b/cputlb.c
index 1ff6354..78e9879 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -18,6 +18,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
#include "cpu.h"
#include "exec/exec-all.h"
#include "exec/memory.h"
diff --git a/exec.c b/exec.c
index e23039c..b7744b9 100644
--- a/exec.c
+++ b/exec.c
@@ -2149,9 +2149,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
}
cpu->watchpoint_hit = wp;
- /* The tb_lock will be reset when cpu_loop_exit or
- * cpu_resume_from_signal longjmp back into the cpu_exec
- * main loop.
+ /* Both tb_lock and iothread_mutex will be reset when
+ * cpu_loop_exit or cpu_resume_from_signal longjmp
+ * back into the cpu_exec main loop.
*/
tb_lock();
tb_check_watchpoint(cpu);
@@ -2387,8 +2387,14 @@ static void io_mem_init(void)
memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
NULL, UINT64_MAX);
+
+ /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
+ * which must be called without the iothread mutex.
+ */
memory_region_init_io(&io_mem_notdirty, NULL, ¬dirty_mem_ops, NULL,
NULL, UINT64_MAX);
+ memory_region_clear_global_locking(&io_mem_notdirty);
+
memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
NULL, UINT64_MAX);
}
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index d98fe2a..12ed58e 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -450,8 +450,8 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
resume_all_vcpus();
if (!kvm_enabled()) {
- /* tb_lock will be reset when cpu_resume_from_signal longjmps
- * back into the cpu_exec loop. */
+ /* Both tb_lock and iothread_mutex will be reset when
+ * cpu_resume_from_signal longjmps back into the cpu_exec loop. */
tb_lock();
tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
cpu_resume_from_signal(cs, NULL);
diff --git a/memory.c b/memory.c
index 4e3cda8..2a318d7 100644
--- a/memory.c
+++ b/memory.c
@@ -914,6 +914,8 @@ void memory_region_transaction_commit(void)
AddressSpace *as;
assert(memory_region_transaction_depth);
+ assert(qemu_mutex_iothread_locked());
+
--memory_region_transaction_depth;
if (!memory_region_transaction_depth) {
if (memory_region_update_pending) {
diff --git a/softmmu_template.h b/softmmu_template.h
index 208f808..0b6c609 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -151,6 +151,7 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState *env,
CPUState *cpu = ENV_GET_CPU(env);
hwaddr physaddr = iotlbentry->addr;
MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
+ bool locked = false;
physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
cpu->mem_io_pc = retaddr;
@@ -159,8 +160,15 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState *env,
}
cpu->mem_io_vaddr = addr;
+ if (mr->global_locking) {
+ qemu_mutex_lock_iothread();
+ locked = true;
+ }
memory_region_dispatch_read(mr, physaddr, &val, 1 << SHIFT,
iotlbentry->attrs);
+ if (locked) {
+ qemu_mutex_unlock_iothread();
+ }
return val;
}
#endif
@@ -358,6 +366,7 @@ static inline void glue(io_write, SUFFIX)(CPUArchState *env,
CPUState *cpu = ENV_GET_CPU(env);
hwaddr physaddr = iotlbentry->addr;
MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
+ bool locked = false;
physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
@@ -366,8 +375,16 @@ static inline void glue(io_write, SUFFIX)(CPUArchState *env,
cpu->mem_io_vaddr = addr;
cpu->mem_io_pc = retaddr;
+
+ if (mr->global_locking) {
+ qemu_mutex_lock_iothread();
+ locked = true;
+ }
memory_region_dispatch_write(mr, physaddr, val, 1 << SHIFT,
iotlbentry->attrs);
+ if (locked) {
+ qemu_mutex_unlock_iothread();
+ }
}
void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
diff --git a/target-i386/smm_helper.c b/target-i386/smm_helper.c
index 4dd6a2c..6a5489b 100644
--- a/target-i386/smm_helper.c
+++ b/target-i386/smm_helper.c
@@ -18,6 +18,7 @@
*/
#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
#include "cpu.h"
#include "exec/helper-proto.h"
#include "exec/log.h"
@@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env)
#define SMM_REVISION_ID 0x00020000
#endif
+/* Called we iothread lock taken */
void cpu_smm_update(X86CPU *cpu)
{
CPUX86State *env = &cpu->env;
bool smm_enabled = (env->hflags & HF_SMM_MASK);
+ g_assert(qemu_mutex_iothread_locked());
+
if (cpu->smram) {
memory_region_set_enabled(cpu->smram, smm_enabled);
}
@@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env)
}
env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK;
env->hflags &= ~HF_SMM_MASK;
+
+ qemu_mutex_lock_iothread();
cpu_smm_update(cpu);
+ qemu_mutex_unlock_iothread();
qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n");
log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP);
diff --git a/translate-all.c b/translate-all.c
index f54ab3e..818520e 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1476,7 +1476,9 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
}
#ifdef CONFIG_SOFTMMU
-/* len must be <= 8 and start must be a multiple of len */
+/* len must be <= 8 and start must be a multiple of len.
+ * Called via softmmu_template.h, with iothread mutex not held.
+ */
void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
{
PageDesc *p;
@@ -1670,7 +1672,10 @@ void tb_check_watchpoint(CPUState *cpu)
#ifndef CONFIG_USER_ONLY
/* in deterministic execution mode, instructions doing device I/Os
- must be at the end of the TB */
+ * must be at the end of the TB.
+ *
+ * Called by softmmu_template.h, with iothread mutex not held.
+ */
void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
{
#if defined(TARGET_MIPS) || defined(TARGET_SH4)
--
2.7.4
next prev parent reply other threads:[~2016-06-03 20:40 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-03 20:40 [Qemu-devel] [RFC v3 00/19] Base enabling patches for MTTCG Alex Bennée
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 01/19] cpus: make all_vcpus_paused() return bool Alex Bennée
2016-06-07 15:05 ` Richard Henderson
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 02/19] translate_all: DEBUG_FLUSH -> DEBUG_TB_FLUSH Alex Bennée
2016-06-07 14:54 ` Richard Henderson
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 03/19] translate-all: add DEBUG_LOCKING asserts Alex Bennée
2016-06-23 14:34 ` Sergey Fedorov
2016-06-23 17:14 ` Alex Bennée
2016-06-23 18:43 ` Sergey Fedorov
2016-07-01 23:21 ` Richard Henderson
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 04/19] docs: new design document multi-thread-tcg.txt (DRAFTING) Alex Bennée
2016-06-23 21:33 ` Sergey Fedorov
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 05/19] exec: add assert_debug_safe and notes on debug structures Alex Bennée
2016-06-24 15:28 ` Sergey Fedorov
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 06/19] tcg: comment on which functions have to be called with tb_lock held Alex Bennée
2016-06-24 15:38 ` Sergey Fedorov
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 07/19] translate-all: Add assert_memory_lock annotations Alex Bennée
2016-06-24 15:48 ` Sergey Fedorov
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 08/19] tcg: protect TBContext with tb_lock Alex Bennée
2016-07-01 23:40 ` Richard Henderson
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 09/19] target-arm/arm-powerctl: wake up sleeping CPUs Alex Bennée
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 10/19] tcg: cpus rm tcg_exec_all() Alex Bennée
2016-06-24 17:09 ` Sergey Fedorov
2016-07-01 23:50 ` Richard Henderson
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 11/19] tcg: add options for enabling MTTCG Alex Bennée
2016-06-27 21:07 ` Sergey Fedorov
2016-07-22 16:17 ` Alex Bennée
2016-07-01 23:53 ` Richard Henderson
2016-07-02 7:11 ` Alex Bennée
2016-07-02 7:38 ` Sergey Fedorov
2016-07-04 10:10 ` Paolo Bonzini
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 12/19] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
2016-06-27 21:20 ` Sergey Fedorov
2016-07-02 0:17 ` Richard Henderson
2016-07-02 7:36 ` Sergey Fedorov
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 13/19] tcg: rename tcg_current_cpu to tcg_current_rr_cpu Alex Bennée
2016-06-06 15:30 ` Paolo Bonzini
2016-06-06 16:05 ` Alex Bennée
2016-06-06 17:05 ` Paolo Bonzini
2016-06-06 17:26 ` Alex Bennée
2016-06-06 18:25 ` Paolo Bonzini
2016-06-07 12:59 ` Alex Bennée
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 14/19] tcg: remove global exit_request Alex Bennée
2016-06-28 16:20 ` Sergey Fedorov
2016-08-03 11:42 ` Alex Bennée
2016-06-03 20:40 ` Alex Bennée [this message]
2016-06-28 16:54 ` [Qemu-devel] [RFC v3 15/19] tcg: drop global lock during TCG code execution Sergey Fedorov
2016-08-10 13:51 ` Alex Bennée
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 16/19] tcg: move locking for tb_invalidate_phys_page_range up Alex Bennée
2016-06-28 19:43 ` Sergey Fedorov
2016-06-28 19:51 ` Sergey Fedorov
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 17/19] tcg: enable thread-per-vCPU Alex Bennée
2016-06-29 14:09 ` Sergey Fedorov
2016-08-10 14:44 ` Alex Bennée
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 18/19] tcg: Ensure safe TB lookup out of 'tb_lock' Alex Bennée
2016-06-03 20:40 ` [Qemu-devel] [RFC v3 19/19] cpu-exec: remove tb_lock from the hot-path Alex Bennée
2016-06-29 14:35 ` Sergey Fedorov
2016-06-29 14:47 ` Alex Bennée
2016-06-29 14:52 ` Sergey Fedorov
2016-06-29 16:08 ` Alex Bennée
2016-06-30 9:24 ` Sergey Fedorov
2016-06-04 14:40 ` [Qemu-devel] [RFC v3 00/19] Base enabling patches for MTTCG Pranith Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1464986428-6739-16-git-send-email-alex.bennee@linaro.org \
--to=alex.bennee@linaro.org \
--cc=a.rigo@virtualopensystems.com \
--cc=bobby.prani@gmail.com \
--cc=claudio.fontana@huawei.com \
--cc=cota@braap.org \
--cc=crosthwaite.peter@gmail.com \
--cc=ehabkost@redhat.com \
--cc=fred.konrad@greensocs.com \
--cc=jan.kiszka@siemens.com \
--cc=mark.burton@greensocs.com \
--cc=mst@redhat.com \
--cc=mttcg@listserver.greensocs.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
--cc=serge.fdrv@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).