qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: mttcg@listserver.greensocs.com, fred.konrad@greensocs.com,
	a.rigo@virtualopensystems.com, serge.fdrv@gmail.com,
	cota@braap.org
Cc: "Eduardo Habkost" <ehabkost@redhat.com>,
	"Peter Crosthwaite" <crosthwaite.peter@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	mark.burton@greensocs.com, qemu-devel@nongnu.org,
	pbonzini@redhat.com, "Alex Bennée" <alex.bennee@linaro.org>,
	"Richard Henderson" <rth@twiddle.net>
Subject: [Qemu-devel] [RFC v1 09/11] tcg: drop global lock during TCG code execution
Date: Fri, 18 Mar 2016 16:18:50 +0000	[thread overview]
Message-ID: <1458317932-1875-10-git-send-email-alex.bennee@linaro.org> (raw)
In-Reply-To: <1458317932-1875-1-git-send-email-alex.bennee@linaro.org>

From: KONRAD Frederic <fred.konrad@greensocs.com>

This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.

We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.

Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:

20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm

The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond

32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm

We don't benefit significantly, though, when the guest is not fully
loading a host CPU.

Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com>
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: -smp single-threaded fix, rm old info from commit msg]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v1 (ajb):
  - SMP failure now fixed by previous commit
Changes from Fred Konrad (mttcg-v7 via paolo):
  * Rebase on the current HEAD.
  * Fixes a deadlock in qemu_devices_reset().
  * Remove the mutex in address_space_*
---
 cpu-exec.c         | 10 ++++++++++
 cpus.c             | 26 +++-----------------------
 cputlb.c           |  4 ++++
 exec.c             | 12 ++++++++++++
 hw/i386/kvmvapic.c |  3 +++
 softmmu_template.h | 17 +++++++++++++++++
 translate-all.c    | 11 +++++++++--
 7 files changed, 58 insertions(+), 25 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 3572256..76891fd 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -28,6 +28,7 @@
 #include "qemu/rcu.h"
 #include "exec/tb-hash.h"
 #include "exec/log.h"
+#include "qemu/main-loop.h"
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
@@ -449,6 +450,13 @@ int cpu_exec(CPUState *cpu)
             for(;;) {
                 interrupt_request = cpu->interrupt_request;
                 if (unlikely(interrupt_request)) {
+                    /* FIXME: this needs to take the iothread lock.
+                     * For this we need to find all places in
+                     * cc->cpu_exec_interrupt that can call cpu_loop_exit,
+                     * and call qemu_unlock_iothread_mutex() there.  Else,
+                     * add a flag telling cpu_loop_exit() to unlock it.
+                     */
+
                     if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
                         /* Mask out external interrupts for this step. */
                         interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
@@ -501,6 +509,7 @@ int cpu_exec(CPUState *cpu)
                            the program flow was changed */
                         next_tb = 0;
                     }
+
                 }
                 if (unlikely(cpu->exit_request
                              || replay_has_interrupt())) {
@@ -618,6 +627,7 @@ int cpu_exec(CPUState *cpu)
             g_assert(env == &x86_cpu->env);
 #endif
 #endif /* buggy compiler */
+
             cpu->can_do_io = 1;
             tb_lock_reset();
         }
diff --git a/cpus.c b/cpus.c
index a87fbf9..0e3d5f9 100644
--- a/cpus.c
+++ b/cpus.c
@@ -911,8 +911,6 @@ static void qemu_kvm_init_cpu_signals(CPUState *cpu)
 #endif /* _WIN32 */
 
 static QemuMutex qemu_global_mutex;
-static QemuCond qemu_io_proceeded_cond;
-static unsigned iothread_requesting_mutex;
 
 static QemuThread io_thread;
 
@@ -928,7 +926,6 @@ void qemu_init_cpu_loop(void)
     qemu_cond_init(&qemu_cpu_cond);
     qemu_cond_init(&qemu_pause_cond);
     qemu_cond_init(&qemu_work_cond);
-    qemu_cond_init(&qemu_io_proceeded_cond);
     qemu_mutex_init(&qemu_global_mutex);
 
     qemu_thread_get_self(&io_thread);
@@ -1043,10 +1040,6 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
         qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
     }
 
-    while (iothread_requesting_mutex) {
-        qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
-    }
-
     CPU_FOREACH(cpu) {
         qemu_wait_io_event_common(cpu);
     }
@@ -1314,22 +1307,7 @@ bool qemu_mutex_iothread_locked(void)
 
 void qemu_mutex_lock_iothread(void)
 {
-    atomic_inc(&iothread_requesting_mutex);
-    /* In the simple case there is no need to bump the VCPU thread out of
-     * TCG code execution.
-     */
-    if (!tcg_enabled() || qemu_in_vcpu_thread() ||
-        !first_cpu || !first_cpu->created) {
-        qemu_mutex_lock(&qemu_global_mutex);
-        atomic_dec(&iothread_requesting_mutex);
-    } else {
-        if (qemu_mutex_trylock(&qemu_global_mutex)) {
-            qemu_cpu_kick_no_halt();
-            qemu_mutex_lock(&qemu_global_mutex);
-        }
-        atomic_dec(&iothread_requesting_mutex);
-        qemu_cond_broadcast(&qemu_io_proceeded_cond);
-    }
+    qemu_mutex_lock(&qemu_global_mutex);
     iothread_locked = true;
 }
 
@@ -1575,7 +1553,9 @@ static int tcg_cpu_exec(CPUState *cpu)
         cpu->icount_decr.u16.low = decr;
         cpu->icount_extra = count;
     }
+    qemu_mutex_unlock_iothread();
     ret = cpu_exec(cpu);
+    qemu_mutex_lock_iothread();
 #ifdef CONFIG_PROFILER
     tcg_time += profile_getclock() - ti;
 #endif
diff --git a/cputlb.c b/cputlb.c
index 2f7a166..d607471 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -30,6 +30,10 @@
 #include "exec/ram_addr.h"
 #include "tcg/tcg.h"
 
+#ifdef CONFIG_SOFTMMU
+#include "qemu/main-loop.h"
+#endif
+
 //#define DEBUG_TLB
 //#define DEBUG_TLB_CHECK
 
diff --git a/exec.c b/exec.c
index 402b751..9986d59 100644
--- a/exec.c
+++ b/exec.c
@@ -2113,6 +2113,12 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
                 }
                 cpu->watchpoint_hit = wp;
 
+                /*
+                 * Unlock iothread mutex before calling cpu_loop_exit
+                 * or cpu_resume_from_signal.
+                 */
+                qemu_mutex_unlock_iothread();
+
                 /* Unlocked by cpu_loop_exit or cpu_resume_from_signal.  */
                 tb_lock();
                 tb_check_watchpoint(cpu);
@@ -2348,8 +2354,14 @@ static void io_mem_init(void)
     memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX);
     memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL,
                           NULL, UINT64_MAX);
+
+    /* io_mem_notdirty calls tb_invalidate_phys_page_fast,
+     * which must be called without the iothread mutex.
+     */
     memory_region_init_io(&io_mem_notdirty, NULL, &notdirty_mem_ops, NULL,
                           NULL, UINT64_MAX);
+    memory_region_clear_global_locking(&io_mem_notdirty);
+
     memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL,
                           NULL, UINT64_MAX);
 }
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 7c0d542..a0439a8 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -447,6 +447,9 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
     resume_all_vcpus();
 
     if (!kvm_enabled()) {
+        /* Unlock iothread mutex before calling cpu_resume_from_signal.  */
+        qemu_mutex_unlock_iothread();
+
         /* Unlocked by cpu_resume_from_signal.  */
         tb_lock();
         cs->current_tb = NULL;
diff --git a/softmmu_template.h b/softmmu_template.h
index 208f808..0b6c609 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -151,6 +151,7 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState *env,
     CPUState *cpu = ENV_GET_CPU(env);
     hwaddr physaddr = iotlbentry->addr;
     MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
+    bool locked = false;
 
     physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
     cpu->mem_io_pc = retaddr;
@@ -159,8 +160,15 @@ static inline DATA_TYPE glue(io_read, SUFFIX)(CPUArchState *env,
     }
 
     cpu->mem_io_vaddr = addr;
+    if (mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        locked = true;
+    }
     memory_region_dispatch_read(mr, physaddr, &val, 1 << SHIFT,
                                 iotlbentry->attrs);
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
     return val;
 }
 #endif
@@ -358,6 +366,7 @@ static inline void glue(io_write, SUFFIX)(CPUArchState *env,
     CPUState *cpu = ENV_GET_CPU(env);
     hwaddr physaddr = iotlbentry->addr;
     MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
+    bool locked = false;
 
     physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
     if (mr != &io_mem_rom && mr != &io_mem_notdirty && !cpu->can_do_io) {
@@ -366,8 +375,16 @@ static inline void glue(io_write, SUFFIX)(CPUArchState *env,
 
     cpu->mem_io_vaddr = addr;
     cpu->mem_io_pc = retaddr;
+
+    if (mr->global_locking) {
+        qemu_mutex_lock_iothread();
+        locked = true;
+    }
     memory_region_dispatch_write(mr, physaddr, val, 1 << SHIFT,
                                  iotlbentry->attrs);
+    if (locked) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
diff --git a/translate-all.c b/translate-all.c
index 1aab243..fe1392a 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1221,6 +1221,7 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
  * this TB.
  *
  * Called with mmap_lock held for user-mode emulation
+ * If called from generated code, iothread mutex must not be held.
  */
 void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
                                    int is_cpu_write_access)
@@ -1333,7 +1334,9 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
 }
 
 #ifdef CONFIG_SOFTMMU
-/* len must be <= 8 and start must be a multiple of len */
+/* len must be <= 8 and start must be a multiple of len.
+ * Called via softmmu_template.h, with iothread mutex not held.
+ */
 void tb_invalidate_phys_page_fast(tb_page_addr_t start, int len)
 {
     PageDesc *p;
@@ -1591,6 +1594,7 @@ static TranslationBlock *tb_find_pc(uintptr_t tc_ptr)
 }
 
 #if !defined(CONFIG_USER_ONLY)
+/* If called from generated code, iothread mutex must not be held.  */
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr)
 {
     ram_addr_t ram_addr;
@@ -1637,7 +1641,10 @@ void tb_check_watchpoint(CPUState *cpu)
 
 #ifndef CONFIG_USER_ONLY
 /* in deterministic execution mode, instructions doing device I/Os
-   must be at the end of the TB */
+ * must be at the end of the TB.
+ *
+ * Called by softmmu_template.h, with iothread mutex not held.
+ */
 void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
 {
 #if defined(TARGET_MIPS) || defined(TARGET_SH4)
-- 
2.7.3

  parent reply	other threads:[~2016-03-18 16:19 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-18 16:18 [Qemu-devel] [RFC v1 00/11] Base enabling patches for MTTCG Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 01/11] tcg: move tb_find_fast outside the tb_lock critical section Alex Bennée
2016-03-18 16:54   ` Paolo Bonzini
2016-03-21 21:50   ` Emilio G. Cota
2016-03-21 22:08     ` Peter Maydell
2016-03-21 23:59       ` Emilio G. Cota
2016-03-22  8:29         ` Paolo Bonzini
2016-03-22 11:59         ` Alex Bennée
2016-03-22 11:55     ` Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 02/11] cpu-exec: elide more icount code if CONFIG_USER_ONLY Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 03/11] tcg: comment on which functions have to be called with tb_lock held Alex Bennée
2016-03-18 16:59   ` Paolo Bonzini
2016-03-21 21:50     ` Emilio G. Cota
2016-03-21 22:12       ` Paolo Bonzini
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 04/11] tcg: protect TBContext with tb_lock Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 05/11] target-arm/psci.c: wake up sleeping CPUs Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 06/11] tcg: cpus rm tcg_exec_all() Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 07/11] tcg: add options for enabling MTTCG Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 08/11] tcg: add kick timer for single-threaded vCPU emulation Alex Bennée
2016-03-18 16:18 ` Alex Bennée [this message]
2016-03-18 16:49   ` [Qemu-devel] [RFC v1 09/11] tcg: drop global lock during TCG code execution Paolo Bonzini
2016-03-23  9:19   ` KONRAD Frederic
2016-03-23 16:27     ` Alex Bennée
2016-03-23 20:36       ` Jan Kiszka
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 10/11] tcg: grab iothread lock in cpu-exec interrupt handling Alex Bennée
2016-03-18 16:48   ` Paolo Bonzini
2016-03-22 12:03     ` Alex Bennée
2016-03-18 16:18 ` [Qemu-devel] [RFC v1 11/11] tcg: enable thread-per-vCPU Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1458317932-1875-10-git-send-email-alex.bennee@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=a.rigo@virtualopensystems.com \
    --cc=cota@braap.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=ehabkost@redhat.com \
    --cc=fred.konrad@greensocs.com \
    --cc=mark.burton@greensocs.com \
    --cc=mst@redhat.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=serge.fdrv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).