From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47640) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHwHJ-0000Fy-CF for qemu-devel@nongnu.org; Tue, 28 Jun 2016 12:54:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bHwHF-0000MP-GL for qemu-devel@nongnu.org; Tue, 28 Jun 2016 12:54:25 -0400 Received: from mail-lf0-x244.google.com ([2a00:1450:4010:c07::244]:36386) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bHwHF-0000Li-3w for qemu-devel@nongnu.org; Tue, 28 Jun 2016 12:54:21 -0400 Received: by mail-lf0-x244.google.com with SMTP id a2so2384224lfe.3 for ; Tue, 28 Jun 2016 09:54:20 -0700 (PDT) References: <1464986428-6739-1-git-send-email-alex.bennee@linaro.org> <1464986428-6739-16-git-send-email-alex.bennee@linaro.org> From: Sergey Fedorov Message-ID: <5772ABBA.8070903@gmail.com> Date: Tue, 28 Jun 2016 19:54:18 +0300 MIME-Version: 1.0 In-Reply-To: <1464986428-6739-16-git-send-email-alex.bennee@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC v3 15/19] tcg: drop global lock during TCG code execution List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?Q?Alex_Benn=c3=a9e?= , mttcg@listserver.greensocs.com, qemu-devel@nongnu.org, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com, cota@braap.org, bobby.prani@gmail.com Cc: mark.burton@greensocs.com, pbonzini@redhat.com, jan.kiszka@siemens.com, rth@twiddle.net, peter.maydell@linaro.org, claudio.fontana@huawei.com, Peter Crosthwaite , Eduardo Habkost , "Michael S. Tsirkin" On 03/06/16 23:40, Alex Bennée wrote: > From: Jan Kiszka (See http://thread.gmane.org/gmane.comp.emulators.qemu/402092/focus=403090) > This finally allows TCG to benefit from the iothread introduction: Drop > the global mutex while running pure TCG CPU code. Reacquire the lock > when entering MMIO or PIO emulation, or when leaving the TCG loop. > > We have to revert a few optimization for the current TCG threading > model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not > kicking it in qemu_cpu_kick. We also need to disable RAM block > reordering until we have a more efficient locking mechanism at hand. > > Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. > These numbers demonstrate where we gain something: > > 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm > 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm > > The guest CPU was fully loaded, but the iothread could still run mostly > independent on a second core. Without the patch we don't get beyond > > 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm > 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm > > We don't benefit significantly, though, when the guest is not fully > loading a host CPU. > > Signed-off-by: Jan Kiszka > Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com> > [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex] > Signed-off-by: KONRAD Frederic > [EGC: fixed iothread lock for cpu-exec IRQ handling] > Signed-off-by: Emilio G. Cota > [AJB: -smp single-threaded fix, rm old info from commit msg] > Signed-off-by: Alex Bennée > (snip) > diff --git a/cpu-exec.c b/cpu-exec.c > index 1613c63..e1fb9ca 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -29,6 +29,7 @@ > #include "qemu/rcu.h" > #include "exec/tb-hash.h" > #include "exec/log.h" > +#include "qemu/main-loop.h" > #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY) > #include "hw/i386/apic.h" > #endif > @@ -460,6 +461,8 @@ static inline void cpu_handle_interrupt(CPUState *cpu, > int interrupt_request = cpu->interrupt_request; > > if (unlikely(interrupt_request)) { > + qemu_mutex_lock_iothread(); > + cpu_handle_halt() for target-i386 also needs to protect 'cpu->interrupt_request' with the global lock. > if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) { > /* Mask out external interrupts for this step. */ > interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK; > @@ -514,6 +517,10 @@ static inline void cpu_handle_interrupt(CPUState *cpu, > the program flow was changed */ > *last_tb = NULL; > } > + > + /* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */ > + qemu_mutex_unlock_iothread(); > + > } > if (unlikely(cpu->exit_request || replay_has_interrupt())) { > cpu->exit_request = 0; (snip) > diff --git a/exec.c b/exec.c > index e23039c..b7744b9 100644 > --- a/exec.c > +++ b/exec.c > @@ -2149,9 +2149,9 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags) > } > cpu->watchpoint_hit = wp; > > - /* The tb_lock will be reset when cpu_loop_exit or > - * cpu_resume_from_signal longjmp back into the cpu_exec > - * main loop. > + /* Both tb_lock and iothread_mutex will be reset when > + * cpu_loop_exit or cpu_resume_from_signal longjmp > + * back into the cpu_exec main loop. > */ > tb_lock(); > tb_check_watchpoint(cpu); > @@ -2387,8 +2387,14 @@ static void io_mem_init(void) > memory_region_init_io(&io_mem_rom, NULL, &unassigned_mem_ops, NULL, NULL, UINT64_MAX); > memory_region_init_io(&io_mem_unassigned, NULL, &unassigned_mem_ops, NULL, > NULL, UINT64_MAX); > + > + /* io_mem_notdirty calls tb_invalidate_phys_page_fast, > + * which must be called without the iothread mutex. "must" or "can"? > + */ > memory_region_init_io(&io_mem_notdirty, NULL, ¬dirty_mem_ops, NULL, > NULL, UINT64_MAX); > + memory_region_clear_global_locking(&io_mem_notdirty); > + > memory_region_init_io(&io_mem_watch, NULL, &watch_mem_ops, NULL, > NULL, UINT64_MAX); > } (snip) > diff --git a/target-i386/smm_helper.c b/target-i386/smm_helper.c > index 4dd6a2c..6a5489b 100644 > --- a/target-i386/smm_helper.c > +++ b/target-i386/smm_helper.c > @@ -18,6 +18,7 @@ > */ > > #include "qemu/osdep.h" > +#include "qemu/main-loop.h" > #include "cpu.h" > #include "exec/helper-proto.h" > #include "exec/log.h" > @@ -42,11 +43,14 @@ void helper_rsm(CPUX86State *env) > #define SMM_REVISION_ID 0x00020000 > #endif > > +/* Called we iothread lock taken */ s/we/with/ > void cpu_smm_update(X86CPU *cpu) > { > CPUX86State *env = &cpu->env; > bool smm_enabled = (env->hflags & HF_SMM_MASK); > > + g_assert(qemu_mutex_iothread_locked()); > + > if (cpu->smram) { > memory_region_set_enabled(cpu->smram, smm_enabled); > } > @@ -333,7 +337,10 @@ void helper_rsm(CPUX86State *env) > } > env->hflags2 &= ~HF2_SMM_INSIDE_NMI_MASK; > env->hflags &= ~HF_SMM_MASK; > + > + qemu_mutex_lock_iothread(); > cpu_smm_update(cpu); > + qemu_mutex_unlock_iothread(); I'm wondering if there are some other similar places to take the global lock. > > qemu_log_mask(CPU_LOG_INT, "SMM: after RSM\n"); > log_cpu_state_mask(CPU_LOG_INT, CPU(cpu), CPU_DUMP_CCOP); (snip) Kind regards Sergey