From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38402) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yn9cj-0004Cm-VL for qemu-devel@nongnu.org; Tue, 28 Apr 2015 13:48:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yn9cg-0004nC-Nk for qemu-devel@nongnu.org; Tue, 28 Apr 2015 13:48:45 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:34427) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yn9cg-0004n4-IE for qemu-devel@nongnu.org; Tue, 28 Apr 2015 13:48:42 -0400 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 51EF120B73 for ; Tue, 28 Apr 2015 13:48:39 -0400 (EDT) Date: Tue, 28 Apr 2015 13:49:14 -0400 From: "Emilio G. Cota" Message-ID: <20150428174914.GA4586@flamenco> References: <1421428797-23697-1-git-send-email-fred.konrad@greensocs.com> <87wq22bvmd.fsf@linaro.org> <551532CF.600@greensocs.com> <0ACA9B57-1DC6-4C75-8FA4-DEAA701B1E94@greensocs.com> <5527F444.7060808@greensocs.com> <55379366.9070304@greensocs.com> <20150427170655.GA32698@flamenco> <553F4D9D.4040901@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <553F4D9D.4040901@redhat.com> Subject: Re: [Qemu-devel] [RFC 00/10] MultiThread TCG. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: mttcg@greensocs.com, Peter Maydell , Jan Kiszka , Mark Burton , QEMU Developers , Alexander Graf , Alex =?iso-8859-1?Q?Benn=E9e?= , Frederic Konrad On Tue, Apr 28, 2015 at 11:06:37 +0200, Paolo Bonzini wrote: > On 27/04/2015 19:06, Emilio G. Cota wrote: > > Note that I'm running with -smp 1. My guess is that the iothread > > is starved, since patch 472f4003 "Drop global lock during TCG code execution" > > removes from the iothread the ability to kick CPU threads. > > In theory that shouldn't be necessary anymore. The CPU thread should > only hold the global lock for very small periods of time, similar to KVM. You're right. I added printouts around qemu_global_mutex_lock/unlock and also added printouts around the cond_wait's that take the BQL. The vcpu goes quiet after a while: [...] softmmu_template.h:io_writel:387 UNLO tid 17633 qemu/cputlb.c:tlb_protect_code:196 LOCK tid 17633 cputlb.c:tlb_protect_code:199 UNLO tid 17633 cputlb.c:tlb_protect_code:196 LOCK tid 17633 cputlb.c:tlb_protect_code:199 UNLO tid 17633 cputlb.c:tlb_protect_code:196 LOCK tid 17633 cputlb.c:tlb_protect_code:199 UNLO tid 17633 softmmu_template.h:io_readl:160 LOCK tid 17633 softmmu_template.h:io_readl:165 UNLO tid 17633 main-loop.c:os_host_main_loop_wait:242 LOCK tid 17630 main-loop.c:os_host_main_loop_wait:234 UNLO tid 17630 .. And at this point the last pair of LOCK/UNLO goes indefinitely. > Can you post a backtrace? $ sudo gdb --pid=8919 (gdb) info threads Id Target Id Frame 3 Thread 0x7ffff596b700 (LWP 16204) "qemu-system-arm" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 2 Thread 0x7ffff0f69700 (LWP 16206) "qemu-system-arm" 0x00007ffff33179fe in ?? () * 1 Thread 0x7ffff7fe4a80 (LWP 16203) "qemu-system-arm" 0x00007ffff5e9b1ef in __GI_ppoll (fds=0x5555569b8f70, nfds=4, timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:56 (gdb) bt #0 0x00007ffff5e9b1ef in __GI_ppoll (fds=0x5555569b8f70, nfds=4, timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:56 #1 0x00005555559a9e26 in qemu_poll_ns (fds=0x5555569b8f70, nfds=4, timeout=9689027) at qemu-timer.c:326 #2 0x00005555559a8abb in os_host_main_loop_wait (timeout=9689027) at main-loop.c:239 #3 0x00005555559a8bef in main_loop_wait (nonblocking=0) at main-loop.c:494 #4 0x000055555578c8c5 in main_loop () at vl.c:1803 #5 0x0000555555794634 in main (argc=16, argv=0x7fffffffe828, envp=0x7fffffffe8b0) at vl.c:4371 (gdb) thread 3 [Switching to thread 3 (Thread 0x7ffff596b700 (LWP 16204))] #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 38 ../sysdeps/unix/sysv/linux/x86_64/syscall.S: No such file or directory. (gdb) bt #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #1 0x0000555555a3a061 in futex_wait (ev=0x555556392724 , val=4294967295) at util/qemu-thread-posix.c:305 #2 0x0000555555a3a20b in qemu_event_wait (ev=0x555556392724 ) at util/qemu-thread-posix.c:401 #3 0x0000555555a5011d in call_rcu_thread (opaque=0x0) at util/rcu.c:231 #4 0x00007ffff617b182 in start_thread (arg=0x7ffff596b700) at pthread_create.c:312 #5 0x00007ffff5ea847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 (gdb) thread 2 [Switching to thread 2 (Thread 0x7ffff0f69700 (LWP 16206))] #0 0x00007ffff33179fe in ?? () (gdb) bt #0 0x00007ffff33179fe in ?? () #1 0x0000555555f0c200 in ?? () #2 0x00007fffd40029a0 in ?? () #3 0x00007fffd40029c0 in ?? () #4 0x13b33d1714a74c00 in ?? () #5 0x00007ffff0f685c0 in ?? () #6 0x00005555555f9da7 in tcg_out_reloc (s=, code_ptr=, type=, label_index=, addend=) at /local/home/cota/src/qemu/tcg/tcg.c:224 Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) q So it seems that the vcpu thread doesn't come out of the execution loop from which that last io_readl was performed. Emilio