From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42568) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZIM84-0004z5-Qq for qemu-devel@nongnu.org; Thu, 23 Jul 2015 15:26:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZIM81-0002BN-Bo for qemu-devel@nongnu.org; Thu, 23 Jul 2015 15:26:04 -0400 Received: from mail-wi0-x22d.google.com ([2a00:1450:400c:c05::22d]:34921) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZIM81-0002AB-17 for qemu-devel@nongnu.org; Thu, 23 Jul 2015 15:26:01 -0400 Received: by wibxm9 with SMTP id xm9so222371065wib.0 for ; Thu, 23 Jul 2015 12:26:00 -0700 (PDT) Sender: Paolo Bonzini References: <1437565425-29861-1-git-send-email-stefanha@redhat.com> <1437565425-29861-6-git-send-email-stefanha@redhat.com> <20150723161413.15ec718a.cornelia.huck@de.ibm.com> <55B12274.2050005@redhat.com> <55B13046.2060205@redhat.com> <55B13E46.2070205@de.ibm.com> From: Paolo Bonzini Message-ID: <55B13FC5.1@redhat.com> Date: Thu, 23 Jul 2015 21:25:57 +0200 MIME-Version: 1.0 In-Reply-To: <55B13E46.2070205@de.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PULL v2 for-2.4 v2 5/7] AioContext: fix broken ctx->dispatching optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Christian Borntraeger , Cornelia Huck Cc: Kevin Wolf , Peter Maydell , qemu-devel@nongnu.org, Stefan Hajnoczi On 23/07/2015 21:19, Christian Borntraeger wrote: > Am 23.07.2015 um 20:19 schrieb Paolo Bonzini: >> >> >> On 23/07/2015 19:20, Paolo Bonzini wrote: >>> >>> >>> On 23/07/2015 16:14, Cornelia Huck wrote: >>>> (gdb) bt >>>> #0 0x000003fffc5871b4 in pthread_cond_wait@@GLIBC_2.3.2 () >>>> from /lib64/libpthread.so.0 >>>> #1 0x000000008024cfca in qemu_cond_wait (cond=cond@entry=0x9717d950, >>>> mutex=mutex@entry=0x9717d920) >>>> at /data/git/yyy/qemu/util/qemu-thread-posix.c:132 >>>> #2 0x000000008025e83a in rfifolock_lock (r=0x9717d920) >>>> at /data/git/yyy/qemu/util/rfifolock.c:59 >>>> #3 0x00000000801b78fa in aio_context_acquire (ctx=) >>>> at /data/git/yyy/qemu/async.c:331 >>>> #4 0x000000008007ceb4 in virtio_blk_data_plane_start (s=0x9717d710) >>>> at /data/git/yyy/qemu/hw/block/dataplane/virtio-blk.c:285 >>>> #5 0x000000008007c64a in virtio_blk_handle_output (vdev=, >>>> vq=) at /data/git/yyy/qemu/hw/block/virtio-blk.c:599 >>>> #6 0x00000000801c56dc in qemu_iohandler_poll (pollfds=0x97142800, >>>> ret=ret@entry=1) at /data/git/yyy/qemu/iohandler.c:126 >>>> #7 0x00000000801c5178 in main_loop_wait (nonblocking=) >>>> at /data/git/yyy/qemu/main-loop.c:494 >>>> #8 0x0000000080013ee2 in main_loop () at /data/git/yyy/qemu/vl.c:1902 >>>> #9 main (argc=, argv=, envp=) >>>> at /data/git/yyy/qemu/vl.c:4653 >>>> >>>> I've stripped down the setup to the following commandline: >>>> >>>> /data/git/yyy/qemu/build/s390x-softmmu/qemu-system-s390x -machine >>>> s390-ccw-virtio-2.4,accel=kvm,usb=off -m 1024 -smp >>>> 4,sockets=4,cores=1,threads=1 -nographic -drive >>>> file=/dev/sda,if=none,id=drive-virtio-disk0,format=raw,serial=ccwzfcp1,cache=none,aio=native >>>> -device >>>> virtio-blk-ccw,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,x-data-plane=on >>> >>> What's the backtrace like for the other threads? This is almost >>> definitely a latent bug somewhere else. >> >> BTW, I can reproduce this---I'm asking because I cannot even attach gdb >> to the hung process. >> >> The simplest workaround is to reintroduce commit a0710f7995 (iothread: >> release iothread around aio_poll, 2015-02-20), though it also comes with >> some risk. It avoids the bug because it limits the contention on the >> RFifoLock. > > I can reproduce this with the following backtrace (with --enable-debug info added qemu is the tag v2.4.0-rc2) Can you check that cherry-picking a0710f7995 works for you? Paolo > Thread 4 (Thread 0x3fffb4ce910 (LWP 57750)): > #0 0x000003fffc185e8e in syscall () from /lib64/libc.so.6 > #1 0x00000000803578ee in futex_wait (ev=0x8098486c , val=4294967295) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:301 > #2 0x0000000080357ae8 in qemu_event_wait (ev=0x8098486c ) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:399 > #3 0x000000008036e202 in call_rcu_thread (opaque=0x0) at /home/cborntra/REPOS/qemu/util/rcu.c:233 > #4 0x000003fffc2374e6 in start_thread () from /lib64/libpthread.so.0 > #5 0x000003fffc18b0fa in thread_start () from /lib64/libc.so.6 > > Thread 3 (Thread 0x3fffacce910 (LWP 57751)): > #0 0x000003fffc17f5d6 in ppoll () from /lib64/libc.so.6 > #1 0x0000000080294c2e in qemu_poll_ns (fds=0x3fff40008c0, nfds=1, timeout=-1) at /home/cborntra/REPOS/qemu/qemu-timer.c:310 > #2 0x0000000080296788 in aio_poll (ctx=0x809c2830, blocking=true) at /home/cborntra/REPOS/qemu/aio-posix.c:271 > #3 0x0000000080137f58 in iothread_run (opaque=0x809c2450) at /home/cborntra/REPOS/qemu/iothread.c:42 > #4 0x000003fffc2374e6 in start_thread () from /lib64/libpthread.so.0 > #5 0x000003fffc18b0fa in thread_start () from /lib64/libc.so.6 > > Thread 2 (Thread 0x3fff9411910 (LWP 57754)): > #0 0x000003fffc23e662 in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x000003fffc2394a4 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #2 0x000000008035706e in qemu_mutex_lock (mutex=0x80559288 ) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:73 > #3 0x000000008005ae46 in qemu_mutex_lock_iothread () at /home/cborntra/REPOS/qemu/cpus.c:1164 > #4 0x000000008012f44e in kvm_arch_handle_exit (cs=0x82141930, run=0x3fffd31a000) at /home/cborntra/REPOS/qemu/target-s390x/kvm.c:2010 > #5 0x00000000800782f8 in kvm_cpu_exec (cpu=0x82141930) at /home/cborntra/REPOS/qemu/kvm-all.c:1901 > #6 0x000000008005a73c in qemu_kvm_cpu_thread_fn (arg=0x82141930) at /home/cborntra/REPOS/qemu/cpus.c:977 > #7 0x000003fffc2374e6 in start_thread () from /lib64/libpthread.so.0 > #8 0x000003fffc18b0fa in thread_start () from /lib64/libc.so.6 > > Thread 1 (Thread 0x3fffb4d0bd0 (LWP 57736)): > #0 0x000003fffc23b57c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 > #1 0x00000000803572fa in qemu_cond_wait (cond=0x809c28c0, mutex=0x809c2890) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:132 > #2 0x000000008036dc3e in rfifolock_lock (r=0x809c2890) at /home/cborntra/REPOS/qemu/util/rfifolock.c:59 > #3 0x0000000080281162 in aio_context_acquire (ctx=0x809c2830) at /home/cborntra/REPOS/qemu/async.c:331 > #4 0x00000000800a2f08 in virtio_blk_data_plane_start (s=0x80a0d6f0) at /home/cborntra/REPOS/qemu/hw/block/dataplane/virtio-blk.c:285 > #5 0x00000000800a0bfe in virtio_blk_handle_output (vdev=0x809b5e18, vq=0x80a70940) at /home/cborntra/REPOS/qemu/hw/block/virtio-blk.c:599 > #6 0x00000000800d065c in virtio_queue_notify_vq (vq=0x80a70940) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:921 > #7 0x00000000800d29b4 in virtio_queue_host_notifier_read (n=0x80a70988) at /home/cborntra/REPOS/qemu/hw/virtio/virtio.c:1480 > #8 0x0000000080293b76 in qemu_iohandler_poll (pollfds=0x809b3600, ret=1) at /home/cborntra/REPOS/qemu/iohandler.c:126 > #9 0x0000000080293714 in main_loop_wait (nonblocking=0) at /home/cborntra/REPOS/qemu/main-loop.c:494 > #10 0x000000008014dc1c in main_loop () at /home/cborntra/REPOS/qemu/vl.c:1902 > #11 0x000000008015627c in main (argc=44, argv=0x3ffffcd66e8, envp=0x3ffffcd6850) at /home/cborntra/REPOS/qemu/vl.c:4653 > > >