From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46320) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XEfhi-0008AA-8B for qemu-devel@nongnu.org; Tue, 05 Aug 2014 10:27:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XEfhc-0006AT-A2 for qemu-devel@nongnu.org; Tue, 05 Aug 2014 10:27:06 -0400 Received: from mx.beyond.pl ([92.43.117.49]:54472) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XEfhb-0006AH-Vv for qemu-devel@nongnu.org; Tue, 05 Aug 2014 10:27:00 -0400 Message-ID: <53E0E9AF.8030103@beyond.pl> Date: Tue, 05 Aug 2014 16:26:55 +0200 From: =?UTF-8?B?TWFyY2luIEdpYnXFgmE=?= MIME-Version: 1.0 References: <1407167793-20425-1-git-send-email-stefanha@redhat.com> In-Reply-To: <1407167793-20425-1-git-send-email-stefanha@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] linux-aio: avoid deadlock in nested aio_poll() calls List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi , qemu-devel@nongnu.org Cc: Kevin Wolf , Paolo Bonzini , Ming Lei On 04.08.2014 17:56, Stefan Hajnoczi wrote: > If two Linux AIO request completions are fetched in the same > io_getevents() call, QEMU will deadlock if request A's callback waits > for request B to complete using an aio_poll() loop. This was reported > to happen with the mirror blockjob. > > This patch moves completion processing into a BH and makes it resumable= . > Nested event loops can resume completion processing so that request B > will complete and the deadlock will not occur. > > Cc: Kevin Wolf > Cc: Paolo Bonzini > Cc: Ming Lei > Cc: Marcin Gibu=C5=82a > Reported-by: Marcin Gibu=C5=82a > Signed-off-by: Stefan Hajnoczi Still hangs... Backtrace still looks like this: Thread 1 (Thread 0x7f3d5313a900 (LWP 17440)): #0 0x00007f3d4f38f286 in ppoll () from /lib64/libc.so.6 #1 0x00007f3d5347465b in ppoll (__ss=3D0x0, __timeout=3D0x0,=20 __nfds=3D, __fds=3D) at=20 /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=3D, nfds=3D,=20 timeout=3D) at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/qemu-timer.c:31= 4 #3 0x00007f3d53475970 in aio_poll (ctx=3Dctx@entry=3D0x7f3d54270c00,=20 blocking=3Dblocking@entry=3Dtrue) at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/aio-posix.c:250 #4 0x00007f3d534695e7 in bdrv_drain_all () at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/block.c:1924 #5 0x00007f3d5346fe1f in bdrv_close (bs=3Dbs@entry=3D0x7f3d5579b340) at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/block.c:1820 #6 0x00007f3d53470047 in bdrv_delete (bs=3D0x7f3d5579b340) at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/block.c:2094 #7 bdrv_unref (bs=3D0x7f3d5579b340) at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/block.c:5376 #8 0x00007f3d5347030b in bdrv_drop_intermediate=20 (active=3Dactive@entry=3D0x7f3d54635e20, top=3Dtop@entry=3D0x7f3d5579b340= ,=20 base=3Dbase@entry=3D0x7f3d54d956b0, backing_file_str=3D0x7f3d54d95700=20 "/mnt/nfs/volumes/7c13c27f-0c48-4676-b075-6e8a3325383e/3785abe6-d2df-49da= -9cba-e15cfce8e2af.qcow2") at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/block.c:2643 #9 0x00007f3d5335121a in commit_run (opaque=3D0x7f3d545cdac0) at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/block/commit.c:= 145 #10 0x00007f3d5347ebca in coroutine_trampoline (i0=3D,=20 i1=3D) at=20 /var/tmp/portage/app-emulation/qemu-2.1.0/work/qemu-2.1.0/coroutine-ucont= ext.c:118 #11 0x00007f3d4f2f49f0 in ?? () from /lib64/libc.so.6 #12 0x00007fff27d5ef50 in ?? () #13 0x0000000000000000 in ?? () --=20 mg