From: Stefan Hajnoczi <stefanha@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Stefan Hajnoczi <stefanha@gmail.com>,
"Shergill, Gurinder" <gurinder.shergill@hp.com>,
qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
"Vinod, Chegu" <chegu_vinod@hp.com>
Subject: Re: [Qemu-devel] [PATCH 00/22] dataplane: use QEMU block layer
Date: Tue, 6 May 2014 10:39:53 +0200 [thread overview]
Message-ID: <20140506083953.GD8923@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <53678811.7030403@de.ibm.com>
On Mon, May 05, 2014 at 02:46:09PM +0200, Christian Borntraeger wrote:
> On 05/05/14 14:05, Stefan Hajnoczi wrote:
> > On Mon, May 05, 2014 at 11:17:44AM +0200, Christian Borntraeger wrote:
> >> On 01/05/14 16:54, Stefan Hajnoczi wrote:
> >>> This patch series switches virtio-blk data-plane from a custom Linux AIO
> >>> request queue to the QEMU block layer. The previous "raw files only"
> >>> limitation is lifted. All image formats and protocols can now be used with
> >>> virtio-blk data-plane.
> >>
> >> Nice. Is there a git branch somewhere, so that we can test this on s390?
> >
> > Hi Christian,
> > I'm getting to work on v2 but you can grab this v1 series from git in
> > the meantime:
> >
> > https://github.com/stefanha/qemu.git bdrv_set_aio_context
> >
> > Stefan
> >
>
> In general the main path seems to work fine.
>
> With lots of devices (one qcow2, 23 raw scsi disks)
> I get a hang on shutdown. kvm_stat claims that nothing is going on any more, but somehow threads are stuck in ppoll.
>
> gdb tells me that
>
> all cpus have
> #0 0x000003fffcde0ba0 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1 0x000003fffcde3c0c in __pthread_mutex_cond_lock () from /lib64/libpthread.so.0
> #2 0x000003fffcddc99a in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> #3 0x00000000801f183a in qemu_cond_wait (cond=<optimized out>, mutex=mutex@entry=0x8072ba30 <qemu_global_mutex>) at /home/cborntra/REPOS/qemu/util/qemu-thread-posix.c:135
> #4 0x00000000801512f2 in qemu_kvm_wait_io_event (cpu=<optimized out>) at /home/cborntra/REPOS/qemu/cpus.c:842
> #5 qemu_kvm_cpu_thread_fn (arg=0x80a53e10) at /home/cborntra/REPOS/qemu/cpus.c:878
>
> all iothreads have
> #0 0x000003fffbc348e0 in ppoll () from /lib64/libc.so.6
> #1 0x00000000800fcce6 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
> #2 qemu_poll_ns (fds=fds@entry=0x3fff4001b00, nfds=nfds@entry=3, timeout=-1) at /home/cborntra/REPOS/qemu/qemu-timer.c:311
> #3 0x000000008001ae4c in aio_poll (ctx=0x807dd610, blocking=blocking@entry=true) at /home/cborntra/REPOS/qemu/aio-posix.c:221
> #4 0x00000000800b2f6c in iothread_run (opaque=0x807dd4c8) at /home/cborntra/REPOS/qemu/iothread.c:41
> #5 0x000003fffcdd8412 in start_thread () from /lib64/libpthread.so.0
> #6 0x000003fffbc3f0ae in thread_start () from /lib64/libc.so.6
>
> the main thread has
> Thread 1 (Thread 0x3fff9e5c9b0 (LWP 33684)):
> #0 0x000003fffbc348e0 in ppoll () from /lib64/libc.so.6
> #1 0x00000000800fcce6 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
> #2 qemu_poll_ns (fds=fds@entry=0x80ae8030, nfds=nfds@entry=4, timeout=-1) at /home/cborntra/REPOS/qemu/qemu-timer.c:311
> #3 0x000000008001ae4c in aio_poll (ctx=ctx@entry=0x809a7ea0, blocking=blocking@entry=true) at /home/cborntra/REPOS/qemu/aio-posix.c:221
> #4 0x0000000080030c46 in bdrv_flush (bs=bs@entry=0x807e5900) at /home/cborntra/REPOS/qemu/block.c:4904
> #5 0x0000000080030ce8 in bdrv_flush_all () at /home/cborntra/REPOS/qemu/block.c:3723
> #6 0x0000000080152fe8 in do_vm_stop (state=<optimized out>) at /home/cborntra/REPOS/qemu/cpus.c:538
> #7 vm_stop (state=<optimized out>) at /home/cborntra/REPOS/qemu/cpus.c:1219
> #8 0x0000000000000000 in ?? ()
>
>
> How are the ppoll calls supposed to return if there is nothing going on?
The AioContext event loop has an event notifier to kick the AioContext.
This is how you can signal it from another thread.
> PS: I think I have seen this before recently during managedsave, so it might have been introduced with the iothread rework instead of this one.
I suspect this is due to a race condition in bdrv_flush_all(). In this
series I added AioContext acquire/release for bdrv_close_all() so that
vl.c:main() shutdown works. It's probably a similar issue.
Thanks for raising this issue, I'll investigate and send a fix. I
suspect this is not the other issue which you saw during managedsave.
Stefan
next prev parent reply other threads:[~2014-05-06 8:40 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-01 14:54 [Qemu-devel] [PATCH 00/22] dataplane: use QEMU block layer Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 01/22] block: use BlockDriverState AioContext Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 02/22] block: acquire AioContext in bdrv_close_all() Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 03/22] block: add bdrv_set_aio_context() Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 04/22] blkdebug: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 05/22] blkverify: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 06/22] curl: " Stefan Hajnoczi
2014-05-04 11:00 ` Fam Zheng
2014-05-05 11:52 ` Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 07/22] gluster: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-05 8:39 ` Bharata B Rao
2014-05-01 14:54 ` [Qemu-devel] [PATCH 08/22] iscsi: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-01 22:39 ` Peter Lieven
2014-05-07 10:07 ` Stefan Hajnoczi
2014-05-07 10:29 ` Paolo Bonzini
2014-05-07 14:09 ` Peter Lieven
2014-05-08 11:33 ` Stefan Hajnoczi
2014-05-08 14:52 ` ronnie sahlberg
2014-05-08 15:45 ` Peter Lieven
2014-05-01 14:54 ` [Qemu-devel] [PATCH 09/22] nbd: " Stefan Hajnoczi
2014-05-02 7:40 ` Paolo Bonzini
2014-05-01 14:54 ` [Qemu-devel] [PATCH 10/22] nfs: " Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 11/22] qed: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 12/22] quorum: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-05 15:46 ` Benoît Canet
2014-05-01 14:54 ` [Qemu-devel] [PATCH 13/22] block/raw-posix: " Stefan Hajnoczi
2014-05-02 7:39 ` Paolo Bonzini
2014-05-02 11:45 ` Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 14/22] block/linux-aio: fix memory and fd leak Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 15/22] rbd: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 16/22] sheepdog: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-05 8:10 ` Liu Yuan
2014-05-01 14:54 ` [Qemu-devel] [PATCH 17/22] ssh: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-01 15:03 ` Richard W.M. Jones
2014-05-01 15:13 ` Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 18/22] vmdk: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-04 9:50 ` Fam Zheng
2014-05-04 10:17 ` Fam Zheng
2014-05-05 12:03 ` Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 19/22] dataplane: use the QEMU block layer for I/O Stefan Hajnoczi
2014-05-04 11:51 ` Fam Zheng
2014-05-05 12:03 ` Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 20/22] dataplane: delete IOQueue since it is no longer used Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 21/22] dataplane: implement async flush Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 22/22] raw-posix: drop raw_get_aio_fd() since it is no longer used Stefan Hajnoczi
2014-05-02 7:42 ` [Qemu-devel] [PATCH 00/22] dataplane: use QEMU block layer Paolo Bonzini
2014-05-02 11:59 ` Stefan Hajnoczi
2014-05-05 9:17 ` Christian Borntraeger
2014-05-05 12:05 ` Stefan Hajnoczi
2014-05-05 12:46 ` Christian Borntraeger
2014-05-06 8:39 ` Stefan Hajnoczi [this message]
2014-05-06 13:30 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140506083953.GD8923@stefanha-thinkpad.redhat.com \
--to=stefanha@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=chegu_vinod@hp.com \
--cc=gurinder.shergill@hp.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).