From: Stefan Hajnoczi <stefanha@gmail.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Anthony Liguori <aliguori@us.ibm.com>,
Ping Fan Liu <pingfank@linux.vnet.ibm.com>,
qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>
Subject: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts""
Date: Tue, 9 Oct 2012 11:08:11 +0200 [thread overview]
Message-ID: <20121009090811.GB13775@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <5072CE54.8020208@redhat.com>
On Mon, Oct 08, 2012 at 03:00:04PM +0200, Paolo Bonzini wrote:
> Il 08/10/2012 13:39, Stefan Hajnoczi ha scritto:
> > 2. Thread pool for dispatching I/O requests outside the QEMU global mutex.
>
> I looked at this in the past and it feels like a dead end to me. I had
> a lot of special code in the thread-pool to mimic yield/enter of
> threadpool work-items. It was needed mostly for I/O throttling, but
> also because it feels unsafe to swap a CoMutex with a Mutex---the
> waiting I/O operations can starve the threadpool.
>
> I now think it is simpler to keep a cooperative coroutine-based
> multitasking in the general case. At the same time you can ensure that
> AIO formats (both Linux and posix-aio-compat) gets a suitable
> no-coroutine fast path in the common case of no copy-on-read, no
> throttling, etc. -- which can be done in the current code too.
You're right. Initially we can keep coroutines and add aio fastpaths. There's
no need to make invasive block layer changes to coroutines -> threads yet.
> Another important step would be to add bdrv_drain. Kevin pointed out to
> me that only ->file and ->backing_hd need to be drained. Well, there
> may be other BlockDriverStates for vmdk extents or similar cases
> (Benoit's quorum device for example)... these need to be handled the
> same way for bdrv_flush, bdrv_reopen, bdrv_drain so perhaps it is useful
> to add a common way to get them.
>
> And you need a lock to the AioContext, too. Then the block device can
> we the AioContext lock in order to synchronize multiple threads working
> on the block device. The lock will effectively block the ioeventfd
> thread, so that bdrv_lock+bdrv_drain+...+bdrv_unlock is a replacement
> for the current usage of bdrv_drain_all within the QEMU lock.
>
> > I'm starting to work on these steps and will send RFCs. This series
> > looks good to me.
>
> Thanks! A lot of the next steps can be done in parallel and more
> importantly none of them blocks each other (roughly)... so I'm eager to
> look at your stuff! :)
Some notes on moving virtio-blk processing out of the QEMU global mutex:
1. Dedicated thread for non-QEMU mutex virtio ioeventfd processing.
The point of this thread is to process without the QEMU global mutex, using
only fine-grained locks. (In the future this thread can be integrated back
into the QEMU iothread when the global mutex has been eliminated.)
Dedicated thread must hold reference to virtio-blk device so it will
not be destroyed. Hot unplug requires asking ioeventfd processing
threads to release reference.
2. Versions of virtqueue_pop() and virtqueue_push() that execute outside
global QEMU mutex. Look at memory API and threaded device dispatch.
The virtio device itself must have a lock so its vring-related state
can be modified safely.
Here are the steps that have been mentioned:
1. aio fastpath - for raw-posix and other aio block drivers, can we reduce I/O
request latency by skipping block layer coroutines? This is can be
prototyped (hacked) easily to scope out how much benefit we get. It's
completely independent from the global mutex related work.
2. BlockDriverState <-> AioContext attach. Allows I/O requests to be processed
by in event loops other than the QEMU iothread.
3. bdrv_drain() and BlockDriverState synchronization. Make it safe to use
BlockDriverState outside the QEMU mutex and ensure that bdrv_drain() works.
4. Unlocked event loop thread. This is simlar to QEMU's iothread except it
doesn't take the big lock. In theory we could have several of these threads
processing at the same time. virtio-blk ioeventfd processing will be done
in this thread.
5. virtqueue_pop()/virtqueue_push() without QEMU global mutex. Before this is
implemented we could temporarily acquire/release the QEMU global mutex.
Let me run benchmarks on the aio fastpath. I'm curious how much difference it
makes.
I'm also curious about virtqueue_pop()/virtqueue_push() outside the QEMU mutex
although that might be blocked by the current work around MMIO/PIO dispatch
outside the global mutex.
Stefan
next prev parent reply other threads:[~2012-10-09 9:08 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-25 12:55 [Qemu-devel] [RFC PATCH 00/17] Support for multiple "AIO contexts" Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 01/17] build: do not rely on indirect inclusion of qemu-config.h Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 02/17] event_notifier: enable it to use pipes Paolo Bonzini
2012-10-08 7:03 ` Stefan Hajnoczi
2012-09-25 12:55 ` [Qemu-devel] [PATCH 03/17] event_notifier: add Win32 implementation Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 04/17] aio: change qemu_aio_set_fd_handler to return void Paolo Bonzini
2012-09-25 21:47 ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 05/17] aio: provide platform-independent API Paolo Bonzini
2012-09-25 21:48 ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 06/17] aio: introduce AioContext, move bottom halves there Paolo Bonzini
2012-09-25 21:51 ` Anthony Liguori
2012-09-26 6:30 ` Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 07/17] aio: add I/O handlers to the AioContext interface Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 08/17] aio: add non-blocking variant of aio_wait Paolo Bonzini
2012-09-25 21:56 ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 09/17] aio: prepare for introducing GSource-based dispatch Paolo Bonzini
2012-09-25 22:01 ` Anthony Liguori
2012-09-26 6:36 ` Paolo Bonzini
2012-09-26 6:48 ` Paolo Bonzini
2012-09-29 11:28 ` Blue Swirl
2012-10-01 6:40 ` Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 10/17] aio: add Win32 implementation Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 11/17] aio: make AioContexts GSources Paolo Bonzini
2012-09-25 22:06 ` Anthony Liguori
2012-09-26 6:40 ` Paolo Bonzini
2012-09-25 12:55 ` [Qemu-devel] [PATCH 12/17] aio: add aio_notify Paolo Bonzini
2012-09-25 22:07 ` Anthony Liguori
2012-09-25 12:55 ` [Qemu-devel] [PATCH 13/17] aio: call aio_notify after setting I/O handlers Paolo Bonzini
2012-09-25 22:07 ` Anthony Liguori
2012-09-25 12:56 ` [Qemu-devel] [PATCH 14/17] main-loop: use GSource to poll AIO file descriptors Paolo Bonzini
2012-09-25 22:09 ` Anthony Liguori
2012-09-26 6:38 ` Paolo Bonzini
2012-09-25 12:56 ` [Qemu-devel] [PATCH 15/17] main-loop: use aio_notify for qemu_notify_event Paolo Bonzini
2012-09-25 22:10 ` Anthony Liguori
2012-09-25 12:56 ` [Qemu-devel] [PATCH 16/17] aio: clean up now-unused functions Paolo Bonzini
2012-09-25 22:11 ` Anthony Liguori
2012-09-25 12:56 ` [Qemu-devel] [PATCH 17/17] linux-aio: use event notifiers Paolo Bonzini
2012-09-26 12:28 ` [Qemu-devel] [RFC PATCH 00/17] Support for multiple "AIO contexts" Kevin Wolf
2012-09-26 13:32 ` Paolo Bonzini
2012-09-26 14:31 ` Kevin Wolf
2012-09-26 15:48 ` Paolo Bonzini
2012-09-27 7:11 ` Kevin Wolf
2012-09-27 7:43 ` Paolo Bonzini
2012-10-08 11:39 ` Stefan Hajnoczi
2012-10-08 13:00 ` Paolo Bonzini
2012-10-09 9:08 ` Stefan Hajnoczi [this message]
2012-10-09 9:26 ` [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"" Avi Kivity
2012-10-09 10:36 ` Paolo Bonzini
2012-10-09 10:52 ` Avi Kivity
2012-10-09 11:08 ` Paolo Bonzini
2012-10-09 11:55 ` Avi Kivity
2012-10-09 12:01 ` Paolo Bonzini
2012-10-09 12:18 ` Jan Kiszka
2012-10-09 12:28 ` Avi Kivity
2012-10-09 12:22 ` Avi Kivity
2012-10-09 13:11 ` Paolo Bonzini
2012-10-09 13:21 ` Avi Kivity
2012-10-09 13:50 ` Paolo Bonzini
2012-10-09 14:24 ` Avi Kivity
2012-10-09 14:35 ` Paolo Bonzini
2012-10-09 14:41 ` Avi Kivity
2012-10-09 14:05 ` Stefan Hajnoczi
2012-10-09 15:02 ` Anthony Liguori
2012-10-09 15:06 ` Paolo Bonzini
2012-10-09 15:37 ` Anthony Liguori
2012-10-09 16:26 ` Paolo Bonzini
2012-10-09 18:26 ` Anthony Liguori
2012-10-10 7:11 ` Paolo Bonzini
2012-10-10 12:25 ` Anthony Liguori
2012-10-10 13:31 ` Paolo Bonzini
2012-10-10 14:44 ` Anthony Liguori
2012-10-11 12:28 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121009090811.GB13775@stefanha-thinkpad.redhat.com \
--to=stefanha@gmail.com \
--cc=aliguori@us.ibm.com \
--cc=avi@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=pingfank@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).