From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
Anthony Liguori <aliguori@us.ibm.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Blue Swirl <blauwirbel@gmail.com>,
khoa@us.ibm.com, Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
asias@redhat.com
Subject: [Qemu-devel] [PATCH v5 00/11] virtio: virtio-blk data plane
Date: Wed, 5 Dec 2012 21:46:59 +0100 [thread overview]
Message-ID: <1354740430-22452-1-git-send-email-stefanha@redhat.com> (raw)
This series adds the -device virtio-blk-pci,x-data-plane=on property that
enables a high performance I/O codepath. A dedicated thread is used to process
virtio-blk requests outside the global mutex and without going through the QEMU
block layer.
Khoa Huynh <khoa@us.ibm.com> reported an increase from 140,000 IOPS to 600,000
IOPS for a single VM using virtio-blk-data-plane in July:
http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
The virtio-blk-data-plane approach was originally presented at Linux Plumbers
Conference 2010. The following slides contain a brief overview:
http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
The basic approach is:
1. Each virtio-blk device has a thread dedicated to handling ioeventfd
signalling when the guest kicks the virtqueue.
2. Requests are processed without going through the QEMU block layer using
Linux AIO directly.
3. Completion interrupts are injected via irqfd from the dedicated thread.
To try it out:
qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
-device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on
Limitations:
* Only format=raw is supported
* Live migration is not supported
* Block jobs, hot unplug, and other operations fail with -EBUSY
* I/O throttling limits are ignored
* Only Linux hosts are supported due to Linux AIO usage
The code has reached a stage where I feel it is ready to merge. Users have
been playing with it for some time and want the significant performance boost.
We are refactoring QEMU to get rid of the global mutex. I believe that
virtio-blk-data-plane can eventually become the default mode of operation.
Instead of waiting for global mutex removal efforts to finish, I want to use
virtio-blk-data-plane as an example device for AioContext and threaded hw
dispatch refactoring. This means:
1. When the block layer can bind to an AioContext and execute I/O outside the
global mutex, virtio-blk-data-plane can use this (and gain image format
support).
2. When hw dispatch no longer needs the global mutex we can use hw/virtio.c
again and perhaps run a pool of iothreads instead of dedicated data plane
threads.
But in the meantime, I have cleaned up the virtio-blk-data-plane code so that
it can be merged as an experimental feature.
v5:
* Omit memory regions with dirty logging enabled from hostmem [Michael]
* Add doc comment about quiescing requests across memory hot unplug [Michael]
* Clarify which Linux vhost version the vring code originates from [Michael]
* Break up indirect vring buffer into 1 hostmem_lookup() per descriptor [Michael]
* Barriers in hw/dataplane/vring.c to force fields to be loaded [Michael]
* split vring_set_notification() into enable/disable [Paolo]
* barriers in vring.c instead of virtio-blk.c [Michael]
* move setup code from hw/virtio-blk.c into hw/dataplane/virtio-blk.c [Michael]
* Note I did not get rid of the mutex+condvar approach to draining requests.
I've had good feedback on the performance of the patch series so I'm not
worried about eliminating the lock (it's very rarely contended). Hope
Michael and Paolo are okay with this approach.
v4:
* Add qemu_iovec_concat_iov() [Paolo]
* Use QEMUIOVector to copy out virtio_blk_inhdr [Michael, Paolo]
v3:
* Don't assume iovec layout [Michael]
* Better naming for hostmem.c MemoryListener callbacks [Don]
* More vring quarantining if commands are bogus instead of exiting [Blue]
v2:
* Use MemoryListener for thread-safe memory mapping [Paolo, Anthony, and everyone else pointed this out ;-)]
* Quarantine invalid vring instead of exiting [Blue]
* Replace __u16 kernel types with uint16_t [Blue]
Changes from the RFC v9:
* Add x-data-plane=on|off option and coexist with regular virtio-blk code
* Create thread from BH so it inherits iothread cpusets
* Drain requests on vm_stop() so stopped guest does not access image file
* Add migration blocker
* Add bdrv_in_use() to prevent block jobs and other operations that can interfere
* Drop IOQueue request merging for simplicity
* Drop ioctl interrupt injection and always use irqfd for simplicity
* Major cleanup to split up source files
* Rebase from qemu-kvm.git onto qemu.git
* Address Michael Tsirkin's review comments
Stefan Hajnoczi (11):
raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane
configure: add CONFIG_VIRTIO_BLK_DATA_PLANE
dataplane: add host memory mapping code
dataplane: add virtqueue vring code
dataplane: add event loop
dataplane: add Linux AIO request queue
iov: add iov_discard() to remove data
test-iov: add iov_discard() testcase
iov: add qemu_iovec_concat_iov()
dataplane: add virtio-blk data plane code
virtio-blk: add x-data-plane=on|off performance feature
block.h | 9 +
block/raw-posix.c | 34 ++++
configure | 21 ++
hw/Makefile.objs | 2 +-
hw/dataplane/Makefile.objs | 3 +
hw/dataplane/event-poll.c | 109 +++++++++++
hw/dataplane/event-poll.h | 40 ++++
hw/dataplane/hostmem.c | 173 +++++++++++++++++
hw/dataplane/hostmem.h | 57 ++++++
hw/dataplane/ioq.c | 118 ++++++++++++
hw/dataplane/ioq.h | 57 ++++++
hw/dataplane/virtio-blk.c | 463 +++++++++++++++++++++++++++++++++++++++++++++
hw/dataplane/virtio-blk.h | 43 +++++
hw/dataplane/vring.c | 361 +++++++++++++++++++++++++++++++++++
hw/dataplane/vring.h | 63 ++++++
hw/virtio-blk.c | 28 ++-
hw/virtio-blk.h | 1 +
hw/virtio-pci.c | 3 +
iov.c | 80 ++++++--
iov.h | 13 ++
qemu-common.h | 3 +
tests/test-iov.c | 129 +++++++++++++
trace-events | 9 +
23 files changed, 1805 insertions(+), 14 deletions(-)
create mode 100644 hw/dataplane/Makefile.objs
create mode 100644 hw/dataplane/event-poll.c
create mode 100644 hw/dataplane/event-poll.h
create mode 100644 hw/dataplane/hostmem.c
create mode 100644 hw/dataplane/hostmem.h
create mode 100644 hw/dataplane/ioq.c
create mode 100644 hw/dataplane/ioq.h
create mode 100644 hw/dataplane/virtio-blk.c
create mode 100644 hw/dataplane/virtio-blk.h
create mode 100644 hw/dataplane/vring.c
create mode 100644 hw/dataplane/vring.h
--
1.8.0.1
next reply other threads:[~2012-12-05 20:47 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-05 20:46 Stefan Hajnoczi [this message]
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 01/11] raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 02/11] configure: add CONFIG_VIRTIO_BLK_DATA_PLANE Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 03/11] dataplane: add host memory mapping code Stefan Hajnoczi
2012-12-09 4:02 ` liu ping fan
2012-12-09 10:36 ` Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 04/11] dataplane: add virtqueue vring code Stefan Hajnoczi
2012-12-06 11:22 ` Michael S. Tsirkin
2012-12-06 12:53 ` Stefan Hajnoczi
2012-12-07 14:07 ` Kevin Wolf
2012-12-07 14:46 ` Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 05/11] dataplane: add event loop Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 06/11] dataplane: add Linux AIO request queue Stefan Hajnoczi
2012-12-07 14:21 ` Kevin Wolf
2012-12-10 13:05 ` Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 07/11] iov: add iov_discard() to remove data Stefan Hajnoczi
2012-12-06 11:36 ` Michael S. Tsirkin
2012-12-06 14:07 ` Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 08/11] test-iov: add iov_discard() testcase Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 09/11] iov: add qemu_iovec_concat_iov() Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 10/11] dataplane: add virtio-blk data plane code Stefan Hajnoczi
2012-12-06 7:35 ` Paolo Bonzini
2012-12-06 14:03 ` Stefan Hajnoczi
2012-12-07 6:06 ` Stefan Hajnoczi
2012-12-07 10:51 ` Paolo Bonzini
2012-12-06 11:33 ` Michael S. Tsirkin
2012-12-07 5:43 ` Stefan Hajnoczi
2012-12-07 18:04 ` Kevin Wolf
2012-12-10 13:06 ` Stefan Hajnoczi
2012-12-05 20:47 ` [Qemu-devel] [PATCH v5 11/11] virtio-blk: add x-data-plane=on|off performance feature Stefan Hajnoczi
2012-12-06 11:38 ` [Qemu-devel] [PATCH v5 00/11] virtio: virtio-blk data plane Michael S. Tsirkin
2012-12-07 6:12 ` Stefan Hajnoczi
2012-12-07 2:43 ` Liu Yuan
2012-12-07 5:46 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1354740430-22452-1-git-send-email-stefanha@redhat.com \
--to=stefanha@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=asias@redhat.com \
--cc=blauwirbel@gmail.com \
--cc=khoa@us.ibm.com \
--cc=kwolf@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).