qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org,
	Christian Borntraeger <borntraeger@de.ibm.com>
Subject: Re: [Qemu-devel] [PATCH v3 00/25] dataplane: use QEMU block layer
Date: Tue, 3 Jun 2014 15:33:31 +0200	[thread overview]
Message-ID: <20140603133331.GC723@stefanha-thinkpad.muc.redhat.com> (raw)
In-Reply-To: <1399559698-31900-1-git-send-email-stefanha@redhat.com>

On Thu, May 08, 2014 at 04:34:33PM +0200, Stefan Hajnoczi wrote:
> v3:
>  * Add assertion if win32_aio_cleanup() is called without detaching [Paolo]
>  * Stash away NFSClient pointer for future use instead of AioContext pointer [Peter Lieven]
>  * Use NFSClient->aio_context where possible instead of bdrv_get_aio_context() [Peter Lieven]
>  * Stash away IscsiLun pointer for future use instead of AioContext pointer [Peter Lieven]
>  * Use IScsiLun->aio_context where possible instead of bdrv_get_aio_context()
>  * Add Benoit's Reviewed-by: for unchanged quorum patch from v1
> 
> v2:
>  * Currently unaddressed: libiscsi nop timer re-entrancy, is it a problem? [Peter Lieven]
>  * Protect bdrv_*_all() with AioContext acquire/release [Christian Borntraeger]
>  * Convert raw-win32 and aio-win32 [Paolo]
>  * Rebase on latest curl.c changes [Fam]
>  * Add curl_attach_aio_context() assert since it shouldn't be called twice [Fam]
>  * Drop curl comment that doesn't make sense in the new code [Fam]
>  * Replace g_slice_new()+memset() with g_slice_new0() in dataplane/virtio-blk.c [Fam]
> 
> This patch series switches virtio-blk data-plane from a custom Linux AIO
> request queue to the QEMU block layer.  The previous "raw files only"
> limitation is lifted.  All image formats and protocols can now be used with
> virtio-blk data-plane.
> 
> How to review this series
> -------------------------
> I CCed the maintainer of each block driver that I modified.  You probably don't
> need to review the entire series, just your patch.
> 
> From now on fd handlers, timers, BHs, and event loop wait must explicitly use
> BlockDriverState's AioContext instead of the main loop.  Use
> bdrv_get_aio_context(bs) to get the AioContext.  The following function calls
> need to be converted:
> 
>  * qemu_aio_set_fd_handler() -> aio_set_fd_handler()
>  * timer_new*() -> aio_timer_new()
>  * qemu_bh_new() -> aio_bh_new()
>  * qemu_aio_wait() -> aio_poll(aio_context, true)
> 
> For simple block drivers this modification suffices and it is now safe to use
> outside the QEMU global mutex.
> 
> Block drivers that keep fd handlers, timers, or BHs registered when requests
> have been drained need a little bit more work.  Examples of this are network
> block drivers with keepalive timers, like iSCSI.
> 
> This series adds a new bdrv_set_aio_context(bs, aio_context) function that
> moves a BlockDriverState into a new AioContext.  This function calls the block
> driver's optional .bdrv_detach_aio_context() and .bdrv_attach_aio_context()
> functions.  Implement detach/attach to move the fd handlers, timers, or BHs to
> the new AioContext.
> 
> Finally, block drivers that manage their own child nodes also need to
> implement detach/attach because the generic block layer doesn't know about
> their children.  Both ->file and ->backing_hd are automatically taken care of
> but blkverify, quorum, and VMDK need to manually propagate detach/attach to
> their children.
> 
> I have audited and modified all block drivers.  Block driver maintainers,
> please check I did it correctly and didn't break your code.
> 
> Background
> ----------
> The block layer is currently tied to the QEMU main loop for fd handlers, timer
> callbacks, and BHs.  This means that even on hosts with many cores, parts of
> block I/O processing happen in one thread and depend on the QEMU global mutex.
> 
> virtio-blk data-plane has shown that 1,000,000 IOPS is achievable if we use
> additional threads that are not under the QEMU global mutex.
> 
> It is necessary to make the QEMU block layer aware that there may be more than
> one event loop.  This way BlockDriverState can be used from a thread without
> contention on the QEMU global mutex.
> 
> This series builds on the aio_context_acquire/release() interface that allows a
> thread to temporarily grab an AioContext.  We add bdrv_set_aio_context(bs,
> aio_context) for changing which AioContext a BlockDriverState uses.
> 
> The final patches convert virtio-blk data-plane to use the QEMU block layer and
> let the BlockDriverState run in the IOThread AioContext.
> 
> What's next?
> ------------
> I have already made block I/O throttling work in another AioContext and will
> send the series out next week.
> 
> In order to keep this series reviewable, I'm holding back those patches for
> now.  One could say, "throttling" them.
> 
> Thank you, thank you, I'll be here all night!
> 
> Stefan Hajnoczi (25):
>   block: use BlockDriverState AioContext
>   block: acquire AioContext in bdrv_*_all()
>   block: acquire AioContext in bdrv_drain_all()
>   block: add bdrv_set_aio_context()
>   blkdebug: use BlockDriverState's AioContext
>   blkverify: implement .bdrv_detach/attach_aio_context()
>   curl: implement .bdrv_detach/attach_aio_context()
>   gluster: use BlockDriverState's AioContext
>   iscsi: implement .bdrv_detach/attach_aio_context()
>   nbd: implement .bdrv_detach/attach_aio_context()
>   nfs: implement .bdrv_detach/attach_aio_context()
>   qed: use BlockDriverState's AioContext
>   quorum: implement .bdrv_detach/attach_aio_context()
>   block/raw-posix: implement .bdrv_detach/attach_aio_context()
>   block/linux-aio: fix memory and fd leak
>   block/raw-win32: create one QEMUWin32AIOState per BDRVRawState
>   block/raw-win32: implement .bdrv_detach/attach_aio_context()
>   rbd: use BlockDriverState's AioContext
>   sheepdog: implement .bdrv_detach/attach_aio_context()
>   ssh: use BlockDriverState's AioContext
>   vmdk: implement .bdrv_detach/attach_aio_context()
>   dataplane: use the QEMU block layer for I/O
>   dataplane: delete IOQueue since it is no longer used
>   dataplane: implement async flush
>   raw-posix: drop raw_get_aio_fd() since it is no longer used
> 
>  block.c                          | 133 +++++++++++++++++-----
>  block/blkdebug.c                 |   2 +-
>  block/blkverify.c                |  47 +++++---
>  block/curl.c                     | 192 +++++++++++++++++++-------------
>  block/gluster.c                  |   7 +-
>  block/iscsi.c                    |  80 ++++++++++----
>  block/linux-aio.c                |  24 +++-
>  block/nbd-client.c               |  24 +++-
>  block/nbd-client.h               |   4 +
>  block/nbd.c                      |  87 +++++++++------
>  block/nfs.c                      |  81 ++++++++++----
>  block/qed-table.c                |   8 +-
>  block/qed.c                      |  35 +++++-
>  block/quorum.c                   |  48 ++++++--
>  block/raw-aio.h                  |   8 ++
>  block/raw-posix.c                |  82 ++++++++------
>  block/raw-win32.c                |  54 ++++++---
>  block/rbd.c                      |   5 +-
>  block/sheepdog.c                 | 118 +++++++++++++-------
>  block/ssh.c                      |  36 +++---
>  block/vmdk.c                     |  23 ++++
>  block/win32-aio.c                |  27 ++++-
>  hw/block/dataplane/Makefile.objs |   2 +-
>  hw/block/dataplane/ioq.c         | 117 --------------------
>  hw/block/dataplane/ioq.h         |  57 ----------
>  hw/block/dataplane/virtio-blk.c  | 232 +++++++++++++++------------------------
>  include/block/block.h            |  20 ++--
>  include/block/block_int.h        |  36 ++++++
>  28 files changed, 929 insertions(+), 660 deletions(-)
>  delete mode 100644 hw/block/dataplane/ioq.c
>  delete mode 100644 hw/block/dataplane/ioq.h

Applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

      parent reply	other threads:[~2014-06-03 13:33 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-08 14:34 [Qemu-devel] [PATCH v3 00/25] dataplane: use QEMU block layer Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 01/25] block: use BlockDriverState AioContext Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 02/25] block: acquire AioContext in bdrv_*_all() Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 03/25] block: acquire AioContext in bdrv_drain_all() Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 04/25] block: add bdrv_set_aio_context() Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 05/25] blkdebug: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 06/25] blkverify: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 07/25] curl: " Stefan Hajnoczi
2014-05-13  2:06   ` Fam Zheng
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 08/25] gluster: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 09/25] iscsi: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-08 15:47   ` Peter Lieven
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 10/25] nbd: " Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 11/25] nfs: " Stefan Hajnoczi
2014-05-08 15:46   ` Peter Lieven
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 12/25] qed: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 13/25] quorum: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 14/25] block/raw-posix: " Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 15/25] block/linux-aio: fix memory and fd leak Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 16/25] block/raw-win32: create one QEMUWin32AIOState per BDRVRawState Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 17/25] block/raw-win32: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 18/25] rbd: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-09  0:04   ` Josh Durgin
2014-05-09  8:16     ` Paolo Bonzini
2014-05-09  8:22     ` Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 19/25] sheepdog: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 20/25] ssh: use BlockDriverState's AioContext Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 21/25] vmdk: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-15  1:34   ` Fam Zheng
2014-05-15  7:33     ` Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 22/25] dataplane: use the QEMU block layer for I/O Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 23/25] dataplane: delete IOQueue since it is no longer used Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 24/25] dataplane: implement async flush Stefan Hajnoczi
2014-05-08 14:34 ` [Qemu-devel] [PATCH v3 25/25] raw-posix: drop raw_get_aio_fd() since it is no longer used Stefan Hajnoczi
2014-06-03 13:33 ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140603133331.GC723@stefanha-thinkpad.muc.redhat.com \
    --to=stefanha@gmail.com \
    --cc=borntraeger@de.ibm.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).