From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51213) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WhF28-0000KS-Cj for qemu-devel@nongnu.org; Mon, 05 May 2014 05:18:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WhF1z-00063o-1q for qemu-devel@nongnu.org; Mon, 05 May 2014 05:18:00 -0400 Received: from e06smtp10.uk.ibm.com ([195.75.94.106]:49388) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WhF1y-00063b-N9 for qemu-devel@nongnu.org; Mon, 05 May 2014 05:17:50 -0400 Received: from /spool/local by e06smtp10.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 5 May 2014 10:17:49 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 6243B17D8062 for ; Mon, 5 May 2014 10:18:51 +0100 (BST) Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by b06cxnps4074.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s459HkfO66191442 for ; Mon, 5 May 2014 09:17:46 GMT Received: from d06av02.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s459HkFa013813 for ; Mon, 5 May 2014 03:17:46 -0600 Message-ID: <53675738.7040704@de.ibm.com> Date: Mon, 05 May 2014 11:17:44 +0200 From: Christian Borntraeger MIME-Version: 1.0 References: <1398956086-20171-1-git-send-email-stefanha@redhat.com> In-Reply-To: <1398956086-20171-1-git-send-email-stefanha@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 00/22] dataplane: use QEMU block layer List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi , qemu-devel@nongnu.org Cc: Kevin Wolf , Paolo Bonzini , "Shergill, Gurinder" , "Vinod, Chegu" On 01/05/14 16:54, Stefan Hajnoczi wrote: > This patch series switches virtio-blk data-plane from a custom Linux AIO > request queue to the QEMU block layer. The previous "raw files only" > limitation is lifted. All image formats and protocols can now be used with > virtio-blk data-plane. Nice. Is there a git branch somewhere, so that we can test this on s390? Christian > > How to review this series > ------------------------- > I CCed the maintainer of each block driver that I modified. You probably don't > need to review the entire series, just your patch. > > From now on fd handlers, timers, BHs, and event loop wait must explicitly use > BlockDriverState's AioContext instead of the main loop. Use > bdrv_get_aio_context(bs) to get the AioContext. The following function calls > need to be converted: > > * qemu_aio_set_fd_handler() -> aio_set_fd_handler() > * timer_new*() -> aio_timer_new() > * qemu_bh_new() -> aio_bh_new() > * qemu_aio_wait() -> aio_poll(aio_context, true) > > For simple block drivers this modification suffices and it is now safe to use > outside the QEMU global mutex. > > Block drivers that keep fd handlers, timers, or BHs registered when requests > have been drained need a little bit more work. Examples of this are network > block drivers with keepalive timers, like iSCSI. > > This series adds a new bdrv_set_aio_context(bs, aio_context) function that > moves a BlockDriverState into a new AioContext. This function calls the block > driver's optional .bdrv_detach_aio_context() and .bdrv_attach_aio_context() > functions. Implement detach/attach to move the fd handlers, timers, or BHs to > the new AioContext. > > Finally, block drivers that manage their own child nodes also need to > implement detach/attach because the generic block layer doesn't know about > their children. Both ->file and ->backing_hd are automatically taken care of > but blkverify, quorum, and VMDK need to manually propagate detach/attach to > their children. > > I have audited and modified all block drivers. Block driver maintainers, > please check I did it correctly and didn't break your code. > > Background > ---------- > The block layer is currently tied to the QEMU main loop for fd handlers, timer > callbacks, and BHs. This means that even on hosts with many cores, parts of > block I/O processing happen in one thread and depend on the QEMU global mutex. > > virtio-blk data-plane has shown that 1,000,000 IOPS is achievable if we use > additional threads that are not under the QEMU global mutex. > > It is necessary to make the QEMU block layer aware that there may be more than > one event loop. This way BlockDriverState can be used from a thread without > contention on the QEMU global mutex. > > This series builds on the aio_context_acquire/release() interface that allows a > thread to temporarily grab an AioContext. We add bdrv_set_aio_context(bs, > aio_context) for changing which AioContext a BlockDriverState uses. > > The final patches convert virtio-blk data-plane to use the QEMU block layer and > let the BlockDriverState run in the IOThread AioContext. > > What's next? > ------------ > I have already made block I/O throttling work in another AioContext and will > send the series out next week. > > In order to keep this series reviewable, I'm holding back those patches for > now. One could say, "throttling" them. > > Thank you, thank you, I'll be here all night! > > Stefan Hajnoczi (22): > block: use BlockDriverState AioContext > block: acquire AioContext in bdrv_close_all() > block: add bdrv_set_aio_context() > blkdebug: use BlockDriverState's AioContext > blkverify: implement .bdrv_detach/attach_aio_context() > curl: implement .bdrv_detach/attach_aio_context() > gluster: use BlockDriverState's AioContext > iscsi: implement .bdrv_detach/attach_aio_context() > nbd: implement .bdrv_detach/attach_aio_context() > nfs: implement .bdrv_detach/attach_aio_context() > qed: use BlockDriverState's AioContext > quorum: implement .bdrv_detach/attach_aio_context() > block/raw-posix: implement .bdrv_detach/attach_aio_context() > block/linux-aio: fix memory and fd leak > rbd: use BlockDriverState's AioContext > sheepdog: implement .bdrv_detach/attach_aio_context() > ssh: use BlockDriverState's AioContext > vmdk: implement .bdrv_detach/attach_aio_context() > dataplane: use the QEMU block layer for I/O > dataplane: delete IOQueue since it is no longer used > dataplane: implement async flush > raw-posix: drop raw_get_aio_fd() since it is no longer used > > block.c | 88 +++++++++++++-- > block/blkdebug.c | 2 +- > block/blkverify.c | 47 +++++--- > block/curl.c | 194 +++++++++++++++++++------------- > block/gluster.c | 7 +- > block/iscsi.c | 79 +++++++++---- > block/linux-aio.c | 24 +++- > block/nbd-client.c | 24 +++- > block/nbd-client.h | 4 + > block/nbd.c | 87 +++++++++------ > block/nfs.c | 80 ++++++++++---- > block/qed-table.c | 8 +- > block/qed.c | 35 +++++- > block/quorum.c | 48 ++++++-- > block/raw-aio.h | 3 + > block/raw-posix.c | 82 ++++++++------ > block/rbd.c | 5 +- > block/sheepdog.c | 118 +++++++++++++------- > block/ssh.c | 36 +++--- > block/vmdk.c | 23 ++++ > hw/block/dataplane/Makefile.objs | 2 +- > hw/block/dataplane/ioq.c | 117 -------------------- > hw/block/dataplane/ioq.h | 57 ---------- > hw/block/dataplane/virtio-blk.c | 233 +++++++++++++++------------------------ > include/block/block.h | 20 ++-- > include/block/block_int.h | 36 ++++++ > 26 files changed, 829 insertions(+), 630 deletions(-) > delete mode 100644 hw/block/dataplane/ioq.c > delete mode 100644 hw/block/dataplane/ioq.h >