From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42798) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TgrBG-0003Gv-9A for qemu-devel@nongnu.org; Fri, 07 Dec 2012 01:13:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TgrBD-00032z-5c for qemu-devel@nongnu.org; Fri, 07 Dec 2012 01:13:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:15075) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TgrBC-00032t-UM for qemu-devel@nongnu.org; Fri, 07 Dec 2012 01:12:59 -0500 Date: Fri, 7 Dec 2012 07:12:55 +0100 From: Stefan Hajnoczi Message-ID: <20121207061255.GA7504@stefanha-thinkpad.redhat.com> References: <1354740430-22452-1-git-send-email-stefanha@redhat.com> <20121206113828.GN10837@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121206113828.GN10837@redhat.com> Subject: Re: [Qemu-devel] [PATCH v5 00/11] virtio: virtio-blk data plane List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Kevin Wolf , Anthony Liguori , qemu-devel@nongnu.org, Blue Swirl , khoa@us.ibm.com, Paolo Bonzini , asias@redhat.com On Thu, Dec 06, 2012 at 01:38:28PM +0200, Michael S. Tsirkin wrote: > On Wed, Dec 05, 2012 at 09:46:59PM +0100, Stefan Hajnoczi wrote: > > This series adds the -device virtio-blk-pci,x-data-plane=on property that > > enables a high performance I/O codepath. A dedicated thread is used to process > > virtio-blk requests outside the global mutex and without going through the QEMU > > block layer. > > > > Khoa Huynh reported an increase from 140,000 IOPS to 600,000 > > IOPS for a single VM using virtio-blk-data-plane in July: > > > > http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580 > > > > The virtio-blk-data-plane approach was originally presented at Linux Plumbers > > Conference 2010. The following slides contain a brief overview: > > > > http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf > > > > The basic approach is: > > 1. Each virtio-blk device has a thread dedicated to handling ioeventfd > > signalling when the guest kicks the virtqueue. > > 2. Requests are processed without going through the QEMU block layer using > > Linux AIO directly. > > 3. Completion interrupts are injected via irqfd from the dedicated thread. > > > > To try it out: > > > > qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=... > > -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on > > > > Limitations: > > * Only format=raw is supported > > * Live migration is not supported > > * Block jobs, hot unplug, and other operations fail with -EBUSY > > * I/O throttling limits are ignored > > * Only Linux hosts are supported due to Linux AIO usage > > > > The code has reached a stage where I feel it is ready to merge. Users have > > been playing with it for some time and want the significant performance boost. > > > > We are refactoring QEMU to get rid of the global mutex. I believe that > > virtio-blk-data-plane can eventually become the default mode of operation. > > > > Instead of waiting for global mutex removal efforts to finish, I want to use > > virtio-blk-data-plane as an example device for AioContext and threaded hw > > dispatch refactoring. This means: > > > > 1. When the block layer can bind to an AioContext and execute I/O outside the > > global mutex, virtio-blk-data-plane can use this (and gain image format > > support). > > > > 2. When hw dispatch no longer needs the global mutex we can use hw/virtio.c > > again and perhaps run a pool of iothreads instead of dedicated data plane > > threads. > > > > But in the meantime, I have cleaned up the virtio-blk-data-plane code so that > > it can be merged as an experimental feature. > > I mostly looked at the virtio side of the patchset. > I don't see any bugs here. I sent some improvement suggestions but > we can do them in tree as well. Thanks Michael. I'll send follow-up patches to split the iov_discard() function and to address config-wce. Stefan