From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:45881) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S10j3-00083D-Ej for qemu-devel@nongnu.org; Fri, 24 Feb 2012 14:22:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S10j1-00013Y-Mm for qemu-devel@nongnu.org; Fri, 24 Feb 2012 14:22:41 -0500 Received: from mail-pz0-f45.google.com ([209.85.210.45]:43899) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S10j1-00013M-Gh for qemu-devel@nongnu.org; Fri, 24 Feb 2012 14:22:39 -0500 Received: by dadp14 with SMTP id p14so3370337dad.4 for ; Fri, 24 Feb 2012 11:22:38 -0800 (PST) Message-ID: <4F47E37A.6000702@codemonkey.ws> Date: Fri, 24 Feb 2012 13:22:34 -0600 From: Anthony Liguori MIME-Version: 1.0 References: <4F47DEB1.7080009@redhat.com> In-Reply-To: <4F47DEB1.7080009@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] converting the block layer from coroutines to threads List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: "Michael S. Tsirkin" , qemu-devel , Stefan Hajnoczi On 02/24/2012 01:02 PM, Paolo Bonzini wrote: > Hi all, > > a few weeks ago Stefan Hajnoczi pointed me to his work on virtio-blk > performance. > > Stefan's work had two sides. First, he captured very nice performance > data of the block layer at > http://www.linux-kvm.org/page/Virtio/Block/Latency; second, in order to > measure peak performance, he basically implemented "vhost-blk" in > userspace. I don't think the improvements here have anything to do with the block layer. We've done the same thing with virtio-net and saw impressive performance results as a consequence. Conversely, we see a similar improvement by applying the same technique to vhost-net. Virtio really wants each virtqueue to be processed in a separate thread. On a multicore system, there's considerable improvement doing this. I think that's where we ought to start. We really just need the block layer to be re-entrant, we don't actually need qcow2 or anything else that uses coroutines to use full threads. Or at least, as far as I know, we don't have any performance data to suggest that we do. Regards, Anthony Liguori