From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1NUKXr-0005oR-KY
	for qemu-devel@nongnu.org; Mon, 11 Jan 2010 08:43:00 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1NUKXl-0005nM-FT
	for qemu-devel@nongnu.org; Mon, 11 Jan 2010 08:42:57 -0500
Received: from [199.232.76.173] (port=59490 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1NUKXl-0005nH-AW
	for qemu-devel@nongnu.org; Mon, 11 Jan 2010 08:42:53 -0500
Received: from verein.lst.de ([213.95.11.210]:41965)
	by monty-python.gnu.org with esmtps
	(TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA1:24) (Exim 4.60)
	(envelope-from <hch@lst.de>) id 1NUKXk-00018M-Pa
	for qemu-devel@nongnu.org; Mon, 11 Jan 2010 08:42:53 -0500
Date: Mon, 11 Jan 2010 14:42:48 +0100
From: Christoph Hellwig <hch@lst.de>
Subject: Re: [Qemu-devel] Re: [RFC][PATCH] performance improvement for windows
	guests, running on top of virtio block device
Message-ID: <20100111134248.GA25622@lst.de>
References: <1263195647.2005.44.camel@localhost> <4B4AE1BD.4000400@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4B4AE1BD.4000400@redhat.com>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Dor Laor <dlaor@redhat.com>, Vadim Rozenfeld <vrozenfe@redhat.com>, qemu-devel <qemu-devel@nongnu.org>

On Mon, Jan 11, 2010 at 10:30:53AM +0200, Avi Kivity wrote:
> The patch has potential to reduce performance on volumes with multiple 
> spindles.  Consider two processes issuing sequential reads into a RAID 
> array.  With this patch, the reads will be executed sequentially rather 
> than in parallel, so I think a follow-on patch to make the minimum depth 
> a parameter (set by the guest? the host?) would be helpful.

Let's think about the life cycle of I/O requests a bit.

We have an idle virtqueue (aka one virtio-blk device).  The first (read)
request comes in, we get the virtio notify from the guest, which calls
into virtio_blk_handle_output.  With the new code we now disable the
notify once we start processing the first request.  If the second
request hits the queue before we call into virtio_blk_get_request
the second time we're fine even with the new code as we keep picking it
up.  If however it hits after we leave virtio_blk_handle_output, but
before we complete the first request we do indeed introduce additional
latency.

So instead of disabling notify while requests are active we might want
to only disable it while we are inside virtio_blk_handle_output.
Something like the following minimally tested patch:


Index: qemu/hw/virtio-blk.c
===================================================================
--- qemu.orig/hw/virtio-blk.c	2010-01-11 14:28:42.896010503 +0100
+++ qemu/hw/virtio-blk.c	2010-01-11 14:40:13.535256353 +0100
@@ -328,7 +328,15 @@ static void virtio_blk_handle_output(Vir
     int num_writes = 0;
     BlockDriverState *old_bs = NULL;
 
+    /*
+     * While we are processing requests there is no need to get further
+     * notifications from the guest - it'll just burn cpu cycles doing
+     * useless context switches into the host.
+     */
+    virtio_queue_set_notification(s->vq, 0);
+
     while ((req = virtio_blk_get_request(s))) {
+handle_request:
         if (req->elem.out_num < 1 || req->elem.in_num < 1) {
             fprintf(stderr, "virtio-blk missing headers\n");
             exit(1);
@@ -358,6 +366,18 @@ static void virtio_blk_handle_output(Vir
         }
     }
 
+    /*
+     * Once we're done processing all pending requests re-enable the queue
+     * notification.  If there's an entry pending after we enabled
+     * notification again we hit a race and need to process it before
+     * returning.
+     */
+    virtio_queue_set_notification(s->vq, 1);
+    req = virtio_blk_get_request(s);
+    if (req) {
+        goto handle_request;
+    }
+
     if (num_writes > 0) {
         do_multiwrite(old_bs, blkreq, num_writes);
     }