All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Vincent, Pradeep" <pradeepv@amazon.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Jan Beulich <JBeulich@novell.com>,
	Daniel Stodden <daniel.stodden@citrix.com>
Subject: Re: [PATCH] blkback: Fix block I/O latency issue
Date: Mon, 16 May 2011 11:22:24 -0400	[thread overview]
Message-ID: <20110516152224.GA7195@dumpdata.com> (raw)
In-Reply-To: <20110513025132.GA4652@dumpdata.com>

[-- Attachment #1: Type: text/plain, Size: 2879 bytes --]

On Thu, May 12, 2011 at 10:51:32PM -0400, Konrad Rzeszutek Wilk wrote:
> > >>what were the numbers when it came to high bandwidth numbers
> > 
> > Under high I/O workload, where the blkfront would fill up the queue as
> > blkback works the queue, the I/O latency problem in question doesn't
> > manifest itself and as a result this patch doesn't make much of a
> > difference in terms of interrupt rate. My benchmarks didn't show any
> > significant effect.
> 
> I have to rerun my benchmarks. Under high load (so 64Kb, four threads
> writting as much as they can to a iSCSI disk), the IRQ rate for each
> blkif went from 2-3/sec to ~5K/sec. But I did not do a good
> job on capturing the submission latency to see if the I/Os get the
> response back as fast (or the same) as without your patch.
> 
> And the iSCSI disk on the target side was an RAMdisk, so latency
> was quite small which is not fair to your problem.
> 
> Do you have a program to measure the latency for the workload you
> had encountered? I would like to run those numbers myself.

Ran some more benchmarks over this week. This time I tried to run it on:

 - iSCSI target (1GB, and on the "other side" it wakes up every 1msec, so the
   latency is set to 1msec).
 - scsi_debug delay=0 (no delay and as fast possible. Comes out to be about
   4 microseconds completion with queue depth of one with 32K I/Os).
 - local SATAI 80GB ST3808110AS. Still running as it is quite slow.

With only one PV guest doing a round (three times) of two threads randomly
writting I/Os with a queue depth of 256. Then a different round of four
threads writting/reading (80/20) 512bytes up to 64K randomly over the disk.

I used the attached patch against #master (git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git)
to gauge how well we are doing (and what the interrupt generation rate is).

These workloads I think would be considered 'high I/O' and I was expecting
your patch to not have any influence on the numbers.

But to my surprise the case where the I/O latency is high, the interrupt generation
was quite small. But where the I/O latency was very very small (4 microseconds)
the interrupt generation was on average about 20K/s. And this is with a queue depth
of 256 with four threads. I was expecting the opposite. Hence quite curious
to see your use case.

What do you consider a middle I/O and low I/O cases? Do you use 'fio' for your
testing?

With the high I/O load, the numbers came out to give us about 1% benefit with your
patch. However, I am worried (maybe unneccassarily?) about the 20K interrupt generation
when the iometer tests kicked in (this was only when using the unrealistic 'scsi_debug'
drive).

The picture of this using iSCSI target:
http://darnok.org/xen/amazon/iscsi_target/iometer-bw.png

And when done on top of local RAMdisk:
http://darnok.org/xen/amazon/scsi_debug/iometer-bw.png


[-- Attachment #2: amazon-debug.patch --]
[-- Type: text/x-diff, Size: 3282 bytes --]

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index dba55e3..83c24ed 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -60,8 +60,11 @@ static int xen_blkif_reqs = 64;
 module_param_named(reqs, xen_blkif_reqs, int, 0);
 MODULE_PARM_DESC(reqs, "Number of blkback requests to allocate");
 
+static int xen_kick_front = 1;
+module_param(xen_kick_front, int, 0644);
+
 /* Run-time switchable: /sys/module/blkback/parameters/ */
-static unsigned int log_stats;
+static unsigned int log_stats = 1;
 module_param(log_stats, int, 0644);
 
 /*
@@ -255,10 +258,21 @@ static void print_stats(struct xen_blkif *blkif)
 	pr_info("xen-blkback (%s): oo %3d  |  rd %4d  |  wr %4d  |  f %4d\n",
 		 current->comm, blkif->st_oo_req,
 		 blkif->st_rd_req, blkif->st_wr_req, blkif->st_f_req);
+
+	if (blkif->st_reqs_avail) {
+		pr_info("xen-blkback (%s):  bk %4d fk %4d | avail %4d finished %4d\n",
+			current->comm, blkif->st_back_kick, blkif->st_front_kick,
+		 	blkif->st_reqs_avail, blkif->st_reqs_finished);
+	}
+
 	blkif->st_print = jiffies + msecs_to_jiffies(10 * 1000);
 	blkif->st_rd_req = 0;
 	blkif->st_wr_req = 0;
 	blkif->st_oo_req = 0;
+	blkif->st_back_kick = 0;
+	blkif->st_front_kick = 0;
+	blkif->st_reqs_avail = 0;
+	blkif->st_reqs_finished = 0;
 }
 
 int xen_blkif_schedule(void *arg)
@@ -459,6 +473,7 @@ static int do_block_io_op(struct xen_blkif *blkif)
 	struct pending_req *pending_req;
 	RING_IDX rc, rp;
 	int more_to_do = 0;
+	unsigned long flags;
 
 	rc = blk_rings->common.req_cons;
 	rp = blk_rings->common.sring->req_prod;
@@ -505,7 +520,13 @@ static int do_block_io_op(struct xen_blkif *blkif)
 		/* Yield point for this unbounded loop. */
 		cond_resched();
 	}
-
+	if (!more_to_do && xen_kick_front) {
+		spin_lock_irqsave(&blkif->blk_ring_lock, flags);
+		RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->common, more_to_do);
+		if (more_to_do)
+			blkif->st_reqs_avail ++;
+		spin_unlock_irqrestore(&blkif->blk_ring_lock, flags);
+	}
 	return more_to_do;
 }
 
@@ -727,6 +748,7 @@ static void make_response(struct xen_blkif *blkif, u64 id,
 	blk_rings->common.rsp_prod_pvt++;
 	RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blk_rings->common, notify);
 	if (blk_rings->common.rsp_prod_pvt == blk_rings->common.req_cons) {
+		blkif->st_reqs_finished ++;
 		/*
 		 * Tail check for pending requests. Allows frontend to avoid
 		 * notifications if requests are already in flight (lower
@@ -740,10 +762,14 @@ static void make_response(struct xen_blkif *blkif, u64 id,
 
 	spin_unlock_irqrestore(&blkif->blk_ring_lock, flags);
 
-	if (more_to_do)
+	if (more_to_do) {
+		blkif->st_back_kick++;
 		blkif_notify_work(blkif);
-	if (notify)
+	}
+	if (notify) {
+		blkif->st_front_kick ++;
 		notify_remote_via_irq(blkif->irq);
+	}
 }
 
 static int __init xen_blkif_init(void)
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index 9e40b28..ccb72e2 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -161,6 +161,10 @@ struct xen_blkif {
 	int			st_f_req;
 	int			st_rd_sect;
 	int			st_wr_sect;
+	int 			st_reqs_finished;
+	int			st_reqs_avail;
+	int			st_front_kick;
+	int			st_back_kick;
 
 	wait_queue_head_t	waiting_to_free;
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

  reply	other threads:[~2011-05-16 15:22 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-02  7:04 [PATCH] blkback: Fix block I/O latency issue Vincent, Pradeep
2011-05-02  8:13 ` Jan Beulich
2011-05-03  1:10   ` Vincent, Pradeep
2011-05-03 14:55     ` Konrad Rzeszutek Wilk
2011-05-03 17:16       ` Vincent, Pradeep
2011-05-03 17:51         ` Daniel Stodden
2011-05-03 23:41           ` Vincent, Pradeep
2011-05-03 17:52     ` Daniel Stodden
2011-05-04  1:54       ` Vincent, Pradeep
2011-05-09 20:24         ` Konrad Rzeszutek Wilk
2011-05-13  0:40           ` Vincent, Pradeep
2011-05-13  2:51             ` Konrad Rzeszutek Wilk
2011-05-16 15:22               ` Konrad Rzeszutek Wilk [this message]
2011-05-20  6:12                 ` Vincent, Pradeep
2011-05-24 16:02                   ` Konrad Rzeszutek Wilk
2011-05-24 22:40                     ` Vincent, Pradeep
2011-05-28 20:12 ` [RE-PATCH] " Daniel Stodden
2011-05-28 20:21   ` [PATCH] xen/blkback: Don't let in-flight requests defer pending ones Daniel Stodden
2011-05-29  8:09     ` Vincent, Pradeep
2011-05-29 11:34       ` Daniel Stodden
2011-06-01  8:02         ` Vincent, Pradeep
2011-06-01  8:24           ` Jan Beulich
2011-06-01 17:49           ` Daniel Stodden
2011-06-01 18:07             ` Daniel Stodden
2011-06-27 14:03             ` Konrad Rzeszutek Wilk
2011-06-27 18:42               ` Daniel Stodden
2011-06-27 19:13                 ` Konrad Rzeszutek Wilk
2011-06-28  0:31                   ` Daniel Stodden
2011-06-28 13:19                     ` Konrad Rzeszutek Wilk
2011-05-31 13:44       ` Fix wrong help message for parameter nestedhvm Dong, Eddie
2011-05-31 16:23         ` Ian Campbell
2011-05-31 16:08     ` [PATCH] xen/blkback: Don't let in-flight requests defer pending ones Konrad Rzeszutek Wilk
2011-05-31 16:30       ` Daniel Stodden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110516152224.GA7195@dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@novell.com \
    --cc=daniel.stodden@citrix.com \
    --cc=jeremy@goop.org \
    --cc=pradeepv@amazon.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.