From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [PATCH] blkback: Fix block I/O latency issue Date: Mon, 16 May 2011 11:22:24 -0400 Message-ID: <20110516152224.GA7195@dumpdata.com> References: <20110509202403.GA27755@dumpdata.com> <20110513025132.GA4652@dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ZPt4rx8FFjLCG7dd" Return-path: Content-Disposition: inline In-Reply-To: <20110513025132.GA4652@dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Vincent, Pradeep" Cc: Jeremy Fitzhardinge , "xen-devel@lists.xensource.com" , Jan Beulich , Daniel Stodden List-Id: xen-devel@lists.xenproject.org --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, May 12, 2011 at 10:51:32PM -0400, Konrad Rzeszutek Wilk wrote: > > >>what were the numbers when it came to high bandwidth numbers > > > > Under high I/O workload, where the blkfront would fill up the queue as > > blkback works the queue, the I/O latency problem in question doesn't > > manifest itself and as a result this patch doesn't make much of a > > difference in terms of interrupt rate. My benchmarks didn't show any > > significant effect. > > I have to rerun my benchmarks. Under high load (so 64Kb, four threads > writting as much as they can to a iSCSI disk), the IRQ rate for each > blkif went from 2-3/sec to ~5K/sec. But I did not do a good > job on capturing the submission latency to see if the I/Os get the > response back as fast (or the same) as without your patch. > > And the iSCSI disk on the target side was an RAMdisk, so latency > was quite small which is not fair to your problem. > > Do you have a program to measure the latency for the workload you > had encountered? I would like to run those numbers myself. Ran some more benchmarks over this week. This time I tried to run it on: - iSCSI target (1GB, and on the "other side" it wakes up every 1msec, so the latency is set to 1msec). - scsi_debug delay=0 (no delay and as fast possible. Comes out to be about 4 microseconds completion with queue depth of one with 32K I/Os). - local SATAI 80GB ST3808110AS. Still running as it is quite slow. With only one PV guest doing a round (three times) of two threads randomly writting I/Os with a queue depth of 256. Then a different round of four threads writting/reading (80/20) 512bytes up to 64K randomly over the disk. I used the attached patch against #master (git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git) to gauge how well we are doing (and what the interrupt generation rate is). These workloads I think would be considered 'high I/O' and I was expecting your patch to not have any influence on the numbers. But to my surprise the case where the I/O latency is high, the interrupt generation was quite small. But where the I/O latency was very very small (4 microseconds) the interrupt generation was on average about 20K/s. And this is with a queue depth of 256 with four threads. I was expecting the opposite. Hence quite curious to see your use case. What do you consider a middle I/O and low I/O cases? Do you use 'fio' for your testing? With the high I/O load, the numbers came out to give us about 1% benefit with your patch. However, I am worried (maybe unneccassarily?) about the 20K interrupt generation when the iometer tests kicked in (this was only when using the unrealistic 'scsi_debug' drive). The picture of this using iSCSI target: http://darnok.org/xen/amazon/iscsi_target/iometer-bw.png And when done on top of local RAMdisk: http://darnok.org/xen/amazon/scsi_debug/iometer-bw.png --ZPt4rx8FFjLCG7dd Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="amazon-debug.patch" diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index dba55e3..83c24ed 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -60,8 +60,11 @@ static int xen_blkif_reqs = 64; module_param_named(reqs, xen_blkif_reqs, int, 0); MODULE_PARM_DESC(reqs, "Number of blkback requests to allocate"); +static int xen_kick_front = 1; +module_param(xen_kick_front, int, 0644); + /* Run-time switchable: /sys/module/blkback/parameters/ */ -static unsigned int log_stats; +static unsigned int log_stats = 1; module_param(log_stats, int, 0644); /* @@ -255,10 +258,21 @@ static void print_stats(struct xen_blkif *blkif) pr_info("xen-blkback (%s): oo %3d | rd %4d | wr %4d | f %4d\n", current->comm, blkif->st_oo_req, blkif->st_rd_req, blkif->st_wr_req, blkif->st_f_req); + + if (blkif->st_reqs_avail) { + pr_info("xen-blkback (%s): bk %4d fk %4d | avail %4d finished %4d\n", + current->comm, blkif->st_back_kick, blkif->st_front_kick, + blkif->st_reqs_avail, blkif->st_reqs_finished); + } + blkif->st_print = jiffies + msecs_to_jiffies(10 * 1000); blkif->st_rd_req = 0; blkif->st_wr_req = 0; blkif->st_oo_req = 0; + blkif->st_back_kick = 0; + blkif->st_front_kick = 0; + blkif->st_reqs_avail = 0; + blkif->st_reqs_finished = 0; } int xen_blkif_schedule(void *arg) @@ -459,6 +473,7 @@ static int do_block_io_op(struct xen_blkif *blkif) struct pending_req *pending_req; RING_IDX rc, rp; int more_to_do = 0; + unsigned long flags; rc = blk_rings->common.req_cons; rp = blk_rings->common.sring->req_prod; @@ -505,7 +520,13 @@ static int do_block_io_op(struct xen_blkif *blkif) /* Yield point for this unbounded loop. */ cond_resched(); } - + if (!more_to_do && xen_kick_front) { + spin_lock_irqsave(&blkif->blk_ring_lock, flags); + RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->common, more_to_do); + if (more_to_do) + blkif->st_reqs_avail ++; + spin_unlock_irqrestore(&blkif->blk_ring_lock, flags); + } return more_to_do; } @@ -727,6 +748,7 @@ static void make_response(struct xen_blkif *blkif, u64 id, blk_rings->common.rsp_prod_pvt++; RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blk_rings->common, notify); if (blk_rings->common.rsp_prod_pvt == blk_rings->common.req_cons) { + blkif->st_reqs_finished ++; /* * Tail check for pending requests. Allows frontend to avoid * notifications if requests are already in flight (lower @@ -740,10 +762,14 @@ static void make_response(struct xen_blkif *blkif, u64 id, spin_unlock_irqrestore(&blkif->blk_ring_lock, flags); - if (more_to_do) + if (more_to_do) { + blkif->st_back_kick++; blkif_notify_work(blkif); - if (notify) + } + if (notify) { + blkif->st_front_kick ++; notify_remote_via_irq(blkif->irq); + } } static int __init xen_blkif_init(void) diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index 9e40b28..ccb72e2 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -161,6 +161,10 @@ struct xen_blkif { int st_f_req; int st_rd_sect; int st_wr_sect; + int st_reqs_finished; + int st_reqs_avail; + int st_front_kick; + int st_back_kick; wait_queue_head_t waiting_to_free; --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --ZPt4rx8FFjLCG7dd--