public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: linux-kernel@vger.kernel.org, axboe@kernel.dk
Cc: nauman@google.com, dpshah@google.com, guijianfeng@cn.fujitsu.com,
	jmoyer@redhat.com, czoccolo@gmail.com, vgoyal@redhat.com
Subject: [PATCH 1/3] cfq-iosched: Implment IOPS mode
Date: Wed, 21 Jul 2010 15:06:19 -0400	[thread overview]
Message-ID: <1279739181-24482-2-git-send-email-vgoyal@redhat.com> (raw)
In-Reply-To: <1279739181-24482-1-git-send-email-vgoyal@redhat.com>

o Implement another CFQ mode where we charge queue/group in terms of number
  of requests dispatched instead of measuring the time. Measuring in terms
  of time is not possible when we are driving deeper queue depths and there
  are requests from multiple cfq queues in the request queue.

o This mode currently gets activated if one sets slice_idle=0 and associated
  disk supports NCQ. Again the idea is that on an NCQ disk with idling disabled
  most of the queues will dispatch 1 or more requests and then cfq queue
  expiry happens and we don't have a way to measure time. So start providing
  fairness in terms of IOPS.

o Currently this primarily is beneficial with cfq group scheduling where one
  can disable slice idling so that we don't idle on queue and drive deeper
  request queue deptsh (achieving better throughput), at the same time group
  idle is enabled so one should get service differentiation among groups.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 block/cfq-iosched.c |   37 ++++++++++++++++++++++++++++++-------
 1 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 7982b83..4671c51 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -378,6 +378,21 @@ CFQ_CFQQ_FNS(wait_busy);
 			&cfqg->service_trees[i][j]: NULL) \
 
 
+static inline bool iops_mode(struct cfq_data *cfqd)
+{
+	/*
+	 * If we are not idling on queues and it is a NCQ drive, parallel
+	 * execution of requests is on and measuring time is not possible
+	 * in most of the cases until and unless we drive shallower queue
+	 * depths and that becomes a performance bottleneck. In such cases
+	 * switch to start providing fairness in terms of number of IOs.
+	 */
+	if (!cfqd->cfq_slice_idle && cfqd->hw_tag)
+		return true;
+	else
+		return false;
+}
+
 static inline enum wl_prio_t cfqq_prio(struct cfq_queue *cfqq)
 {
 	if (cfq_class_idle(cfqq))
@@ -905,7 +920,6 @@ static inline unsigned int cfq_cfqq_slice_usage(struct cfq_queue *cfqq)
 			slice_used = cfqq->allocated_slice;
 	}
 
-	cfq_log_cfqq(cfqq->cfqd, cfqq, "sl_used=%u", slice_used);
 	return slice_used;
 }
 
@@ -913,19 +927,21 @@ static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,
 				struct cfq_queue *cfqq)
 {
 	struct cfq_rb_root *st = &cfqd->grp_service_tree;
-	unsigned int used_sl, charge_sl;
+	unsigned int used_sl, charge;
 	int nr_sync = cfqg->nr_cfqq - cfqg_busy_async_queues(cfqd, cfqg)
 			- cfqg->service_tree_idle.count;
 
 	BUG_ON(nr_sync < 0);
-	used_sl = charge_sl = cfq_cfqq_slice_usage(cfqq);
+	used_sl = charge = cfq_cfqq_slice_usage(cfqq);
 
-	if (!cfq_cfqq_sync(cfqq) && !nr_sync)
-		charge_sl = cfqq->allocated_slice;
+	if (iops_mode(cfqd))
+		charge = cfqq->slice_dispatch;
+	else if (!cfq_cfqq_sync(cfqq) && !nr_sync)
+		charge = cfqq->allocated_slice;
 
 	/* Can't update vdisktime while group is on service tree */
 	cfq_rb_erase(&cfqg->rb_node, st);
-	cfqg->vdisktime += cfq_scale_slice(charge_sl, cfqg);
+	cfqg->vdisktime += cfq_scale_slice(charge, cfqg);
 	__cfq_group_service_tree_add(st, cfqg);
 
 	/* This group is being expired. Save the context */
@@ -939,6 +955,8 @@ static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,
 
 	cfq_log_cfqg(cfqd, cfqg, "served: vt=%llu min_vt=%llu", cfqg->vdisktime,
 					st->min_vdisktime);
+	cfq_log_cfqq(cfqq->cfqd, cfqq, "sl_used=%u disp=%u charge=%u iops=%u",
+			used_sl, cfqq->slice_dispatch, charge, iops_mode(cfqd));
 	cfq_blkiocg_update_timeslice_used(&cfqg->blkg, used_sl);
 	cfq_blkiocg_set_start_empty_time(&cfqg->blkg);
 }
@@ -1625,8 +1643,13 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 
 	/*
 	 * store what was left of this slice, if the queue idled/timed out
+	 * Currently in IOPS mode I am not getting into the business of
+	 * saving remaining slice/number of requests because I think it does
+	 * not help much in most of the cases. We can fix it later, if that's
+	 * not the case. IOPS mode is primarily more useful for group
+	 * scheduling.
 	 */
-	if (timed_out && !cfq_cfqq_slice_new(cfqq)) {
+	if (timed_out && !cfq_cfqq_slice_new(cfqq) && !iops_mode(cfqd)) {
 		cfqq->slice_resid = cfqq->slice_end - jiffies;
 		cfq_log_cfqq(cfqd, cfqq, "resid=%ld", cfqq->slice_resid);
 	}
-- 
1.7.1.1


  reply	other threads:[~2010-07-21 19:06 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-21 19:06 [RFC PATCH] cfq-iosced: Implement IOPS mode and group_idle tunable V3 Vivek Goyal
2010-07-21 19:06 ` Vivek Goyal [this message]
2010-07-21 20:33   ` [PATCH 1/3] cfq-iosched: Implment IOPS mode Jeff Moyer
2010-07-21 20:57     ` Vivek Goyal
2010-07-21 19:06 ` [PATCH 2/3] cfq-iosched: Implement a tunable group_idle Vivek Goyal
2010-07-21 19:40   ` Jeff Moyer
2010-07-21 20:13     ` Vivek Goyal
2010-07-21 20:54       ` Jeff Moyer
2010-07-21 19:06 ` [PATCH 3/3] cfq-iosched: Print number of sectors dispatched per cfqq slice Vivek Goyal
2010-07-22  5:56 ` [RFC PATCH] cfq-iosced: Implement IOPS mode and group_idle tunable V3 Christoph Hellwig
2010-07-22 14:00   ` Vivek Goyal
2010-07-24  8:51     ` Christoph Hellwig
2010-07-24  9:07       ` Corrado Zoccolo
2010-07-26 14:30         ` Vivek Goyal
2010-07-26 21:21           ` Tuning IO scheduler (Was: Re: [RFC PATCH] cfq-iosced: Implement IOPS mode and group_idle tunable V3) Vivek Goyal
2010-07-26 14:33         ` [RFC PATCH] cfq-iosced: Implement IOPS mode and group_idle tunable V3 Vivek Goyal
2010-07-29 19:57           ` Corrado Zoccolo
2010-07-26 13:51       ` Vivek Goyal
2010-07-22 20:54   ` Vivek Goyal
2010-07-22  7:08 ` Gui Jianfeng
2010-07-22 14:49   ` Vivek Goyal
2010-07-22 23:53     ` Gui Jianfeng
2010-07-26  6:58 ` Gui Jianfeng
2010-07-26 14:10   ` Vivek Goyal
2010-07-27  8:33     ` Gui Jianfeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1279739181-24482-2-git-send-email-vgoyal@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=czoccolo@gmail.com \
    --cc=dpshah@google.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nauman@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox