[PATCH 0/15] CFQ IO scheduler patch series

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/15] CFQ IO scheduler patch series
@ 2007-04-24  8:15 Jens Axboe
  2007-04-24  8:15 ` [PATCH 1/15] cfq-iosched: improve preemption for cooperating tasks Jens Axboe
                   ` (15 more replies)
  0 siblings, 16 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel

Hi,

I have a series of patches for the CFQ IO scheduler that I'd like to get
some more testing on. The patch series is also scheduled to enter the
next -mm, but if I'd like people to consciously give it a spin on its
own as well. The patches are also available from the 'cfq' branch of the
block layer tree:

git://git.kernel.dk/data/git/linux-2.6-block.git

and I've uploaded a rolled up version here as well:

http://brick.kernel.dk/snaps/cfq-update-20070424

The patch series is essentially a series of cleanups and smaller
optimizations, but there's also a larger change in there (patches 4 to
7) that completely rework how CFQ selects which queue to process. It's
an experimental approach similar to the CFS CPU scheduler, in which
management lists are converted to a single rbtree instead.

So give it a spin if you have the time, and let me know how it performs
and/or feels for you workload and hardware.

 cfq-iosched.c |  676 ++++++++++++++++++++++++++------------------------
 1 file changed, 357 insertions(+), 319 deletions(-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/15] cfq-iosched: improve preemption for cooperating tasks
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 2/15] cfq-iosched: development update Jens Axboe
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

When testing the syslet async io approach, I discovered that CFQ
sometimes didn't perform as well as expected. cfq_should_preempt()
needs to better check for cooperating tasks, so fix that by allowing
preemption of an equal priority queue if the recently queued request
is as good a candidate for IO as the one we are currently waiting for.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   26 ++++++++++++++++++++------
 1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 9e37971..a683d00 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -861,15 +861,11 @@ static int cfq_arm_slice_timer(struct cfq_data *cfqd)
 
 static void cfq_dispatch_insert(request_queue_t *q, struct request *rq)
 {
-	struct cfq_data *cfqd = q->elevator->elevator_data;
 	struct cfq_queue *cfqq = RQ_CFQQ(rq);
 
 	cfq_remove_request(rq);
 	cfqq->on_dispatch[rq_is_sync(rq)]++;
 	elv_dispatch_sort(q, rq);
-
-	rq = list_entry(q->queue_head.prev, struct request, queuelist);
-	cfqd->last_sector = rq->sector + rq->nr_sectors;
 }
 
 /*
@@ -1579,6 +1575,7 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
 		   struct request *rq)
 {
 	struct cfq_queue *cfqq = cfqd->active_queue;
+	sector_t dist;
 
 	if (cfq_class_idle(new_cfqq))
 		return 0;
@@ -1588,14 +1585,14 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
 
 	if (cfq_class_idle(cfqq))
 		return 1;
-	if (!cfq_cfqq_wait_request(new_cfqq))
-		return 0;
+
 	/*
 	 * if the new request is sync, but the currently running queue is
 	 * not, let the sync request have priority.
 	 */
 	if (rq_is_sync(rq) && !cfq_cfqq_sync(cfqq))
 		return 1;
+
 	/*
 	 * So both queues are sync. Let the new request get disk time if
 	 * it's a metadata request and the current queue is doing regular IO.
@@ -1603,6 +1600,21 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
 	if (rq_is_meta(rq) && !cfqq->meta_pending)
 		return 1;
 
+	if (!cfqd->active_cic || !cfq_cfqq_wait_request(cfqq))
+		return 0;
+
+	/*
+	 * if this request is as-good as one we would expect from the
+	 * current cfqq, let it preempt
+	 */
+	if (rq->sector > cfqd->last_sector)
+		dist = rq->sector - cfqd->last_sector;
+	else
+		dist = cfqd->last_sector - rq->sector;
+
+	if (dist <= cfqd->active_cic->seek_mean)
+		return 1;
+
 	return 0;
 }
 
@@ -1719,6 +1731,8 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
 	cfqq->on_dispatch[sync]--;
 	cfqq->service_last = now;
 
+	cfqd->last_sector = rq->hard_sector + rq->hard_nr_sectors;
+
 	if (!cfq_class_idle(cfqq))
 		cfqd->last_end_request = now;
 
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/15] cfq-iosched: development update
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
  2007-04-24  8:15 ` [PATCH 1/15] cfq-iosched: improve preemption for cooperating tasks Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 3/15] cfq-iosched: minor updates Jens Axboe
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

- Implement logic for detecting cooperating processes, so we
  choose the best available queue whenever possible.

- Improve residual slice time accounting.

- Remove dead code: we no longer see async requests coming in on
  sync queues. That part was removed a long time ago. That means
  that we can also remove the difference between cfq_cfqq_sync()
  and cfq_cfqq_class_sync(), they are now indentical. And we can
  kill the on_dispatch array, just make it a counter.

- Allow a process to go into the current list, if it hasn't been
  serviced in this scheduler tick yet.

Possible future improvements including caching the cfqq lookup
in cfq_close_cooperator(), so we don't have to look it up twice.
cfq_get_best_queue() should just use that last decision instead
of doing it again.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |  381 +++++++++++++++++++++++++++++++++++----------------
 1 files changed, 261 insertions(+), 120 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index a683d00..3883ba8 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -56,13 +56,7 @@ static struct completion *ioc_gone;
 #define ASYNC			(0)
 #define SYNC			(1)
 
-#define cfq_cfqq_dispatched(cfqq)	\
-	((cfqq)->on_dispatch[ASYNC] + (cfqq)->on_dispatch[SYNC])
-
-#define cfq_cfqq_class_sync(cfqq)	((cfqq)->key != CFQ_KEY_ASYNC)
-
-#define cfq_cfqq_sync(cfqq)		\
-	(cfq_cfqq_class_sync(cfqq) || (cfqq)->on_dispatch[SYNC])
+#define cfq_cfqq_sync(cfqq)	((cfqq)->key != CFQ_KEY_ASYNC)
 
 #define sample_valid(samples)	((samples) > 80)
 
@@ -79,6 +73,7 @@ struct cfq_data {
 	struct list_head busy_rr;
 	struct list_head cur_rr;
 	struct list_head idle_rr;
+	unsigned long cur_rr_tick;
 	unsigned int busy_queues;
 
 	/*
@@ -98,11 +93,12 @@ struct cfq_data {
 	struct cfq_queue *active_queue;
 	struct cfq_io_context *active_cic;
 	int cur_prio, cur_end_prio;
+	unsigned long prio_time;
 	unsigned int dispatch_slice;
 
 	struct timer_list idle_class_timer;
 
-	sector_t last_sector;
+	sector_t last_position;
 	unsigned long last_end_request;
 
 	/*
@@ -117,6 +113,9 @@ struct cfq_data {
 	unsigned int cfq_slice_idle;
 
 	struct list_head cic_list;
+
+	sector_t new_seek_mean;
+	u64 new_seek_total;
 };
 
 /*
@@ -133,6 +132,8 @@ struct cfq_queue {
 	unsigned int key;
 	/* member of the rr/busy/cur/idle cfqd list */
 	struct list_head cfq_list;
+	/* in what tick we were last serviced */
+	unsigned long rr_tick;
 	/* sorted list of pending requests */
 	struct rb_root sort_list;
 	/* if fifo isn't expired, next request to serve */
@@ -148,10 +149,11 @@ struct cfq_queue {
 
 	unsigned long slice_end;
 	unsigned long service_last;
+	unsigned long slice_start;
 	long slice_resid;
 
-	/* number of requests that are on the dispatch list */
-	int on_dispatch[2];
+	/* number of requests that are on the dispatch list or inside driver */
+	int dispatched;
 
 	/* io prio of this group */
 	unsigned short ioprio, org_ioprio;
@@ -159,6 +161,8 @@ struct cfq_queue {
 
 	/* various state flags, see below */
 	unsigned int flags;
+
+	sector_t last_request_pos;
 };
 
 enum cfqq_state_flags {
@@ -259,6 +263,8 @@ cfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 	 * easily introduce oscillations.
 	 */
 	cfqq->slice_resid = 0;
+
+	cfqq->slice_start = jiffies;
 }
 
 /*
@@ -307,7 +313,7 @@ cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, struct request *rq2)
 	s1 = rq1->sector;
 	s2 = rq2->sector;
 
-	last = cfqd->last_sector;
+	last = cfqd->last_position;
 
 	/*
 	 * by definition, 1KiB is 2 sectors
@@ -398,39 +404,42 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	return cfq_choose_req(cfqd, next, prev);
 }
 
-static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
+/*
+ * This function finds out where to insert a BE queue in the service hierarchy
+ */
+static void cfq_resort_be_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq,
+				int preempted)
 {
-	struct cfq_data *cfqd = cfqq->cfqd;
 	struct list_head *list, *n;
 	struct cfq_queue *__cfqq;
+	int add_tail = 0;
 
 	/*
-	 * Resorting requires the cfqq to be on the RR list already.
+	 * if cfqq has requests in flight, don't allow it to be
+	 * found in cfq_set_active_queue before it has finished them.
+	 * this is done to increase fairness between a process that
+	 * has lots of io pending vs one that only generates one
+	 * sporadically or synchronously
 	 */
-	if (!cfq_cfqq_on_rr(cfqq))
-		return;
-
-	list_del(&cfqq->cfq_list);
-
-	if (cfq_class_rt(cfqq))
+	if (cfqq->dispatched)
+		list = &cfqd->busy_rr;
+	else if (cfqq->ioprio == (cfqd->cur_prio + 1) &&
+		 cfq_cfqq_sync(cfqq) &&
+		 (time_before(cfqd->prio_time, cfqq->service_last) ||
+		  cfq_cfqq_queue_new(cfqq) || preempted)) {
 		list = &cfqd->cur_rr;
-	else if (cfq_class_idle(cfqq))
-		list = &cfqd->idle_rr;
-	else {
+		add_tail = 1;
+	} else
+		list = &cfqd->rr_list[cfqq->ioprio];
+
+	if (!cfq_cfqq_sync(cfqq) || add_tail) {
 		/*
-		 * if cfqq has requests in flight, don't allow it to be
-		 * found in cfq_set_active_queue before it has finished them.
-		 * this is done to increase fairness between a process that
-		 * has lots of io pending vs one that only generates one
-		 * sporadically or synchronously
+		 * async queue always goes to the end. this wont be overly
+		 * unfair to writes, as the sort of the sync queue wont be
+		 * allowed to pass the async queue again.
 		 */
-		if (cfq_cfqq_dispatched(cfqq))
-			list = &cfqd->busy_rr;
-		else
-			list = &cfqd->rr_list[cfqq->ioprio];
-	}
-
-	if (preempted || cfq_cfqq_queue_new(cfqq)) {
+		list_add_tail(&cfqq->cfq_list, list);
+	} else if (preempted || cfq_cfqq_queue_new(cfqq)) {
 		/*
 		 * If this queue was preempted or is new (never been serviced),
 		 * let it be added first for fairness but beind other new
@@ -444,14 +453,7 @@ static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 
 			n = n->next;
 		}
-		list_add_tail(&cfqq->cfq_list, n);
-	} else if (!cfq_cfqq_class_sync(cfqq)) {
-		/*
-		 * async queue always goes to the end. this wont be overly
-		 * unfair to writes, as the sort of the sync queue wont be
-		 * allowed to pass the async queue again.
-		 */
-		list_add_tail(&cfqq->cfq_list, list);
+		list_add(&cfqq->cfq_list, n);
 	} else {
 		/*
 		 * sort by last service, but don't cross a new or async
@@ -461,17 +463,54 @@ static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 		 */
 		n = list;
 		while ((n = n->prev) != list) {
-			struct cfq_queue *__cfqq = list_entry_cfqq(n);
+			struct cfq_queue *__c = list_entry_cfqq(n);
 
-			if (!cfq_cfqq_class_sync(cfqq) || !__cfqq->service_last)
+			if (!cfq_cfqq_sync(__c) || !__c->service_last)
 				break;
-			if (time_before(__cfqq->service_last, cfqq->service_last))
+			if (time_before(__c->service_last, cfqq->service_last))
 				break;
 		}
 		list_add(&cfqq->cfq_list, n);
 	}
 }
 
+static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
+{
+	struct cfq_data *cfqd = cfqq->cfqd;
+	struct list_head *n;
+
+	/*
+	 * Resorting requires the cfqq to be on the RR list already.
+	 */
+	if (!cfq_cfqq_on_rr(cfqq))
+		return;
+
+	list_del(&cfqq->cfq_list);
+
+	if (cfq_class_rt(cfqq)) {
+		/*
+		 * At to the front of the current list, but behind other
+		 * RT queues.
+		 */
+		n = &cfqd->cur_rr;
+		while (n->next != &cfqd->cur_rr)
+			if (!cfq_class_rt(cfqq))
+				break;
+
+		list_add(&cfqq->cfq_list, n);
+	} else if (cfq_class_idle(cfqq)) {
+		/*
+		 * IDLE goes to the tail of the idle list
+		 */
+		list_add_tail(&cfqq->cfq_list, &cfqd->idle_rr);
+	} else {
+		/*
+		 * So we get here, ergo the queue is a regular best-effort queue
+		 */
+		cfq_resort_be_queue(cfqd, cfqq, preempted);
+	}
+}
+
 /*
  * add to busy list of queues for service, trying to be fair in ordering
  * the pending list according to last request service
@@ -573,6 +612,8 @@ static void cfq_activate_request(request_queue_t *q, struct request *rq)
 	 */
 	if (!cfqd->hw_tag && cfqd->rq_in_driver > 4)
 		cfqd->hw_tag = 1;
+
+	cfqd->last_position = rq->hard_sector + rq->hard_nr_sectors;
 }
 
 static void cfq_deactivate_request(request_queue_t *q, struct request *rq)
@@ -678,6 +719,7 @@ __cfq_set_active_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 		cfq_clear_cfqq_must_alloc_slice(cfqq);
 		cfq_clear_cfqq_fifo_expire(cfqq);
 		cfq_mark_cfqq_slice_new(cfqq);
+		cfqq->rr_tick = cfqd->cur_rr_tick;
 	}
 
 	cfqd->active_queue = cfqq;
@@ -780,10 +822,46 @@ static int cfq_get_next_prio_level(struct cfq_data *cfqd)
 		cfqd->cur_end_prio = 0;
 	}
 
+	cfqd->cur_rr_tick++;
+	cfqd->prio_time = jiffies;
 	return prio;
 }
 
-static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd)
+static inline sector_t cfq_dist_from_last(struct cfq_data *cfqd,
+					  struct request *rq)
+{
+	if (rq->sector >= cfqd->last_position)
+		return rq->sector - cfqd->last_position;
+	else
+		return cfqd->last_position - rq->sector;
+}
+
+static struct cfq_queue *cfq_get_best_queue(struct cfq_data *cfqd)
+{
+	struct cfq_queue *cfqq = NULL, *__cfqq;
+	sector_t best = -1, dist;
+
+	list_for_each_entry(__cfqq, &cfqd->cur_rr, cfq_list) {
+		if (!__cfqq->next_rq || !cfq_cfqq_sync(__cfqq))
+			continue;
+
+		dist = cfq_dist_from_last(cfqd, __cfqq->next_rq);
+		if (dist < best) {
+			best = dist;
+			cfqq = __cfqq;
+		}
+	}
+
+	/*
+	 * Only async queue(s) available, grab first entry
+	 */
+	if (!cfqq)
+		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
+
+	return cfqq;
+}
+
+static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 {
 	struct cfq_queue *cfqq = NULL;
 
@@ -793,7 +871,7 @@ static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd)
 		 * empty, get next prio level and grab first entry then if any
 		 * are spliced
 		 */
-		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
+		cfqq = cfq_get_best_queue(cfqd);
 	} else if (!list_empty(&cfqd->busy_rr)) {
 		/*
 		 * If no new queues are available, check if the busy list has
@@ -814,49 +892,128 @@ static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd)
 			mod_timer(&cfqd->idle_class_timer, end);
 	}
 
+	return cfqq;
+}
+
+static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd)
+{
+	struct cfq_queue *cfqq;
+
+	do {
+		long prio;
+
+		cfqq = cfq_get_next_queue(cfqd);
+		if (!cfqq)
+			break;
+
+		prio = cfq_prio_to_slice(cfqd, cfqq);
+		if (cfqq->slice_resid > -prio)
+			break;
+
+		cfqq->slice_resid += prio;
+		list_del_init(&cfqq->cfq_list);
+		list_add_tail(&cfqq->cfq_list, &cfqd->rr_list[cfqq->ioprio]);
+		cfqq = NULL;
+	} while (1);
+
 	__cfq_set_active_queue(cfqd, cfqq);
 	return cfqq;
 }
 
-#define CIC_SEEKY(cic) ((cic)->seek_mean > (128 * 1024))
+static inline int cfq_rq_close(struct cfq_data *cfqd, struct request *rq)
+{
+	struct cfq_io_context *cic = cfqd->active_cic;
+
+	if (!sample_valid(cic->seek_samples))
+		return 0;
+
+	return cfq_dist_from_last(cfqd, rq) <= cic->seek_mean;
+}
+
+static struct cfq_queue *__cfq_close_cooperator(struct cfq_data *cfqd,
+						struct cfq_queue *cur_cfqq,
+						struct list_head *list)
+{
+	struct cfq_queue *cfqq;
+
+	list_for_each_entry(cfqq, list, cfq_list) {
+		if (cfqq == cur_cfqq || !cfq_cfqq_sync(cfqq))
+			continue;
+
+		BUG_ON(!cfqq->next_rq);
+
+		if (cfq_rq_close(cfqd, cfqq->next_rq))
+			return cfqq;
+	}
+
+	return NULL;
+}
+
+static int cfq_close_cooperator(struct cfq_data *cfqd,
+				struct cfq_queue *cur_cfqq)
+{
+	struct cfq_queue *cfqq;
+
+	if (!cfqd->busy_queues)
+		return 0;
+
+	/*
+	 * check cur_rr and same-prio rr_list for candidates
+	 */
+	cfqq = __cfq_close_cooperator(cfqd, cur_cfqq, &cfqd->cur_rr);
+	if (cfqq)
+		return 1;
+
+	cfqq = __cfq_close_cooperator(cfqd, cur_cfqq, &cfqd->rr_list[cur_cfqq->ioprio]);
+	if (cfqq && (cfqq->rr_tick == cfqd->cur_rr_tick))
+		cfqq = NULL;
+
+	return cfqq != NULL;
+}
+
+#define CIC_SEEKY(cic) ((cic)->seek_mean > (8 * 1024))
 
-static int cfq_arm_slice_timer(struct cfq_data *cfqd)
+static void cfq_arm_slice_timer(struct cfq_data *cfqd)
 {
 	struct cfq_queue *cfqq = cfqd->active_queue;
 	struct cfq_io_context *cic;
 	unsigned long sl;
 
 	WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list));
+	WARN_ON(cfq_cfqq_slice_new(cfqq));
 
 	/*
 	 * idle is disabled, either manually or by past process history
 	 */
-	if (!cfqd->cfq_slice_idle)
-		return 0;
-	if (!cfq_cfqq_idle_window(cfqq))
-		return 0;
+	if (!cfqd->cfq_slice_idle || !cfq_cfqq_idle_window(cfqq))
+		return;
+
 	/*
 	 * task has exited, don't wait
 	 */
 	cic = cfqd->active_cic;
 	if (!cic || !cic->ioc->task)
-		return 0;
+		return;
+
+	/*
+	 * See if this prio level has a good candidate
+	 */
+	if (cfq_close_cooperator(cfqd, cfqq))
+		return;
 
 	cfq_mark_cfqq_must_dispatch(cfqq);
 	cfq_mark_cfqq_wait_request(cfqq);
 
-	sl = min(cfqq->slice_end - 1, (unsigned long) cfqd->cfq_slice_idle);
-
 	/*
 	 * we don't want to idle for seeks, but we do want to allow
 	 * fair distribution of slice time for a process doing back-to-back
 	 * seeks. so allow a little bit of time for him to submit a new rq
 	 */
+	sl = cfqd->cfq_slice_idle;
 	if (sample_valid(cic->seek_samples) && CIC_SEEKY(cic))
 		sl = min(sl, msecs_to_jiffies(2));
 
 	mod_timer(&cfqd->idle_slice_timer, jiffies + sl);
-	return 1;
 }
 
 static void cfq_dispatch_insert(request_queue_t *q, struct request *rq)
@@ -864,7 +1021,7 @@ static void cfq_dispatch_insert(request_queue_t *q, struct request *rq)
 	struct cfq_queue *cfqq = RQ_CFQQ(rq);
 
 	cfq_remove_request(rq);
-	cfqq->on_dispatch[rq_is_sync(rq)]++;
+	cfqq->dispatched++;
 	elv_dispatch_sort(q, rq);
 }
 
@@ -885,13 +1042,13 @@ static inline struct request *cfq_check_fifo(struct cfq_queue *cfqq)
 	if (list_empty(&cfqq->fifo))
 		return NULL;
 
-	fifo = cfq_cfqq_class_sync(cfqq);
+	fifo = cfq_cfqq_sync(cfqq);
 	rq = rq_entry_fifo(cfqq->fifo.next);
 
-	if (time_after(jiffies, rq->start_time + cfqd->cfq_fifo_expire[fifo]))
-		return rq;
+	if (time_before(jiffies, rq->start_time + cfqd->cfq_fifo_expire[fifo]))
+		return NULL;
 
-	return NULL;
+	return rq;
 }
 
 static inline int
@@ -916,23 +1073,26 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd)
 		goto new_queue;
 
 	/*
-	 * slice has expired
+	 * The active queue has run out of time, expire it and select new.
 	 */
-	if (!cfq_cfqq_must_dispatch(cfqq) && cfq_slice_used(cfqq))
+	if (cfq_slice_used(cfqq))
 		goto expire;
 
 	/*
-	 * if queue has requests, dispatch one. if not, check if
-	 * enough slice is left to wait for one
+	 * The active queue has requests and isn't expired, allow it to
+	 * dispatch.
 	 */
 	if (!RB_EMPTY_ROOT(&cfqq->sort_list))
 		goto keep_queue;
-	else if (cfq_cfqq_slice_new(cfqq) || cfq_cfqq_dispatched(cfqq)) {
+
+	/*
+	 * No requests pending. If the active queue still has requests in
+	 * flight or is idling for a new request, allow either of these
+	 * conditions to happen (or time out) before selecting a new queue.
+	 */
+	if (cfqq->dispatched || timer_pending(&cfqd->idle_slice_timer)) {
 		cfqq = NULL;
 		goto keep_queue;
-	} else if (cfq_cfqq_class_sync(cfqq)) {
-		if (cfq_arm_slice_timer(cfqd))
-			return NULL;
 	}
 
 expire:
@@ -1033,7 +1193,7 @@ static int
 cfq_dispatch_requests(request_queue_t *q, int force)
 {
 	struct cfq_data *cfqd = q->elevator->elevator_data;
-	struct cfq_queue *cfqq, *prev_cfqq;
+	struct cfq_queue *cfqq;
 	int dispatched;
 
 	if (!cfqd->busy_queues)
@@ -1043,23 +1203,19 @@ cfq_dispatch_requests(request_queue_t *q, int force)
 		return cfq_forced_dispatch(cfqd);
 
 	dispatched = 0;
-	prev_cfqq = NULL;
 	while ((cfqq = cfq_select_queue(cfqd)) != NULL) {
 		int max_dispatch;
 
 		if (cfqd->busy_queues > 1) {
 			/*
-			 * Don't repeat dispatch from the previous queue.
-			 */
-			if (prev_cfqq == cfqq)
-				break;
-
-			/*
 			 * So we have dispatched before in this round, if the
 			 * next queue has idling enabled (must be sync), don't
-			 * allow it service until the previous have continued.
+			 * allow it service until the previous have completed.
 			 */
-			if (cfqd->rq_in_driver && cfq_cfqq_idle_window(cfqq))
+			if (cfqd->rq_in_driver && cfq_cfqq_idle_window(cfqq) &&
+			    dispatched)
+				break;
+			if (cfqq->dispatched >= cfqd->cfq_quantum)
 				break;
 		}
 
@@ -1072,7 +1228,6 @@ cfq_dispatch_requests(request_queue_t *q, int force)
 			max_dispatch = 1;
 
 		dispatched += __cfq_dispatch_requests(cfqd, cfqq, max_dispatch);
-		prev_cfqq = cfqq;
 	}
 
 	return dispatched;
@@ -1514,7 +1669,8 @@ cfq_update_io_thinktime(struct cfq_data *cfqd, struct cfq_io_context *cic)
 }
 
 static void
-cfq_update_io_seektime(struct cfq_io_context *cic, struct request *rq)
+cfq_update_io_seektime(struct cfq_data *cfqd, struct cfq_io_context *cic,
+		       struct request *rq)
 {
 	sector_t sdist;
 	u64 total;
@@ -1524,6 +1680,11 @@ cfq_update_io_seektime(struct cfq_io_context *cic, struct request *rq)
 	else
 		sdist = cic->last_request_pos - rq->sector;
 
+	if (!cic->seek_samples) {
+		cfqd->new_seek_total = (7*cic->seek_total + (u64)256*sdist) / 8;
+		cfqd->new_seek_mean = cfqd->new_seek_total / 256;
+	}
+
 	/*
 	 * Don't allow the seek distance to get too large from the
 	 * odd fragment, pagein, etc
@@ -1574,13 +1735,16 @@ static int
 cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
 		   struct request *rq)
 {
-	struct cfq_queue *cfqq = cfqd->active_queue;
-	sector_t dist;
+	struct cfq_queue *cfqq;
 
-	if (cfq_class_idle(new_cfqq))
+	cfqq = cfqd->active_queue;
+	if (!cfqq)
 		return 0;
 
-	if (!cfqq)
+	if (cfq_slice_used(cfqq))
+		return 1;
+
+	if (cfq_class_idle(new_cfqq))
 		return 0;
 
 	if (cfq_class_idle(cfqq))
@@ -1607,12 +1771,7 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
 	 * if this request is as-good as one we would expect from the
 	 * current cfqq, let it preempt
 	 */
-	if (rq->sector > cfqd->last_sector)
-		dist = rq->sector - cfqd->last_sector;
-	else
-		dist = cfqd->last_sector - rq->sector;
-
-	if (dist <= cfqd->active_cic->seek_mean)
+	if (cfq_rq_close(cfqd, rq))
 		return 1;
 
 	return 0;
@@ -1656,28 +1815,12 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	cfqq->next_rq = cfq_choose_req(cfqd, cfqq->next_rq, rq);
 	BUG_ON(!cfqq->next_rq);
 
-	/*
-	 * we never wait for an async request and we don't allow preemption
-	 * of an async request. so just return early
-	 */
-	if (!rq_is_sync(rq)) {
-		/*
-		 * sync process issued an async request, if it's waiting
-		 * then expire it and kick rq handling.
-		 */
-		if (cic == cfqd->active_cic &&
-		    del_timer(&cfqd->idle_slice_timer)) {
-			cfq_slice_expired(cfqd, 0, 0);
-			blk_start_queueing(cfqd->queue);
-		}
-		return;
-	}
-
 	cfq_update_io_thinktime(cfqd, cic);
-	cfq_update_io_seektime(cic, rq);
+	cfq_update_io_seektime(cfqd, cic, rq);
 	cfq_update_idle_window(cfqd, cfqq, cic);
 
 	cic->last_request_pos = rq->sector + rq->nr_sectors;
+	cfqq->last_request_pos = cic->last_request_pos;
 
 	if (cfqq == cfqd->active_queue) {
 		/*
@@ -1726,13 +1869,11 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
 	now = jiffies;
 
 	WARN_ON(!cfqd->rq_in_driver);
-	WARN_ON(!cfqq->on_dispatch[sync]);
+	WARN_ON(!cfqq->dispatched);
 	cfqd->rq_in_driver--;
-	cfqq->on_dispatch[sync]--;
+	cfqq->dispatched--;
 	cfqq->service_last = now;
 
-	cfqd->last_sector = rq->hard_sector + rq->hard_nr_sectors;
-
 	if (!cfq_class_idle(cfqq))
 		cfqd->last_end_request = now;
 
@@ -1752,11 +1893,12 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
 		}
 		if (cfq_slice_used(cfqq))
 			cfq_slice_expired(cfqd, 0, 1);
-		else if (sync && RB_EMPTY_ROOT(&cfqq->sort_list)) {
-			if (!cfq_arm_slice_timer(cfqd))
-				cfq_schedule_dispatch(cfqd);
-		}
+		else if (sync && RB_EMPTY_ROOT(&cfqq->sort_list))
+			cfq_arm_slice_timer(cfqd);
 	}
+
+	if (!cfqd->rq_in_driver)
+		cfq_schedule_dispatch(cfqd);
 }
 
 /*
@@ -2101,7 +2243,6 @@ fail:
 /*
  * sysfs parts below -->
  */
-
 static ssize_t
 cfq_var_show(unsigned int var, char *page)
 {
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/15] cfq-iosched: minor updates
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
  2007-04-24  8:15 ` [PATCH 1/15] cfq-iosched: improve preemption for cooperating tasks Jens Axboe
  2007-04-24  8:15 ` [PATCH 2/15] cfq-iosched: development update Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 4/15] cfq-iosched: rework the whole round-robin list concept Jens Axboe
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

- Move the queue_new flag clear to when the queue is selected
- Only select the non-first queue in cfq_get_best_queue(), if there's
  a substantial difference between the best and first.
- Get rid of ->busy_rr
- Only select a close cooperator, if the current queue is known to take
  a while to "think".

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   81 +++++++++++---------------------------------------
 1 files changed, 18 insertions(+), 63 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 3883ba8..04fea76 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -70,7 +70,6 @@ struct cfq_data {
 	 * rr list of queues with requests and the count of them
 	 */
 	struct list_head rr_list[CFQ_PRIO_LISTS];
-	struct list_head busy_rr;
 	struct list_head cur_rr;
 	struct list_head idle_rr;
 	unsigned long cur_rr_tick;
@@ -410,59 +409,18 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 static void cfq_resort_be_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 				int preempted)
 {
-	struct list_head *list, *n;
-	struct cfq_queue *__cfqq;
-	int add_tail = 0;
-
-	/*
-	 * if cfqq has requests in flight, don't allow it to be
-	 * found in cfq_set_active_queue before it has finished them.
-	 * this is done to increase fairness between a process that
-	 * has lots of io pending vs one that only generates one
-	 * sporadically or synchronously
-	 */
-	if (cfqq->dispatched)
-		list = &cfqd->busy_rr;
-	else if (cfqq->ioprio == (cfqd->cur_prio + 1) &&
-		 cfq_cfqq_sync(cfqq) &&
-		 (time_before(cfqd->prio_time, cfqq->service_last) ||
-		  cfq_cfqq_queue_new(cfqq) || preempted)) {
-		list = &cfqd->cur_rr;
-		add_tail = 1;
-	} else
-		list = &cfqd->rr_list[cfqq->ioprio];
-
-	if (!cfq_cfqq_sync(cfqq) || add_tail) {
-		/*
-		 * async queue always goes to the end. this wont be overly
-		 * unfair to writes, as the sort of the sync queue wont be
-		 * allowed to pass the async queue again.
-		 */
-		list_add_tail(&cfqq->cfq_list, list);
-	} else if (preempted || cfq_cfqq_queue_new(cfqq)) {
-		/*
-		 * If this queue was preempted or is new (never been serviced),
-		 * let it be added first for fairness but beind other new
-		 * queues.
-		 */
-		n = list;
-		while (n->next != list) {
-			__cfqq = list_entry_cfqq(n->next);
-			if (!cfq_cfqq_queue_new(__cfqq))
-				break;
+	if (!cfq_cfqq_sync(cfqq))
+		list_add_tail(&cfqq->cfq_list, &cfqd->rr_list[cfqq->ioprio]);
+	else {
+		struct list_head *n = &cfqd->rr_list[cfqq->ioprio];
 
-			n = n->next;
-		}
-		list_add(&cfqq->cfq_list, n);
-	} else {
 		/*
 		 * sort by last service, but don't cross a new or async
-		 * queue. we don't cross a new queue because it hasn't been
-		 * service before, and we don't cross an async queue because
-		 * it gets added to the end on expire.
+		 * queue. we don't cross a new queue because it hasn't
+		 * been service before, and we don't cross an async
+		 * queue because it gets added to the end on expire.
 		 */
-		n = list;
-		while ((n = n->prev) != list) {
+		while ((n = n->prev) != &cfqd->rr_list[cfqq->ioprio]) {
 			struct cfq_queue *__c = list_entry_cfqq(n);
 
 			if (!cfq_cfqq_sync(__c) || !__c->service_last)
@@ -719,6 +677,7 @@ __cfq_set_active_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 		cfq_clear_cfqq_must_alloc_slice(cfqq);
 		cfq_clear_cfqq_fifo_expire(cfqq);
 		cfq_mark_cfqq_slice_new(cfqq);
+		cfq_clear_cfqq_queue_new(cfqq);
 		cfqq->rr_tick = cfqd->cur_rr_tick;
 	}
 
@@ -737,7 +696,6 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 
 	cfq_clear_cfqq_must_dispatch(cfqq);
 	cfq_clear_cfqq_wait_request(cfqq);
-	cfq_clear_cfqq_queue_new(cfqq);
 
 	/*
 	 * store what was left of this slice, if the queue idled out
@@ -839,13 +797,15 @@ static inline sector_t cfq_dist_from_last(struct cfq_data *cfqd,
 static struct cfq_queue *cfq_get_best_queue(struct cfq_data *cfqd)
 {
 	struct cfq_queue *cfqq = NULL, *__cfqq;
-	sector_t best = -1, dist;
+	sector_t best = -1, first = -1, dist;
 
 	list_for_each_entry(__cfqq, &cfqd->cur_rr, cfq_list) {
 		if (!__cfqq->next_rq || !cfq_cfqq_sync(__cfqq))
 			continue;
 
 		dist = cfq_dist_from_last(cfqd, __cfqq->next_rq);
+		if (first == -1)
+			first = dist;
 		if (dist < best) {
 			best = dist;
 			cfqq = __cfqq;
@@ -853,9 +813,11 @@ static struct cfq_queue *cfq_get_best_queue(struct cfq_data *cfqd)
 	}
 
 	/*
-	 * Only async queue(s) available, grab first entry
+	 * Only async queue(s) available, grab first entry. Do the same
+	 * if the difference between the first and best isn't more than
+	 * twice, to obey fairness.
 	 */
-	if (!cfqq)
+	if (!cfqq || (best && first != best && ((first / best) < 4)))
 		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
 
 	return cfqq;
@@ -872,12 +834,6 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 		 * are spliced
 		 */
 		cfqq = cfq_get_best_queue(cfqd);
-	} else if (!list_empty(&cfqd->busy_rr)) {
-		/*
-		 * If no new queues are available, check if the busy list has
-		 * some before falling back to idle io.
-		 */
-		cfqq = list_entry_cfqq(cfqd->busy_rr.next);
 	} else if (!list_empty(&cfqd->idle_rr)) {
 		/*
 		 * if we have idle queues and no rt or be queues had pending
@@ -998,7 +954,8 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
 	/*
 	 * See if this prio level has a good candidate
 	 */
-	if (cfq_close_cooperator(cfqd, cfqq))
+	if (cfq_close_cooperator(cfqd, cfqq) &&
+	    (sample_valid(cic->ttime_samples) && cic->ttime_mean > 2))
 		return;
 
 	cfq_mark_cfqq_must_dispatch(cfqq);
@@ -1178,7 +1135,6 @@ cfq_forced_dispatch(struct cfq_data *cfqd)
 	for (i = 0; i < CFQ_PRIO_LISTS; i++)
 		dispatched += cfq_forced_dispatch_cfqqs(&cfqd->rr_list[i]);
 
-	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->busy_rr);
 	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->cur_rr);
 	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->idle_rr);
 
@@ -2174,7 +2130,6 @@ static void *cfq_init_queue(request_queue_t *q)
 	for (i = 0; i < CFQ_PRIO_LISTS; i++)
 		INIT_LIST_HEAD(&cfqd->rr_list[i]);
 
-	INIT_LIST_HEAD(&cfqd->busy_rr);
 	INIT_LIST_HEAD(&cfqd->cur_rr);
 	INIT_LIST_HEAD(&cfqd->idle_rr);
 	INIT_LIST_HEAD(&cfqd->cic_list);
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/15] cfq-iosched: rework the whole round-robin list concept
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (2 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 3/15] cfq-iosched: minor updates Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 5/15] cfq-iosched: speed up rbtree handling Jens Axboe
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Drawing on some inspiration from the CFS CPU scheduler design, overhaul
the pending cfq_queue concept list management. Currently CFQ uses a
doubly linked list per priority level for sorting and service uses.
Kill those lists and maintain an rbtree of cfq_queue's, sorted by when
to service them.

This unfortunately means that the ionice levels aren't as strong
anymore, will work on improving those later. We only scale the slice
time now, not the number of times we service. This means that latency
is better (for all priority levels), but that the distinction between
the highest and lower levels aren't as big.

The diffstat speaks for itself.

 cfq-iosched.c |  363 +++++++++++++++++---------------------------------
 1 file changed, 125 insertions(+), 238 deletions(-)

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |  361 +++++++++++++++++---------------------------------
 1 files changed, 123 insertions(+), 238 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 04fea76..ad29a99 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -26,7 +26,16 @@ static int cfq_slice_async = HZ / 25;
 static const int cfq_slice_async_rq = 2;
 static int cfq_slice_idle = HZ / 125;
 
+/*
+ * grace period before allowing idle class to get disk access
+ */
 #define CFQ_IDLE_GRACE		(HZ / 10)
+
+/*
+ * below this threshold, we consider thinktime immediate
+ */
+#define CFQ_MIN_TT		(2)
+
 #define CFQ_SLICE_SCALE		(5)
 
 #define CFQ_KEY_ASYNC		(0)
@@ -69,10 +78,9 @@ struct cfq_data {
 	/*
 	 * rr list of queues with requests and the count of them
 	 */
-	struct list_head rr_list[CFQ_PRIO_LISTS];
+	struct rb_root service_tree;
 	struct list_head cur_rr;
 	struct list_head idle_rr;
-	unsigned long cur_rr_tick;
 	unsigned int busy_queues;
 
 	/*
@@ -91,8 +99,6 @@ struct cfq_data {
 
 	struct cfq_queue *active_queue;
 	struct cfq_io_context *active_cic;
-	int cur_prio, cur_end_prio;
-	unsigned long prio_time;
 	unsigned int dispatch_slice;
 
 	struct timer_list idle_class_timer;
@@ -131,8 +137,10 @@ struct cfq_queue {
 	unsigned int key;
 	/* member of the rr/busy/cur/idle cfqd list */
 	struct list_head cfq_list;
-	/* in what tick we were last serviced */
-	unsigned long rr_tick;
+	/* service_tree member */
+	struct rb_node rb_node;
+	/* service_tree key */
+	unsigned long rb_key;
 	/* sorted list of pending requests */
 	struct rb_root sort_list;
 	/* if fifo isn't expired, next request to serve */
@@ -147,8 +155,6 @@ struct cfq_queue {
 	struct list_head fifo;
 
 	unsigned long slice_end;
-	unsigned long service_last;
-	unsigned long slice_start;
 	long slice_resid;
 
 	/* number of requests that are on the dispatch list or inside driver */
@@ -240,30 +246,26 @@ static inline pid_t cfq_queue_pid(struct task_struct *task, int rw, int is_sync)
  * if a queue is marked sync and has sync io queued. A sync queue with async
  * io only, should not get full sync slice length.
  */
-static inline int
-cfq_prio_to_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
+static inline int cfq_prio_slice(struct cfq_data *cfqd, int sync,
+				 unsigned short prio)
 {
-	const int base_slice = cfqd->cfq_slice[cfq_cfqq_sync(cfqq)];
+	const int base_slice = cfqd->cfq_slice[sync];
 
-	WARN_ON(cfqq->ioprio >= IOPRIO_BE_NR);
+	WARN_ON(prio >= IOPRIO_BE_NR);
+
+	return base_slice + (base_slice/CFQ_SLICE_SCALE * (4 - prio));
+}
 
-	return base_slice + (base_slice/CFQ_SLICE_SCALE * (4 - cfqq->ioprio));
+static inline int
+cfq_prio_to_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
+{
+	return cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio);
 }
 
 static inline void
 cfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 {
 	cfqq->slice_end = cfq_prio_to_slice(cfqd, cfqq) + jiffies;
-	cfqq->slice_end += cfqq->slice_resid;
-
-	/*
-	 * Don't carry over residual for more than one slice, we only want
-	 * to slightly correct the fairness. Carrying over forever would
-	 * easily introduce oscillations.
-	 */
-	cfqq->slice_resid = 0;
-
-	cfqq->slice_start = jiffies;
 }
 
 /*
@@ -403,33 +405,50 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	return cfq_choose_req(cfqd, next, prev);
 }
 
-/*
- * This function finds out where to insert a BE queue in the service hierarchy
- */
-static void cfq_resort_be_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-				int preempted)
+static unsigned long cfq_slice_offset(struct cfq_data *cfqd,
+				      struct cfq_queue *cfqq)
 {
-	if (!cfq_cfqq_sync(cfqq))
-		list_add_tail(&cfqq->cfq_list, &cfqd->rr_list[cfqq->ioprio]);
-	else {
-		struct list_head *n = &cfqd->rr_list[cfqq->ioprio];
+	/*
+	 * just an approximation, should be ok.
+	 */
+	return ((cfqd->busy_queues - 1) * cfq_prio_slice(cfqd, 1, 0));
+}
+
+static void cfq_service_tree_add(struct cfq_data *cfqd,
+				    struct cfq_queue *cfqq)
+{
+	struct rb_node **p = &cfqd->service_tree.rb_node;
+	struct rb_node *parent = NULL;
+	struct cfq_queue *__cfqq;
+	unsigned long rb_key;
+
+	rb_key = cfq_slice_offset(cfqd, cfqq) + jiffies;
+	rb_key += cfqq->slice_resid;
+	cfqq->slice_resid = 0;
 
+	if (!RB_EMPTY_NODE(&cfqq->rb_node)) {
 		/*
-		 * sort by last service, but don't cross a new or async
-		 * queue. we don't cross a new queue because it hasn't
-		 * been service before, and we don't cross an async
-		 * queue because it gets added to the end on expire.
+		 * same position, nothing more to do
 		 */
-		while ((n = n->prev) != &cfqd->rr_list[cfqq->ioprio]) {
-			struct cfq_queue *__c = list_entry_cfqq(n);
+		if (rb_key == cfqq->rb_key)
+			return;
 
-			if (!cfq_cfqq_sync(__c) || !__c->service_last)
-				break;
-			if (time_before(__c->service_last, cfqq->service_last))
-				break;
-		}
-		list_add(&cfqq->cfq_list, n);
+		rb_erase(&cfqq->rb_node, &cfqd->service_tree);
 	}
+
+	while (*p) {
+		parent = *p;
+		__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
+
+		if (rb_key < __cfqq->rb_key)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+
+	cfqq->rb_key = rb_key;
+	rb_link_node(&cfqq->rb_node, parent, p);
+	rb_insert_color(&cfqq->rb_node, &cfqd->service_tree);
 }
 
 static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
@@ -443,7 +462,7 @@ static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 	if (!cfq_cfqq_on_rr(cfqq))
 		return;
 
-	list_del(&cfqq->cfq_list);
+	list_del_init(&cfqq->cfq_list);
 
 	if (cfq_class_rt(cfqq)) {
 		/*
@@ -465,7 +484,7 @@ static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 		/*
 		 * So we get here, ergo the queue is a regular best-effort queue
 		 */
-		cfq_resort_be_queue(cfqd, cfqq, preempted);
+		cfq_service_tree_add(cfqd, cfqq);
 	}
 }
 
@@ -490,6 +509,11 @@ cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 	cfq_clear_cfqq_on_rr(cfqq);
 	list_del_init(&cfqq->cfq_list);
 
+	if (!RB_EMPTY_NODE(&cfqq->rb_node)) {
+		rb_erase(&cfqq->rb_node, &cfqd->service_tree);
+		RB_CLEAR_NODE(&cfqq->rb_node);
+	}
+
 	BUG_ON(!cfqd->busy_queues);
 	cfqd->busy_queues--;
 }
@@ -678,7 +702,6 @@ __cfq_set_active_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 		cfq_clear_cfqq_fifo_expire(cfqq);
 		cfq_mark_cfqq_slice_new(cfqq);
 		cfq_clear_cfqq_queue_new(cfqq);
-		cfqq->rr_tick = cfqd->cur_rr_tick;
 	}
 
 	cfqd->active_queue = cfqq;
@@ -726,114 +749,19 @@ static inline void cfq_slice_expired(struct cfq_data *cfqd, int preempted,
 		__cfq_slice_expired(cfqd, cfqq, preempted, timed_out);
 }
 
-/*
- * 0
- * 0,1
- * 0,1,2
- * 0,1,2,3
- * 0,1,2,3,4
- * 0,1,2,3,4,5
- * 0,1,2,3,4,5,6
- * 0,1,2,3,4,5,6,7
- */
-static int cfq_get_next_prio_level(struct cfq_data *cfqd)
-{
-	int prio, wrap;
-
-	prio = -1;
-	wrap = 0;
-	do {
-		int p;
-
-		for (p = cfqd->cur_prio; p <= cfqd->cur_end_prio; p++) {
-			if (!list_empty(&cfqd->rr_list[p])) {
-				prio = p;
-				break;
-			}
-		}
-
-		if (prio != -1)
-			break;
-		cfqd->cur_prio = 0;
-		if (++cfqd->cur_end_prio == CFQ_PRIO_LISTS) {
-			cfqd->cur_end_prio = 0;
-			if (wrap)
-				break;
-			wrap = 1;
-		}
-	} while (1);
-
-	if (unlikely(prio == -1))
-		return -1;
-
-	BUG_ON(prio >= CFQ_PRIO_LISTS);
-
-	list_splice_init(&cfqd->rr_list[prio], &cfqd->cur_rr);
-
-	cfqd->cur_prio = prio + 1;
-	if (cfqd->cur_prio > cfqd->cur_end_prio) {
-		cfqd->cur_end_prio = cfqd->cur_prio;
-		cfqd->cur_prio = 0;
-	}
-	if (cfqd->cur_end_prio == CFQ_PRIO_LISTS) {
-		cfqd->cur_prio = 0;
-		cfqd->cur_end_prio = 0;
-	}
-
-	cfqd->cur_rr_tick++;
-	cfqd->prio_time = jiffies;
-	return prio;
-}
-
-static inline sector_t cfq_dist_from_last(struct cfq_data *cfqd,
-					  struct request *rq)
-{
-	if (rq->sector >= cfqd->last_position)
-		return rq->sector - cfqd->last_position;
-	else
-		return cfqd->last_position - rq->sector;
-}
-
-static struct cfq_queue *cfq_get_best_queue(struct cfq_data *cfqd)
-{
-	struct cfq_queue *cfqq = NULL, *__cfqq;
-	sector_t best = -1, first = -1, dist;
-
-	list_for_each_entry(__cfqq, &cfqd->cur_rr, cfq_list) {
-		if (!__cfqq->next_rq || !cfq_cfqq_sync(__cfqq))
-			continue;
-
-		dist = cfq_dist_from_last(cfqd, __cfqq->next_rq);
-		if (first == -1)
-			first = dist;
-		if (dist < best) {
-			best = dist;
-			cfqq = __cfqq;
-		}
-	}
-
-	/*
-	 * Only async queue(s) available, grab first entry. Do the same
-	 * if the difference between the first and best isn't more than
-	 * twice, to obey fairness.
-	 */
-	if (!cfqq || (best && first != best && ((first / best) < 4)))
-		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
-
-	return cfqq;
-}
-
 static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 {
 	struct cfq_queue *cfqq = NULL;
 
-	if (!list_empty(&cfqd->cur_rr) || cfq_get_next_prio_level(cfqd) != -1) {
+	if (!list_empty(&cfqd->cur_rr)) {
 		/*
-		 * if current list is non-empty, grab first entry. if it is
-		 * empty, get next prio level and grab first entry then if any
-		 * are spliced
+		 * if current list is non-empty, grab first entry.
 		 */
-		cfqq = cfq_get_best_queue(cfqd);
+		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
+	} else if (!RB_EMPTY_ROOT(&cfqd->service_tree)) {
+		struct rb_node *n = rb_first(&cfqd->service_tree);
+
+		cfqq = rb_entry(n, struct cfq_queue, rb_node);
 	} else if (!list_empty(&cfqd->idle_rr)) {
 		/*
 		 * if we have idle queues and no rt or be queues had pending
@@ -855,27 +783,20 @@ static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd)
 {
 	struct cfq_queue *cfqq;
 
-	do {
-		long prio;
-
-		cfqq = cfq_get_next_queue(cfqd);
-		if (!cfqq)
-			break;
-
-		prio = cfq_prio_to_slice(cfqd, cfqq);
-		if (cfqq->slice_resid > -prio)
-			break;
-
-		cfqq->slice_resid += prio;
-		list_del_init(&cfqq->cfq_list);
-		list_add_tail(&cfqq->cfq_list, &cfqd->rr_list[cfqq->ioprio]);
-		cfqq = NULL;
-	} while (1);
-
+	cfqq = cfq_get_next_queue(cfqd);
 	__cfq_set_active_queue(cfqd, cfqq);
 	return cfqq;
 }
 
+static inline sector_t cfq_dist_from_last(struct cfq_data *cfqd,
+					  struct request *rq)
+{
+	if (rq->sector >= cfqd->last_position)
+		return rq->sector - cfqd->last_position;
+	else
+		return cfqd->last_position - rq->sector;
+}
+
 static inline int cfq_rq_close(struct cfq_data *cfqd, struct request *rq)
 {
 	struct cfq_io_context *cic = cfqd->active_cic;
@@ -886,45 +807,15 @@ static inline int cfq_rq_close(struct cfq_data *cfqd, struct request *rq)
 	return cfq_dist_from_last(cfqd, rq) <= cic->seek_mean;
 }
 
-static struct cfq_queue *__cfq_close_cooperator(struct cfq_data *cfqd,
-						struct cfq_queue *cur_cfqq,
-						struct list_head *list)
-{
-	struct cfq_queue *cfqq;
-
-	list_for_each_entry(cfqq, list, cfq_list) {
-		if (cfqq == cur_cfqq || !cfq_cfqq_sync(cfqq))
-			continue;
-
-		BUG_ON(!cfqq->next_rq);
-
-		if (cfq_rq_close(cfqd, cfqq->next_rq))
-			return cfqq;
-	}
-
-	return NULL;
-}
-
-static int cfq_close_cooperator(struct cfq_data *cfqd,
-				struct cfq_queue *cur_cfqq)
+static int cfq_close_cooperator(struct cfq_data *cfq_data,
+				struct cfq_queue *cfqq)
 {
-	struct cfq_queue *cfqq;
-
-	if (!cfqd->busy_queues)
-		return 0;
-
 	/*
-	 * check cur_rr and same-prio rr_list for candidates
+	 * We should notice if some of the queues are cooperating, eg
+	 * working closely on the same area of the disk. In that case,
+	 * we can group them together and don't waste time idling.
 	 */
-	cfqq = __cfq_close_cooperator(cfqd, cur_cfqq, &cfqd->cur_rr);
-	if (cfqq)
-		return 1;
-
-	cfqq = __cfq_close_cooperator(cfqd, cur_cfqq, &cfqd->rr_list[cur_cfqq->ioprio]);
-	if (cfqq && (cfqq->rr_tick == cfqd->cur_rr_tick))
-		cfqq = NULL;
-
-	return cfqq != NULL;
+	return 0;
 }
 
 #define CIC_SEEKY(cic) ((cic)->seek_mean > (8 * 1024))
@@ -968,7 +859,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
 	 */
 	sl = cfqd->cfq_slice_idle;
 	if (sample_valid(cic->seek_samples) && CIC_SEEKY(cic))
-		sl = min(sl, msecs_to_jiffies(2));
+		sl = min(sl, msecs_to_jiffies(CFQ_MIN_TT));
 
 	mod_timer(&cfqd->idle_slice_timer, jiffies + sl);
 }
@@ -1109,31 +1000,41 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	return dispatched;
 }
 
-static int
-cfq_forced_dispatch_cfqqs(struct list_head *list)
+static inline int __cfq_forced_dispatch_cfqq(struct cfq_queue *cfqq)
+{
+	int dispatched = 0;
+
+	while (cfqq->next_rq) {
+		cfq_dispatch_insert(cfqq->cfqd->queue, cfqq->next_rq);
+		dispatched++;
+	}
+
+	BUG_ON(!list_empty(&cfqq->fifo));
+	return dispatched;
+}
+
+static int cfq_forced_dispatch_cfqqs(struct list_head *list)
 {
 	struct cfq_queue *cfqq, *next;
 	int dispatched;
 
 	dispatched = 0;
-	list_for_each_entry_safe(cfqq, next, list, cfq_list) {
-		while (cfqq->next_rq) {
-			cfq_dispatch_insert(cfqq->cfqd->queue, cfqq->next_rq);
-			dispatched++;
-		}
-		BUG_ON(!list_empty(&cfqq->fifo));
-	}
+	list_for_each_entry_safe(cfqq, next, list, cfq_list)
+		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
 
 	return dispatched;
 }
 
-static int
-cfq_forced_dispatch(struct cfq_data *cfqd)
+static int cfq_forced_dispatch(struct cfq_data *cfqd)
 {
-	int i, dispatched = 0;
+	int dispatched = 0;
+	struct rb_node *n;
+
+	while ((n = rb_first(&cfqd->service_tree)) != NULL) {
+		struct cfq_queue *cfqq = rb_entry(n, struct cfq_queue, rb_node);
 
-	for (i = 0; i < CFQ_PRIO_LISTS; i++)
-		dispatched += cfq_forced_dispatch_cfqqs(&cfqd->rr_list[i]);
+		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
+	}
 
 	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->cur_rr);
 	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->idle_rr);
@@ -1145,8 +1046,7 @@ cfq_forced_dispatch(struct cfq_data *cfqd)
 	return dispatched;
 }
 
-static int
-cfq_dispatch_requests(request_queue_t *q, int force)
+static int cfq_dispatch_requests(request_queue_t *q, int force)
 {
 	struct cfq_data *cfqd = q->elevator->elevator_data;
 	struct cfq_queue *cfqq;
@@ -1216,7 +1116,6 @@ static void cfq_put_queue(struct cfq_queue *cfqq)
 	/*
 	 * it's on the empty list and still hashed
 	 */
-	list_del(&cfqq->cfq_list);
 	hlist_del(&cfqq->cfq_hash);
 	kmem_cache_free(cfq_pool, cfqq);
 }
@@ -1385,8 +1284,6 @@ static void cfq_init_prio_data(struct cfq_queue *cfqq)
 	 */
 	cfqq->org_ioprio = cfqq->ioprio;
 	cfqq->org_ioprio_class = cfqq->ioprio_class;
-
-	cfq_resort_rr_list(cfqq, 0);
 	cfq_clear_cfqq_prio_changed(cfqq);
 }
 
@@ -1472,6 +1369,7 @@ retry:
 
 		INIT_HLIST_NODE(&cfqq->cfq_hash);
 		INIT_LIST_HEAD(&cfqq->cfq_list);
+		RB_CLEAR_NODE(&cfqq->rb_node);
 		INIT_LIST_HEAD(&cfqq->fifo);
 
 		cfqq->key = key;
@@ -1746,7 +1644,8 @@ static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 	 * so we know that it will be selected next.
 	 */
 	BUG_ON(!cfq_cfqq_on_rr(cfqq));
-	list_move(&cfqq->cfq_list, &cfqd->cur_rr);
+	list_del_init(&cfqq->cfq_list);
+	list_add(&cfqq->cfq_list, &cfqd->cur_rr);
 
 	cfqq->slice_end = 0;
 	cfq_mark_cfqq_slice_new(cfqq);
@@ -1828,13 +1727,10 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
 	WARN_ON(!cfqq->dispatched);
 	cfqd->rq_in_driver--;
 	cfqq->dispatched--;
-	cfqq->service_last = now;
 
 	if (!cfq_class_idle(cfqq))
 		cfqd->last_end_request = now;
 
-	cfq_resort_rr_list(cfqq, 0);
-
 	if (sync)
 		RQ_CIC(rq)->last_end_request = now;
 
@@ -1863,9 +1759,6 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
  */
 static void cfq_prio_boost(struct cfq_queue *cfqq)
 {
-	const int ioprio_class = cfqq->ioprio_class;
-	const int ioprio = cfqq->ioprio;
-
 	if (has_fs_excl()) {
 		/*
 		 * boost idle prio on transactions that would lock out other
@@ -1884,12 +1777,6 @@ static void cfq_prio_boost(struct cfq_queue *cfqq)
 		if (cfqq->ioprio != cfqq->org_ioprio)
 			cfqq->ioprio = cfqq->org_ioprio;
 	}
-
-	/*
-	 * refile between round-robin lists if we moved the priority class
-	 */
-	if ((ioprio_class != cfqq->ioprio_class || ioprio != cfqq->ioprio))
-		cfq_resort_rr_list(cfqq, 0);
 }
 
 static inline int __cfq_may_queue(struct cfq_queue *cfqq)
@@ -2127,9 +2014,7 @@ static void *cfq_init_queue(request_queue_t *q)
 
 	memset(cfqd, 0, sizeof(*cfqd));
 
-	for (i = 0; i < CFQ_PRIO_LISTS; i++)
-		INIT_LIST_HEAD(&cfqd->rr_list[i]);
-
+	cfqd->service_tree = RB_ROOT;
 	INIT_LIST_HEAD(&cfqd->cur_rr);
 	INIT_LIST_HEAD(&cfqd->idle_rr);
 	INIT_LIST_HEAD(&cfqd->cic_list);
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/15] cfq-iosched: speed up rbtree handling
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (3 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 4/15] cfq-iosched: rework the whole round-robin list concept Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-25 15:59   ` Alan D. Brunelle
  2007-04-24  8:15 ` [PATCH 6/15] cfq-iosched: sort RT queues into the rbtree Jens Axboe
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

For cases where the rbtree is mainly used for sorting and min retrieval,
a nice speedup of the rbtree code is to maintain a cache of the leftmost
node in the tree.

Also spotted in the CFS CPU scheduler code.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   62 +++++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index ad29a99..7f964ee 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -70,6 +70,18 @@ static struct completion *ioc_gone;
 #define sample_valid(samples)	((samples) > 80)
 
 /*
+ * Most of our rbtree usage is for sorting with min extraction, so
+ * if we cache the leftmost node we don't have to walk down the tree
+ * to find it. Idea borrowed from Ingo Molnars CFS scheduler. We should
+ * move this into the elevator for the rq sorting as well.
+ */
+struct cfq_rb_root {
+	struct rb_root rb;
+	struct rb_node *left;
+};
+#define CFQ_RB_ROOT	(struct cfq_rb_root) { RB_ROOT, NULL, }
+
+/*
  * Per block device queue structure
  */
 struct cfq_data {
@@ -78,7 +90,7 @@ struct cfq_data {
 	/*
 	 * rr list of queues with requests and the count of them
 	 */
-	struct rb_root service_tree;
+	struct cfq_rb_root service_tree;
 	struct list_head cur_rr;
 	struct list_head idle_rr;
 	unsigned int busy_queues;
@@ -378,6 +390,23 @@ cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, struct request *rq2)
 	}
 }
 
+static struct rb_node *cfq_rb_first(struct cfq_rb_root *root)
+{
+	if (root->left)
+		return root->left;
+
+	return rb_first(&root->rb);
+}
+
+static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root)
+{
+	if (root->left == n)
+		root->left = NULL;
+
+	rb_erase(n, &root->rb);
+	RB_CLEAR_NODE(n);
+}
+
 /*
  * would be nice to take fifo expire time into account as well
  */
@@ -417,10 +446,10 @@ static unsigned long cfq_slice_offset(struct cfq_data *cfqd,
 static void cfq_service_tree_add(struct cfq_data *cfqd,
 				    struct cfq_queue *cfqq)
 {
-	struct rb_node **p = &cfqd->service_tree.rb_node;
+	struct rb_node **p = &cfqd->service_tree.rb.rb_node;
 	struct rb_node *parent = NULL;
-	struct cfq_queue *__cfqq;
 	unsigned long rb_key;
+	int left = 1;
 
 	rb_key = cfq_slice_offset(cfqd, cfqq) + jiffies;
 	rb_key += cfqq->slice_resid;
@@ -433,22 +462,29 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
 		if (rb_key == cfqq->rb_key)
 			return;
 
-		rb_erase(&cfqq->rb_node, &cfqd->service_tree);
+		cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
 	}
 
 	while (*p) {
+		struct cfq_queue *__cfqq;
+
 		parent = *p;
 		__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
 
 		if (rb_key < __cfqq->rb_key)
 			p = &(*p)->rb_left;
-		else
+		else {
 			p = &(*p)->rb_right;
+			left = 0;
+		}
 	}
 
+	if (left)
+		cfqd->service_tree.left = &cfqq->rb_node;
+
 	cfqq->rb_key = rb_key;
 	rb_link_node(&cfqq->rb_node, parent, p);
-	rb_insert_color(&cfqq->rb_node, &cfqd->service_tree);
+	rb_insert_color(&cfqq->rb_node, &cfqd->service_tree.rb);
 }
 
 static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
@@ -509,10 +545,8 @@ cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 	cfq_clear_cfqq_on_rr(cfqq);
 	list_del_init(&cfqq->cfq_list);
 
-	if (!RB_EMPTY_NODE(&cfqq->rb_node)) {
-		rb_erase(&cfqq->rb_node, &cfqd->service_tree);
-		RB_CLEAR_NODE(&cfqq->rb_node);
-	}
+	if (!RB_EMPTY_NODE(&cfqq->rb_node))
+		cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
 
 	BUG_ON(!cfqd->busy_queues);
 	cfqd->busy_queues--;
@@ -758,8 +792,8 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 		 * if current list is non-empty, grab first entry.
 		 */
 		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
-	} else if (!RB_EMPTY_ROOT(&cfqd->service_tree)) {
-		struct rb_node *n = rb_first(&cfqd->service_tree);
+	} else if (!RB_EMPTY_ROOT(&cfqd->service_tree.rb)) {
+		struct rb_node *n = cfq_rb_first(&cfqd->service_tree);
 
 		cfqq = rb_entry(n, struct cfq_queue, rb_node);
 	} else if (!list_empty(&cfqd->idle_rr)) {
@@ -1030,7 +1064,7 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
 	int dispatched = 0;
 	struct rb_node *n;
 
-	while ((n = rb_first(&cfqd->service_tree)) != NULL) {
+	while ((n = cfq_rb_first(&cfqd->service_tree)) != NULL) {
 		struct cfq_queue *cfqq = rb_entry(n, struct cfq_queue, rb_node);
 
 		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
@@ -2014,7 +2048,7 @@ static void *cfq_init_queue(request_queue_t *q)
 
 	memset(cfqd, 0, sizeof(*cfqd));
 
-	cfqd->service_tree = RB_ROOT;
+	cfqd->service_tree = CFQ_RB_ROOT;
 	INIT_LIST_HEAD(&cfqd->cur_rr);
 	INIT_LIST_HEAD(&cfqd->idle_rr);
 	INIT_LIST_HEAD(&cfqd->cic_list);
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/15] cfq-iosched: sort RT queues into the rbtree
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (4 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 5/15] cfq-iosched: speed up rbtree handling Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 7/15] cfq-iosched: sort IDLE " Jens Axboe
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Currently CFQ does a linked insert into the current list for RT
queues. We can just factor the class into the rb insertion,
and then we don't have to treat RT queues in a special way. It's
faster, too.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   27 ++++++++++++---------------
 1 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 7f964ee..38ac492 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -471,7 +471,16 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
 		parent = *p;
 		__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
 
-		if (rb_key < __cfqq->rb_key)
+		/*
+		 * sort RT queues first, we always want to give
+		 * preference to them. after that, sort on the next
+		 * service time.
+		 */
+		if (cfq_class_rt(cfqq) > cfq_class_rt(__cfqq))
+			p = &(*p)->rb_left;
+		else if (cfq_class_rt(cfqq) < cfq_class_rt(__cfqq))
+			p = &(*p)->rb_right;
+		else if (rb_key < __cfqq->rb_key)
 			p = &(*p)->rb_left;
 		else {
 			p = &(*p)->rb_right;
@@ -490,7 +499,6 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
 static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 {
 	struct cfq_data *cfqd = cfqq->cfqd;
-	struct list_head *n;
 
 	/*
 	 * Resorting requires the cfqq to be on the RR list already.
@@ -500,25 +508,14 @@ static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 
 	list_del_init(&cfqq->cfq_list);
 
-	if (cfq_class_rt(cfqq)) {
-		/*
-		 * At to the front of the current list, but behind other
-		 * RT queues.
-		 */
-		n = &cfqd->cur_rr;
-		while (n->next != &cfqd->cur_rr)
-			if (!cfq_class_rt(cfqq))
-				break;
-
-		list_add(&cfqq->cfq_list, n);
-	} else if (cfq_class_idle(cfqq)) {
+	if (cfq_class_idle(cfqq)) {
 		/*
 		 * IDLE goes to the tail of the idle list
 		 */
 		list_add_tail(&cfqq->cfq_list, &cfqd->idle_rr);
 	} else {
 		/*
-		 * So we get here, ergo the queue is a regular best-effort queue
+		 * RT and BE queues, sort into the rbtree
 		 */
 		cfq_service_tree_add(cfqd, cfqq);
 	}
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/15] cfq-iosched: sort IDLE queues into the rbtree
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (5 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 6/15] cfq-iosched: sort RT queues into the rbtree Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 8/15] cfq-iosched: style cleanups and comments Jens Axboe
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Same treatment as the RT conversion, just put the sorted idle
branch at the end of the tree.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   67 +++++++++++++++++++++++---------------------------
 1 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 38ac492..e6cc77f 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -92,7 +92,6 @@ struct cfq_data {
 	 */
 	struct cfq_rb_root service_tree;
 	struct list_head cur_rr;
-	struct list_head idle_rr;
 	unsigned int busy_queues;
 
 	/*
@@ -467,25 +466,33 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
 
 	while (*p) {
 		struct cfq_queue *__cfqq;
+		struct rb_node **n;
 
 		parent = *p;
 		__cfqq = rb_entry(parent, struct cfq_queue, rb_node);
 
 		/*
 		 * sort RT queues first, we always want to give
-		 * preference to them. after that, sort on the next
-		 * service time.
+		 * preference to them. IDLE queues goes to the back.
+		 * after that, sort on the next service time.
 		 */
 		if (cfq_class_rt(cfqq) > cfq_class_rt(__cfqq))
-			p = &(*p)->rb_left;
+			n = &(*p)->rb_left;
 		else if (cfq_class_rt(cfqq) < cfq_class_rt(__cfqq))
-			p = &(*p)->rb_right;
+			n = &(*p)->rb_right;
+		else if (cfq_class_idle(cfqq) < cfq_class_idle(__cfqq))
+			n = &(*p)->rb_left;
+		else if (cfq_class_idle(cfqq) > cfq_class_idle(__cfqq))
+			n = &(*p)->rb_right;
 		else if (rb_key < __cfqq->rb_key)
-			p = &(*p)->rb_left;
-		else {
-			p = &(*p)->rb_right;
+			n = &(*p)->rb_left;
+		else
+			n = &(*p)->rb_right;
+
+		if (n == &(*p)->rb_right)
 			left = 0;
-		}
+
+		p = n;
 	}
 
 	if (left)
@@ -506,19 +513,7 @@ static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 	if (!cfq_cfqq_on_rr(cfqq))
 		return;
 
-	list_del_init(&cfqq->cfq_list);
-
-	if (cfq_class_idle(cfqq)) {
-		/*
-		 * IDLE goes to the tail of the idle list
-		 */
-		list_add_tail(&cfqq->cfq_list, &cfqd->idle_rr);
-	} else {
-		/*
-		 * RT and BE queues, sort into the rbtree
-		 */
-		cfq_service_tree_add(cfqd, cfqq);
-	}
+	cfq_service_tree_add(cfqd, cfqq);
 }
 
 /*
@@ -791,20 +786,22 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
 	} else if (!RB_EMPTY_ROOT(&cfqd->service_tree.rb)) {
 		struct rb_node *n = cfq_rb_first(&cfqd->service_tree);
+		unsigned long end;
 
 		cfqq = rb_entry(n, struct cfq_queue, rb_node);
-	} else if (!list_empty(&cfqd->idle_rr)) {
-		/*
-		 * if we have idle queues and no rt or be queues had pending
-		 * requests, either allow immediate service if the grace period
-		 * has passed or arm the idle grace timer
-		 */
-		unsigned long end = cfqd->last_end_request + CFQ_IDLE_GRACE;
-
-		if (time_after_eq(jiffies, end))
-			cfqq = list_entry_cfqq(cfqd->idle_rr.next);
-		else
-			mod_timer(&cfqd->idle_class_timer, end);
+		if (cfq_class_idle(cfqq)) {
+			/*
+			 * if we have idle queues and no rt or be queues had
+			 * pending requests, either allow immediate service if
+			 * the grace period has passed or arm the idle grace
+			 * timer
+			 */
+			end = cfqd->last_end_request + CFQ_IDLE_GRACE;
+			if (time_before(jiffies, end)) {
+				mod_timer(&cfqd->idle_class_timer, end);
+				cfqq = NULL;
+			}
+		}
 	}
 
 	return cfqq;
@@ -1068,7 +1065,6 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
 	}
 
 	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->cur_rr);
-	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->idle_rr);
 
 	cfq_slice_expired(cfqd, 0, 0);
 
@@ -2047,7 +2043,6 @@ static void *cfq_init_queue(request_queue_t *q)
 
 	cfqd->service_tree = CFQ_RB_ROOT;
 	INIT_LIST_HEAD(&cfqd->cur_rr);
-	INIT_LIST_HEAD(&cfqd->idle_rr);
 	INIT_LIST_HEAD(&cfqd->cic_list);
 
 	cfqd->cfq_hash = kmalloc_node(sizeof(struct hlist_head) * CFQ_QHASH_ENTRIES, GFP_KERNEL, q->node);
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/15] cfq-iosched: style cleanups and comments
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (6 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 7/15] cfq-iosched: sort IDLE " Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 9/15] cfq-iosched: slice offset should take ioprio into account Jens Axboe
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   66 ++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 50 insertions(+), 16 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index e6cc77f..f86ff4d 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -222,7 +222,7 @@ CFQ_CFQQ_FNS(slice_new);
 
 static struct cfq_queue *cfq_find_cfq_hash(struct cfq_data *, unsigned int, unsigned short);
 static void cfq_dispatch_insert(request_queue_t *, struct request *);
-static struct cfq_queue *cfq_get_queue(struct cfq_data *cfqd, unsigned int key, struct task_struct *tsk, gfp_t gfp_mask);
+static struct cfq_queue *cfq_get_queue(struct cfq_data *, unsigned int, struct task_struct *, gfp_t);
 
 /*
  * scheduler run of queue, if there are requests pending and no one in the
@@ -389,6 +389,9 @@ cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, struct request *rq2)
 	}
 }
 
+/*
+ * The below is leftmost cache rbtree addon
+ */
 static struct rb_node *cfq_rb_first(struct cfq_rb_root *root)
 {
 	if (root->left)
@@ -442,13 +445,18 @@ static unsigned long cfq_slice_offset(struct cfq_data *cfqd,
 	return ((cfqd->busy_queues - 1) * cfq_prio_slice(cfqd, 1, 0));
 }
 
+/*
+ * The cfqd->service_tree holds all pending cfq_queue's that have
+ * requests waiting to be processed. It is sorted in the order that
+ * we will service the queues.
+ */
 static void cfq_service_tree_add(struct cfq_data *cfqd,
 				    struct cfq_queue *cfqq)
 {
 	struct rb_node **p = &cfqd->service_tree.rb.rb_node;
 	struct rb_node *parent = NULL;
 	unsigned long rb_key;
-	int left = 1;
+	int left;
 
 	rb_key = cfq_slice_offset(cfqd, cfqq) + jiffies;
 	rb_key += cfqq->slice_resid;
@@ -464,6 +472,7 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
 		cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
 	}
 
+	left = 1;
 	while (*p) {
 		struct cfq_queue *__cfqq;
 		struct rb_node **n;
@@ -503,17 +512,16 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
 	rb_insert_color(&cfqq->rb_node, &cfqd->service_tree.rb);
 }
 
+/*
+ * Update cfqq's position in the service tree.
+ */
 static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
 {
-	struct cfq_data *cfqd = cfqq->cfqd;
-
 	/*
 	 * Resorting requires the cfqq to be on the RR list already.
 	 */
-	if (!cfq_cfqq_on_rr(cfqq))
-		return;
-
-	cfq_service_tree_add(cfqd, cfqq);
+	if (cfq_cfqq_on_rr(cfqq))
+		cfq_service_tree_add(cfqq->cfqd, cfqq);
 }
 
 /*
@@ -530,6 +538,10 @@ cfq_add_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 	cfq_resort_rr_list(cfqq, 0);
 }
 
+/*
+ * Called when the cfqq no longer has requests pending, remove it from
+ * the service tree.
+ */
 static inline void
 cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 {
@@ -648,8 +660,7 @@ static void cfq_remove_request(struct request *rq)
 	}
 }
 
-static int
-cfq_merge(request_queue_t *q, struct request **req, struct bio *bio)
+static int cfq_merge(request_queue_t *q, struct request **req, struct bio *bio)
 {
 	struct cfq_data *cfqd = q->elevator->elevator_data;
 	struct request *__rq;
@@ -775,6 +786,10 @@ static inline void cfq_slice_expired(struct cfq_data *cfqd, int preempted,
 		__cfq_slice_expired(cfqd, cfqq, preempted, timed_out);
 }
 
+/*
+ * Get next queue for service. Unless we have a queue preemption,
+ * we'll simply select the first cfqq in the service tree.
+ */
 static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 {
 	struct cfq_queue *cfqq = NULL;
@@ -786,10 +801,11 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
 	} else if (!RB_EMPTY_ROOT(&cfqd->service_tree.rb)) {
 		struct rb_node *n = cfq_rb_first(&cfqd->service_tree);
-		unsigned long end;
 
 		cfqq = rb_entry(n, struct cfq_queue, rb_node);
 		if (cfq_class_idle(cfqq)) {
+			unsigned long end;
+
 			/*
 			 * if we have idle queues and no rt or be queues had
 			 * pending requests, either allow immediate service if
@@ -807,6 +823,9 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 	return cfqq;
 }
 
+/*
+ * Get and set a new active queue for service.
+ */
 static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd)
 {
 	struct cfq_queue *cfqq;
@@ -892,6 +911,9 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
 	mod_timer(&cfqd->idle_slice_timer, jiffies + sl);
 }
 
+/*
+ * Move request from internal lists to the request queue dispatch list.
+ */
 static void cfq_dispatch_insert(request_queue_t *q, struct request *rq)
 {
 	struct cfq_queue *cfqq = RQ_CFQQ(rq);
@@ -938,7 +960,8 @@ cfq_prio_to_maxrq(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 }
 
 /*
- * get next queue for service
+ * Select a queue for service. If we have a current active queue,
+ * check whether to continue servicing it, or retrieve and set a new one.
  */
 static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd)
 {
@@ -979,6 +1002,10 @@ keep_queue:
 	return cfqq;
 }
 
+/*
+ * Dispatch some requests from cfqq, moving them to the request queue
+ * dispatch list.
+ */
 static int
 __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 			int max_dispatch)
@@ -1053,6 +1080,10 @@ static int cfq_forced_dispatch_cfqqs(struct list_head *list)
 	return dispatched;
 }
 
+/*
+ * Drain our current requests. Used for barriers and when switching
+ * io schedulers on-the-fly.
+ */
 static int cfq_forced_dispatch(struct cfq_data *cfqd)
 {
 	int dispatched = 0;
@@ -1218,10 +1249,6 @@ static void __cfq_exit_single_io_context(struct cfq_data *cfqd,
 	}
 }
 
-
-/*
- * Called with interrupts disabled
- */
 static void cfq_exit_single_io_context(struct cfq_io_context *cic)
 {
 	struct cfq_data *cfqd = cic->key;
@@ -1235,6 +1262,10 @@ static void cfq_exit_single_io_context(struct cfq_io_context *cic)
 	}
 }
 
+/*
+ * The process that ioc belongs to has exited, we need to clean up
+ * and put the internal structures we have that belongs to that process.
+ */
 static void cfq_exit_io_context(struct io_context *ioc)
 {
 	struct cfq_io_context *__cic;
@@ -1421,6 +1452,9 @@ out:
 	return cfqq;
 }
 
+/*
+ * We drop cfq io contexts lazily, so we may find a dead one.
+ */
 static void
 cfq_drop_dead_cic(struct io_context *ioc, struct cfq_io_context *cic)
 {
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 9/15] cfq-iosched: slice offset should take ioprio into account
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (7 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 8/15] cfq-iosched: style cleanups and comments Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 10/15] cfq-iosched: get rid of ->cur_rr and ->cfq_list Jens Axboe
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Use the max_slice-cur_slice as the multipler for the insertion offset.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index f86ff4d..251131a 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -442,7 +442,8 @@ static unsigned long cfq_slice_offset(struct cfq_data *cfqd,
 	/*
 	 * just an approximation, should be ok.
 	 */
-	return ((cfqd->busy_queues - 1) * cfq_prio_slice(cfqd, 1, 0));
+	return (cfqd->busy_queues - 1) * (cfq_prio_slice(cfqd, 1, 0) -
+		       cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio));
 }
 
 /*
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/15] cfq-iosched: get rid of ->cur_rr and ->cfq_list
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (8 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 9/15] cfq-iosched: slice offset should take ioprio into account Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 11/15] cfq-iosched: don't pass unused preemption variable around Jens Axboe
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

It's only used for preemption now that the IDLE and RT queues also
use the rbtree. If we pass an 'add_front' variable to
cfq_service_tree_add(), we can set ->rb_key to 0 to force insertion
at the front of the tree.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   87 +++++++++++++++++++--------------------------------
 1 files changed, 32 insertions(+), 55 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 251131a..2d0e9c5 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -45,9 +45,6 @@ static int cfq_slice_idle = HZ / 125;
  */
 #define CFQ_QHASH_SHIFT		6
 #define CFQ_QHASH_ENTRIES	(1 << CFQ_QHASH_SHIFT)
-#define list_entry_qhash(entry)	hlist_entry((entry), struct cfq_queue, cfq_hash)
-
-#define list_entry_cfqq(ptr)	list_entry((ptr), struct cfq_queue, cfq_list)
 
 #define RQ_CIC(rq)		((struct cfq_io_context*)(rq)->elevator_private)
 #define RQ_CFQQ(rq)		((rq)->elevator_private2)
@@ -91,7 +88,6 @@ struct cfq_data {
 	 * rr list of queues with requests and the count of them
 	 */
 	struct cfq_rb_root service_tree;
-	struct list_head cur_rr;
 	unsigned int busy_queues;
 
 	/*
@@ -146,8 +142,6 @@ struct cfq_queue {
 	struct hlist_node cfq_hash;
 	/* hash key */
 	unsigned int key;
-	/* member of the rr/busy/cur/idle cfqd list */
-	struct list_head cfq_list;
 	/* service_tree member */
 	struct rb_node rb_node;
 	/* service_tree key */
@@ -452,16 +446,19 @@ static unsigned long cfq_slice_offset(struct cfq_data *cfqd,
  * we will service the queues.
  */
 static void cfq_service_tree_add(struct cfq_data *cfqd,
-				    struct cfq_queue *cfqq)
+				    struct cfq_queue *cfqq, int add_front)
 {
 	struct rb_node **p = &cfqd->service_tree.rb.rb_node;
 	struct rb_node *parent = NULL;
 	unsigned long rb_key;
 	int left;
 
-	rb_key = cfq_slice_offset(cfqd, cfqq) + jiffies;
-	rb_key += cfqq->slice_resid;
-	cfqq->slice_resid = 0;
+	if (!add_front) {
+		rb_key = cfq_slice_offset(cfqd, cfqq) + jiffies;
+		rb_key += cfqq->slice_resid;
+		cfqq->slice_resid = 0;
+	} else
+		rb_key = 0;
 
 	if (!RB_EMPTY_NODE(&cfqq->rb_node)) {
 		/*
@@ -516,13 +513,13 @@ static void cfq_service_tree_add(struct cfq_data *cfqd,
 /*
  * Update cfqq's position in the service tree.
  */
-static void cfq_resort_rr_list(struct cfq_queue *cfqq, int preempted)
+static void cfq_resort_rr_list(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 {
 	/*
 	 * Resorting requires the cfqq to be on the RR list already.
 	 */
 	if (cfq_cfqq_on_rr(cfqq))
-		cfq_service_tree_add(cfqq->cfqd, cfqq);
+		cfq_service_tree_add(cfqd, cfqq, 0);
 }
 
 /*
@@ -536,7 +533,7 @@ cfq_add_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 	cfq_mark_cfqq_on_rr(cfqq);
 	cfqd->busy_queues++;
 
-	cfq_resort_rr_list(cfqq, 0);
+	cfq_resort_rr_list(cfqd, cfqq);
 }
 
 /*
@@ -548,7 +545,6 @@ cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 {
 	BUG_ON(!cfq_cfqq_on_rr(cfqq));
 	cfq_clear_cfqq_on_rr(cfqq);
-	list_del_init(&cfqq->cfq_list);
 
 	if (!RB_EMPTY_NODE(&cfqq->rb_node))
 		cfq_rb_erase(&cfqq->rb_node, &cfqd->service_tree);
@@ -765,7 +761,7 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	if (timed_out && !cfq_cfqq_slice_new(cfqq))
 		cfqq->slice_resid = cfqq->slice_end - jiffies;
 
-	cfq_resort_rr_list(cfqq, preempted);
+	cfq_resort_rr_list(cfqd, cfqq);
 
 	if (cfqq == cfqd->active_queue)
 		cfqd->active_queue = NULL;
@@ -793,31 +789,28 @@ static inline void cfq_slice_expired(struct cfq_data *cfqd, int preempted,
  */
 static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 {
-	struct cfq_queue *cfqq = NULL;
+	struct cfq_queue *cfqq;
+	struct rb_node *n;
 
-	if (!list_empty(&cfqd->cur_rr)) {
-		/*
-		 * if current list is non-empty, grab first entry.
-		 */
-		cfqq = list_entry_cfqq(cfqd->cur_rr.next);
-	} else if (!RB_EMPTY_ROOT(&cfqd->service_tree.rb)) {
-		struct rb_node *n = cfq_rb_first(&cfqd->service_tree);
+	if (RB_EMPTY_ROOT(&cfqd->service_tree.rb))
+		return NULL;
 
-		cfqq = rb_entry(n, struct cfq_queue, rb_node);
-		if (cfq_class_idle(cfqq)) {
-			unsigned long end;
+	n = cfq_rb_first(&cfqd->service_tree);
+	cfqq = rb_entry(n, struct cfq_queue, rb_node);
 
-			/*
-			 * if we have idle queues and no rt or be queues had
-			 * pending requests, either allow immediate service if
-			 * the grace period has passed or arm the idle grace
-			 * timer
-			 */
-			end = cfqd->last_end_request + CFQ_IDLE_GRACE;
-			if (time_before(jiffies, end)) {
-				mod_timer(&cfqd->idle_class_timer, end);
-				cfqq = NULL;
-			}
+	if (cfq_class_idle(cfqq)) {
+		unsigned long end;
+
+		/*
+		 * if we have idle queues and no rt or be queues had
+		 * pending requests, either allow immediate service if
+		 * the grace period has passed or arm the idle grace
+		 * timer
+		 */
+		end = cfqd->last_end_request + CFQ_IDLE_GRACE;
+		if (time_before(jiffies, end)) {
+			mod_timer(&cfqd->idle_class_timer, end);
+			cfqq = NULL;
 		}
 	}
 
@@ -1069,18 +1062,6 @@ static inline int __cfq_forced_dispatch_cfqq(struct cfq_queue *cfqq)
 	return dispatched;
 }
 
-static int cfq_forced_dispatch_cfqqs(struct list_head *list)
-{
-	struct cfq_queue *cfqq, *next;
-	int dispatched;
-
-	dispatched = 0;
-	list_for_each_entry_safe(cfqq, next, list, cfq_list)
-		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
-
-	return dispatched;
-}
-
 /*
  * Drain our current requests. Used for barriers and when switching
  * io schedulers on-the-fly.
@@ -1096,8 +1077,6 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
 		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
 	}
 
-	dispatched += cfq_forced_dispatch_cfqqs(&cfqd->cur_rr);
-
 	cfq_slice_expired(cfqd, 0, 0);
 
 	BUG_ON(cfqd->busy_queues);
@@ -1427,7 +1406,6 @@ retry:
 		memset(cfqq, 0, sizeof(*cfqq));
 
 		INIT_HLIST_NODE(&cfqq->cfq_hash);
-		INIT_LIST_HEAD(&cfqq->cfq_list);
 		RB_CLEAR_NODE(&cfqq->rb_node);
 		INIT_LIST_HEAD(&cfqq->fifo);
 
@@ -1706,8 +1684,8 @@ static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 	 * so we know that it will be selected next.
 	 */
 	BUG_ON(!cfq_cfqq_on_rr(cfqq));
-	list_del_init(&cfqq->cfq_list);
-	list_add(&cfqq->cfq_list, &cfqd->cur_rr);
+
+	cfq_service_tree_add(cfqd, cfqq, 1);
 
 	cfqq->slice_end = 0;
 	cfq_mark_cfqq_slice_new(cfqq);
@@ -2077,7 +2055,6 @@ static void *cfq_init_queue(request_queue_t *q)
 	memset(cfqd, 0, sizeof(*cfqd));
 
 	cfqd->service_tree = CFQ_RB_ROOT;
-	INIT_LIST_HEAD(&cfqd->cur_rr);
 	INIT_LIST_HEAD(&cfqd->cic_list);
 
 	cfqd->cfq_hash = kmalloc_node(sizeof(struct hlist_head) * CFQ_QHASH_ENTRIES, GFP_KERNEL, q->node);
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/15] cfq-iosched: don't pass unused preemption variable around
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (9 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 10/15] cfq-iosched: get rid of ->cur_rr and ->cfq_list Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 12/15] cfq-iosched: get rid of ->dispatch_slice Jens Axboe
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

We don't use it anymore in the slice expiry handling.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   28 +++++++++++++---------------
 1 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 2d0e9c5..b680002 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -746,7 +746,7 @@ __cfq_set_active_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
  */
 static void
 __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
-		    int preempted, int timed_out)
+		    int timed_out)
 {
 	if (cfq_cfqq_wait_request(cfqq))
 		del_timer(&cfqd->idle_slice_timer);
@@ -755,8 +755,7 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	cfq_clear_cfqq_wait_request(cfqq);
 
 	/*
-	 * store what was left of this slice, if the queue idled out
-	 * or was preempted
+	 * store what was left of this slice, if the queue idled/timed out
 	 */
 	if (timed_out && !cfq_cfqq_slice_new(cfqq))
 		cfqq->slice_resid = cfqq->slice_end - jiffies;
@@ -774,13 +773,12 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	cfqd->dispatch_slice = 0;
 }
 
-static inline void cfq_slice_expired(struct cfq_data *cfqd, int preempted,
-				     int timed_out)
+static inline void cfq_slice_expired(struct cfq_data *cfqd, int timed_out)
 {
 	struct cfq_queue *cfqq = cfqd->active_queue;
 
 	if (cfqq)
-		__cfq_slice_expired(cfqd, cfqq, preempted, timed_out);
+		__cfq_slice_expired(cfqd, cfqq, timed_out);
 }
 
 /*
@@ -989,7 +987,7 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd)
 	}
 
 expire:
-	cfq_slice_expired(cfqd, 0, 0);
+	cfq_slice_expired(cfqd, 0);
 new_queue:
 	cfqq = cfq_set_active_queue(cfqd);
 keep_queue:
@@ -1043,7 +1041,7 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	    cfqd->dispatch_slice >= cfq_prio_to_maxrq(cfqd, cfqq)) ||
 	    cfq_class_idle(cfqq))) {
 		cfqq->slice_end = jiffies + 1;
-		cfq_slice_expired(cfqd, 0, 0);
+		cfq_slice_expired(cfqd, 0);
 	}
 
 	return dispatched;
@@ -1077,7 +1075,7 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd)
 		dispatched += __cfq_forced_dispatch_cfqq(cfqq);
 	}
 
-	cfq_slice_expired(cfqd, 0, 0);
+	cfq_slice_expired(cfqd, 0);
 
 	BUG_ON(cfqd->busy_queues);
 
@@ -1147,7 +1145,7 @@ static void cfq_put_queue(struct cfq_queue *cfqq)
 	BUG_ON(cfq_cfqq_on_rr(cfqq));
 
 	if (unlikely(cfqd->active_queue == cfqq)) {
-		__cfq_slice_expired(cfqd, cfqq, 0, 0);
+		__cfq_slice_expired(cfqd, cfqq, 0);
 		cfq_schedule_dispatch(cfqd);
 	}
 
@@ -1204,7 +1202,7 @@ static void cfq_free_io_context(struct io_context *ioc)
 static void cfq_exit_cfqq(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 {
 	if (unlikely(cfqq == cfqd->active_queue)) {
-		__cfq_slice_expired(cfqd, cfqq, 0, 0);
+		__cfq_slice_expired(cfqd, cfqq, 0);
 		cfq_schedule_dispatch(cfqd);
 	}
 
@@ -1677,7 +1675,7 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
  */
 static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
 {
-	cfq_slice_expired(cfqd, 1, 1);
+	cfq_slice_expired(cfqd, 1);
 
 	/*
 	 * Put the new queue at the front of the of the current list,
@@ -1784,7 +1782,7 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
 			cfq_clear_cfqq_slice_new(cfqq);
 		}
 		if (cfq_slice_used(cfqq))
-			cfq_slice_expired(cfqd, 0, 1);
+			cfq_slice_expired(cfqd, 1);
 		else if (sync && RB_EMPTY_ROOT(&cfqq->sort_list))
 			cfq_arm_slice_timer(cfqd);
 	}
@@ -1979,7 +1977,7 @@ static void cfq_idle_slice_timer(unsigned long data)
 		}
 	}
 expire:
-	cfq_slice_expired(cfqd, 0, timed_out);
+	cfq_slice_expired(cfqd, timed_out);
 out_kick:
 	cfq_schedule_dispatch(cfqd);
 out_cont:
@@ -2025,7 +2023,7 @@ static void cfq_exit_queue(elevator_t *e)
 	spin_lock_irq(q->queue_lock);
 
 	if (cfqd->active_queue)
-		__cfq_slice_expired(cfqd, cfqd->active_queue, 0, 0);
+		__cfq_slice_expired(cfqd, cfqd->active_queue, 0);
 
 	while (!list_empty(&cfqd->cic_list)) {
 		struct cfq_io_context *cic = list_entry(cfqd->cic_list.next,
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/15] cfq-iosched: get rid of ->dispatch_slice
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (10 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 11/15] cfq-iosched: don't pass unused preemption variable around Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 13/15] cfq-iosched: never allow an async queue idling Jens Axboe
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

We can track it fairly accurately locally, let the slice handling
take care of the rest.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index b680002..8f76aed 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -106,7 +106,6 @@ struct cfq_data {
 
 	struct cfq_queue *active_queue;
 	struct cfq_io_context *active_cic;
-	unsigned int dispatch_slice;
 
 	struct timer_list idle_class_timer;
 
@@ -769,8 +768,6 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 		put_io_context(cfqd->active_cic->ioc);
 		cfqd->active_cic = NULL;
 	}
-
-	cfqd->dispatch_slice = 0;
 }
 
 static inline void cfq_slice_expired(struct cfq_data *cfqd, int timed_out)
@@ -1020,7 +1017,6 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 		 */
 		cfq_dispatch_insert(cfqd->queue, rq);
 
-		cfqd->dispatch_slice++;
 		dispatched++;
 
 		if (!cfqd->active_cic) {
@@ -1038,7 +1034,7 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	 * queue always expire after 1 dispatch round.
 	 */
 	if (cfqd->busy_queues > 1 && ((!cfq_cfqq_sync(cfqq) &&
-	    cfqd->dispatch_slice >= cfq_prio_to_maxrq(cfqd, cfqq)) ||
+	    dispatched >= cfq_prio_to_maxrq(cfqd, cfqq)) ||
 	    cfq_class_idle(cfqq))) {
 		cfqq->slice_end = jiffies + 1;
 		cfq_slice_expired(cfqd, 0);
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 13/15] cfq-iosched: never allow an async queue idling
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (11 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 12/15] cfq-iosched: get rid of ->dispatch_slice Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 14/15] cfq-iosched: improve sync vs async workloads Jens Axboe
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

We don't enable it by default, don't let it get enabled during
runtime.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 8f76aed..f920527 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1597,7 +1597,12 @@ static void
 cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 		       struct cfq_io_context *cic)
 {
-	int enable_idle = cfq_cfqq_idle_window(cfqq);
+	int enable_idle;
+
+	if (!cfq_cfqq_sync(cfqq))
+		return;
+
+	enable_idle = cfq_cfqq_idle_window(cfqq);
 
 	if (!cic->ioc->task || !cfqd->cfq_slice_idle ||
 	    (cfqd->hw_tag && CIC_SEEKY(cic)))
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 14/15] cfq-iosched: improve sync vs async workloads
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (12 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 13/15] cfq-iosched: never allow an async queue idling Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-24  8:15 ` [PATCH 15/15] cfq-iosched: tighten queue request overlap condition Jens Axboe
  2007-04-25 14:42 ` [PATCH 0/15] CFQ IO scheduler patch series Alan D. Brunelle
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |   31 ++++++++++++++++++-------------
 1 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index f920527..772df89 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -96,6 +96,7 @@ struct cfq_data {
 	struct hlist_head *cfq_hash;
 
 	int rq_in_driver;
+	int sync_flight;
 	int hw_tag;
 
 	/*
@@ -905,11 +906,15 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
  */
 static void cfq_dispatch_insert(request_queue_t *q, struct request *rq)
 {
+	struct cfq_data *cfqd = q->elevator->elevator_data;
 	struct cfq_queue *cfqq = RQ_CFQQ(rq);
 
 	cfq_remove_request(rq);
 	cfqq->dispatched++;
 	elv_dispatch_sort(q, rq);
+
+	if (cfq_cfqq_sync(cfqq))
+		cfqd->sync_flight++;
 }
 
 /*
@@ -1094,27 +1099,24 @@ static int cfq_dispatch_requests(request_queue_t *q, int force)
 	while ((cfqq = cfq_select_queue(cfqd)) != NULL) {
 		int max_dispatch;
 
-		if (cfqd->busy_queues > 1) {
-			/*
-			 * So we have dispatched before in this round, if the
-			 * next queue has idling enabled (must be sync), don't
-			 * allow it service until the previous have completed.
-			 */
-			if (cfqd->rq_in_driver && cfq_cfqq_idle_window(cfqq) &&
-			    dispatched)
+		max_dispatch = cfqd->cfq_quantum;
+		if (cfq_class_idle(cfqq))
+			max_dispatch = 1;
+
+		if (cfqq->dispatched >= max_dispatch) {
+			if (cfqd->busy_queues > 1)
 				break;
-			if (cfqq->dispatched >= cfqd->cfq_quantum)
+			if (cfqq->dispatched >= 4 * max_dispatch)
 				break;
 		}
 
+		if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq))
+			break;
+
 		cfq_clear_cfqq_must_dispatch(cfqq);
 		cfq_clear_cfqq_wait_request(cfqq);
 		del_timer(&cfqd->idle_slice_timer);
 
-		max_dispatch = cfqd->cfq_quantum;
-		if (cfq_class_idle(cfqq))
-			max_dispatch = 1;
-
 		dispatched += __cfq_dispatch_requests(cfqd, cfqq, max_dispatch);
 	}
 
@@ -1767,6 +1769,9 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
 	cfqd->rq_in_driver--;
 	cfqq->dispatched--;
 
+	if (cfq_cfqq_sync(cfqq))
+		cfqd->sync_flight--;
+
 	if (!cfq_class_idle(cfqq))
 		cfqd->last_end_request = now;
 
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 15/15] cfq-iosched: tighten queue request overlap condition
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (13 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 14/15] cfq-iosched: improve sync vs async workloads Jens Axboe
@ 2007-04-24  8:15 ` Jens Axboe
  2007-04-25 14:42 ` [PATCH 0/15] CFQ IO scheduler patch series Alan D. Brunelle
  15 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-24  8:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

For tagged devices, allow overlap of requests if the idle window
isn't enabled on the current active queue.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
 block/cfq-iosched.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 772df89..8093733 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -983,7 +983,8 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd)
 	 * flight or is idling for a new request, allow either of these
 	 * conditions to happen (or time out) before selecting a new queue.
 	 */
-	if (cfqq->dispatched || timer_pending(&cfqd->idle_slice_timer)) {
+	if (timer_pending(&cfqd->idle_slice_timer) ||
+	    (cfqq->dispatched && cfq_cfqq_idle_window(cfqq))) {
 		cfqq = NULL;
 		goto keep_queue;
 	}
-- 
1.5.1.1.190.g74474


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/15] CFQ IO scheduler patch series
  2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
                   ` (14 preceding siblings ...)
  2007-04-24  8:15 ` [PATCH 15/15] cfq-iosched: tighten queue request overlap condition Jens Axboe
@ 2007-04-25 14:42 ` Alan D. Brunelle
  15 siblings, 0 replies; 23+ messages in thread
From: Alan D. Brunelle @ 2007-04-25 14:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Using the patches posted yesterday 
(http://marc.info/?l=linux-kernel&m=117740312628325&w=2) here are some 
quick read results (as measured by iostat over a 5 minute period, taken 
in 6 second intervals) on a 4-way IA64 box with 42 disks (24 FC and 18 
U320), 42 processes (1 per disk) with 256 AIOs (16KB) outstanding at all 
times per device:

2.6.21-rc7:                                           1,006.023 MB/second
2.6.21-rc7 + new CFQ IO scheduler: 1,030.767 MB/second

showing about a 2.46% performance improvement with a 2.43% increase in 
%system used (3.738% -> 3.829%).

Interestingly enough this patch also seems to remove some noise during 
the run - see the chart at http://free.linux.hp.com/~adb/cfq/rkb_s.png

Alan D. Brunelle
HP / Open Source and Linux Organization / Scalability and Performance Group


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling
  2007-04-24  8:15 ` [PATCH 5/15] cfq-iosched: speed up rbtree handling Jens Axboe
@ 2007-04-25 15:59   ` Alan D. Brunelle
  2007-04-25 17:15     ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Alan D. Brunelle @ 2007-04-25 15:59 UTC (permalink / raw)
  To: Jens Axboe, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 198 bytes --]

Hi Jens -

The attached patch speeds it up even more - I'm finding a >9% reduction 
in %system with no loss in IO performance. This just sets the cached 
element when the first is looked for.

Alan

[-- Attachment #2: fix-left --]
[-- Type: text/plain, Size: 776 bytes --]

From: Alan D. Brunelle <Alan.Brunelle@hp.com>

Update cached leftmost every time it is found.

Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
---

 block/cfq-iosched.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 8093733..a86a7c3 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -388,10 +388,10 @@ cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, struct request *rq2)
  */
 static struct rb_node *cfq_rb_first(struct cfq_rb_root *root)
 {
-	if (root->left)
-		return root->left;
+	if (!root->left)
+		root->left = rb_first(&root->rb);
 
-	return rb_first(&root->rb);
+	return root->left;
 }
 
 static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root)

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling
  2007-04-25 15:59   ` Alan D. Brunelle
@ 2007-04-25 17:15     ` Jens Axboe
  2007-04-25 17:50       ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2007-04-25 17:15 UTC (permalink / raw)
  To: Alan.Brunelle; +Cc: linux-kernel

On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> Hi Jens -
> 
> The attached patch speeds it up even more - I'm finding a >9% reduction 
> in %system with no loss in IO performance. This just sets the cached 
> element when the first is looked for.

Interesting, good thinking. It should not change the IO pattern, as the
end result should be the same. Thanks Alan, will commit!

I'll give elevator.c the same treatment, should be even more beneficial.
Stay tuned for a test patch.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling
  2007-04-25 17:15     ` Jens Axboe
@ 2007-04-25 17:50       ` Jens Axboe
  2007-04-25 18:08         ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2007-04-25 17:50 UTC (permalink / raw)
  To: Alan.Brunelle; +Cc: linux-kernel

On Wed, Apr 25 2007, Jens Axboe wrote:
> On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> > Hi Jens -
> > 
> > The attached patch speeds it up even more - I'm finding a >9% reduction 
> > in %system with no loss in IO performance. This just sets the cached 
> > element when the first is looked for.
> 
> Interesting, good thinking. It should not change the IO pattern, as the
> end result should be the same. Thanks Alan, will commit!
> 
> I'll give elevator.c the same treatment, should be even more beneficial.
> Stay tuned for a test patch.

Something like this, totally untested (it compiles). I initially wanted
to fold the cfq addon into the elevator.h provided implementation, but
that requires more extensive changes. Given how little code it is, I
think I'll keep them seperate.

diff --git a/block/as-iosched.c b/block/as-iosched.c
index ef12627..37c6fb3 100644
--- a/block/as-iosched.c
+++ b/block/as-iosched.c
@@ -89,7 +89,7 @@ struct as_data {
 	/*
 	 * requests (as_rq s) are present on both sort_list and fifo_list
 	 */
-	struct rb_root sort_list[2];
+	struct blk_rb_root sort_list[2];
 	struct list_head fifo_list[2];
 
 	struct request *next_rq[2];	/* next in sort order */
@@ -369,9 +369,9 @@ as_find_next_rq(struct as_data *ad, struct request *last)
 	else {
 		const int data_dir = rq_is_sync(last);
 
-		rbnext = rb_first(&ad->sort_list[data_dir]);
-		if (rbnext && rbnext != &last->rb_node)
-			next = rb_entry_rq(rbnext);
+		next = elv_rb_first(&ad->sort_list[data_dir]);
+		if (next == last)
+			next = NULL;
 	}
 
 	return as_choose_req(ad, next, prev);
@@ -1057,7 +1057,7 @@ static int as_dispatch_request(request_queue_t *q, int force)
 	 */
 
 	if (reads) {
-		BUG_ON(RB_EMPTY_ROOT(&ad->sort_list[REQ_SYNC]));
+		BUG_ON(BLK_RB_EMPTY_ROOT(&ad->sort_list[REQ_SYNC]));
 
 		if (writes && ad->batch_data_dir == REQ_SYNC)
 			/*
@@ -1081,7 +1081,7 @@ static int as_dispatch_request(request_queue_t *q, int force)
 
 	if (writes) {
 dispatch_writes:
-		BUG_ON(RB_EMPTY_ROOT(&ad->sort_list[REQ_ASYNC]));
+		BUG_ON(BLK_RB_EMPTY_ROOT(&ad->sort_list[REQ_ASYNC]));
 
 		if (ad->batch_data_dir == REQ_SYNC) {
 			ad->changed_batch = 1;
@@ -1337,8 +1337,8 @@ static void *as_init_queue(request_queue_t *q)
 
 	INIT_LIST_HEAD(&ad->fifo_list[REQ_SYNC]);
 	INIT_LIST_HEAD(&ad->fifo_list[REQ_ASYNC]);
-	ad->sort_list[REQ_SYNC] = RB_ROOT;
-	ad->sort_list[REQ_ASYNC] = RB_ROOT;
+	ad->sort_list[REQ_SYNC] = BLK_RB_ROOT;
+	ad->sort_list[REQ_ASYNC] = BLK_RB_ROOT;
 	ad->fifo_expire[REQ_SYNC] = default_read_expire;
 	ad->fifo_expire[REQ_ASYNC] = default_write_expire;
 	ad->antic_expire = default_antic_expire;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 7a18f31..f3020aa 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -127,7 +127,7 @@ struct cfq_queue {
 	/* service_tree key */
 	unsigned long rb_key;
 	/* sorted list of pending requests */
-	struct rb_root sort_list;
+	struct blk_rb_root sort_list;
 	/* if fifo isn't expired, next request to serve */
 	struct request *next_rq;
 	/* requests queued in sort_list */
@@ -419,9 +419,9 @@ cfq_find_next_rq(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 	if (rbnext)
 		next = rb_entry_rq(rbnext);
 	else {
-		rbnext = rb_first(&cfqq->sort_list);
-		if (rbnext && rbnext != &last->rb_node)
-			next = rb_entry_rq(rbnext);
+		next = elv_rb_first(&cfqq->sort_list);
+		if (next == last)
+			next = NULL;
 	}
 
 	return cfq_choose_req(cfqd, next, prev);
@@ -564,7 +564,7 @@ static inline void cfq_del_rq_rb(struct request *rq)
 
 	elv_rb_del(&cfqq->sort_list, rq);
 
-	if (cfq_cfqq_on_rr(cfqq) && RB_EMPTY_ROOT(&cfqq->sort_list))
+	if (cfq_cfqq_on_rr(cfqq) && BLK_RB_EMPTY_ROOT(&cfqq->sort_list))
 		cfq_del_cfqq_rr(cfqd, cfqq);
 }
 
@@ -871,7 +871,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd)
 	struct cfq_io_context *cic;
 	unsigned long sl;
 
-	WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list));
+	WARN_ON(!BLK_RB_EMPTY_ROOT(&cfqq->sort_list));
 	WARN_ON(cfq_cfqq_slice_new(cfqq));
 
 	/*
@@ -983,7 +983,7 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd)
 	 * The active queue has requests and isn't expired, allow it to
 	 * dispatch.
 	 */
-	if (!RB_EMPTY_ROOT(&cfqq->sort_list))
+	if (!BLK_RB_EMPTY_ROOT(&cfqq->sort_list))
 		goto keep_queue;
 
 	/*
@@ -1015,7 +1015,7 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 {
 	int dispatched = 0;
 
-	BUG_ON(RB_EMPTY_ROOT(&cfqq->sort_list));
+	BUG_ON(BLK_RB_EMPTY_ROOT(&cfqq->sort_list));
 
 	do {
 		struct request *rq;
@@ -1038,7 +1038,7 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 			cfqd->active_cic = RQ_CIC(rq);
 		}
 
-		if (RB_EMPTY_ROOT(&cfqq->sort_list))
+		if (BLK_RB_EMPTY_ROOT(&cfqq->sort_list))
 			break;
 
 	} while (dispatched < max_dispatch);
@@ -1147,7 +1147,7 @@ static void cfq_put_queue(struct cfq_queue *cfqq)
 	if (!atomic_dec_and_test(&cfqq->ref))
 		return;
 
-	BUG_ON(rb_first(&cfqq->sort_list));
+	BUG_ON(elv_rb_first(&cfqq->sort_list));
 	BUG_ON(cfqq->allocated[READ] + cfqq->allocated[WRITE]);
 	BUG_ON(cfq_cfqq_on_rr(cfqq));
 
@@ -1385,6 +1385,7 @@ retry:
 
 		memset(cfqq, 0, sizeof(*cfqq));
 
+		cfqq->sort_list = BLK_RB_ROOT;
 		RB_CLEAR_NODE(&cfqq->rb_node);
 		INIT_LIST_HEAD(&cfqq->fifo);
 
@@ -1783,7 +1784,7 @@ static void cfq_completed_request(request_queue_t *q, struct request *rq)
 		}
 		if (cfq_slice_used(cfqq))
 			cfq_slice_expired(cfqd, 1);
-		else if (sync && RB_EMPTY_ROOT(&cfqq->sort_list))
+		else if (sync && BLK_RB_EMPTY_ROOT(&cfqq->sort_list))
 			cfq_arm_slice_timer(cfqd);
 	}
 
@@ -1973,7 +1974,7 @@ static void cfq_idle_slice_timer(unsigned long data)
 		/*
 		 * not expired and it has a request pending, let it dispatch
 		 */
-		if (!RB_EMPTY_ROOT(&cfqq->sort_list)) {
+		if (!BLK_RB_EMPTY_ROOT(&cfqq->sort_list)) {
 			cfq_mark_cfqq_must_dispatch(cfqq);
 			goto out_kick;
 		}
diff --git a/block/deadline-iosched.c b/block/deadline-iosched.c
index 6d673e9..8db8709 100644
--- a/block/deadline-iosched.c
+++ b/block/deadline-iosched.c
@@ -31,7 +31,7 @@ struct deadline_data {
 	/*
 	 * requests (deadline_rq s) are present on both sort_list and fifo_list
 	 */
-	struct rb_root sort_list[2];	
+	struct blk_rb_root sort_list[2];	
 	struct list_head fifo_list[2];
 	
 	/*
@@ -58,11 +58,10 @@ static void deadline_move_request(struct deadline_data *, struct request *);
 static void
 deadline_add_rq_rb(struct deadline_data *dd, struct request *rq)
 {
-	struct rb_root *root = RQ_RB_ROOT(dd, rq);
 	struct request *__alias;
 
 retry:
-	__alias = elv_rb_add(root, rq);
+	__alias = elv_rb_add(RQ_RB_ROOT(dd, rq), rq);
 	if (unlikely(__alias)) {
 		deadline_move_request(dd, __alias);
 		goto retry;
@@ -270,7 +269,7 @@ static int deadline_dispatch_requests(request_queue_t *q, int force)
 	 */
 
 	if (reads) {
-		BUG_ON(RB_EMPTY_ROOT(&dd->sort_list[READ]));
+		BUG_ON(BLK_RB_EMPTY_ROOT(&dd->sort_list[READ]));
 
 		if (writes && (dd->starved++ >= dd->writes_starved))
 			goto dispatch_writes;
@@ -286,7 +285,7 @@ static int deadline_dispatch_requests(request_queue_t *q, int force)
 
 	if (writes) {
 dispatch_writes:
-		BUG_ON(RB_EMPTY_ROOT(&dd->sort_list[WRITE]));
+		BUG_ON(BLK_RB_EMPTY_ROOT(&dd->sort_list[WRITE]));
 
 		dd->starved = 0;
 
@@ -313,16 +312,13 @@ dispatch_find_request:
 		 */
 		rq = dd->next_rq[data_dir];
 	} else {
-		struct rb_node *node;
 		/*
 		 * The last req was the other direction or we have run out of
 		 * higher-sectored requests. Go back to the lowest sectored
 		 * request (1 way elevator) and start a new batch.
 		 */
 		dd->batching = 0;
-		node = rb_first(&dd->sort_list[data_dir]);
-		if (node)
-			rq = rb_entry_rq(node);
+		rq = elv_rb_first(&dd->sort_list[data_dir]);
 	}
 
 dispatch_request:
@@ -367,8 +363,8 @@ static void *deadline_init_queue(request_queue_t *q)
 
 	INIT_LIST_HEAD(&dd->fifo_list[READ]);
 	INIT_LIST_HEAD(&dd->fifo_list[WRITE]);
-	dd->sort_list[READ] = RB_ROOT;
-	dd->sort_list[WRITE] = RB_ROOT;
+	dd->sort_list[READ] = BLK_RB_ROOT;
+	dd->sort_list[WRITE] = BLK_RB_ROOT;
 	dd->fifo_expire[READ] = read_expire;
 	dd->fifo_expire[WRITE] = write_expire;
 	dd->writes_starved = writes_starved;
diff --git a/block/elevator.c b/block/elevator.c
index 96a00c8..8f1c287 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -336,11 +336,12 @@ static struct request *elv_rqhash_find(request_queue_t *q, sector_t offset)
  * RB-tree support functions for inserting/lookup/removal of requests
  * in a sorted RB tree.
  */
-struct request *elv_rb_add(struct rb_root *root, struct request *rq)
+struct request *elv_rb_add(struct blk_rb_root *root, struct request *rq)
 {
-	struct rb_node **p = &root->rb_node;
+	struct rb_node **p = &root->tree.rb_node;
 	struct rb_node *parent = NULL;
 	struct request *__rq;
+	int left = 1;
 
 	while (*p) {
 		parent = *p;
@@ -348,31 +349,38 @@ struct request *elv_rb_add(struct rb_root *root, struct request *rq)
 
 		if (rq->sector < __rq->sector)
 			p = &(*p)->rb_left;
-		else if (rq->sector > __rq->sector)
+		else if (rq->sector > __rq->sector) {
 			p = &(*p)->rb_right;
-		else
+			left = 0;
+		} else
 			return __rq;
 	}
 
+	if (left)
+		root->left = &rq->rb_node;
+
 	rb_link_node(&rq->rb_node, parent, p);
-	rb_insert_color(&rq->rb_node, root);
+	rb_insert_color(&rq->rb_node, &root->tree);
 	return NULL;
 }
 
 EXPORT_SYMBOL(elv_rb_add);
 
-void elv_rb_del(struct rb_root *root, struct request *rq)
+void elv_rb_del(struct blk_rb_root *root, struct request *rq)
 {
+	if (root->left == &rq->rb_node)
+		root->left = NULL;
+
 	BUG_ON(RB_EMPTY_NODE(&rq->rb_node));
-	rb_erase(&rq->rb_node, root);
+	rb_erase(&rq->rb_node, &root->tree);
 	RB_CLEAR_NODE(&rq->rb_node);
 }
 
 EXPORT_SYMBOL(elv_rb_del);
 
-struct request *elv_rb_find(struct rb_root *root, sector_t sector)
+struct request *elv_rb_find(struct blk_rb_root *root, sector_t sector)
 {
-	struct rb_node *n = root->rb_node;
+	struct rb_node *n = root->tree.rb_node;
 	struct request *rq;
 
 	while (n) {
@@ -391,6 +399,18 @@ struct request *elv_rb_find(struct rb_root *root, sector_t sector)
 
 EXPORT_SYMBOL(elv_rb_find);
 
+struct request *elv_rb_first(struct blk_rb_root *root)
+{
+	if (!root->left)
+		root->left = rb_first(&root->tree);
+	if (root->left)
+		return rb_entry_rq(root->left);
+
+	return NULL;
+}
+
+EXPORT_SYMBOL(elv_rb_first);
+
 /*
  * Insert rq into dispatch queue of q.  Queue lock must be held on
  * entry.  rq is sort insted into the dispatch queue. To be used by
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index e88fcbc..28cc010 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -139,11 +139,22 @@ extern struct request *elv_rb_former_request(request_queue_t *, struct request *
 extern struct request *elv_rb_latter_request(request_queue_t *, struct request *);
 
 /*
- * rb support functions.
+ * rb support functions below. we have a private rb_root setup where we
+ * cache the leftmost node in the tree, for fastest min element retrieval.
+ * this is the main purpose of the rbtree (sort and min extraction).
  */
-extern struct request *elv_rb_add(struct rb_root *, struct request *);
-extern void elv_rb_del(struct rb_root *, struct request *);
-extern struct request *elv_rb_find(struct rb_root *, sector_t);
+struct blk_rb_root {
+	struct rb_root tree;
+	struct rb_node *left;
+};
+
+#define BLK_RB_ROOT	(struct blk_rb_root) { RB_ROOT, NULL, }
+#define BLK_RB_EMPTY_ROOT(root)	RB_EMPTY_ROOT(&((root)->tree))
+
+extern struct request *elv_rb_add(struct blk_rb_root *, struct request *);
+extern void elv_rb_del(struct blk_rb_root *, struct request *);
+extern struct request *elv_rb_find(struct blk_rb_root *, sector_t);
+extern struct request *elv_rb_first(struct blk_rb_root *);
 
 /*
  * Return values from elevator merger

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling
  2007-04-25 17:50       ` Jens Axboe
@ 2007-04-25 18:08         ` Jens Axboe
  2007-04-26 14:28           ` Alan D. Brunelle
  0 siblings, 1 reply; 23+ messages in thread
From: Jens Axboe @ 2007-04-25 18:08 UTC (permalink / raw)
  To: Alan.Brunelle; +Cc: linux-kernel

On Wed, Apr 25 2007, Jens Axboe wrote:
> On Wed, Apr 25 2007, Jens Axboe wrote:
> > On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> > > Hi Jens -
> > > 
> > > The attached patch speeds it up even more - I'm finding a >9% reduction 
> > > in %system with no loss in IO performance. This just sets the cached 
> > > element when the first is looked for.
> > 
> > Interesting, good thinking. It should not change the IO pattern, as the
> > end result should be the same. Thanks Alan, will commit!
> > 
> > I'll give elevator.c the same treatment, should be even more beneficial.
> > Stay tuned for a test patch.
> 
> Something like this, totally untested (it compiles). I initially wanted
> to fold the cfq addon into the elevator.h provided implementation, but
> that requires more extensive changes. Given how little code it is, I
> think I'll keep them seperate.

Booted, seems to work fine for me. In a null ended IO test, I get about
a 1-2% speedup for a single queue of depth 64 using libaio. So it's
definitely worth it, will commit.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling
  2007-04-25 18:08         ` Jens Axboe
@ 2007-04-26 14:28           ` Alan D. Brunelle
  2007-04-26 15:46             ` Jens Axboe
  0 siblings, 1 reply; 23+ messages in thread
From: Alan D. Brunelle @ 2007-04-26 14:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: Jens Axboe

Jens Axboe wrote:
> On Wed, Apr 25 2007, Jens Axboe wrote:
>> On Wed, Apr 25 2007, Jens Axboe wrote:
>>> On Wed, Apr 25 2007, Alan D. Brunelle wrote:
>>>> Hi Jens -
>>>>
>>>> The attached patch speeds it up even more - I'm finding a >9% reduction 
>>>> in %system with no loss in IO performance. This just sets the cached 
>>>> element when the first is looked for.
>>> Interesting, good thinking. It should not change the IO pattern, as the
>>> end result should be the same. Thanks Alan, will commit!
>>>
>>> I'll give elevator.c the same treatment, should be even more beneficial.
>>> Stay tuned for a test patch.
>> Something like this, totally untested (it compiles). I initially wanted
>> to fold the cfq addon into the elevator.h provided implementation, but
>> that requires more extensive changes. Given how little code it is, I
>> think I'll keep them seperate.
>
> Booted, seems to work fine for me. In a null ended IO test, I get about
> a 1-2% speedup for a single queue of depth 64 using libaio. So it's
> definitely worth it, will commit.
>
After longer runs last night, I think the patched elevator code /does/ 
help (albeit ever so slightly - about 0.6% performance improvement at a 
1.1% %system overhead).

  rkB_s    %system  Kernel
---------  -------  ----------------------------------------------------
1022942.2     3.69  Original patch + fix to cfq_rb_first
1029087.0     3.73  This patch stream (including fixes to elevator code)

Alan




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/15] cfq-iosched: speed up rbtree handling
  2007-04-26 14:28           ` Alan D. Brunelle
@ 2007-04-26 15:46             ` Jens Axboe
  0 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2007-04-26 15:46 UTC (permalink / raw)
  To: Alan.Brunelle; +Cc: linux-kernel

On Thu, Apr 26 2007, Alan D. Brunelle wrote:
> Jens Axboe wrote:
> >On Wed, Apr 25 2007, Jens Axboe wrote:
> >>On Wed, Apr 25 2007, Jens Axboe wrote:
> >>>On Wed, Apr 25 2007, Alan D. Brunelle wrote:
> >>>>Hi Jens -
> >>>>
> >>>>The attached patch speeds it up even more - I'm finding a >9% reduction 
> >>>>in %system with no loss in IO performance. This just sets the cached 
> >>>>element when the first is looked for.
> >>>Interesting, good thinking. It should not change the IO pattern, as the
> >>>end result should be the same. Thanks Alan, will commit!
> >>>
> >>>I'll give elevator.c the same treatment, should be even more beneficial.
> >>>Stay tuned for a test patch.
> >>Something like this, totally untested (it compiles). I initially wanted
> >>to fold the cfq addon into the elevator.h provided implementation, but
> >>that requires more extensive changes. Given how little code it is, I
> >>think I'll keep them seperate.
> >
> >Booted, seems to work fine for me. In a null ended IO test, I get about
> >a 1-2% speedup for a single queue of depth 64 using libaio. So it's
> >definitely worth it, will commit.
> >
> After longer runs last night, I think the patched elevator code /does/ 
> help (albeit ever so slightly - about 0.6% performance improvement at a 
> 1.1% %system overhead).
> 
>  rkB_s    %system  Kernel
> ---------  -------  ----------------------------------------------------
> 1022942.2     3.69  Original patch + fix to cfq_rb_first
> 1029087.0     3.73  This patch stream (including fixes to elevator code)

Ah good, thanks for testing! It's all in the cfq branch now.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2007-04-26 15:50 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-24  8:15 [PATCH 0/15] CFQ IO scheduler patch series Jens Axboe
2007-04-24  8:15 ` [PATCH 1/15] cfq-iosched: improve preemption for cooperating tasks Jens Axboe
2007-04-24  8:15 ` [PATCH 2/15] cfq-iosched: development update Jens Axboe
2007-04-24  8:15 ` [PATCH 3/15] cfq-iosched: minor updates Jens Axboe
2007-04-24  8:15 ` [PATCH 4/15] cfq-iosched: rework the whole round-robin list concept Jens Axboe
2007-04-24  8:15 ` [PATCH 5/15] cfq-iosched: speed up rbtree handling Jens Axboe
2007-04-25 15:59   ` Alan D. Brunelle
2007-04-25 17:15     ` Jens Axboe
2007-04-25 17:50       ` Jens Axboe
2007-04-25 18:08         ` Jens Axboe
2007-04-26 14:28           ` Alan D. Brunelle
2007-04-26 15:46             ` Jens Axboe
2007-04-24  8:15 ` [PATCH 6/15] cfq-iosched: sort RT queues into the rbtree Jens Axboe
2007-04-24  8:15 ` [PATCH 7/15] cfq-iosched: sort IDLE " Jens Axboe
2007-04-24  8:15 ` [PATCH 8/15] cfq-iosched: style cleanups and comments Jens Axboe
2007-04-24  8:15 ` [PATCH 9/15] cfq-iosched: slice offset should take ioprio into account Jens Axboe
2007-04-24  8:15 ` [PATCH 10/15] cfq-iosched: get rid of ->cur_rr and ->cfq_list Jens Axboe
2007-04-24  8:15 ` [PATCH 11/15] cfq-iosched: don't pass unused preemption variable around Jens Axboe
2007-04-24  8:15 ` [PATCH 12/15] cfq-iosched: get rid of ->dispatch_slice Jens Axboe
2007-04-24  8:15 ` [PATCH 13/15] cfq-iosched: never allow an async queue idling Jens Axboe
2007-04-24  8:15 ` [PATCH 14/15] cfq-iosched: improve sync vs async workloads Jens Axboe
2007-04-24  8:15 ` [PATCH 15/15] cfq-iosched: tighten queue request overlap condition Jens Axboe
2007-04-25 14:42 ` [PATCH 0/15] CFQ IO scheduler patch series Alan D. Brunelle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox