elevator priorities vs. full request queues

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* elevator priorities vs. full request queues
@ 2004-06-22  4:25 Werner Almesberger
  2004-06-22  7:48 ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-06-22  4:25 UTC (permalink / raw)
  To: linux-fsdevel

Another issue regarding priorities: let's suppose we have "high"
and "low" priority requests. Those priorities are indicated by
some means, and do not depend on CPU scheduler priorities. Let's
say that the elevator always handles high priority requests
first, unless barriers dictate it to do otherwise.

So far so good. Next, let's suppose that the request queue is
filling up. Now all the processes doing IO will compete for
request slots, and the outcome of this competition is
determined by the CPU scheduler - which is blissfully unaware
of those request priorities.

Has anybody thought of a solution for this problem ?

I can think of a few possible approaches, but they're all
relatively intrusive or plain ugly. A few of them:

 1) temporarily raise the CPU priority of processes waiting
    for request queue slots
 2) implement some priority scheme in ll_rw_blk.c
 3) give elv_may_queue more information about what the request
    will look like, so that it can decide to accept "high"
    priority requests even if the queue is officially full
 4) allow elv_add_request to refuse requests, e.g. to keep
    some reserve for "high" priority requests

1) and 2) would have the advantage that they also work if
requests are delayed due to a lack of memory. However, in this
case, all bets are off anyway, so I wouldn't worry too much
about it.

3) and 4) have the nice property of keeping all the priority
logic entirely in the elevator. For 4), one would also have to
force an interface change (besides giving elv_add_request a
return value), to make sure there are no enqueuers left in the
kernel or elsewhere which don't know that elv_add_request may
now fail.

Opinions ?

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-22  4:25 elevator priorities vs. full request queues Werner Almesberger
@ 2004-06-22  7:48 ` Jens Axboe
  2004-06-22  8:26   ` Werner Almesberger
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2004-06-22  7:48 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Tue, Jun 22 2004, Werner Almesberger wrote:
> Another issue regarding priorities: let's suppose we have "high"
> and "low" priority requests. Those priorities are indicated by
> some means, and do not depend on CPU scheduler priorities. Let's
> say that the elevator always handles high priority requests
> first, unless barriers dictate it to do otherwise.
> 
> So far so good. Next, let's suppose that the request queue is
> filling up. Now all the processes doing IO will compete for
> request slots, and the outcome of this competition is
> determined by the CPU scheduler - which is blissfully unaware
> of those request priorities.
> 
> Has anybody thought of a solution for this problem ?

Well yes, different parties have worked on io priorities in the past.
Request allocation is a important part of that.

> I can think of a few possible approaches, but they're all
> relatively intrusive or plain ugly. A few of them:
> 
>  1) temporarily raise the CPU priority of processes waiting
>     for request queue slots
>  2) implement some priority scheme in ll_rw_blk.c
>  3) give elv_may_queue more information about what the request
>     will look like, so that it can decide to accept "high"
>     priority requests even if the queue is officially full
>  4) allow elv_add_request to refuse requests, e.g. to keep
>     some reserve for "high" priority requests
> 
> 1) and 2) would have the advantage that they also work if
> requests are delayed due to a lack of memory. However, in this
> case, all bets are off anyway, so I wouldn't worry too much
> about it.
> 
> 3) and 4) have the nice property of keeping all the priority
> logic entirely in the elevator. For 4), one would also have to
> force an interface change (besides giving elv_add_request a
> return value), to make sure there are no enqueuers left in the
> kernel or elsewhere which don't know that elv_add_request may
> now fail.

#4 doesn't work for various reasons, the obvious being that the request
is already allocated at this point. Or request could be coming from
outside get_request().

CFQ with priorities used a combination of 2+3. It seems you haven't
looked any of this up, maybe that would be a good starting point. You
can do this internally in your io scheduler in several ways. CFQ used
per-process (-group) priorities, so it would get the priority from
there. A bio has room for priority info, so you could also pass it in
from submit_bio() -> __make_request() -> get_request() -> may_queue().

> Opinions ?

#1 can complement a solution of 2+3, it's not a solution in itself imo.
You absolutely need some way of reserving requests for some processes,
or denying requests for other and priorities alone does not accomplish
that.

#4 is a definite no-go. Maybe you meant to specify a different function,
I don't think your suggestion makes any sense.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-22  7:48 ` Jens Axboe
@ 2004-06-22  8:26   ` Werner Almesberger
  2004-06-22 10:14     ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-06-22  8:26 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> Well yes, different parties have worked on io priorities in the past.
> Request allocation is a important part of that.

I only found your "cfq + io priorities" posting from November
2003. Are there any others that are publicly available ?

> #4 doesn't work for various reasons, the obvious being that the request
> is already allocated at this point.

The idea would be to refuse/delay low-priority requests before
reaching the queue limit. But yes, having a possibly large
number of allocated but not enqueuable requests sit around,
or being forced to discard requests after building them, may
not be so great.

> CFQ with priorities used a combination of 2+3. It seems you haven't
> looked any of this up, maybe that would be a good starting point.

I had to have a second look :-) I didn't quite realize that
your changes actually solved that. Nice and simple. Are there
any newer versions of it ? What are your plans for IO priorities
anyway ?

> CFQ used per-process (-group) priorities,

In my case, they're per open file or per memory page. The latter
works as follows: pages are explicitly prefetched at the
application's declared read rate. When doing this, the prefetcher
sets the page's priority, which the elevator retrieves. Sound
complicated but actually doesn't cause all that much overhead. As
a welcome side-effect, prefetching also limits the high-priority
IO activity an application can generate.

> A bio has room for priority info, so you could also pass it in
> from submit_bio() -> __make_request() -> get_request() -> may_queue().

Hmm yes, that's the large set of little changes I was hoping to
avoid. But I guess there's no way around this.

Thanks,
- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-22  8:26   ` Werner Almesberger
@ 2004-06-22 10:14     ` Jens Axboe
  2004-06-22 19:08       ` Werner Almesberger
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2004-06-22 10:14 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Tue, Jun 22 2004, Werner Almesberger wrote:
> Jens Axboe wrote:
> > Well yes, different parties have worked on io priorities in the past.
> > Request allocation is a important part of that.
> 
> I only found your "cfq + io priorities" posting from November
> 2003. Are there any others that are publicly available ?

Don't think I posted any newer ones, I'll see if I can get something
posted today/tomorrow (or at least before going on vacation thursday).

> > #4 doesn't work for various reasons, the obvious being that the request
> > is already allocated at this point.
> 
> The idea would be to refuse/delay low-priority requests before
> reaching the queue limit. But yes, having a possibly large
> number of allocated but not enqueuable requests sit around,
> or being forced to discard requests after building them, may
> not be so great.

That's pushing the problem somewhere it doesn't belong, imho. Let the
process block in getting a request instead, it doesn't make sense to let
it allocate a request and then refuse to queue it. Plus it raises all
sorts of handling in the lower layers to cope with this.

> > CFQ with priorities used a combination of 2+3. It seems you haven't
> > looked any of this up, maybe that would be a good starting point.
> 
> I had to have a second look :-) I didn't quite realize that
> your changes actually solved that. Nice and simple. Are there
> any newer versions of it ? What are your plans for IO priorities
> anyway ?

It's basically just deciding whether a process can allocate a request in
elv_may_queue(). Christoffer Hall implemented split allocation lists as
well.

My general plans for IO priorities basically involve rewriting lots of
ll_rw_blk, I'm afraid. Not fundamentally, but there are some areas that
don't work well if you assign different rights to different processes.
One being the queue congested bit - the queue may be congested for
processs A, but not for B. And I'd like to pass in priority through the
bio, like described further down.

> > CFQ used per-process (-group) priorities,
> 
> In my case, they're per open file or per memory page. The latter
> works as follows: pages are explicitly prefetched at the
> application's declared read rate. When doing this, the prefetcher
> sets the page's priority, which the elevator retrieves. Sound
> complicated but actually doesn't cause all that much overhead. As
> a welcome side-effect, prefetching also limits the high-priority
> IO activity an application can generate.

It doesn't sound too complicated :)

> > A bio has room for priority info, so you could also pass it in
> > from submit_bio() -> __make_request() -> get_request() -> may_queue().
> 
> Hmm yes, that's the large set of little changes I was hoping to
> avoid. But I guess there's no way around this.

I'd much rather do it this way, since I think it's the right way. It
might make sense to just do this change right away and merge it, just
using a static io priority. Makes it easier to build and work on an
external scheduler using priorities without throwing lots of rejects in
the core. It's not really a lot of work, either...

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-22 10:14     ` Jens Axboe
@ 2004-06-22 19:08       ` Werner Almesberger
  2004-06-23 10:14         ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-06-22 19:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> Don't think I posted any newer ones, I'll see if I can get something
> posted today/tomorrow (or at least before going on vacation thursday).

Seems that I picked a good moment for starting to ask around :-)

> And I'd like to pass in priority through the
> bio, like described further down.

Good, so that's the way to go then.

> It doesn't sound too complicated :)

You haven't seen the tiny little per-page trees hanging off
each requests yet ;-) I need them to "upgrade" requests when
someone with high priority decides to request a page that's
already in the queue. That should be a comparably rare event,
but handling that certainly adds complexity :-(

Thanks,
- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-22 19:08       ` Werner Almesberger
@ 2004-06-23 10:14         ` Jens Axboe
  2004-06-23 12:46           ` Werner Almesberger
  2004-07-12 23:52           ` Werner Almesberger
  0 siblings, 2 replies; 17+ messages in thread
From: Jens Axboe @ 2004-06-23 10:14 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Tue, Jun 22 2004, Werner Almesberger wrote:
> Jens Axboe wrote:
> > Don't think I posted any newer ones, I'll see if I can get something
> > posted today/tomorrow (or at least before going on vacation thursday).
> 
> Seems that I picked a good moment for starting to ask around :-)

Something like this (probably a little half-assed, and definitely very
untested :-). That should basically allow you to do whatever you need in
your io scheduler ->may_queue_fn. I know for CFQ I had to make another
little change, since AS changed the semantics of may_queue return values
a little to allow a for a third option:

#define ELEVATOR_NO_QUEUE      0
#define ELEVATOR_MAY_QUEUE     1
#define ELEVATOR_MUST_QUEUE    2

since AS needs the must_queue option.

> > And I'd like to pass in priority through the
> > bio, like described further down.
> 
> Good, so that's the way to go then.
> 
> > It doesn't sound too complicated :)
> 
> You haven't seen the tiny little per-page trees hanging off
> each requests yet ;-) I need them to "upgrade" requests when
> someone with high priority decides to request a page that's
> already in the queue. That should be a comparably rare event,
> but handling that certainly adds complexity :-(

Ok, sounds a little worse now :-)

===== drivers/block/elevator.c 1.57 vs edited =====
--- 1.57/drivers/block/elevator.c	2004-06-18 17:08:26 +02:00
+++ edited/drivers/block/elevator.c	2004-06-23 11:48:30 +02:00
@@ -339,12 +339,12 @@
 		e->elevator_put_req_fn(q, rq);
 }
 
-int elv_may_queue(request_queue_t *q, int rw)
+int elv_may_queue(request_queue_t *q, int rw, int prio)
 {
 	elevator_t *e = &q->elevator;
 
 	if (e->elevator_may_queue_fn)
-		return e->elevator_may_queue_fn(q, rw);
+		return e->elevator_may_queue_fn(q, rw, prio);
 
 	return 0;
 }
===== drivers/block/ll_rw_blk.c 1.258 vs edited =====
--- 1.258/drivers/block/ll_rw_blk.c	2004-06-22 04:20:21 +02:00
+++ edited/drivers/block/ll_rw_blk.c	2004-06-23 11:47:17 +02:00
@@ -1584,7 +1584,8 @@
 /*
  * Get a free request, queue_lock must not be held
  */
-static struct request *get_request(request_queue_t *q, int rw, int gfp_mask)
+static struct request *get_request(request_queue_t *q, int rw, int prio,
+				   int gfp_mask)
 {
 	struct request *rq = NULL;
 	struct request_list *rl = &q->rq;
@@ -1605,7 +1606,7 @@
 	}
 
 	if (blk_queue_full(q, rw)
-			&& !ioc_batching(ioc) && !elv_may_queue(q, rw)) {
+			&& !ioc_batching(ioc) && !elv_may_queue(q, rw, prio)) {
 		/*
 		 * The queue is full and the allocating process is not a
 		 * "batcher", and not exempted by the IO scheduler
@@ -1667,7 +1668,7 @@
  * No available requests for this queue, unplug the device and wait for some
  * requests to become available.
  */
-static struct request *get_request_wait(request_queue_t *q, int rw)
+static struct request *get_request_wait(request_queue_t *q, int rw, int prio)
 {
 	DEFINE_WAIT(wait);
 	struct request *rq;
@@ -1679,7 +1680,7 @@
 		prepare_to_wait_exclusive(&rl->wait[rw], &wait,
 				TASK_UNINTERRUPTIBLE);
 
-		rq = get_request(q, rw, GFP_NOIO);
+		rq = get_request(q, rw, prio, GFP_NOIO);
 
 		if (!rq) {
 			struct io_context *ioc;
@@ -1705,13 +1706,14 @@
 struct request *blk_get_request(request_queue_t *q, int rw, int gfp_mask)
 {
 	struct request *rq;
+	int prio = 0; /* needs to be passed in as well, or just use "NORMAL" */
 
 	BUG_ON(rw != READ && rw != WRITE);
 
 	if (gfp_mask & __GFP_WAIT)
-		rq = get_request_wait(q, rw);
+		rq = get_request_wait(q, rw, prio);
 	else
-		rq = get_request(q, rw, gfp_mask);
+		rq = get_request(q, rw, prio, gfp_mask);
 
 	return rq;
 }
@@ -2193,7 +2195,7 @@
 static int __make_request(request_queue_t *q, struct bio *bio)
 {
 	struct request *req, *freereq = NULL;
-	int el_ret, rw, nr_sectors, cur_nr_sectors, barrier, ra;
+	int el_ret, rw, nr_sectors, cur_nr_sectors, barrier, ra, prio;
 	sector_t sector;
 
 	sector = bio->bi_sector;
@@ -2202,6 +2204,8 @@
 
 	rw = bio_data_dir(bio);
 
+	prio = bio_prio(bio);
+
 	/*
 	 * low level driver can indicate that it wants pages above a
 	 * certain limit bounced to low memory (ie for highmem, or even
@@ -2289,14 +2293,14 @@
 		freereq = NULL;
 	} else {
 		spin_unlock_irq(q->queue_lock);
-		if ((freereq = get_request(q, rw, GFP_ATOMIC)) == NULL) {
+		if ((freereq = get_request(q, rw, prio, GFP_ATOMIC)) == NULL) {
 			/*
 			 * READA bit set
 			 */
 			if (ra)
 				goto end_io;
 	
-			freereq = get_request_wait(q, rw);
+			freereq = get_request_wait(q, rw, prio);
 		}
 		goto again;
 	}
===== include/linux/bio.h 1.40 vs edited =====
--- 1.40/include/linux/bio.h	2004-06-21 03:23:40 +02:00
+++ edited/include/linux/bio.h	2004-06-23 12:09:13 +02:00
@@ -146,6 +146,18 @@
 #define BIO_RW_SYNC	4
 
 /*
+ * upper bits of bi_rw define the io priority of this bio
+ */
+#define BIO_PRIO_BITS	5
+#define BIO_PRIO_SHIFT	(8 * sizeof(unsigned long) - BIO_PRIO_BITS)
+#define bio_prio(bio)	((bio)->bi_rw >> BIO_PRIO_SHIFT)
+#define bio_set_prio(bio, prio)	do {			\
+	WARN_ON(prio >= (1 << BIO_PRIO_BITS));		\
+	(bio)->bi_rw &= ((1 << BIO_PRIO_SHIFT) - 1);	\
+	(bio)->bi_rw |= (prio << BIO_PRIO_SHIFT);	\
+} while (0)
+
+/*
  * various member access, note that bio_data should of course not be used
  * on highmem page vectors
  */
===== include/linux/elevator.h 1.31 vs edited =====
--- 1.31/include/linux/elevator.h	2004-04-12 19:55:20 +02:00
+++ edited/include/linux/elevator.h	2004-06-23 11:47:36 +02:00
@@ -16,7 +16,7 @@
 typedef void (elevator_requeue_req_fn) (request_queue_t *, struct request *);
 typedef struct request *(elevator_request_list_fn) (request_queue_t *, struct request *);
 typedef void (elevator_completed_req_fn) (request_queue_t *, struct request *);
-typedef int (elevator_may_queue_fn) (request_queue_t *, int);
+typedef int (elevator_may_queue_fn) (request_queue_t *, int, int);
 
 typedef int (elevator_set_req_fn) (request_queue_t *, struct request *, int);
 typedef void (elevator_put_req_fn) (request_queue_t *, struct request *);
@@ -73,7 +73,7 @@
 extern struct request *elv_latter_request(request_queue_t *, struct request *);
 extern int elv_register_queue(request_queue_t *q);
 extern void elv_unregister_queue(request_queue_t *q);
-extern int elv_may_queue(request_queue_t *, int);
+extern int elv_may_queue(request_queue_t *, int, int);
 extern void elv_completed_request(request_queue_t *, struct request *);
 extern int elv_set_request(request_queue_t *, struct request *, int);
 extern void elv_put_request(request_queue_t *, struct request *);

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-23 10:14         ` Jens Axboe
@ 2004-06-23 12:46           ` Werner Almesberger
  2004-06-23 16:46             ` Jens Axboe
  2004-07-12 23:52           ` Werner Almesberger
  1 sibling, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-06-23 12:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> Something like this

Hmm, this means that also all the way to submit_bh needs
instrumentation to pass the priority to the bio. What would
you think of simply passing the bio ? That might also gives
us more flexibility to play with other priority schemes.

Thank,
- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-23 12:46           ` Werner Almesberger
@ 2004-06-23 16:46             ` Jens Axboe
  2004-06-23 16:57               ` Werner Almesberger
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2004-06-23 16:46 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Wed, Jun 23 2004, Werner Almesberger wrote:
> Jens Axboe wrote:
> > Something like this
> 
> Hmm, this means that also all the way to submit_bh needs
> instrumentation to pass the priority to the bio. What would
> you think of simply passing the bio ? That might also gives
> us more flexibility to play with other priority schemes.

Not sure I quite understand what you mean here, pass the bio in where?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-23 16:46             ` Jens Axboe
@ 2004-06-23 16:57               ` Werner Almesberger
  2004-06-23 17:00                 ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-06-23 16:57 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> Not sure I quite understand what you mean here, pass the bio in where?

To elv_may_queue (and its callers), just like your patch
passes "prio". So the call to bio_prio would move into
*_may_queue. (And maybe we'd have to allow NULL, for
requests fabricated in a special way.)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-23 16:57               ` Werner Almesberger
@ 2004-06-23 17:00                 ` Jens Axboe
  2004-06-23 23:02                   ` Werner Almesberger
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2004-06-23 17:00 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Wed, Jun 23 2004, Werner Almesberger wrote:
> Jens Axboe wrote:
> > Not sure I quite understand what you mean here, pass the bio in where?
> 
> To elv_may_queue (and its callers), just like your patch
> passes "prio". So the call to bio_prio would move into
> *_may_queue. (And maybe we'd have to allow NULL, for
> requests fabricated in a special way.)

Don't like that approach. I don't see why you'd need more than the
priority passed in anyways?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-23 17:00                 ` Jens Axboe
@ 2004-06-23 23:02                   ` Werner Almesberger
  0 siblings, 0 replies; 17+ messages in thread
From: Werner Almesberger @ 2004-06-23 23:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> Don't like that approach. I don't see why you'd need more than the
> priority passed in anyways?

I don't need more, but my priorities originally sit on open
files, and may then propagate to pages, from where the
elevator picks them up. Your bio-centric approach breaks
this.

Well, maybe that's just an opportinity to do something
better :-) How do you think the priorities should get into
the bio, i.e. who calls bio_set_prio ?

The properties I'm looking for are:

 - priorities are per open file (whether a process is allowed
   to raise the priority of a file may depend on a number of
   factors, so this would be compatible with your "io
   priorities", except that the process has to ask explicitly
   before an open file gets a non-default priority)

 - the kernel can grant priorities either a) for the entire
   file, or b) depending on whether the data rate is below a
   certain limit. (The latter if for really high priorities.
   I'm combining this with a special readahead mechanism, so 
   I'm simply boosting the priority of pages that will be
   read ahead. The read ahead is done by yet another kernel
   thread.)

b) is actually the easy part, because there all action is
initiated by my code. a) is harder, because there, the open
file just has a priority that somehow needs to propagate to
the bio.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-06-23 10:14         ` Jens Axboe
  2004-06-23 12:46           ` Werner Almesberger
@ 2004-07-12 23:52           ` Werner Almesberger
  2004-07-13  5:37             ` Jens Axboe
  1 sibling, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-07-12 23:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> Something like this (probably a little half-assed, and definitely very
> untested :-).

Nevertheless, it seems to work well enough :-) The only bug I've
noticed is that calculations related to bi_rw need to be unsigned
long, for 64 bit compatibility, i.e.

+#define bio_set_prio(bio, prio)	do {			\
+	WARN_ON(prio >= (1 << BIO_PRIO_BITS));		\
+	(bio)->bi_rw &= ((1UL << BIO_PRIO_SHIFT) - 1);	\
+	(bio)->bi_rw |= ((unsigned long) (prio) << BIO_PRIO_SHIFT);	\
+} while (0)

I've adapted your per-process IO priority idea, and used it as
follows:

--- linux-2.6.7-orig/include/linux/sched.h	Wed Jun 16 02:18:57 2004
+++ linux-2.6.7/include/linux/sched.h	Sun Jul 11 15:00:31 2004
@@ -505,6 +505,7 @@ struct task_struct {
 	struct backing_dev_info *backing_dev_info;
 
 	struct io_context *io_context;
+	int ioprio;
 
 	unsigned long ptrace_message;
 	siginfo_t *last_siginfo; /* For ptrace use.  */
--- linux-2.6.7-orig/fs/buffer.c	Wed Jun 16 02:19:36 2004
+++ linux-2.6.7/fs/buffer.c	Mon Jul 12 08:25:41 2004
@@ -2789,6 +2789,8 @@ void submit_bh(int rw, struct buffer_hea
 	bio->bi_end_io = end_bio_bh_io_sync;
 	bio->bi_private = bh;
 
+	bio_set_prio(bio, current->ioprio);
+
 	submit_bio(rw, bio);
 }
 
--- linux-2.6.7-orig/drivers/block/ll_rw_blk.c	Sun Jul 11 14:20:07 2004
+++ linux-2.6.7/drivers/block/ll_rw_blk.c	Mon Jul 12 08:20:41 2004
@@ -2320,7 +2320,7 @@ static inline void blk_partition_remap(s
 	if (bdev != bdev->bd_contains) {
 		struct hd_struct *p = bdev->bd_part;
 
-		switch (bio->bi_rw) {
+		switch (bio->bi_rw & BIO_RW) {
 		case READ:
 			p->read_sectors += bio_sectors(bio);
 			p->reads++;
@@ -2451,7 +2451,7 @@ void submit_bio(int rw, struct bio *bio)
 
 	BIO_BUG_ON(!bio->bi_size);
 	BIO_BUG_ON(!bio->bi_io_vec);
-	bio->bi_rw = rw;
+	bio->bi_rw |= rw;
 	if (rw & WRITE)
 		mod_page_state(pgpgout, count);
 	else

Because I'm lazy, I'm using a default priority of zero, so I
don't need any explicit initialization.

I've been playing with this for a few hours, and even a
request-happy load with random accesses through AIO, which
normally basically kills the machine, doesn't impress my
high-priority reader anymore. I haven't looked into fairness
issues, though.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-07-12 23:52           ` Werner Almesberger
@ 2004-07-13  5:37             ` Jens Axboe
  2004-07-13 12:29               ` Werner Almesberger
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2004-07-13  5:37 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Mon, Jul 12 2004, Werner Almesberger wrote:
> Jens Axboe wrote:
> > Something like this (probably a little half-assed, and definitely very
> > untested :-).
> 
> Nevertheless, it seems to work well enough :-) The only bug I've
> noticed is that calculations related to bi_rw need to be unsigned
> long, for 64 bit compatibility, i.e.
> 
> +#define bio_set_prio(bio, prio)	do {			\
> +	WARN_ON(prio >= (1 << BIO_PRIO_BITS));		\
> +	(bio)->bi_rw &= ((1UL << BIO_PRIO_SHIFT) - 1);	\
> +	(bio)->bi_rw |= ((unsigned long) (prio) << BIO_PRIO_SHIFT);	\
> +} while (0)

Looks fine.

> --- linux-2.6.7-orig/include/linux/sched.h	Wed Jun 16 02:18:57 2004
> +++ linux-2.6.7/include/linux/sched.h	Sun Jul 11 15:00:31 2004
> @@ -505,6 +505,7 @@ struct task_struct {
>  	struct backing_dev_info *backing_dev_info;
>  
>  	struct io_context *io_context;
> +	int ioprio;
>  
>  	unsigned long ptrace_message;
>  	siginfo_t *last_siginfo; /* For ptrace use.  */
> --- linux-2.6.7-orig/fs/buffer.c	Wed Jun 16 02:19:36 2004
> +++ linux-2.6.7/fs/buffer.c	Mon Jul 12 08:25:41 2004
> @@ -2789,6 +2789,8 @@ void submit_bh(int rw, struct buffer_hea
>  	bio->bi_end_io = end_bio_bh_io_sync;
>  	bio->bi_private = bh;
>  
> +	bio_set_prio(bio, current->ioprio);
> +
>  	submit_bio(rw, bio);
>  }
>  
> --- linux-2.6.7-orig/drivers/block/ll_rw_blk.c	Sun Jul 11 14:20:07 2004
> +++ linux-2.6.7/drivers/block/ll_rw_blk.c	Mon Jul 12 08:20:41 2004
> @@ -2320,7 +2320,7 @@ static inline void blk_partition_remap(s
>  	if (bdev != bdev->bd_contains) {
>  		struct hd_struct *p = bdev->bd_part;
>  
> -		switch (bio->bi_rw) {
> +		switch (bio->bi_rw & BIO_RW) {
>  		case READ:
>  			p->read_sectors += bio_sectors(bio);
>  			p->reads++;

That's buggy (was before, not your fault. Should read:

	switch (bio_data_dir(bio)) {
		...

> @@ -2451,7 +2451,7 @@ void submit_bio(int rw, struct bio *bio)
>  
>  	BIO_BUG_ON(!bio->bi_size);
>  	BIO_BUG_ON(!bio->bi_io_vec);
> -	bio->bi_rw = rw;
> +	bio->bi_rw |= rw;
>  	if (rw & WRITE)
>  		mod_page_state(pgpgout, count);
>  	else

Should be ok, everyone should have called bio_init() first.

> Because I'm lazy, I'm using a default priority of zero, so I
> don't need any explicit initialization.

Maybe it would be more logical to assign 'default' priority in
bio_init(), in fact?

> I've been playing with this for a few hours, and even a
> request-happy load with random accesses through AIO, which
> normally basically kills the machine, doesn't impress my
> high-priority reader anymore. I haven't looked into fairness
> issues, though.

Sounds like you are making good progress.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-07-13  5:37             ` Jens Axboe
@ 2004-07-13 12:29               ` Werner Almesberger
  2004-07-13 12:35                 ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-07-13 12:29 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> 	switch (bio_data_dir(bio)) {

Ah yes, much nicer.

> Maybe it would be more logical to assign 'default' priority in
> bio_init(), in fact?

What made me shy away from this is that bio_init and bio_alloc
seem to get called from contexts where "current" wouldn't
necessarily have anything to do with the process that has
actually initiated the IO.

For example, I'm not sure if the calls from ide-scsi.c, when
it builds requests piece by piece, are safe in this regard.

The callers of submit_bio all seem to be more of the
"classical" variety.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-07-13 12:29               ` Werner Almesberger
@ 2004-07-13 12:35                 ` Jens Axboe
  2004-07-13 16:36                   ` Werner Almesberger
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2004-07-13 12:35 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Tue, Jul 13 2004, Werner Almesberger wrote:
> > Maybe it would be more logical to assign 'default' priority in
> > bio_init(), in fact?
> 
> What made me shy away from this is that bio_init and bio_alloc
> seem to get called from contexts where "current" wouldn't
> necessarily have anything to do with the process that has
> actually initiated the IO.
> 
> For example, I'm not sure if the calls from ide-scsi.c, when
> it builds requests piece by piece, are safe in this regard.

That's not really what I meant - 'default' priority isn't tied to the
process. It's just that if default io prio was something other than 0,
then it would make sense to set it there. Say if the range was 0-15, and
you wanted 7 to be 'in the middle' default priority.

But I suppose you just want io submission to set it, in which case it
doesn't matter what bio_init() sets it to.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-07-13 12:35                 ` Jens Axboe
@ 2004-07-13 16:36                   ` Werner Almesberger
  2004-07-13 16:59                     ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Werner Almesberger @ 2004-07-13 16:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel

Jens Axboe wrote:
> It's just that if default io prio was something other than 0, then it
> would make sense to set it there.

Oh, I see. Yes, of course. It actually does this, implicitly,
since the default value of zero just happens to be what bio_init
sets already ;-)

> But I suppose you just want io submission to set it, in which case it
> doesn't matter what bio_init() sets it to.

Except for those bios that get submitted without ever touching
submit_bh.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: elevator priorities vs. full request queues
  2004-07-13 16:36                   ` Werner Almesberger
@ 2004-07-13 16:59                     ` Jens Axboe
  0 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2004-07-13 16:59 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: linux-fsdevel

On Tue, Jul 13 2004, Werner Almesberger wrote:
> Jens Axboe wrote:
> > It's just that if default io prio was something other than 0, then it
> > would make sense to set it there.
> 
> Oh, I see. Yes, of course. It actually does this, implicitly,
> since the default value of zero just happens to be what bio_init
> sets already ;-)

Only if you define 0 to be the default value :-)

> > But I suppose you just want io submission to set it, in which case it
> > doesn't matter what bio_init() sets it to.
> 
> Except for those bios that get submitted without ever touching
> submit_bh.

You mean things like bio_map_user/bio_copy_user and (at the other end,
doubt these would count), bios submitted privately to the drive itself
(ala ide-scsi)? The latter can be disregarded imho, nothing to worry
about. The former allocate requests on its own, so you must pass in the
priority there. So instead of assigning bio priority and using that for
request allocation, you are just a step or two further down in the call
chain and specify the priority there directly.

In short, I don't see a problem there.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2004-07-13 16:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-22  4:25 elevator priorities vs. full request queues Werner Almesberger
2004-06-22  7:48 ` Jens Axboe
2004-06-22  8:26   ` Werner Almesberger
2004-06-22 10:14     ` Jens Axboe
2004-06-22 19:08       ` Werner Almesberger
2004-06-23 10:14         ` Jens Axboe
2004-06-23 12:46           ` Werner Almesberger
2004-06-23 16:46             ` Jens Axboe
2004-06-23 16:57               ` Werner Almesberger
2004-06-23 17:00                 ` Jens Axboe
2004-06-23 23:02                   ` Werner Almesberger
2004-07-12 23:52           ` Werner Almesberger
2004-07-13  5:37             ` Jens Axboe
2004-07-13 12:29               ` Werner Almesberger
2004-07-13 12:35                 ` Jens Axboe
2004-07-13 16:36                   ` Werner Almesberger
2004-07-13 16:59                     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).