From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jens Axboe Subject: Re: elevator priorities vs. full request queues Date: Tue, 13 Jul 2004 07:37:52 +0200 Sender: linux-fsdevel-owner@vger.kernel.org Message-ID: <20040713053749.GA14759@suse.de> References: <20040622012502.B1325@almesberger.net> <20040622074852.GW12881@suse.de> <20040622052644.D1325@almesberger.net> <20040622101434.GB12881@suse.de> <20040622160859.I1325@almesberger.net> <20040623101430.GI1120@suse.de> <20040712205227.A12285@almesberger.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org Return-path: Received: from ns.virtualhost.dk ([195.184.98.160]:14796 "EHLO virtualhost.dk") by vger.kernel.org with ESMTP id S264595AbUGMFig (ORCPT ); Tue, 13 Jul 2004 01:38:36 -0400 To: Werner Almesberger Content-Disposition: inline In-Reply-To: <20040712205227.A12285@almesberger.net> List-Id: linux-fsdevel.vger.kernel.org On Mon, Jul 12 2004, Werner Almesberger wrote: > Jens Axboe wrote: > > Something like this (probably a little half-assed, and definitely very > > untested :-). > > Nevertheless, it seems to work well enough :-) The only bug I've > noticed is that calculations related to bi_rw need to be unsigned > long, for 64 bit compatibility, i.e. > > +#define bio_set_prio(bio, prio) do { \ > + WARN_ON(prio >= (1 << BIO_PRIO_BITS)); \ > + (bio)->bi_rw &= ((1UL << BIO_PRIO_SHIFT) - 1); \ > + (bio)->bi_rw |= ((unsigned long) (prio) << BIO_PRIO_SHIFT); \ > +} while (0) Looks fine. > --- linux-2.6.7-orig/include/linux/sched.h Wed Jun 16 02:18:57 2004 > +++ linux-2.6.7/include/linux/sched.h Sun Jul 11 15:00:31 2004 > @@ -505,6 +505,7 @@ struct task_struct { > struct backing_dev_info *backing_dev_info; > > struct io_context *io_context; > + int ioprio; > > unsigned long ptrace_message; > siginfo_t *last_siginfo; /* For ptrace use. */ > --- linux-2.6.7-orig/fs/buffer.c Wed Jun 16 02:19:36 2004 > +++ linux-2.6.7/fs/buffer.c Mon Jul 12 08:25:41 2004 > @@ -2789,6 +2789,8 @@ void submit_bh(int rw, struct buffer_hea > bio->bi_end_io = end_bio_bh_io_sync; > bio->bi_private = bh; > > + bio_set_prio(bio, current->ioprio); > + > submit_bio(rw, bio); > } > > --- linux-2.6.7-orig/drivers/block/ll_rw_blk.c Sun Jul 11 14:20:07 2004 > +++ linux-2.6.7/drivers/block/ll_rw_blk.c Mon Jul 12 08:20:41 2004 > @@ -2320,7 +2320,7 @@ static inline void blk_partition_remap(s > if (bdev != bdev->bd_contains) { > struct hd_struct *p = bdev->bd_part; > > - switch (bio->bi_rw) { > + switch (bio->bi_rw & BIO_RW) { > case READ: > p->read_sectors += bio_sectors(bio); > p->reads++; That's buggy (was before, not your fault. Should read: switch (bio_data_dir(bio)) { ... > @@ -2451,7 +2451,7 @@ void submit_bio(int rw, struct bio *bio) > > BIO_BUG_ON(!bio->bi_size); > BIO_BUG_ON(!bio->bi_io_vec); > - bio->bi_rw = rw; > + bio->bi_rw |= rw; > if (rw & WRITE) > mod_page_state(pgpgout, count); > else Should be ok, everyone should have called bio_init() first. > Because I'm lazy, I'm using a default priority of zero, so I > don't need any explicit initialization. Maybe it would be more logical to assign 'default' priority in bio_init(), in fact? > I've been playing with this for a few hours, and even a > request-happy load with random accesses through AIO, which > normally basically kills the machine, doesn't impress my > high-priority reader anymore. I haven't looked into fairness > issues, though. Sounds like you are making good progress. -- Jens Axboe