public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Marcelo Tosatti <marcelo@conectiva.com.br>
Cc: Andrea Arcangeli <andrea@suse.de>, Chris Mason <mason@suse.com>,
	lkml <linux-kernel@vger.kernel.org>,
	"Stephen C. Tweedie" <sct@redhat.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Jeff Garzik <jgarzik@pobox.com>, Andrew Morton <akpm@digeo.com>,
	Alexander Viro <viro@math.psu.edu>
Subject: Re: RFC on io-stalls patch
Date: Mon, 14 Jul 2003 21:51:39 +0200	[thread overview]
Message-ID: <20030714195138.GX833@suse.de> (raw)
In-Reply-To: <20030714131206.GJ833@suse.de>

On Mon, Jul 14 2003, Jens Axboe wrote:
> On Mon, Jul 14 2003, Marcelo Tosatti wrote:
> > 
> > 
> > On Mon, 14 Jul 2003, Jens Axboe wrote:
> > 
> > > On Sun, Jul 13 2003, Andrea Arcangeli wrote:
> > > > On Sun, Jul 13, 2003 at 11:01:16AM +0200, Jens Axboe wrote:
> > > > > No I don't have anything specific, it just seems like a bad heuristic to
> > > > > get rid of. I can try and do some testing tomorrow. I do feel strongly
> > > >
> > > > well, it's not an heuristic, it's a simplification and it will certainly
> > > > won't provide any benefit (besides saving some hundred kbytes of ram per
> > > > harddisk that is a minor benefit).
> > >
> > > You are missing my point - I don't care about loosing the extra request
> > > list, I never said anything about that in this thread. I care about
> > > loosing the reserved requests for reads. And we can do that just fine
> > > with just holding back a handful of requests.
> > >
> > > > > that we should at least make sure to reserve a few requests for reads
> > > > > exclusively, even if you don't agree with the oversized check. Anything
> > > > > else really contradicts all the io testing we have done the past years
> > > > > that shows how important it is to get a read in ASAP. And doing that in
> > > >
> > > > Important for latency or throughput? Do you know which is the benchmarks
> > > > that returned better results with the two queues, what's the theory
> > > > behind this?
> > >
> > > Forget the two queues, noone has said anything about that. The reserved
> > > reads are important for latency reasons, not throughput.
> > 
> > So Jens,
> > 
> > Please bench (as you said you would), and send us the results.
> > 
> > Its very important.
> 
> Yes of course I'll send the results, at the current rate (they are
> running on the box) it probably wont be before tomorrow though.

Some initial results with the attached patch, I'll try and do some more
fine grained tomorrow. Base kernel was 2.4.22-pre5 (obviously), drive
tested is a SCSI drive (on aic7xxx, tcq fixed at 4), fs is ext3. I would
have done ide testing actually, but the drive in that machine appears to
have gone dead. I'll pop in a new one tomorrow and test on that too.

no_load:
Kernel            [runs]        Time    CPU%    Loads   LCPU%   Ratio
2.4.22-pre5            2        134     196.3   0.0     0.0     1.00
2.4.22-pre5-axboe      3        133     196.2   0.0     0.0     1.00
ctar_load:
Kernel            [runs]        Time    CPU%    Loads   LCPU%   Ratio
2.4.22-pre5            3        235     114.0   25.0    22.1    1.75
2.4.22-pre5-axboe      3        194     138.1   19.7    20.6    1.46
xtar_load:
Kernel            [runs]        Time    CPU%    Loads   LCPU%   Ratio
2.4.22-pre5            3        309     86.4    15.0    14.9    2.31
2.4.22-pre5-axboe      3        249     107.2   11.3    14.1    1.87
io_load:
Kernel            [runs]        Time    CPU%    Loads   LCPU%   Ratio
2.4.22-pre5            3        637     42.5    120.2   18.5    4.75
2.4.22-pre5-axboe      3        540     50.0    103.0   18.1    4.06
io_other:
Kernel            [runs]        Time    CPU%    Loads   LCPU%   Ratio
2.4.22-pre5            3        576     47.2    107.7   19.8    4.30
2.4.22-pre5-axboe      3        452     59.7    85.3    19.5    3.40
read_load:
Kernel            [runs]        Time    CPU%    Loads   LCPU%   Ratio
2.4.22-pre5            3        150     181.3   8.1     9.3     1.12
2.4.22-pre5-axboe      3        152     178.9   8.2     9.9     1.14


===== drivers/block/ll_rw_blk.c 1.47 vs edited =====
--- 1.47/drivers/block/ll_rw_blk.c	Fri Jul 11 10:30:54 2003
+++ edited/drivers/block/ll_rw_blk.c	Mon Jul 14 20:42:36 2003
@@ -549,10 +549,12 @@
 static struct request *get_request(request_queue_t *q, int rw)
 {
 	struct request *rq = NULL;
-	struct request_list *rl;
+	struct request_list *rl = &q->rq;
 
-	rl = &q->rq;
-	if (!list_empty(&rl->free) && !blk_oversized_queue(q)) {
+	if ((rw == WRITE) && (blk_oversized_queue(q) || (rl->count < 4)))
+		return NULL;
+
+	if (!list_empty(&rl->free)) {
 		rq = blkdev_free_rq(&rl->free);
 		list_del(&rq->queue);
 		rl->count--;
@@ -947,7 +949,7 @@
 static int __make_request(request_queue_t * q, int rw,
 				  struct buffer_head * bh)
 {
-	unsigned int sector, count;
+	unsigned int sector, count, sync;
 	int max_segments = MAX_SEGMENTS;
 	struct request * req, *freereq = NULL;
 	int rw_ahead, max_sectors, el_ret;
@@ -958,6 +960,7 @@
 
 	count = bh->b_size >> 9;
 	sector = bh->b_rsector;
+	sync = test_and_clear_bit(BH_Sync, &bh->b_state);
 
 	rw_ahead = 0;	/* normal case; gets changed below for READA */
 	switch (rw) {
@@ -1124,6 +1127,8 @@
 		blkdev_release_request(freereq);
 	if (should_wake)
 		get_request_wait_wakeup(q, rw);
+	if (sync)
+		__generic_unplug_device(q);
 	spin_unlock_irq(&io_request_lock);
 	return 0;
 end_io:
===== fs/buffer.c 1.89 vs edited =====
--- 1.89/fs/buffer.c	Tue Jun 24 23:11:29 2003
+++ edited/fs/buffer.c	Mon Jul 14 16:59:59 2003
@@ -1142,6 +1142,7 @@
 	bh = getblk(dev, block, size);
 	if (buffer_uptodate(bh))
 		return bh;
+	set_bit(BH_Sync, &bh->b_state);
 	ll_rw_block(READ, 1, &bh);
 	wait_on_buffer(bh);
 	if (buffer_uptodate(bh))
===== include/linux/fs.h 1.82 vs edited =====
--- 1.82/include/linux/fs.h	Wed Jul  9 20:42:38 2003
+++ edited/include/linux/fs.h	Mon Jul 14 16:59:47 2003
@@ -221,6 +221,7 @@
 	BH_Launder,	/* 1 if we can throttle on this buffer */
 	BH_Attached,	/* 1 if b_inode_buffers is linked into a list */
 	BH_JBD,		/* 1 if it has an attached journal_head */
+	BH_Sync,
 
 	BH_PrivateStart,/* not a state bit, but the first bit available
 			 * for private allocation by other entities

-- 
Jens Axboe


  reply	other threads:[~2003-07-14 19:39 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-07-08 20:06 RFC on io-stalls patch Marcelo Tosatti
2003-07-10 13:57 ` Jens Axboe
2003-07-11 14:13   ` Chris Mason
2003-07-12  0:20     ` Nick Piggin
2003-07-12 18:37       ` Chris Mason
2003-07-12  7:37     ` Jens Axboe
2003-07-12  7:48       ` Jens Axboe
2003-07-12 18:32       ` Chris Mason
2003-07-13  0:33         ` Andrea Arcangeli
2003-07-13  9:01         ` Jens Axboe
2003-07-13 16:20           ` Chris Mason
2003-07-13 16:45             ` Jeff Garzik
2003-07-13 19:33               ` Andrea Arcangeli
2003-07-13 17:47             ` Jens Axboe
2003-07-13 19:35               ` Andrea Arcangeli
2003-07-14  0:36                 ` Chris Mason
2003-07-13 19:19           ` Andrea Arcangeli
2003-07-14  5:49             ` Jens Axboe
2003-07-14 12:23               ` Marcelo Tosatti
2003-07-14 13:12                 ` Jens Axboe
2003-07-14 19:51                   ` Jens Axboe [this message]
2003-07-14 20:09                     ` Chris Mason
2003-07-14 20:19                       ` Andrea Arcangeli
2003-07-14 21:24                         ` Chris Mason
2003-07-15  5:46                       ` Jens Axboe
2003-07-14 20:09                     ` Marcelo Tosatti
2003-07-14 20:24                       ` Andrea Arcangeli
2003-07-14 20:34                         ` Chris Mason
2003-07-15  5:35                           ` Jens Axboe
     [not found]                           ` <20030714224528.GU16313@dualathlon.random>
2003-07-15  5:40                             ` Jens Axboe
     [not found]                             ` <1058229360.13317.364.camel@tiny.suse.com>
2003-07-15  5:43                               ` Jens Axboe
     [not found]                               ` <20030714175238.3eaddd9a.akpm@osdl.org>
     [not found]                                 ` <20030715020706.GC16313@dualathlon.random>
2003-07-15  5:45                                   ` Jens Axboe
2003-07-15  6:01                                     ` Andrea Arcangeli
2003-07-15  6:08                                       ` Jens Axboe
2003-07-15  7:03                                         ` Andrea Arcangeli
2003-07-15  8:28                                           ` Jens Axboe
2003-07-15  9:12                                             ` Chris Mason
2003-07-15  9:17                                               ` Jens Axboe
2003-07-15  9:18                                                 ` Jens Axboe
2003-07-15  9:30                                                   ` Chris Mason
2003-07-15 10:03                                                   ` Andrea Arcangeli
2003-07-15 10:11                                                     ` Jens Axboe
2003-07-15 14:18                                                 ` Chris Mason
2003-07-15 14:29                                                   ` Jens Axboe
2003-07-16 17:06                                                   ` Chris Mason
2003-07-15  9:22                                               ` Chris Mason
2003-07-15  9:59                                               ` Andrea Arcangeli
2003-07-15  9:48                                             ` Andrea Arcangeli
2003-07-14 20:16                     ` Andrea Arcangeli
2003-07-14 20:17                       ` Marcelo Tosatti
2003-07-14 20:27                         ` Andrea Arcangeli
2003-07-15  5:26                       ` Jens Axboe
2003-07-15  5:48                         ` Andrea Arcangeli
2003-07-15  6:01                           ` Jens Axboe
2003-07-15  6:33                             ` Andrea Arcangeli
2003-07-15 11:22                         ` Alan Cox
2003-07-15 11:27                           ` Jens Axboe
2003-07-16 12:43                             ` Andrea Arcangeli
2003-07-16 12:46                               ` Jens Axboe
2003-07-16 12:59                                 ` Andrea Arcangeli
2003-07-16 13:04                                   ` Jens Axboe
2003-07-16 13:11                                     ` Andrea Arcangeli
2003-07-16 13:21                                       ` Jens Axboe
2003-07-16 13:44                                         ` Andrea Arcangeli
2003-07-16 14:00                                           ` Jens Axboe
2003-07-16 14:24                                             ` Andrea Arcangeli
2003-07-16 16:49                                     ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2003-07-15 18:47 Shane Shrybman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030714195138.GX833@suse.de \
    --to=axboe@suse.de \
    --cc=akpm@digeo.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andrea@suse.de \
    --cc=jgarzik@pobox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    --cc=mason@suse.com \
    --cc=sct@redhat.com \
    --cc=viro@math.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox