[PATCH 0/5] Adaptive readahead updates 2

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/5] Adaptive readahead updates 2
       [not found] <20060609080801.741901069@localhost.localdomain>
@ 2006-06-09  8:08 ` Wu Fengguang
       [not found] ` <20060609081119.231751179@localhost.localdomain>
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Wu Fengguang @ 2006-06-09  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew,

Here are 5 small readahead patches collected in the past week.
They can be applied cleanly for linux-2.6.17-rc6-mm1.

- against readahead-state-based-method-routines.patch
	[PATCH 1/5] readahead: no RA_FLAG_EOF on single page file

- against readahead-initial-method-guiding-sizes.patch
	[PATCH 2/5] readahead: aggressive initial sizes

- against readahead-call-scheme.patch
	[PATCH 3/5] readahead: call scheme - no fastcall for readahead_cache_hit()

- standalone patches
	[PATCH 4/5] readahead: backoff on I/O error
	[PATCH 5/5] readahead: remove size limit on read_ahead_kb

Thanks,
Wu Fengguang
-
Dept. Automation                University of Science and Technology of China

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] readahead: no RA_FLAG_EOF on single page file
       [not found] ` <20060609081119.231751179@localhost.localdomain>
@ 2006-06-09  8:08   ` Wu Fengguang
  0 siblings, 0 replies; 12+ messages in thread
From: Wu Fengguang @ 2006-06-09  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Wu Fengguang

[-- Attachment #1: readahead-reduce-close-call.patch --]
[-- Type: text/plain, Size: 802 bytes --]

Dot not set RA_FLAG_EOF on single page files.

readahead_close() will be called if RA_FLAG_EOF is there on file close.
It detects readahead hit/miss, and adjust ra_expected_bytes correspondingly.
Single page files are uninteresting for it. Since near 40% desktop
files are <= 4k, this patch can reduce many useless readahead_close()
invocations.

Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
---

--- linux-2.6.17-rc5-mm3.orig/mm/readahead.c
+++ linux-2.6.17-rc5-mm3/mm/readahead.c
@@ -1037,7 +1037,8 @@ static int ra_dispatch(struct file_ra_st
 		ra->readahead_index = eof_index;
 		if (ra->lookahead_index > eof_index)
 			ra->lookahead_index = eof_index;
-		ra->flags |= RA_FLAG_EOF;
+		if (eof_index > 1)
+			ra->flags |= RA_FLAG_EOF;
 	}

 	/* Disable look-ahead for loopback file. */

--

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/5] readahead: call scheme - no fastcall for readahead_cache_hit()
       [not found] ` <20060609081120.054527393@localhost.localdomain>
@ 2006-06-09  8:08   ` Wu Fengguang
  0 siblings, 0 replies; 12+ messages in thread
From: Wu Fengguang @ 2006-06-09  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Wu Fengguang

[-- Attachment #1: readahead-cache-hit-remove-fastcall.patch --]
[-- Type: text/plain, Size: 2493 bytes --]

Remove 'fastcall' directive for readahead_cache_hit().

It leads to unfavorable performance in the following micro benchmark on i386
with CONFIG_REGPARM=n:

Command:
        time cp cold /dev/null

Summary:
              user     sys     cpu    total
no-fastcall   1.24    24.88    90.9   28.57
fastcall      1.16    25.69    91.5   29.23

Details:
without fastcall:
cp cold /dev/null  1.27s user 24.63s system 91% cpu 28.348 total
cp cold /dev/null  1.17s user 25.09s system 91% cpu 28.653 total
cp cold /dev/null  1.24s user 24.75s system 91% cpu 28.448 total
cp cold /dev/null  1.20s user 25.04s system 91% cpu 28.614 total
cp cold /dev/null  1.31s user 24.67s system 91% cpu 28.499 total
cp cold /dev/null  1.30s user 24.87s system 91% cpu 28.530 total
cp cold /dev/null  1.26s user 24.84s system 91% cpu 28.542 total
cp cold /dev/null  1.16s user 25.15s system 90% cpu 28.925 total

with fastcall:
cp cold /dev/null  1.16s user 26.39s system 91% cpu 30.061 total
cp cold /dev/null  1.25s user 26.53s system 91% cpu 30.378 total
cp cold /dev/null  1.10s user 25.32s system 92% cpu 28.679 total
cp cold /dev/null  1.15s user 25.20s system 91% cpu 28.747 total
cp cold /dev/null  1.19s user 25.38s system 92% cpu 28.841 total
cp cold /dev/null  1.11s user 25.75s system 92% cpu 29.126 total
cp cold /dev/null  1.17s user 25.49s system 91% cpu 29.042 total
cp cold /dev/null  1.17s user 25.49s system 92% cpu 28.970 total

Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
---


--- linux-2.6.17-rc5-mm2.orig/include/linux/mm.h
+++ linux-2.6.17-rc5-mm2/include/linux/mm.h
@@ -1074,7 +1074,7 @@ page_cache_readahead_adaptive(struct add
 			struct file_ra_state *ra, struct file *filp,
 			struct page *prev_page, struct page *page,
 			pgoff_t first_index, pgoff_t index, pgoff_t last_index);
-void fastcall readahead_cache_hit(struct file_ra_state *ra, struct page *page);
+void readahead_cache_hit(struct file_ra_state *ra, struct page *page);

 #ifdef CONFIG_ADAPTIVE_READAHEAD
 extern int readahead_ratio;
--- linux-2.6.17-rc5-mm2.orig/mm/readahead.c
+++ linux-2.6.17-rc5-mm2/mm/readahead.c
@@ -1916,7 +1916,7 @@ readit:
  * readahead_cache_hit() is the feedback route of the adaptive read-ahead
  * logic. It must be called on every access on the read-ahead pages.
  */
-void fastcall readahead_cache_hit(struct file_ra_state *ra, struct page *page)
+void readahead_cache_hit(struct file_ra_state *ra, struct page *page)
 {
 	if (PageActive(page) || PageReferenced(page))
 		return;

--

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 4/5] readahead: backoff on I/O error
       [not found] ` <20060609081120.531013274@localhost.localdomain>
@ 2006-06-09  8:08   ` Wu Fengguang
  2006-06-10 18:33     ` Ingo Oeser
  0 siblings, 1 reply; 12+ messages in thread
From: Wu Fengguang @ 2006-06-09  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Michael Tokarev, Jens Axboe

[-- Attachment #1: readahead-eio-case.patch --]
[-- Type: text/plain, Size: 3496 bytes --]

Backoff readahead size exponentially on I/O error.

------
Michael Tokarev <mjt@tls.msk.ru> described the problem as:

[QUOTE]
Suppose there's a CD-rom with a scratch/etc, one sector is unreadable.
In order to "fix" it, one have to read it and write to another CD-rom,
or something.. or just ignore the error (if it's just a skip in a video
stream).  Let's assume the unreadable block is number U.

But current behavior is just insane.  An application requests block
number N, which is before U. Kernel tries to read-ahead blocks N..U.
Cdrom drive tries to read it, re-read it.. for some time.  Finally,
when all the N..U-1 blocks are read, kernel returns block number N
(as requested) to an application, successefully.

Now an app requests block number N+1, and kernel tries to read
blocks N+1..U+1.  Retrying again as in previous step.

And so on, up to when an app requests block number U-1.  And when,
finally, it requests block U, it receives read error.

So, kernel currentry tries to re-read the same failing block as
many times as the current readahead value (256 (times?) by default).

This whole process already killed my cdrom drive (I posted about it
to LKML several months ago) - literally, the drive has fried, and
does not work anymore.  Ofcourse that problem was a bug in firmware
(or whatever) of the drive *too*, but.. main problem with that is
current readahead logic as described above.
[/QUOTE]

Which was confirmed by Jens Axboe <axboe@suse.de>:

[QUOTE]
For ide-cd, it tends do only end the first part of the request on a
medium error. So you may see a lot of repeats :/
[/QUOTE]

With this patch, retries are expected to be reduced from, say, 256, to 5.

Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
---

--- linux-2.6.17-rc4-mm3.orig/mm/filemap.c
+++ linux-2.6.17-rc4-mm3/mm/filemap.c
@@ -809,6 +809,33 @@ grab_cache_page_nowait(struct address_sp
 EXPORT_SYMBOL(grab_cache_page_nowait);

 /*
+ * CD/DVDs are error prone. When a medium error occurs, the driver may fail
+ * a _large_ part of the i/o request. Imagine the worst scenario:
+ *
+ *      ---R__________________________________________B__________
+ *         ^ reading here                             ^ bad block(assume 4k)
+ *
+ * read(R) => miss => readahead(R...B) => media error => frustrating retries
+ * => failing the whole request => read(R) => read(R+1) =>
+ * readahead(R+1...B+1) => bang => read(R+2) => read(R+3) =>
+ * readahead(R+3...B+2) => bang => read(R+3) => read(R+4) =>
+ * readahead(R+4...B+3) => bang => read(R+4) => read(R+5) => ......
+ *
+ * It is going insane. Fix it by quickly scaling down the readahead size.
+ */
+static void shrink_readahead_size_eio(struct file *filp,
+					struct file_ra_state *ra)
+{
+	if (!ra->ra_pages)
+		return;
+
+	ra->ra_pages /= 4;
+	printk(KERN_WARNING "Retracting readahead size of %s to %luK\n",
+			filp->f_dentry->d_iname,
+			ra->ra_pages << (PAGE_CACHE_SHIFT - 10));
+}
+
+/*
  * This is a generic file read routine, and uses the
  * mapping->a_ops->readpage() function for the actual low-level
  * stuff.
@@ -983,6 +1010,7 @@ readpage:
 				}
 				unlock_page(page);
 				error = -EIO;
+				shrink_readahead_size_eio(filp, &ra);
 				goto readpage_error;
 			}
 			unlock_page(page);
@@ -1535,6 +1563,7 @@ page_not_uptodate:
 	 * Things didn't work out. Return zero to tell the
 	 * mm layer so, possibly freeing the page cache page first.
 	 */
+	shrink_readahead_size_eio(file, ra);
 	page_cache_release(page);
 	return NULL;
 }

--

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 5/5] readahead: remove size limit on read_ahead_kb
       [not found] ` <20060609081121.002572642@localhost.localdomain>
@ 2006-06-09  8:08   ` Wu Fengguang
  0 siblings, 0 replies; 12+ messages in thread
From: Wu Fengguang @ 2006-06-09  8:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Wu Fengguang

[-- Attachment #1: readahead-max-kb-sysfs-simplify.patch --]
[-- Type: text/plain, Size: 1389 bytes --]

Remove the unnecessary size limit on setting read_ahead_kb.

Also make possible large values harmless. The stock readahead
is protected by always consulting the avaiable memory before
applying this number. Other readahead paths have already did so.

read_ahead_kb used to be guarded by the queue's max_sectors,
which can be too rigid because some devices set max_sectors to
small values like 64kb. That leads to many user complains.

Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
---


 block/ll_rw_blk.c |    5 -----
 1 files changed, 5 deletions(-)

--- linux-2.6.17-rc6-mm1.orig/block/ll_rw_blk.c
+++ linux-2.6.17-rc6-mm1/block/ll_rw_blk.c
@@ -3810,12 +3810,7 @@ queue_ra_store(struct request_queue *q, 
 	unsigned long ra_kb;
 	ssize_t ret = queue_var_store(&ra_kb, page, count);
 
-	spin_lock_irq(q->queue_lock);
-	if (ra_kb > (q->max_sectors >> 1))
-		ra_kb = (q->max_sectors >> 1);
-
 	q->backing_dev_info.ra_pages = ra_kb >> (PAGE_CACHE_SHIFT - 10);
-	spin_unlock_irq(q->queue_lock);
 
 	return ret;
 }
--- linux-2.6.17-rc6-mm1.orig/mm/readahead.c
+++ linux-2.6.17-rc6-mm1/mm/readahead.c
@@ -156,7 +156,7 @@ EXPORT_SYMBOL_GPL(file_ra_state_init);
  */
 static inline unsigned long get_max_readahead(struct file_ra_state *ra)
 {
-	return ra->ra_pages;
+	return max_sane_readahead(ra->ra_pages);
 }
 
 static inline unsigned long get_min_readahead(struct file_ra_state *ra)

--

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] readahead: backoff on I/O error
  2006-06-09  8:08   ` [PATCH 4/5] readahead: backoff on I/O error Wu Fengguang
@ 2006-06-10 18:33     ` Ingo Oeser
  2006-06-10 19:48       ` Michael Tokarev
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Oeser @ 2006-06-10 18:33 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Andrew Morton, linux-kernel, Michael Tokarev, Jens Axboe

Hi Fengguang,

On Friday, 9. June 2006 10:08, Wu Fengguang wrote:
> Backoff readahead size exponentially on I/O error.
 
> With this patch, retries are expected to be reduced from, say, 256, to 5.
 
1. Would you mind to push this patch to -stable?

Reason: If killing a drive was hit in the field, this should be critical 
	enough.

2. Could you disable (at least optionally) read ahead complety 
  on the first IO error?

Reason: In a data recovery situation (hitting EIO quite often, 
	but not really sequentially) readahead is counter productive.
	E.g. trying to save an old CD with the cdparanoia software.


Regards

Ingo Oeser

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] readahead: backoff on I/O error
  2006-06-10 18:33     ` Ingo Oeser
@ 2006-06-10 19:48       ` Michael Tokarev
       [not found]         ` <20060612011245.GA5214@mail.ustc.edu.cn>
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2006-06-10 19:48 UTC (permalink / raw)
  To: Ingo Oeser; +Cc: Wu Fengguang, Andrew Morton, linux-kernel, Jens Axboe

Ingo Oeser wrote:
> Hi Fengguang,
> 
> On Friday, 9. June 2006 10:08, Wu Fengguang wrote:
>> Backoff readahead size exponentially on I/O error.
>  
>> With this patch, retries are expected to be reduced from, say, 256, to 5.
>  
> 1. Would you mind to push this patch to -stable?
> 
> Reason: If killing a drive was hit in the field, this should be critical 
> 	enough.

Well, it looks like I'm the only one in this world who's drive was
fired this way... ;)  Note it's a combination of two issues: it's
definitely a buggy firmware (or hardware) - it should not fire,
no matter how you're hitting it from software - and the readahead+
EIO logic.  So.. well, I don't want to try another drive really,
but it *seems* like it will eventually "recover".  Not that it's
possible still to watch a DVD with a single scratch (one unreadable
block) anyway with current code (it will freeze and freeze and freeze
again and again), but well, it's not really THAT critical.  My drive
is already dead, nothing worse can happen to it anyway ;)

> 2. Could you disable (at least optionally) read ahead complety 
>   on the first IO error?
> 
> Reason: In a data recovery situation (hitting EIO quite often, 
> 	but not really sequentially) readahead is counter productive.
> 	E.g. trying to save an old CD with the cdparanoia software.

I'm thinking about all this again.. well.  Read-ahead is definitely
very useful on a CD (I'm referring to all optical media here, be it
DVD, or BlueRay, or whatever; floppies too, but there it's less useful
due to speed of the whole thing) - I mean, say, DVDs are played more
smoothly if readahead is enabled; a "live CD" distro will be more
responsive if readahead is enabled, and so on - the effect of RA is
trivially visible.

But still, for a scratched CD, it might be a good idea to turn RA off
while playing it, completely - by means of, eg, blockdev --setra 0,
or something like that.  Yes not many (end)users know this tool, yes
it's privileged (oh well), but it helps.

Why I recall --setra is: when kernel will start reducing RA by its
own, next question will be "why my CD is too slow?" -- after playing
a scratched CD, you insert another one, and RA is *still* zero...

So I'm not really sure how simple the solution should be.

I *think* here we have some logic error, either fundamental or something
simple.

I mean the following.

Let's say an application requests block number N, which is NOT in cache.
The kernel, with readahead set, tries to read blocks N..N+R (R = readahead
value).  In.. hmm. one request?  So the question is whenever this one
big request, if block number N+R failed, will be completed partially, or
will fail entirely?  I think the right way is to complete that request
partially, filling buffer cache with blocks N..N+R-1.  Now, when an app
tries to access block N+1, it's already in the cache, and all goes well
up to block number N+R which is unreadable.  So kernel tries to request
blocks N+R..N+2R, fails, and returns an error to application.. which will
next (possible) request block number N+R+1, the kernel will try to read
blocks N+R+1..N+2R+1, will succeed, and everything will go just fine,
with the same readahead value, and with bad block being retried only twice.

So the question really is whenever that large request fails as a whole,
or partially succeeds.  As Jens Axboe explained, it depends on the driver
(or hardware?  Will, say, a hard drive behave sanely in this case, returning
partial data instead of faling the whole request?  Is it ever possible
for a drive to return such "partially successeful" result?)

If it's really difficult/impossible to "fix" this "fail the whole thing"
case, when yes, the only way to go is to disable (or reduce) readahead,
because it's the only way left to fix the problem...  But is it impossible?

Speaking of the patch, it seems to work fine - *alot* better than current
kernel code, I tried one slightly scratched CD - a process reading it
"unstalls" in some more-or-less sane time (still large but it's not
half an hour as before ;)

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] readahead: backoff on I/O error
       [not found]         ` <20060612011245.GA5214@mail.ustc.edu.cn>
@ 2006-06-12  1:12           ` Wu Fengguang
  2006-06-12  3:43             ` Andrew Morton
  2006-06-12 11:06             ` Michael Tokarev
  0 siblings, 2 replies; 12+ messages in thread
From: Wu Fengguang @ 2006-06-12  1:12 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: Ingo Oeser, Andrew Morton, linux-kernel, Jens Axboe

Hi Ingo and Michael,

On Sat, Jun 10, 2006 at 11:48:45PM +0400, Michael Tokarev wrote:
> Ingo Oeser wrote:
> >> Backoff readahead size exponentially on I/O error.
> >  
> >> With this patch, retries are expected to be reduced from, say, 256, to 5.
> >  
> > 1. Would you mind to push this patch to -stable?
> > 
> > Reason: If killing a drive was hit in the field, this should be critical 
> > 	enough.
> 

Andrew:
I was a bit afraid about that because I have no CDROM to try it out.
But since Michael has tested it OK, it should be OK for the stable kernel.

> > 2. Could you disable (at least optionally) read ahead complety 
> >   on the first IO error?
> > 
> > Reason: In a data recovery situation (hitting EIO quite often, 
> > 	but not really sequentially) readahead is counter productive.
> > 	E.g. trying to save an old CD with the cdparanoia software.

Should be ok. I have no experience on it. So Michael and users with
the taste have the last word :)

It might also be helpful to generate some uevent for it. User land
tools can then run blockdev --setra in a configurable way.

> I'm thinking about all this again.. well.  Read-ahead is definitely
> very useful on a CD (I'm referring to all optical media here, be it
> DVD, or BlueRay, or whatever; floppies too, but there it's less useful
> due to speed of the whole thing) - I mean, say, DVDs are played more
> smoothly if readahead is enabled; a "live CD" distro will be more
> responsive if readahead is enabled, and so on - the effect of RA is
> trivially visible.
> 
> But still, for a scratched CD, it might be a good idea to turn RA off
> while playing it, completely - by means of, eg, blockdev --setra 0,
> or something like that.  Yes not many (end)users know this tool, yes
> it's privileged (oh well), but it helps.
> 
> Why I recall --setra is: when kernel will start reducing RA by its
> own, next question will be "why my CD is too slow?" -- after playing
> a scratched CD, you insert another one, and RA is *still* zero...
> 
> So I'm not really sure how simple the solution should be.

The patch operates on a per-fd basis. Thus the readahead size will not
be affected for other files and CDs. The readahead size also restores
on reopening the file :) 

Thanks,
Wu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] readahead: backoff on I/O error
  2006-06-12  1:12           ` Wu Fengguang
@ 2006-06-12  3:43             ` Andrew Morton
  2006-06-12 11:06             ` Michael Tokarev
  1 sibling, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2006-06-12  3:43 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: mjt, ioe-lkml, linux-kernel, axboe

On Mon, 12 Jun 2006 09:12:45 +0800
Wu Fengguang <wfg@mail.ustc.edu.cn> wrote:

> Andrew:
> I was a bit afraid about that because I have no CDROM to try it out.
> But since Michael has tested it OK, it should be OK for the stable kernel.

hm.  It's a problem, but not a terribly huge one.



It's nice to print the filename, but we shouldn't use d_iname because long
filenames aren't stored there.  And if the file was unlinked after open
then I don't think we'll oops, but the filename isn't very meaningful.


So...


--- devel/mm/filemap.c~readahead-backoff-on-i-o-error-tweaks	2006-06-11 20:40:52.000000000 -0700
+++ devel-akpm/mm/filemap.c	2006-06-11 20:41:05.000000000 -0700
@@ -804,8 +804,7 @@ static void shrink_readahead_size_eio(st
 		return;
 
 	ra->ra_pages /= 4;
-	printk(KERN_WARNING "Retracting readahead size of %s to %luK\n",
-			filp->f_dentry->d_iname,
+	printk(KERN_WARNING "Reducing readahead size to %luK\n",
 			ra->ra_pages << (PAGE_CACHE_SHIFT - 10));
 }
 
_


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] readahead: backoff on I/O error
  2006-06-12  1:12           ` Wu Fengguang
  2006-06-12  3:43             ` Andrew Morton
@ 2006-06-12 11:06             ` Michael Tokarev
  2006-06-17  6:38               ` Michael Tokarev
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2006-06-12 11:06 UTC (permalink / raw)
  To: Wu Fengguang, Michael Tokarev, Ingo Oeser, Andrew Morton,
	linux-kernel, Jens Axboe

Wu Fengguang wrote:
[]
> Andrew:
> I was a bit afraid about that because I have no CDROM to try it out.
> But since Michael has tested it OK, it should be OK for the stable kernel.

Hmm.. I haven't "tested it OK" yet. I just ran a quick-n-dirty
check.  Tomorrow I'll be in our office where I have more adequate
"test environment" for this stuff (different CD/DVD drives and
disks, some scratched), and I'll do some real testing.

Thanks.

/mjt


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] readahead: backoff on I/O error
  2006-06-12 11:06             ` Michael Tokarev
@ 2006-06-17  6:38               ` Michael Tokarev
  2006-06-17  7:00                 ` Michael Tokarev
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Tokarev @ 2006-06-17  6:38 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Wu Fengguang, Ingo Oeser, Andrew Morton, linux-kernel, Jens Axboe

Michael Tokarev wrote:
> Wu Fengguang wrote:
> []
>> Andrew:
>> I was a bit afraid about that because I have no CDROM to try it out.
>> But since Michael has tested it OK, it should be OK for the stable kernel.
> 
> Hmm.. I haven't "tested it OK" yet. I just ran a quick-n-dirty
> check.  Tomorrow I'll be in our office where I have more adequate
> "test environment" for this stuff (different CD/DVD drives and
> disks, some scratched), and I'll do some real testing.

Ok, almost a week has passed, and... oh well.

The patch itself works, reducing the readahead quickly, *when
I use dd with default bs=512* (see below), but it looks somewhat
strange:

hdc: command error: status=0x51 { DriveReady SeekComplete Error }
hdc: command error: error=0x51 { IllegalLengthIndication LastFailedSense=0x05 }
ide: failed opcode was: unknown
end_request: I/O error, dev hdc, sector 706696
Buffer I/O error on device hdc, logical block 176674
Retracting readahead size of hdc to 32K
Retracting readahead size of hdc to 8K
Retracting readahead size of hdc to 0K

The strangeness is that the message "Retracting readahead size" is
repeated, without intermediate "hdc: command error: ..." things,
ie, from the dmesg output it looks like we reducing RA size 3
times in a row without any reason (after only one error, instead
of 3) (all the above messages are recorded in the same second).

But the interesting thing is the 2nd line above -- error=0x51
{ IllegalLengthIndication LastFailedSense=0x05 } -- this
IllegalLengthIndication bit.  In SBC-2 standard this bit is
called "Incorrect Length Indication (ILI)", not "Illegal"
(probably needs to be fixed in drivers/ide/ide-lib.c.

The meaning of this bit is that the drive completed PART of
the original request, ie, it has read SOME data, probably
up to the pre-error block, so it should be safe to fill in
the cache with the read data and continue, without changing
anything.

It looks like *ALL* cd-rom drives I have here return this sort
of error on problematic reads.  At least I wasn't able to find
any which does not (I tried - hence the delay in replying).

So... it looks like it should be possible to fix the original
issue in the right place.

Ok, here's the most interesting (IMHO) part.

All the above works if I'm using dd with default bs=512.  BUT.
Once I add bs=2k to the dd line..  it (the patch) stops working,
and shows exactly the behaviour I've seen originally (which
killed my drive):

10:18:41 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
10:18:41 hdc: command error: error=0x51 { IllegalLengthIndication LastFailedSense=0x05 }
10:18:41 ide: failed opcode was: unknown
10:18:41 end_request: I/O error, dev hdc, sector 706240
10:18:41 Buffer I/O error on device hdc, logical block 88280
10:18:47 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
10:18:47 hdc: command error: error=0x51 { IllegalLengthIndication LastFailedSense=0x05 }
10:18:47 ide: failed opcode was: unknown
10:18:47 end_request: I/O error, dev hdc, sector 706248
10:18:47 Buffer I/O error on device hdc, logical block 88281
10:18:53 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
10:18:53 hdc: command error: error=0x51 { IllegalLengthIndication LastFailedSense=0x05 }
10:18:53 ide: failed opcode was: unknown
10:18:53 end_request: I/O error, dev hdc, sector 706256
10:18:53 Buffer I/O error on device hdc, logical block 88282

....alot of similar errors skipped, with consequtive block numbers.
   At this point (10:18), I hit Ctrl+C on the xterm where dd was running...

10:23:44 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
10:23:44 hdc: command error: error=0x51 { IllegalLengthIndication LastFailedSense=0x05 }
10:23:44 ide: failed opcode was: unknown
10:23:44 end_request: I/O error, dev hdc, sector 706680
10:23:44 Buffer I/O error on device hdc, logical block 88335
10:23:54 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
10:23:54 hdc: command error: error=0x51 { IllegalLengthIndication LastFailedSense=0x05 }
10:23:54 ide: failed opcode was: unknown
10:23:54 end_request: I/O error, dev hdc, sector 706688
10:23:54 Buffer I/O error on device hdc, logical block 88336
10:24:32 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
10:24:32 hdc: command error: error=0x51 { IllegalLengthIndication LastFailedSense=0x05 }
10:24:32 ide: failed opcode was: unknown
10:24:32 end_request: I/O error, dev hdc, sector 706696
10:24:32 Buffer I/O error on device hdc, logical block 88337
10:24:32 Retracting readahead size of hdc to 32K

And finally here we go: the new logic triggered.  And the dd command
unfroze as well (reacted to the interrupt).

So now I see why the first strangeness above (3 times "retracting RA
size" after only one error): the hw sector size is 2048 bytes, but
dd requested 512-byte blocks, so one failed hw sector = 4 failed
reads.

So finally.. it looks like the whole thing is somewhere else still.
The last batch of messages shows exactly the previous behaviour
(numerous attempts to read quite alot of failing sectors, which
takes quite some time too - depending on the CD-ROM drive alot),
BEFORE the new RA-reducing logic gets triggered.

I wonder...

 a) where all those read attempts comes from, and
 b) whenever it's possible to use this IncorrectLengthIndication
    (ILI) bit and all the returned data.

And btw.. why "ide: failed opcode was: unknown" ?

Thanks

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] readahead: backoff on I/O error
  2006-06-17  6:38               ` Michael Tokarev
@ 2006-06-17  7:00                 ` Michael Tokarev
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Tokarev @ 2006-06-17  7:00 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Wu Fengguang, Ingo Oeser, Andrew Morton, linux-kernel, Jens Axboe

Michael Tokarev wrote:
> Michael Tokarev wrote:
[]
> .....alot of similar errors skipped, with consequtive block numbers.
>    At this point (10:18), I hit Ctrl+C on the xterm where dd was running...
[]
> And finally here we go: the new logic triggered.  And the dd command
> unfroze as well (reacted to the interrupt).
[]
> So finally.. it looks like the whole thing is somewhere else still.
> The last batch of messages shows exactly the previous behaviour
> (numerous attempts to read quite alot of failing sectors, which
> takes quite some time too - depending on the CD-ROM drive alot),
> BEFORE the new RA-reducing logic gets triggered.

I forgot to add: when turning RA off for the device before reading
the disk, read errors are propagated to the application quickly,
without those long delays and attempts to read all the bad blocks.
Ofcourse (with conv=noerror), dd will request next blocks anyway,
but it stays responsive still, and reports each error quickly.

The patch helps still: after first batch of attempts to read,
RA gets reduced, when follows next (much smaller) batch, and
finally RA is set to 0, and the above 'quick route' (as if
RA was turned off initially) takes effect.

/mjt

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-06-17  7:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20060609080801.741901069@localhost.localdomain>
2006-06-09  8:08 ` [PATCH 0/5] Adaptive readahead updates 2 Wu Fengguang
     [not found] ` <20060609081119.231751179@localhost.localdomain>
2006-06-09  8:08   ` [PATCH 1/5] readahead: no RA_FLAG_EOF on single page file Wu Fengguang
     [not found] ` <20060609081120.054527393@localhost.localdomain>
2006-06-09  8:08   ` [PATCH 3/5] readahead: call scheme - no fastcall for readahead_cache_hit() Wu Fengguang
     [not found] ` <20060609081120.531013274@localhost.localdomain>
2006-06-09  8:08   ` [PATCH 4/5] readahead: backoff on I/O error Wu Fengguang
2006-06-10 18:33     ` Ingo Oeser
2006-06-10 19:48       ` Michael Tokarev
     [not found]         ` <20060612011245.GA5214@mail.ustc.edu.cn>
2006-06-12  1:12           ` Wu Fengguang
2006-06-12  3:43             ` Andrew Morton
2006-06-12 11:06             ` Michael Tokarev
2006-06-17  6:38               ` Michael Tokarev
2006-06-17  7:00                 ` Michael Tokarev
     [not found] ` <20060609081121.002572642@localhost.localdomain>
2006-06-09  8:08   ` [PATCH 5/5] readahead: remove size limit on read_ahead_kb Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox