From: Wu Fengguang <wfg@mail.ustc.edu.cn>
To: Jens Axboe <axboe@suse.de>
Cc: Michael Tokarev <mjt@tls.msk.ru>,
linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Subject: Re: [PATCH 00/32] Adaptive readahead V14
Date: Tue, 30 May 2006 22:34:18 +0800 [thread overview]
Message-ID: <348999644.00549@ustc.edu.cn> (raw)
Message-ID: <20060530143417.GA9126@mail.ustc.edu.cn> (raw)
In-Reply-To: <20060530122934.GT4199@suse.de>
On Tue, May 30, 2006 at 02:29:34PM +0200, Jens Axboe wrote:
> On Tue, May 30 2006, Wu Fengguang wrote:
> > On Tue, May 30, 2006 at 11:23:10AM +0200, Jens Axboe wrote:
> > > On Mon, May 29 2006, Wu Fengguang wrote:
> > > > On Sun, May 28, 2006 at 11:23:33PM +0400, Michael Tokarev wrote:
> > > > > Wu Fengguang wrote:
> > > > > >
> > > > > > It's not quite reasonable for readahead to worry about media errors.
> > > > > > If the media fails, fix it. Or it will hurt read sooner or later.
> > > > >
> > > > > Well... In reality, it is just the opposite.
> > > > >
> > > > > Suppose there's a CD-rom with a scratch/etc, one sector is unreadable.
> > > > > In order to "fix" it, one have to read it and write to another CD-rom,
> > > > > or something.. or just ignore the error (if it's just a skip in a video
> > > > > stream). Let's assume the unreadable block is number U.
> > > > >
> > > > > But current behavior is just insane. An application requests block
> > > > > number N, which is before U. Kernel tries to read-ahead blocks N..U.
> > > > > Cdrom drive tries to read it, re-read it.. for some time. Finally,
> > > > > when all the N..U-1 blocks are read, kernel returns block number N
> > > > > (as requested) to an application, successefully.
> > > > >
> > > > > Now an app requests block number N+1, and kernel tries to read
> > > > > blocks N+1..U+1. Retrying again as in previous step.
> > > > >
> > > > > And so on, up to when an app requests block number U-1. And when,
> > > > > finally, it requests block U, it receives read error.
> > > > >
> > > > > So, kernel currentry tries to re-read the same failing block as
> > > > > many times as the current readahead value (256 (times?) by default).
> > > >
> > > > Good insight... But I'm not sure about it.
> > > >
> > > > Jens, will a bad sector cause the _whole_ request to fail?
> > > > Or only the page that contains the bad sector?
> > >
> > > Depends entirely on the driver, and that point we've typically lost the
> > > fact that this is a read-ahead request and could just be tossed. In
> > > fact, the entire request may consist of read-ahead as well as normal
> > > read entries.
> > >
> > > For ide-cd, it tends do only end the first part of the request on a
> > > medium error. So you may see a lot of repeats :/
> >
> > Another question about it:
> > If the block layer issued a request, which happened to contain
> > R ranges of B bad blocks, i.e. 3 ranges of 9 bad-blocks:
> > ___b_____bb___________bbbbbb____
> > How many retries will incur? 1, 3, 9, or something else?
> > If it is 3 or more, then we are even more bad luck :(
>
> Again, this is driver specific. But for ide-cd, if it's using PIO the
> right thing should happen since we do each chunk individually. For dma
> it looks much worse, since we only get an EIO back from the hardware for
> the entire range. It wont do the right thing at all, only for the very
> last thing when get get past the last bbbbbb block.
>
> > Will it be suitable to _automatically_ apply the following retracting
> > policy on I/O error? Please comment if there's better ways:
>
> Probably it should be even more aggressively scaling down. The real
> problem is the drivers of course, we should spend some time fixing them
> up too.
nod, it's so frustrating...
Updated the patch, please comment if necessary.
With this patch, retries are reduced from, say, 256, to 5.
Wu
---
--- linux.orig/mm/filemap.c
+++ linux/mm/filemap.c
@@ -809,6 +809,32 @@ grab_cache_page_nowait(struct address_sp
EXPORT_SYMBOL(grab_cache_page_nowait);
/*
+ * CD/DVDs are error prone. When a medium error occurs, the driver may fail
+ * a _large_ part of the i/o request. Imagine the worst scenario:
+ *
+ * ---R__________________________________________B__________
+ * ^ reading here ^ bad block(assume 4k)
+ *
+ * read(R) => miss => readahead(R...B) => media error => frustrating retries
+ * => failing the whole request => read(R) => read(R+1) =>
+ * readahead(R+1...B+1) => bang => read(R+2) => read(R+3) =>
+ * readahead(R+3...B+2) => bang => read(R+3) => read(R+4) =>
+ * readahead(R+4...B+3) => bang => read(R+4) => read(R+5) => ......
+ *
+ * It is going insane. Fix it by quickly scale down the readahead size.
+ */
+static void shrink_readahead_size_eio(struct file *filp,
+ struct file_ra_state *ra)
+{
+ if (!ra->ra_pages)
+ return;
+
+ ra->ra_pages /= 4;
+ printk(KERN_WARNING "Retracting readahead size of %s to %lu\n",
+ filp->f_dentry->d_iname, ra->ra_pages);
+}
+
+/*
* This is a generic file read routine, and uses the
* mapping->a_ops->readpage() function for the actual low-level
* stuff.
@@ -983,6 +1009,7 @@ readpage:
}
unlock_page(page);
error = -EIO;
+ shrink_readahead_size_eio(filp, &ra);
goto readpage_error;
}
unlock_page(page);
@@ -1535,6 +1562,7 @@ page_not_uptodate:
* Things didn't work out. Return zero to tell the
* mm layer so, possibly freeing the page cache page first.
*/
+ shrink_readahead_size_eio(file, ra);
page_cache_release(page);
return NULL;
}
next prev parent reply other threads:[~2006-05-30 14:34 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060527154849.927021763@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 00/32] Adaptive readahead V14 Wu Fengguang
2006-05-27 17:29 ` Michael Tokarev
[not found] ` <20060528120815.GB6478@mail.ustc.edu.cn>
2006-05-28 12:08 ` Wu Fengguang
2006-05-28 19:23 ` Michael Tokarev
[not found] ` <20060529030152.GA5994@mail.ustc.edu.cn>
2006-05-29 3:01 ` Wu Fengguang
2006-05-30 9:23 ` Jens Axboe
[not found] ` <20060530113221.GA8665@mail.ustc.edu.cn>
2006-05-30 11:32 ` Wu Fengguang
2006-05-30 12:29 ` Jens Axboe
[not found] ` <20060530143417.GA9126@mail.ustc.edu.cn>
2006-05-30 14:34 ` Wu Fengguang [this message]
[not found] ` <20060527155125.911021581@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 01/32] readahead: kconfig options Wu Fengguang
[not found] ` <20060527155127.522802387@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 04/32] mm: introduce PG_readahead Wu Fengguang
[not found] ` <20060527155128.472551240@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 06/32] readahead: delay page release in do_generic_mapping_read() Wu Fengguang
[not found] ` <20060527155129.001886224@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 07/32] readahead: insert cond_resched() calls Wu Fengguang
[not found] ` <20060527155129.653903854@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 08/32] readahead: {MIN,MAX}_RA_PAGES Wu Fengguang
[not found] ` <20060527155130.013773601@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 09/32] readahead: events accounting Wu Fengguang
[not found] ` <20060527155130.538411854@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 10/32] readahead: rescue_pages() Wu Fengguang
[not found] ` <20060527155131.200177171@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 11/32] readahead: sysctl parameters Wu Fengguang
[not found] ` <20060527155132.649338979@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 14/32] readahead: state based method - routines Wu Fengguang
[not found] ` <20060527155133.216888332@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 15/32] readahead: state based method Wu Fengguang
[not found] ` <20060527155134.715578802@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 18/32] readahead: initial method - thrashing guard size Wu Fengguang
[not found] ` <20060527155135.584918734@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 20/32] readahead: initial method - user recommended size Wu Fengguang
[not found] ` <20060527155136.503037461@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 22/32] readahead: backward prefetching method Wu Fengguang
[not found] ` <20060527155137.552915509@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 24/32] readahead: thrashing recovery method Wu Fengguang
2006-05-27 22:04 ` [PATCH 23/32] readahead: seeking reads method Ingo Oeser
[not found] ` <20060527155138.046726658@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 25/32] readahead: call scheme Wu Fengguang
[not found] ` <20060527155138.454809673@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 26/32] readahead: laptop mode Wu Fengguang
[not found] ` <20060527155140.035991503@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 27/32] readahead: loop case Wu Fengguang
[not found] ` <20060527155141.697607086@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 30/32] readahead: debug radix tree new functions Wu Fengguang
[not found] ` <20060527155142.129761018@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 31/32] readahead: debug traces showing accessed file names Wu Fengguang
[not found] ` <20060527155142.715530234@localhost.localdomain>
2006-05-27 15:49 ` [PATCH 32/32] readahead: debug traces showing read patterns Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=348999644.00549@ustc.edu.cn \
--to=wfg@mail.ustc.edu.cn \
--cc=akpm@osdl.org \
--cc=axboe@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mjt@tls.msk.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox