Re: [PATCH 00/32] Adaptive readahead V14

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Wu Fengguang <wfg@mail.ustc.edu.cn>
To: Jens Axboe <axboe@suse.de>
Cc: Michael Tokarev <mjt@tls.msk.ru>,
	linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Subject: Re: [PATCH 00/32] Adaptive readahead V14
Date: Tue, 30 May 2006 22:34:18 +0800	[thread overview]
Message-ID: <348999644.00549@ustc.edu.cn> (raw)
Message-ID: <20060530143417.GA9126@mail.ustc.edu.cn> (raw)
In-Reply-To: <20060530122934.GT4199@suse.de>

On Tue, May 30, 2006 at 02:29:34PM +0200, Jens Axboe wrote:
> On Tue, May 30 2006, Wu Fengguang wrote:
> > On Tue, May 30, 2006 at 11:23:10AM +0200, Jens Axboe wrote:
> > > On Mon, May 29 2006, Wu Fengguang wrote:
> > > > On Sun, May 28, 2006 at 11:23:33PM +0400, Michael Tokarev wrote:
> > > > > Wu Fengguang wrote:
> > > > > > 
> > > > > > It's not quite reasonable for readahead to worry about media errors.
> > > > > > If the media fails, fix it. Or it will hurt read sooner or later.
> > > > > 
> > > > > Well... In reality, it is just the opposite.
> > > > > 
> > > > > Suppose there's a CD-rom with a scratch/etc, one sector is unreadable.
> > > > > In order to "fix" it, one have to read it and write to another CD-rom,
> > > > > or something.. or just ignore the error (if it's just a skip in a video
> > > > > stream).  Let's assume the unreadable block is number U.
> > > > > 
> > > > > But current behavior is just insane.  An application requests block
> > > > > number N, which is before U. Kernel tries to read-ahead blocks N..U.
> > > > > Cdrom drive tries to read it, re-read it.. for some time.  Finally,
> > > > > when all the N..U-1 blocks are read, kernel returns block number N
> > > > > (as requested) to an application, successefully.
> > > > > 
> > > > > Now an app requests block number N+1, and kernel tries to read
> > > > > blocks N+1..U+1.  Retrying again as in previous step.
> > > > > 
> > > > > And so on, up to when an app requests block number U-1.  And when,
> > > > > finally, it requests block U, it receives read error.
> > > > > 
> > > > > So, kernel currentry tries to re-read the same failing block as
> > > > > many times as the current readahead value (256 (times?) by default).
> > > > 
> > > > Good insight... But I'm not sure about it.
> > > > 
> > > > Jens, will a bad sector cause the _whole_ request to fail?
> > > > Or only the page that contains the bad sector?
> > > 
> > > Depends entirely on the driver, and that point we've typically lost the
> > > fact that this is a read-ahead request and could just be tossed. In
> > > fact, the entire request may consist of read-ahead as well as normal
> > > read entries.
> > > 
> > > For ide-cd, it tends do only end the first part of the request on a
> > > medium error. So you may see a lot of repeats :/
> > 
> > Another question about it:
> >         If the block layer issued a request, which happened to contain
> >         R ranges of B bad blocks, i.e. 3 ranges of 9 bad-blocks:
> >                 ___b_____bb___________bbbbbb____
> >         How many retries will incur? 1, 3, 9, or something else?
> >         If it is 3 or more, then we are even more bad luck :(
> 
> Again, this is driver specific. But for ide-cd, if it's using PIO the
> right thing should happen since we do each chunk individually. For dma
> it looks much worse, since we only get an EIO back from the hardware for
> the entire range. It wont do the right thing at all, only for the very
> last thing when get get past the last bbbbbb block.
> 
> > Will it be suitable to _automatically_ apply the following retracting
> > policy on I/O error? Please comment if there's better ways:
> 
> Probably it should be even more aggressively scaling down. The real
> problem is the drivers of course, we should spend some time fixing them
> up too.

nod, it's so frustrating...

Updated the patch, please comment if necessary.

With this patch, retries are reduced from, say, 256, to 5.

Wu
---

--- linux.orig/mm/filemap.c
+++ linux/mm/filemap.c
@@ -809,6 +809,32 @@ grab_cache_page_nowait(struct address_sp
 EXPORT_SYMBOL(grab_cache_page_nowait);
 
 /*
+ * CD/DVDs are error prone. When a medium error occurs, the driver may fail
+ * a _large_ part of the i/o request. Imagine the worst scenario:
+ *
+ *      ---R__________________________________________B__________
+ *         ^ reading here                             ^ bad block(assume 4k)
+ *
+ * read(R) => miss => readahead(R...B) => media error => frustrating retries
+ * => failing the whole request => read(R) => read(R+1) =>
+ * readahead(R+1...B+1) => bang => read(R+2) => read(R+3) =>
+ * readahead(R+3...B+2) => bang => read(R+3) => read(R+4) =>
+ * readahead(R+4...B+3) => bang => read(R+4) => read(R+5) => ......
+ *
+ * It is going insane. Fix it by quickly scale down the readahead size.
+ */
+static void shrink_readahead_size_eio(struct file *filp,
+					struct file_ra_state *ra)
+{
+	if (!ra->ra_pages)
+		return;
+
+	ra->ra_pages /= 4;
+	printk(KERN_WARNING "Retracting readahead size of %s to %lu\n",
+			filp->f_dentry->d_iname, ra->ra_pages);
+}
+
+/*
  * This is a generic file read routine, and uses the
  * mapping->a_ops->readpage() function for the actual low-level
  * stuff.
@@ -983,6 +1009,7 @@ readpage:
 				}
 				unlock_page(page);
 				error = -EIO;
+				shrink_readahead_size_eio(filp, &ra);
 				goto readpage_error;
 			}
 			unlock_page(page);
@@ -1535,6 +1562,7 @@ page_not_uptodate:
 	 * Things didn't work out. Return zero to tell the
 	 * mm layer so, possibly freeing the page cache page first.
 	 */
+	shrink_readahead_size_eio(file, ra);
 	page_cache_release(page);
 	return NULL;
 }

next prev parent reply	other threads:[~2006-05-30 14:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20060527154849.927021763@localhost.localdomain>
2006-05-27 15:48 ` [PATCH 00/32] Adaptive readahead V14 Wu Fengguang
2006-05-27 17:29   ` Michael Tokarev
     [not found]     ` <20060528120815.GB6478@mail.ustc.edu.cn>
2006-05-28 12:08       ` Wu Fengguang
2006-05-28 19:23         ` Michael Tokarev
     [not found]           ` <20060529030152.GA5994@mail.ustc.edu.cn>
2006-05-29  3:01             ` Wu Fengguang
2006-05-30  9:23             ` Jens Axboe
     [not found]               ` <20060530113221.GA8665@mail.ustc.edu.cn>
2006-05-30 11:32                 ` Wu Fengguang
2006-05-30 12:29                 ` Jens Axboe
     [not found]                   ` <20060530143417.GA9126@mail.ustc.edu.cn>
2006-05-30 14:34                     ` Wu Fengguang [this message]
     [not found] ` <20060527155125.911021581@localhost.localdomain>
2006-05-27 15:48   ` [PATCH 01/32] readahead: kconfig options Wu Fengguang
     [not found] ` <20060527155127.522802387@localhost.localdomain>
2006-05-27 15:48   ` [PATCH 04/32] mm: introduce PG_readahead Wu Fengguang
     [not found] ` <20060527155128.472551240@localhost.localdomain>
2006-05-27 15:48   ` [PATCH 06/32] readahead: delay page release in do_generic_mapping_read() Wu Fengguang
     [not found] ` <20060527155129.001886224@localhost.localdomain>
2006-05-27 15:48   ` [PATCH 07/32] readahead: insert cond_resched() calls Wu Fengguang
     [not found] ` <20060527155129.653903854@localhost.localdomain>
2006-05-27 15:48   ` [PATCH 08/32] readahead: {MIN,MAX}_RA_PAGES Wu Fengguang
     [not found] ` <20060527155130.013773601@localhost.localdomain>
2006-05-27 15:48   ` [PATCH 09/32] readahead: events accounting Wu Fengguang
     [not found] ` <20060527155130.538411854@localhost.localdomain>
2006-05-27 15:48   ` [PATCH 10/32] readahead: rescue_pages() Wu Fengguang
     [not found] ` <20060527155131.200177171@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 11/32] readahead: sysctl parameters Wu Fengguang
     [not found] ` <20060527155132.649338979@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 14/32] readahead: state based method - routines Wu Fengguang
     [not found] ` <20060527155133.216888332@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 15/32] readahead: state based method Wu Fengguang
     [not found] ` <20060527155134.715578802@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 18/32] readahead: initial method - thrashing guard size Wu Fengguang
     [not found] ` <20060527155135.584918734@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 20/32] readahead: initial method - user recommended size Wu Fengguang
     [not found] ` <20060527155136.503037461@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 22/32] readahead: backward prefetching method Wu Fengguang
     [not found] ` <20060527155137.552915509@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 24/32] readahead: thrashing recovery method Wu Fengguang
2006-05-27 22:04     ` [PATCH 23/32] readahead: seeking reads method Ingo Oeser
     [not found] ` <20060527155138.046726658@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 25/32] readahead: call scheme Wu Fengguang
     [not found] ` <20060527155138.454809673@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 26/32] readahead: laptop mode Wu Fengguang
     [not found] ` <20060527155140.035991503@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 27/32] readahead: loop case Wu Fengguang
     [not found] ` <20060527155141.697607086@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 30/32] readahead: debug radix tree new functions Wu Fengguang
     [not found] ` <20060527155142.129761018@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 31/32] readahead: debug traces showing accessed file names Wu Fengguang
     [not found] ` <20060527155142.715530234@localhost.localdomain>
2006-05-27 15:49   ` [PATCH 32/32] readahead: debug traces showing read patterns Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=348999644.00549@ustc.edu.cn \
    --to=wfg@mail.ustc.edu.cn \
    --cc=akpm@osdl.org \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjt@tls.msk.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox