All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: Andries.Brouwer@cwi.nl
Cc: linux-kernel@vger.kernel.org
Subject: Re: readahead
Date: Tue, 16 Apr 2002 12:23:45 -0700	[thread overview]
Message-ID: <3CBC7A41.384FD032@zip.com.au> (raw)
In-Reply-To: <UTC200204161910.g3GJAx009370.aeb@smtp.cwi.nl>

Andries.Brouwer@cwi.nl wrote:
> 
> ...
>     Do these cards not have a request queue?
> 
> The kernel views them as SCSI disks.
> So yes, I can do
> 
>    blockdev --setra 0 /dev/sdc
> 
> Unfortunately that does not help in the least.
> Indeed, the only user of the readahead info is
> readahead.c: get_max_readahead() and it does
> 
>         blk_ra_kbytes = blk_get_readahead(inode->i_dev) / 2;
>         if (blk_ra_kbytes < VM_MIN_READAHEAD)
>                 blk_ra_kbytes = VM_MAX_READAHEAD;
> 
> We need to distinguish between undefined, and explicily zero.

Yup.  The below (untested) patch should fix it up.  Assuming
that all drivers use blk_init_queue(), and heaven knows if
that's the case.  If not, the readahead window will have to be
set from userspace for those devices.

> ...
> 
>     Yes, but things should be OK as-is.  If the readahead attempt
>     gets an I/O error, do_generic_file_read will notice the non-uptodate
>     page and will issue a single-page read.  So everything up to
>     a page's distance from the bad block should be recoverable.
>     That's the theory; can't say that I've tested it.
> 
> It is really important to be able to tell the kernel to read and
> write only the blocks it has been asked to read and write and
> not to touch anything else.

That is really hard when there is a filesystem mounted.
Even with readahead set to zero, the kernel will go and
read PAGE_CACHE_SIZE chunks.  That's not worth changing 
to address this problem.  In the last resort, the
user would need to perform a sector-by-sector read of
the dud device into a regular file, recover from that.


--- 2.5.8/mm/readahead.c~readahead-fixes	Tue Apr 16 12:07:42 2002
+++ 2.5.8-akpm/mm/readahead.c	Tue Apr 16 12:13:52 2002
@@ -25,9 +25,6 @@
  * has a zero value of ra_sectors.
  */
 
-#define VM_MAX_READAHEAD	128	/* kbytes */
-#define VM_MIN_READAHEAD	16	/* kbytes (includes current page) */
-
 /*
  * Return max readahead size for this inode in number-of-pages.
  */
@@ -36,9 +33,6 @@ static int get_max_readahead(struct inod
 	unsigned blk_ra_kbytes = 0;
 
 	blk_ra_kbytes = blk_get_readahead(inode->i_dev) / 2;
-	if (blk_ra_kbytes < VM_MIN_READAHEAD)
-		blk_ra_kbytes = VM_MAX_READAHEAD;
-
 	return blk_ra_kbytes >> (PAGE_CACHE_SHIFT - 10);
 }
 
@@ -96,10 +90,10 @@ static int get_min_readahead(struct inod
  * second page) then the window will build more slowly.
  *
  * On a readahead miss (the application seeked away) the readahead window is shrunk
- * by 25%.  We don't want to drop it too aggressively, because it's a good assumption
- * that an application which has built a good readahead window will continue to
- * perform linear reads.  Either at the new file position, or at the old one after
- * another seek.
+ * by 25%.  We don't want to drop it too aggressively, because it is a good
+ * assumption that an application which has built a good readahead window will
+ * continue to perform linear reads.  Either at the new file position, or at the
+ * old one after another seek.
  *
  * There is a special-case: if the first page which the application tries to read
  * happens to be the first page of the file, it is assumed that a linear read is
@@ -109,9 +103,9 @@ static int get_min_readahead(struct inod
  * sequential file reading.
  *
  * This function is to be called for every page which is read, rather than when
- * it is time to perform readahead.  This is so the readahead algorithm can centrally
- * work out the access patterns.  This could be costly with many tiny read()s, so
- * we specifically optimise for that case with prev_page.
+ * it is time to perform readahead.  This is so the readahead algorithm can
+ * centrally work out the access patterns.  This could be costly with many tiny
+ * read()s, so we specifically optimise for that case with prev_page.
  */
 
 /*
@@ -208,8 +202,10 @@ void page_cache_readahead(struct file *f
 			goto out;
 	}
 
-	min = get_min_readahead(inode);
 	max = get_max_readahead(inode);
+	if (max == 0)
+		goto out;	/* No readahead */
+	min = get_min_readahead(inode);
 
 	if (ra->next_size == 0 && offset == 0) {
 		/*
@@ -310,7 +306,8 @@ void page_cache_readaround(struct file *
 {
 	unsigned long target;
 	unsigned long backward;
-	const int min = get_min_readahead(file->f_dentry->d_inode->i_mapping->host) * 2;
+	const int min =
+		get_min_readahead(file->f_dentry->d_inode->i_mapping->host) * 2;
 
 	if (file->f_ra.next_size < min)
 		file->f_ra.next_size = min;
--- 2.5.8/include/linux/mm.h~readahead-fixes	Tue Apr 16 12:11:39 2002
+++ 2.5.8-akpm/include/linux/mm.h	Tue Apr 16 12:12:14 2002
@@ -539,6 +539,8 @@ extern int filemap_sync(struct vm_area_s
 extern struct page *filemap_nopage(struct vm_area_struct *, unsigned long, int);
 
 /* readahead.c */
+#define VM_MAX_READAHEAD	128	/* kbytes */
+#define VM_MIN_READAHEAD	16	/* kbytes (includes current page) */
 void do_page_cache_readahead(struct file *file,
 			unsigned long offset, unsigned long nr_to_read);
 void page_cache_readahead(struct file *file, unsigned long offset);
--- 2.5.8/drivers/block/ll_rw_blk.c~readahead-fixes	Tue Apr 16 12:12:19 2002
+++ 2.5.8-akpm/drivers/block/ll_rw_blk.c	Tue Apr 16 12:13:43 2002
@@ -851,7 +851,7 @@ int blk_init_queue(request_queue_t *q, r
 	q->plug_tq.data		= q;
 	q->queue_flags		= (1 << QUEUE_FLAG_CLUSTER);
 	q->queue_lock		= lock;
-	q->ra_sectors		= 0;		/* Use VM default */
+	q->ra_sectors		= VM_MAX_READAHEAD << (PAGE_CACHE_SHIFT - 9);
 
 	blk_queue_segment_boundary(q, 0xffffffff);
 

-

  reply	other threads:[~2002-04-16 19:23 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-04-16 19:10 readahead Andries.Brouwer
2002-04-16 19:23 ` Andrew Morton [this message]
2002-04-16 19:33   ` readahead Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2005-09-27  2:38 Readahead Alan Stern
2005-09-27  3:06 ` Readahead Randy.Dunlap
2005-09-27  4:24 ` Readahead Andrew Morton
2005-09-28 18:40   ` Readahead Alan Stern
2003-11-01 17:22 READAHEAD Voluspa
2003-10-30 19:23 READAHEAD age
2003-10-30 21:44 ` READAHEAD Andrew Morton
2003-10-31  7:43   ` READAHEAD Nuno Silva
2003-10-31  8:03     ` READAHEAD Andrew Morton
2003-10-31 12:20   ` READAHEAD age
2003-10-31  9:28     ` READAHEAD Andrew Morton
2003-10-31  9:29       ` READAHEAD Andrew Morton
2003-11-01  9:15         ` READAHEAD age
2003-11-03  0:15         ` READAHEAD Derek Foreman
2002-04-16 20:21 readahead Andries.Brouwer
2002-04-16 13:54 readahead Andries.Brouwer
2002-04-16 16:08 ` readahead Steven Cole
2002-04-16 18:25 ` readahead Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3CBC7A41.384FD032@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=Andries.Brouwer@cwi.nl \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.