From: Matthew Wilcox <willy@linux.intel.com>
To: linux-mm@kvack.org
Subject: Setting of the PageReadahed bit
Date: Fri, 3 Jun 2011 07:55:19 -0400 [thread overview]
Message-ID: <20110603115519.GI4061@linux.intel.com> (raw)
The exact definition of PageReadahead doesn't seem to be documented
anywhere. I'm assuming it means "This page was not directly requested;
it is being read for prefetching purposes", exactly like the READA
semantics.
If my interpretation is correct, then the implementation in
__do_page_cache_readahead is wrong:
if (page_idx == nr_to_read - lookahead_size)
SetPageReadahead(page);
It'll only set the PageReadahead bit on one page. The patch below fixes
this ... if my understanding is correct.
If my understanding is wrong, then how are readpage/readpages
implementations supposed to know that the VM is only prefetching these
pages, and they're not as important as metadata (dependent) reads?
commit 7b1a00ae0ecc327bab9ce467dcd7bd7fe2a31fab
Author: Matthew Wilcox <matthew.r.wilcox@intel.com>
Date: Fri Jun 3 03:49:45 2011 -0400
mm: Fix PageReadahead flag setting in readahead code
The current code sets the PageReadahead bit on at most one page in each
batch. Worse, it can set the PageReadahead bit on the exact page that
was requested. What it should be doing is setting the PageReadahead
bit on every page except the one which is really demanded.
Passing the page offset to the __do_page_cache_readahead() function
lets it know which page to not set the Readahead bit on. That implies
passing the offset into ra_submit as well.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9670f71..3e32a17 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1480,6 +1480,7 @@ void page_cache_async_readahead(struct address_space *mapping,
unsigned long max_sane_readahead(unsigned long nr);
unsigned long ra_submit(struct file_ra_state *ra,
+ pgoff_t offset,
struct address_space *mapping,
struct file *filp);
diff --git a/mm/filemap.c b/mm/filemap.c
index d7b1057..1a1ab1b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1594,7 +1594,7 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma,
ra->start = max_t(long, 0, offset - ra_pages / 2);
ra->size = ra_pages;
ra->async_size = ra_pages / 4;
- ra_submit(ra, mapping, file);
+ ra_submit(ra, offset, mapping, file);
}
/*
diff --git a/mm/readahead.c b/mm/readahead.c
index 867f9dd..9202533 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -150,7 +150,7 @@ out:
static int
__do_page_cache_readahead(struct address_space *mapping, struct file *filp,
pgoff_t offset, unsigned long nr_to_read,
- unsigned long lookahead_size)
+ pgoff_t wanted)
{
struct inode *inode = mapping->host;
struct page *page;
@@ -185,7 +185,7 @@ __do_page_cache_readahead(struct address_space *mapping, struct file *filp,
break;
page->index = page_offset;
list_add(&page->lru, &page_pool);
- if (page_idx == nr_to_read - lookahead_size)
+ if (page_offset != wanted)
SetPageReadahead(page);
ret++;
}
@@ -210,6 +210,7 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
pgoff_t offset, unsigned long nr_to_read)
{
int ret = 0;
+ pgoff_t wanted = offset;
if (unlikely(!mapping->a_ops->readpage && !mapping->a_ops->readpages))
return -EINVAL;
@@ -223,7 +224,7 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
if (this_chunk > nr_to_read)
this_chunk = nr_to_read;
err = __do_page_cache_readahead(mapping, filp,
- offset, this_chunk, 0);
+ offset, this_chunk, wanted);
if (err < 0) {
ret = err;
break;
@@ -248,13 +249,13 @@ unsigned long max_sane_readahead(unsigned long nr)
/*
* Submit IO for the read-ahead request in file_ra_state.
*/
-unsigned long ra_submit(struct file_ra_state *ra,
+unsigned long ra_submit(struct file_ra_state *ra, pgoff_t offset,
struct address_space *mapping, struct file *filp)
{
int actual;
actual = __do_page_cache_readahead(mapping, filp,
- ra->start, ra->size, ra->async_size);
+ ra->start, ra->size, offset);
return actual;
}
@@ -465,7 +466,8 @@ ondemand_readahead(struct address_space *mapping,
* standalone, small random read
* Read as is, and do not pollute the readahead state.
*/
- return __do_page_cache_readahead(mapping, filp, offset, req_size, 0);
+ return __do_page_cache_readahead(mapping, filp, offset, req_size,
+ offset);
initial_readahead:
ra->start = offset;
@@ -483,7 +485,7 @@ readit:
ra->size += ra->async_size;
}
- return ra_submit(ra, mapping, filp);
+ return ra_submit(ra, offset, mapping, filp);
}
/**
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2011-06-03 11:55 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-03 11:55 Matthew Wilcox [this message]
2011-06-04 3:15 ` Setting of the PageReadahed bit Hugh Dickins
2011-06-05 7:54 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110603115519.GI4061@linux.intel.com \
--to=willy@linux.intel.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).