All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults
@ 2011-04-26  9:43 ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Li Shaohua, Wu Fengguang, LKML,
	Linux Memory Management List

Andrew,

This kills unnessesary ra->mmap_miss and ra->prev_pos updates on every page
fault when the readahead is disabled.

They fix the cache line bouncing problem in the mosbench exim benchmark, which
does multi-threaded page faults on shared struct file on tmpfs.

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults
@ 2011-04-26  9:43 ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Li Shaohua, Wu Fengguang, LKML,
	Linux Memory Management List

Andrew,

This kills unnessesary ra->mmap_miss and ra->prev_pos updates on every page
fault when the readahead is disabled.

They fix the cache line bouncing problem in the mosbench exim benchmark, which
does multi-threaded page faults on shared struct file on tmpfs.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults
@ 2011-04-26  9:43 Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Li Shaohua, Wu Fengguang, LKML,
	Linux Memory Management List

Andrew,

This kills unnessesary ra->mmap_miss and ra->prev_pos updates on every page
fault when the readahead is disabled.

They fix the cache line bouncing problem in the mosbench exim benchmark, which
does multi-threaded page faults on shared struct file on tmpfs.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] readahead: return early when readahead is disabled
  2011-04-26  9:43 ` Wu Fengguang
@ 2011-04-26  9:43   ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-early-abort-mmap-around.patch --]
[-- Type: text/plain, Size: 1253 bytes --]

Reduce readahead overheads by returning early in
do_sync_mmap_readahead().

tmpfs has ra_pages=0 and it can page fault really fast
(not constraint by IO if not swapping).

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 08:56:59.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
@@ -1528,6 +1528,8 @@ static void do_sync_mmap_readahead(struc
 	/* If we don't want any read-ahead, don't bother */
 	if (VM_RandomReadHint(vma))
 		return;
+	if (!ra->ra_pages)
+		return;
 
 	if (VM_SequentialReadHint(vma) ||
 			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
@@ -1550,12 +1552,10 @@ static void do_sync_mmap_readahead(struc
 	 * mmap read-around
 	 */
 	ra_pages = max_sane_readahead(ra->ra_pages);
-	if (ra_pages) {
-		ra->start = max_t(long, 0, offset - ra_pages/2);
-		ra->size = ra_pages;
-		ra->async_size = 0;
-		ra_submit(ra, mapping, file);
-	}
+	ra->start = max_t(long, 0, offset - ra_pages / 2);
+	ra->size = ra_pages;
+	ra->async_size = 0;
+	ra_submit(ra, mapping, file);
 }
 
 /*



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] readahead: return early when readahead is disabled
@ 2011-04-26  9:43   ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-early-abort-mmap-around.patch --]
[-- Type: text/plain, Size: 1556 bytes --]

Reduce readahead overheads by returning early in
do_sync_mmap_readahead().

tmpfs has ra_pages=0 and it can page fault really fast
(not constraint by IO if not swapping).

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 08:56:59.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
@@ -1528,6 +1528,8 @@ static void do_sync_mmap_readahead(struc
 	/* If we don't want any read-ahead, don't bother */
 	if (VM_RandomReadHint(vma))
 		return;
+	if (!ra->ra_pages)
+		return;
 
 	if (VM_SequentialReadHint(vma) ||
 			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
@@ -1550,12 +1552,10 @@ static void do_sync_mmap_readahead(struc
 	 * mmap read-around
 	 */
 	ra_pages = max_sane_readahead(ra->ra_pages);
-	if (ra_pages) {
-		ra->start = max_t(long, 0, offset - ra_pages/2);
-		ra->size = ra_pages;
-		ra->async_size = 0;
-		ra_submit(ra, mapping, file);
-	}
+	ra->start = max_t(long, 0, offset - ra_pages / 2);
+	ra->size = ra_pages;
+	ra->async_size = 0;
+	ra_submit(ra, mapping, file);
 }
 
 /*


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] readahead: return early when readahead is disabled
  2011-04-26  9:43 ` Wu Fengguang
  (?)
  (?)
@ 2011-04-26  9:43 ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-early-abort-mmap-around.patch --]
[-- Type: text/plain, Size: 1251 bytes --]

Reduce readahead overheads by returning early in
do_sync_mmap_readahead().

tmpfs has ra_pages=0 and it can page fault really fast
(not constraint by IO if not swapping).

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 08:56:59.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
@@ -1528,6 +1528,8 @@ static void do_sync_mmap_readahead(struc
 	/* If we don't want any read-ahead, don't bother */
 	if (VM_RandomReadHint(vma))
 		return;
+	if (!ra->ra_pages)
+		return;
 
 	if (VM_SequentialReadHint(vma) ||
 			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
@@ -1550,12 +1552,10 @@ static void do_sync_mmap_readahead(struc
 	 * mmap read-around
 	 */
 	ra_pages = max_sane_readahead(ra->ra_pages);
-	if (ra_pages) {
-		ra->start = max_t(long, 0, offset - ra_pages/2);
-		ra->size = ra_pages;
-		ra->async_size = 0;
-		ra_submit(ra, mapping, file);
-	}
+	ra->start = max_t(long, 0, offset - ra_pages / 2);
+	ra->size = ra_pages;
+	ra->async_size = 0;
+	ra_submit(ra, mapping, file);
 }
 
 /*

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases
  2011-04-26  9:43 ` Wu Fengguang
@ 2011-04-26  9:43   ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Andi Kleen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-reduce-mmap_miss-increases.patch --]
[-- Type: text/plain, Size: 792 bytes --]

From: Andi Kleen <ak@linux.intel.com>

The original INT_MAX is too large, reduce it to

- avoid unnecessarily dirtying/bouncing the cache line
- restore mmap read-around faster on changed access pattern

Tested-by: Tim Chen <tim.c.chen@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:17:21.000000000 +0800
@@ -1538,7 +1538,8 @@ static void do_sync_mmap_readahead(struc
 		return;
 	}
 
-	if (ra->mmap_miss < INT_MAX)
+	/* Avoid banging the cache line if not needed */
+	if (ra->mmap_miss < MMAP_LOTSAMISS * 10)
 		ra->mmap_miss++;
 
 	/*



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases
@ 2011-04-26  9:43   ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Andi Kleen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-reduce-mmap_miss-increases.patch --]
[-- Type: text/plain, Size: 1095 bytes --]

From: Andi Kleen <ak@linux.intel.com>

The original INT_MAX is too large, reduce it to

- avoid unnecessarily dirtying/bouncing the cache line
- restore mmap read-around faster on changed access pattern

Tested-by: Tim Chen <tim.c.chen@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:17:21.000000000 +0800
@@ -1538,7 +1538,8 @@ static void do_sync_mmap_readahead(struc
 		return;
 	}
 
-	if (ra->mmap_miss < INT_MAX)
+	/* Avoid banging the cache line if not needed */
+	if (ra->mmap_miss < MMAP_LOTSAMISS * 10)
 		ra->mmap_miss++;
 
 	/*


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases
  2011-04-26  9:43 ` Wu Fengguang
                   ` (3 preceding siblings ...)
  (?)
@ 2011-04-26  9:43 ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Andi Kleen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-reduce-mmap_miss-increases.patch --]
[-- Type: text/plain, Size: 790 bytes --]

From: Andi Kleen <ak@linux.intel.com>

The original INT_MAX is too large, reduce it to

- avoid unnecessarily dirtying/bouncing the cache line
- restore mmap read-around faster on changed access pattern

Tested-by: Tim Chen <tim.c.chen@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:17:21.000000000 +0800
@@ -1538,7 +1538,8 @@ static void do_sync_mmap_readahead(struc
 		return;
 	}
 
-	if (ra->mmap_miss < INT_MAX)
+	/* Avoid banging the cache line if not needed */
+	if (ra->mmap_miss < MMAP_LOTSAMISS * 10)
 		ra->mmap_miss++;
 
 	/*

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead
  2011-04-26  9:43 ` Wu Fengguang
@ 2011-04-26  9:43   ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-no-mmap-prev_pos.patch --]
[-- Type: text/plain, Size: 1676 bytes --]

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, this is found to cause excessive cache line bouncing
on tmpfs, which does not need readahead at all.

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead
@ 2011-04-26  9:43   ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-no-mmap-prev_pos.patch --]
[-- Type: text/plain, Size: 1979 bytes --]

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, this is found to cause excessive cache line bouncing
on tmpfs, which does not need readahead at all.

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead
  2011-04-26  9:43 ` Wu Fengguang
                   ` (5 preceding siblings ...)
  (?)
@ 2011-04-26  9:43 ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-no-mmap-prev_pos.patch --]
[-- Type: text/plain, Size: 1674 bytes --]

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, this is found to cause excessive cache line bouncing
on tmpfs, which does not need readahead at all.

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3 with new changelog] readahead: trigger mmap sequential readahead on PG_readahead
  2011-04-26  9:43   ` Wu Fengguang
@ 2011-04-26 14:36     ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26 14:36 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Chen, Tim C, Li, Shaohua, LKML, Linux Memory Management List

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, the ra->mmap_miss and ra->prev_pos updates are found
to cause excessive cache line bouncing on tmpfs, which actually disabled
readahead totally (shmem_backing_dev_info.ra_pages == 0).

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3 with new changelog] readahead: trigger mmap sequential readahead on PG_readahead
@ 2011-04-26 14:36     ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2011-04-26 14:36 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Chen, Tim C, Li, Shaohua, LKML, Linux Memory Management List

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, the ra->mmap_miss and ra->prev_pos updates are found
to cause excessive cache line bouncing on tmpfs, which actually disabled
readahead totally (shmem_backing_dev_info.ra_pages == 0).

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-04-26 14:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
2011-04-26  9:43 ` Wu Fengguang
2011-04-26  9:43 ` [PATCH 1/3] readahead: return early when readahead is disabled Wu Fengguang
2011-04-26  9:43   ` Wu Fengguang
2011-04-26  9:43 ` Wu Fengguang
2011-04-26  9:43 ` [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases Wu Fengguang
2011-04-26  9:43   ` Wu Fengguang
2011-04-26  9:43 ` Wu Fengguang
2011-04-26  9:43 ` [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead Wu Fengguang
2011-04-26  9:43   ` Wu Fengguang
2011-04-26 14:36   ` [PATCH 3/3 with new changelog] " Wu Fengguang
2011-04-26 14:36     ` Wu Fengguang
2011-04-26  9:43 ` [PATCH 3/3] " Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.