linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults
@ 2011-04-26  9:43 Wu Fengguang
  2011-04-26  9:43 ` [PATCH 1/3] readahead: return early when readahead is disabled Wu Fengguang
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Li Shaohua, Wu Fengguang, LKML,
	Linux Memory Management List

Andrew,

This kills unnessesary ra->mmap_miss and ra->prev_pos updates on every page
fault when the readahead is disabled.

They fix the cache line bouncing problem in the mosbench exim benchmark, which
does multi-threaded page faults on shared struct file on tmpfs.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] readahead: return early when readahead is disabled
  2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
  2011-04-26  9:43 ` [PATCH 1/3] readahead: return early when readahead is disabled Wu Fengguang
@ 2011-04-26  9:43 ` Wu Fengguang
  2011-04-26  9:43 ` [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases Wu Fengguang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-early-abort-mmap-around.patch --]
[-- Type: text/plain, Size: 1556 bytes --]

Reduce readahead overheads by returning early in
do_sync_mmap_readahead().

tmpfs has ra_pages=0 and it can page fault really fast
(not constraint by IO if not swapping).

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 08:56:59.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
@@ -1528,6 +1528,8 @@ static void do_sync_mmap_readahead(struc
 	/* If we don't want any read-ahead, don't bother */
 	if (VM_RandomReadHint(vma))
 		return;
+	if (!ra->ra_pages)
+		return;
 
 	if (VM_SequentialReadHint(vma) ||
 			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
@@ -1550,12 +1552,10 @@ static void do_sync_mmap_readahead(struc
 	 * mmap read-around
 	 */
 	ra_pages = max_sane_readahead(ra->ra_pages);
-	if (ra_pages) {
-		ra->start = max_t(long, 0, offset - ra_pages/2);
-		ra->size = ra_pages;
-		ra->async_size = 0;
-		ra_submit(ra, mapping, file);
-	}
+	ra->start = max_t(long, 0, offset - ra_pages / 2);
+	ra->size = ra_pages;
+	ra->async_size = 0;
+	ra_submit(ra, mapping, file);
 }
 
 /*


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] readahead: return early when readahead is disabled
  2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
@ 2011-04-26  9:43 ` Wu Fengguang
  2011-04-26  9:43 ` Wu Fengguang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-early-abort-mmap-around.patch --]
[-- Type: text/plain, Size: 1251 bytes --]

Reduce readahead overheads by returning early in
do_sync_mmap_readahead().

tmpfs has ra_pages=0 and it can page fault really fast
(not constraint by IO if not swapping).

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 08:56:59.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
@@ -1528,6 +1528,8 @@ static void do_sync_mmap_readahead(struc
 	/* If we don't want any read-ahead, don't bother */
 	if (VM_RandomReadHint(vma))
 		return;
+	if (!ra->ra_pages)
+		return;
 
 	if (VM_SequentialReadHint(vma) ||
 			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
@@ -1550,12 +1552,10 @@ static void do_sync_mmap_readahead(struc
 	 * mmap read-around
 	 */
 	ra_pages = max_sane_readahead(ra->ra_pages);
-	if (ra_pages) {
-		ra->start = max_t(long, 0, offset - ra_pages/2);
-		ra->size = ra_pages;
-		ra->async_size = 0;
-		ra_submit(ra, mapping, file);
-	}
+	ra->start = max_t(long, 0, offset - ra_pages / 2);
+	ra->size = ra_pages;
+	ra->async_size = 0;
+	ra_submit(ra, mapping, file);
 }
 
 /*

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases
  2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
  2011-04-26  9:43 ` [PATCH 1/3] readahead: return early when readahead is disabled Wu Fengguang
  2011-04-26  9:43 ` Wu Fengguang
@ 2011-04-26  9:43 ` Wu Fengguang
  2011-04-26  9:43 ` Wu Fengguang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Andi Kleen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-reduce-mmap_miss-increases.patch --]
[-- Type: text/plain, Size: 1095 bytes --]

From: Andi Kleen <ak@linux.intel.com>

The original INT_MAX is too large, reduce it to

- avoid unnecessarily dirtying/bouncing the cache line
- restore mmap read-around faster on changed access pattern

Tested-by: Tim Chen <tim.c.chen@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:17:21.000000000 +0800
@@ -1538,7 +1538,8 @@ static void do_sync_mmap_readahead(struc
 		return;
 	}
 
-	if (ra->mmap_miss < INT_MAX)
+	/* Avoid banging the cache line if not needed */
+	if (ra->mmap_miss < MMAP_LOTSAMISS * 10)
 		ra->mmap_miss++;
 
 	/*


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases
  2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
                   ` (2 preceding siblings ...)
  2011-04-26  9:43 ` [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases Wu Fengguang
@ 2011-04-26  9:43 ` Wu Fengguang
  2011-04-26  9:43 ` [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead Wu Fengguang
  2011-04-26  9:43 ` Wu Fengguang
  5 siblings, 0 replies; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Andi Kleen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-reduce-mmap_miss-increases.patch --]
[-- Type: text/plain, Size: 790 bytes --]

From: Andi Kleen <ak@linux.intel.com>

The original INT_MAX is too large, reduce it to

- avoid unnecessarily dirtying/bouncing the cache line
- restore mmap read-around faster on changed access pattern

Tested-by: Tim Chen <tim.c.chen@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 09:01:44.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-23 09:17:21.000000000 +0800
@@ -1538,7 +1538,8 @@ static void do_sync_mmap_readahead(struc
 		return;
 	}
 
-	if (ra->mmap_miss < INT_MAX)
+	/* Avoid banging the cache line if not needed */
+	if (ra->mmap_miss < MMAP_LOTSAMISS * 10)
 		ra->mmap_miss++;
 
 	/*

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead
  2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
                   ` (4 preceding siblings ...)
  2011-04-26  9:43 ` [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead Wu Fengguang
@ 2011-04-26  9:43 ` Wu Fengguang
  2011-04-26 14:36   ` [PATCH 3/3 with new changelog] " Wu Fengguang
  5 siblings, 1 reply; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-no-mmap-prev_pos.patch --]
[-- Type: text/plain, Size: 1979 bytes --]

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, this is found to cause excessive cache line bouncing
on tmpfs, which does not need readahead at all.

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead
  2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
                   ` (3 preceding siblings ...)
  2011-04-26  9:43 ` Wu Fengguang
@ 2011-04-26  9:43 ` Wu Fengguang
  2011-04-26  9:43 ` Wu Fengguang
  5 siblings, 0 replies; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26  9:43 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tim Chen, Wu Fengguang, Li Shaohua, LKML,
	Linux Memory Management List

[-- Attachment #1: readahead-no-mmap-prev_pos.patch --]
[-- Type: text/plain, Size: 1674 bytes --]

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, this is found to cause excessive cache line bouncing
on tmpfs, which does not need readahead at all.

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/3 with new changelog] readahead: trigger mmap sequential readahead on PG_readahead
  2011-04-26  9:43 ` Wu Fengguang
@ 2011-04-26 14:36   ` Wu Fengguang
  0 siblings, 0 replies; 8+ messages in thread
From: Wu Fengguang @ 2011-04-26 14:36 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Chen, Tim C, Li, Shaohua, LKML, Linux Memory Management List

Previously the mmap sequential readahead is triggered by updating
ra->prev_pos on each page fault and compare it with current page offset.

In the mosbench exim benchmark which does multi-threaded page faults on
shared struct file, the ra->mmap_miss and ra->prev_pos updates are found
to cause excessive cache line bouncing on tmpfs, which actually disabled
readahead totally (shmem_backing_dev_info.ra_pages == 0).

So remove the ra->prev_pos recording, and instead tag PG_readahead to
trigger the possible sequential readahead. It's not only more simple,
but also will work more reliably on concurrent reads on shared struct file.

Tested-by: Tim Chen <tim.c.chen@intel.com>
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/filemap.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- linux-next.orig/mm/filemap.c	2011-04-23 16:52:21.000000000 +0800
+++ linux-next/mm/filemap.c	2011-04-24 09:59:08.000000000 +0800
@@ -1531,8 +1531,7 @@ static void do_sync_mmap_readahead(struc
 	if (!ra->ra_pages)
 		return;
 
-	if (VM_SequentialReadHint(vma) ||
-			offset - 1 == (ra->prev_pos >> PAGE_CACHE_SHIFT)) {
+	if (VM_SequentialReadHint(vma)) {
 		page_cache_sync_readahead(mapping, ra, file, offset,
 					  ra->ra_pages);
 		return;
@@ -1555,7 +1554,7 @@ static void do_sync_mmap_readahead(struc
 	ra_pages = max_sane_readahead(ra->ra_pages);
 	ra->start = max_t(long, 0, offset - ra_pages / 2);
 	ra->size = ra_pages;
-	ra->async_size = 0;
+	ra->async_size = ra_pages / 4;
 	ra_submit(ra, mapping, file);
 }
 
@@ -1661,7 +1660,6 @@ retry_find:
 		return VM_FAULT_SIGBUS;
 	}
 
-	ra->prev_pos = (loff_t)offset << PAGE_CACHE_SHIFT;
 	vmf->page = page;
 	return ret | VM_FAULT_LOCKED;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-04-26 14:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-26  9:43 [PATCH 0/3] reduce readahead overheads on tmpfs mmap page faults Wu Fengguang
2011-04-26  9:43 ` [PATCH 1/3] readahead: return early when readahead is disabled Wu Fengguang
2011-04-26  9:43 ` Wu Fengguang
2011-04-26  9:43 ` [PATCH 2/3] readahead: reduce unnecessary mmap_miss increases Wu Fengguang
2011-04-26  9:43 ` Wu Fengguang
2011-04-26  9:43 ` [PATCH 3/3] readahead: trigger mmap sequential readahead on PG_readahead Wu Fengguang
2011-04-26  9:43 ` Wu Fengguang
2011-04-26 14:36   ` [PATCH 3/3 with new changelog] " Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).