linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-15 10:12 Mel Gorman
  2011-04-15 13:23 ` Rik van Riel
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Mel Gorman @ 2011-04-15 10:12 UTC (permalink / raw)
  To: akpm
  Cc: Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

With transparent hugepage support, handle_mm_fault() has to be careful
that a normal PMD has been established before handling a PTE fault. To
achieve this, it used __pte_alloc() directly instead of pte_alloc_map
as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
is called once it is known the PMD is safe.

pte_alloc_map() is smart enough to check if a PTE is already present
before calling __pte_alloc but this check was lost. As a consequence,
PTEs may be allocated unnecessarily and the page table lock taken.
Thi useless PTE does get cleaned up but it's a performance hit which
is visible in page_test from aim9.

This patch simply re-adds the check normally done by pte_alloc_map to
check if the PTE needs to be allocated before taking the page table
lock. The effect is noticable in page_test from aim9.

AIM9
                2.6.38-vanilla 2.6.38-checkptenone
creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
exec_test      382.00 ( 0.00%)   456.90 (16.39%)
fork_test       60.11 ( 0.00%)    67.79 (11.34%)
MMTests Statistics: duration
Total Elapsed Time (seconds)                611.90    612.22

(While this affects 2.6.38, it is a performance rather than a
functional bug and normally outside the rules -stable. While the big
performance differences are to a microbench, the difference in fork
and exec performance may be significant enough that -stable wants to
consider the patch)

Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
-- 
 mm/memory.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 5823698..1659574 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * run pte_offset_map on the pmd, if an huge pmd could
 	 * materialize from under us from a different thread.
 	 */
-	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
+	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
 		return VM_FAULT_OOM;
 	/* if an huge pmd materialized from under us just retry later */
 	if (unlikely(pmd_trans_huge(*pmd)))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-04-27 13:51 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-15 10:12 [PATCH] mm: Check if PTE is already allocated during page fault Mel Gorman
2011-04-15 13:23 ` Rik van Riel
2011-04-15 14:39 ` Andrea Arcangeli
2011-04-15 15:06   ` Andrea Arcangeli
2011-04-18  7:21     ` raz ben yehuda
2011-04-18 10:23     ` Mel Gorman
2011-04-21  6:59 ` Minchan Kim
2011-04-21 11:08   ` Mel Gorman
2011-04-21 14:26     ` Minchan Kim
2011-04-21 16:00       ` Mel Gorman
2011-04-21 16:14         ` Andrea Arcangeli
2011-04-22  0:54           ` Minchan Kim
2011-04-26 13:00             ` Andrea Arcangeli
2011-04-22  1:01         ` Minchan Kim
2011-04-27 13:50 ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).