linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] Increase locking granularity in THP page fault code
@ 2013-08-30 16:58 Alex Thorlton
  2013-08-30 16:58 ` [RFC PATCH] Change THP code to use pud_page(pud)->ptl lock page_table_lock Alex Thorlton
  0 siblings, 1 reply; 3+ messages in thread
From: Alex Thorlton @ 2013-08-30 16:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Kirill A. Shutemov, Mel Gorman, Xiao Guangrong,
	Andrea Arcangeli, Hugh Dickins, Rik van Riel, Peter Zijlstra,
	Robin Holt, Alex Thorlton, linux-mm

We have a threaded page fault scaling test that is performing very
poorly due to the use of the page_table_lock in the THP fault code
path. By replacing the page_table_lock with the ptl on the pud_page
(need CONFIG_SPLIT_PTLOCKS for this to work), we're able to increase
the granularity of the locking here by a factor of 512, i.e. instead
of locking all 256TB of addressable memory, we only lock the 512GB that
is handled by a single pud_page.

The test I'm running creates 512 threads, pins each thread to a cpu, has
the threads allocate 512mb of memory each and then touch the first byte
of each 4k chunk of the allocated memory.  Here are the timings from
this test on 3.11-rc7, clean, THP on:

real	22m50.904s
user	15m26.072s
sys	11430m19.120s

And here are the timings with my modified kernel, THP on:

real	0m37.018s
user	21m39.164s
sys	155m9.132s

As you can see, we get a huge performance boost by locking a more
targeted chunk of memory instead of locking the whole page table.  At
this point, I'm comfortable saying that there are obvious benefits to
increasing the granularity of the locking, but I'm not sure that I've
done this in the best way possible.  Mainly, I'm not positive that using
the pud_page lock actually protects everything that we're concerned
about locking here.  I'm hoping that everyone can provide some input
on whether or not this seems like a reasonable move to make and, if so,
confirm that I've locked things appropriately.

As a side note, we still have some pretty significant scaling issues
with this test, both with THP on, and off.  I'm cleaning up the locking
here first as this is causing the biggest performance hit.

Alex Thorlton (1):
  Change THP code to use pud_page(pud)->ptl lock instead of
    page_table_lock

 mm/huge_memory.c | 4 ++--
 mm/memory.c      | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

-- 
1.7.12.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-08-30 17:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-30 16:58 [RFC PATCH] Increase locking granularity in THP page fault code Alex Thorlton
2013-08-30 16:58 ` [RFC PATCH] Change THP code to use pud_page(pud)->ptl lock page_table_lock Alex Thorlton
2013-08-30 17:57   ` Naoya Horiguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).