* [PATCH 00 of 66] Transparent Hugepage Support #32
@ 2010-11-03 15:27 Andrea Arcangeli
2010-11-18 16:39 ` Mel Gorman
0 siblings, 1 reply; 5+ messages in thread
From: Andrea Arcangeli @ 2010-11-03 15:27 UTC (permalink / raw)
To: linux-mm, Linus Torvalds, Andrew Morton, linux-kernel
Cc: Marcelo Tosatti, Adam Litke, Avi Kivity, Hugh Dickins,
Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner,
Daisuke Nishimura, Chris Mason, Borislav Petkov
Some of some relevant user of the project:
KVM Virtualization
GCC (kernel build included, requires a few liner patch to enable)
JVM
VMware Workstation
HPC
It would be great if it could go in -mm.
http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=blob;f=Documentation/vm/transhuge.txt
http://www.linux-kvm.org/wiki/images/9/9e/2010-forum-thp.pdf
http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=shortlog
first: git clone git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
or first: git clone --reference linux-2.6 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
later: git fetch; git checkout -f origin/master
The tree is rebased and git pull won't work.
http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.37-rc1/transparent_hugepage-32/
http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.37-rc1/transparent_hugepage-32.gz
Diff #31 -> #32:
b/clear_copy_huge_page | 59 ++++++++--------
hugetlbfs.c copy_huge_page was renamed to copy_user_huge_page.
b/kvm_transparent_hugepage | 38 +++++-----
Adjust hva_to_pfn interface change.
b/lumpy-compaction | 48 +++++++++++++
Disable lumpy reclaim when CONFIG_COMPACTION is enabled. Agreed with Mel @
Kernel Summit. Mel wants to defer the full removal of lumpy reclaim of a few
releases. Disabling lumpy reclaim is needed to prevent THP to render the system
totally unusable when reclaim starts.
b/pmd_trans_huge_migrate | 31 +++-----
Migrate can now migrate hugetlb pages. This has a chance to break on THP but it
seems all the "magic hugetlbfs code paths" are activeted by the "destination"
page to be huge. That never happens with THP that in fact would split the page
making the source not huge either. So it seems the current code may co-exist
with THP too without further changes.
This update also fixes a false positive BUG_ON in remove_migration_pte that
could materialize after handling the CPU errata(s) that shows the CPU don't
like 4k and 2M simultaneous TLB entries. To implement the workaround without
increasing the TLB flush cost of split_huge_page I had to set the hugepmd as
non-present during the TLB flush (so opening a micro-window for a false
positive in the BUG_ON check). The BUG_ON simply can be safely removed now, in
turn solving the false positive.
remove-lumpy_reclaim | 131 -------------------------------------
lumpy reclaim not removed anymore but it gets disabled at runtime by enabling
CONFIG_COMPACTION=y at compile time (and setting CONFIG_TRANSPARENT_HUGEPAGE=y
implicitly selects CONFIG_COMPACTION=y of course).
Full diffstat:
Documentation/vm/transhuge.txt | 283 ++++
arch/alpha/include/asm/mman.h | 2
arch/mips/include/asm/mman.h | 2
arch/parisc/include/asm/mman.h | 2
arch/powerpc/mm/gup.c | 12
arch/x86/include/asm/kvm_host.h | 1
arch/x86/include/asm/paravirt.h | 23
arch/x86/include/asm/paravirt_types.h | 6
arch/x86/include/asm/pgtable-2level.h | 9
arch/x86/include/asm/pgtable-3level.h | 23
arch/x86/include/asm/pgtable.h | 149 ++
arch/x86/include/asm/pgtable_64.h | 28
arch/x86/include/asm/pgtable_types.h | 3
arch/x86/kernel/paravirt.c | 3
arch/x86/kernel/tboot.c | 2
arch/x86/kernel/vm86_32.c | 1
arch/x86/kvm/mmu.c | 60
arch/x86/kvm/paging_tmpl.h | 4
arch/x86/mm/gup.c | 28
arch/x86/mm/pgtable.c | 66
arch/xtensa/include/asm/mman.h | 2
drivers/base/node.c | 21
fs/Kconfig | 2
fs/exec.c | 44
fs/proc/meminfo.c | 14
fs/proc/page.c | 14
include/asm-generic/mman-common.h | 2
include/asm-generic/pgtable.h | 130 +
include/linux/compaction.h | 13
include/linux/gfp.h | 14
include/linux/huge_mm.h | 170 ++
include/linux/khugepaged.h | 66
include/linux/kvm_host.h | 4
include/linux/memory_hotplug.h | 14
include/linux/mm.h | 114 +
include/linux/mm_inline.h | 19
include/linux/mm_types.h | 3
include/linux/mmu_notifier.h | 66
include/linux/mmzone.h | 1
include/linux/page-flags.h | 36
include/linux/sched.h | 1
include/linux/swap.h | 2
kernel/fork.c | 12
kernel/futex.c | 55
mm/Kconfig | 38
mm/Makefile | 1
mm/compaction.c | 48
mm/huge_memory.c | 2290 ++++++++++++++++++++++++++++++++++
mm/hugetlb.c | 69 -
mm/ksm.c | 52
mm/madvise.c | 8
mm/memcontrol.c | 138 +-
mm/memory-failure.c | 2
mm/memory.c | 199 ++
mm/memory_hotplug.c | 14
mm/mempolicy.c | 14
mm/migrate.c | 7
mm/mincore.c | 7
mm/mmap.c | 7
mm/mmu_notifier.c | 20
mm/mprotect.c | 20
mm/mremap.c | 9
mm/page_alloc.c | 31
mm/pagewalk.c | 1
mm/rmap.c | 115 -
mm/sparse.c | 4
mm/swap.c | 117 +
mm/swap_state.c | 6
mm/swapfile.c | 2
mm/vmscan.c | 43
mm/vmstat.c | 1
virt/kvm/iommu.c | 2
virt/kvm/kvm_main.c | 56
73 files changed, 4485 insertions(+), 362 deletions(-)
FAQ:
Q: When will 1G pages be supported? (by far the most frequently asked question
in the last two days)
A: Not any time soon but it's not entirly impossible... The benefit of going
from 2M to 1G is likely much lower than the benefit of going from 4k to 2M
so it's unlikely to be a worthwhile effort for a while.
Q: When this will work on filebacked pages? (pagecache/swapcache/tmpfs)
A: Not until it's merged in mainline. It's already feature complete for many
usages and the moment we expand into pagecache the patch would grow
significantly.
Q: When will KSM will scan inside Transparent Hugepages?
A: Working on that, this should materialize soon enough.
Q: What is the next place where to remove split_huge_page_pmd()?
A: mremap. JVM uses mremap in the garbage collector so the ~18% boost (no virt)
has further margin for optimizations.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 00 of 66] Transparent Hugepage Support #32
[not found] <1562100965.1384941288839981384.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
@ 2010-11-04 3:10 ` caiqian
2010-11-04 8:41 ` CAI Qian
0 siblings, 1 reply; 5+ messages in thread
From: caiqian @ 2010-11-04 3:10 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Marcelo Tosatti, Adam Litke, Avi Kivity, Hugh Dickins,
Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner,
Daisuke Nishimura, Chris Mason, Borislav Petkov, linux-mm,
Linus Torvalds, Andrew Morton, linux-kernel
There were some changes of behaviours with THP and KSM statistics demonstrated by this program, http://people.redhat.com/qcai/ksm01.c.
There are 3 programs (A, B ,C) to allocate 128M memory each using KSM.
A has memory content = 'c'.
B has memory content = 'a'.
C has memory content = 'a'.
Then without THP,
pages_shared = 2
pages_sharing = 98285
pages_sharing = 98292
pages_unshared = 0
pages_volatile = 17
pages_to_scan = 98304
sleep_millisecs = 0
with THP,
pages_shared is 2.
pages_sharing is 18422.
pages_unshared is 0.
pages_volatile is 8.
Later,
A has memory content = 'c'
B has memory content = 'b'
C has memory content = 'a'.
Then without THP,
pages_shared = 3
pages_sharing = 98296
pages_unshared = 0
pages_volatile = 5
with THP,
pages_shared = 3
pages_sharing = 16358
pages_unshared = 0
pages_volatile = 23
Later,
A has memory content = 'd'
B has memory content = 'd'
C has memory content = 'd'
Then without THP,
pages_shared = 1
pages_sharing = 98274
pages_unshared = 0
pages_volatile = 29
with THP,
pages_shared = 1
pages_sharing = 8668
pages_unshared = 0
pages_volatile = 35
Finally,
A changes one page to 'e'
Then without THP,
pages_shared = 1
pages_sharing = 98274
pages_unshared = 1
pages_volatile = 28
with THP,
pages_shared = 1
pages_sharing = 8163
pages_unshared = 1
pages_volatile = 27
Are those differences for pages_sharing between with and without THP are expected?
CAI Qian
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 00 of 66] Transparent Hugepage Support #32
2010-11-04 3:10 ` [PATCH 00 of 66] Transparent Hugepage Support #32 caiqian
@ 2010-11-04 8:41 ` CAI Qian
2010-11-04 16:30 ` Andrea Arcangeli
0 siblings, 1 reply; 5+ messages in thread
From: CAI Qian @ 2010-11-04 8:41 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Marcelo Tosatti, Adam Litke, Avi Kivity, Hugh Dickins,
Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner,
Daisuke Nishimura, Chris Mason, Borislav Petkov, linux-mm,
Linus Torvalds, Andrew Morton, linux-kernel
Thank Andrea for pointing out to me there are ongoing works for KSM and THP integration. Sorry for the noise.
CAI Qian
----- caiqian@redhat.com wrote:
> There were some changes of behaviours with THP and KSM statistics
> demonstrated by this program, http://people.redhat.com/qcai/ksm01.c.
>
> There are 3 programs (A, B ,C) to allocate 128M memory each using KSM.
> A has memory content = 'c'.
> B has memory content = 'a'.
> C has memory content = 'a'.
> Then without THP,
> pages_shared = 2
> pages_sharing = 98285
> pages_sharing = 98292
> pages_unshared = 0
> pages_volatile = 17
> pages_to_scan = 98304
> sleep_millisecs = 0
> with THP,
> pages_shared is 2.
> pages_sharing is 18422.
> pages_unshared is 0.
> pages_volatile is 8.
>
> Later,
> A has memory content = 'c'
> B has memory content = 'b'
> C has memory content = 'a'.
> Then without THP,
> pages_shared = 3
> pages_sharing = 98296
> pages_unshared = 0
> pages_volatile = 5
> with THP,
> pages_shared = 3
> pages_sharing = 16358
> pages_unshared = 0
> pages_volatile = 23
>
> Later,
> A has memory content = 'd'
> B has memory content = 'd'
> C has memory content = 'd'
> Then without THP,
> pages_shared = 1
> pages_sharing = 98274
> pages_unshared = 0
> pages_volatile = 29
> with THP,
> pages_shared = 1
> pages_sharing = 8668
> pages_unshared = 0
> pages_volatile = 35
>
> Finally,
> A changes one page to 'e'
> Then without THP,
> pages_shared = 1
> pages_sharing = 98274
> pages_unshared = 1
> pages_volatile = 28
> with THP,
> pages_shared = 1
> pages_sharing = 8163
> pages_unshared = 1
> pages_volatile = 27
>
> Are those differences for pages_sharing between with and without THP
> are expected?
>
> CAI Qian
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 00 of 66] Transparent Hugepage Support #32
2010-11-04 8:41 ` CAI Qian
@ 2010-11-04 16:30 ` Andrea Arcangeli
0 siblings, 0 replies; 5+ messages in thread
From: Andrea Arcangeli @ 2010-11-04 16:30 UTC (permalink / raw)
To: CAI Qian
Cc: Marcelo Tosatti, Adam Litke, Avi Kivity, Hugh Dickins,
Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner,
Daisuke Nishimura, Chris Mason, Borislav Petkov, linux-mm,
Linus Torvalds, Andrew Morton, linux-kernel
Hi Qian,
On Thu, Nov 04, 2010 at 04:41:12AM -0400, CAI Qian wrote:
> Thank Andrea for pointing out to me there are ongoing works for KSM
> and THP integration. Sorry for the noise.
No problem, thanks for your feedback!
The longer answer is: PageKsm pages will already co-exist fine with
PageTransHuge pages in the same vma with regular pages. So 3 type of
pages in the same vma. But before KSM scan can see the content of the
hugepages there has to be some memory pressure... So it's not ideal
and we will make KSM able to scan inside hugepages before they're
splitted (and to split them when it finds a match and then merge them
in the stable tree).
So it's perfectly normal that KSM becomes less effective when THP is
enabled. But in the mainline version we have MADV_HUGEPAGE, so if you
need KSM in full effect, you can simply "echo madvise
>/sys/kernel/mm/transparent_hugepage/enabled", and then you can decide
if to mark a mapping either with madvise(MADV_HUGEPAGE) or
madvise(MADV_MERGEABLE).
Thanks,
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 00 of 66] Transparent Hugepage Support #32
2010-11-03 15:27 Andrea Arcangeli
@ 2010-11-18 16:39 ` Mel Gorman
0 siblings, 0 replies; 5+ messages in thread
From: Mel Gorman @ 2010-11-18 16:39 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: linux-mm, Linus Torvalds, Andrew Morton, linux-kernel,
Marcelo Tosatti, Adam Litke, Avi Kivity, Hugh Dickins,
Rik van Riel, Dave Hansen, Benjamin Herrenschmidt, Ingo Molnar,
Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter, Chris Wright,
bpicco, KOSAKI Motohiro, Balbir Singh, Michael S. Tsirkin,
Peter Zijlstra, Johannes Weiner, Daisuke Nishimura, Chris Mason,
Borislav Petkov
On Wed, Nov 03, 2010 at 04:27:35PM +0100, Andrea Arcangeli wrote:
> Some of some relevant user of the project:
>
> KVM Virtualization
> GCC (kernel build included, requires a few liner patch to enable)
> JVM
> VMware Workstation
> HPC
>
> It would be great if it could go in -mm.
>
FWIW, I saw nothing of major concern while reading through the series other
than the delete-lumpy-reclaim portion. I've posted a series that should
address both the lumpy reclaim concerns while also improving the performance
of compaction in general.
For the rest of the series, the vast majority of my comments were nits as
my questions were covered (and addressed) in previous revisions. I'll try
and set time aside to convert some of the libhugetlbfs-orientated to run
against these patches but I'm not anticipating any problems and it'd be
nice to see merged at some point.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-11-18 16:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1562100965.1384941288839981384.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-11-04 3:10 ` [PATCH 00 of 66] Transparent Hugepage Support #32 caiqian
2010-11-04 8:41 ` CAI Qian
2010-11-04 16:30 ` Andrea Arcangeli
2010-11-03 15:27 Andrea Arcangeli
2010-11-18 16:39 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).