* [PATCH] madvise.2: add MADV_HUGEPAGE and MADV_NOHUGEPAGE
@ 2011-07-27 20:14 Doug Goldstein
[not found] ` <CAFWqQMRFHJ2kWkJWB2dAg-Od9MzqL7LeC=CQvzy6t5aNqVY_zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Doug Goldstein @ 2011-07-27 20:14 UTC (permalink / raw)
To: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-man-u79uwXL29TY76Z2rM5mHXA
Document the MADV_HUGEPAGE and MADV_NOHUGEPAGE flags added to the
madvise() syscall in Linux kernels 2.6.38 and newer.
Signed-off-by: Doug Goldstein <cardoe-VPKZcK2rSRzQT0dZR+AlfA@public.gmane.org>
---
man2/madvise.2 | 34 ++++++++++++++++++++++++++++++++++
1 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/man2/madvise.2 b/man2/madvise.2
index 6a449c5..e099e94 100644
--- a/man2/madvise.2
+++ b/man2/madvise.2
@@ -209,6 +209,40 @@ KSM unmerges whatever pages it had merged in the
address range specified by
.IR addr
and
.IR length .
+.TP
+.BR MADV_HUGEPAGE " (since Linux 2.6.38)"
+Enables Transparent Huge Pages (THP) for pages in the range specified by
+.I addr
+and
+.IR length .
+Currently Transparent Huge Pages only work with private anonymous pages (see
+.BR mmap (2)).
+The kernel will regularly scan the areas marked as huge page candidates
+to replace them with huge pages.
+The kernel will also allocate huge pages directly when the region is
+naturally aligned to the huge page size. (see
+.BR posix_memalign (2)).
+This feature is primarily aimmed at applications that use large mappings of
+data and access large regions of that memory at a time (e.g. virtualization
+systems such as qemu).
+It can very easily waste memory (e.g. a 2MB mapping that only ever accesses
+1 byte will result in 2MB of wired memory instead of one 4KB page).
+See the kernel source file
+.I Documentation/vm/transhuge.txt
+for more details.
+The
+.BR MADV_HUGEPAGE
+and
+.BR MADV_NOHUGEPAGE
+operations are only available if the kernel was configured with
+.BR CONFIG_TRANSPARENT_HUGEPAGE.
+.TP
+.BR MADV_NOHUGEPAGE " (since Linux 2.6.38)"
+Ensures that memory in the address range specified by
+.IR addr
+and
+.IR length
+will not be collapsed into huge pages.
.SH "RETURN VALUE"
On success
.BR madvise ()
--
1.7.6
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 3+ messages in thread[parent not found: <CAFWqQMRFHJ2kWkJWB2dAg-Od9MzqL7LeC=CQvzy6t5aNqVY_zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] madvise.2: add MADV_HUGEPAGE and MADV_NOHUGEPAGE [not found] ` <CAFWqQMRFHJ2kWkJWB2dAg-Od9MzqL7LeC=CQvzy6t5aNqVY_zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-09-19 5:26 ` Michael Kerrisk [not found] ` <CAKgNAkh55ZFMEU5nH0vS=jgW91GjDWx1Tf=gyRDUqNm4yqS1oA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Michael Kerrisk @ 2011-09-19 5:26 UTC (permalink / raw) To: Doug Goldstein; +Cc: linux-man-u79uwXL29TY76Z2rM5mHXA, Andrea Arcangeli Hello Doug, On Wed, Jul 27, 2011 at 10:14 PM, Doug Goldstein <cardoe-VPKZcK2rSRzQT0dZR+AlfA@public.gmane.org> wrote: > Document the MADV_HUGEPAGE and MADV_NOHUGEPAGE flags added to the > madvise() syscall in Linux kernels 2.6.38 and newer. Thanks. I've applied this for man-pages-3.34. Andrea, is there anything you think necessary to add/change? Cheers, Michael > Signed-off-by: Doug Goldstein <cardoe-VPKZcK2rSRzQT0dZR+AlfA@public.gmane.org> > --- > man2/madvise.2 | 34 ++++++++++++++++++++++++++++++++++ > 1 files changed, 34 insertions(+), 0 deletions(-) > > diff --git a/man2/madvise.2 b/man2/madvise.2 > index 6a449c5..e099e94 100644 > --- a/man2/madvise.2 > +++ b/man2/madvise.2 > @@ -209,6 +209,40 @@ KSM unmerges whatever pages it had merged in the > address range specified by > .IR addr > and > .IR length . > +.TP > +.BR MADV_HUGEPAGE " (since Linux 2.6.38)" > +Enables Transparent Huge Pages (THP) for pages in the range specified by > +.I addr > +and > +.IR length . > +Currently Transparent Huge Pages only work with private anonymous pages (see > +.BR mmap (2)). > +The kernel will regularly scan the areas marked as huge page candidates > +to replace them with huge pages. > +The kernel will also allocate huge pages directly when the region is > +naturally aligned to the huge page size. (see > +.BR posix_memalign (2)). > +This feature is primarily aimmed at applications that use large mappings of > +data and access large regions of that memory at a time (e.g. virtualization > +systems such as qemu). > +It can very easily waste memory (e.g. a 2MB mapping that only ever accesses > +1 byte will result in 2MB of wired memory instead of one 4KB page). > +See the kernel source file > +.I Documentation/vm/transhuge.txt > +for more details. > +The > +.BR MADV_HUGEPAGE > +and > +.BR MADV_NOHUGEPAGE > +operations are only available if the kernel was configured with > +.BR CONFIG_TRANSPARENT_HUGEPAGE. > +.TP > +.BR MADV_NOHUGEPAGE " (since Linux 2.6.38)" > +Ensures that memory in the address range specified by > +.IR addr > +and > +.IR length > +will not be collapsed into huge pages. > .SH "RETURN VALUE" > On success > .BR madvise () > -- > 1.7.6 > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Author of "The Linux Programming Interface"; http://man7.org/tlpi/ -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <CAKgNAkh55ZFMEU5nH0vS=jgW91GjDWx1Tf=gyRDUqNm4yqS1oA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] madvise.2: add MADV_HUGEPAGE and MADV_NOHUGEPAGE [not found] ` <CAKgNAkh55ZFMEU5nH0vS=jgW91GjDWx1Tf=gyRDUqNm4yqS1oA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-09-28 1:59 ` Andrea Arcangeli 0 siblings, 0 replies; 3+ messages in thread From: Andrea Arcangeli @ 2011-09-28 1:59 UTC (permalink / raw) To: Michael Kerrisk; +Cc: Doug Goldstein, linux-man-u79uwXL29TY76Z2rM5mHXA Hello, On Mon, Sep 19, 2011 at 07:26:49AM +0200, Michael Kerrisk wrote: > Hello Doug, > > On Wed, Jul 27, 2011 at 10:14 PM, Doug Goldstein <cardoe-VPKZcK2rSRzQT0dZR+AlfA@public.gmane.org> wrote: > > Document the MADV_HUGEPAGE and MADV_NOHUGEPAGE flags added to the > > madvise() syscall in Linux kernels 2.6.38 and newer. > > Thanks. I've applied this for man-pages-3.34. > > Andrea, is there anything you think necessary to add/change? Looking good! > > Signed-off-by: Doug Goldstein <cardoe-VPKZcK2rSRzQT0dZR+AlfA@public.gmane.org> > > --- > > man2/madvise.2 | 34 ++++++++++++++++++++++++++++++++++ > > 1 files changed, 34 insertions(+), 0 deletions(-) > > > > diff --git a/man2/madvise.2 b/man2/madvise.2 > > index 6a449c5..e099e94 100644 > > --- a/man2/madvise.2 > > +++ b/man2/madvise.2 > > @@ -209,6 +209,40 @@ KSM unmerges whatever pages it had merged in the > > address range specified by > > .IR addr > > and > > .IR length . > > +.TP > > +.BR MADV_HUGEPAGE " (since Linux 2.6.38)" > > +Enables Transparent Huge Pages (THP) for pages in the range specified by > > +.I addr > > +and > > +.IR length . Maybe it should also be specified that most common kernels configurations by default will behave like MADV_HUGEPAGE already, and thus MADV_HUGEPAGE is normally not necessary and it's mostly meant for embedded systems that may not enable by default in the kernel the MADV_HUGEPAGE behavior. It can be used in order to selectively enable THP through MADV_HUGEPAGE (only in some region). Whenever MADV_HUGEPAGE is used, it should be always in regions of memory with an access pattern that the developer knows in advance that won't risk to increase the memory footprint of the application when transparent hugepages are enabled. > > +.BR MADV_NOHUGEPAGE " (since Linux 2.6.38)" > > +Ensures that memory in the address range specified by > > +.IR addr > > +and > > +.IR length > > +will not be collapsed into huge pages. Maybe it's more clear as "will not be backed by transparent hugepages". The collapse is done by khugepaged only but the transparent hugepages may be natively allocated during the page fault without waiting them to be collapse later, if MADV_NOHUGEPAGE isn't used. This can be used to selectively disable THP for any app that is doing some scattered memory access that may increase the memory footprint of the application too much with THP enabled. Generally those two MADV_*HUGEPAGE madvise are useful to deal with any memory footprint issue that may arise depending on the kernel default. For example that the NPTL thread stacks virtual area could be a good candidate for MADV_NOHUGEPAGE usage, but that's not implemented yet I think. As opposed qemu-kvm should do MADV_HUGEPAGE by default because if somebody runs KVM on embedded there will be no memory waste in KVM because of THP enabled for the guest physical memory (when the guest reach peak load and touched all ram which happens eventually), so then KVM will just run faster with no risk of increased memory footprint. Not so easy to explain clearly though :) but if we manage express these concepts too, it'll avoid the risk of people polluting apps with these madvises when they're not needed 99% of the time (with a few exceptions like qemu-kvm and maybe NPTL for the user thread stacks, the latter has yet to be checked, KVM I'm positive it'll be fine). But hey your previous patch already is looking good already. Thanks a lot for helping document this! Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-09-28 1:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-27 20:14 [PATCH] madvise.2: add MADV_HUGEPAGE and MADV_NOHUGEPAGE Doug Goldstein
[not found] ` <CAFWqQMRFHJ2kWkJWB2dAg-Od9MzqL7LeC=CQvzy6t5aNqVY_zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-09-19 5:26 ` Michael Kerrisk
[not found] ` <CAKgNAkh55ZFMEU5nH0vS=jgW91GjDWx1Tf=gyRDUqNm4yqS1oA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-09-28 1:59 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox