From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>, Zi Yan <ziy@nvidia.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
Jonathan Corbet <corbet@lwn.net>,
linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH] docs: update THP documentation to clarify sysfs "never" setting
Date: Tue, 22 Jul 2025 06:29:15 +0100 [thread overview]
Message-ID: <c071c478-c6c3-47c4-a504-b1fa650d528f@lucifer.local> (raw)
In-Reply-To: <CAGsJ_4wvWZwG6agXOzDoYBD_vnN6k4TRJjhFfR5dw5pQrk2mwQ@mail.gmail.com>
+cc Hugh since we're mentioning him here, and not-trimming for context -
TL;DR I am updating the docs to reflect the sysfs never 'doesn't mean
never' behaviour for THP.
On Tue, Jul 22, 2025 at 11:37:07AM +0800, Barry Song wrote:
> On Tue, Jul 22, 2025 at 10:33 AM Baolin Wang
> <baolin.wang@linux.alibaba.com> wrote:
> >
> >
> >
> > On 2025/7/22 10:23, Barry Song wrote:
> > > On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang
> > > <baolin.wang@linux.alibaba.com> wrote:
> > >>
> > >>
> > >>
> > >> On 2025/7/21 23:55, Lorenzo Stoakes wrote:
> > >>> Rather confusingly, setting all Transparent Huge Page sysfs settings to
> > >>> "never" does not in fact result in THP being globally disabled.
> > >>>
> > >>> Rather, it results in khugepaged being disabled, but one can still obtain
> > >>> THP pages using madvise(..., MADV_COLLAPSE).
> > >>>
> > >>> This is something that has remained poorly documented for some time, and it
> > >>> is likely the received wisdom of most users of THP that never does, in
> > >>> fact, mean never.
> > >>>
> > >>> It is therefore important to highlight, very clearly, that this is not the
> > >>> ase.
> > >>>
> > >>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > >>> ---
> > >>> Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++--
> > >>> 1 file changed, 9 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> > >>> index dff8d5985f0f..182519197ef7 100644
> > >>> --- a/Documentation/admin-guide/mm/transhuge.rst
> > >>> +++ b/Documentation/admin-guide/mm/transhuge.rst
> > >>> @@ -107,7 +107,7 @@ sysfs
> > >>> Global THP controls
> > >>> -------------------
> > >>>
> > >>> -Transparent Hugepage Support for anonymous memory can be entirely disabled
> > >>> +Transparent Hugepage Support for anonymous memory can be disabled
> > >>> (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE
> > >>> regions (to avoid the risk of consuming more memory resources) or enabled
> > >>> system wide. This can be achieved per-supported-THP-size with one of::
> > >>> @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of::
> > >>> where <size> is the hugepage size being addressed, the available sizes
> > >>> for which vary by system.
> > >>>
> > >>> +.. note:: Setting "never" in all sysfs THP controls does **not** disable
> > >>> + Transparent Huge Pages globally. This is because ``madvise(...,
> > >>> + MADV_COLLAPSE)`` ignores these settings and collapses ranges to
> > >>> + PMD-sized huge pages unconditionally.
> > >>> +
> > >>> For example::
> > >>>
> > >>> echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> > >>> @@ -187,7 +192,9 @@ madvise
> > >>> behaviour.
> > >>>
> > >>> never
> > >>> - should be self-explanatory.
> > >>> + should be self-explanatory. Note that ``madvise(...,
> > >>> + MADV_COLLAPSE)`` can still cause transparent huge pages to be
> > >>> + obtained even if this mode is specified everywhere.
> > >>
> > >> I hope this part of the explanation is also copy-pasted into the
> > >> 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks.
> > >
> > > Apologies if this is a silly question, but regarding this patchset:
> > > https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/
> > >
> > > It looks like the intention is to disable hugepages even for
> > > `MADV_COLLAPSE` when the user has set the policy to 'never'. However,
> > > based on Lorenzo's documentation update, it seems we still want to allow
> > > hugepages for `MADV_COLLAPSE` even if 'never' is set?
> > >
> > > Could you clarify what the intended behavior is? It seems we've decided
> > > to keep the existing behavior unchanged—am I understanding that
> > > correctly?
> >
> > Yes, Hugh has already explicitly opposed the current changes to the
> > MADV_COLLAPSE logic[1], although there are still some disagreements that
> > cannot be resolved.
> >
> > At least we reached the consensus to update the documentation to reflect
> > the current sysfs THP control logic first, to avoid the misunderstanding
> > that 'sysfs THP controls can disable Transparent Huge Pages globally'.
>
> Nice, thanks! Personally, I prefer this approach as well. Updating the
> man page feels a bit odd, since it's something people are already
> familiar with and may have memorized.
Indeed, Hugh's input was important here and gave pause for thought. This
was not an easy decision, and I ended up changing my mind from initially
supporting this chnage... :)
We may return to it later, but for the time being this is the rather
conservative approach we've decided upon.
Re: man page - I _do_ intend to update the man page as I find it far too
vague on this topic currently, so that patch will be coming soon.
I will cc- the THP folks on that patch when I send it.
>
> >
> > [1]
> > https://lore.kernel.org/linux-mm/75c02dbf-4189-958d-515e-fa80bb2187fc@google.com/
>
> Best regards
> Barry
Cheers, Lorenzo
next prev parent reply other threads:[~2025-07-22 5:29 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-21 15:55 [PATCH] docs: update THP documentation to clarify sysfs "never" setting Lorenzo Stoakes
2025-07-21 16:27 ` SeongJae Park
2025-07-22 1:13 ` Zi Yan
2025-07-22 1:30 ` Baolin Wang
2025-07-22 2:23 ` Barry Song
2025-07-22 2:33 ` Baolin Wang
2025-07-22 3:37 ` Barry Song
2025-07-22 5:29 ` Lorenzo Stoakes [this message]
2025-07-23 15:41 ` Lorenzo Stoakes
2025-07-22 5:24 ` Lorenzo Stoakes
2025-07-22 5:25 ` Lorenzo Stoakes
2025-07-22 5:34 ` Lorenzo Stoakes
2025-07-22 5:59 ` Baolin Wang
2025-07-22 8:19 ` Barry Song
2025-07-22 7:20 ` David Hildenbrand
2025-07-22 7:59 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c071c478-c6c3-47c4-a504-b1fa650d528f@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hughd@google.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).