linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>, Zi Yan <ziy@nvidia.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Jonathan Corbet <corbet@lwn.net>,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH] docs: update THP documentation to clarify sysfs "never" setting
Date: Tue, 22 Jul 2025 06:29:15 +0100	[thread overview]
Message-ID: <c071c478-c6c3-47c4-a504-b1fa650d528f@lucifer.local> (raw)
In-Reply-To: <CAGsJ_4wvWZwG6agXOzDoYBD_vnN6k4TRJjhFfR5dw5pQrk2mwQ@mail.gmail.com>

+cc Hugh since we're mentioning him here, and not-trimming for context -
TL;DR I am updating the docs to reflect the sysfs never 'doesn't mean
never' behaviour for THP.

On Tue, Jul 22, 2025 at 11:37:07AM +0800, Barry Song wrote:
> On Tue, Jul 22, 2025 at 10:33 AM Baolin Wang
> <baolin.wang@linux.alibaba.com> wrote:
> >
> >
> >
> > On 2025/7/22 10:23, Barry Song wrote:
> > > On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang
> > > <baolin.wang@linux.alibaba.com> wrote:
> > >>
> > >>
> > >>
> > >> On 2025/7/21 23:55, Lorenzo Stoakes wrote:
> > >>> Rather confusingly, setting all Transparent Huge Page sysfs settings to
> > >>> "never" does not in fact result in THP being globally disabled.
> > >>>
> > >>> Rather, it results in khugepaged being disabled, but one can still obtain
> > >>> THP pages using madvise(..., MADV_COLLAPSE).
> > >>>
> > >>> This is something that has remained poorly documented for some time, and it
> > >>> is likely the received wisdom of most users of THP that never does, in
> > >>> fact, mean never.
> > >>>
> > >>> It is therefore important to highlight, very clearly, that this is not the
> > >>> ase.
> > >>>
> > >>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > >>> ---
> > >>>    Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++--
> > >>>    1 file changed, 9 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> > >>> index dff8d5985f0f..182519197ef7 100644
> > >>> --- a/Documentation/admin-guide/mm/transhuge.rst
> > >>> +++ b/Documentation/admin-guide/mm/transhuge.rst
> > >>> @@ -107,7 +107,7 @@ sysfs
> > >>>    Global THP controls
> > >>>    -------------------
> > >>>
> > >>> -Transparent Hugepage Support for anonymous memory can be entirely disabled
> > >>> +Transparent Hugepage Support for anonymous memory can be disabled
> > >>>    (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE
> > >>>    regions (to avoid the risk of consuming more memory resources) or enabled
> > >>>    system wide. This can be achieved per-supported-THP-size with one of::
> > >>> @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of::
> > >>>    where <size> is the hugepage size being addressed, the available sizes
> > >>>    for which vary by system.
> > >>>
> > >>> +.. note:: Setting "never" in all sysfs THP controls does **not** disable
> > >>> +          Transparent Huge Pages globally. This is because ``madvise(...,
> > >>> +          MADV_COLLAPSE)`` ignores these settings and collapses ranges to
> > >>> +          PMD-sized huge pages unconditionally.
> > >>> +
> > >>>    For example::
> > >>>
> > >>>        echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> > >>> @@ -187,7 +192,9 @@ madvise
> > >>>        behaviour.
> > >>>
> > >>>    never
> > >>> -     should be self-explanatory.
> > >>> +     should be self-explanatory. Note that ``madvise(...,
> > >>> +     MADV_COLLAPSE)`` can still cause transparent huge pages to be
> > >>> +     obtained even if this mode is specified everywhere.
> > >>
> > >> I hope this part of the explanation is also copy-pasted into the
> > >> 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks.
> > >
> > > Apologies if this is a silly question, but regarding this patchset:
> > > https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/
> > >
> > > It looks like the intention is to disable hugepages even for
> > > `MADV_COLLAPSE` when the user has set the policy to 'never'. However,
> > > based on Lorenzo's documentation update, it seems we still want to allow
> > > hugepages for `MADV_COLLAPSE` even if 'never' is set?
> > >
> > > Could you clarify what the intended behavior is? It seems we've decided
> > > to keep the existing behavior unchanged—am I understanding that
> > > correctly?
> >
> > Yes, Hugh has already explicitly opposed the current changes to the
> > MADV_COLLAPSE logic[1], although there are still some disagreements that
> > cannot be resolved.
> >
> > At least we reached the consensus to update the documentation to reflect
> > the current sysfs THP control logic first, to avoid the misunderstanding
> > that 'sysfs THP controls can disable Transparent Huge Pages globally'.
>
> Nice, thanks! Personally, I prefer this approach as well. Updating the
> man page feels a bit odd, since it's something people are already
> familiar with and may have memorized.

Indeed, Hugh's input was important here and gave pause for thought. This
was not an easy decision, and I ended up changing my mind from initially
supporting this chnage... :)

We may return to it later, but for the time being this is the rather
conservative approach we've decided upon.

Re: man page - I _do_ intend to update the man page as I find it far too
vague on this topic currently, so that patch will be coming soon.

I will cc- the THP folks on that patch when I send it.

>
> >
> > [1]
> > https://lore.kernel.org/linux-mm/75c02dbf-4189-958d-515e-fa80bb2187fc@google.com/
>
> Best regards
> Barry

Cheers, Lorenzo


  reply	other threads:[~2025-07-22  5:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-21 15:55 [PATCH] docs: update THP documentation to clarify sysfs "never" setting Lorenzo Stoakes
2025-07-21 16:27 ` SeongJae Park
2025-07-22  1:13 ` Zi Yan
2025-07-22  1:30 ` Baolin Wang
2025-07-22  2:23   ` Barry Song
2025-07-22  2:33     ` Baolin Wang
2025-07-22  3:37       ` Barry Song
2025-07-22  5:29         ` Lorenzo Stoakes [this message]
2025-07-23 15:41           ` Lorenzo Stoakes
2025-07-22  5:24     ` Lorenzo Stoakes
2025-07-22  5:25   ` Lorenzo Stoakes
2025-07-22  5:34 ` Lorenzo Stoakes
2025-07-22  5:59   ` Baolin Wang
2025-07-22  8:19   ` Barry Song
2025-07-22  7:20 ` David Hildenbrand
2025-07-22  7:59   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c071c478-c6c3-47c4-a504-b1fa650d528f@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=21cnbao@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).