Linux Documentation
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: Nico Pache <npache@redhat.com>
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org,
	aarcange@redhat.com,  akpm@linux-foundation.org,
	anshuman.khandual@arm.com, apopple@nvidia.com, baohua@kernel.org,
	 baolin.wang@linux.alibaba.com, byungchul@sk.com,
	catalin.marinas@arm.com, cl@gentwo.org,  corbet@lwn.net,
	dave.hansen@linux.intel.com, david@kernel.org, dev.jain@arm.com,
	 gourry@gourry.net, hannes@cmpxchg.org, hughd@google.com,
	jack@suse.cz,  jackmanb@google.com, jannh@google.com,
	jglisse@google.com, joshua.hahnjy@gmail.com,  kas@kernel.org,
	lance.yang@linux.dev, liam@infradead.org,
	 mathieu.desnoyers@efficios.com, matthew.brost@intel.com,
	mhiramat@kernel.org, mhocko@suse.com,  peterx@redhat.com,
	pfalcato@suse.de, rakie.kim@sk.com, raquini@redhat.com,
	 rdunlap@infradead.org, richard.weiyang@gmail.com,
	rientjes@google.com,  rostedt@goodmis.org, rppt@kernel.org,
	ryan.roberts@arm.com, shivankg@amd.com,  sunnanyong@huawei.com,
	surenb@google.com, thomas.hellstrom@linux.intel.com,
	 tiwai@suse.de, usamaarif642@gmail.com, vbabka@suse.cz,
	vishal.moola@gmail.com,  wangkefeng.wang@huawei.com,
	will@kernel.org, willy@infradead.org,
	 yang@os.amperecomputing.com, ying.huang@linux.alibaba.com,
	ziy@nvidia.com, zokeefe@google.com,
	 Bagas Sanjaya <bagasdotme@gmail.com>
Subject: Re: [PATCH mm-unstable v19 14/14] Documentation: mm: update the admin guide for mTHP collapse
Date: Fri, 5 Jun 2026 19:20:39 +0100	[thread overview]
Message-ID: <aiMTFl2vvns1_dn3@lucifer> (raw)
In-Reply-To: <20260605161422.213817-15-npache@redhat.com>

On Fri, Jun 05, 2026 at 10:14:21AM -0600, Nico Pache wrote:
> Now that we can collapse to mTHPs lets update the admin guide to
> reflect these changes and provide proper guidance on how to utilize it.
>
> Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
> Signed-off-by: Nico Pache <npache@redhat.com>

This is completely fine, and no blockers, but just a couple tiny things
below Claude brought up for a possible trivial follow up.

> ---
>  Documentation/admin-guide/mm/transhuge.rst | 49 ++++++++++++++--------
>  1 file changed, 32 insertions(+), 17 deletions(-)
>
> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> index b98e18c80185..23f8d13c2629 100644
> --- a/Documentation/admin-guide/mm/transhuge.rst
> +++ b/Documentation/admin-guide/mm/transhuge.rst
> @@ -63,7 +63,8 @@ often.
>  THP can be enabled system wide or restricted to certain tasks or even
>  memory ranges inside task's address space. Unless THP is completely
>  disabled, there is ``khugepaged`` daemon that scans memory and
> -collapses sequences of basic pages into PMD-sized huge pages.
> +collapses sequences of basic pages into huge pages of either PMD size
> +or mTHP sizes, if the system is configured to do so.
>
>  The THP behaviour is controlled via :ref:`sysfs <thp_sysfs>`
>  interface and using madvise(2) and prctl(2) system calls.
> @@ -219,10 +220,10 @@ this behaviour by writing 0 to shrink_underused, and enable it by writing
>  	echo 0 > /sys/kernel/mm/transparent_hugepage/shrink_underused
>  	echo 1 > /sys/kernel/mm/transparent_hugepage/shrink_underused
>
> -khugepaged will be automatically started when PMD-sized THP is enabled
> +khugepaged will be automatically started when any THP size is enabled
>  (either of the per-size anon control or the top-level control are set
>  to "always" or "madvise"), and it'll be automatically shutdown when
> -PMD-sized THP is disabled (when both the per-size anon control and the
> +all THP sizes are disabled (when both the per-size anon control and the
>  top-level control are "never")

Claude was very pedantic and said we need a full stop here :P

This is not a blocker, obviously...!

>
>  process THP controls
> @@ -265,8 +266,8 @@ Khugepaged controls
>  -------------------
>
>  .. note::
> -   khugepaged currently only searches for opportunities to collapse to
> -   PMD-sized THP and no attempt is made to collapse to other THP
> +   khugepaged currently only searches for opportunities to collapse file/shmem
> +   to PMD-sized THP. Only anonymous memory will attempt to collapse to other THP
>     sizes.
>
>  khugepaged runs usually at low frequency so while one may not want to
> @@ -296,11 +297,11 @@ allocation failure to throttle the next allocation attempt::
>  The khugepaged progress can be seen in the number of pages collapsed (note
>  that this counter may not be an exact count of the number of pages
>  collapsed, since "collapsed" could mean multiple things: (1) A PTE mapping
> -being replaced by a PMD mapping, or (2) All 4K physical pages replaced by
> -one 2M hugepage. Each may happen independently, or together, depending on
> -the type of memory and the failures that occur. As such, this value should
> -be interpreted roughly as a sign of progress, and counters in /proc/vmstat
> -consulted for more accurate accounting)::
> +being replaced by a PMD mapping, or (2) physical pages replaced by one
> +hugepage of various sizes (PMD-sized or mTHP). Each may happen independently,
> +or together, depending on the type of memory and the failures that occur.
> +As such, this value should be interpreted roughly as a sign of progress,
> +and counters in /proc/vmstat consulted for more accurate accounting)::

So Claude said maybe it's worth mentioning that the per-mTHP counters are only
actually exposed through
/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/stats/ and maybe worth
mentioning here too?

>
>  	/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed
>
> @@ -308,16 +309,21 @@ for each pass::
>
>  	/sys/kernel/mm/transparent_hugepage/khugepaged/full_scans
>
> -``max_ptes_none`` specifies how many extra small pages (that are
> -not already mapped) can be allocated when collapsing a group
> -of small pages into one large page::
> +``max_ptes_none`` specifies how many empty (none/zero) pages are allowed
> +when collapsing a group of small pages into one large page::
>
>  	/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none
>
> -A higher value leads to use additional memory for programs.
> -A lower value leads to gain less thp performance. Value of
> -max_ptes_none can waste cpu time very little, you can
> -ignore it.
> +For PMD-sized THP collapse, this directly limits the number of empty pages
> +allowed in the 2MB region.
> +
> +For mTHP collapse, only 0 or (HPAGE_PMD_NR - 1) are supported. At
> +HPAGE_PMD_NR - 1, we collapse to the highest possible order. Any intermediate
> +value will emit a warning and mTHP collapse will default to max_ptes_none=0.
> +
> +A higher value allows more empty pages, potentially leading to more memory
> +usage but better THP performance. A lower value is more conservative and
> +may result in fewer THP collapses.
>
>  ``max_ptes_swap`` specifies how many pages can be brought in from
>  swap when collapsing a group of pages into a transparent huge page::
> @@ -337,6 +343,15 @@ that THP is shared. Exceeding the number would block the collapse::
>
>  A higher value may increase memory footprint for some workloads.
>
> +.. note::
> +   For mTHP collapse, khugepaged does not support collapsing regions that
> +   contain shared or swapped out pages, as this could lead to continuous
> +   promotion to higher orders. The collapse will fail if any shared or
> +   swapped PTEs are encountered during the scan.
> +
> +   Currently, madvise_collapse only supports collapsing to PMD-sized THPs
> +   and does not attempt mTHP collapses.
> +
>  Boot parameters
>  ===============
>
> --
> 2.54.0
>

Cheers, Lorenzo

  parent reply	other threads:[~2026-06-05 18:20 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05 16:14 [PATCH mm-unstable v19 00/14] khugepaged: add mTHP collapse support Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 01/14] mm/khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 02/14] mm/khugepaged: generalize alloc_charge_folio() Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 03/14] mm/khugepaged: rework max_ptes_* handling with helper functions Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 04/14] mm/khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2026-06-05 19:03   ` Zi Yan
2026-06-05 16:14 ` [PATCH mm-unstable v19 05/14] mm/khugepaged: require collapse_huge_page to enter/exit with the lock dropped Nico Pache
2026-06-05 20:07   ` Zi Yan
2026-06-05 16:14 ` [PATCH mm-unstable v19 06/14] mm/khugepaged: generalize collapse_huge_page for mTHP collapse Nico Pache
2026-06-05 17:48   ` David Hildenbrand (Arm)
2026-06-05 18:15     ` Lorenzo Stoakes
2026-06-05 18:18   ` Lorenzo Stoakes
2026-06-05 16:14 ` [PATCH mm-unstable v19 07/14] mm/khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 08/14] mm/khugepaged: add per-order mTHP collapse failure statistics Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 09/14] mm/khugepaged: improve tracepoints for mTHP orders Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 10/14] mm/khugepaged: introduce collapse_possible_orders helper functions Nico Pache
2026-06-05 17:46   ` Lorenzo Stoakes
2026-06-05 16:14 ` [PATCH mm-unstable v19 11/14] mm/khugepaged: Introduce mTHP collapse support Nico Pache
2026-06-05 18:03   ` David Hildenbrand (Arm)
2026-06-05 18:38   ` Lorenzo Stoakes
2026-06-05 16:14 ` [PATCH mm-unstable v19 12/14] mm/khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2026-06-05 17:49   ` David Hildenbrand (Arm)
2026-06-05 18:16     ` Lorenzo Stoakes
2026-06-05 16:14 ` [PATCH mm-unstable v19 13/14] mm/khugepaged: run khugepaged for all orders Nico Pache
2026-06-05 16:14 ` [PATCH mm-unstable v19 14/14] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2026-06-05 17:52   ` David Hildenbrand (Arm)
2026-06-05 18:20   ` Lorenzo Stoakes [this message]
2026-06-05 18:07 ` [PATCH mm-unstable v19 00/14] khugepaged: add mTHP collapse support David Hildenbrand (Arm)
2026-06-05 18:39   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiMTFl2vvns1_dn3@lucifer \
    --to=ljs@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=apopple@nvidia.com \
    --cc=bagasdotme@gmail.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=byungchul@sk.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jackmanb@google.com \
    --cc=jannh@google.com \
    --cc=jglisse@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kas@kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=matthew.brost@intel.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=raquini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shivankg@amd.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tiwai@suse.de \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox