From: Nico Pache <npache@redhat.com>
To: linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Cc: david@redhat.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
ryan.roberts@arm.com, dev.jain@arm.com, corbet@lwn.net,
rostedt@goodmis.org, mhiramat@kernel.org,
mathieu.desnoyers@efficios.com, akpm@linux-foundation.org,
baohua@kernel.org, willy@infradead.org, peterx@redhat.com,
wangkefeng.wang@huawei.com, usamaarif642@gmail.com,
sunnanyong@huawei.com, vishal.moola@gmail.com,
thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com,
kirill.shutemov@linux.intel.com, aarcange@redhat.com,
raquini@redhat.com, anshuman.khandual@arm.com,
catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org,
dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org,
jglisse@google.com, surenb@google.com, zokeefe@google.com,
hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com,
rdunlap@infradead.org, Bagas Sanjaya <bagasdotme@gmail.com>
Subject: [PATCH v8 15/15] Documentation: mm: update the admin guide for mTHP collapse
Date: Tue, 1 Jul 2025 23:57:42 -0600 [thread overview]
Message-ID: <20250702055742.102808-16-npache@redhat.com> (raw)
In-Reply-To: <20250702055742.102808-1-npache@redhat.com>
Now that we can collapse to mTHPs lets update the admin guide to
reflect these changes and provide proper guidence on how to utilize it.
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Nico Pache <npache@redhat.com>
---
Documentation/admin-guide/mm/transhuge.rst | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index dff8d5985f0f..878796b4d7d3 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -63,7 +63,7 @@ often.
THP can be enabled system wide or restricted to certain tasks or even
memory ranges inside task's address space. Unless THP is completely
disabled, there is ``khugepaged`` daemon that scans memory and
-collapses sequences of basic pages into PMD-sized huge pages.
+collapses sequences of basic pages into huge pages.
The THP behaviour is controlled via :ref:`sysfs <thp_sysfs>`
interface and using madvise(2) and prctl(2) system calls.
@@ -144,6 +144,18 @@ hugepage sizes have enabled="never". If enabling multiple hugepage
sizes, the kernel will select the most appropriate enabled size for a
given allocation.
+khugepaged uses max_ptes_none scaled to the order of the enabled mTHP size
+to determine collapses. When using mTHPs it's recommended to set
+max_ptes_none low-- ideally less than HPAGE_PMD_NR / 2 (255 on 4k page
+size). This will prevent undesired "creep" behavior that leads to
+continuously collapsing to the largest mTHP size; when we collapse, we are
+bringing in new non-zero pages that will, on a subsequent scan, cause the
+max_ptes_none check of the +1 order to always be satisfied. By limiting
+this to less than half the current order, we make sure we don't cause this
+feedback loop. max_ptes_shared and max_ptes_swap have no effect when
+collapsing to a mTHP, and mTHP collapse will fail on shared or swapped out
+pages.
+
It's also possible to limit defrag efforts in the VM to generate
anonymous hugepages in case they're not immediately free to madvise
regions or to never try to defrag memory and simply fallback to regular
@@ -221,11 +233,6 @@ top-level control are "never")
Khugepaged controls
-------------------
-.. note::
- khugepaged currently only searches for opportunities to collapse to
- PMD-sized THP and no attempt is made to collapse to other THP
- sizes.
-
khugepaged runs usually at low frequency so while one may not want to
invoke defrag algorithms synchronously during the page faults, it
should be worth invoking defrag at least in khugepaged. However it's
--
2.49.0
prev parent reply other threads:[~2025-07-02 6:02 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 5:57 [PATCH v8 00/15] khugepaged: mTHP support Nico Pache
2025-07-02 5:57 ` [PATCH v8 01/15] khugepaged: rename hpage_collapse_* to khugepaged_* Nico Pache
2025-07-04 5:14 ` Dev Jain
2025-07-08 4:35 ` Nico Pache
2025-07-08 15:57 ` David Hildenbrand
2025-07-08 16:23 ` Lorenzo Stoakes
2025-07-02 5:57 ` [PATCH v8 02/15] introduce khugepaged_collapse_single_pmd to unify khugepaged and madvise_collapse Nico Pache
2025-07-04 3:50 ` Baolin Wang
2025-07-04 4:20 ` Nico Pache
2025-07-04 5:51 ` Baolin Wang
2025-07-02 5:57 ` [PATCH v8 03/15] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-07-02 5:57 ` [PATCH v8 04/15] khugepaged: generalize alloc_charge_folio() Nico Pache
2025-07-08 6:19 ` Baolin Wang
2025-07-02 5:57 ` [PATCH v8 05/15] khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2025-07-02 5:57 ` [PATCH v8 06/15] khugepaged: introduce khugepaged_scan_bitmap " Nico Pache
2025-07-02 5:57 ` [PATCH v8 07/15] khugepaged: add " Nico Pache
2025-07-02 5:57 ` [PATCH v8 08/15] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-07-02 5:57 ` [PATCH v8 09/15] khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2025-07-02 5:57 ` [PATCH v8 10/15] khugepaged: allow khugepaged to check all anonymous mTHP orders Nico Pache
2025-07-02 5:57 ` [PATCH v8 11/15] khugepaged: allow madvise_collapse " Nico Pache
2025-07-04 6:11 ` Baolin Wang
2025-07-04 21:14 ` Andrew Morton
2025-07-08 4:37 ` Nico Pache
2025-07-08 6:15 ` Baolin Wang
2025-07-02 5:57 ` [PATCH v8 12/15] khugepaged: kick khugepaged for enabling none-PMD-sized mTHPs Nico Pache
2025-07-02 5:57 ` [PATCH v8 13/15] khugepaged: improve tracepoints for mTHP orders Nico Pache
2025-07-02 5:57 ` [PATCH v8 14/15] khugepaged: add per-order mTHP khugepaged stats Nico Pache
2025-07-08 6:09 ` Baolin Wang
2025-07-11 5:19 ` Nico Pache
2025-07-02 5:57 ` Nico Pache [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250702055742.102808-16-npache@redhat.com \
--to=npache@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=bagasdotme@gmail.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=jglisse@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=peterx@redhat.com \
--cc=raquini@redhat.com \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=ryan.roberts@arm.com \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tiwai@suse.de \
--cc=usamaarif642@gmail.com \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).