From: Lance Yang <lance.yang@linux.dev>
To: luizcap@redhat.com, baolin.wang@linux.alibaba.com
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
david@kernel.org, ziy@nvidia.com, lance.yang@linux.dev,
corbet@lwn.net, tsbogend@alpha.franken.de, maddy@linux.ibm.com,
mpe@ellerman.id.au, agordeev@linux.ibm.com,
gerald.schaefer@linux.ibm.com, hca@linux.ibm.com,
gor@linux.ibm.com, x86@kernel.org, tglx@kernel.org,
mingo@redhat.com, bp@alien8.de, ira.weiny@intel.com,
hughd@google.com, dave.hansen@linux.intel.com, djbw@kernel.org,
vishal.l.verma@intel.com, dave.jiang@intel.com,
akpm@linux-foundation.org, yintirui@huawei.com, ljs@kernel.org
Subject: Re: [PATCH v5 14/14] mm: thp: always enable mTHP support
Date: Wed, 3 Jun 2026 14:47:18 +0800 [thread overview]
Message-ID: <20260603064718.81699-1-lance.yang@linux.dev> (raw)
In-Reply-To: <8781a9a0f115705ee11884ed3184b65a1ce39923.1780066530.git.luizcap@redhat.com>
Hi Luiz,
SHMEM_HUGE_FORCE still assumes PMD order in a few places.
Is that expected?
shmem_init() only sets the default inherit mask when PMD leaves are
available.
if (!shmem_orders_configured && pgtable_has_pmd_leaves())
huge_shmem_orders_inherit = BIT(HPAGE_PMD_ORDER);
But shmem_parse_huge() rejects "force" unless the mask is exactly PMD
order.
if (huge == SHMEM_HUGE_FORCE &&
huge_shmem_orders_inherit != BIT(HPAGE_PMD_ORDER))
return -EINVAL;
Even if "force" is selected, shmem_huge_global_enabled() still return
only PMD order.
if (shmem_huge_force || shmem_huge == SHMEM_HUGE_FORCE)
return maybe_pmd_order;
and shmem_allowable_huge_orders() mask it out.
if (!pgtable_has_pmd_leaves())
disabled_orders = BIT(PMD_ORDER);
if (!vma || !vma_is_anon_shmem(vma))
return global_orders & ~disabled_orders;
For anon shmem, it can also return 0 for same reason.
if (shmem_huge == SHMEM_HUGE_FORCE)
return READ_ONCE(huge_shmem_orders_inherit) & ~disabled_orders;
Should SHMEM_HUGE_FORCE use the available mTHP orders below PMD when
pgtable_has_pmd_leaves() is false?
Cheers, Lance
On Fri, May 29, 2026 at 10:55:32AM -0400, Luiz Capitulino wrote:
>If PMD-sized pages are not supported on an architecture (ie. the
>arch implements arch_has_pmd_leaves() and it returns false) then the
>current code disables all THP, including mTHP.
>
>This commit fixes this by allowing mTHP to be always enabled for all
>archs. When PMD-sized pages are not supported, its sysfs entry won't be
>created and their mapping will be disallowed at page-fault time.
>
>Similarly, this commit implements the following changes for shmem in
>shmem_allowable_huge_orders():
>
> - Drop the pgtable_has_pmd_leaves() check so that mTHP sizes are
> considered
> - Filter out PMD and PUD orders from allowable orders when
> PMD-sized pages are not supported by the CPU
>
>Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
>---
> mm/huge_memory.c | 25 ++++++++++++++++++++-----
> mm/shmem.c | 14 +++++++++-----
> 2 files changed, 29 insertions(+), 10 deletions(-)
>
>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>index 32254febe097..059901a8c6cb 100644
>--- a/mm/huge_memory.c
>+++ b/mm/huge_memory.c
>@@ -126,6 +126,15 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> else
> supported_orders = THP_ORDERS_ALL_FILE_DEFAULT;
>
>+ if (!pgtable_has_pmd_leaves()) {
>+ /*
>+ * If the CPU does not support PMD leaves, assume for
>+ * now that it does not support PUD leaves and disable
>+ * both folio orders.
>+ */
>+ supported_orders &= ~(BIT(PMD_ORDER) | BIT(PUD_ORDER));
>+ }
>+
> orders &= supported_orders;
> if (!orders)
> return 0;
>@@ -133,7 +142,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> if (!vma->vm_mm) /* vdso */
> return 0;
>
>- if (!pgtable_has_pmd_leaves() || vma_thp_disabled(vma, vm_flags, forced_collapse))
>+ if (vma_thp_disabled(vma, vm_flags, forced_collapse))
> return 0;
>
> /* khugepaged doesn't collapse DAX vma, but page fault is fine. */
>@@ -848,7 +857,7 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj)
> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time
> * constant so we have to do this here.
> */
>- if (!anon_orders_configured)
>+ if (!anon_orders_configured && pgtable_has_pmd_leaves())
> huge_anon_orders_inherit = BIT(PMD_ORDER);
>
> *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj);
>@@ -870,6 +879,15 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj)
> }
>
> orders = THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE_DEFAULT;
>+ if (!pgtable_has_pmd_leaves()) {
>+ /*
>+ * If the CPU does not support PMD leaves, assume for
>+ * now that it does not support PUD leaves and disable
>+ * both folio orders.
>+ */
>+ orders &= ~(BIT(PMD_ORDER) | BIT(PUD_ORDER));
>+ }
>+
> order = highest_order(orders);
> while (orders) {
> thpsize = thpsize_create(order, *hugepage_kobj);
>@@ -969,9 +987,6 @@ static int __init hugepage_init(void)
> int err;
> struct kobject *hugepage_kobj;
>
>- if (!pgtable_has_pmd_leaves())
>- return -EINVAL;
>-
> /*
> * hugepages can't be allocated by the buddy allocator
> */
>diff --git a/mm/shmem.c b/mm/shmem.c
>index 079e299ea789..c15dffd0eb41 100644
>--- a/mm/shmem.c
>+++ b/mm/shmem.c
>@@ -1844,16 +1844,19 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> unsigned long mask = READ_ONCE(huge_shmem_orders_always);
> unsigned long within_size_orders = READ_ONCE(huge_shmem_orders_within_size);
> vm_flags_t vm_flags = vma ? vma->vm_flags : 0;
>- unsigned int global_orders;
>+ unsigned int global_orders, disabled_orders = 0;
>
>- if (!pgtable_has_pmd_leaves() || (vma && vma_thp_disabled(vma, vm_flags, shmem_huge_force)))
>+ if (vma && vma_thp_disabled(vma, vm_flags, shmem_huge_force))
> return 0;
>
>+ if (!pgtable_has_pmd_leaves())
>+ disabled_orders = BIT(PMD_ORDER);
>+
> global_orders = shmem_huge_global_enabled(inode, index, write_end,
> shmem_huge_force, vma, vm_flags);
> /* Tmpfs huge pages allocation */
> if (!vma || !vma_is_anon_shmem(vma))
>- return global_orders;
>+ return global_orders & ~disabled_orders;
>
> /*
> * Following the 'deny' semantics of the top level, force the huge
>@@ -1867,7 +1870,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> * means non-PMD sized THP can not override 'huge' mount option now.
> */
> if (shmem_huge == SHMEM_HUGE_FORCE)
>- return READ_ONCE(huge_shmem_orders_inherit);
>+ return READ_ONCE(huge_shmem_orders_inherit) & ~disabled_orders;
>
> /* Allow mTHP that will be fully within i_size. */
> mask |= shmem_get_orders_within_size(inode, within_size_orders, index, 0);
>@@ -1878,6 +1881,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> if (global_orders > 0)
> mask |= READ_ONCE(huge_shmem_orders_inherit);
>
>+ mask &= ~disabled_orders;
> return THP_ORDERS_ALL_FILE_DEFAULT & mask;
> }
>
>@@ -5461,7 +5465,7 @@ void __init shmem_init(void)
> * Default to setting PMD-sized THP to inherit the global setting and
> * disable all other multi-size THPs.
> */
>- if (!shmem_orders_configured)
>+ if (!shmem_orders_configured && pgtable_has_pmd_leaves())
> huge_shmem_orders_inherit = BIT(HPAGE_PMD_ORDER);
> #endif
> return;
>--
>2.54.0
>
>
next prev parent reply other threads:[~2026-06-03 6:47 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 14:55 [PATCH v5 00/14] mm: thp: always enable mTHP support Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 01/14] docs: tmpfs: remove implementation detail reference Luiz Capitulino
2026-06-02 10:16 ` Dev Jain
2026-05-29 14:55 ` [PATCH v5 02/14] mm: shmem: shmem_getattr(): set blksize to highest supported THP order Luiz Capitulino
2026-06-02 15:13 ` Zi Yan
2026-05-29 14:55 ` [PATCH v5 03/14] mm: introduce pgtable_has_pmd_leaves() Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 04/14] drivers: dax: use pgtable_has_pmd_leaves() Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 05/14] drivers: nvdimm: " Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 06/14] mm: debug_vm_pgtable: " Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 07/14] mm: shmem: allow THP support determination at folio allocation time Luiz Capitulino
2026-06-01 12:09 ` Baolin Wang
2026-06-01 12:20 ` Luiz Capitulino
2026-06-02 2:55 ` Baolin Wang
2026-05-29 14:55 ` [PATCH v5 08/14] s390: move has_transparent_hugepage() out of THP guard Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 09/14] powerpc: " Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 10/14] mips: " Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 11/14] x86: " Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 12/14] treewide: introduce arch_has_pmd_leaves() Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 13/14] mm: replace thp_disabled_by_hw() with pgtable_has_pmd_leaves() Luiz Capitulino
2026-05-29 14:55 ` [PATCH v5 14/14] mm: thp: always enable mTHP support Luiz Capitulino
2026-06-01 12:02 ` Baolin Wang
2026-06-02 3:02 ` Baolin Wang
2026-06-03 6:47 ` Lance Yang [this message]
2026-06-03 7:23 ` Baolin Wang
2026-06-03 7:40 ` Lance Yang
2026-06-03 8:12 ` Lance Yang
2026-06-03 17:40 ` Luiz Capitulino
2026-06-04 2:04 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260603064718.81699-1-lance.yang@linux.dev \
--to=lance.yang@linux.dev \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=david@kernel.org \
--cc=djbw@kernel.org \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hughd@google.com \
--cc=ira.weiny@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=luizcap@redhat.com \
--cc=maddy@linux.ibm.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=tglx@kernel.org \
--cc=tsbogend@alpha.franken.de \
--cc=vishal.l.verma@intel.com \
--cc=x86@kernel.org \
--cc=yintirui@huawei.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.