* Re: [PATCH v4 0/9] mm: thp: always enable mTHP support
[not found] ` <20260503080236.4ea7f3fec0e5788d50113599@linux-foundation.org>
@ 2026-05-04 19:11 ` Luiz Capitulino
0 siblings, 0 replies; 12+ messages in thread
From: Luiz Capitulino @ 2026-05-04 19:11 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, david, baolin.wang, ziy, lance.yang,
corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang,
lorenzo.stoakes
On 2026-05-03 11:02, Andrew Morton wrote:
> On Fri, 1 May 2026 15:18:42 -0400 Luiz Capitulino <luizcap@redhat.com> wrote:
>
>> Today, if an architecture implements has_transparent_hugepage() and the CPU
>> lacks support for PMD-sized pages, the THP code disables all THP, including
>> mTHP. In addition, the kernel lacks a well defined API to check for
>> PMD-sized page support. It currently relies on has_transparent_hugepage()
>> and thp_disabled_by_hw(), but they are not well defined and are tied to
>> THP support.
>>
>> This series addresses both issues by introducing a new well defined API
>> to query PMD-sized page support: pgtable_has_pmd_leaves(). Using this
>> new helper, we ensure that mTHP remains enabled even when the
>> architecture or CPU doesn't support PMD-sized pages.
>>
>> Thanks to David Hildenbrand for suggesting this improvement and for
>> providing guidance (all bugs and misconceptions are mine).
>>
>> This applies to Linus tree 08d0d3466664 ("Merge tag 'net-7.1-rc2'
>> of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
>>
>> NOTE: I used Claude Code Opus 4.6 to *review* the series before
>> posting. It did find one issue where a pgtable_has_pmd_leaves()
>> check was missing when assining huge_shmem_orders_inherit in
>> shmem_init().
>
> Thanks.
>
> Sashiko found a few other things to ask about:
> https://sashiko.dev/#/patchset/cover.1777663129.git.luizcap@redhat.com
Thanks, I'll go over those soon.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 9/9] mm: thp: always enable mTHP support
[not found] ` <f67da00a825da9097b5faf2f390ad344450b88be.1777663129.git.luizcap@redhat.com>
@ 2026-05-06 5:46 ` Baolin Wang
2026-05-06 18:34 ` (sashiko review) " Luiz Capitulino
2026-05-13 15:58 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 12+ messages in thread
From: Baolin Wang @ 2026-05-06 5:46 UTC (permalink / raw)
To: Luiz Capitulino, linux-kernel, linux-mm, david, ziy, lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 5/2/26 3:18 AM, Luiz Capitulino wrote:
> If PMD-sized pages are not supported on an architecture (ie. the
> arch implements arch_has_pmd_leaves() and it returns false) then the
> current code disables all THP, including mTHP.
>
> This commit fixes this by allowing mTHP to be always enabled for all
> archs. When PMD-sized pages are not supported, its sysfs entry won't be
> created and their mapping will be disallowed at page-fault time.
>
> Similarly, this commit implements the following changes for shmem in
> shmem_allowable_huge_orders():
>
> - Drop the pgtable_has_pmd_leaves() check so that mTHP sizes are
> considered
> - Filter out PMD and PUD orders from allowable orders when
> PMD-sized pages are not supported by the CPU
>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
Nothing else caught my eye. Thanks. Feel free to add:
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* (sashiko review) Re: [PATCH v4 2/9] mm: introduce pgtable_has_pmd_leaves()
[not found] ` <2a0bae00cdd2b6ef6b962610b523ebfc97806ba7.1777663129.git.luizcap@redhat.com>
@ 2026-05-06 17:50 ` Luiz Capitulino
2026-05-13 15:30 ` David Hildenbrand (Arm)
2026-05-13 15:36 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 12+ messages in thread
From: Luiz Capitulino @ 2026-05-06 17:50 UTC (permalink / raw)
To: linux-kernel, linux-mm, david, baolin.wang, ziy, lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 2026-05-01 15:18, Luiz Capitulino wrote:
> Currently, we have two helpers that check for PMD-sized pages but have
> different names and slightly different semantics:
>
> - has_transparent_hugepage(): the name suggests it checks if THP is
> enabled, but when CONFIG_TRANSPARENT_HUGEPAGE=y and the architecture
> implements this helper, it actually checks if the CPU supports
> PMD-sized pages
>
> - thp_disabled_by_hw(): the name suggests it checks if THP is disabled
> by the hardware, but it just returns a cached value acquired with
> has_transparent_hugepage(). This helper is used in fast paths
>
> This commit introduces a new helper called pgtable_has_pmd_leaves()
> which is intended to replace both has_transparent_hugepage() and
> thp_disabled_by_hw(). pgtable_has_pmd_leaves() has very clear semantics:
> it returns true if the CPU supports PMD-sized pages and false otherwise.
> It always returns a cached value, so it can be used in fast paths.
>
> The new helper requires an initialization step which is performed by
> init_arch_has_pmd_leaves(). We call init_arch_has_pmd_leaves() early
> during boot in start_kernel() right after parse_early_param() but before
> parse_args(). This allows early_param() handlers to change CPU flags if
> needed (eg. parse_memopt() in x86-32) while also allowing users to use
> the API from __setup() handlers.
>
> The next commits will convert users of both has_transparent_hugepage()
> and thp_disabled_by_hw() to pgtable_has_pmd_leaves().
>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
> include/linux/pgtable.h | 15 +++++++++++++++
> init/main.c | 1 +
> mm/memory.c | 9 +++++++++
> 3 files changed, 25 insertions(+)
>
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index cdd68ed3ae1a..b365be3516bf 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -2243,6 +2243,21 @@ static inline const char *pgtable_level_to_str(enum pgtable_level level)
> }
> }
>
> +#ifdef CONFIG_MMU
> +DECLARE_STATIC_KEY_TRUE(__arch_has_pmd_leaves_key);
"""
This adds DECLARE_STATIC_KEY_TRUE and static_branch_likely but doesn't
explicitly include linux/jump_label.h. Can this cause build breakages in
compilation units that include linux/pgtable.h but do not have
linux/jump_label.h in their dependency chain?
"""
I've built this series for a few archs and I'm sure the kernel bot built
it too, so I'd assume this is not an issue. But since adding the header
doesn't hurt, I can do it.
> +static inline bool pgtable_has_pmd_leaves(void)
> +{
> + return static_branch_likely(&__arch_has_pmd_leaves_key);
> +}
> +void __init init_arch_has_pmd_leaves(void);
> +#else
> +static inline bool pgtable_has_pmd_leaves(void)
> +{
> + return false;
> +}
> +static inline void __init init_arch_has_pmd_leaves(void) { }
> +#endif
> +
> #endif /* !__ASSEMBLY__ */
>
> #if !defined(MAX_POSSIBLE_PHYSMEM_BITS) && !defined(CONFIG_64BIT)
> diff --git a/init/main.c b/init/main.c
> index 96f93bb06c49..eea7c5bdddf7 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1053,6 +1053,7 @@ void start_kernel(void)
> print_kernel_cmdline(saved_command_line);
> /* parameters may set static keys */
> parse_early_param();
> + init_arch_has_pmd_leaves();
> after_dashes = parse_args("Booting kernel",
> static_command_line, __start___param,
> __stop___param - __start___param,
> diff --git a/mm/memory.c b/mm/memory.c
> index ea6568571131..90b2d9e84320 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -164,6 +164,15 @@ __setup("norandmaps", disable_randmaps);
>
> unsigned long highest_memmap_pfn __read_mostly;
>
> +DEFINE_STATIC_KEY_TRUE(__arch_has_pmd_leaves_key);
> +EXPORT_SYMBOL(__arch_has_pmd_leaves_key);
> +
> +void __init init_arch_has_pmd_leaves(void)
> +{
> + if (!has_transparent_hugepage())
> + static_branch_disable(&__arch_has_pmd_leaves_key);
> +}
> +
"""
Will this unconditionally disable the static key when
CONFIG_TRANSPARENT_HUGEPAGE is not enabled, since has_transparent_hugepage()
evaluates to 0 in that configuration?
This seems to contradict the commit message's goal of decoupling the hardware
capability check from THP enablement.
I see this is fixed later in the patch series in commit 4782375d13da
("treewide: introduce arch_has_pmd_leaves()"), but could this cause
regressions for non-THP PMD users if they test at this point in the series?
"""
There's no users of the new API in this patch and all converted users in
the following patches are protected by #ifdef
CONFIG_TRANSPARENT_HUGEPAGE so it's not an issue.
> void mm_trace_rss_stat(struct mm_struct *mm, int member)
> {
> trace_rss_stat(mm, member);
^ permalink raw reply [flat|nested] 12+ messages in thread
* (sashiko review) Re: [PATCH v4 6/9] mm: shmem: drop has_transparent_hugepage() usage
[not found] ` <d2d1d0deaa5732d82bc2daea5033a1578781d641.1777663129.git.luizcap@redhat.com>
@ 2026-05-06 18:12 ` Luiz Capitulino
0 siblings, 0 replies; 12+ messages in thread
From: Luiz Capitulino @ 2026-05-06 18:12 UTC (permalink / raw)
To: linux-kernel, linux-mm, david, baolin.wang, ziy, lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 2026-05-01 15:18, Luiz Capitulino wrote:
> Shmem uses has_transparent_hugepage() in the following ways:
>
> - shmem_parse_one() and shmem_parse_huge(): Check if THP is built-in and
> if the CPU supports PMD-sized pages
>
> - shmem_init(): Since the CONFIG_TRANSPARENT_HUGEPAGE guard is outside
> the code block calling has_transparent_hugepage(), the
> has_transparent_hugepage() call is exclusively checking if the CPU
> supports PMD-sized pages
>
> While it's necessary to check if CONFIG_TRANSPARENT_HUGEPAGE is enabled
> in all cases, shmem can determine mTHP size support at folio allocation
> time. Therefore, drop has_transparent_hugepage() usage while keeping the
> CONFIG_TRANSPARENT_HUGEPAGE checks.
>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Reviewed-by: Lance Yang <lance.yang@linux.dev>
> Acked-by: Zi Yan <ziy@nvidia.com>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
> mm/shmem.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 3b5dc21b323c..1948d73fb1e3 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -689,7 +689,7 @@ static int shmem_parse_huge(const char *str)
> else
> return -EINVAL;
>
> - if (!has_transparent_hugepage() &&
> + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> huge != SHMEM_HUGE_NEVER && huge != SHMEM_HUGE_DENY)
> return -EINVAL;
>
> @@ -4656,8 +4656,7 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
> case Opt_huge:
> ctx->huge = result.uint_32;
> if (ctx->huge != SHMEM_HUGE_NEVER &&
> - !(IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> - has_transparent_hugepage()))
> + !IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
> goto unsupported_parameter;
> ctx->seen |= SHMEM_SEEN_HUGE;
> break;
"""
By dropping the has_transparent_hugepage() check, will mount -t tmpfs
-o huge=always now succeed on hardware lacking PMD support?
If so, since hugepage_init() still sets the TRANSPARENT_HUGEPAGE_UNSUPPORTED
flag, thp_disabled_by_hw() will unconditionally block all large folio
allocations in shmem_allowable_huge_orders().
Does this create an intermediate state where the mount silently succeeds
but no huge pages of any size can actually be allocated?
I see this is resolved later in the series by commit cd27430097e8
("mm: replace thp_disabled_by_hw() with pgtable_has_pmd_leaves()") and
commit 641a20ae032f ("mm: thp: always enable mTHP support").
"""
The mount -t tmpfs -o huge=always succeeding on hardware without PMD
support can happen in this patch, yes. But this seems very minor, the
impact seems to be someone doing bisection, landing on this patch and
their reproducer is depedent on mounting tmpfs with -o huge=always on
hardware without PMD size support? I can fix it if others feel strong
about this.
> @@ -5449,7 +5448,7 @@ void __init shmem_init(void)
> #endif
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> - if (has_transparent_hugepage() && shmem_huge > SHMEM_HUGE_DENY)
> + if (shmem_huge > SHMEM_HUGE_DENY)
> SHMEM_SB(shm_mnt->mnt_sb)->huge = shmem_huge;
> else
> shmem_huge = SHMEM_HUGE_NEVER; /* just in case it was patched */
"""
Also, by allowing shmem_huge to be set to SHMEM_HUGE_ALWAYS on systems
without PMD support, does this incorrectly affect shmem_getattr()?
shmem_getattr() relies on shmem_huge_global_enabled(), which only checks
the software configuration and not hardware PMD support. Consequently,
shmem_getattr() will erroneously report stat->blksize = HPAGE_PMD_SIZE
to userspace.
Since subsequent patches in the series do not appear to update
shmem_getattr(), could this misleading block size cause userspace tools
to over-allocate IO buffers on hardware where PMD-sized pages are
structurally impossible?
"""
This a real issue (albeit small one), the problem is this check in
shmem_getattr():
if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0))
stat->blksize = HPAGE_PMD_SIZE;
So, we may report HPAGE_PMD_SIZE even when PMD size is not supported.
Looks like we may over-report today as well for the
SHMEM_HUGE_WITHIN_SIZE case? In any case, I'll fix this.
^ permalink raw reply [flat|nested] 12+ messages in thread
* (sashiko review) Re: [PATCH v4 7/9] treewide: introduce arch_has_pmd_leaves()
[not found] ` <85674af7934cd363ceadd1d4f2b784d833a183cc.1777663129.git.luizcap@redhat.com>
@ 2026-05-06 18:22 ` Luiz Capitulino
2026-05-06 18:30 ` Luiz Capitulino
0 siblings, 1 reply; 12+ messages in thread
From: Luiz Capitulino @ 2026-05-06 18:22 UTC (permalink / raw)
To: linux-kernel, linux-mm, david, baolin.wang, ziy, lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 2026-05-01 15:18, Luiz Capitulino wrote:
> Now that all the has_transparent_hugepage() callers have been converted
> to pgtable_has_pmd_leaves(), this commit does two things:
>
> 1. Rename has_transparent_hugepage() arch implementations to
> arch_has_pmd_leaves(), since that's what the helper checks for
>
> 2. Introduce the default implementation of arch_has_pmd_leaves() as
> IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE). This means that if
> the arch doesn't implement arch_has_pmd_leaves() we default to checking
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE as a way to determine if
> PMD-sized pages are supported
>
> Note that arch_has_pmd_leaves() is supposed to be called only by
> init_arch_has_pmd_leaves(). The remaining exception is hugepage_init()
> which will be converted in a future commit.
>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
> arch/mips/include/asm/pgtable.h | 4 ++--
> arch/mips/mm/tlb-r4k.c | 4 ++--
> arch/powerpc/include/asm/book3s/64/hash-4k.h | 2 +-
> arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
> arch/powerpc/include/asm/book3s/64/pgtable.h | 10 +++++-----
> arch/powerpc/include/asm/book3s/64/radix.h | 2 +-
> arch/powerpc/mm/book3s64/hash_pgtable.c | 4 ++--
> arch/s390/include/asm/pgtable.h | 4 ++--
> arch/x86/include/asm/pgtable.h | 4 ++--
> include/linux/pgtable.h | 4 ++--
> mm/huge_memory.c | 2 +-
> mm/memory.c | 2 +-
> 12 files changed, 22 insertions(+), 22 deletions(-)
>
> diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
> index fa7b935f947c..a97b788315e2 100644
> --- a/arch/mips/include/asm/pgtable.h
> +++ b/arch/mips/include/asm/pgtable.h
> @@ -615,8 +615,8 @@ unsigned long io_remap_pfn_range_pfn(unsigned long pfn, unsigned long size);
> /* We don't have hardware dirty/accessed bits, generic_pmdp_establish is fine.*/
> #define pmdp_establish generic_pmdp_establish
>
> -#define has_transparent_hugepage has_transparent_hugepage
> -extern int has_transparent_hugepage(void);
> +#define arch_has_pmd_leaves arch_has_pmd_leaves
> +extern int arch_has_pmd_leaves(void);
>
> static inline int pmd_trans_huge(pmd_t pmd)
> {
> diff --git a/arch/mips/mm/tlb-r4k.c b/arch/mips/mm/tlb-r4k.c
> index 24fe85fa169d..c423b5784337 100644
> --- a/arch/mips/mm/tlb-r4k.c
> +++ b/arch/mips/mm/tlb-r4k.c
> @@ -434,7 +434,7 @@ void add_wired_entry(unsigned long entrylo0, unsigned long entrylo1,
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>
> -int has_transparent_hugepage(void)
> +int arch_has_pmd_leaves(void)
> {
> static unsigned int mask = -1;
>
> @@ -450,7 +450,7 @@ int has_transparent_hugepage(void)
> }
> return mask == PM_HUGE_MASK;
> }
> -EXPORT_SYMBOL(has_transparent_hugepage);
> +EXPORT_SYMBOL(arch_has_pmd_leaves);
>
> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index 8e5bd9902bed..6744c2287199 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -165,7 +165,7 @@ extern void hash__pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
> extern pgtable_t hash__pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
> extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
> unsigned long addr, pmd_t *pmdp);
> -extern int hash__has_transparent_hugepage(void);
> +extern int hash__arch_has_pmd_leaves(void);
> #endif
>
> #endif /* !__ASSEMBLER__ */
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 7deb3a66890b..9392aba5e5dc 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -278,7 +278,7 @@ extern void hash__pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
> extern pgtable_t hash__pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
> extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
> unsigned long addr, pmd_t *pmdp);
> -extern int hash__has_transparent_hugepage(void);
> +extern int hash__arch_has_pmd_leaves(void);
> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
> #endif /* __ASSEMBLER__ */
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index e67e64ac6e8c..b6629c041e75 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -1121,14 +1121,14 @@ static inline void update_mmu_cache_pud(struct vm_area_struct *vma,
> {
> }
>
> -extern int hash__has_transparent_hugepage(void);
> -static inline int has_transparent_hugepage(void)
> +extern int hash__arch_has_pmd_leaves(void);
> +static inline int arch_has_pmd_leaves(void)
> {
> if (radix_enabled())
> - return radix__has_transparent_hugepage();
> - return hash__has_transparent_hugepage();
> + return radix__arch_has_pmd_leaves();
> + return hash__arch_has_pmd_leaves();
> }
> -#define has_transparent_hugepage has_transparent_hugepage
> +#define arch_has_pmd_leaves arch_has_pmd_leaves
>
> static inline int has_transparent_pud_hugepage(void)
> {
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index da954e779744..c884a119cbd9 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -298,7 +298,7 @@ extern pmd_t radix__pmdp_huge_get_and_clear(struct mm_struct *mm,
> pud_t radix__pudp_huge_get_and_clear(struct mm_struct *mm,
> unsigned long addr, pud_t *pudp);
>
> -static inline int radix__has_transparent_hugepage(void)
> +static inline int radix__arch_has_pmd_leaves(void)
> {
> /* For radix 2M at PMD level means thp */
> if (mmu_psize_defs[MMU_PAGE_2M].shift == PMD_SHIFT)
> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
> index d9b5b751d7b7..88a4a2eab513 100644
> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
> @@ -391,7 +391,7 @@ pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
> return old_pmd;
> }
>
> -int hash__has_transparent_hugepage(void)
> +int hash__arch_has_pmd_leaves(void)
> {
>
> if (!mmu_has_feature(MMU_FTR_16M_PAGE))
> @@ -420,7 +420,7 @@ int hash__has_transparent_hugepage(void)
>
> return 1;
> }
> -EXPORT_SYMBOL_GPL(hash__has_transparent_hugepage);
> +EXPORT_SYMBOL_GPL(hash__arch_has_pmd_leaves);
>
> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 2c6cee8241e0..33b165dbf3db 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -1799,8 +1799,8 @@ static inline int pmd_trans_huge(pmd_t pmd)
> return pmd_leaf(pmd);
> }
>
> -#define has_transparent_hugepage has_transparent_hugepage
> -static inline int has_transparent_hugepage(void)
> +#define arch_has_pmd_leaves arch_has_pmd_leaves
> +static inline int arch_has_pmd_leaves(void)
> {
> return cpu_has_edat1() ? 1 : 0;
> }
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 2187e9cfcefa..2edd6c9d789c 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -306,8 +306,8 @@ static inline int pud_trans_huge(pud_t pud)
> }
> #endif
>
> -#define has_transparent_hugepage has_transparent_hugepage
> -static inline int has_transparent_hugepage(void)
> +#define arch_has_pmd_leaves arch_has_pmd_leaves
> +static inline int arch_has_pmd_leaves(void)
> {
> return boot_cpu_has(X86_FEATURE_PSE);
> }
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index b365be3516bf..3d7eeb50c183 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -2273,8 +2273,8 @@ static inline void __init init_arch_has_pmd_leaves(void) { }
> #endif
> #endif
>
> -#ifndef has_transparent_hugepage
> -#define has_transparent_hugepage() IS_BUILTIN(CONFIG_TRANSPARENT_HUGEPAGE)
> +#ifndef arch_has_pmd_leaves
> +#define arch_has_pmd_leaves() IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE)
> #endif
>
> #ifndef has_transparent_pud_hugepage
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 970e077019b7..4da10e94bbb6 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -969,7 +969,7 @@ static int __init hugepage_init(void)
> int err;
> struct kobject *hugepage_kobj;
>
> - if (!has_transparent_hugepage()) {
> + if (!arch_has_pmd_leaves()) {
> transparent_hugepage_flags = 1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED;
> return -EINVAL;
> }
> diff --git a/mm/memory.c b/mm/memory.c
> index 90b2d9e84320..c62fce83b8d0 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -169,7 +169,7 @@ EXPORT_SYMBOL(__arch_has_pmd_leaves_key);
>
> void __init init_arch_has_pmd_leaves(void)
> {
> - if (!has_transparent_hugepage())
> + if (!arch_has_pmd_leaves())
> static_branch_disable(&__arch_has_pmd_leaves_key);
> }
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: (sashiko review) Re: [PATCH v4 7/9] treewide: introduce arch_has_pmd_leaves()
2026-05-06 18:22 ` (sashiko review) Re: [PATCH v4 7/9] treewide: introduce arch_has_pmd_leaves() Luiz Capitulino
@ 2026-05-06 18:30 ` Luiz Capitulino
0 siblings, 0 replies; 12+ messages in thread
From: Luiz Capitulino @ 2026-05-06 18:30 UTC (permalink / raw)
To: linux-kernel, linux-mm, david, baolin.wang, ziy, lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 2026-05-06 14:22, Luiz Capitulino wrote:
> On 2026-05-01 15:18, Luiz Capitulino wrote:
>> Now that all the has_transparent_hugepage() callers have been converted
>> to pgtable_has_pmd_leaves(), this commit does two things:
>>
>> 1. Rename has_transparent_hugepage() arch implementations to
>> arch_has_pmd_leaves(), since that's what the helper checks for
>>
>> 2. Introduce the default implementation of arch_has_pmd_leaves() as
>> IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE). This means that if
>> the arch doesn't implement arch_has_pmd_leaves() we default to checking
>> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE as a way to determine if
>> PMD-sized pages are supported
>>
>> Note that arch_has_pmd_leaves() is supposed to be called only by
>> init_arch_has_pmd_leaves(). The remaining exception is hugepage_init()
>> which will be converted in a future commit.
>>
>> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
>> ---
>> arch/mips/include/asm/pgtable.h | 4 ++--
>> arch/mips/mm/tlb-r4k.c | 4 ++--
>> arch/powerpc/include/asm/book3s/64/hash-4k.h | 2 +-
>> arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
>> arch/powerpc/include/asm/book3s/64/pgtable.h | 10 +++++-----
>> arch/powerpc/include/asm/book3s/64/radix.h | 2 +-
>> arch/powerpc/mm/book3s64/hash_pgtable.c | 4 ++--
>> arch/s390/include/asm/pgtable.h | 4 ++--
>> arch/x86/include/asm/pgtable.h | 4 ++--
>> include/linux/pgtable.h | 4 ++--
>> mm/huge_memory.c | 2 +-
>> mm/memory.c | 2 +-
>> 12 files changed, 22 insertions(+), 22 deletions(-)
>>
>> diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
>> index fa7b935f947c..a97b788315e2 100644
>> --- a/arch/mips/include/asm/pgtable.h
>> +++ b/arch/mips/include/asm/pgtable.h
>> @@ -615,8 +615,8 @@ unsigned long io_remap_pfn_range_pfn(unsigned long pfn, unsigned long size);
>> /* We don't have hardware dirty/accessed bits, generic_pmdp_establish is fine.*/
>> #define pmdp_establish generic_pmdp_establish
>> -#define has_transparent_hugepage has_transparent_hugepage
>> -extern int has_transparent_hugepage(void);
>> +#define arch_has_pmd_leaves arch_has_pmd_leaves
>> +extern int arch_has_pmd_leaves(void);
>> static inline int pmd_trans_huge(pmd_t pmd)
>> {
>> diff --git a/arch/mips/mm/tlb-r4k.c b/arch/mips/mm/tlb-r4k.c
>> index 24fe85fa169d..c423b5784337 100644
>> --- a/arch/mips/mm/tlb-r4k.c
>> +++ b/arch/mips/mm/tlb-r4k.c
>> @@ -434,7 +434,7 @@ void add_wired_entry(unsigned long entrylo0, unsigned long entrylo1,
>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> -int has_transparent_hugepage(void)
>> +int arch_has_pmd_leaves(void)
>> {
>> static unsigned int mask = -1;
>> @@ -450,7 +450,7 @@ int has_transparent_hugepage(void)
>> }
>> return mask == PM_HUGE_MASK;
>> }
>> -EXPORT_SYMBOL(has_transparent_hugepage);
>> +EXPORT_SYMBOL(arch_has_pmd_leaves);
>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
>> index 8e5bd9902bed..6744c2287199 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
>> @@ -165,7 +165,7 @@ extern void hash__pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
>> extern pgtable_t hash__pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
>> extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
>> unsigned long addr, pmd_t *pmdp);
>> -extern int hash__has_transparent_hugepage(void);
>> +extern int hash__arch_has_pmd_leaves(void);
>> #endif
>> #endif /* !__ASSEMBLER__ */
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> index 7deb3a66890b..9392aba5e5dc 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> @@ -278,7 +278,7 @@ extern void hash__pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
>> extern pgtable_t hash__pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
>> extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
>> unsigned long addr, pmd_t *pmdp);
>> -extern int hash__has_transparent_hugepage(void);
>> +extern int hash__arch_has_pmd_leaves(void);
>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>> #endif /* __ASSEMBLER__ */
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index e67e64ac6e8c..b6629c041e75 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -1121,14 +1121,14 @@ static inline void update_mmu_cache_pud(struct vm_area_struct *vma,
>> {
>> }
>> -extern int hash__has_transparent_hugepage(void);
>> -static inline int has_transparent_hugepage(void)
>> +extern int hash__arch_has_pmd_leaves(void);
>> +static inline int arch_has_pmd_leaves(void)
>> {
>> if (radix_enabled())
>> - return radix__has_transparent_hugepage();
>> - return hash__has_transparent_hugepage();
>> + return radix__arch_has_pmd_leaves();
>> + return hash__arch_has_pmd_leaves();
>> }
>> -#define has_transparent_hugepage has_transparent_hugepage
>> +#define arch_has_pmd_leaves arch_has_pmd_leaves
>> static inline int has_transparent_pud_hugepage(void)
>> {
>> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
>> index da954e779744..c884a119cbd9 100644
>> --- a/arch/powerpc/include/asm/book3s/64/radix.h
>> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
>> @@ -298,7 +298,7 @@ extern pmd_t radix__pmdp_huge_get_and_clear(struct mm_struct *mm,
>> pud_t radix__pudp_huge_get_and_clear(struct mm_struct *mm,
>> unsigned long addr, pud_t *pudp);
>> -static inline int radix__has_transparent_hugepage(void)
>> +static inline int radix__arch_has_pmd_leaves(void)
>> {
>> /* For radix 2M at PMD level means thp */
>> if (mmu_psize_defs[MMU_PAGE_2M].shift == PMD_SHIFT)
>> diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
>> index d9b5b751d7b7..88a4a2eab513 100644
>> --- a/arch/powerpc/mm/book3s64/hash_pgtable.c
>> +++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
>> @@ -391,7 +391,7 @@ pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
>> return old_pmd;
>> }
>> -int hash__has_transparent_hugepage(void)
>> +int hash__arch_has_pmd_leaves(void)
>> {
>> if (!mmu_has_feature(MMU_FTR_16M_PAGE))
>> @@ -420,7 +420,7 @@ int hash__has_transparent_hugepage(void)
>> return 1;
>> }
>> -EXPORT_SYMBOL_GPL(hash__has_transparent_hugepage);
>> +EXPORT_SYMBOL_GPL(hash__arch_has_pmd_leaves);
>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
>> index 2c6cee8241e0..33b165dbf3db 100644
>> --- a/arch/s390/include/asm/pgtable.h
>> +++ b/arch/s390/include/asm/pgtable.h
>> @@ -1799,8 +1799,8 @@ static inline int pmd_trans_huge(pmd_t pmd)
>> return pmd_leaf(pmd);
>> }
>> -#define has_transparent_hugepage has_transparent_hugepage
>> -static inline int has_transparent_hugepage(void)
>> +#define arch_has_pmd_leaves arch_has_pmd_leaves
>> +static inline int arch_has_pmd_leaves(void)
>> {
>> return cpu_has_edat1() ? 1 : 0;
>> }
>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>> index 2187e9cfcefa..2edd6c9d789c 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -306,8 +306,8 @@ static inline int pud_trans_huge(pud_t pud)
>> }
>> #endif
>> -#define has_transparent_hugepage has_transparent_hugepage
>> -static inline int has_transparent_hugepage(void)
>> +#define arch_has_pmd_leaves arch_has_pmd_leaves
>> +static inline int arch_has_pmd_leaves(void)
>> {
>> return boot_cpu_has(X86_FEATURE_PSE);
>> }
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index b365be3516bf..3d7eeb50c183 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -2273,8 +2273,8 @@ static inline void __init init_arch_has_pmd_leaves(void) { }
>> #endif
>> #endif
>> -#ifndef has_transparent_hugepage
>> -#define has_transparent_hugepage() IS_BUILTIN(CONFIG_TRANSPARENT_HUGEPAGE)
>> +#ifndef arch_has_pmd_leaves
>> +#define arch_has_pmd_leaves() IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE)
>> #endif
[Forgot to include the review, doing it now]
"""
When CONFIG_TRANSPARENT_HUGEPAGE is disabled, do we bypass crucial runtime
hardware feature checks on multiple architectures?
Because the architecture-specific declarations are often placed inside
CONFIG_TRANSPARENT_HUGEPAGE blocks (as seen above, and similarly in
arch/x86/include/asm/pgtable.h and arch/s390/include/asm/pgtable.h), they
will not be exposed to the generic headers when THP is disabled.
The preprocessor will then fall back to the default implementation here. Since
architectures like x86, s390, and powerpc select
HAVE_ARCH_TRANSPARENT_HUGEPAGE in their Kconfig independent of the THP
setting, this fallback evaluates to true.
Will this cause init_arch_has_pmd_leaves() to keep the static key enabled,
making pgtable_has_pmd_leaves() unconditionally return true at runtime even
when THP is disabled?
If this happens on older hardware without these features, could subsystems
like DAX that rely on pgtable_has_pmd_leaves() attempt to map PMD-sized
leaves and cause page table corruption or hardware exceptions?
"""
Although a bit of a corner case, this is a real issue.
Checking on x86, the problem is that the arch's implementation of
arch_has_pmd_leaves() lives inside a CONFIG_TRANSPARENT_HUGEPAGE block so
it won't be built when CONFIG_TRANSPARENT_HUGEPAGE=n which will cause
arch_has_pmd_leaves() to default to true without the proper hardware
check. I'll check the other archs and get this fixed.
>> #ifndef has_transparent_pud_hugepage
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 970e077019b7..4da10e94bbb6 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -969,7 +969,7 @@ static int __init hugepage_init(void)
>> int err;
>> struct kobject *hugepage_kobj;
>> - if (!has_transparent_hugepage()) {
>> + if (!arch_has_pmd_leaves()) {
>> transparent_hugepage_flags = 1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED;
>> return -EINVAL;
>> }
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 90b2d9e84320..c62fce83b8d0 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -169,7 +169,7 @@ EXPORT_SYMBOL(__arch_has_pmd_leaves_key);
>> void __init init_arch_has_pmd_leaves(void)
>> {
>> - if (!has_transparent_hugepage())
>> + if (!arch_has_pmd_leaves())
>> static_branch_disable(&__arch_has_pmd_leaves_key);
>> }
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* (sashiko review) Re: [PATCH v4 9/9] mm: thp: always enable mTHP support
[not found] ` <f67da00a825da9097b5faf2f390ad344450b88be.1777663129.git.luizcap@redhat.com>
2026-05-06 5:46 ` [PATCH v4 9/9] " Baolin Wang
@ 2026-05-06 18:34 ` Luiz Capitulino
2026-05-13 15:58 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 12+ messages in thread
From: Luiz Capitulino @ 2026-05-06 18:34 UTC (permalink / raw)
To: linux-kernel, linux-mm, david, baolin.wang, ziy, lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 2026-05-01 15:18, Luiz Capitulino wrote:
> If PMD-sized pages are not supported on an architecture (ie. the
> arch implements arch_has_pmd_leaves() and it returns false) then the
> current code disables all THP, including mTHP.
>
> This commit fixes this by allowing mTHP to be always enabled for all
> archs. When PMD-sized pages are not supported, its sysfs entry won't be
> created and their mapping will be disallowed at page-fault time.
>
> Similarly, this commit implements the following changes for shmem in
> shmem_allowable_huge_orders():
>
> - Drop the pgtable_has_pmd_leaves() check so that mTHP sizes are
> considered
> - Filter out PMD and PUD orders from allowable orders when
> PMD-sized pages are not supported by the CPU
>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
> mm/huge_memory.c | 23 ++++++++++++++++++-----
> mm/shmem.c | 14 +++++++++-----
> 2 files changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 32254febe097..c1765c8e3dc6 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -126,6 +126,14 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> else
> supported_orders = THP_ORDERS_ALL_FILE_DEFAULT;
>
> + if (!pgtable_has_pmd_leaves()) {
> + /*
> + * The CPU doesn't support PMD-sized pages, assume it
> + * doesn't support PUD-sized pages either.
> + */
> + supported_orders &= ~(BIT(PMD_ORDER) | BIT(PUD_ORDER));
> + }
> +
> orders &= supported_orders;
> if (!orders)
> return 0;
> @@ -133,7 +141,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> if (!vma->vm_mm) /* vdso */
> return 0;
>
> - if (!pgtable_has_pmd_leaves() || vma_thp_disabled(vma, vm_flags, forced_collapse))
> + if (vma_thp_disabled(vma, vm_flags, forced_collapse))
> return 0;
>
> /* khugepaged doesn't collapse DAX vma, but page fault is fine. */
> @@ -848,7 +856,7 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj)
> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time
> * constant so we have to do this here.
> */
> - if (!anon_orders_configured)
> + if (!anon_orders_configured && pgtable_has_pmd_leaves())
> huge_anon_orders_inherit = BIT(PMD_ORDER);
>
> *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj);
> @@ -870,6 +878,14 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj)
> }
>
> orders = THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE_DEFAULT;
> + if (!pgtable_has_pmd_leaves()) {
> + /*
> + * The CPU doesn't support PMD-sized pages, assume it
> + * doesn't support PUD-sized pages either.
> + */
> + orders &= ~(BIT(PMD_ORDER) | BIT(PUD_ORDER));
> + }
> +
> order = highest_order(orders);
> while (orders) {
> thpsize = thpsize_create(order, *hugepage_kobj);
> @@ -969,9 +985,6 @@ static int __init hugepage_init(void)
> int err;
> struct kobject *hugepage_kobj;
>
> - if (!pgtable_has_pmd_leaves())
> - return -EINVAL;
> -
> /*
> * hugepages can't be allocated by the buddy allocator
> */
> diff --git a/mm/shmem.c b/mm/shmem.c
> index a48f034830cd..23893c2bc2dd 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1840,16 +1840,19 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> unsigned long mask = READ_ONCE(huge_shmem_orders_always);
> unsigned long within_size_orders = READ_ONCE(huge_shmem_orders_within_size);
> vm_flags_t vm_flags = vma ? vma->vm_flags : 0;
> - unsigned int global_orders;
> + unsigned int global_orders, filter_orders = 0;
>
> - if (!pgtable_has_pmd_leaves() || (vma && vma_thp_disabled(vma, vm_flags, shmem_huge_force)))
> + if (vma && vma_thp_disabled(vma, vm_flags, shmem_huge_force))
> return 0;
>
> + if (!pgtable_has_pmd_leaves())
> + filter_orders = BIT(PMD_ORDER) | BIT(PUD_ORDER);
> +
> global_orders = shmem_huge_global_enabled(inode, index, write_end,
> shmem_huge_force, vma, vm_flags);
> /* Tmpfs huge pages allocation */
> if (!vma || !vma_is_anon_shmem(vma))
> - return global_orders;
> + return global_orders & ~filter_orders;
"""
Could this lead to unintended truncation of the order masks?
Because filter_orders is declared as an unsigned int, assigning the result
of BIT() causes an implicit downcast from unsigned long.
"""
If I'm checking this right, the maximum bit number we can get to is 22
on arm64. So, we're safe. That being said we should be consistent with
the return type for both filter_order and global_orders. I'll change it.
NOTE: I'm skipping the other comment from Sashiko which is about the
same issue.
>
> /*
> * Following the 'deny' semantics of the top level, force the huge
> @@ -1863,7 +1866,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> * means non-PMD sized THP can not override 'huge' mount option now.
> */
> if (shmem_huge == SHMEM_HUGE_FORCE)
> - return READ_ONCE(huge_shmem_orders_inherit);
> + return READ_ONCE(huge_shmem_orders_inherit) & ~filter_orders;
>
> /* Allow mTHP that will be fully within i_size. */
> mask |= shmem_get_orders_within_size(inode, within_size_orders, index, 0);
> @@ -1874,6 +1877,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> if (global_orders > 0)
> mask |= READ_ONCE(huge_shmem_orders_inherit);
>
> + mask &= ~filter_orders;
> return THP_ORDERS_ALL_FILE_DEFAULT & mask;
> }
>
> @@ -5457,7 +5461,7 @@ void __init shmem_init(void)
> * Default to setting PMD-sized THP to inherit the global setting and
> * disable all other multi-size THPs.
> */
> - if (!shmem_orders_configured)
> + if (!shmem_orders_configured && pgtable_has_pmd_leaves())
> huge_shmem_orders_inherit = BIT(HPAGE_PMD_ORDER);
> #endif
> return;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 2/9] mm: introduce pgtable_has_pmd_leaves()
[not found] ` <2a0bae00cdd2b6ef6b962610b523ebfc97806ba7.1777663129.git.luizcap@redhat.com>
2026-05-06 17:50 ` (sashiko review) Re: [PATCH v4 2/9] mm: introduce pgtable_has_pmd_leaves() Luiz Capitulino
@ 2026-05-13 15:30 ` David Hildenbrand (Arm)
2026-05-13 15:36 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-13 15:30 UTC (permalink / raw)
To: Luiz Capitulino, linux-kernel, linux-mm, baolin.wang, ziy,
lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 5/1/26 21:18, Luiz Capitulino wrote:
> Currently, we have two helpers that check for PMD-sized pages but have
> different names and slightly different semantics:
>
> - has_transparent_hugepage(): the name suggests it checks if THP is
> enabled, but when CONFIG_TRANSPARENT_HUGEPAGE=y and the architecture
> implements this helper, it actually checks if the CPU supports
> PMD-sized pages
>
> - thp_disabled_by_hw(): the name suggests it checks if THP is disabled
> by the hardware, but it just returns a cached value acquired with
> has_transparent_hugepage(). This helper is used in fast paths
>
> This commit introduces a new helper called pgtable_has_pmd_leaves()
> which is intended to replace both has_transparent_hugepage() and
> thp_disabled_by_hw(). pgtable_has_pmd_leaves() has very clear semantics:
> it returns true if the CPU supports PMD-sized pages and false otherwise.
> It always returns a cached value, so it can be used in fast paths.
>
> The new helper requires an initialization step which is performed by
> init_arch_has_pmd_leaves(). We call init_arch_has_pmd_leaves() early
> during boot in start_kernel() right after parse_early_param() but before
> parse_args(). This allows early_param() handlers to change CPU flags if
> needed (eg. parse_memopt() in x86-32) while also allowing users to use
> the API from __setup() handlers.
>
> The next commits will convert users of both has_transparent_hugepage()
> and thp_disabled_by_hw() to pgtable_has_pmd_leaves().
>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
> include/linux/pgtable.h | 15 +++++++++++++++
> init/main.c | 1 +
> mm/memory.c | 9 +++++++++
> 3 files changed, 25 insertions(+)
>
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index cdd68ed3ae1a..b365be3516bf 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -2243,6 +2243,21 @@ static inline const char *pgtable_level_to_str(enum pgtable_level level)
> }
> }
>
> +#ifdef CONFIG_MMU
> +DECLARE_STATIC_KEY_TRUE(__arch_has_pmd_leaves_key);
> +static inline bool pgtable_has_pmd_leaves(void)
> +{
> + return static_branch_likely(&__arch_has_pmd_leaves_key);
> +}
> +void __init init_arch_has_pmd_leaves(void);
> +#else
> +static inline bool pgtable_has_pmd_leaves(void)
> +{
> + return false;
> +}
> +static inline void __init init_arch_has_pmd_leaves(void) { }
> +#endif
> +
> #endif /* !__ASSEMBLY__ */
>
> #if !defined(MAX_POSSIBLE_PHYSMEM_BITS) && !defined(CONFIG_64BIT)
> diff --git a/init/main.c b/init/main.c
> index 96f93bb06c49..eea7c5bdddf7 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1053,6 +1053,7 @@ void start_kernel(void)
> print_kernel_cmdline(saved_command_line);
> /* parameters may set static keys */
> parse_early_param();
> + init_arch_has_pmd_leaves();
Can't we do this a bit later from some mm code?
This feels like something that can just go somewhere into mm_core_init()?
There, we should probably call this something like XXX_init(), and prepare it
from detecting support for PUD leaves as well.
Maybe just
pgtable_init() ?
--
Cheers,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 2/9] mm: introduce pgtable_has_pmd_leaves()
[not found] ` <2a0bae00cdd2b6ef6b962610b523ebfc97806ba7.1777663129.git.luizcap@redhat.com>
2026-05-06 17:50 ` (sashiko review) Re: [PATCH v4 2/9] mm: introduce pgtable_has_pmd_leaves() Luiz Capitulino
2026-05-13 15:30 ` David Hildenbrand (Arm)
@ 2026-05-13 15:36 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-13 15:36 UTC (permalink / raw)
To: Luiz Capitulino, linux-kernel, linux-mm, baolin.wang, ziy,
lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 5/1/26 21:18, Luiz Capitulino wrote:
> Currently, we have two helpers that check for PMD-sized pages but have
> different names and slightly different semantics:
>
> - has_transparent_hugepage(): the name suggests it checks if THP is
> enabled, but when CONFIG_TRANSPARENT_HUGEPAGE=y and the architecture
> implements this helper, it actually checks if the CPU supports
> PMD-sized pages
>
> - thp_disabled_by_hw(): the name suggests it checks if THP is disabled
> by the hardware, but it just returns a cached value acquired with
> has_transparent_hugepage(). This helper is used in fast paths
>
> This commit introduces a new helper called pgtable_has_pmd_leaves()
> which is intended to replace both has_transparent_hugepage() and
> thp_disabled_by_hw(). pgtable_has_pmd_leaves() has very clear semantics:
> it returns true if the CPU supports PMD-sized pages and false otherwise.
> It always returns a cached value, so it can be used in fast paths.
Oh, one more thing: what will be the semantics regarding
CONFIG_TRANSPARENT_HUGEPAGE?
I assume it will only return true if CONFIG_TRANSPARENT_HUGEPAGE is enabled,
correct?
That is, for example, relevant for patch #2.
We could later change these semantics, but for now we should be very clear about
what it means.
--
Cheers,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 3/9] drivers: dax: use pgtable_has_pmd_leaves()
[not found] ` <cb854ba26a4b8036706c6da5f6844f10f386087f.1777663129.git.luizcap@redhat.com>
@ 2026-05-13 15:40 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-13 15:40 UTC (permalink / raw)
To: Luiz Capitulino, linux-kernel, linux-mm, baolin.wang, ziy,
lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 5/1/26 21:18, Luiz Capitulino wrote:
> dax_align_valid() uses has_transparent_hugepage() to check if PMD-sized
> pages are supported, use pgtable_has_pmd_leaves() instead.
>
> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
> drivers/dax/dax-private.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
> index 81e4af49e39c..35744ff6592a 100644
> --- a/drivers/dax/dax-private.h
> +++ b/drivers/dax/dax-private.h
> @@ -123,7 +123,7 @@ static inline bool dax_align_valid(unsigned long align)
> {
> if (align == PUD_SIZE && IS_ENABLED(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD))
> return true;
> - if (align == PMD_SIZE && has_transparent_hugepage())
> + if (align == PMD_SIZE && pgtable_has_pmd_leaves())
> return true;
I think this code really depends on the implied CONFIG_TRANSPARENT_HUGEPAGE check.
For now, should we just keep saying that pgtable_has_pmd_leaves() implies
CONFIG_TRANSPARENT_HUGEPAGE support?
That would also e.g., make patch #4 easier.
--
Cheers,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 8/9] mm: replace thp_disabled_by_hw() with pgtable_has_pmd_leaves()
[not found] ` <d4fbc41a51f8fe9ab55cc6792d60c6c2c504b4d4.1777663129.git.luizcap@redhat.com>
@ 2026-05-13 15:50 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-13 15:50 UTC (permalink / raw)
To: Luiz Capitulino, linux-kernel, linux-mm, baolin.wang, ziy,
lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 5/1/26 21:18, Luiz Capitulino wrote:
> Despite its name, thp_disabled_by_hw() just checks whether the
> architecture supports PMD-sized pages. It returns true when
> TRANSPARENT_HUGEPAGE_UNSUPPORTED is set in transparent_hugepage_flags,
> this only occurs if the architecture implements arch_has_pmd_leaves()
> and that function returns false.
>
> Since pgtable_has_pmd_leaves() provides the same semantics, use it
> instead.
>
> Reviewed-by: Lance Yang <lance.yang@linux.dev>
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Acked-by: Zi Yan <ziy@nvidia.com>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
Makes sense, and the next patch will actually limit it only to PMD orders.
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4 9/9] mm: thp: always enable mTHP support
[not found] ` <f67da00a825da9097b5faf2f390ad344450b88be.1777663129.git.luizcap@redhat.com>
2026-05-06 5:46 ` [PATCH v4 9/9] " Baolin Wang
2026-05-06 18:34 ` (sashiko review) " Luiz Capitulino
@ 2026-05-13 15:58 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-13 15:58 UTC (permalink / raw)
To: Luiz Capitulino, linux-kernel, linux-mm, baolin.wang, ziy,
lance.yang
Cc: corbet, tsbogend, maddy, mpe, agordeev, gerald.schaefer, hca, gor,
x86, dave.hansen, djbw, vishal.l.verma, dave.jiang, akpm,
lorenzo.stoakes
On 5/1/26 21:18, Luiz Capitulino wrote:
> If PMD-sized pages are not supported on an architecture (ie. the
> arch implements arch_has_pmd_leaves() and it returns false) then the
> current code disables all THP, including mTHP.
>
> This commit fixes this by allowing mTHP to be always enabled for all
> archs. When PMD-sized pages are not supported, its sysfs entry won't be
> created and their mapping will be disallowed at page-fault time.
>
> Similarly, this commit implements the following changes for shmem in
> shmem_allowable_huge_orders():
>
> - Drop the pgtable_has_pmd_leaves() check so that mTHP sizes are
> considered
> - Filter out PMD and PUD orders from allowable orders when
> PMD-sized pages are not supported by the CPU
>
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---
> mm/huge_memory.c | 23 ++++++++++++++++++-----
> mm/shmem.c | 14 +++++++++-----
> 2 files changed, 27 insertions(+), 10 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 32254febe097..c1765c8e3dc6 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -126,6 +126,14 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> else
> supported_orders = THP_ORDERS_ALL_FILE_DEFAULT;
>
> + if (!pgtable_has_pmd_leaves()) {
> + /*
> + * The CPU doesn't support PMD-sized pages, assume it
> + * doesn't support PUD-sized pages either.
> + */
I'd say here "If the CPU does not support PMD leaves, assume for now that it
does not support PUD leaves and disable both folio orders."
> + supported_orders &= ~(BIT(PMD_ORDER) | BIT(PUD_ORDER));
> + }
> +
> orders &= supported_orders;
> if (!orders)
> return 0;
> @@ -133,7 +141,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
> if (!vma->vm_mm) /* vdso */
> return 0;
>
> - if (!pgtable_has_pmd_leaves() || vma_thp_disabled(vma, vm_flags, forced_collapse))
> + if (vma_thp_disabled(vma, vm_flags, forced_collapse))
> return 0;
>
> /* khugepaged doesn't collapse DAX vma, but page fault is fine. */
> @@ -848,7 +856,7 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj)
> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time
> * constant so we have to do this here.
> */
> - if (!anon_orders_configured)
> + if (!anon_orders_configured && pgtable_has_pmd_leaves())
> huge_anon_orders_inherit = BIT(PMD_ORDER);
>
> *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj);
> @@ -870,6 +878,14 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj)
> }
>
> orders = THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE_DEFAULT;
> + if (!pgtable_has_pmd_leaves()) {
> + /*
> + * The CPU doesn't support PMD-sized pages, assume it
> + * doesn't support PUD-sized pages either.
> + */
> + orders &= ~(BIT(PMD_ORDER) | BIT(PUD_ORDER));
> + }
> +
> order = highest_order(orders);
> while (orders) {
> thpsize = thpsize_create(order, *hugepage_kobj);
> @@ -969,9 +985,6 @@ static int __init hugepage_init(void)
> int err;
> struct kobject *hugepage_kobj;
>
> - if (!pgtable_has_pmd_leaves())
> - return -EINVAL;
> -
> /*
> * hugepages can't be allocated by the buddy allocator
> */
> diff --git a/mm/shmem.c b/mm/shmem.c
> index a48f034830cd..23893c2bc2dd 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1840,16 +1840,19 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> unsigned long mask = READ_ONCE(huge_shmem_orders_always);
> unsigned long within_size_orders = READ_ONCE(huge_shmem_orders_within_size);
> vm_flags_t vm_flags = vma ? vma->vm_flags : 0;
> - unsigned int global_orders;
> + unsigned int global_orders, filter_orders = 0;
>
> - if (!pgtable_has_pmd_leaves() || (vma && vma_thp_disabled(vma, vm_flags, shmem_huge_force)))
> + if (vma && vma_thp_disabled(vma, vm_flags, shmem_huge_force))
> return 0;
>
> + if (!pgtable_has_pmd_leaves())
> + filter_orders = BIT(PMD_ORDER) | BIT(PUD_ORDER);
Would "disabled_orders" or "unavailable_orders" be more appropriate?
There is no need to disable PUD-orders, as shmem does not support PUD-orders
(unlike DAX). So you can keep it simpler here.
> +
> global_orders = shmem_huge_global_enabled(inode, index, write_end,
> shmem_huge_force, vma, vm_flags);
> /* Tmpfs huge pages allocation */
> if (!vma || !vma_is_anon_shmem(vma))
> - return global_orders;
> + return global_orders & ~filter_orders;
>
> /*
> * Following the 'deny' semantics of the top level, force the huge
> @@ -1863,7 +1866,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> * means non-PMD sized THP can not override 'huge' mount option now.
> */
> if (shmem_huge == SHMEM_HUGE_FORCE)
> - return READ_ONCE(huge_shmem_orders_inherit);
> + return READ_ONCE(huge_shmem_orders_inherit) & ~filter_orders;
>
> /* Allow mTHP that will be fully within i_size. */
> mask |= shmem_get_orders_within_size(inode, within_size_orders, index, 0);
> @@ -1874,6 +1877,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
> if (global_orders > 0)
> mask |= READ_ONCE(huge_shmem_orders_inherit);
>
> + mask &= ~filter_orders;
> return THP_ORDERS_ALL_FILE_DEFAULT & mask;
> }
>
> @@ -5457,7 +5461,7 @@ void __init shmem_init(void)
> * Default to setting PMD-sized THP to inherit the global setting and
> * disable all other multi-size THPs.
> */
> - if (!shmem_orders_configured)
> + if (!shmem_orders_configured && pgtable_has_pmd_leaves())
> huge_shmem_orders_inherit = BIT(HPAGE_PMD_ORDER);
Do we really want to change that? We can just leave the defaults as is, no?
--
Cheers,
David
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-05-13 15:58 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <cover.1777663129.git.luizcap@redhat.com>
[not found] ` <20260503080236.4ea7f3fec0e5788d50113599@linux-foundation.org>
2026-05-04 19:11 ` [PATCH v4 0/9] mm: thp: always enable mTHP support Luiz Capitulino
[not found] ` <f67da00a825da9097b5faf2f390ad344450b88be.1777663129.git.luizcap@redhat.com>
2026-05-06 5:46 ` [PATCH v4 9/9] " Baolin Wang
2026-05-06 18:34 ` (sashiko review) " Luiz Capitulino
2026-05-13 15:58 ` David Hildenbrand (Arm)
[not found] ` <d2d1d0deaa5732d82bc2daea5033a1578781d641.1777663129.git.luizcap@redhat.com>
2026-05-06 18:12 ` (sashiko review) Re: [PATCH v4 6/9] mm: shmem: drop has_transparent_hugepage() usage Luiz Capitulino
[not found] ` <85674af7934cd363ceadd1d4f2b784d833a183cc.1777663129.git.luizcap@redhat.com>
2026-05-06 18:22 ` (sashiko review) Re: [PATCH v4 7/9] treewide: introduce arch_has_pmd_leaves() Luiz Capitulino
2026-05-06 18:30 ` Luiz Capitulino
[not found] ` <2a0bae00cdd2b6ef6b962610b523ebfc97806ba7.1777663129.git.luizcap@redhat.com>
2026-05-06 17:50 ` (sashiko review) Re: [PATCH v4 2/9] mm: introduce pgtable_has_pmd_leaves() Luiz Capitulino
2026-05-13 15:30 ` David Hildenbrand (Arm)
2026-05-13 15:36 ` David Hildenbrand (Arm)
[not found] ` <cb854ba26a4b8036706c6da5f6844f10f386087f.1777663129.git.luizcap@redhat.com>
2026-05-13 15:40 ` [PATCH v4 3/9] drivers: dax: use pgtable_has_pmd_leaves() David Hildenbrand (Arm)
[not found] ` <d4fbc41a51f8fe9ab55cc6792d60c6c2c504b4d4.1777663129.git.luizcap@redhat.com>
2026-05-13 15:50 ` [PATCH v4 8/9] mm: replace thp_disabled_by_hw() with pgtable_has_pmd_leaves() David Hildenbrand (Arm)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox