linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
@ 2025-10-29  5:49 Sourabh Jain
  2025-10-29  8:25 ` David Hildenbrand
  0 siblings, 1 reply; 21+ messages in thread
From: Sourabh Jain @ 2025-10-29  5:49 UTC (permalink / raw)
  To: Madhavan Srinivasan, Ritesh Harjani (IBM), linuxppc-dev
  Cc: Christophe Leroy, Donet Tom, David Hildenbrand, Andrew Morton

Kernel is printing below warning while booting:


WARNING: CPU: 0 PID: 1 at mm/hugetlb.c:4753 hugetlb_add_hstate+0xc0/0x180
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 
6.18.0-rc1-01400-ga297f72c4951 #6 NONE
Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
NIP:  c000000001370800 LR: c000000001357740 CTR: 0000000000000005
REGS: c000000080183890 TRAP: 0700   Not tainted 
(6.18.0-rc1-01400-ga297f72c4951)
MSR:  0000000080029002 <CE,EE,ME>  CR: 48000242  XER: 20000000
IRQMASK: 0
GPR00: c000000001357740 c000000080183b30 c000000001352000 000000000000000e
GPR04: c0000000011d1c4f 0000000000000002 000000000000001a 0000000000000000
GPR08: 0000000000000000 0000000000000002 0000000000000001 0000000000000005
GPR12: c0000000013576a4 c0000000015ad000 c00000000000210c 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 c0000000015876e8 0000000000000002 c000000001587500
GPR28: c000000001587578 000000000000000e 0000000004000000 0000000000000170
NIP [c000000001370800] hugetlb_add_hstate+0xc0/0x180
LR [c000000001357740] hugetlbpage_init+0x9c/0xf0
Call Trace:
hugetlb_add_hstate+0x148/0x180 (unreliable)
hugetlbpage_init+0x9c/0xf0
do_one_initcall+0x84/0x308
kernel_init_freeable+0x2e4/0x380
kernel_init+0x30/0x15c
ret_from_kernel_user_thread+0x14/0x1c

Kernel commit causing these warning:
commit 7b4f21f5e0386dfe02c68c009294d8f26e3c1bad (HEAD)
Author: David Hildenbrand <david@redhat.com>
Date:   Mon Sep 1 17:03:29 2025 +0200

     mm/hugetlb: check for unreasonable folio sizes when registering hstate

     Let's check that no hstate that corresponds to an unreasonable 
folio size
     is registered by an architecture.  If we were to succeed 
registering, we
     could later try allocating an unsupported gigantic folio size.

...

         BUG_ON(order < order_base_2(__NR_USED_SUBPAGE));
+       WARN_ON(order > MAX_FOLIO_ORDER);
         h = &hstates[hugetlb_max_hstate++];

snip...


Command to create kernel config:
make ARCH=powerpc corenet64_smp_defconfig

Qemu command:
qemu-system-ppc64 -nographic -vga none -M ppce500 -smp 2 -m 4G -accel 
tcg -kernel ./vmlinux -nic user -initrd ./ppc64-novsx-rootfs.cpio.gz 
-cpu e5500 -append "noreboot"


Root cause:
The MAX_FOLIO_ORDER  for e500 platform is MAX_PAGE_ORDER which is
nothing but CONFIG_ARCH_FORCE_MAX_ORDER which dependent of page-size
which was 4k. So value of MAX_FOLIO_ODER is 12 for this case.

As per arch/powerpc/mm/nohash/tlb.c the following page size are supported on
e500 platform:

struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
     [MMU_PAGE_4K] = {
         .shift    = 12,
     },
     [MMU_PAGE_2M] = {
         .shift    = 21,
     },
     [MMU_PAGE_4M] = {
         .shift    = 22,
     },
     [MMU_PAGE_16M] = {
         .shift    = 24,
     },
     [MMU_PAGE_64M] = {
         .shift    = 26,
     },
     [MMU_PAGE_256M] = {
         .shift    = 28,
     },
     [MMU_PAGE_1G] = {
         .shift    = 30,
     },
};

With the above MAX_FOLIO_ORDER and page sizes, hugetlbpage_init() in
arch/powerpc/mm/hugetlbpage.c tries to call hugetlb_add_hstate() with
an order higher than 12, causing the kernel to print the above warning.

Things I tried:
I enabled CONFIG_ARCH_HAS_GIGANTIC_PAGE for the e500 platform. With that,
MAX_FOLIO_ORDER was set to 16, but that was not sufficient for MMU_PAGE_1G.

This is because with CONFIG_ARCH_HAS_GIGANTIC_PAGE enabled,
MAX_FOLIO_ORDER was set to 16 = PUD_ORDER = (PMD_INDEX_SIZE (7) + 
PTE_INDEX_SIZE (9)),
while the order for MMU_PAGE_1G was 18.

Thanks,
Sourabh Jain


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-10-29  5:49 powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate Sourabh Jain
@ 2025-10-29  8:25 ` David Hildenbrand
  2025-11-05 11:32   ` Christophe Leroy
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2025-10-29  8:25 UTC (permalink / raw)
  To: Sourabh Jain, Madhavan Srinivasan, Ritesh Harjani (IBM),
	linuxppc-dev
  Cc: Christophe Leroy, Donet Tom, Andrew Morton

On 29.10.25 06:49, Sourabh Jain wrote:
> Kernel is printing below warning while booting:
> 
> 
> WARNING: CPU: 0 PID: 1 at mm/hugetlb.c:4753 hugetlb_add_hstate+0xc0/0x180
> Modules linked in:
> CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
> 6.18.0-rc1-01400-ga297f72c4951 #6 NONE
> Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
> NIP:  c000000001370800 LR: c000000001357740 CTR: 0000000000000005
> REGS: c000000080183890 TRAP: 0700   Not tainted
> (6.18.0-rc1-01400-ga297f72c4951)
> MSR:  0000000080029002 <CE,EE,ME>  CR: 48000242  XER: 20000000
> IRQMASK: 0
> GPR00: c000000001357740 c000000080183b30 c000000001352000 000000000000000e
> GPR04: c0000000011d1c4f 0000000000000002 000000000000001a 0000000000000000
> GPR08: 0000000000000000 0000000000000002 0000000000000001 0000000000000005
> GPR12: c0000000013576a4 c0000000015ad000 c00000000000210c 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR24: 0000000000000000 c0000000015876e8 0000000000000002 c000000001587500
> GPR28: c000000001587578 000000000000000e 0000000004000000 0000000000000170
> NIP [c000000001370800] hugetlb_add_hstate+0xc0/0x180
> LR [c000000001357740] hugetlbpage_init+0x9c/0xf0
> Call Trace:
> hugetlb_add_hstate+0x148/0x180 (unreliable)
> hugetlbpage_init+0x9c/0xf0
> do_one_initcall+0x84/0x308
> kernel_init_freeable+0x2e4/0x380
> kernel_init+0x30/0x15c
> ret_from_kernel_user_thread+0x14/0x1c
> 
> Kernel commit causing these warning:
> commit 7b4f21f5e0386dfe02c68c009294d8f26e3c1bad (HEAD)
> Author: David Hildenbrand <david@redhat.com>
> Date:   Mon Sep 1 17:03:29 2025 +0200
> 
>       mm/hugetlb: check for unreasonable folio sizes when registering hstate
> 
>       Let's check that no hstate that corresponds to an unreasonable
> folio size
>       is registered by an architecture.  If we were to succeed
> registering, we
>       could later try allocating an unsupported gigantic folio size.
> 
> ...
> 
>           BUG_ON(order < order_base_2(__NR_USED_SUBPAGE));
> +       WARN_ON(order > MAX_FOLIO_ORDER);
>           h = &hstates[hugetlb_max_hstate++];
> 
> snip...
> 
> 
> Command to create kernel config:
> make ARCH=powerpc corenet64_smp_defconfig
> 
> Qemu command:
> qemu-system-ppc64 -nographic -vga none -M ppce500 -smp 2 -m 4G -accel
> tcg -kernel ./vmlinux -nic user -initrd ./ppc64-novsx-rootfs.cpio.gz
> -cpu e5500 -append "noreboot"
> 
> 
> Root cause:
> The MAX_FOLIO_ORDER  for e500 platform is MAX_PAGE_ORDER which is
> nothing but CONFIG_ARCH_FORCE_MAX_ORDER which dependent of page-size
> which was 4k. So value of MAX_FOLIO_ODER is 12 for this case.
> 
> As per arch/powerpc/mm/nohash/tlb.c the following page size are supported on
> e500 platform:
> 
> struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
>       [MMU_PAGE_4K] = {
>           .shift    = 12,
>       },
>       [MMU_PAGE_2M] = {
>           .shift    = 21,
>       },
>       [MMU_PAGE_4M] = {
>           .shift    = 22,
>       },
>       [MMU_PAGE_16M] = {
>           .shift    = 24,
>       },
>       [MMU_PAGE_64M] = {
>           .shift    = 26,
>       },
>       [MMU_PAGE_256M] = {
>           .shift    = 28,
>       },
>       [MMU_PAGE_1G] = {
>           .shift    = 30,
>       },
> };
> 
> With the above MAX_FOLIO_ORDER and page sizes, hugetlbpage_init() in
> arch/powerpc/mm/hugetlbpage.c tries to call hugetlb_add_hstate() with
> an order higher than 12, causing the kernel to print the above warning.
> 
> Things I tried:
> I enabled CONFIG_ARCH_HAS_GIGANTIC_PAGE for the e500 platform. With that,
> MAX_FOLIO_ORDER was set to 16, but that was not sufficient for MMU_PAGE_1G.
> 
> This is because with CONFIG_ARCH_HAS_GIGANTIC_PAGE enabled,
> MAX_FOLIO_ORDER was set to 16 = PUD_ORDER = (PMD_INDEX_SIZE (7) +
> PTE_INDEX_SIZE (9)),
> while the order for MMU_PAGE_1G was 18.

Yes, we discussed that in [1].

We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase 
MAX_FOLIO_ORDER, because apparently, there might be ppc configs that 
have even larger hugetlb sizes than PUDs.

@Cristophe, I was under the impression that you would send a fix. Do you 
want me to prepare something and send it out?

[1] 
https://lkml.kernel.org/r/4632e721-0ac8-4d72-a8ed-e6c928eee94d@csgroup.eu

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-10-29  8:25 ` David Hildenbrand
@ 2025-11-05 11:32   ` Christophe Leroy
  2025-11-06 15:02     ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2025-11-05 11:32 UTC (permalink / raw)
  To: David Hildenbrand, Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

Hi David,

Le 29/10/2025 à 09:25, David Hildenbrand a écrit :
> On 29.10.25 06:49, Sourabh Jain wrote:
>> Kernel is printing below warning while booting:
>>
>>
>> WARNING: CPU: 0 PID: 1 at mm/hugetlb.c:4753 hugetlb_add_hstate+0xc0/0x180
>> Modules linked in:
>> CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
>> 6.18.0-rc1-01400-ga297f72c4951 #6 NONE
>> Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
>> NIP:  c000000001370800 LR: c000000001357740 CTR: 0000000000000005
>> REGS: c000000080183890 TRAP: 0700   Not tainted
>> (6.18.0-rc1-01400-ga297f72c4951)
>> MSR:  0000000080029002 <CE,EE,ME>  CR: 48000242  XER: 20000000
>> IRQMASK: 0
>> GPR00: c000000001357740 c000000080183b30 c000000001352000 
>> 000000000000000e
>> GPR04: c0000000011d1c4f 0000000000000002 000000000000001a 
>> 0000000000000000
>> GPR08: 0000000000000000 0000000000000002 0000000000000001 
>> 0000000000000005
>> GPR12: c0000000013576a4 c0000000015ad000 c00000000000210c 
>> 0000000000000000
>> GPR16: 0000000000000000 0000000000000000 0000000000000000 
>> 0000000000000000
>> GPR20: 0000000000000000 0000000000000000 0000000000000000 
>> 0000000000000000
>> GPR24: 0000000000000000 c0000000015876e8 0000000000000002 
>> c000000001587500
>> GPR28: c000000001587578 000000000000000e 0000000004000000 
>> 0000000000000170
>> NIP [c000000001370800] hugetlb_add_hstate+0xc0/0x180
>> LR [c000000001357740] hugetlbpage_init+0x9c/0xf0
>> Call Trace:
>> hugetlb_add_hstate+0x148/0x180 (unreliable)
>> hugetlbpage_init+0x9c/0xf0
>> do_one_initcall+0x84/0x308
>> kernel_init_freeable+0x2e4/0x380
>> kernel_init+0x30/0x15c
>> ret_from_kernel_user_thread+0x14/0x1c
>>
>> Kernel commit causing these warning:
>> commit 7b4f21f5e0386dfe02c68c009294d8f26e3c1bad (HEAD)
>> Author: David Hildenbrand <david@redhat.com>
>> Date:   Mon Sep 1 17:03:29 2025 +0200
>>
>>       mm/hugetlb: check for unreasonable folio sizes when registering 
>> hstate
>>
>>       Let's check that no hstate that corresponds to an unreasonable
>> folio size
>>       is registered by an architecture.  If we were to succeed
>> registering, we
>>       could later try allocating an unsupported gigantic folio size.
>>
>> ...
>>
>>           BUG_ON(order < order_base_2(__NR_USED_SUBPAGE));
>> +       WARN_ON(order > MAX_FOLIO_ORDER);
>>           h = &hstates[hugetlb_max_hstate++];
>>
>> snip...
>>
>>
>> Command to create kernel config:
>> make ARCH=powerpc corenet64_smp_defconfig
>>
>> Qemu command:
>> qemu-system-ppc64 -nographic -vga none -M ppce500 -smp 2 -m 4G -accel
>> tcg -kernel ./vmlinux -nic user -initrd ./ppc64-novsx-rootfs.cpio.gz
>> -cpu e5500 -append "noreboot"
>>
>>
>> Root cause:
>> The MAX_FOLIO_ORDER  for e500 platform is MAX_PAGE_ORDER which is
>> nothing but CONFIG_ARCH_FORCE_MAX_ORDER which dependent of page-size
>> which was 4k. So value of MAX_FOLIO_ODER is 12 for this case.
>>
>> As per arch/powerpc/mm/nohash/tlb.c the following page size are 
>> supported on
>> e500 platform:
>>
>> struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = {
>>       [MMU_PAGE_4K] = {
>>           .shift    = 12,
>>       },
>>       [MMU_PAGE_2M] = {
>>           .shift    = 21,
>>       },
>>       [MMU_PAGE_4M] = {
>>           .shift    = 22,
>>       },
>>       [MMU_PAGE_16M] = {
>>           .shift    = 24,
>>       },
>>       [MMU_PAGE_64M] = {
>>           .shift    = 26,
>>       },
>>       [MMU_PAGE_256M] = {
>>           .shift    = 28,
>>       },
>>       [MMU_PAGE_1G] = {
>>           .shift    = 30,
>>       },
>> };
>>
>> With the above MAX_FOLIO_ORDER and page sizes, hugetlbpage_init() in
>> arch/powerpc/mm/hugetlbpage.c tries to call hugetlb_add_hstate() with
>> an order higher than 12, causing the kernel to print the above warning.
>>
>> Things I tried:
>> I enabled CONFIG_ARCH_HAS_GIGANTIC_PAGE for the e500 platform. With that,
>> MAX_FOLIO_ORDER was set to 16, but that was not sufficient for 
>> MMU_PAGE_1G.
>>
>> This is because with CONFIG_ARCH_HAS_GIGANTIC_PAGE enabled,
>> MAX_FOLIO_ORDER was set to 16 = PUD_ORDER = (PMD_INDEX_SIZE (7) +
>> PTE_INDEX_SIZE (9)),
>> while the order for MMU_PAGE_1G was 18.
> 
> Yes, we discussed that in [1].
> 
> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase 
> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that 
> have even larger hugetlb sizes than PUDs.
> 
> @Cristophe, I was under the impression that you would send a fix. Do you 
> want me to prepare something and send it out?

Indeed I would have liked to better understand the implications of all 
this, but I didn't have the time.

By the way, you would describe the fix better than me so yes if you can 
prepare and send a fix please do.

Christophe


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-05 11:32   ` Christophe Leroy
@ 2025-11-06 15:02     ` David Hildenbrand (Red Hat)
  2025-11-06 16:19       ` Christophe Leroy
  2025-11-07  8:00       ` Sourabh Jain
  0 siblings, 2 replies; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-06 15:02 UTC (permalink / raw)
  To: Christophe Leroy, Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

>> Yes, we discussed that in [1].
>>
>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>> have even larger hugetlb sizes than PUDs.
>>
>> @Cristophe, I was under the impression that you would send a fix. Do you
>> want me to prepare something and send it out?
> 
> Indeed I would have liked to better understand the implications of all
> this, but I didn't have the time.

Indeed, too me longer than it should to understand and make up my mind as well.

> 
> By the way, you would describe the fix better than me so yes if you can
> prepare and send a fix please do.

I just crafted the following. I yet have to test it more, some early
feedback+testing would be appreciated!

 From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
Date: Thu, 6 Nov 2025 11:31:45 +0100
Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with hugetlb

In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
runtime allocation of gigantic hugetlb folios. In the meantime it evolved
into a generic way for the architecture to state that it supports
gigantic hugetlb folios.

In commit fae7d834c43c ("mm: add __dump_folio()") we started using
CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
have folios larger than what the buddy can handle. In the context of
that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
when dumping tail pages of folios. Before that commit, we assumed that
we cannot have folios larger than the highest buddy order, which was
obviously wrong.

In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
when registering hstate"), we used MAX_FOLIO_ORDER to detect
inconsistencies, and in fact, we found some now.

Powerpc allows for configs that can allocate gigantic folio during boot
(not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
exceed PUD_ORDER.

To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE for
all 64bit configs, and increase the maximum folio size to P4D_ORDER.

Ideally, we'd have a better way to obtain a maximum value. But this should
be good enough for now fix the issue and now mostly states "no real folio
size limit".

While at it, handle gigantic DAX folios more clearly: DAX can only
end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.

Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
HUGETLB_PAGE.

Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on PPC64, we will now
also allow for runtime allocations of folios in some more powerpc configs.
I don't think this is a problem, but if it is we could handle it through
__HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.

While __dump_page()/__dump_folio was also problematic (not handling dumping
of tail pages of such gigantic folios correctly), it doesn't relevant
critical enough to mark it as a fix.

Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate")
Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
---
  arch/powerpc/Kconfig                   | 1 +
  arch/powerpc/platforms/Kconfig.cputype | 1 -
  include/linux/mm.h                     | 4 ++--
  include/linux/pgtable.h                | 1 +
  mm/Kconfig                             | 7 +++++++
  5 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e24f4d88885ae..55c3626c86273 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -137,6 +137,7 @@ config PPC
  	select ARCH_HAS_DMA_OPS			if PPC64
  	select ARCH_HAS_FORTIFY_SOURCE
  	select ARCH_HAS_GCOV_PROFILE_ALL
+	select ARCH_HAS_GIGANTIC_PAGE		if PPC64
  	select ARCH_HAS_KCOV
  	select ARCH_HAS_KERNEL_FPU_SUPPORT	if PPC64 && PPC_FPU
  	select ARCH_HAS_MEMBARRIER_CALLBACKS
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 7b527d18aa5ee..4c321a8ea8965 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
  config PPC_RADIX_MMU
  	bool "Radix MMU Support"
  	depends on PPC_BOOK3S_64
-	select ARCH_HAS_GIGANTIC_PAGE
  	default y
  	help
  	  Enable support for the Power ISA 3.0 Radix style MMU. Currently this
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d16b33bacc32b..4842edc875185 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const struct folio *folio)
  	return folio_large_nr_pages(folio);
  }
  
-#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
+#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
  /*
   * We don't expect any folios that exceed buddy sizes (and consequently
   * memory sections).
@@ -2092,7 +2092,7 @@ static inline unsigned long folio_nr_pages(const struct folio *folio)
   * There is no real limit on the folio size. We limit them to the maximum we
   * currently expect (e.g., hugetlb, dax).
   */
-#define MAX_FOLIO_ORDER		PUD_ORDER
+#define MAX_FOLIO_ORDER		P4D_ORDER
  #endif
  
  #define MAX_FOLIO_NR_PAGES	(1UL << MAX_FOLIO_ORDER)
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 32e8457ad5352..09fc3c2ba39e2 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -7,6 +7,7 @@
  
  #define PMD_ORDER	(PMD_SHIFT - PAGE_SHIFT)
  #define PUD_ORDER	(PUD_SHIFT - PAGE_SHIFT)
+#define P4D_ORDER	(P4D_SHIFT - PAGE_SHIFT)
  
  #ifndef __ASSEMBLY__
  #ifdef CONFIG_MMU
diff --git a/mm/Kconfig b/mm/Kconfig
index 0e26f4fc8717b..ca3f146bc7053 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -908,6 +908,13 @@ config PAGE_MAPCOUNT
  config PGTABLE_HAS_HUGE_LEAVES
  	def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE
  
+#
+# We can end up creating gigantic folio.
+#
+config HAVE_GIGANTIC_FOLIOS
+	def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \
+		 (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
+
  # TODO: Allow to be enabled without THP
  config ARCH_SUPPORTS_HUGE_PFNMAP
  	def_bool n
-- 
2.51.0


-- 
Cheers

David


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-06 15:02     ` David Hildenbrand (Red Hat)
@ 2025-11-06 16:19       ` Christophe Leroy
  2025-11-07 14:37         ` Ritesh Harjani
  2025-11-10 11:27         ` David Hildenbrand (Red Hat)
  2025-11-07  8:00       ` Sourabh Jain
  1 sibling, 2 replies; 21+ messages in thread
From: Christophe Leroy @ 2025-11-06 16:19 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton



Le 06/11/2025 à 16:02, David Hildenbrand (Red Hat) a écrit :
>>> Yes, we discussed that in [1].
>>>
>>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>>> have even larger hugetlb sizes than PUDs.
>>>
>>> @Cristophe, I was under the impression that you would send a fix. Do you
>>> want me to prepare something and send it out?
>>
>> Indeed I would have liked to better understand the implications of all
>> this, but I didn't have the time.
> 
> Indeed, too me longer than it should to understand and make up my mind 
> as well.
> 
>>
>> By the way, you would describe the fix better than me so yes if you can
>> prepare and send a fix please do.
> 
> I just crafted the following. I yet have to test it more, some early
> feedback+testing would be appreciated!
> 
>  From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
> Date: Thu, 6 Nov 2025 11:31:45 +0100
> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with hugetlb
> 
> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
> into a generic way for the architecture to state that it supports
> gigantic hugetlb folios.
> 
> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
> have folios larger than what the buddy can handle. In the context of
> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
> when dumping tail pages of folios. Before that commit, we assumed that
> we cannot have folios larger than the highest buddy order, which was
> obviously wrong.
> 
> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
> when registering hstate"), we used MAX_FOLIO_ORDER to detect
> inconsistencies, and in fact, we found some now.
> 
> Powerpc allows for configs that can allocate gigantic folio during boot
> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
> exceed PUD_ORDER.
> 
> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE for
> all 64bit configs, and increase the maximum folio size to P4D_ORDER.
> 
> Ideally, we'd have a better way to obtain a maximum value. But this should
> be good enough for now fix the issue and now mostly states "no real folio
> size limit".
> 
> While at it, handle gigantic DAX folios more clearly: DAX can only
> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
> 
> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
> HUGETLB_PAGE.
> 
> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on PPC64, we will now
> also allow for runtime allocations of folios in some more powerpc configs.
> I don't think this is a problem, but if it is we could handle it through
> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
> 
> While __dump_page()/__dump_folio was also problematic (not handling dumping
> of tail pages of such gigantic folios correctly), it doesn't relevant
> critical enough to mark it as a fix.
> 
> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes 
> when registering hstate")
> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
> ---
>   arch/powerpc/Kconfig                   | 1 +
>   arch/powerpc/platforms/Kconfig.cputype | 1 -
>   include/linux/mm.h                     | 4 ++--
>   include/linux/pgtable.h                | 1 +
>   mm/Kconfig                             | 7 +++++++
>   5 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e24f4d88885ae..55c3626c86273 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -137,6 +137,7 @@ config PPC
>       select ARCH_HAS_DMA_OPS            if PPC64
>       select ARCH_HAS_FORTIFY_SOURCE
>       select ARCH_HAS_GCOV_PROFILE_ALL
> +    select ARCH_HAS_GIGANTIC_PAGE        if PPC64

Problem is not only on PPC64, it is on PPC32 as well, for instance 
corenet32_smp_defconfig has the problem as well.

On the other hand for book3s/64 it is already handled, see 
arch/powerpc/platforms/Kconfig.cputype:

config PPC_RADIX_MMU
	bool "Radix MMU Support"
	depends on PPC_BOOK3S_64
	select ARCH_HAS_GIGANTIC_PAGE
	default y


So I think what you want instead is:

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 7b527d18aa5ee..1f5a1e587740c 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -276,6 +276,7 @@ config PPC_E500
         select FSL_EMB_PERFMON
         bool
         select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
+       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
         select PPC_SMP_MUXED_IPI
         select PPC_DOORBELL
         select PPC_KUEP



>       select ARCH_HAS_KCOV
>       select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>       select ARCH_HAS_MEMBARRIER_CALLBACKS
> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/ 
> platforms/Kconfig.cputype
> index 7b527d18aa5ee..4c321a8ea8965 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>   config PPC_RADIX_MMU
>       bool "Radix MMU Support"
>       depends on PPC_BOOK3S_64
> -    select ARCH_HAS_GIGANTIC_PAGE

Should remain I think.

>       default y
>       help
>         Enable support for the Power ISA 3.0 Radix style MMU. Currently 
> this
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index d16b33bacc32b..4842edc875185 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const 
> struct folio *folio)
>       return folio_large_nr_pages(folio);
>   }
> 
> -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
> +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
>   /*
>    * We don't expect any folios that exceed buddy sizes (and consequently
>    * memory sections).
> @@ -2092,7 +2092,7 @@ static inline unsigned long folio_nr_pages(const 
> struct folio *folio)
>    * There is no real limit on the folio size. We limit them to the 
> maximum we
>    * currently expect (e.g., hugetlb, dax).
>    */
> -#define MAX_FOLIO_ORDER        PUD_ORDER
> +#define MAX_FOLIO_ORDER        P4D_ORDER
>   #endif
> 
>   #define MAX_FOLIO_NR_PAGES    (1UL << MAX_FOLIO_ORDER)
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 32e8457ad5352..09fc3c2ba39e2 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -7,6 +7,7 @@
> 
>   #define PMD_ORDER    (PMD_SHIFT - PAGE_SHIFT)
>   #define PUD_ORDER    (PUD_SHIFT - PAGE_SHIFT)
> +#define P4D_ORDER    (P4D_SHIFT - PAGE_SHIFT)
> 
>   #ifndef __ASSEMBLY__
>   #ifdef CONFIG_MMU
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 0e26f4fc8717b..ca3f146bc7053 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -908,6 +908,13 @@ config PAGE_MAPCOUNT
>   config PGTABLE_HAS_HUGE_LEAVES
>       def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE
> 
> +#
> +# We can end up creating gigantic folio.
> +#
> +config HAVE_GIGANTIC_FOLIOS
> +    def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \
> +         (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
> +
>   # TODO: Allow to be enabled without THP
>   config ARCH_SUPPORTS_HUGE_PFNMAP
>       def_bool n



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-06 15:02     ` David Hildenbrand (Red Hat)
  2025-11-06 16:19       ` Christophe Leroy
@ 2025-11-07  8:00       ` Sourabh Jain
  2025-11-07  9:02         ` David Hildenbrand (Red Hat)
  1 sibling, 1 reply; 21+ messages in thread
From: Sourabh Jain @ 2025-11-07  8:00 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), Christophe Leroy,
	Madhavan Srinivasan, Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton



On 06/11/25 20:32, David Hildenbrand (Red Hat) wrote:
>>> Yes, we discussed that in [1].
>>>
>>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>>> have even larger hugetlb sizes than PUDs.
>>>
>>> @Cristophe, I was under the impression that you would send a fix. Do 
>>> you
>>> want me to prepare something and send it out?
>>
>> Indeed I would have liked to better understand the implications of all
>> this, but I didn't have the time.
>
> Indeed, too me longer than it should to understand and make up my mind 
> as well.
>
>>
>> By the way, you would describe the fix better than me so yes if you can
>> prepare and send a fix please do.
>
> I just crafted the following. I yet have to test it more, some early
> feedback+testing would be appreciated!
>
> From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
> Date: Thu, 6 Nov 2025 11:31:45 +0100
> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with 
> hugetlb

b4 did not detect this patch, and manually copying the patch text from this
reply also did not apply cleanly on upstream master and linuxppc 
master/next.

- Sourabh Jain

>
> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
> into a generic way for the architecture to state that it supports
> gigantic hugetlb folios.
>
> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
> have folios larger than what the buddy can handle. In the context of
> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
> when dumping tail pages of folios. Before that commit, we assumed that
> we cannot have folios larger than the highest buddy order, which was
> obviously wrong.
>
> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
> when registering hstate"), we used MAX_FOLIO_ORDER to detect
> inconsistencies, and in fact, we found some now.
>
> Powerpc allows for configs that can allocate gigantic folio during boot
> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
> exceed PUD_ORDER.
>
> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE for
> all 64bit configs, and increase the maximum folio size to P4D_ORDER.
>
> Ideally, we'd have a better way to obtain a maximum value. But this 
> should
> be good enough for now fix the issue and now mostly states "no real folio
> size limit".
>
> While at it, handle gigantic DAX folios more clearly: DAX can only
> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>
> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
> HUGETLB_PAGE.
>
> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on PPC64, we will now
> also allow for runtime allocations of folios in some more powerpc 
> configs.
> I don't think this is a problem, but if it is we could handle it through
> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
>
> While __dump_page()/__dump_folio was also problematic (not handling 
> dumping
> of tail pages of such gigantic folios correctly), it doesn't relevant
> critical enough to mark it as a fix.
>
> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes 
> when registering hstate")
> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
> ---
>  arch/powerpc/Kconfig                   | 1 +
>  arch/powerpc/platforms/Kconfig.cputype | 1 -
>  include/linux/mm.h                     | 4 ++--
>  include/linux/pgtable.h                | 1 +
>  mm/Kconfig                             | 7 +++++++
>  5 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e24f4d88885ae..55c3626c86273 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -137,6 +137,7 @@ config PPC
>      select ARCH_HAS_DMA_OPS            if PPC64
>      select ARCH_HAS_FORTIFY_SOURCE
>      select ARCH_HAS_GCOV_PROFILE_ALL
> +    select ARCH_HAS_GIGANTIC_PAGE        if PPC64
>      select ARCH_HAS_KCOV
>      select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>      select ARCH_HAS_MEMBARRIER_CALLBACKS
> diff --git a/arch/powerpc/platforms/Kconfig.cputype 
> b/arch/powerpc/platforms/Kconfig.cputype
> index 7b527d18aa5ee..4c321a8ea8965 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>  config PPC_RADIX_MMU
>      bool "Radix MMU Support"
>      depends on PPC_BOOK3S_64
> -    select ARCH_HAS_GIGANTIC_PAGE
>      default y
>      help
>        Enable support for the Power ISA 3.0 Radix style MMU. Currently 
> this
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index d16b33bacc32b..4842edc875185 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const 
> struct folio *folio)
>      return folio_large_nr_pages(folio);
>  }
>
> -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
> +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
>  /*
>   * We don't expect any folios that exceed buddy sizes (and consequently
>   * memory sections).
> @@ -2092,7 +2092,7 @@ static inline unsigned long folio_nr_pages(const 
> struct folio *folio)
>   * There is no real limit on the folio size. We limit them to the 
> maximum we
>   * currently expect (e.g., hugetlb, dax).
>   */
> -#define MAX_FOLIO_ORDER        PUD_ORDER
> +#define MAX_FOLIO_ORDER        P4D_ORDER
>  #endif
>
>  #define MAX_FOLIO_NR_PAGES    (1UL << MAX_FOLIO_ORDER)
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 32e8457ad5352..09fc3c2ba39e2 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -7,6 +7,7 @@
>
>  #define PMD_ORDER    (PMD_SHIFT - PAGE_SHIFT)
>  #define PUD_ORDER    (PUD_SHIFT - PAGE_SHIFT)
> +#define P4D_ORDER    (P4D_SHIFT - PAGE_SHIFT)
>
>  #ifndef __ASSEMBLY__
>  #ifdef CONFIG_MMU
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 0e26f4fc8717b..ca3f146bc7053 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -908,6 +908,13 @@ config PAGE_MAPCOUNT
>  config PGTABLE_HAS_HUGE_LEAVES
>      def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE
>
> +#
> +# We can end up creating gigantic folio.
> +#
> +config HAVE_GIGANTIC_FOLIOS
> +    def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \
> +         (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
> +
>  # TODO: Allow to be enabled without THP
>  config ARCH_SUPPORTS_HUGE_PFNMAP
>      def_bool n



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-07  8:00       ` Sourabh Jain
@ 2025-11-07  9:02         ` David Hildenbrand (Red Hat)
  2025-11-07 12:35           ` Sourabh Jain
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-07  9:02 UTC (permalink / raw)
  To: Sourabh Jain, Christophe Leroy, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

On 07.11.25 09:00, Sourabh Jain wrote:
> 
> 
> On 06/11/25 20:32, David Hildenbrand (Red Hat) wrote:
>>>> Yes, we discussed that in [1].
>>>>
>>>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>>>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>>>> have even larger hugetlb sizes than PUDs.
>>>>
>>>> @Cristophe, I was under the impression that you would send a fix. Do
>>>> you
>>>> want me to prepare something and send it out?
>>>
>>> Indeed I would have liked to better understand the implications of all
>>> this, but I didn't have the time.
>>
>> Indeed, too me longer than it should to understand and make up my mind
>> as well.
>>
>>>
>>> By the way, you would describe the fix better than me so yes if you can
>>> prepare and send a fix please do.
>>
>> I just crafted the following. I yet have to test it more, some early
>> feedback+testing would be appreciated!
>>
>>  From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with
>> hugetlb
> 
> b4 did not detect this patch, and manually copying the patch text from this
> reply also did not apply cleanly on upstream master and linuxppc
> master/next.

I have it on a branch here:

https://github.com/davidhildenbrand/linux/commit/274928854644c49c92515f8675c090dba15a0db6

https://github.com/davidhildenbrand/linux.git max_folio_order

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-07  9:02         ` David Hildenbrand (Red Hat)
@ 2025-11-07 12:35           ` Sourabh Jain
  2025-11-07 14:18             ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 21+ messages in thread
From: Sourabh Jain @ 2025-11-07 12:35 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), Christophe Leroy,
	Madhavan Srinivasan, Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton


On 07/11/25 14:32, David Hildenbrand (Red Hat) wrote:
> On 07.11.25 09:00, Sourabh Jain wrote:
>>
>>
>> On 06/11/25 20:32, David Hildenbrand (Red Hat) wrote:
>>>>> Yes, we discussed that in [1].
>>>>>
>>>>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>>>>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>>>>> have even larger hugetlb sizes than PUDs.
>>>>>
>>>>> @Cristophe, I was under the impression that you would send a fix. Do
>>>>> you
>>>>> want me to prepare something and send it out?
>>>>
>>>> Indeed I would have liked to better understand the implications of all
>>>> this, but I didn't have the time.
>>>
>>> Indeed, too me longer than it should to understand and make up my mind
>>> as well.
>>>
>>>>
>>>> By the way, you would describe the fix better than me so yes if you 
>>>> can
>>>> prepare and send a fix please do.
>>>
>>> I just crafted the following. I yet have to test it more, some early
>>> feedback+testing would be appreciated!
>>>
>>>  From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
>>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with
>>> hugetlb
>>
>> b4 did not detect this patch, and manually copying the patch text 
>> from this
>> reply also did not apply cleanly on upstream master and linuxppc
>> master/next.
>
> I have it on a branch here:
>
> https://github.com/davidhildenbrand/linux/commit/274928854644c49c92515f8675c090dba15a0db6 
>
>
> https://github.com/davidhildenbrand/linux.git max_folio_order
>

The above patch resolves the issue reported in this thread.

Thanks for the fix David.

Thanks,
Sourabh Jain


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-07 12:35           ` Sourabh Jain
@ 2025-11-07 14:18             ` David Hildenbrand (Red Hat)
  0 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-07 14:18 UTC (permalink / raw)
  To: Sourabh Jain, Christophe Leroy, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

On 07.11.25 13:35, Sourabh Jain wrote:
> 
> On 07/11/25 14:32, David Hildenbrand (Red Hat) wrote:
>> On 07.11.25 09:00, Sourabh Jain wrote:
>>>
>>>
>>> On 06/11/25 20:32, David Hildenbrand (Red Hat) wrote:
>>>>>> Yes, we discussed that in [1].
>>>>>>
>>>>>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>>>>>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>>>>>> have even larger hugetlb sizes than PUDs.
>>>>>>
>>>>>> @Cristophe, I was under the impression that you would send a fix. Do
>>>>>> you
>>>>>> want me to prepare something and send it out?
>>>>>
>>>>> Indeed I would have liked to better understand the implications of all
>>>>> this, but I didn't have the time.
>>>>
>>>> Indeed, too me longer than it should to understand and make up my mind
>>>> as well.
>>>>
>>>>>
>>>>> By the way, you would describe the fix better than me so yes if you
>>>>> can
>>>>> prepare and send a fix please do.
>>>>
>>>> I just crafted the following. I yet have to test it more, some early
>>>> feedback+testing would be appreciated!
>>>>
>>>>   From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
>>>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>>>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>>>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with
>>>> hugetlb
>>>
>>> b4 did not detect this patch, and manually copying the patch text
>>> from this
>>> reply also did not apply cleanly on upstream master and linuxppc
>>> master/next.
>>
>> I have it on a branch here:
>>
>> https://github.com/davidhildenbrand/linux/commit/274928854644c49c92515f8675c090dba15a0db6
>>
>>
>> https://github.com/davidhildenbrand/linux.git max_folio_order
>>
> 
> The above patch resolves the issue reported in this thread.
> 
> Thanks for the fix David.

Okay, I'll have to do some more testing (and I've been failing for days 
to get a ppc64 machine internally provisioned automatically). Will send 
it out early next week, thanks!

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-06 16:19       ` Christophe Leroy
@ 2025-11-07 14:37         ` Ritesh Harjani
  2025-11-07 16:11           ` Christophe Leroy
  2025-11-10 10:10           ` David Hildenbrand (Red Hat)
  2025-11-10 11:27         ` David Hildenbrand (Red Hat)
  1 sibling, 2 replies; 21+ messages in thread
From: Ritesh Harjani @ 2025-11-07 14:37 UTC (permalink / raw)
  To: Christophe Leroy, David Hildenbrand (Red Hat), Sourabh Jain,
	Madhavan Srinivasan, linuxppc-dev
  Cc: Donet Tom, Andrew Morton

Christophe Leroy <christophe.leroy@csgroup.eu> writes:

> Le 06/11/2025 à 16:02, David Hildenbrand (Red Hat) a écrit :
>>>> Yes, we discussed that in [1].
>>>>
>>>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>>>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>>>> have even larger hugetlb sizes than PUDs.
>>>>
>>>> @Cristophe, I was under the impression that you would send a fix. Do you
>>>> want me to prepare something and send it out?
>>>
>>> Indeed I would have liked to better understand the implications of all
>>> this, but I didn't have the time.
>> 
>> Indeed, too me longer than it should to understand and make up my mind 
>> as well.
>> 
>>>
>>> By the way, you would describe the fix better than me so yes if you can
>>> prepare and send a fix please do.
>> 
>> I just crafted the following. I yet have to test it more, some early
>> feedback+testing would be appreciated!
>> 
>>  From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with hugetlb
>> 
>> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
>> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
>> into a generic way for the architecture to state that it supports
>> gigantic hugetlb folios.
>> 
>> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
>> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
>> have folios larger than what the buddy can handle. In the context of
>> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
>> when dumping tail pages of folios. Before that commit, we assumed that
>> we cannot have folios larger than the highest buddy order, which was
>> obviously wrong.
>> 
>> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>> when registering hstate"), we used MAX_FOLIO_ORDER to detect
>> inconsistencies, and in fact, we found some now.
>> 
>> Powerpc allows for configs that can allocate gigantic folio during boot
>> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
>> exceed PUD_ORDER.
>> 
>> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE for
>> all 64bit configs, and increase the maximum folio size to P4D_ORDER.
>> 
>> Ideally, we'd have a better way to obtain a maximum value. But this should
>> be good enough for now fix the issue and now mostly states "no real folio
>> size limit".
>> 
>> While at it, handle gigantic DAX folios more clearly: DAX can only
>> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>> 
>> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
>> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
>> HUGETLB_PAGE.
>> 
>> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on PPC64, we will now
>> also allow for runtime allocations of folios in some more powerpc configs.
>> I don't think this is a problem, but if it is we could handle it through
>> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
>> 
>> While __dump_page()/__dump_folio was also problematic (not handling dumping
>> of tail pages of such gigantic folios correctly), it doesn't relevant
>> critical enough to mark it as a fix.
>> 
>> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes 
>> when registering hstate")
>> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
>> ---
>>   arch/powerpc/Kconfig                   | 1 +
>>   arch/powerpc/platforms/Kconfig.cputype | 1 -
>>   include/linux/mm.h                     | 4 ++--
>>   include/linux/pgtable.h                | 1 +
>>   mm/Kconfig                             | 7 +++++++
>>   5 files changed, 11 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index e24f4d88885ae..55c3626c86273 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -137,6 +137,7 @@ config PPC
>>       select ARCH_HAS_DMA_OPS            if PPC64
>>       select ARCH_HAS_FORTIFY_SOURCE
>>       select ARCH_HAS_GCOV_PROFILE_ALL
>> +    select ARCH_HAS_GIGANTIC_PAGE        if PPC64


The patch looks good from PPC64 perspective, it also fixes the problem
reported on corenet64_smp_defconfig...

>
> Problem is not only on PPC64, it is on PPC32 as well, for instance 
> corenet32_smp_defconfig has the problem as well.
>

However on looking deeper into it - I agree with Christophe that this
problem might still exist on PPC32. 

I did try the patch on corenet32_smp_defconfig and I can see the WARN_ON
still triggering. You can check the logs here.. 

https://github.com/riteshharjani/linux-ci/actions/runs/19169468405/job/54799498288


>
> So I think what you want instead is:
>
> diff --git a/arch/powerpc/platforms/Kconfig.cputype 
> b/arch/powerpc/platforms/Kconfig.cputype
> index 7b527d18aa5ee..1f5a1e587740c 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -276,6 +276,7 @@ config PPC_E500
>          select FSL_EMB_PERFMON
>          bool
>          select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>          select PPC_SMP_MUXED_IPI
>          select PPC_DOORBELL
>          select PPC_KUEP
>
>
>

@Christophe, 

I don't think even the above diff will fix the warning on PPC32. 
The patch defines MAX_FOLIO_ORDER as P4D_ORDER...

+#define MAX_FOLIO_ORDER        P4D_ORDER
+#define P4D_ORDER              (P4D_SHIFT - PAGE_SHIFT)

and for ppc32 in.. 
include/asm-generic/pgtable-nop4d.h
    #define P4D_SHIFT		PGDIR_SHIFT

Then in.. 
arch/powerpc/include/asm/nohash/32/pgtable.h
    #define PGDIR_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
    #define PTE_INDEX_SIZE	PTE_SHIFT

in...
arch/powerpc/include/asm/page_32.h
    #define PTE_SHIFT	(PAGE_SHIFT - PTE_T_LOG2)	/* full page */

    #define PTE_T_LOG2	(__builtin_ffs(sizeof(pte_t)) - 1)


So if you see from above P4D_ORDER is coming down to PTE_INDEX_SIZE 

IIUC, that will cause MAX_FOLIO_ORDER to be 9 in case of e500mc machine type right?

Can you please confirm if the above analysis looks correct to you?

-ritesh



>>       select ARCH_HAS_KCOV
>>       select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>       select ARCH_HAS_MEMBARRIER_CALLBACKS
>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/ 
>> platforms/Kconfig.cputype
>> index 7b527d18aa5ee..4c321a8ea8965 100644
>> --- a/arch/powerpc/platforms/Kconfig.cputype
>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>   config PPC_RADIX_MMU
>>       bool "Radix MMU Support"
>>       depends on PPC_BOOK3S_64
>> -    select ARCH_HAS_GIGANTIC_PAGE
>
> Should remain I think.
>
>>       default y
>>       help
>>         Enable support for the Power ISA 3.0 Radix style MMU. Currently 
>> this
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index d16b33bacc32b..4842edc875185 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const 
>> struct folio *folio)
>>       return folio_large_nr_pages(folio);
>>   }
>> 
>> -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
>> +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
>>   /*
>>    * We don't expect any folios that exceed buddy sizes (and consequently
>>    * memory sections).
>> @@ -2092,7 +2092,7 @@ static inline unsigned long folio_nr_pages(const 
>> struct folio *folio)
>>    * There is no real limit on the folio size. We limit them to the 
>> maximum we
>>    * currently expect (e.g., hugetlb, dax).
>>    */
>> -#define MAX_FOLIO_ORDER        PUD_ORDER
>> +#define MAX_FOLIO_ORDER        P4D_ORDER
>>   #endif
>> 
>>   #define MAX_FOLIO_NR_PAGES    (1UL << MAX_FOLIO_ORDER)
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index 32e8457ad5352..09fc3c2ba39e2 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -7,6 +7,7 @@
>> 
>>   #define PMD_ORDER    (PMD_SHIFT - PAGE_SHIFT)
>>   #define PUD_ORDER    (PUD_SHIFT - PAGE_SHIFT)
>> +#define P4D_ORDER    (P4D_SHIFT - PAGE_SHIFT)
>> 
>>   #ifndef __ASSEMBLY__
>>   #ifdef CONFIG_MMU
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index 0e26f4fc8717b..ca3f146bc7053 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -908,6 +908,13 @@ config PAGE_MAPCOUNT
>>   config PGTABLE_HAS_HUGE_LEAVES
>>       def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE
>> 
>> +#
>> +# We can end up creating gigantic folio.
>> +#
>> +config HAVE_GIGANTIC_FOLIOS
>> +    def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \
>> +         (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
>> +
>>   # TODO: Allow to be enabled without THP
>>   config ARCH_SUPPORTS_HUGE_PFNMAP
>>       def_bool n


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-07 14:37         ` Ritesh Harjani
@ 2025-11-07 16:11           ` Christophe Leroy
  2025-11-10 10:10           ` David Hildenbrand (Red Hat)
  1 sibling, 0 replies; 21+ messages in thread
From: Christophe Leroy @ 2025-11-07 16:11 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), David Hildenbrand (Red Hat), Sourabh Jain,
	Madhavan Srinivasan, linuxppc-dev
  Cc: Donet Tom, Andrew Morton



Le 07/11/2025 à 15:37, Ritesh Harjani a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> 
>> Le 06/11/2025 à 16:02, David Hildenbrand (Red Hat) a écrit :
>>>>> Yes, we discussed that in [1].
>>>>>
>>>>> We'll have to set ARCH_HAS_GIGANTIC_PAGE on ppc and increase
>>>>> MAX_FOLIO_ORDER, because apparently, there might be ppc configs that
>>>>> have even larger hugetlb sizes than PUDs.
>>>>>
>>>>> @Cristophe, I was under the impression that you would send a fix. Do you
>>>>> want me to prepare something and send it out?
>>>>
>>>> Indeed I would have liked to better understand the implications of all
>>>> this, but I didn't have the time.
>>>
>>> Indeed, too me longer than it should to understand and make up my mind
>>> as well.
>>>
>>>>
>>>> By the way, you would describe the fix better than me so yes if you can
>>>> prepare and send a fix please do.
>>>
>>> I just crafted the following. I yet have to test it more, some early
>>> feedback+testing would be appreciated!
>>>
>>>   From 274928854644c49c92515f8675c090dba15a0db6 Mon Sep 17 00:00:00 2001
>>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on some ppc64 configs with hugetlb
>>>
>>> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
>>> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
>>> into a generic way for the architecture to state that it supports
>>> gigantic hugetlb folios.
>>>
>>> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
>>> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
>>> have folios larger than what the buddy can handle. In the context of
>>> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
>>> when dumping tail pages of folios. Before that commit, we assumed that
>>> we cannot have folios larger than the highest buddy order, which was
>>> obviously wrong.
>>>
>>> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>>> when registering hstate"), we used MAX_FOLIO_ORDER to detect
>>> inconsistencies, and in fact, we found some now.
>>>
>>> Powerpc allows for configs that can allocate gigantic folio during boot
>>> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
>>> exceed PUD_ORDER.
>>>
>>> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE for
>>> all 64bit configs, and increase the maximum folio size to P4D_ORDER.
>>>
>>> Ideally, we'd have a better way to obtain a maximum value. But this should
>>> be good enough for now fix the issue and now mostly states "no real folio
>>> size limit".
>>>
>>> While at it, handle gigantic DAX folios more clearly: DAX can only
>>> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>>>
>>> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
>>> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
>>> HUGETLB_PAGE.
>>>
>>> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on PPC64, we will now
>>> also allow for runtime allocations of folios in some more powerpc configs.
>>> I don't think this is a problem, but if it is we could handle it through
>>> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
>>>
>>> While __dump_page()/__dump_folio was also problematic (not handling dumping
>>> of tail pages of such gigantic folios correctly), it doesn't relevant
>>> critical enough to mark it as a fix.
>>>
>>> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>>> when registering hstate")
>>> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
>>> ---
>>>    arch/powerpc/Kconfig                   | 1 +
>>>    arch/powerpc/platforms/Kconfig.cputype | 1 -
>>>    include/linux/mm.h                     | 4 ++--
>>>    include/linux/pgtable.h                | 1 +
>>>    mm/Kconfig                             | 7 +++++++
>>>    5 files changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>> index e24f4d88885ae..55c3626c86273 100644
>>> --- a/arch/powerpc/Kconfig
>>> +++ b/arch/powerpc/Kconfig
>>> @@ -137,6 +137,7 @@ config PPC
>>>        select ARCH_HAS_DMA_OPS            if PPC64
>>>        select ARCH_HAS_FORTIFY_SOURCE
>>>        select ARCH_HAS_GCOV_PROFILE_ALL
>>> +    select ARCH_HAS_GIGANTIC_PAGE        if PPC64
> 
> 
> The patch looks good from PPC64 perspective, it also fixes the problem
> reported on corenet64_smp_defconfig...
> 
>>
>> Problem is not only on PPC64, it is on PPC32 as well, for instance
>> corenet32_smp_defconfig has the problem as well.
>>
> 
> However on looking deeper into it - I agree with Christophe that this
> problem might still exist on PPC32.
> 
> I did try the patch on corenet32_smp_defconfig and I can see the WARN_ON
> still triggering. You can check the logs here..
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Friteshharjani%2Flinux-ci%2Factions%2Fruns%2F19169468405%2Fjob%2F54799498288&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C27caddf37d884b359c4008de1e13dd22%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638981267852035218%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ceCXfRzudcYmZhzcoUYKMUeirOlKLgVdEy1L2vNrSPI%3D&reserved=0
> 
> 
>>
>> So I think what you want instead is:
>>
>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>> b/arch/powerpc/platforms/Kconfig.cputype
>> index 7b527d18aa5ee..1f5a1e587740c 100644
>> --- a/arch/powerpc/platforms/Kconfig.cputype
>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>> @@ -276,6 +276,7 @@ config PPC_E500
>>           select FSL_EMB_PERFMON
>>           bool
>>           select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>           select PPC_SMP_MUXED_IPI
>>           select PPC_DOORBELL
>>           select PPC_KUEP
>>
>>
>>
> 
> @Christophe,
> 
> I don't think even the above diff will fix the warning on PPC32.
> The patch defines MAX_FOLIO_ORDER as P4D_ORDER...
> 
> +#define MAX_FOLIO_ORDER        P4D_ORDER
> +#define P4D_ORDER              (P4D_SHIFT - PAGE_SHIFT)
> 
> and for ppc32 in..
> include/asm-generic/pgtable-nop4d.h
>      #define P4D_SHIFT		PGDIR_SHIFT
> 
> Then in..
> arch/powerpc/include/asm/nohash/32/pgtable.h
>      #define PGDIR_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
>      #define PTE_INDEX_SIZE	PTE_SHIFT
> 
> in...
> arch/powerpc/include/asm/page_32.h
>      #define PTE_SHIFT	(PAGE_SHIFT - PTE_T_LOG2)	/* full page */
> 
>      #define PTE_T_LOG2	(__builtin_ffs(sizeof(pte_t)) - 1)
> 
> 
> So if you see from above P4D_ORDER is coming down to PTE_INDEX_SIZE
> 
> IIUC, that will cause MAX_FOLIO_ORDER to be 9 in case of e500mc machine type right?
> 
> Can you please confirm if the above analysis looks correct to you?
> 

Ah you are right, that's not enough. I was thinking that PGDIR_ORDER was 
the highest possible value ever but in fact not. PGDIR_SIZE is 4Mbytes 
so any page larger than that still triggers the warning. Here are the 
warnings I get on QEMU with corenet32_smp_defconfig

random: crng init done
Hash pointers mode set to never.
Memory CAM mapping: 16/16/16/16/64/64/64/256/256 Mb, residual: 256Mb
Activating Kernel Userspace Access Protection
Activating Kernel Userspace Execution Prevention
Linux version 6.18.0-rc4-00026-g274928854644-dirty 
(chleroy@PO20335.IDSI0.si.c-s.fr) (powerpc64-linux-gcc (GCC) 8.5.0, GNU 
ld (GNU Binutils) 2.36.1) #1706 SMP Fri Nov  7 17:01:26 CET 2025
OF: reserved mem: Reserved memory: No reserved-memory node in the DT
Found initrd at 0xc4000000:0xc41d1a3b
Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
printk: legacy bootconsole [udbg0] enabled
CPU maps initialized for 1 thread per core
-----------------------------------------------------
phys_mem_size     = 0x40000000
dcache_bsize      = 0x40
icache_bsize      = 0x40
cpu_features      = 0x0000000000000194
   possible        = 0x000000000001039c
   always          = 0x0000000000000100
cpu_user_features = 0x8c008000 0x08000000
mmu_features      = 0x000a0210
-----------------------------------------------------
qemu_e500_setup_arch()
barrier-nospec: using isync; sync as speculation barrier
Zone ranges:
   Normal   [mem 0x0000000000000000-0x000000002fffffff]
   HighMem  [mem 0x0000000030000000-0x000000003fffffff]
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x0000000000000000-0x000000003fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff]
MMU: Allocated 1088 bytes of context maps for 255 contexts
percpu: Embedded 17 pages/cpu s39436 r8192 d22004 u69632
Kernel command line: hugepagesz=1g hugepages=1 hugepagesz=64m 
hugepages=1 hugepagesz=256m hugepages=1 noreboot no_hash_pointers
Unknown kernel command line parameters "noreboot", will be passed to 
user space.
printk: log buffer data + meta data: 16384 + 51200 = 67584 bytes
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at mm/hugetlb.c:4753 hugetlb_add_hstate+0x170/0x178
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 
6.18.0-rc4-00026-g274928854644-dirty #1706 NONE
Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
NIP:  c2027d88 LR: c2028548 CTR: 00000003
REGS: c24b5e10 TRAP: 0700   Not tainted 
(6.18.0-rc4-00026-g274928854644-dirty)
MSR:  00021002 <CE,ME>  CR: 24000424  XER: 20000000

GPR00: c2028548 c24b5f00 c23b5580 00000012 40000000 c24b5ee0 c24c0000 
00000000
GPR08: c24f7934 00001000 c23b74cc 00000000 c24f5070 02089cc0 00000000 
00000000
GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 ef7e7fc0 
c23b5008
GPR24: 00000000 c10937d4 0000000a c205b000 00000000 c24f7934 40000000 
c24f7934
NIP [c2027d88] hugetlb_add_hstate+0x170/0x178
LR [c2028548] hugepagesz_setup+0xb0/0x16c
Call Trace:
[c24b5f00] [c0d3dc6c] memparse+0x2c/0x104 (unreliable)
[c24b5f30] [c2028548] hugepagesz_setup+0xb0/0x16c
[c24b5f50] [c2028d18] hugetlb_bootmem_alloc+0x7c/0x194
[c24b5f80] [c2021628] mm_core_init+0x30/0x368
[c24b5fa0] [c2000ec0] start_kernel+0x2f4/0x7bc
[c24b5ff0] [c0000478] set_ivor+0x150/0x18c
Code: 60000000 60000000 2f8a0000 41deff28 83810020 83a10024 83c10028 
83e1002c 38210030 4e800020 0fe00000 0fe00000 <0fe00000> 4bffff24 
3d20c24d 9421ff60
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at mm/hugetlb.c:4753 hugetlb_add_hstate+0x170/0x178
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G        W 
6.18.0-rc4-00026-g274928854644-dirty #1706 NONE
Tainted: [W]=WARN
Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
NIP:  c2027d88 LR: c2028548 CTR: 00000005
REGS: c24b5e10 TRAP: 0700   Tainted: G        W 
(6.18.0-rc4-00026-g274928854644-dirty)
MSR:  00021002 <CE,ME>  CR: 24000224  XER: 20000000

GPR00: c2028548 c24b5f00 c23b5580 0000000e 04000000 c24b5ee0 c24c0000 
00000001
GPR08: 00001000 40000000 c24f79b4 00000000 24000424 02089cc0 00000000 
00000000
GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 ef7e7fc0 
c23b5008
GPR24: 00000000 c10937d4 0000000a c205b000 00000080 c24f7934 04000000 
c24f79b4
NIP [c2027d88] hugetlb_add_hstate+0x170/0x178
LR [c2028548] hugepagesz_setup+0xb0/0x16c
Call Trace:
[c24b5f00] [c0d3dc6c] memparse+0x2c/0x104 (unreliable)
[c24b5f30] [c2028548] hugepagesz_setup+0xb0/0x16c
[c24b5f50] [c2028d18] hugetlb_bootmem_alloc+0x7c/0x194
[c24b5f80] [c2021628] mm_core_init+0x30/0x368
[c24b5fa0] [c2000ec0] start_kernel+0x2f4/0x7bc
[c24b5ff0] [c0000478] set_ivor+0x150/0x18c
Code: 60000000 60000000 2f8a0000 41deff28 83810020 83a10024 83c10028 
83e1002c 38210030 4e800020 0fe00000 0fe00000 <0fe00000> 4bffff24 
3d20c24d 9421ff60
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at mm/hugetlb.c:4753 hugetlb_add_hstate+0x170/0x178
Modules linked in:
CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G        W 
6.18.0-rc4-00026-g274928854644-dirty #1706 NONE
Tainted: [W]=WARN
Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
NIP:  c2027d88 LR: c2028548 CTR: 00000004
REGS: c24b5e10 TRAP: 0700   Tainted: G        W 
(6.18.0-rc4-00026-g274928854644-dirty)
MSR:  00021002 <CE,ME>  CR: 24000224  XER: 20000000

GPR00: c2028548 c24b5f00 c23b5580 00000010 10000000 c24b5ee0 c24c0000 
00000002
GPR08: 00001000 04000000 c24f7a34 00000000 24000224 02089cc0 00000000 
00000000
GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 ef7e7fc0 
c23b5008
GPR24: 00000000 c10937d4 0000000a c205b000 00000100 c24f7934 10000000 
c24f7a34
NIP [c2027d88] hugetlb_add_hstate+0x170/0x178
LR [c2028548] hugepagesz_setup+0xb0/0x16c
Call Trace:
[c24b5f00] [c0d3dc6c] memparse+0x2c/0x104 (unreliable)
[c24b5f30] [c2028548] hugepagesz_setup+0xb0/0x16c
[c24b5f50] [c2028d18] hugetlb_bootmem_alloc+0x7c/0x194
[c24b5f80] [c2021628] mm_core_init+0x30/0x368
[c24b5fa0] [c2000ec0] start_kernel+0x2f4/0x7bc
[c24b5ff0] [c0000478] set_ivor+0x150/0x18c
Code: 60000000 60000000 2f8a0000 41deff28 83810020 83a10024 83c10028 
83e1002c 38210030 4e800020 0fe00000 0fe00000 <0fe00000> 4bffff24 
3d20c24d 9421ff60
---[ end trace 0000000000000000 ]---
HugeTLB: allocating 1 of page size 1.00 GiB failed.  Only allocated 0 
hugepages.
Built 1 zonelists, mobility grouping on.  Total pages: 262144
mem auto-init: stack:off, heap alloc:off, heap free:off
**********************************************************
**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
**                                                      **
** This system shows unhashed kernel memory addresses   **
** via the console, logs, and other interfaces. This    **
** might reduce the security of your system.            **
**                                                      **
** If you see this message and you are not debugging    **
** the kernel, report this immediately to your system   **
** administrator!                                       **
**                                                      **
** Use hash_pointers=always to force this mode off      **
**                                                      **
**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
**********************************************************
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
rcu: Hierarchical RCU implementation.
rcu: 	RCU event tracing is enabled.
rcu: 	RCU restricting CPUs from NR_CPUS=24 to nr_cpu_ids=2.
	Tracing variant of Tasks RCU enabled.
rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
RCU Tasks Trace: Setting shift to 1 and lim to 1 rcu_task_cb_adjust=1 
rcu_task_cpu_ids=2.
NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
mpic: Setting up MPIC " OpenPIC  " version 1.2 at fe0040000, max 2 CPUs
mpic: ISU size: 256, shift: 8, mask: ff
mpic: Initializing for 256 sources
rcu: srcu_init: Setting srcu_struct sizes based on contention.
clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 
0x5c4093a7d1, max_idle_ns: 440795210635 ns
clocksource: timebase mult[2800000] shift[24] registered
Console: colour dummy device 80x25
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
e500 family performance monitor hardware support registered
rcu: Hierarchical SRCU implementation.
rcu: 	Max phase no-delay instances is 1000.
Timer migration: 1 hierarchy levels; 8 children per group; 1 crossnode level
smp: Bringing up secondary CPUs ...
Activating Kernel Userspace Access Protection
smp: Brought up 1 node, 2 CPUs
Memory: 668416K/1048576K available (13796K kernel code, 1160K rwdata, 
18972K rodata, 3796K init, 245K bss, 377736K reserved, 0K cma-reserved, 
262144K highmem)
devtmpfs: initialized
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, 
max_idle_ns: 7645041785100000 ns
posixtimers hash table entries: 1024 (order: 1, 8192 bytes, linear)
futex hash table entries: 512 (32768 bytes on 1 NUMA nodes, total 32 
KiB, linear).
Machine: QEMU ppce500
SoC family: QorIQ
SoC ID: svr:0x00000000, Revision: 0.0
NET: Registered PF_NETLINK/PF_ROUTE protocol family
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(0.224:1): state=initialized audit_enabled=0 res=1

------------[ cut here ]------------
WARNING: CPU: 1 PID: 1 at mm/hugetlb.c:4753 hugetlb_add_hstate+0x170/0x178
Modules linked in:
CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Tainted: G        W 
6.18.0-rc4-00026-g274928854644-dirty #1706 NONE
Tainted: [W]=WARN
Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
NIP:  c2027d88 LR: c200bab4 CTR: 00000008
REGS: c50cdce0 TRAP: 0700   Tainted: G        W 
(6.18.0-rc4-00026-g274928854644-dirty)
MSR:  00029002 <CE,EE,ME>  CR: 48000824  XER: 20000000

GPR00: c200bab4 c50cddd0 c50f0000 0000000a 00029002 0000002c c24c0000 
00000003
GPR08: 00001000 10000000 c24f7ab4 00000004 44000828 00000000 c00043b8 
00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 c24be774 c205c024 
c20000c4
GPR24: c5138a80 00000000 c23b74dc 00000000 00000180 c24f7934 00400000 
c24f7ab4
NIP [c2027d88] hugetlb_add_hstate+0x170/0x178
LR [c200bab4] hugetlbpage_init+0x98/0x118
Call Trace:
[c50cddd0] [c0017de0] udbg_uart_putc+0x48/0x94 (unreliable)
[c50cde00] [c200bab4] hugetlbpage_init+0x98/0x118
[c50cde30] [c0004130] do_one_initcall+0x58/0x228
[c50cdea0] [c200162c] kernel_init_freeable+0x224/0x3c8
[c50cdee0] [c00043d8] kernel_init+0x20/0x148
[c50cdf00] [c0015224] ret_from_kernel_user_thread+0x10/0x18
---- interrupt: 0 at 0x0
Code: 60000000 60000000 2f8a0000 41deff28 83810020 83a10024 83c10028 
83e1002c 38210030 4e800020 0fe00000 0fe00000 <0fe00000> 4bffff24 
3d20c24d 9421ff60
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1 at mm/hugetlb.c:4753 hugetlb_add_hstate+0x170/0x178
Modules linked in:
CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Tainted: G        W 
6.18.0-rc4-00026-g274928854644-dirty #1706 NONE
Tainted: [W]=WARN
Hardware name: QEMU ppce500 e5500 0x80240020 QEMU e500
NIP:  c2027d88 LR: c200bab4 CTR: 00000006
REGS: c50cdce0 TRAP: 0700   Tainted: G        W 
(6.18.0-rc4-00026-g274928854644-dirty)
MSR:  00029002 <CE,EE,ME>  CR: 44000224  XER: 20000000

GPR00: c200bab4 c50cddd0 c50f0000 0000000c c100e27a 00000002 c24c0000 
00000004
GPR08: 00001000 00400000 c24f7b34 ffffffff 48000824 00000000 c00043b8 
00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 c24be774 c205c024 
c20000c4
GPR24: c5138a80 00000001 c23b74dc 00000000 00000200 c24f7934 01000000 
c24f7b34
NIP [c2027d88] hugetlb_add_hstate+0x170/0x178
LR [c200bab4] hugetlbpage_init+0x98/0x118
Call Trace:
[c50cddd0] [c2027d2c] hugetlb_add_hstate+0x114/0x178 (unreliable)
[c50cde00] [c200bab4] hugetlbpage_init+0x98/0x118
[c50cde30] [c0004130] do_one_initcall+0x58/0x228
[c50cdea0] [c200162c] kernel_init_freeable+0x224/0x3c8
[c50cdee0] [c00043d8] kernel_init+0x20/0x148
[c50cdf00] [c0015224] ret_from_kernel_user_thread+0x10/0x18
---- interrupt: 0 at 0x0
Code: 60000000 60000000 2f8a0000 41deff28 83810020 83a10024 83c10028 
83e1002c 38210030 4e800020 0fe00000 0fe00000 <0fe00000> 4bffff24 
3d20c24d 9421ff60
---[ end trace 0000000000000000 ]---
Found FSL PCI host bridge at 0x0000000fe0008000. Firmware bus number: 0->255
PCI host bridge /pci@fe0008000 (primary) ranges:
  MEM 0x0000000c00000000..0x0000000c1fffffff -> 0x00000000e0000000
   IO 0x0000000fe1000000..0x0000000fe100ffff -> 0x0000000000000000
/pci@fe0008000: PCICSRBAR @ 0xdff00000
setup_pci_atmu: end of DRAM 40000000
fsl-pamu: fsl_pamu_init: could not find a PAMU node
PCI: Probing PCI hardware
fsl-pci fe0008000.pci: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
pci_bus 0000:00: root bus resource [mem 0xc00000000-0xc1fffffff] (bus 
address [0xe0000000-0xffffffff])
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to ff
pci 0000:00:00.0: [1957:0030] type 00 class 0x0b2000 conventional PCI 
endpoint
pci 0000:00:00.0: BAR 0 [mem 0xdff00000-0xdfffffff]
pci 0000:00:01.0: [8086:100e] type 00 class 0x020000 conventional PCI 
endpoint
pci 0000:00:01.0: BAR 0 [mem 0x00000000-0x0001ffff]
pci 0000:00:01.0: BAR 1 [io  0x0000-0x003f]
pci 0000:00:01.0: ROM [mem 0x00000000-0x0003ffff pref]
pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to 00
pci 0000:00:01.0: ROM [mem 0xc00000000-0xc0003ffff pref]: assigned
pci 0000:00:01.0: BAR 0 [mem 0xc00040000-0xc0005ffff]: assigned
pci 0000:00:01.0: BAR 1 [io  0x1000-0x103f]: assigned
pci_bus 0000:00: resource 4 [io  0x0000-0xffff]
pci_bus 0000:00: resource 5 [mem 0xc00000000-0xc1fffffff]
HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
HugeTLB: 0 KiB vmemmap can be freed for a 1.00 GiB page
HugeTLB: registered 64.0 MiB page size, pre-allocated 1 pages
HugeTLB: 0 KiB vmemmap can be freed for a 64.0 MiB page
HugeTLB: registered 256 MiB page size, pre-allocated 1 pages
HugeTLB: 0 KiB vmemmap can be freed for a 256 MiB page
HugeTLB: registered 4.00 MiB page size, pre-allocated 0 pages
HugeTLB: 0 KiB vmemmap can be freed for a 4.00 MiB page
HugeTLB: registered 16.0 MiB page size, pre-allocated 0 pages
HugeTLB: 0 KiB vmemmap can be freed for a 16.0 MiB page


Christophe


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-07 14:37         ` Ritesh Harjani
  2025-11-07 16:11           ` Christophe Leroy
@ 2025-11-10 10:10           ` David Hildenbrand (Red Hat)
  2025-11-10 10:33             ` Christophe Leroy
  1 sibling, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-10 10:10 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), Christophe Leroy, Sourabh Jain,
	Madhavan Srinivasan, linuxppc-dev
  Cc: Donet Tom, Andrew Morton

[fighting with mail transitioning, for some reason I did not receive
the mails from Christophe, so replying here]

>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>> index e24f4d88885ae..55c3626c86273 100644
>>> --- a/arch/powerpc/Kconfig
>>> +++ b/arch/powerpc/Kconfig
>>> @@ -137,6 +137,7 @@ config PPC
>>>        select ARCH_HAS_DMA_OPS            if PPC64
>>>        select ARCH_HAS_FORTIFY_SOURCE
>>>        select ARCH_HAS_GCOV_PROFILE_ALL
>>> +    select ARCH_HAS_GIGANTIC_PAGE        if PPC64
> 
> 
> The patch looks good from PPC64 perspective, it also fixes the problem
> reported on corenet64_smp_defconfig...
> 
>>
>> Problem is not only on PPC64, it is on PPC32 as well, for instance
>> corenet32_smp_defconfig has the problem as well.
>>
> 
> However on looking deeper into it - I agree with Christophe that this
> problem might still exist on PPC32.

Ah, I missed that. I thought it would be a ppc64 thing. :(

> 
> I did try the patch on corenet32_smp_defconfig and I can see the WARN_ON
> still triggering. You can check the logs here..
> 
> https://github.com/riteshharjani/linux-ci/actions/runs/19169468405/job/54799498288
> 
> 
>>
>> So I think what you want instead is:
>>
>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>> b/arch/powerpc/platforms/Kconfig.cputype
>> index 7b527d18aa5ee..1f5a1e587740c 100644
>> --- a/arch/powerpc/platforms/Kconfig.cputype
>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>> @@ -276,6 +276,7 @@ config PPC_E500
>>           select FSL_EMB_PERFMON
>>           bool
>>           select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>           select PPC_SMP_MUXED_IPI
>>           select PPC_DOORBELL
>>           select PPC_KUEP
>>
>>
>>
> 
> @Christophe,
> 
> I don't think even the above diff will fix the warning on PPC32.
> The patch defines MAX_FOLIO_ORDER as P4D_ORDER...
> 
> +#define MAX_FOLIO_ORDER        P4D_ORDER
> +#define P4D_ORDER              (P4D_SHIFT - PAGE_SHIFT)
> 
> and for ppc32 in..
> include/asm-generic/pgtable-nop4d.h
>      #define P4D_SHIFT		PGDIR_SHIFT
> 
> Then in..
> arch/powerpc/include/asm/nohash/32/pgtable.h
>      #define PGDIR_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
>      #define PTE_INDEX_SIZE	PTE_SHIFT
> 
> in...
> arch/powerpc/include/asm/page_32.h
>      #define PTE_SHIFT	(PAGE_SHIFT - PTE_T_LOG2)	/* full page */
> 
>      #define PTE_T_LOG2	(__builtin_ffs(sizeof(pte_t)) - 1)
> 
> 
> So if you see from above P4D_ORDER is coming down to PTE_INDEX_SIZE
> 
> IIUC, that will cause MAX_FOLIO_ORDER to be 9 in case of e500mc machine type right?
> 
> Can you please confirm if the above analysis looks correct to you?

Cristophe wrote

"
Ah you are right, that's not enough. I was thinking that PGDIR_ORDER was
the highest possible value ever but in fact not. PGDIR_SIZE is 4Mbytes
so any page larger than that still triggers the warning. Here are the
warnings I get on QEMU with corenet32_smp_defconfig
"

And then we get

HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
HugeTLB: 0 KiB vmemmap can be freed for a 1.00 GiB page
HugeTLB: registered 64.0 MiB page size, pre-allocated 1 pages
HugeTLB: 0 KiB vmemmap can be freed for a 64.0 MiB page
HugeTLB: registered 256 MiB page size, pre-allocated 1 pages
HugeTLB: 0 KiB vmemmap can be freed for a 256 MiB page
HugeTLB: registered 4.00 MiB page size, pre-allocated 0 pages
HugeTLB: 0 KiB vmemmap can be freed for a 4.00 MiB page
HugeTLB: registered 16.0 MiB page size, pre-allocated 0 pages
HugeTLB: 0 KiB vmemmap can be freed for a 16.0 MiB page


How could any of these larger sizes possibly ever get mapped into a page 
table on 32bit? I'm probably missing something important :)

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-10 10:10           ` David Hildenbrand (Red Hat)
@ 2025-11-10 10:33             ` Christophe Leroy
  2025-11-10 11:04               ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2025-11-10 10:33 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), Ritesh Harjani (IBM), Sourabh Jain,
	Madhavan Srinivasan, linuxppc-dev, David Hildenbrand
  Cc: Donet Tom, Andrew Morton



Le 10/11/2025 à 11:10, David Hildenbrand (Red Hat) a écrit :
> [fighting with mail transitioning, for some reason I did not receive
> the mails from Christophe, so replying here]
> 
>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>> index e24f4d88885ae..55c3626c86273 100644
>>>> --- a/arch/powerpc/Kconfig
>>>> +++ b/arch/powerpc/Kconfig
>>>> @@ -137,6 +137,7 @@ config PPC
>>>>        select ARCH_HAS_DMA_OPS            if PPC64
>>>>        select ARCH_HAS_FORTIFY_SOURCE
>>>>        select ARCH_HAS_GCOV_PROFILE_ALL
>>>> +    select ARCH_HAS_GIGANTIC_PAGE        if PPC64
>>
>>
>> The patch looks good from PPC64 perspective, it also fixes the problem
>> reported on corenet64_smp_defconfig...
>>
>>>
>>> Problem is not only on PPC64, it is on PPC32 as well, for instance
>>> corenet32_smp_defconfig has the problem as well.
>>>
>>
>> However on looking deeper into it - I agree with Christophe that this
>> problem might still exist on PPC32.
> 
> Ah, I missed that. I thought it would be a ppc64 thing. :(
> 
>>
>> I did try the patch on corenet32_smp_defconfig and I can see the WARN_ON
>> still triggering. You can check the logs here..
>>
>> https://eur01.safelinks.protection.outlook.com/? 
>> url=https%3A%2F%2Fgithub.com%2Friteshharjani%2Flinux- 
>> ci%2Factions%2Fruns%2F19169468405%2Fjob%2F54799498288&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cf2e19b221ba740b2034e08de204158de%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638983662203106300%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=UKQnlJWDKPfNCiYL8W7d2%2FTAhMhGbmxx8IDvy8jTbNQ%3D&reserved=0
>>
>>
>>>
>>> So I think what you want instead is:
>>>
>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>> b/arch/powerpc/platforms/Kconfig.cputype
>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>           select FSL_EMB_PERFMON
>>>           bool
>>>           select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>           select PPC_SMP_MUXED_IPI
>>>           select PPC_DOORBELL
>>>           select PPC_KUEP
>>>
>>>
>>>
>>
>> @Christophe,
>>
>> I don't think even the above diff will fix the warning on PPC32.
>> The patch defines MAX_FOLIO_ORDER as P4D_ORDER...
>>
>> +#define MAX_FOLIO_ORDER        P4D_ORDER
>> +#define P4D_ORDER              (P4D_SHIFT - PAGE_SHIFT)
>>
>> and for ppc32 in..
>> include/asm-generic/pgtable-nop4d.h
>>      #define P4D_SHIFT        PGDIR_SHIFT
>>
>> Then in..
>> arch/powerpc/include/asm/nohash/32/pgtable.h
>>      #define PGDIR_SHIFT    (PAGE_SHIFT + PTE_INDEX_SIZE)
>>      #define PTE_INDEX_SIZE    PTE_SHIFT
>>
>> in...
>> arch/powerpc/include/asm/page_32.h
>>      #define PTE_SHIFT    (PAGE_SHIFT - PTE_T_LOG2)    /* full page */
>>
>>      #define PTE_T_LOG2    (__builtin_ffs(sizeof(pte_t)) - 1)
>>
>>
>> So if you see from above P4D_ORDER is coming down to PTE_INDEX_SIZE
>>
>> IIUC, that will cause MAX_FOLIO_ORDER to be 9 in case of e500mc 
>> machine type right?
>>
>> Can you please confirm if the above analysis looks correct to you?
> 
> Cristophe wrote
> 
> "
> Ah you are right, that's not enough. I was thinking that PGDIR_ORDER was
> the highest possible value ever but in fact not. PGDIR_SIZE is 4Mbytes
> so any page larger than that still triggers the warning. Here are the
> warnings I get on QEMU with corenet32_smp_defconfig
> "
> 
> And then we get
> 
> HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
> HugeTLB: 0 KiB vmemmap can be freed for a 1.00 GiB page
> HugeTLB: registered 64.0 MiB page size, pre-allocated 1 pages
> HugeTLB: 0 KiB vmemmap can be freed for a 64.0 MiB page
> HugeTLB: registered 256 MiB page size, pre-allocated 1 pages
> HugeTLB: 0 KiB vmemmap can be freed for a 256 MiB page
> HugeTLB: registered 4.00 MiB page size, pre-allocated 0 pages
> HugeTLB: 0 KiB vmemmap can be freed for a 4.00 MiB page
> HugeTLB: registered 16.0 MiB page size, pre-allocated 0 pages
> HugeTLB: 0 KiB vmemmap can be freed for a 16.0 MiB page
> 
> 
> How could any of these larger sizes possibly ever get mapped into a page 
> table on 32bit? I'm probably missing something important :)
> 

Using contiguous entries in a table to describe larger pages.

See commit 7c44202e3609 ("powerpc/e500: use contiguous PMD instead of 
hugepd")

That's similar to what ARM64 does as far as I understand, see commit 
66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")

Christophe


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-10 10:33             ` Christophe Leroy
@ 2025-11-10 11:04               ` David Hildenbrand (Red Hat)
  0 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-10 11:04 UTC (permalink / raw)
  To: Christophe Leroy, Ritesh Harjani (IBM), Sourabh Jain,
	Madhavan Srinivasan, linuxppc-dev
  Cc: Donet Tom, Andrew Morton

On 10.11.25 11:33, Christophe Leroy wrote:
> 
> 
> Le 10/11/2025 à 11:10, David Hildenbrand (Red Hat) a écrit :
>> [fighting with mail transitioning, for some reason I did not receive
>> the mails from Christophe, so replying here]
>>
>>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>>> index e24f4d88885ae..55c3626c86273 100644
>>>>> --- a/arch/powerpc/Kconfig
>>>>> +++ b/arch/powerpc/Kconfig
>>>>> @@ -137,6 +137,7 @@ config PPC
>>>>>         select ARCH_HAS_DMA_OPS            if PPC64
>>>>>         select ARCH_HAS_FORTIFY_SOURCE
>>>>>         select ARCH_HAS_GCOV_PROFILE_ALL
>>>>> +    select ARCH_HAS_GIGANTIC_PAGE        if PPC64
>>>
>>>
>>> The patch looks good from PPC64 perspective, it also fixes the problem
>>> reported on corenet64_smp_defconfig...
>>>
>>>>
>>>> Problem is not only on PPC64, it is on PPC32 as well, for instance
>>>> corenet32_smp_defconfig has the problem as well.
>>>>
>>>
>>> However on looking deeper into it - I agree with Christophe that this
>>> problem might still exist on PPC32.
>>
>> Ah, I missed that. I thought it would be a ppc64 thing. :(
>>
>>>
>>> I did try the patch on corenet32_smp_defconfig and I can see the WARN_ON
>>> still triggering. You can check the logs here..
>>>
>>> https://eur01.safelinks.protection.outlook.com/?
>>> url=https%3A%2F%2Fgithub.com%2Friteshharjani%2Flinux-
>>> ci%2Factions%2Fruns%2F19169468405%2Fjob%2F54799498288&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cf2e19b221ba740b2034e08de204158de%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638983662203106300%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=UKQnlJWDKPfNCiYL8W7d2%2FTAhMhGbmxx8IDvy8jTbNQ%3D&reserved=0
>>>
>>>
>>>>
>>>> So I think what you want instead is:
>>>>
>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>>> b/arch/powerpc/platforms/Kconfig.cputype
>>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>>            select FSL_EMB_PERFMON
>>>>            bool
>>>>            select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>>            select PPC_SMP_MUXED_IPI
>>>>            select PPC_DOORBELL
>>>>            select PPC_KUEP
>>>>
>>>>
>>>>
>>>
>>> @Christophe,
>>>
>>> I don't think even the above diff will fix the warning on PPC32.
>>> The patch defines MAX_FOLIO_ORDER as P4D_ORDER...
>>>
>>> +#define MAX_FOLIO_ORDER        P4D_ORDER
>>> +#define P4D_ORDER              (P4D_SHIFT - PAGE_SHIFT)
>>>
>>> and for ppc32 in..
>>> include/asm-generic/pgtable-nop4d.h
>>>       #define P4D_SHIFT        PGDIR_SHIFT
>>>
>>> Then in..
>>> arch/powerpc/include/asm/nohash/32/pgtable.h
>>>       #define PGDIR_SHIFT    (PAGE_SHIFT + PTE_INDEX_SIZE)
>>>       #define PTE_INDEX_SIZE    PTE_SHIFT
>>>
>>> in...
>>> arch/powerpc/include/asm/page_32.h
>>>       #define PTE_SHIFT    (PAGE_SHIFT - PTE_T_LOG2)    /* full page */
>>>
>>>       #define PTE_T_LOG2    (__builtin_ffs(sizeof(pte_t)) - 1)
>>>
>>>
>>> So if you see from above P4D_ORDER is coming down to PTE_INDEX_SIZE
>>>
>>> IIUC, that will cause MAX_FOLIO_ORDER to be 9 in case of e500mc
>>> machine type right?
>>>
>>> Can you please confirm if the above analysis looks correct to you?
>>
>> Cristophe wrote
>>
>> "
>> Ah you are right, that's not enough. I was thinking that PGDIR_ORDER was
>> the highest possible value ever but in fact not. PGDIR_SIZE is 4Mbytes
>> so any page larger than that still triggers the warning. Here are the
>> warnings I get on QEMU with corenet32_smp_defconfig
>> "
>>
>> And then we get
>>
>> HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
>> HugeTLB: 0 KiB vmemmap can be freed for a 1.00 GiB page
>> HugeTLB: registered 64.0 MiB page size, pre-allocated 1 pages
>> HugeTLB: 0 KiB vmemmap can be freed for a 64.0 MiB page
>> HugeTLB: registered 256 MiB page size, pre-allocated 1 pages
>> HugeTLB: 0 KiB vmemmap can be freed for a 256 MiB page
>> HugeTLB: registered 4.00 MiB page size, pre-allocated 0 pages
>> HugeTLB: 0 KiB vmemmap can be freed for a 4.00 MiB page
>> HugeTLB: registered 16.0 MiB page size, pre-allocated 0 pages
>> HugeTLB: 0 KiB vmemmap can be freed for a 16.0 MiB page
>>
>>
>> How could any of these larger sizes possibly ever get mapped into a page
>> table on 32bit? I'm probably missing something important :)
>>
> 
> Using contiguous entries in a table to describe larger pages.

Thanks, that makes sense.

Alright, let me think whether we should just have a generic "unlimited" 
thing here (e.g., max_order = 31).

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-06 16:19       ` Christophe Leroy
  2025-11-07 14:37         ` Ritesh Harjani
@ 2025-11-10 11:27         ` David Hildenbrand (Red Hat)
  2025-11-10 18:31           ` Christophe Leroy
  1 sibling, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-10 11:27 UTC (permalink / raw)
  To: Christophe Leroy, Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

Thanks for the review!

> 
> So I think what you want instead is:
> 
> diff --git a/arch/powerpc/platforms/Kconfig.cputype
> b/arch/powerpc/platforms/Kconfig.cputype
> index 7b527d18aa5ee..1f5a1e587740c 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -276,6 +276,7 @@ config PPC_E500
>           select FSL_EMB_PERFMON
>           bool
>           select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>           select PPC_SMP_MUXED_IPI
>           select PPC_DOORBELL
>           select PPC_KUEP
> 
> 
> 
>>        select ARCH_HAS_KCOV
>>        select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>        select ARCH_HAS_MEMBARRIER_CALLBACKS
>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>> platforms/Kconfig.cputype
>> index 7b527d18aa5ee..4c321a8ea8965 100644
>> --- a/arch/powerpc/platforms/Kconfig.cputype
>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>    config PPC_RADIX_MMU
>>        bool "Radix MMU Support"
>>        depends on PPC_BOOK3S_64
>> -    select ARCH_HAS_GIGANTIC_PAGE
> 
> Should remain I think.
> 
>>        default y
>>        help
>>          Enable support for the Power ISA 3.0 Radix style MMU. Currently


We also have PPC_8xx do a

	select ARCH_SUPPORTS_HUGETLBFS

And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through PPC_BOOK3S_64.

Are we sure they cannot end up with gigantic folios through hugetlb?

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-10 11:27         ` David Hildenbrand (Red Hat)
@ 2025-11-10 18:31           ` Christophe Leroy
  2025-11-11  8:29             ` David Hildenbrand (Red Hat)
  2025-11-12 10:41             ` Ritesh Harjani
  0 siblings, 2 replies; 21+ messages in thread
From: Christophe Leroy @ 2025-11-10 18:31 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton



Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
> Thanks for the review!
> 
>>
>> So I think what you want instead is:
>>
>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>> b/arch/powerpc/platforms/Kconfig.cputype
>> index 7b527d18aa5ee..1f5a1e587740c 100644
>> --- a/arch/powerpc/platforms/Kconfig.cputype
>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>> @@ -276,6 +276,7 @@ config PPC_E500
>>           select FSL_EMB_PERFMON
>>           bool
>>           select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>           select PPC_SMP_MUXED_IPI
>>           select PPC_DOORBELL
>>           select PPC_KUEP
>>
>>
>>
>>>        select ARCH_HAS_KCOV
>>>        select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>>        select ARCH_HAS_MEMBARRIER_CALLBACKS
>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>>> platforms/Kconfig.cputype
>>> index 7b527d18aa5ee..4c321a8ea8965 100644
>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>>    config PPC_RADIX_MMU
>>>        bool "Radix MMU Support"
>>>        depends on PPC_BOOK3S_64
>>> -    select ARCH_HAS_GIGANTIC_PAGE
>>
>> Should remain I think.
>>
>>>        default y
>>>        help
>>>          Enable support for the Power ISA 3.0 Radix style MMU. Currently
> 
> 
> We also have PPC_8xx do a
> 
>      select ARCH_SUPPORTS_HUGETLBFS
> 
> And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through 
> PPC_BOOK3S_64.
> 
> Are we sure they cannot end up with gigantic folios through hugetlb?
> 

Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9 
(largest hugepage is 8M) but I do get the warning with the default value 
which is 8 (with 16k pages).

For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with 
CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the 
warning with CONFIG_ARCH_FORCE_MAX_ORDER=7

Should CONFIG_ARCH_HAS_GIGANTIC_PAGE be set unconditionaly as soon as 
hugepages are selected, or should it depend on 
CONFIG_ARCH_FORCE_MAX_ORDER ? What is the cost of selecting 
CONFIG_ARCH_HAS_GIGANTIC_PAGE ?

Christophe


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-10 18:31           ` Christophe Leroy
@ 2025-11-11  8:29             ` David Hildenbrand (Red Hat)
  2025-11-11 11:21               ` David Hildenbrand (Red Hat)
  2025-11-12 10:41             ` Ritesh Harjani
  1 sibling, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-11  8:29 UTC (permalink / raw)
  To: Christophe Leroy, Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

On 10.11.25 19:31, Christophe Leroy wrote:
> 
> 
> Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
>> Thanks for the review!
>>
>>>
>>> So I think what you want instead is:
>>>
>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>> b/arch/powerpc/platforms/Kconfig.cputype
>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>            select FSL_EMB_PERFMON
>>>            bool
>>>            select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>            select PPC_SMP_MUXED_IPI
>>>            select PPC_DOORBELL
>>>            select PPC_KUEP
>>>
>>>
>>>
>>>>         select ARCH_HAS_KCOV
>>>>         select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>>>         select ARCH_HAS_MEMBARRIER_CALLBACKS
>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>>>> platforms/Kconfig.cputype
>>>> index 7b527d18aa5ee..4c321a8ea8965 100644
>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>>>     config PPC_RADIX_MMU
>>>>         bool "Radix MMU Support"
>>>>         depends on PPC_BOOK3S_64
>>>> -    select ARCH_HAS_GIGANTIC_PAGE
>>>
>>> Should remain I think.
>>>
>>>>         default y
>>>>         help
>>>>           Enable support for the Power ISA 3.0 Radix style MMU. Currently
>>
>>
>> We also have PPC_8xx do a
>>
>>       select ARCH_SUPPORTS_HUGETLBFS
>>
>> And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through
>> PPC_BOOK3S_64.
>>
>> Are we sure they cannot end up with gigantic folios through hugetlb?
>>
> 
> Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9
> (largest hugepage is 8M) but I do get the warning with the default value
> which is 8 (with 16k pages).
> 
> For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with
> CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the
> warning with CONFIG_ARCH_FORCE_MAX_ORDER=7

Right, the dependency on CONFIG_ARCH_FORCE_MAX_ORDER is nasty. In the future,
likely the arch should just tell us the biggest possible hugetlb size and we
can then determine this ourselves.

... or we'll simply remove the gigantic vs. !gigantic handling completely and
simply assume that "if there is hugetlb, we might have gigantic folios".

> Should CONFIG_ARCH_HAS_GIGANTIC_PAGE be set unconditionaly as soon as
> hugepages are selected, or should it depend on
> CONFIG_ARCH_FORCE_MAX_ORDER ? What is the cost of selecting
> CONFIG_ARCH_HAS_GIGANTIC_PAGE ?

There is no real cost, we just try to keep the value small so __dump_folio()
can better detect inconsistencies.

To fix it for now, likely the following is good enough (pushed to the
previously mentioned branch):


 From 7abf0f52e59d96533aa8c96194878e9453aa8be0 Mon Sep 17 00:00:00 2001
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
Date: Thu, 6 Nov 2025 11:31:45 +0100
Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb

In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
runtime allocation of gigantic hugetlb folios. In the meantime it evolved
into a generic way for the architecture to state that it supports
gigantic hugetlb folios.

In commit fae7d834c43c ("mm: add __dump_folio()") we started using
CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
have folios larger than what the buddy can handle. In the context of
that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
when dumping tail pages of folios. Before that commit, we assumed that
we cannot have folios larger than the highest buddy order, which was
obviously wrong.

In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
when registering hstate"), we used MAX_FOLIO_ORDER to detect
inconsistencies, and in fact, we found some now.

Powerpc allows for configs that can allocate gigantic folio during boot
(not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
exceed PUD_ORDER.

To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with
hugetlb on powerpc, and increase the maximum folio size with hugetlb to 16
GiB (possible on arm64 and powerpc). Note that on some powerpc
configurations, whether we actually have gigantic pages
depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is
nothing really problematic about setting it unconditionally: we just try to
keep the value small so we can better detect problems in __dump_folio()
and inconsistencies around the expected largest folio in the system.

Ideally, we'd have a better way to obtain the maximum hugetlb folio size
and detect ourselves whether we really end up with gigantic folios. Let's
defer bigger changes and fix the warnings first.

While at it, handle gigantic DAX folios more clearly: DAX can only
end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.

Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
HUGETLB_PAGE.

Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now
also allow for runtime allocations of folios in some more powerpc configs.
I don't think this is a problem, but if it is we could handle it through
__HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.

While __dump_page()/__dump_folio was also problematic (not handling dumping
of tail pages of such gigantic folios correctly), it doesn't relevant
critical enough to mark it as a fix.

Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate")
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Closes: https://lore.kernel.org/r/3e043453-3f27-48ad-b987-cc39f523060a@csgroup.eu/
Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Closes: https://lore.kernel.org/r/94377f5c-d4f0-4c0f-b0f6-5bf1cd7305b1@linux.ibm.com/
Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
---
  arch/powerpc/Kconfig |  1 +
  include/linux/mm.h   | 12 +++++++++---
  mm/Kconfig           |  7 +++++++
  3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e24f4d88885ae..9537a61ebae02 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -137,6 +137,7 @@ config PPC
  	select ARCH_HAS_DMA_OPS			if PPC64
  	select ARCH_HAS_FORTIFY_SOURCE
  	select ARCH_HAS_GCOV_PROFILE_ALL
+	select ARCH_HAS_GIGANTIC_PAGE		if ARCH_SUPPORTS_HUGETLBFS
  	select ARCH_HAS_KCOV
  	select ARCH_HAS_KERNEL_FPU_SUPPORT	if PPC64 && PPC_FPU
  	select ARCH_HAS_MEMBARRIER_CALLBACKS
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d16b33bacc32b..2646ba7c96a49 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const struct folio *folio)
  	return folio_large_nr_pages(folio);
  }
  
-#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
+#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
  /*
   * We don't expect any folios that exceed buddy sizes (and consequently
   * memory sections).
@@ -2087,10 +2087,16 @@ static inline unsigned long folio_nr_pages(const struct folio *folio)
   * pages are guaranteed to be contiguous.
   */
  #define MAX_FOLIO_ORDER		PFN_SECTION_SHIFT
-#else
+#elif defined(CONFIG_HUGETLB_PAGE)
  /*
   * There is no real limit on the folio size. We limit them to the maximum we
- * currently expect (e.g., hugetlb, dax).
+ * currently expect: with hugetlb, we expect no folios larger than 16 GiB.
+ */
+#define MAX_FOLIO_ORDER		(16 * GIGA / PAGE_SIZE)
+#else
+/*
+ * Without hugetlb, gigantic folios that are bigger than a single PUD are
+ * currently impossible.
   */
  #define MAX_FOLIO_ORDER		PUD_ORDER
  #endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 0e26f4fc8717b..ca3f146bc7053 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -908,6 +908,13 @@ config PAGE_MAPCOUNT
  config PGTABLE_HAS_HUGE_LEAVES
  	def_bool TRANSPARENT_HUGEPAGE || HUGETLB_PAGE
  
+#
+# We can end up creating gigantic folio.
+#
+config HAVE_GIGANTIC_FOLIOS
+	def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \
+		 (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
+
  # TODO: Allow to be enabled without THP
  config ARCH_SUPPORTS_HUGE_PFNMAP
  	def_bool n
-- 
2.51.0



-- 
Cheers

David


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-11  8:29             ` David Hildenbrand (Red Hat)
@ 2025-11-11 11:21               ` David Hildenbrand (Red Hat)
  2025-11-11 11:42                 ` Christophe Leroy
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-11 11:21 UTC (permalink / raw)
  To: Christophe Leroy, Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

On 11.11.25 09:29, David Hildenbrand (Red Hat) wrote:
> On 10.11.25 19:31, Christophe Leroy wrote:
>>
>>
>> Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
>>> Thanks for the review!
>>>
>>>>
>>>> So I think what you want instead is:
>>>>
>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>>> b/arch/powerpc/platforms/Kconfig.cputype
>>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>>             select FSL_EMB_PERFMON
>>>>             bool
>>>>             select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>>             select PPC_SMP_MUXED_IPI
>>>>             select PPC_DOORBELL
>>>>             select PPC_KUEP
>>>>
>>>>
>>>>
>>>>>          select ARCH_HAS_KCOV
>>>>>          select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>>>>          select ARCH_HAS_MEMBARRIER_CALLBACKS
>>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>>>>> platforms/Kconfig.cputype
>>>>> index 7b527d18aa5ee..4c321a8ea8965 100644
>>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>>>>      config PPC_RADIX_MMU
>>>>>          bool "Radix MMU Support"
>>>>>          depends on PPC_BOOK3S_64
>>>>> -    select ARCH_HAS_GIGANTIC_PAGE
>>>>
>>>> Should remain I think.
>>>>
>>>>>          default y
>>>>>          help
>>>>>            Enable support for the Power ISA 3.0 Radix style MMU. Currently
>>>
>>>
>>> We also have PPC_8xx do a
>>>
>>>        select ARCH_SUPPORTS_HUGETLBFS
>>>
>>> And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through
>>> PPC_BOOK3S_64.
>>>
>>> Are we sure they cannot end up with gigantic folios through hugetlb?
>>>
>>
>> Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9
>> (largest hugepage is 8M) but I do get the warning with the default value
>> which is 8 (with 16k pages).
>>
>> For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with
>> CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the
>> warning with CONFIG_ARCH_FORCE_MAX_ORDER=7
> 
> Right, the dependency on CONFIG_ARCH_FORCE_MAX_ORDER is nasty. In the future,
> likely the arch should just tell us the biggest possible hugetlb size and we
> can then determine this ourselves.
> 
> ... or we'll simply remove the gigantic vs. !gigantic handling completely and
> simply assume that "if there is hugetlb, we might have gigantic folios".
> 
>> Should CONFIG_ARCH_HAS_GIGANTIC_PAGE be set unconditionaly as soon as
>> hugepages are selected, or should it depend on
>> CONFIG_ARCH_FORCE_MAX_ORDER ? What is the cost of selecting
>> CONFIG_ARCH_HAS_GIGANTIC_PAGE ?
> 
> There is no real cost, we just try to keep the value small so __dump_folio()
> can better detect inconsistencies.
> 
> To fix it for now, likely the following is good enough (pushed to the
> previously mentioned branch):
> 
> 
>   From 7abf0f52e59d96533aa8c96194878e9453aa8be0 Mon Sep 17 00:00:00 2001
> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
> Date: Thu, 6 Nov 2025 11:31:45 +0100
> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb
> 
> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
> into a generic way for the architecture to state that it supports
> gigantic hugetlb folios.
> 
> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
> have folios larger than what the buddy can handle. In the context of
> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
> when dumping tail pages of folios. Before that commit, we assumed that
> we cannot have folios larger than the highest buddy order, which was
> obviously wrong.
> 
> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
> when registering hstate"), we used MAX_FOLIO_ORDER to detect
> inconsistencies, and in fact, we found some now.
> 
> Powerpc allows for configs that can allocate gigantic folio during boot
> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
> exceed PUD_ORDER.
> 
> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with
> hugetlb on powerpc, and increase the maximum folio size with hugetlb to 16
> GiB (possible on arm64 and powerpc). Note that on some powerpc
> configurations, whether we actually have gigantic pages
> depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is
> nothing really problematic about setting it unconditionally: we just try to
> keep the value small so we can better detect problems in __dump_folio()
> and inconsistencies around the expected largest folio in the system.
> 
> Ideally, we'd have a better way to obtain the maximum hugetlb folio size
> and detect ourselves whether we really end up with gigantic folios. Let's
> defer bigger changes and fix the warnings first.
> 
> While at it, handle gigantic DAX folios more clearly: DAX can only
> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
> 
> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
> HUGETLB_PAGE.
> 
> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now
> also allow for runtime allocations of folios in some more powerpc configs.
> I don't think this is a problem, but if it is we could handle it through
> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
> 
> While __dump_page()/__dump_folio was also problematic (not handling dumping
> of tail pages of such gigantic folios correctly), it doesn't relevant
> critical enough to mark it as a fix.
> 
> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate")
> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> Closes: https://lore.kernel.org/r/3e043453-3f27-48ad-b987-cc39f523060a@csgroup.eu/
> Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> Closes: https://lore.kernel.org/r/94377f5c-d4f0-4c0f-b0f6-5bf1cd7305b1@linux.ibm.com/
> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
> ---
>    arch/powerpc/Kconfig |  1 +
>    include/linux/mm.h   | 12 +++++++++---
>    mm/Kconfig           |  7 +++++++
>    3 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e24f4d88885ae..9537a61ebae02 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -137,6 +137,7 @@ config PPC
>    	select ARCH_HAS_DMA_OPS			if PPC64
>    	select ARCH_HAS_FORTIFY_SOURCE
>    	select ARCH_HAS_GCOV_PROFILE_ALL
> +	select ARCH_HAS_GIGANTIC_PAGE		if ARCH_SUPPORTS_HUGETLBFS
>    	select ARCH_HAS_KCOV
>    	select ARCH_HAS_KERNEL_FPU_SUPPORT	if PPC64 && PPC_FPU
>    	select ARCH_HAS_MEMBARRIER_CALLBACKS
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index d16b33bacc32b..2646ba7c96a49 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const struct folio *folio)
>    	return folio_large_nr_pages(folio);
>    }
>    
> -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
> +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
>    /*
>     * We don't expect any folios that exceed buddy sizes (and consequently
>     * memory sections).
> @@ -2087,10 +2087,16 @@ static inline unsigned long folio_nr_pages(const struct folio *folio)
>     * pages are guaranteed to be contiguous.
>     */
>    #define MAX_FOLIO_ORDER		PFN_SECTION_SHIFT
> -#else
> +#elif defined(CONFIG_HUGETLB_PAGE)
>    /*
>     * There is no real limit on the folio size. We limit them to the maximum we
> - * currently expect (e.g., hugetlb, dax).
> + * currently expect: with hugetlb, we expect no folios larger than 16 GiB.
> + */
> +#define MAX_FOLIO_ORDER		(16 * GIGA / PAGE_SIZE)

Forgot to commit the ilog2(), so this should be

#define MAX_FOLIO_ORDER                ilog2(16 * GIGA / PAGE_SIZE

And we might need unit.h to make some cross compiles happy.

Still testing ...

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-11 11:21               ` David Hildenbrand (Red Hat)
@ 2025-11-11 11:42                 ` Christophe Leroy
  2025-11-11 12:20                   ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 21+ messages in thread
From: Christophe Leroy @ 2025-11-11 11:42 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat), Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev, David Hildenbrand
  Cc: Donet Tom, Andrew Morton



Le 11/11/2025 à 12:21, David Hildenbrand (Red Hat) a écrit :
> On 11.11.25 09:29, David Hildenbrand (Red Hat) wrote:
>> On 10.11.25 19:31, Christophe Leroy wrote:
>>>
>>>
>>> Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
>>>> Thanks for the review!
>>>>
>>>>>
>>>>> So I think what you want instead is:
>>>>>
>>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>>>> b/arch/powerpc/platforms/Kconfig.cputype
>>>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>>>             select FSL_EMB_PERFMON
>>>>>             bool
>>>>>             select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>>>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>>>             select PPC_SMP_MUXED_IPI
>>>>>             select PPC_DOORBELL
>>>>>             select PPC_KUEP
>>>>>
>>>>>
>>>>>
>>>>>>          select ARCH_HAS_KCOV
>>>>>>          select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>>>>>          select ARCH_HAS_MEMBARRIER_CALLBACKS
>>>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>>>>>> platforms/Kconfig.cputype
>>>>>> index 7b527d18aa5ee..4c321a8ea8965 100644
>>>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>>>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>>>>>      config PPC_RADIX_MMU
>>>>>>          bool "Radix MMU Support"
>>>>>>          depends on PPC_BOOK3S_64
>>>>>> -    select ARCH_HAS_GIGANTIC_PAGE
>>>>>
>>>>> Should remain I think.
>>>>>
>>>>>>          default y
>>>>>>          help
>>>>>>            Enable support for the Power ISA 3.0 Radix style MMU. 
>>>>>> Currently
>>>>
>>>>
>>>> We also have PPC_8xx do a
>>>>
>>>>        select ARCH_SUPPORTS_HUGETLBFS
>>>>
>>>> And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through
>>>> PPC_BOOK3S_64.
>>>>
>>>> Are we sure they cannot end up with gigantic folios through hugetlb?
>>>>
>>>
>>> Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9
>>> (largest hugepage is 8M) but I do get the warning with the default value
>>> which is 8 (with 16k pages).
>>>
>>> For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with
>>> CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the
>>> warning with CONFIG_ARCH_FORCE_MAX_ORDER=7
>>
>> Right, the dependency on CONFIG_ARCH_FORCE_MAX_ORDER is nasty. In the 
>> future,
>> likely the arch should just tell us the biggest possible hugetlb size 
>> and we
>> can then determine this ourselves.
>>
>> ... or we'll simply remove the gigantic vs. !gigantic handling 
>> completely and
>> simply assume that "if there is hugetlb, we might have gigantic folios".
>>
>>> Should CONFIG_ARCH_HAS_GIGANTIC_PAGE be set unconditionaly as soon as
>>> hugepages are selected, or should it depend on
>>> CONFIG_ARCH_FORCE_MAX_ORDER ? What is the cost of selecting
>>> CONFIG_ARCH_HAS_GIGANTIC_PAGE ?
>>
>> There is no real cost, we just try to keep the value small so 
>> __dump_folio()
>> can better detect inconsistencies.
>>
>> To fix it for now, likely the following is good enough (pushed to the
>> previously mentioned branch):
>>
>>
>>   From 7abf0f52e59d96533aa8c96194878e9453aa8be0 Mon Sep 17 00:00:00 2001
>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb
>>
>> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
>> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
>> into a generic way for the architecture to state that it supports
>> gigantic hugetlb folios.
>>
>> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
>> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
>> have folios larger than what the buddy can handle. In the context of
>> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
>> when dumping tail pages of folios. Before that commit, we assumed that
>> we cannot have folios larger than the highest buddy order, which was
>> obviously wrong.
>>
>> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>> when registering hstate"), we used MAX_FOLIO_ORDER to detect
>> inconsistencies, and in fact, we found some now.
>>
>> Powerpc allows for configs that can allocate gigantic folio during boot
>> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
>> exceed PUD_ORDER.
>>
>> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with
>> hugetlb on powerpc, and increase the maximum folio size with hugetlb 
>> to 16
>> GiB (possible on arm64 and powerpc). Note that on some powerpc
>> configurations, whether we actually have gigantic pages
>> depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is
>> nothing really problematic about setting it unconditionally: we just 
>> try to
>> keep the value small so we can better detect problems in __dump_folio()
>> and inconsistencies around the expected largest folio in the system.
>>
>> Ideally, we'd have a better way to obtain the maximum hugetlb folio size
>> and detect ourselves whether we really end up with gigantic folios. Let's
>> defer bigger changes and fix the warnings first.
>>
>> While at it, handle gigantic DAX folios more clearly: DAX can only
>> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>>
>> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
>> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
>> HUGETLB_PAGE.
>>
>> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now
>> also allow for runtime allocations of folios in some more powerpc 
>> configs.
>> I don't think this is a problem, but if it is we could handle it through
>> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
>>
>> While __dump_page()/__dump_folio was also problematic (not handling 
>> dumping
>> of tail pages of such gigantic folios correctly), it doesn't relevant
>> critical enough to mark it as a fix.
>>
>> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes 
>> when registering hstate")
>> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> Closes: https://eur01.safelinks.protection.outlook.com/? 
>> url=https%3A%2F%2Flore.kernel.org%2Fr%2F3e043453-3f27-48ad-b987- 
>> cc39f523060a%40csgroup.eu%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cb376c59325bf40bc08ce08de211479f4%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638984569012877144%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=KwQwqCg2Cu5oXXwBYhuQvW2kZqjyNZMk5N6zfsg%2FCHI%3D&reserved=0
>> Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> Closes: https://eur01.safelinks.protection.outlook.com/? 
>> url=https%3A%2F%2Flore.kernel.org%2Fr%2F94377f5c-d4f0-4c0f- 
>> b0f6-5bf1cd7305b1%40linux.ibm.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cb376c59325bf40bc08ce08de211479f4%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638984569012910679%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=1twO%2Ffle%2BX3EKlku7P9C8ZlQQUB2B9r%2FvF8ZaQdVz8k%3D&reserved=0
>> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
>> ---
>>    arch/powerpc/Kconfig |  1 +
>>    include/linux/mm.h   | 12 +++++++++---
>>    mm/Kconfig           |  7 +++++++
>>    3 files changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index e24f4d88885ae..9537a61ebae02 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -137,6 +137,7 @@ config PPC
>>        select ARCH_HAS_DMA_OPS            if PPC64
>>        select ARCH_HAS_FORTIFY_SOURCE
>>        select ARCH_HAS_GCOV_PROFILE_ALL
>> +    select ARCH_HAS_GIGANTIC_PAGE        if ARCH_SUPPORTS_HUGETLBFS
>>        select ARCH_HAS_KCOV
>>        select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>        select ARCH_HAS_MEMBARRIER_CALLBACKS
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index d16b33bacc32b..2646ba7c96a49 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const 
>> struct folio *folio)
>>        return folio_large_nr_pages(folio);
>>    }
>> -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
>> +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
>>    /*
>>     * We don't expect any folios that exceed buddy sizes (and 
>> consequently
>>     * memory sections).
>> @@ -2087,10 +2087,16 @@ static inline unsigned long 
>> folio_nr_pages(const struct folio *folio)
>>     * pages are guaranteed to be contiguous.
>>     */
>>    #define MAX_FOLIO_ORDER        PFN_SECTION_SHIFT
>> -#else
>> +#elif defined(CONFIG_HUGETLB_PAGE)
>>    /*
>>     * There is no real limit on the folio size. We limit them to the 
>> maximum we
>> - * currently expect (e.g., hugetlb, dax).
>> + * currently expect: with hugetlb, we expect no folios larger than 16 
>> GiB.
>> + */
>> +#define MAX_FOLIO_ORDER        (16 * GIGA / PAGE_SIZE)
> 
> Forgot to commit the ilog2(), so this should be
> 
> #define MAX_FOLIO_ORDER                ilog2(16 * GIGA / PAGE_SIZE

I would have used SZ_16G.

But could we use get_order() instead ? (From include/asm-generic/getorder.h)

> 
> And we might need unit.h to make some cross compiles happy.

size.h by the way if we use SZ_16G instead.

> 
> Still testing ...
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-11 11:42                 ` Christophe Leroy
@ 2025-11-11 12:20                   ` David Hildenbrand (Red Hat)
  0 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-11 12:20 UTC (permalink / raw)
  To: Christophe Leroy, Sourabh Jain, Madhavan Srinivasan,
	Ritesh Harjani (IBM), linuxppc-dev
  Cc: Donet Tom, Andrew Morton

On 11.11.25 12:42, Christophe Leroy wrote:
> 
> 
> Le 11/11/2025 à 12:21, David Hildenbrand (Red Hat) a écrit :
>> On 11.11.25 09:29, David Hildenbrand (Red Hat) wrote:
>>> On 10.11.25 19:31, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
>>>>> Thanks for the review!
>>>>>
>>>>>>
>>>>>> So I think what you want instead is:
>>>>>>
>>>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>>>>> b/arch/powerpc/platforms/Kconfig.cputype
>>>>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>>>>              select FSL_EMB_PERFMON
>>>>>>              bool
>>>>>>              select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>>>>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>>>>              select PPC_SMP_MUXED_IPI
>>>>>>              select PPC_DOORBELL
>>>>>>              select PPC_KUEP
>>>>>>
>>>>>>
>>>>>>
>>>>>>>           select ARCH_HAS_KCOV
>>>>>>>           select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>>>>>>           select ARCH_HAS_MEMBARRIER_CALLBACKS
>>>>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>>>>>>> platforms/Kconfig.cputype
>>>>>>> index 7b527d18aa5ee..4c321a8ea8965 100644
>>>>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>>>>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>>>>>>       config PPC_RADIX_MMU
>>>>>>>           bool "Radix MMU Support"
>>>>>>>           depends on PPC_BOOK3S_64
>>>>>>> -    select ARCH_HAS_GIGANTIC_PAGE
>>>>>>
>>>>>> Should remain I think.
>>>>>>
>>>>>>>           default y
>>>>>>>           help
>>>>>>>             Enable support for the Power ISA 3.0 Radix style MMU.
>>>>>>> Currently
>>>>>
>>>>>
>>>>> We also have PPC_8xx do a
>>>>>
>>>>>         select ARCH_SUPPORTS_HUGETLBFS
>>>>>
>>>>> And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through
>>>>> PPC_BOOK3S_64.
>>>>>
>>>>> Are we sure they cannot end up with gigantic folios through hugetlb?
>>>>>
>>>>
>>>> Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9
>>>> (largest hugepage is 8M) but I do get the warning with the default value
>>>> which is 8 (with 16k pages).
>>>>
>>>> For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with
>>>> CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the
>>>> warning with CONFIG_ARCH_FORCE_MAX_ORDER=7
>>>
>>> Right, the dependency on CONFIG_ARCH_FORCE_MAX_ORDER is nasty. In the
>>> future,
>>> likely the arch should just tell us the biggest possible hugetlb size
>>> and we
>>> can then determine this ourselves.
>>>
>>> ... or we'll simply remove the gigantic vs. !gigantic handling
>>> completely and
>>> simply assume that "if there is hugetlb, we might have gigantic folios".
>>>
>>>> Should CONFIG_ARCH_HAS_GIGANTIC_PAGE be set unconditionaly as soon as
>>>> hugepages are selected, or should it depend on
>>>> CONFIG_ARCH_FORCE_MAX_ORDER ? What is the cost of selecting
>>>> CONFIG_ARCH_HAS_GIGANTIC_PAGE ?
>>>
>>> There is no real cost, we just try to keep the value small so
>>> __dump_folio()
>>> can better detect inconsistencies.
>>>
>>> To fix it for now, likely the following is good enough (pushed to the
>>> previously mentioned branch):
>>>
>>>
>>>    From 7abf0f52e59d96533aa8c96194878e9453aa8be0 Mon Sep 17 00:00:00 2001
>>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb
>>>
>>> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
>>> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
>>> into a generic way for the architecture to state that it supports
>>> gigantic hugetlb folios.
>>>
>>> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
>>> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
>>> have folios larger than what the buddy can handle. In the context of
>>> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
>>> when dumping tail pages of folios. Before that commit, we assumed that
>>> we cannot have folios larger than the highest buddy order, which was
>>> obviously wrong.
>>>
>>> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>>> when registering hstate"), we used MAX_FOLIO_ORDER to detect
>>> inconsistencies, and in fact, we found some now.
>>>
>>> Powerpc allows for configs that can allocate gigantic folio during boot
>>> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
>>> exceed PUD_ORDER.
>>>
>>> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with
>>> hugetlb on powerpc, and increase the maximum folio size with hugetlb
>>> to 16
>>> GiB (possible on arm64 and powerpc). Note that on some powerpc
>>> configurations, whether we actually have gigantic pages
>>> depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is
>>> nothing really problematic about setting it unconditionally: we just
>>> try to
>>> keep the value small so we can better detect problems in __dump_folio()
>>> and inconsistencies around the expected largest folio in the system.
>>>
>>> Ideally, we'd have a better way to obtain the maximum hugetlb folio size
>>> and detect ourselves whether we really end up with gigantic folios. Let's
>>> defer bigger changes and fix the warnings first.
>>>
>>> While at it, handle gigantic DAX folios more clearly: DAX can only
>>> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>>>
>>> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
>>> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
>>> HUGETLB_PAGE.
>>>
>>> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now
>>> also allow for runtime allocations of folios in some more powerpc
>>> configs.
>>> I don't think this is a problem, but if it is we could handle it through
>>> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
>>>
>>> While __dump_page()/__dump_folio was also problematic (not handling
>>> dumping
>>> of tail pages of such gigantic folios correctly), it doesn't relevant
>>> critical enough to mark it as a fix.
>>>
>>> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>>> when registering hstate")
>>> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>>> Closes: https://eur01.safelinks.protection.outlook.com/?
>>> url=https%3A%2F%2Flore.kernel.org%2Fr%2F3e043453-3f27-48ad-b987-
>>> cc39f523060a%40csgroup.eu%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cb376c59325bf40bc08ce08de211479f4%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638984569012877144%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=KwQwqCg2Cu5oXXwBYhuQvW2kZqjyNZMk5N6zfsg%2FCHI%3D&reserved=0
>>> Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>> Closes: https://eur01.safelinks.protection.outlook.com/?
>>> url=https%3A%2F%2Flore.kernel.org%2Fr%2F94377f5c-d4f0-4c0f-
>>> b0f6-5bf1cd7305b1%40linux.ibm.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cb376c59325bf40bc08ce08de211479f4%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638984569012910679%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=1twO%2Ffle%2BX3EKlku7P9C8ZlQQUB2B9r%2FvF8ZaQdVz8k%3D&reserved=0
>>> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
>>> ---
>>>     arch/powerpc/Kconfig |  1 +
>>>     include/linux/mm.h   | 12 +++++++++---
>>>     mm/Kconfig           |  7 +++++++
>>>     3 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>> index e24f4d88885ae..9537a61ebae02 100644
>>> --- a/arch/powerpc/Kconfig
>>> +++ b/arch/powerpc/Kconfig
>>> @@ -137,6 +137,7 @@ config PPC
>>>         select ARCH_HAS_DMA_OPS            if PPC64
>>>         select ARCH_HAS_FORTIFY_SOURCE
>>>         select ARCH_HAS_GCOV_PROFILE_ALL
>>> +    select ARCH_HAS_GIGANTIC_PAGE        if ARCH_SUPPORTS_HUGETLBFS
>>>         select ARCH_HAS_KCOV
>>>         select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>>         select ARCH_HAS_MEMBARRIER_CALLBACKS
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index d16b33bacc32b..2646ba7c96a49 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const
>>> struct folio *folio)
>>>         return folio_large_nr_pages(folio);
>>>     }
>>> -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
>>> +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
>>>     /*
>>>      * We don't expect any folios that exceed buddy sizes (and
>>> consequently
>>>      * memory sections).
>>> @@ -2087,10 +2087,16 @@ static inline unsigned long
>>> folio_nr_pages(const struct folio *folio)
>>>      * pages are guaranteed to be contiguous.
>>>      */
>>>     #define MAX_FOLIO_ORDER        PFN_SECTION_SHIFT
>>> -#else
>>> +#elif defined(CONFIG_HUGETLB_PAGE)
>>>     /*
>>>      * There is no real limit on the folio size. We limit them to the
>>> maximum we
>>> - * currently expect (e.g., hugetlb, dax).
>>> + * currently expect: with hugetlb, we expect no folios larger than 16
>>> GiB.
>>> + */
>>> +#define MAX_FOLIO_ORDER        (16 * GIGA / PAGE_SIZE)
>>
>> Forgot to commit the ilog2(), so this should be
>>
>> #define MAX_FOLIO_ORDER                ilog2(16 * GIGA / PAGE_SIZE
> 
> I would have used SZ_16G.
> 

Yeah, much better.

> But could we use get_order() instead ? (From include/asm-generic/getorder.h)

I think so, the compiler should just convert it to a compile-time constant.

> 
>>
>> And we might need unit.h to make some cross compiles happy.
> 
> size.h by the way if we use SZ_16G instead.

sizes.h is even already included in mmh.

Thanks, let me cross-compile and send out something official.

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
  2025-11-10 18:31           ` Christophe Leroy
  2025-11-11  8:29             ` David Hildenbrand (Red Hat)
@ 2025-11-12 10:41             ` Ritesh Harjani
  1 sibling, 0 replies; 21+ messages in thread
From: Ritesh Harjani @ 2025-11-12 10:41 UTC (permalink / raw)
  To: Christophe Leroy, David Hildenbrand (Red Hat), Sourabh Jain,
	Madhavan Srinivasan, linuxppc-dev
  Cc: Donet Tom, Andrew Morton

Christophe Leroy <christophe.leroy@csgroup.eu> writes:

> Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
>> Thanks for the review!
>> 
>>>
>>> So I think what you want instead is:
>>>
>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>> b/arch/powerpc/platforms/Kconfig.cputype
>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>           select FSL_EMB_PERFMON
>>>           bool
>>>           select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>> +       select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>           select PPC_SMP_MUXED_IPI
>>>           select PPC_DOORBELL
>>>           select PPC_KUEP
>>>
>>>
>>>
>>>>        select ARCH_HAS_KCOV
>>>>        select ARCH_HAS_KERNEL_FPU_SUPPORT    if PPC64 && PPC_FPU
>>>>        select ARCH_HAS_MEMBARRIER_CALLBACKS
>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>>>> platforms/Kconfig.cputype
>>>> index 7b527d18aa5ee..4c321a8ea8965 100644
>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>>>    config PPC_RADIX_MMU
>>>>        bool "Radix MMU Support"
>>>>        depends on PPC_BOOK3S_64
>>>> -    select ARCH_HAS_GIGANTIC_PAGE
>>>
>>> Should remain I think.
>>>
>>>>        default y
>>>>        help
>>>>          Enable support for the Power ISA 3.0 Radix style MMU. Currently
>> 
>> 
>> We also have PPC_8xx do a
>> 
>>      select ARCH_SUPPORTS_HUGETLBFS
>> 
>> And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through 
>> PPC_BOOK3S_64.
>> 
>> Are we sure they cannot end up with gigantic folios through hugetlb?
>> 
>
> Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9 
> (largest hugepage is 8M) but I do get the warning with the default value 
> which is 8 (with 16k pages).
>
> For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with 
> CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the 
> warning with CONFIG_ARCH_FORCE_MAX_ORDER=7
>

This made me thinking.. Currently we can also get warning even on
book3s64 when CONFIG_PPC_RADIX_MMU=n is selected because max page size
in case of HASH can be 16G. I guess this was not getting tested in
regular CI because it requires us to disable RADIX config during build.

We will end up in this path on Hash where MAX_PAGE_ORDER is
CONFIG_ARCH_FORCE_MAX_ORDER which is 8, this is because we HAVE
ARCH_HAS_GIGANTIC_PAGE=n in case of only HASH.

From below, MAX_FOLIO_ORDER on !PPC_RADIX_MMU (HASH) becomes 8 i.e... 

    #if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
    /*
    * We don't expect any folios that exceed buddy sizes (and consequently
    * memory sections).
    */
    #define MAX_FOLIO_ORDER		MAX_PAGE_ORDER

...And thus 
we get similar warning because (order=18 for 16G) > MAX_FOLIO_ORDER(8) in hugetlb_add_hstate().

[    0.000000] Kernel command line: console=hvc0 console=hvc1 systemd.unit=emergency.target root=/dev/vda1 noreboot disable_radix=1 hugepagesz=16M hugepages=1 hugepagesz=16G hugepages=1 default_hugepagesz=16G
<...>
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at mm/hugetlb.c:4753 hugetlb_add_hstate+0xf4/0x228
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.18.0-rc3-00138-g1e87cdb8702c #26 NONE
[    0.000000] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.1-106-g785a5e307 PowerNV
[    0.000000] NIP:  c00000000204ef4c LR: c00000000204f1b0 CTR: c00000000204ee68
[    0.000000] REGS: c000000002857ad0 TRAP: 0700   Not tainted  (6.18.0-rc3-00138-g1e87cdb8702c)
[    0.000000] MSR:  9000000002021033 <SF,HV,VEC,ME,IR,DR,RI,LE>  CR: 28000448  XER: 00000000
[    0.000000] CFAR: c00000000204eed8 IRQMASK: 3
<...>
[    0.000000] NIP [c00000000204ef4c] hugetlb_add_hstate+0xf4/0x228
[    0.000000] LR [c00000000204f1b0] hugepagesz_setup+0x130/0x16c
[    0.000000] Call Trace:
[    0.000000] [c000000002857d70] [c0000000020ee564] hstate_cmdline_buf+0x4/0x800 (unreliable)
[    0.000000] [c000000002857e10] [c00000000204f1b0] hugepagesz_setup+0x130/0x16c
[    0.000000] [c000000002857e80] [c0000000020505a8] hugetlb_bootmem_alloc+0xd8/0x1d0
[    0.000000] [c000000002857ec0] [c000000002046828] mm_core_init+0x2c/0x254
[    0.000000] [c000000002857f30] [c0000000020012ac] start_kernel+0x404/0xae0
[    0.000000] [c000000002857fe0] [c00000000000e934] start_here_common+0x1c/0x20
<...>
[    2.557050] HugeTLB: allocation took 7ms with hugepage_allocation_threads=1
[    2.562263] ------------[ cut here ]------------
[    2.564482] WARNING: CPU: 0 PID: 1 at mm/internal.h:758 gather_bootmem_prealloc_parallel+0x454/0x4d8
[    2.568266] Modules linked in:
[    2.570204] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G        W           6.18.0-rc3-00138-g1e87cdb8702c #26 NONE
[    2.574570] Tainted: [W]=WARN
[    2.576009] Hardware name: IBM PowerNV (emulated by qemu) POWER10 0x801200 opal:v7.1-106-g785a5e307 PowerNV
[    2.579979] NIP:  c00000000204f9b0 LR: c00000000204f870 CTR: c00000000204f55c
[    2.582763] REGS: c000000004a0f5a0 TRAP: 0700   Tainted: G        W            (6.18.0-rc3-00138-g1e87cdb8702c)
[    2.586670] MSR:  9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 44002288  XER: 20040000
[    2.590234] CFAR: c00000000204f880 IRQMASK: 0
<...>
[    2.616926] NIP [c00000000204f9b0] gather_bootmem_prealloc_parallel+0x454/0x4d8
[    2.619928] LR [c00000000204f870] gather_bootmem_prealloc_parallel+0x314/0x4d8
[    2.622799] Call Trace:
[    2.624068] [c000000004a0f840] [c00000000204f85c] gather_bootmem_prealloc_parallel+0x300/0x4d8 (unreliable)
[    2.627847] [c000000004a0f930] [c000000002041018] padata_do_multithreaded+0x470/0x518
[    2.631141] [c000000004a0fad0] [c00000000204fce8] hugetlb_init+0x2b4/0x904
[    2.633914] [c000000004a0fc10] [c000000000010d74] do_one_initcall+0xac/0x438
[    2.636761] [c000000004a0fcf0] [c000000002001dfc] kernel_init_freeable+0x3cc/0x720
[    2.639764] [c000000004a0fde0] [c000000000011344] kernel_init+0x34/0x260
[    2.642688] [c000000004a0fe50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
[    2.646020] ---- interrupt: 0 at 0x0
[    2.647943] Code: eba100d8 ebc100e0 ebe100e8 e9410058 e92d0c70 7d4a4a79 39200000 40820044 382100f0 eaa1ffa8 4e800020 60420000 <0fe00000> 4bfffed0 3ba00000 7ee4bb78
[    2.654240] irq event stamp: 50400
[    2.655991] hardirqs last  enabled at (50399): [<c00000000002ed84>] interrupt_exit_kernel_prepare+0xd8/0x224
[    2.659759] hardirqs last disabled at (50400): [<c00000000002bdb8>] program_check_exception+0x60/0x78
[    2.663293] softirqs last  enabled at (50320): [<c00000000017aa0c>] handle_softirqs+0x5a8/0x5c0
[    2.666819] softirqs last disabled at (50315): [<c0000000000165e4>] do_softirq_own_stack+0x40/0x54
[    2.670569] ---[ end trace 0000000000000000 ]---
[    2.697258] HugeTLB: registered 16.0 MiB page size, pre-allocated 1 pages
[    2.700831] HugeTLB: 0 KiB vmemmap can be freed for a 16.0 MiB page
[    2.703917] HugeTLB: registered 16.0 GiB page size, pre-allocated 1 pages
[    2.707073] HugeTLB: 0 KiB vmemmap can be freed for a 16.0 GiB page


So I guess making PPC select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS is true,
should help us resolve this warning w.r.t order. 
And I guess the runtime allocation of gigantic pages is anyway being controlled
via, __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED

Feel free to correct me here if I missed anything. There seems to be a
lot of history related to hugetlb / gigantic pages.

-ritesh


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-11-12 12:54 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29  5:49 powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate Sourabh Jain
2025-10-29  8:25 ` David Hildenbrand
2025-11-05 11:32   ` Christophe Leroy
2025-11-06 15:02     ` David Hildenbrand (Red Hat)
2025-11-06 16:19       ` Christophe Leroy
2025-11-07 14:37         ` Ritesh Harjani
2025-11-07 16:11           ` Christophe Leroy
2025-11-10 10:10           ` David Hildenbrand (Red Hat)
2025-11-10 10:33             ` Christophe Leroy
2025-11-10 11:04               ` David Hildenbrand (Red Hat)
2025-11-10 11:27         ` David Hildenbrand (Red Hat)
2025-11-10 18:31           ` Christophe Leroy
2025-11-11  8:29             ` David Hildenbrand (Red Hat)
2025-11-11 11:21               ` David Hildenbrand (Red Hat)
2025-11-11 11:42                 ` Christophe Leroy
2025-11-11 12:20                   ` David Hildenbrand (Red Hat)
2025-11-12 10:41             ` Ritesh Harjani
2025-11-07  8:00       ` Sourabh Jain
2025-11-07  9:02         ` David Hildenbrand (Red Hat)
2025-11-07 12:35           ` Sourabh Jain
2025-11-07 14:18             ` David Hildenbrand (Red Hat)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).