From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
Sourabh Jain <sourabhjain@linux.ibm.com>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
"Ritesh Harjani (IBM)" <ritesh.list@gmail.com>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Cc: Donet Tom <donettom@linux.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate
Date: Tue, 11 Nov 2025 13:20:03 +0100 [thread overview]
Message-ID: <b2a79874-4718-47dd-ad79-99d7fb49246e@kernel.org> (raw)
In-Reply-To: <a31e6d70-9275-4277-991b-9de1aea03cd7@csgroup.eu>
On 11.11.25 12:42, Christophe Leroy wrote:
>
>
> Le 11/11/2025 à 12:21, David Hildenbrand (Red Hat) a écrit :
>> On 11.11.25 09:29, David Hildenbrand (Red Hat) wrote:
>>> On 10.11.25 19:31, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 10/11/2025 à 12:27, David Hildenbrand (Red Hat) a écrit :
>>>>> Thanks for the review!
>>>>>
>>>>>>
>>>>>> So I think what you want instead is:
>>>>>>
>>>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype
>>>>>> b/arch/powerpc/platforms/Kconfig.cputype
>>>>>> index 7b527d18aa5ee..1f5a1e587740c 100644
>>>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>>>> @@ -276,6 +276,7 @@ config PPC_E500
>>>>>> select FSL_EMB_PERFMON
>>>>>> bool
>>>>>> select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
>>>>>> + select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>>>>> select PPC_SMP_MUXED_IPI
>>>>>> select PPC_DOORBELL
>>>>>> select PPC_KUEP
>>>>>>
>>>>>>
>>>>>>
>>>>>>> select ARCH_HAS_KCOV
>>>>>>> select ARCH_HAS_KERNEL_FPU_SUPPORT if PPC64 && PPC_FPU
>>>>>>> select ARCH_HAS_MEMBARRIER_CALLBACKS
>>>>>>> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/
>>>>>>> platforms/Kconfig.cputype
>>>>>>> index 7b527d18aa5ee..4c321a8ea8965 100644
>>>>>>> --- a/arch/powerpc/platforms/Kconfig.cputype
>>>>>>> +++ b/arch/powerpc/platforms/Kconfig.cputype
>>>>>>> @@ -423,7 +423,6 @@ config PPC_64S_HASH_MMU
>>>>>>> config PPC_RADIX_MMU
>>>>>>> bool "Radix MMU Support"
>>>>>>> depends on PPC_BOOK3S_64
>>>>>>> - select ARCH_HAS_GIGANTIC_PAGE
>>>>>>
>>>>>> Should remain I think.
>>>>>>
>>>>>>> default y
>>>>>>> help
>>>>>>> Enable support for the Power ISA 3.0 Radix style MMU.
>>>>>>> Currently
>>>>>
>>>>>
>>>>> We also have PPC_8xx do a
>>>>>
>>>>> select ARCH_SUPPORTS_HUGETLBFS
>>>>>
>>>>> And of course !PPC_RADIX_MMU (e.g., PPC_64S_HASH_MMU) through
>>>>> PPC_BOOK3S_64.
>>>>>
>>>>> Are we sure they cannot end up with gigantic folios through hugetlb?
>>>>>
>>>>
>>>> Yes indeed. My PPC_8xx is OK because I set CONFIG_ARCH_FORCE_MAX_ORDER=9
>>>> (largest hugepage is 8M) but I do get the warning with the default value
>>>> which is 8 (with 16k pages).
>>>>
>>>> For PPC_64S_HASH_MMU, max page size is 16M, we get no warning with
>>>> CONFIG_ARCH_FORCE_MAX_ORDER=8 which is the default value but get the
>>>> warning with CONFIG_ARCH_FORCE_MAX_ORDER=7
>>>
>>> Right, the dependency on CONFIG_ARCH_FORCE_MAX_ORDER is nasty. In the
>>> future,
>>> likely the arch should just tell us the biggest possible hugetlb size
>>> and we
>>> can then determine this ourselves.
>>>
>>> ... or we'll simply remove the gigantic vs. !gigantic handling
>>> completely and
>>> simply assume that "if there is hugetlb, we might have gigantic folios".
>>>
>>>> Should CONFIG_ARCH_HAS_GIGANTIC_PAGE be set unconditionaly as soon as
>>>> hugepages are selected, or should it depend on
>>>> CONFIG_ARCH_FORCE_MAX_ORDER ? What is the cost of selecting
>>>> CONFIG_ARCH_HAS_GIGANTIC_PAGE ?
>>>
>>> There is no real cost, we just try to keep the value small so
>>> __dump_folio()
>>> can better detect inconsistencies.
>>>
>>> To fix it for now, likely the following is good enough (pushed to the
>>> previously mentioned branch):
>>>
>>>
>>> From 7abf0f52e59d96533aa8c96194878e9453aa8be0 Mon Sep 17 00:00:00 2001
>>> From: "David Hildenbrand (Red Hat)" <david@kernel.org>
>>> Date: Thu, 6 Nov 2025 11:31:45 +0100
>>> Subject: [PATCH] mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb
>>>
>>> In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support
>>> runtime allocation of gigantic hugetlb folios. In the meantime it evolved
>>> into a generic way for the architecture to state that it supports
>>> gigantic hugetlb folios.
>>>
>>> In commit fae7d834c43c ("mm: add __dump_folio()") we started using
>>> CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could
>>> have folios larger than what the buddy can handle. In the context of
>>> that commit, we started using MAX_FOLIO_ORDER to detect page corruptions
>>> when dumping tail pages of folios. Before that commit, we assumed that
>>> we cannot have folios larger than the highest buddy order, which was
>>> obviously wrong.
>>>
>>> In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>>> when registering hstate"), we used MAX_FOLIO_ORDER to detect
>>> inconsistencies, and in fact, we found some now.
>>>
>>> Powerpc allows for configs that can allocate gigantic folio during boot
>>> (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can
>>> exceed PUD_ORDER.
>>>
>>> To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with
>>> hugetlb on powerpc, and increase the maximum folio size with hugetlb
>>> to 16
>>> GiB (possible on arm64 and powerpc). Note that on some powerpc
>>> configurations, whether we actually have gigantic pages
>>> depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is
>>> nothing really problematic about setting it unconditionally: we just
>>> try to
>>> keep the value small so we can better detect problems in __dump_folio()
>>> and inconsistencies around the expected largest folio in the system.
>>>
>>> Ideally, we'd have a better way to obtain the maximum hugetlb folio size
>>> and detect ourselves whether we really end up with gigantic folios. Let's
>>> defer bigger changes and fix the warnings first.
>>>
>>> While at it, handle gigantic DAX folios more clearly: DAX can only
>>> end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>>>
>>> Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases
>>> clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with
>>> HUGETLB_PAGE.
>>>
>>> Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now
>>> also allow for runtime allocations of folios in some more powerpc
>>> configs.
>>> I don't think this is a problem, but if it is we could handle it through
>>> __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED.
>>>
>>> While __dump_page()/__dump_folio was also problematic (not handling
>>> dumping
>>> of tail pages of such gigantic folios correctly), it doesn't relevant
>>> critical enough to mark it as a fix.
>>>
>>> Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes
>>> when registering hstate")
>>> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>>> Closes: https://eur01.safelinks.protection.outlook.com/?
>>> url=https%3A%2F%2Flore.kernel.org%2Fr%2F3e043453-3f27-48ad-b987-
>>> cc39f523060a%40csgroup.eu%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cb376c59325bf40bc08ce08de211479f4%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638984569012877144%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=KwQwqCg2Cu5oXXwBYhuQvW2kZqjyNZMk5N6zfsg%2FCHI%3D&reserved=0
>>> Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>> Closes: https://eur01.safelinks.protection.outlook.com/?
>>> url=https%3A%2F%2Flore.kernel.org%2Fr%2F94377f5c-d4f0-4c0f-
>>> b0f6-5bf1cd7305b1%40linux.ibm.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Cb376c59325bf40bc08ce08de211479f4%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638984569012910679%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=1twO%2Ffle%2BX3EKlku7P9C8ZlQQUB2B9r%2FvF8ZaQdVz8k%3D&reserved=0
>>> Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org>
>>> ---
>>> arch/powerpc/Kconfig | 1 +
>>> include/linux/mm.h | 12 +++++++++---
>>> mm/Kconfig | 7 +++++++
>>> 3 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>> index e24f4d88885ae..9537a61ebae02 100644
>>> --- a/arch/powerpc/Kconfig
>>> +++ b/arch/powerpc/Kconfig
>>> @@ -137,6 +137,7 @@ config PPC
>>> select ARCH_HAS_DMA_OPS if PPC64
>>> select ARCH_HAS_FORTIFY_SOURCE
>>> select ARCH_HAS_GCOV_PROFILE_ALL
>>> + select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>> select ARCH_HAS_KCOV
>>> select ARCH_HAS_KERNEL_FPU_SUPPORT if PPC64 && PPC_FPU
>>> select ARCH_HAS_MEMBARRIER_CALLBACKS
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index d16b33bacc32b..2646ba7c96a49 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -2074,7 +2074,7 @@ static inline unsigned long folio_nr_pages(const
>>> struct folio *folio)
>>> return folio_large_nr_pages(folio);
>>> }
>>> -#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
>>> +#if !defined(CONFIG_HAVE_GIGANTIC_FOLIOS)
>>> /*
>>> * We don't expect any folios that exceed buddy sizes (and
>>> consequently
>>> * memory sections).
>>> @@ -2087,10 +2087,16 @@ static inline unsigned long
>>> folio_nr_pages(const struct folio *folio)
>>> * pages are guaranteed to be contiguous.
>>> */
>>> #define MAX_FOLIO_ORDER PFN_SECTION_SHIFT
>>> -#else
>>> +#elif defined(CONFIG_HUGETLB_PAGE)
>>> /*
>>> * There is no real limit on the folio size. We limit them to the
>>> maximum we
>>> - * currently expect (e.g., hugetlb, dax).
>>> + * currently expect: with hugetlb, we expect no folios larger than 16
>>> GiB.
>>> + */
>>> +#define MAX_FOLIO_ORDER (16 * GIGA / PAGE_SIZE)
>>
>> Forgot to commit the ilog2(), so this should be
>>
>> #define MAX_FOLIO_ORDER ilog2(16 * GIGA / PAGE_SIZE
>
> I would have used SZ_16G.
>
Yeah, much better.
> But could we use get_order() instead ? (From include/asm-generic/getorder.h)
I think so, the compiler should just convert it to a compile-time constant.
>
>>
>> And we might need unit.h to make some cross compiles happy.
>
> size.h by the way if we use SZ_16G instead.
sizes.h is even already included in mmh.
Thanks, let me cross-compile and send out something official.
--
Cheers
David
next prev parent reply other threads:[~2025-11-11 12:20 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 5:49 powerpc/e500: WARNING: at mm/hugetlb.c:4755 hugetlb_add_hstate Sourabh Jain
2025-10-29 8:25 ` David Hildenbrand
2025-11-05 11:32 ` Christophe Leroy
2025-11-06 15:02 ` David Hildenbrand (Red Hat)
2025-11-06 16:19 ` Christophe Leroy
2025-11-07 14:37 ` Ritesh Harjani
2025-11-07 16:11 ` Christophe Leroy
2025-11-10 10:10 ` David Hildenbrand (Red Hat)
2025-11-10 10:33 ` Christophe Leroy
2025-11-10 11:04 ` David Hildenbrand (Red Hat)
2025-11-10 11:27 ` David Hildenbrand (Red Hat)
2025-11-10 18:31 ` Christophe Leroy
2025-11-11 8:29 ` David Hildenbrand (Red Hat)
2025-11-11 11:21 ` David Hildenbrand (Red Hat)
2025-11-11 11:42 ` Christophe Leroy
2025-11-11 12:20 ` David Hildenbrand (Red Hat) [this message]
2025-11-12 10:41 ` Ritesh Harjani
2025-11-07 8:00 ` Sourabh Jain
2025-11-07 9:02 ` David Hildenbrand (Red Hat)
2025-11-07 12:35 ` Sourabh Jain
2025-11-07 14:18 ` David Hildenbrand (Red Hat)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b2a79874-4718-47dd-ad79-99d7fb49246e@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=christophe.leroy@csgroup.eu \
--cc=donettom@linux.ibm.com \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=sourabhjain@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).