linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Sourabh Jain <sourabhjain@linux.ibm.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>,
	linuxppc-dev@lists.ozlabs.org
Cc: Hari Bathini <hbathini@linux.ibm.com>,
	Madhavan Srinivasan <maddy@linux.ibm.com>,
	Mahesh Salgaonkar <mahesh@linux.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: Re: [PATCH v4] powerpc/hugetlb: Disable gigantic hugepages if fadump is active
Date: Wed, 5 Mar 2025 11:18:06 +0530	[thread overview]
Message-ID: <96f3bbe1-821c-4792-a22f-3ecc3aecb6a4@linux.ibm.com> (raw)
In-Reply-To: <87frjttmt7.fsf@gmail.com>

Hello Ritesh,


On 04/03/25 10:27, Ritesh Harjani (IBM) wrote:
> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>
>> Hello Ritesh,
>>
>> Thanks for the review.
>>
>> On 02/03/25 12:05, Ritesh Harjani (IBM) wrote:
>>> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>>>
>>>> The fadump kernel boots with limited memory solely to collect the kernel
>>>> core dump. Having gigantic hugepages in the fadump kernel is of no use.
>>> Sure got it.
>>>
>>>> Many times, the fadump kernel encounters OOM (Out of Memory) issues if
>>>> gigantic hugepages are allocated.
>>>>
>>>> To address this, disable gigantic hugepages if fadump is active by
>>>> returning early from arch_hugetlb_valid_size() using
>>>> hugepages_supported(). When fadump is active, the global variable
>>>> hugetlb_disabled is set to true, which is later used by the
>>>> PowerPC-specific hugepages_supported() function to determine hugepage
>>>> support.
>>>>
>>>> Returning early from arch_hugetlb_vali_size() not only disables
>>>> gigantic hugepages but also avoids unnecessary hstate initialization for
>>>> every hugepage size supported by the platform.
>>>>
>>>> kernel logs related to hugepages with this patch included:
>>>> kernel argument passed: hugepagesz=1G hugepages=1
>>>>
>>>> First kernel: gigantic hugepage got allocated
>>>> ==============================================
>>>>
>>>> dmesg | grep -i "hugetlb"
>>>> -------------------------
>>>> HugeTLB: registered 1.00 GiB page size, pre-allocated 1 pages
>>>> HugeTLB: 0 KiB vmemmap can be freed for a 1.00 GiB page
>>>> HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
>>>> HugeTLB: 0 KiB vmemmap can be freed for a 2.00 MiB page
>>>>
>>>> $ cat /proc/meminfo | grep -i "hugetlb"
>>>> -------------------------------------
>>>> Hugetlb:         1048576 kB
>>> Was this tested with patch [1] in your local tree?
>>>
>>> [1]: https://web.git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=d629d7a8efc33
>>>
>>> IIUC, this patch [1] disables the boot time allocation of hugepages.
>>> Isn't it also disabling the boot time allocation for gigantic huge pages
>>> passed by the cmdline params like hugepagesz=1G and hugepages=2 ?
>> Yes, I had the patch [1] in my tree.
>>
>> My understanding is that gigantic pages are allocated before normal huge
>> pages.
>>
>> In hugepages_setup() in hugetlb.c, we have:
>>
>>       if (hugetlb_max_hstate && hstate_is_gigantic(parsed_hstate))
>>           hugetlb_hstate_alloc_pages(parsed_hstate);
>>
>> I believe the above code allocates memory for gigantic pages, and
>> hugetlb_init() is
>> called later because it is a subsys_initcall.
>>
>> So, by the time the kernel reaches hugetlb_init(), the gigantic pages
>> are already
>> allocated. Isn't that right?
>>
>> Please let me know your opinion.
> Yes, you are right. We are allocating hugepages from memblock, however
> this isn't getting advertized anywhere. i.e. there is no way one can
> know from any user interface on whether hugepages were allocated or not.
> i.e. for fadump kernel when hugepagesz= and hugepages= params are
> passed, though it will allocate gigantic pages, it won't advertize this
> in meminfo or anywhere else. This was adding the confusion when I tested
> this (which wasn't clear from the commit msg either).

Yeah I should have added that in my commit message.

>
> And I guess this is happening during fadump kernel because of our patch
> [1], which added a check to see whether hugetlb_disabled is true in-
> hugepages_supported(). Due to this hugetlb_init() is now not doing the
> rest of the initialization for those gigantic pages which were allocated
> due to cmdline options from hugepages_setup().

Yeah patch [1] has disabled the hugetlb initialization.

>
> [1]: https://lore.kernel.org/linuxppc-dev/20241202054310.928610-1-sourabhjain@linux.ibm.com/
>
> Now as we know from below that fadump can set hugetlb_disabled call in early_setup().
> i.e. fadump can mark hugetlb_disabled to true in
> early_setup() -> early_init_devtree() -> fadump_reserve_mem()
>
> And hugepages_setup() and hugepagesz_setup() gets called late in
> start_kernel() -> parse_args()
>
>
> And we already check for hugepages_supported() in all necessary calls in
> mm/hugetlb.c. So IMO, this check should go in mm/hugetlb.c in
> hugepagesz_setup() and hugepages_setup(). Because otherwise every arch
> implementation will end up duplicating this by adding
> hugepages_supported() check in their arch implementation of
> arch_hugetlb_valid_size().
>
> e.g. references of hugepages_supported() checks in mm/hugetlb.c
>
> mm/hugetlb.c hugetlb_show_meminfo_node 4959 if (!hugepages_supported())
> mm/hugetlb.c hugetlb_report_node_meminfo 4943 if (!hugepages_supported())
> mm/hugetlb.c hugetlb_report_meminfo 4914 if (!hugepages_supported())
> mm/hugetlb.c hugetlb_overcommit_handler 4848 if (!hugepages_supported())
> mm/hugetlb.c hugetlb_sysctl_handler_common 4809 if (!hugepages_supported())
> mm/hugetlb.c hugetlb_init 4461 if (!hugepages_supported()) {
> mm/hugetlb.c dissolve_free_hugetlb_folios 2211 if (!hugepages_supported())
> fs/hugetlbfs/inode.c init_hugetlbfs_fs 1604 if (!hugepages_supported()) {

v1 proposed to do it in generic code:
https://lore.kernel.org/all/20250121150419.1342794-1-sourabhjain@linux.ibm.com/

But it was suggested to try it in arch specific code.


>
>
> Let me also see the history on why this wasn't done earlier though...
>
> ... Oh actually there is more history to this. See [2]. We already had
> hugepages_supported() check in hugepages_setup() and other places
> earlier which was removed to fix some other problem in ppc ;)
>
> [2]: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2833a5bf75b3657c4dd20b3709c8c702754cb1f
>

IIUC, I think it was done because the HPAGE_SHIFT was not initialized 
early enough
on powrepc. But it is no longer the case after the below commit:

2354ad252b66 ("powerpc/mm: Update default hugetlb size early")

> Hence I believe this needs a wider cleanup than just fixing it for our
> arch. I see there is a patch series already fixing these code paths,
> which is also cleaning up the path of gigantic hugepage allocation in
> hugepages_setup(). I think it is in mm-unstable branch too. Can we
> please review & test that to make sure that the fadump usecase of
> disabling hugepages & gigantic are getting covered properly?
>
> [3]: https://lore.kernel.org/all/20250228182928.2645936-1-fvdl@google.com/
>
>
> Thoughts?

Agree, if there is cleanup happening already we should try to handle things
there itself. I will review the patch series [3].

Thanks, Ritesh, for the detailed review! Appreciate your time and effort.

Thanks,
Sourabh Jain


>
> -ritesh
>
>> Thanks,
>> Sourabh Jain
>>
>>
>>>
>>>> HugeTLB: registered 1.00 GiB page size, pre-allocated 1 pages
>>> This print comes from report_hugepages(). The only place from where
>>> report_hugepages() gets called is hugetlb_init(). hugetlb_init() is what
>>> is responsible for hugepages & gigantic hugepage allocations of the
>>> passed kernel cmdline params.
>>>
>>> But hugetlb_init() already checks for hugepages_supported() in the very
>>> beginning. So I am not sure whether we need this extra patch to disable
>>> gigantic hugepages allocation by the kernel cmdline params like
>>> hugepagesz=1G and hugepages=2 type of options.
>>>
>>> Hence I was wondering if you had this patch [1] in your tree when you were
>>> testing this?
>>>
>>> But I may be missing something. Could you please help clarify on whether
>>> we really need this patch to disable gigantic hugetlb page allocations?
>>>
>>>> Fadump kernel: gigantic hugepage not allocated
>>>> ===============================================
>>>>
>>>> dmesg | grep -i "hugetlb"
>>>> -------------------------
>>>> [    0.000000] HugeTLB: unsupported hugepagesz=1G
>>>> [    0.000000] HugeTLB: hugepages=1 does not follow a valid hugepagesz, ignoring
>>>> [    0.706375] HugeTLB support is disabled!
>>>> [    0.773530] hugetlbfs: disabling because there are no supported hugepage sizes
>>>>
>>>> $ cat /proc/meminfo | grep -i "hugetlb"
>>>> ----------------------------------
>>>> <Nothing>
> What I meant was, when we pass hugepagesz= and hugepages= and fadump=on,
> then during the fadump kernel, we still will see the above prints and
> nothing in meminfo even w/o this patch (because the user interfaces will
> be disabled since hugetlb_supported() is false).
> This observation should have been mentioned in the commit msg to avoid
> the confusion :)
>
>
>>>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>>>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>>>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>>> Cc: Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
>>> I guess the extra " in the above was not adding me in the cc list.
>>> Hence I missed to see this patch early.
>> Thanks for point it out. I will fix it.
>>
>>
>>> -ritesh
>>>
>>>
>>>> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>>>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>>> ---
>>>> Changelog:
>>>>
>>>> v1:
>>>> https://lore.kernel.org/all/20250121150419.1342794-1-sourabhjain@linux.ibm.com/
>>>>
>>>> v2:
>>>> https://lore.kernel.org/all/20250124103220.111303-1-sourabhjain@linux.ibm.com/
>>>>    - disable gigantic hugepage in arch code, arch_hugetlb_valid_size()
>>>>
>>>> v3:
>>>> https://lore.kernel.org/all/20250125104928.88881-1-sourabhjain@linux.ibm.com/
>>>>    - Do not modify the initialization of the shift variable
>>>>
>>>> v4:
>>>> - Update commit message to include how hugepages_supported() detects
>>>>     hugepages support when fadump is active
>>>> - Add Reviewed-by tag
>>>> - NO functional change
>>>>
>>>> ---
>>>>    arch/powerpc/mm/hugetlbpage.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
>>>> index 6b043180220a..88cfd182db4e 100644
>>>> --- a/arch/powerpc/mm/hugetlbpage.c
>>>> +++ b/arch/powerpc/mm/hugetlbpage.c
>>>> @@ -138,6 +138,9 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
>>>>    	int shift = __ffs(size);
>>>>    	int mmu_psize;
>>>>
>>>> +	if (!hugepages_supported())
>>>> +		return false;
>>>> +
>>>>    	/* Check that it is a page size supported by the hardware and
>>>>    	 * that it fits within pagetable and slice limits. */
>>>>    	if (size <= PAGE_SIZE || !is_power_of_2(size))
>>>> --
>>>> 2.48.1



  reply	other threads:[~2025-03-05  5:48 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28  4:33 [PATCH v4] powerpc/hugetlb: Disable gigantic hugepages if fadump is active Sourabh Jain
2025-02-24  4:04 ` Sourabh Jain
2025-03-02  6:35 ` Ritesh Harjani
2025-03-03  6:22   ` Sourabh Jain
2025-03-04  4:57     ` Ritesh Harjani
2025-03-05  5:48       ` Sourabh Jain [this message]
2025-03-05  9:38       ` Sourabh Jain
2025-03-05 19:17         ` Ritesh Harjani
2025-03-06  8:30           ` Sourabh Jain
2025-03-03  6:34   ` Sourabh Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96f3bbe1-821c-4792-a22f-3ecc3aecb6a4@linux.ibm.com \
    --to=sourabhjain@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=hbathini@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=mahesh@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).