Re: [RFC PATCH 12/20] mm/hugetlb: make bootmem allocation work with KHO

Linux-HyperV List
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Jork Loeser" <jloeser@linux.microsoft.com>
Cc: linux-hyperv@vger.kernel.org
Subject: Re: [RFC PATCH 12/20] mm/hugetlb: make bootmem allocation work with KHO
Date: Thu, 28 May 2026 01:06:59 +0000	[thread overview]
Message-ID: <20260528010659.E3A1C1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260528004204.1484584-13-jloeser@linux.microsoft.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Incorrect NUMA node extraction for huge page allocation fallback due to inverted `node_exact` condition, breaking HugeTLB Vmemmap Optimization (HVO).
- [Medium] Silent memory leak in `memblock_alloc_nid_user` error path due to inherently failing `memblock_phys_free`.
--

commit 8f512f66429ea3e879e3104adb050f66df1c4fb3
Author: Pratyush Yadav (Google) <pratyush@kernel.org>

mm/hugetlb: make bootmem allocation work with KHO

This patch addresses KHO scratch size accounting issues by introducing
memblock_alloc_nid_user(). It ensures that huge pages are not allocated
from scratch areas and removes the MEMBLOCK_RSRV_KERN flag from these
allocations, preventing them from artificially inflating scratch space
requirements.

> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 571212b80835e..46f2b1bd5abe8 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3033,26 +3033,19 @@ static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exact)
[ ... ]
> -		if (node_exact)
> -			m = memblock_alloc_exact_nid_raw(huge_page_size(h),
> -				huge_page_size(h), 0,
> -				MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> -		else {
> -			m = memblock_alloc_try_nid_raw(huge_page_size(h),
> -				huge_page_size(h), 0,
> -				MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +		m = memblock_alloc_nid_user(huge_page_size(h), huge_page_size(h),
> +					    nid, node_exact);
> +		if (m) {
>  			/*
>  			 * For pre-HVO to work correctly, pages need to be on
>  			 * the list for the node they were actually allocated
>  			 * from. That node may be different in the case of
> -			 * fallback by memblock_alloc_try_nid_raw. So,
> -			 * extract the actual node first.
> +			 * fallback by memblock_alloc_try_nid_raw. So, extract
> +			 * the actual node first.
>  			 */
> -			if (m)
> +			if (node_exact)
>  				listnode = early_pfn_to_nid(PHYS_PFN(__pa(m)));

[Severity: High]
Does this extract the node under the wrong condition?

The original code extracted the node only in the fallback path (!node_exact),
since exact allocations guarantee the requested node is used. 

With this change, if node_exact is true, we redundantly extract the node,
but if node_exact is false (meaning a fallback could have happened), we skip
extracting the actual node.

Can this place fallback-allocated huge pages onto the wrong node's list,
breaking NUMA locality and pre-HVO assumptions?

[ ... ]
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 6f76a6bb96d61..8cd52d34ad6e4 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
[ ... ]
> @@ -1784,6 +1817,47 @@ void * __init memblock_alloc_try_nid_raw(
> +void * __init memblock_alloc_nid_user(phys_addr_t size, phys_addr_t align,
> +				      int nid, bool exact_nid)
> +{
[ ... ]
> +	alloc = __memblock_alloc_range_nid(size, align, 0, MEMBLOCK_ALLOC_ACCESSIBLE,
> +					   nid, exact_nid, flags);
> +	if (!alloc)
> +		return NULL;
> +
> +	/* User memory should not be marked with RSRV_KERN. */
> +	if (memblock_reserved_clear_kern(alloc, size)) {
> +		memblock_phys_free(alloc, size);
> +		return NULL;
> +	}

[Severity: Medium]
Will this error path silently leak memory?

If memblock_reserved_clear_kern() fails, it is because
memblock_isolate_range() encountered an out-of-memory scenario when trying
to split a region and memblock_double_array() failed.

When we then call memblock_phys_free(), it eventually calls
memblock_remove_range(), which relies on the exact same
memblock_isolate_range() call to split the region before removing it.

Since the array is still full, this isolation will deterministically fail
again, and memblock_phys_free() will return without actually freeing the
memory, leaving it permanently reserved with MEMBLOCK_RSRV_KERN set.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260528004204.1484584-1-jloeser@linux.microsoft.com?part=12

next prev parent reply	other threads:[~2026-05-28  1:07 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-28  0:41 [RFC PATCH 00/20] mshv: enable kexec with Hyper-V donated pages and partitions Jork Loeser
2026-05-28  0:41 ` [RFC PATCH 01/20] kho: generalize radix tree APIs Jork Loeser
2026-05-28  1:22   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 02/20] kho: store incoming radix tree in kho_in Jork Loeser
2026-05-28  1:08   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 03/20] kho: add a struct for radix callbacks Jork Loeser
2026-05-28  0:41 ` [RFC PATCH 04/20] kho: add callback for table pages Jork Loeser
2026-05-28  1:33   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 05/20] kho: add data argument to radix walk callback Jork Loeser
2026-05-28  1:11   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 06/20] kho: allow early-boot usage of the KHO radix tree Jork Loeser
2026-05-28  1:40   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 07/20] kho: allow destroying " Jork Loeser
2026-05-28  0:41 ` [RFC PATCH 08/20] kho: add kho_radix_init_tree() Jork Loeser
2026-05-28  1:21   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 09/20] memblock: introduce MEMBLOCK_KHO_SCRATCH_EXT Jork Loeser
2026-05-28  0:41 ` [RFC PATCH 10/20] kho: extended scratch Jork Loeser
2026-05-28  1:21   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 11/20] kho: return virtual address of mem_map Jork Loeser
2026-05-28  1:27   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 12/20] mm/hugetlb: make bootmem allocation work with KHO Jork Loeser
2026-05-28  1:06   ` sashiko-bot [this message]
2026-05-28  0:41 ` [RFC PATCH 13/20] kho: add radix tree freeze and del_key() error reporting Jork Loeser
2026-05-28  1:34   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 14/20] kho: Add crash-kernel-safe radix tree presence check Jork Loeser
2026-05-28  1:27   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 15/20] mshv: Use page tracker to manage MSHV-owned pages and preserve with KHO Jork Loeser
2026-05-28  1:41   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 16/20] mshv: Add debugfs interface to page tracker Jork Loeser
2026-05-28  1:48   ` sashiko-bot
2026-05-28  0:41 ` [RFC PATCH 17/20] hyperv: Reserve crash MSR P2 for page preservation root PA Jork Loeser
2026-05-28  1:34   ` sashiko-bot
2026-05-28  0:42 ` [RFC PATCH 18/20] mshv: Exclude Hyper-V donated pages from crash dump collection Jork Loeser
2026-05-28  2:13   ` sashiko-bot
2026-05-28  0:42 ` [RFC PATCH 19/20] kexec: export kexec_in_progress for modules Jork Loeser
2026-05-28  0:42 ` [RFC PATCH 20/20] mshv: freeze and vacuum partitions across kexec Jork Loeser
2026-05-28  2:11   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260528010659.E3A1C1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=jloeser@linux.microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox