[PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

* [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes
@ 2026-03-09 12:34 Pratyush Yadav
  2026-03-09 12:34 ` [PATCH 2/2] kho: drop restriction on maximum page order Pratyush Yadav
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Pratyush Yadav @ 2026-03-09 12:34 UTC (permalink / raw)
  To: Alexander Graf, Mike Rapoport, Pasha Tatashin, Pratyush Yadav,
	Andrew Morton
  Cc: kexec, linux-mm, linux-kernel

From: "Pratyush Yadav (Google)" <pratyush@kernel.org>

The KHO restoration machinery is not capable of dealing with
preservations that span multiple NUMA nodes. kho_preserve_folio()
guarantees the preservation will only span one NUMA node since folios
can't span multiple nodes.

This leaves kho_preserve_pages(). While semantically
kho_preserve_pages() only deals with 0-order pages, so all preservations
should be single page only, in practice it combines preservations to
higher orders for efficiency. This can result in a preservation spanning
multiple nodes. Break up the preservations into a smaller order if that
happens.

Suggested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---

Notes:
    Ref: https://lore.kernel.org/linux-mm/CA+CK2bDvaGmfkCPCMWM6gPcd4FfUyD6e5yWE+kNcma1vT3Jw3g@mail.gmail.com/

 kernel/liveupdate/kexec_handover.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
index cc68a3692905..bc9bd18294ee 100644
--- a/kernel/liveupdate/kexec_handover.c
+++ b/kernel/liveupdate/kexec_handover.c
@@ -869,9 +869,17 @@ int kho_preserve_pages(struct page *page, unsigned long nr_pages)
 	}
 
 	while (pfn < end_pfn) {
-		const unsigned int order =
+		unsigned int order =
 			min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn));
 
+		/*
+		 * Make sure all the pages in a single preservation are in the
+		 * same NUMA node. The restore machinery can not cope with a
+		 * preservation spanning multiple NUMA nodes.
+		 */
+		while (pfn_to_nid(pfn) != pfn_to_nid(pfn + (1UL << order) - 1))
+			order--;
+
 		err = __kho_preserve_order(track, pfn, order);
 		if (err) {
 			failed_pfn = pfn;
-- 
2.53.0.473.g4a7958ca14-goog



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] kho: drop restriction on maximum page order
  2026-03-09 12:34 [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes Pratyush Yadav
@ 2026-03-09 12:34 ` Pratyush Yadav
  2026-03-10 10:33   ` Mike Rapoport
  2026-03-09 15:59 ` [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes Samiullah Khawaja
  2026-03-10 10:32 ` Mike Rapoport
  2 siblings, 1 reply; 8+ messages in thread
From: Pratyush Yadav @ 2026-03-09 12:34 UTC (permalink / raw)
  To: Alexander Graf, Mike Rapoport, Pasha Tatashin, Pratyush Yadav,
	Andrew Morton
  Cc: kexec, linux-mm, linux-kernel

KHO currently restricts the maximum order of a restored page to the
maximum order supported by the buddy allocator. While this works fine
for much of the data passed across kexec, it is possible to have pages
larger than MAX_PAGE_ORDER.

For one, it is possible to get a larger order when using
kho_preserve_pages() if the number of pages is large enough, since it
tries to combine multiple aligned 0-order preservations into one higher
order preservation.

For another, upcoming support for hugepages can have gigantic hugepages
being preserved over KHO.

There is no real reason for this limit. The KHO preservation machinery
can handle any page order. Remove this artificial restriction on max
page order.

Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---

Notes:
    This patch was first sent with this RFC series [0]. I am sending it
    separately since it is an independent patch that is useful even without
    hugepage preservation. No changes since the RFC.
    
    [0] https://lore.kernel.org/linux-mm/20251206230222.853493-1-pratyush@kernel.org/T/#u

 kernel/liveupdate/kexec_handover.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
index bc9bd18294ee..1038e41ff9f9 100644
--- a/kernel/liveupdate/kexec_handover.c
+++ b/kernel/liveupdate/kexec_handover.c
@@ -253,7 +253,7 @@ static struct page *kho_restore_page(phys_addr_t phys, bool is_folio)
 	 * check also implicitly makes sure phys is order-aligned since for
 	 * non-order-aligned phys addresses, magic will never be set.
 	 */
-	if (WARN_ON_ONCE(info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER))
+	if (WARN_ON_ONCE(info.magic != KHO_PAGE_MAGIC))
 		return NULL;
 	nr_pages = (1 << info.order);
 
-- 
2.53.0.473.g4a7958ca14-goog



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes
  2026-03-09 12:34 [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes Pratyush Yadav
  2026-03-09 12:34 ` [PATCH 2/2] kho: drop restriction on maximum page order Pratyush Yadav
@ 2026-03-09 15:59 ` Samiullah Khawaja
  2026-03-10 10:32 ` Mike Rapoport
  2 siblings, 0 replies; 8+ messages in thread
From: Samiullah Khawaja @ 2026-03-09 15:59 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Alexander Graf, Mike Rapoport, Pasha Tatashin, Andrew Morton,
	kexec, linux-mm, linux-kernel

On Mon, Mar 09, 2026 at 12:34:06PM +0000, Pratyush Yadav wrote:
>From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
>
>The KHO restoration machinery is not capable of dealing with
>preservations that span multiple NUMA nodes. kho_preserve_folio()
>guarantees the preservation will only span one NUMA node since folios
>can't span multiple nodes.
>
>This leaves kho_preserve_pages(). While semantically
>kho_preserve_pages() only deals with 0-order pages, so all preservations
>should be single page only, in practice it combines preservations to
>higher orders for efficiency. This can result in a preservation spanning
>multiple nodes. Break up the preservations into a smaller order if that
>happens.
>
>Suggested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
>Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
>---
>
>Notes:
>    Ref: https://lore.kernel.org/linux-mm/CA+CK2bDvaGmfkCPCMWM6gPcd4FfUyD6e5yWE+kNcma1vT3Jw3g@mail.gmail.com/
>
> kernel/liveupdate/kexec_handover.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
>diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
>index cc68a3692905..bc9bd18294ee 100644
>--- a/kernel/liveupdate/kexec_handover.c
>+++ b/kernel/liveupdate/kexec_handover.c
>@@ -869,9 +869,17 @@ int kho_preserve_pages(struct page *page, unsigned long nr_pages)
> 	}
>
> 	while (pfn < end_pfn) {
>-		const unsigned int order =
>+		unsigned int order =
> 			min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn));
>
>+		/*
>+		 * Make sure all the pages in a single preservation are in the
>+		 * same NUMA node. The restore machinery can not cope with a
>+		 * preservation spanning multiple NUMA nodes.
>+		 */
>+		while (pfn_to_nid(pfn) != pfn_to_nid(pfn + (1UL << order) - 1))
>+			order--;
>+
> 		err = __kho_preserve_order(track, pfn, order);
> 		if (err) {
> 			failed_pfn = pfn;
>-- 
>2.53.0.473.g4a7958ca14-goog
>
>

Reviewed-by: Samiullah Khawaja <skhawaja@google.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes
  2026-03-09 12:34 [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes Pratyush Yadav
  2026-03-09 12:34 ` [PATCH 2/2] kho: drop restriction on maximum page order Pratyush Yadav
  2026-03-09 15:59 ` [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes Samiullah Khawaja
@ 2026-03-10 10:32 ` Mike Rapoport
  2 siblings, 0 replies; 8+ messages in thread
From: Mike Rapoport @ 2026-03-10 10:32 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Alexander Graf, Pasha Tatashin, Andrew Morton, kexec, linux-mm,
	linux-kernel

On Mon, Mar 09, 2026 at 12:34:06PM +0000, Pratyush Yadav wrote:
> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
> 
> The KHO restoration machinery is not capable of dealing with
> preservations that span multiple NUMA nodes. kho_preserve_folio()
> guarantees the preservation will only span one NUMA node since folios
> can't span multiple nodes.
> 
> This leaves kho_preserve_pages(). While semantically
> kho_preserve_pages() only deals with 0-order pages, so all preservations
> should be single page only, in practice it combines preservations to
> higher orders for efficiency. This can result in a preservation spanning
> multiple nodes. Break up the preservations into a smaller order if that
> happens.
> 
> Suggested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
> 
> Notes:
>     Ref: https://lore.kernel.org/linux-mm/CA+CK2bDvaGmfkCPCMWM6gPcd4FfUyD6e5yWE+kNcma1vT3Jw3g@mail.gmail.com/
> 
>  kernel/liveupdate/kexec_handover.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index cc68a3692905..bc9bd18294ee 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -869,9 +869,17 @@ int kho_preserve_pages(struct page *page, unsigned long nr_pages)
>  	}
>  
>  	while (pfn < end_pfn) {
> -		const unsigned int order =
> +		unsigned int order =
>  			min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn));
>  
> +		/*
> +		 * Make sure all the pages in a single preservation are in the
> +		 * same NUMA node. The restore machinery can not cope with a
> +		 * preservation spanning multiple NUMA nodes.
> +		 */
> +		while (pfn_to_nid(pfn) != pfn_to_nid(pfn + (1UL << order) - 1))
> +			order--;
> +
>  		err = __kho_preserve_order(track, pfn, order);
>  		if (err) {
>  			failed_pfn = pfn;
> -- 
> 2.53.0.473.g4a7958ca14-goog
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] kho: drop restriction on maximum page order
  2026-03-09 12:34 ` [PATCH 2/2] kho: drop restriction on maximum page order Pratyush Yadav
@ 2026-03-10 10:33   ` Mike Rapoport
  2026-03-17  9:12     ` Pratyush Yadav
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2026-03-10 10:33 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Alexander Graf, Pasha Tatashin, Andrew Morton, kexec, linux-mm,
	linux-kernel

On Mon, Mar 09, 2026 at 12:34:07PM +0000, Pratyush Yadav wrote:
> KHO currently restricts the maximum order of a restored page to the
> maximum order supported by the buddy allocator. While this works fine
> for much of the data passed across kexec, it is possible to have pages
> larger than MAX_PAGE_ORDER.
> 
> For one, it is possible to get a larger order when using
> kho_preserve_pages() if the number of pages is large enough, since it
> tries to combine multiple aligned 0-order preservations into one higher
> order preservation.
> 
> For another, upcoming support for hugepages can have gigantic hugepages
> being preserved over KHO.
> 
> There is no real reason for this limit. The KHO preservation machinery
> can handle any page order. Remove this artificial restriction on max
> page order.
> 
> Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>

One SOB should be enough ;-)

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

> ---
> 
> Notes:
>     This patch was first sent with this RFC series [0]. I am sending it
>     separately since it is an independent patch that is useful even without
>     hugepage preservation. No changes since the RFC.
>     
>     [0] https://lore.kernel.org/linux-mm/20251206230222.853493-1-pratyush@kernel.org/T/#u
> 
>  kernel/liveupdate/kexec_handover.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> index bc9bd18294ee..1038e41ff9f9 100644
> --- a/kernel/liveupdate/kexec_handover.c
> +++ b/kernel/liveupdate/kexec_handover.c
> @@ -253,7 +253,7 @@ static struct page *kho_restore_page(phys_addr_t phys, bool is_folio)
>  	 * check also implicitly makes sure phys is order-aligned since for
>  	 * non-order-aligned phys addresses, magic will never be set.
>  	 */
> -	if (WARN_ON_ONCE(info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER))
> +	if (WARN_ON_ONCE(info.magic != KHO_PAGE_MAGIC))
>  		return NULL;
>  	nr_pages = (1 << info.order);
>  
> -- 
> 2.53.0.473.g4a7958ca14-goog
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] kho: drop restriction on maximum page order
  2026-03-10 10:33   ` Mike Rapoport
@ 2026-03-17  9:12     ` Pratyush Yadav
  2026-03-17 11:04       ` Mike Rapoport
  0 siblings, 1 reply; 8+ messages in thread
From: Pratyush Yadav @ 2026-03-17  9:12 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Pratyush Yadav, Alexander Graf, Pasha Tatashin, Andrew Morton,
	kexec, linux-mm, linux-kernel

On Tue, Mar 10 2026, Mike Rapoport wrote:

> On Mon, Mar 09, 2026 at 12:34:07PM +0000, Pratyush Yadav wrote:
>> KHO currently restricts the maximum order of a restored page to the
>> maximum order supported by the buddy allocator. While this works fine
>> for much of the data passed across kexec, it is possible to have pages
>> larger than MAX_PAGE_ORDER.
>> 
>> For one, it is possible to get a larger order when using
>> kho_preserve_pages() if the number of pages is large enough, since it
>> tries to combine multiple aligned 0-order preservations into one higher
>> order preservation.
>> 
>> For another, upcoming support for hugepages can have gigantic hugepages
>> being preserved over KHO.
>> 
>> There is no real reason for this limit. The KHO preservation machinery
>> can handle any page order. Remove this artificial restriction on max
>> page order.
>> 
>> Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
>> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
>
> One SOB should be enough ;-)

Hmm, I figured the unemployed me (who originally wrote the patch) and
the employed-by-google me (who is doing this new version) would count as
two separate entities and there should be a S-o-b for both.

Anyway, I am fine with dropping either one of the two.

>
> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Thanks!

[...]

-- 
Regards,
Pratyush Yadav


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] kho: drop restriction on maximum page order
  2026-03-17  9:12     ` Pratyush Yadav
@ 2026-03-17 11:04       ` Mike Rapoport
  2026-03-20 10:24         ` Pratyush Yadav
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2026-03-17 11:04 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Alexander Graf, Pasha Tatashin, Andrew Morton, kexec, linux-mm,
	linux-kernel

On Tue, Mar 17, 2026 at 09:12:19AM +0000, Pratyush Yadav wrote:
> On Tue, Mar 10 2026, Mike Rapoport wrote:
> 
> > On Mon, Mar 09, 2026 at 12:34:07PM +0000, Pratyush Yadav wrote:
> >> KHO currently restricts the maximum order of a restored page to the
> >> maximum order supported by the buddy allocator. While this works fine
> >> for much of the data passed across kexec, it is possible to have pages
> >> larger than MAX_PAGE_ORDER.
> >> 
> >> For one, it is possible to get a larger order when using
> >> kho_preserve_pages() if the number of pages is large enough, since it
> >> tries to combine multiple aligned 0-order preservations into one higher
> >> order preservation.
> >> 
> >> For another, upcoming support for hugepages can have gigantic hugepages
> >> being preserved over KHO.
> >> 
> >> There is no real reason for this limit. The KHO preservation machinery
> >> can handle any page order. Remove this artificial restriction on max
> >> page order.
> >> 
> >> Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
> >> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
> >
> > One SOB should be enough ;-)
> 
> Hmm, I figured the unemployed me (who originally wrote the patch) and
> the employed-by-google me (who is doing this new version) would count as
> two separate entities and there should be a S-o-b for both.

Maybe than something like:

Signed-off-by: Pratyush Yadav (Hobbyist) <pratyush@kernel.org>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
 
;-)

> Anyway, I am fine with dropping either one of the two.
> 
> >
> > Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> 
> Thanks!
> 
> [...]
> 
> -- 
> Regards,
> Pratyush Yadav

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] kho: drop restriction on maximum page order
  2026-03-17 11:04       ` Mike Rapoport
@ 2026-03-20 10:24         ` Pratyush Yadav
  0 siblings, 0 replies; 8+ messages in thread
From: Pratyush Yadav @ 2026-03-20 10:24 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Pratyush Yadav, Alexander Graf, Pasha Tatashin, Andrew Morton,
	kexec, linux-mm, linux-kernel

On Tue, Mar 17 2026, Mike Rapoport wrote:

> On Tue, Mar 17, 2026 at 09:12:19AM +0000, Pratyush Yadav wrote:
>> On Tue, Mar 10 2026, Mike Rapoport wrote:
>> 
>> > On Mon, Mar 09, 2026 at 12:34:07PM +0000, Pratyush Yadav wrote:
>> >> KHO currently restricts the maximum order of a restored page to the
>> >> maximum order supported by the buddy allocator. While this works fine
>> >> for much of the data passed across kexec, it is possible to have pages
>> >> larger than MAX_PAGE_ORDER.
>> >> 
>> >> For one, it is possible to get a larger order when using
>> >> kho_preserve_pages() if the number of pages is large enough, since it
>> >> tries to combine multiple aligned 0-order preservations into one higher
>> >> order preservation.
>> >> 
>> >> For another, upcoming support for hugepages can have gigantic hugepages
>> >> being preserved over KHO.
>> >> 
>> >> There is no real reason for this limit. The KHO preservation machinery
>> >> can handle any page order. Remove this artificial restriction on max
>> >> page order.
>> >> 
>> >> Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
>> >> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
>> >
>> > One SOB should be enough ;-)
>> 
>> Hmm, I figured the unemployed me (who originally wrote the patch) and
>> the employed-by-google me (who is doing this new version) would count as
>> two separate entities and there should be a S-o-b for both.
>
> Maybe than something like:
>
> Signed-off-by: Pratyush Yadav (Hobbyist) <pratyush@kernel.org>
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
>  
> ;-)
>
Yeah, that works too :-)

I'll use that when I redo some of the HugeTLB patches too

-- 
Regards,
Pratyush Yadav


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-03-20 10:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 12:34 [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes Pratyush Yadav
2026-03-09 12:34 ` [PATCH 2/2] kho: drop restriction on maximum page order Pratyush Yadav
2026-03-10 10:33   ` Mike Rapoport
2026-03-17  9:12     ` Pratyush Yadav
2026-03-17 11:04       ` Mike Rapoport
2026-03-20 10:24         ` Pratyush Yadav
2026-03-09 15:59 ` [PATCH 1/2] kho: make sure preservations do not span multiple NUMA nodes Samiullah Khawaja
2026-03-10 10:32 ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox