Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range
@ 2026-05-20  9:34 Yuan Liu
  2026-05-20  9:34 ` [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop Yuan Liu
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Yuan Liu @ 2026-05-20  9:34 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
  Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
	Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng,
	linux-kernel

This series introduces a pages_with_online_memmap member into struct zone
to avoid pageblock-by-pageblock scans across the entire zone and improve
memory hotplug performance.

For VM hotplug performance data, please refer to Patch 4.
This series also benefits CXL hotplug. Performance results are as follows
https://lore.kernel.org/all/20260409023552.GA2807@AE/

Patches 1 and 2 avoid pages_with_online_memmap overcounting when
kernelcore=mirror is enabled.

Patch 3 avoids incorrect accounting of pages_with_online_memmap when
subsection holes exist in early sections.

pfn_valid() says early sections always have a full memmap, so even invalid
subsections have a memmap. pfn_to_online_page() says an invalid subsection
cannot be online and its content must be stale. for_each_valid_pfn() follows
pfn_valid() semantics, and we use it to initialize memmap that is not going
to be online and account it as pages_with_online_memmap, which is wrong.

The cleanest approach is to avoid allocating memmap for subsections, which
also removes the special early-section handling from pfn_valid() and
for_each_valid_pfn().

The price is another test_bit(idx, usage->subsection_map) check for early
sections. If that ever becomes a real problem, a future optimization could
cache "full valid section" in a section flag.

Patch 4 introduces pages_with_online_memmap to replace pageblock-by-pageblock
scans across the entire zone for zone contiguity checks

Patch 5 avoids incorrectly shrinking the zone span during memory unplug when
a memory or hole boundary falls in the middle of a subsection.

v3:
    https://lore.kernel.org/linux-mm/20260408031615.1831922-1-yuan1.liu@intel.com/

v4:
    https://lore.kernel.org/linux-mm/20260421125508.2317429-1-yuan1.liu@intel.com/

v4 changes:
  1. Address the pages_with_online_memmap overcounting when kernelcore=mirror

v5 changes:
  1. Rebase onto v7.1-rc3.
  2. Address incorrect pages_with_online_memmap accounting in several cases,
     including kernelcore=mirror, holes in early sections, and incorrect zone
     shrinking.

Yuan Liu (5):
  mm: move mirrored memory overlap checking to the outer loop
  mm: skip non-mirrored ZONE_NORMAL memory map init when
    kernelcore=mirror
  mm: remove the special early-section handling from pfn_valid() and
    for_each_valid_pfn()
  mm/memory_hotplug: optimize zone contiguous check when changing pfn
    range
  mm/memory_hotplug: improve shrink_zone_span() subsection boundary
    checks

 Documentation/mm/physical_memory.rst | 13 +++++
 drivers/base/memory.c                |  6 ++
 include/linux/mmzone.h               | 60 +++++++++++++++++---
 mm/internal.h                        |  8 +--
 mm/memory_hotplug.c                  | 54 +++++++++---------
 mm/mm_init.c                         | 83 +++++++++++-----------------
 mm/sparse-vmemmap.c                  |  4 +-
 7 files changed, 136 insertions(+), 92 deletions(-)

-- 
2.47.3



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop
  2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
@ 2026-05-20  9:34 ` Yuan Liu
  2026-05-21  3:33   ` Wei Yang
  2026-05-20  9:34 ` [PATCH v5 2/5] mm: skip non-mirrored ZONE_NORMAL memory map init when kernelcore=mirror Yuan Liu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Yuan Liu @ 2026-05-20  9:34 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
  Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
	Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng,
	linux-kernel

Move the overlap memmap initialization check from memmap_init_range()
to memmap_init(), and replace the per-PFN check with a memblock-based
check.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
---
 mm/mm_init.c | 29 +++++------------------------
 1 file changed, 5 insertions(+), 24 deletions(-)

diff --git a/mm/mm_init.c b/mm/mm_init.c
index f9f8e1af921c..24e103a402b0 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -783,28 +783,6 @@ void __meminit init_deferred_page(unsigned long pfn, int nid)
 	__init_deferred_page(pfn, nid);
 }
 
-/* If zone is ZONE_MOVABLE but memory is mirrored, it is an overlapped init */
-static bool __meminit
-overlap_memmap_init(unsigned long zone, unsigned long *pfn)
-{
-	static struct memblock_region *r __meminitdata;
-
-	if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
-		if (!r || *pfn >= memblock_region_memory_end_pfn(r)) {
-			for_each_mem_region(r) {
-				if (*pfn < memblock_region_memory_end_pfn(r))
-					break;
-			}
-		}
-		if (*pfn >= memblock_region_memory_base_pfn(r) &&
-		    memblock_is_mirror(r)) {
-			*pfn = memblock_region_memory_end_pfn(r);
-			return true;
-		}
-	}
-	return false;
-}
-
 /*
  * Only struct pages that correspond to ranges defined by memblock.memory
  * are zeroed and initialized by going through __init_single_page() during
@@ -891,8 +869,6 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone
 		 * function.  They do not exist on hotplugged memory.
 		 */
 		if (context == MEMINIT_EARLY) {
-			if (overlap_memmap_init(zone, &pfn))
-				continue;
 			if (defer_init(nid, pfn, zone_end_pfn)) {
 				deferred_struct_pages = true;
 				break;
@@ -956,6 +932,7 @@ static void __init memmap_init(void)
 	int i, j, zone_id = 0, nid;
 
 	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
+		struct memblock_region *r = &memblock.memory.regions[i];
 		struct pglist_data *node = NODE_DATA(nid);
 
 		for (j = 0; j < MAX_NR_ZONES; j++) {
@@ -964,6 +941,10 @@ static void __init memmap_init(void)
 			if (!populated_zone(zone))
 				continue;
 
+			if (mirrored_kernelcore && j == ZONE_MOVABLE &&
+			    memblock_is_mirror(r))
+				continue;
+
 			memmap_init_zone_range(zone, start_pfn, end_pfn,
 					       &hole_pfn);
 			zone_id = j;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/5] mm: skip non-mirrored ZONE_NORMAL memory map init when kernelcore=mirror
  2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
  2026-05-20  9:34 ` [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop Yuan Liu
@ 2026-05-20  9:34 ` Yuan Liu
  2026-05-20  9:34 ` [PATCH v5 3/5] mm: remove the special early-section handling from pfn_valid() and for_each_valid_pfn() Yuan Liu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Yuan Liu @ 2026-05-20  9:34 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
  Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
	Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng,
	linux-kernel

Mirrored regions are already skipped when initializing ZONE_MOVABLE, but
overlapping PFNs can still be initialized from the ZONE_NORMAL path when
ZONE_MOVABLE is present on the node.

When zone_movable_pfn[nid] is set, skip ZONE_NORMAL initialization for
non-mirrored regions, keep skipping mirrored regions for ZONE_MOVABLE.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Co-developed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
---
 mm/mm_init.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/mm/mm_init.c b/mm/mm_init.c
index 24e103a402b0..2a5ac175d5dd 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -941,9 +941,18 @@ static void __init memmap_init(void)
 			if (!populated_zone(zone))
 				continue;
 
-			if (mirrored_kernelcore && j == ZONE_MOVABLE &&
-			    memblock_is_mirror(r))
-				continue;
+			if (mirrored_kernelcore) {
+				/*
+				 * Avoid double initialization of PFNs that overlap
+				 * between Normal and Movable zones.
+				 */
+				if (j == ZONE_NORMAL && !memblock_is_mirror(r) &&
+				    zone_movable_pfn[nid])
+					continue;
+
+				if (j == ZONE_MOVABLE && memblock_is_mirror(r))
+					continue;
+			}
 
 			memmap_init_zone_range(zone, start_pfn, end_pfn,
 					       &hole_pfn);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 3/5] mm: remove the special early-section handling from pfn_valid() and for_each_valid_pfn()
  2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
  2026-05-20  9:34 ` [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop Yuan Liu
  2026-05-20  9:34 ` [PATCH v5 2/5] mm: skip non-mirrored ZONE_NORMAL memory map init when kernelcore=mirror Yuan Liu
@ 2026-05-20  9:34 ` Yuan Liu
  2026-05-20  9:34 ` [PATCH v5 4/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Yuan Liu @ 2026-05-20  9:34 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
  Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
	Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng,
	linux-kernel

Make pfn_valid() return 0 for PFNs that fall into invalid subsections
in early sections. Make for_each_valid_pfn() skip PFNs that fall into
invalid subsections in early sections.

This change is in preparation for optimizing zone contiguity checks
based on pages_with_online_memmap.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
---
 include/linux/mmzone.h | 13 ++++++-------
 mm/sparse-vmemmap.c    |  4 ++--
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 9adb2ad21da5..783084f8bbfe 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -2259,6 +2259,10 @@ void sparse_init_early_section(int nid, struct page *map, unsigned long pnum,
  * there is actual usable memory at that @pfn. The struct page may
  * represent a hole or an unusable page frame.
  *
+ * Note that this function returns 0 for PFNs that fall into
+ * invalid subsections as part of early sections, even though there would
+ * currently be a memmap allocated (that should not be touched).
+ *
  * Return: 1 for PFNs that have memory map entries and 0 otherwise
  */
 static inline int pfn_valid(unsigned long pfn)
@@ -2283,11 +2287,7 @@ static inline int pfn_valid(unsigned long pfn)
 		rcu_read_unlock_sched();
 		return 0;
 	}
-	/*
-	 * Traditionally early sections always returned pfn_valid() for
-	 * the entire section-sized span.
-	 */
-	ret = early_section(ms) || pfn_section_valid(ms, pfn);
+	ret = pfn_section_valid(ms, pfn);
 	rcu_read_unlock_sched();
 
 	return ret;
@@ -2303,8 +2303,7 @@ static inline unsigned long first_valid_pfn(unsigned long pfn, unsigned long end
 	while (nr <= __highest_present_section_nr && pfn < end_pfn) {
 		struct mem_section *ms = __pfn_to_section(pfn);
 
-		if (valid_section(ms) &&
-		    (early_section(ms) || pfn_section_first_valid(ms, &pfn))) {
+		if (valid_section(ms) && pfn_section_first_valid(ms, &pfn)) {
 			rcu_read_unlock_sched();
 			return pfn;
 		}
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 6eadb9d116e4..c6eefbb6013f 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -771,8 +771,8 @@ static void section_deactivate(unsigned long pfn, unsigned long nr_pages,
 	}
 
 	/*
-	 * The memmap of early sections is always fully populated. See
-	 * section_activate() and pfn_valid() .
+	 * The memmap of early sections is currently always fully populated. See
+	 * section_activate().
 	 */
 	if (!section_is_early) {
 		memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)));
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 4/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range
  2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
                   ` (2 preceding siblings ...)
  2026-05-20  9:34 ` [PATCH v5 3/5] mm: remove the special early-section handling from pfn_valid() and for_each_valid_pfn() Yuan Liu
@ 2026-05-20  9:34 ` Yuan Liu
  2026-05-20  9:34 ` [PATCH v5 5/5] mm/memory_hotplug: improve shrink_zone_span() subsection boundary checks Yuan Liu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Yuan Liu @ 2026-05-20  9:34 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
  Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
	Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng,
	linux-kernel

When move_pfn_range_to_zone() or remove_pfn_range_from_zone() updates a
zone, set_zone_contiguous() rescans the entire zone pageblock-by-pageblock
to rebuild zone->contiguous. For large zones this is a significant cost
during memory hotplug and hot-unplug.

Add a new zone member pages_with_online_memmap that tracks the number of
pages within the zone span that have an online memory map (including
present pages and memory holes whose memory map has been initialized).
When spanned_pages == pages_with_online_memmap the zone is contiguous and
pfn_to_page() can be called on any PFN in the zone span without further
pfn_valid() checks.

Only pages that fall within the current zone span are accounted towards
pages_with_online_memmap. A "too small" value is safe, it merely prevents
detecting a contiguous zone.

The following test cases of memory hotplug for a VM [1], tested in the
environment [2], show that this optimization can significantly reduce the
memory hotplug time [3].

+----------------+------+---------------+--------------+----------------+
|                | Size | Time (before) | Time (after) | Time Reduction |
|                +------+---------------+--------------+----------------+
| Plug Memory    | 256G |      10s      |      3s      |       70%      |
|                +------+---------------+--------------+----------------+
|                | 512G |      36s      |      7s      |       81%      |
+----------------+------+---------------+--------------+----------------+

+----------------+------+---------------+--------------+----------------+
|                | Size | Time (before) | Time (after) | Time Reduction |
|                +------+---------------+--------------+----------------+
| Unplug Memory  | 256G |      11s      |      4s      |       64%      |
|                +------+---------------+--------------+----------------+
|                | 512G |      36s      |      9s      |       75%      |
+----------------+------+---------------+--------------+----------------+

[1] Qemu commands to hotplug 256G/512G memory for a VM:
    object_add memory-backend-ram,id=hotmem0,size=256G/512G,share=on
    device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
    qom-set vmem1 requested-size 256G/512G (Plug Memory)
    qom-set vmem1 requested-size 0G (Unplug Memory)

[2] Hardware     : Intel Icelake server
    Guest Kernel : v7.0-rc4
    Qemu         : v9.0.0

    Launch VM    :
    qemu-system-x86_64 -accel kvm -cpu host \
    -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
    -drive file=./seed.img,format=raw,if=virtio \
    -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
    -m 2G,slots=10,maxmem=2052472M \
    -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
    -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
    -nographic -machine q35 \
    -nic user,hostfwd=tcp::3000-:22

    Guest kernel auto-onlines newly added memory blocks:
    echo online > /sys/devices/system/memory/auto_online_blocks

[3] The time from typing the QEMU commands in [1] to when the output of
    'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
    memory is recognized.

Reported-by: Nanhai Zou <nanhai.zou@intel.com>
Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
Tested-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
Reviewed-by: Pan Deng <pan.deng@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Co-developed-by: Tianyou Li <tianyou.li@intel.com>
Signed-off-by: Tianyou Li <tianyou.li@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
---
 Documentation/mm/physical_memory.rst | 13 ++++++++
 drivers/base/memory.c                |  6 ++++
 include/linux/mmzone.h               | 47 ++++++++++++++++++++++++++++
 mm/internal.h                        |  8 +----
 mm/memory_hotplug.c                  | 12 ++-----
 mm/mm_init.c                         | 45 +++++++++++---------------
 6 files changed, 87 insertions(+), 44 deletions(-)

diff --git a/Documentation/mm/physical_memory.rst b/Documentation/mm/physical_memory.rst
index b76183545e5b..0aa65e6b5499 100644
--- a/Documentation/mm/physical_memory.rst
+++ b/Documentation/mm/physical_memory.rst
@@ -483,6 +483,19 @@ General
   ``present_pages`` should use ``get_online_mems()`` to get a stable value. It
   is initialized by ``calculate_node_totalpages()``.
 
+``pages_with_online_memmap``
+  Tracks pages within the zone that have an online memory map (present pages
+  and memory holes whose memory map has been initialized). When
+  ``spanned_pages`` == ``pages_with_online_memmap``, ``pfn_to_page()`` can be
+  performed without further checks on any PFN within the zone span.
+
+  Note: this counter may temporarily undercount when pages with an online
+  memory map exist outside the current zone span. This can only happen during
+  boot, when initializing the memory map of pages that do not fall into any
+  zone span. Growing the zone to cover such pages and later shrinking it back
+  may result in a "too small" value. This is safe: it merely prevents
+  detecting a contiguous zone.
+
 ``present_early_pages``
   The present pages existing within the zone located on memory available since
   early boot, excluding hotplugged memory. Defined only when
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index f806a683b767..e029699d89a6 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -246,6 +246,7 @@ static int memory_block_online(struct memory_block *mem)
 		nr_vmemmap_pages = mem->altmap->free;
 
 	mem_hotplug_begin();
+	clear_zone_contiguous(zone);
 	if (nr_vmemmap_pages) {
 		ret = mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone);
 		if (ret)
@@ -270,6 +271,7 @@ static int memory_block_online(struct memory_block *mem)
 
 	mem->zone = zone;
 out:
+	set_zone_contiguous(zone);
 	mem_hotplug_done();
 	return ret;
 }
@@ -282,6 +284,7 @@ static int memory_block_offline(struct memory_block *mem)
 	unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
 	unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
 	unsigned long nr_vmemmap_pages = 0;
+	struct zone *zone;
 	int ret;
 
 	if (!mem->zone)
@@ -294,7 +297,9 @@ static int memory_block_offline(struct memory_block *mem)
 	if (mem->altmap)
 		nr_vmemmap_pages = mem->altmap->free;
 
+	zone = mem->zone;
 	mem_hotplug_begin();
+	clear_zone_contiguous(zone);
 	if (nr_vmemmap_pages)
 		adjust_present_page_count(pfn_to_page(start_pfn), mem->group,
 					  -nr_vmemmap_pages);
@@ -314,6 +319,7 @@ static int memory_block_offline(struct memory_block *mem)
 
 	mem->zone = NULL;
 out:
+	set_zone_contiguous(zone);
 	mem_hotplug_done();
 	return ret;
 }
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 783084f8bbfe..374e73ec1356 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1031,6 +1031,20 @@ struct zone {
 	 * cma pages is present pages that are assigned for CMA use
 	 * (MIGRATE_CMA).
 	 *
+	 * pages_with_online_memmap tracks pages within the zone that have
+	 * an online memory map (present pages and memory holes whose memory
+	 * map has been initialized). When spanned_pages ==
+	 * pages_with_online_memmap, pfn_to_page() can be performed without
+	 * further checks on any PFN within the zone span.
+	 *
+	 * Note: this counter may temporarily undercount when pages with an
+	 * online memory map exist outside the current zone span. This can
+	 * only happen during boot, when initializing the memory map of
+	 * pages that do not fall into any zone span. Growing the zone to
+	 * cover such pages and later shrinking it back may result in a
+	 * "too small" value. This is safe: it merely prevents detecting a
+	 * contiguous zone.
+	 *
 	 * So present_pages may be used by memory hotplug or memory power
 	 * management logic to figure out unmanaged pages by checking
 	 * (present_pages - managed_pages). And managed_pages should be used
@@ -1055,6 +1069,7 @@ struct zone {
 	atomic_long_t		managed_pages;
 	unsigned long		spanned_pages;
 	unsigned long		present_pages;
+	unsigned long		pages_with_online_memmap;
 #if defined(CONFIG_MEMORY_HOTPLUG)
 	unsigned long		present_early_pages;
 #endif
@@ -1692,6 +1707,38 @@ static inline bool zone_is_zone_device(const struct zone *zone)
 }
 #endif
 
+/**
+ * zone_is_contiguous - test whether a zone is contiguous
+ * @zone: the zone to test.
+ *
+ * In a contiguous zone, it is valid to call pfn_to_page() on any PFN in the
+ * spanned zone without requiring pfn_valid() or pfn_to_online_page() checks.
+ *
+ * Note that missing synchronization with memory offlining makes any PFN
+ * traversal prone to races.
+ *
+ * ZONE_DEVICE zones are always marked non-contiguous.
+ *
+ * Return: true if contiguous, otherwise false.
+ */
+static inline bool zone_is_contiguous(const struct zone *zone)
+{
+	return zone->contiguous;
+}
+
+static inline void set_zone_contiguous(struct zone *zone)
+{
+	if (zone_is_zone_device(zone))
+		return;
+	if (zone->spanned_pages == zone->pages_with_online_memmap)
+		zone->contiguous = true;
+}
+
+static inline void clear_zone_contiguous(struct zone *zone)
+{
+	zone->contiguous = false;
+}
+
 /*
  * Returns true if a zone has pages managed by the buddy allocator.
  * All the reclaim decisions have to use this function rather than
diff --git a/mm/internal.h b/mm/internal.h
index 5a2ddcf68e0b..a047c7caef6f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -814,21 +814,15 @@ extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
 static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
 				unsigned long end_pfn, struct zone *zone)
 {
-	if (zone->contiguous)
+	if (zone_is_contiguous(zone))
 		return pfn_to_page(start_pfn);
 
 	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
 }
 
-void set_zone_contiguous(struct zone *zone);
 bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
 			   unsigned long nr_pages);
 
-static inline void clear_zone_contiguous(struct zone *zone)
-{
-	zone->contiguous = false;
-}
-
 extern int __isolate_free_page(struct page *page, unsigned int order);
 extern void __putback_isolated_page(struct page *page, unsigned int order,
 				    int mt);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2a943ec57c85..fbe863441761 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -557,18 +557,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
 
 	/*
 	 * Zone shrinking code cannot properly deal with ZONE_DEVICE. So
-	 * we will not try to shrink the zones - which is okay as
-	 * set_zone_contiguous() cannot deal with ZONE_DEVICE either way.
+	 * we will not try to shrink it.
 	 */
 	if (zone_is_zone_device(zone))
 		return;
 
-	clear_zone_contiguous(zone);
-
 	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
 	update_pgdat_span(pgdat);
-
-	set_zone_contiguous(zone);
 }
 
 /**
@@ -745,8 +740,6 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	int nid = pgdat->node_id;
 
-	clear_zone_contiguous(zone);
-
 	if (zone_is_empty(zone))
 		init_currently_empty_zone(zone, start_pfn, nr_pages);
 	resize_zone_range(zone, start_pfn, nr_pages);
@@ -774,8 +767,6 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 	memmap_init_range(nr_pages, nid, zone_idx(zone), start_pfn, 0,
 			 MEMINIT_HOTPLUG, altmap, migratetype,
 			 isolate_pageblock);
-
-	set_zone_contiguous(zone);
 }
 
 struct auto_movable_stats {
@@ -1071,6 +1062,7 @@ void adjust_present_page_count(struct page *page, struct memory_group *group,
 	if (early_section(__pfn_to_section(page_to_pfn(page))))
 		zone->present_early_pages += nr_pages;
 	zone->present_pages += nr_pages;
+	zone->pages_with_online_memmap += nr_pages;
 	zone->zone_pgdat->node_present_pages += nr_pages;
 
 	if (group && movable)
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 2a5ac175d5dd..05c616c857ec 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -806,9 +806,9 @@ void __meminit init_deferred_page(unsigned long pfn, int nid)
  *   zone/node above the hole except for the trailing pages in the last
  *   section that will be appended to the zone/node below.
  */
-static void __init init_unavailable_range(unsigned long spfn,
-					  unsigned long epfn,
-					  int zone, int node)
+static unsigned long __init init_unavailable_range(unsigned long spfn,
+						   unsigned long epfn,
+						   int zone, int node)
 {
 	unsigned long pfn;
 	u64 pgcnt = 0;
@@ -822,6 +822,7 @@ static void __init init_unavailable_range(unsigned long spfn,
 	if (pgcnt)
 		pr_info("On node %d, zone %s: %lld pages in unavailable ranges\n",
 			node, zone_names[zone], pgcnt);
+	return pgcnt;
 }
 
 /*
@@ -918,9 +919,21 @@ static void __init memmap_init_zone_range(struct zone *zone,
 	memmap_init_range(end_pfn - start_pfn, nid, zone_id, start_pfn,
 			  zone_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE,
 			  false);
+	zone->pages_with_online_memmap += end_pfn - start_pfn;
 
-	if (*hole_pfn < start_pfn)
-		init_unavailable_range(*hole_pfn, start_pfn, zone_id, nid);
+	if (*hole_pfn < start_pfn) {
+		unsigned long hole_start_pfn = *hole_pfn;
+		unsigned long pgcnt;
+
+		if (hole_start_pfn < zone_start_pfn) {
+			init_unavailable_range(hole_start_pfn, zone_start_pfn,
+					       zone_id, nid);
+			hole_start_pfn = zone_start_pfn;
+		}
+		pgcnt = init_unavailable_range(hole_start_pfn, start_pfn,
+					       zone_id, nid);
+		zone->pages_with_online_memmap += pgcnt;
+	}
 
 	*hole_pfn = end_pfn;
 }
@@ -2237,28 +2250,6 @@ void __init init_cma_pageblock(struct page *page)
 }
 #endif
 
-void set_zone_contiguous(struct zone *zone)
-{
-	unsigned long block_start_pfn = zone->zone_start_pfn;
-	unsigned long block_end_pfn;
-
-	block_end_pfn = pageblock_end_pfn(block_start_pfn);
-	for (; block_start_pfn < zone_end_pfn(zone);
-			block_start_pfn = block_end_pfn,
-			 block_end_pfn += pageblock_nr_pages) {
-
-		block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
-
-		if (!__pageblock_pfn_to_page(block_start_pfn,
-					     block_end_pfn, zone))
-			return;
-		cond_resched();
-	}
-
-	/* We confirm that there is no hole */
-	zone->contiguous = true;
-}
-
 /*
  * Check if a PFN range intersects multiple zones on one or more
  * NUMA nodes. Specify the @nid argument if it is known that this
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 5/5] mm/memory_hotplug: improve shrink_zone_span() subsection boundary checks
  2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
                   ` (3 preceding siblings ...)
  2026-05-20  9:34 ` [PATCH v5 4/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
@ 2026-05-20  9:34 ` Yuan Liu
  2026-05-21  2:59 ` [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Wei Yang
  2026-06-01  7:17 ` Mike Rapoport
  6 siblings, 0 replies; 12+ messages in thread
From: Yuan Liu @ 2026-05-20  9:34 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang
  Cc: linux-mm, Yong Hu, Nanhai Zou, Yuan Liu, Tim Chen, Qiuxu Zhuo,
	Yu C Chen, Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng,
	linux-kernel

When shrinking a zone span after removing a PFN range,
find_smallest_section_pfn() and find_biggest_section_pfn()
only checked one edge PFN in each subsection for nid/zone matching.

If a memory or hole boundary falls in the middle of a subsection,
that edge PFN may belong to a different nid/zone, causing the helpers
to miss a valid PFN within that subsection.

Fix this by checking both subsection edge PFNs for nid/zone matching.
Keep a single pfn_to_online_page() check per subsection, since online
state is the same for all PFNs in a subsection.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
---
 mm/memory_hotplug.c | 42 +++++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index fbe863441761..20b61f70cd81 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -427,17 +427,24 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone,
 				     unsigned long start_pfn,
 				     unsigned long end_pfn)
 {
-	for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SUBSECTION) {
-		if (unlikely(!pfn_to_online_page(start_pfn)))
-			continue;
+	unsigned long next_pfn;
 
-		if (unlikely(pfn_to_nid(start_pfn) != nid))
-			continue;
+	for (; start_pfn < end_pfn; start_pfn = next_pfn) {
+		unsigned long tail_pfn;
 
-		if (zone != page_zone(pfn_to_page(start_pfn)))
+		next_pfn = start_pfn + PAGES_PER_SUBSECTION;
+		tail_pfn = next_pfn - 1;
+
+		if (unlikely(!pfn_to_online_page(start_pfn)))
 			continue;
 
-		return start_pfn;
+		if (likely(pfn_to_nid(start_pfn) == nid) &&
+		    zone == page_zone(pfn_to_page(start_pfn)))
+			return start_pfn;
+
+		if (likely(pfn_to_nid(tail_pfn) == nid) &&
+		    zone == page_zone(pfn_to_page(tail_pfn)))
+			return start_pfn;
 	}
 
 	return 0;
@@ -448,21 +455,26 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone,
 				    unsigned long start_pfn,
 				    unsigned long end_pfn)
 {
-	unsigned long pfn;
+	unsigned long pfn, prev_pfn;
 
 	/* pfn is the end pfn of a memory section. */
 	pfn = end_pfn - 1;
-	for (; pfn >= start_pfn; pfn -= PAGES_PER_SUBSECTION) {
-		if (unlikely(!pfn_to_online_page(pfn)))
-			continue;
+	for (; pfn >= start_pfn; pfn = prev_pfn) {
+		unsigned long head_pfn;
 
-		if (unlikely(pfn_to_nid(pfn) != nid))
-			continue;
+		prev_pfn = pfn - PAGES_PER_SUBSECTION;
+		head_pfn = prev_pfn + 1;
 
-		if (zone != page_zone(pfn_to_page(pfn)))
+		if (unlikely(!pfn_to_online_page(pfn)))
 			continue;
 
-		return pfn;
+		if (likely(pfn_to_nid(pfn) == nid) &&
+		    zone == page_zone(pfn_to_page(pfn)))
+			return pfn;
+
+		if (likely(pfn_to_nid(head_pfn) == nid) &&
+		    zone == page_zone(pfn_to_page(head_pfn)))
+			return pfn;
 	}
 
 	return 0;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range
  2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
                   ` (4 preceding siblings ...)
  2026-05-20  9:34 ` [PATCH v5 5/5] mm/memory_hotplug: improve shrink_zone_span() subsection boundary checks Yuan Liu
@ 2026-05-21  2:59 ` Wei Yang
  2026-05-21  5:16   ` Liu, Yuan1
  2026-06-01  7:17 ` Mike Rapoport
  6 siblings, 1 reply; 12+ messages in thread
From: Wei Yang @ 2026-05-21  2:59 UTC (permalink / raw)
  To: Yuan Liu
  Cc: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang,
	linux-mm, Yong Hu, Nanhai Zou, Tim Chen, Qiuxu Zhuo, Yu C Chen,
	Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng, linux-kernel

Hi, Yuan

IIRC I haven't reviewed any one of the patch below. We have discussed
it but not settle down. So it is not proper to add RB by yourself.

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop
  2026-05-20  9:34 ` [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop Yuan Liu
@ 2026-05-21  3:33   ` Wei Yang
  2026-05-22  7:43     ` Liu, Yuan1
  0 siblings, 1 reply; 12+ messages in thread
From: Wei Yang @ 2026-05-21  3:33 UTC (permalink / raw)
  To: Yuan Liu
  Cc: David Hildenbrand, Oscar Salvador, Mike Rapoport, Wei Yang,
	linux-mm, Yong Hu, Nanhai Zou, Tim Chen, Qiuxu Zhuo, Yu C Chen,
	Pan Deng, Tianyou Li, Chen Zhang, Jason Zeng, linux-kernel

On Wed, May 20, 2026 at 05:34:53AM -0400, Yuan Liu wrote:
>Move the overlap memmap initialization check from memmap_init_range()
>to memmap_init(), and replace the per-PFN check with a memblock-based
>check.

The description is a little simple.

Even I know the purpose, I feel confused at the first glance.

>
>Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
>Reviewed-by: Jason Zeng <jason.zeng@intel.com>
>Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
>---
> mm/mm_init.c | 29 +++++------------------------
> 1 file changed, 5 insertions(+), 24 deletions(-)
>
>diff --git a/mm/mm_init.c b/mm/mm_init.c
>index f9f8e1af921c..24e103a402b0 100644
>--- a/mm/mm_init.c
>+++ b/mm/mm_init.c
>@@ -783,28 +783,6 @@ void __meminit init_deferred_page(unsigned long pfn, int nid)
> 	__init_deferred_page(pfn, nid);
> }
> 
>-/* If zone is ZONE_MOVABLE but memory is mirrored, it is an overlapped init */
>-static bool __meminit
>-overlap_memmap_init(unsigned long zone, unsigned long *pfn)
>-{
>-	static struct memblock_region *r __meminitdata;
>-
>-	if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
>-		if (!r || *pfn >= memblock_region_memory_end_pfn(r)) {
>-			for_each_mem_region(r) {
>-				if (*pfn < memblock_region_memory_end_pfn(r))
>-					break;
>-			}
>-		}
>-		if (*pfn >= memblock_region_memory_base_pfn(r) &&
>-		    memblock_is_mirror(r)) {
>-			*pfn = memblock_region_memory_end_pfn(r);
>-			return true;
>-		}
>-	}
>-	return false;
>-}
>-
> /*
>  * Only struct pages that correspond to ranges defined by memblock.memory
>  * are zeroed and initialized by going through __init_single_page() during
>@@ -891,8 +869,6 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone
> 		 * function.  They do not exist on hotplugged memory.
> 		 */
> 		if (context == MEMINIT_EARLY) {
>-			if (overlap_memmap_init(zone, &pfn))
>-				continue;
> 			if (defer_init(nid, pfn, zone_end_pfn)) {
> 				deferred_struct_pages = true;
> 				break;
>@@ -956,6 +932,7 @@ static void __init memmap_init(void)
> 	int i, j, zone_id = 0, nid;
> 
> 	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
>+		struct memblock_region *r = &memblock.memory.regions[i];
> 		struct pglist_data *node = NODE_DATA(nid);
> 
> 		for (j = 0; j < MAX_NR_ZONES; j++) {
>@@ -964,6 +941,10 @@ static void __init memmap_init(void)
> 			if (!populated_zone(zone))
> 				continue;
> 
>+			if (mirrored_kernelcore && j == ZONE_MOVABLE &&
>+			    memblock_is_mirror(r))
>+				continue;
>+

So you have figured out the memory layout of mirror memory?

Would you mind elaborate?

> 			memmap_init_zone_range(zone, start_pfn, end_pfn,
> 					       &hole_pfn);
> 			zone_id = j;
>-- 
>2.47.3

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range
  2026-05-21  2:59 ` [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Wei Yang
@ 2026-05-21  5:16   ` Liu, Yuan1
  0 siblings, 0 replies; 12+ messages in thread
From: Liu, Yuan1 @ 2026-05-21  5:16 UTC (permalink / raw)
  To: Wei Yang
  Cc: David Hildenbrand, Oscar Salvador, Mike Rapoport,
	linux-mm@kvack.org, Hu, Yong, Zou, Nanhai, Tim Chen, Zhuo, Qiuxu,
	Chen, Yu C, Deng, Pan, Li, Tianyou, Chen Zhang, Zeng, Jason,
	linux-kernel@vger.kernel.org

> Hi, Yuan
> 
> IIRC I haven't reviewed any one of the patch below. We have discussed
> it but not settle down. So it is not proper to add RB by yourself.

Hi Wei

I'm very sorry for adding those tags before receiving your RB. 
I will remove them and make sure not to send this kind of issue again.

> --
> Wei Yang
> Help you, Help me


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop
  2026-05-21  3:33   ` Wei Yang
@ 2026-05-22  7:43     ` Liu, Yuan1
  2026-05-25  8:35       ` Wei Yang
  0 siblings, 1 reply; 12+ messages in thread
From: Liu, Yuan1 @ 2026-05-22  7:43 UTC (permalink / raw)
  To: Wei Yang
  Cc: David Hildenbrand, Oscar Salvador, Mike Rapoport,
	linux-mm@kvack.org, Hu, Yong, Zou, Nanhai, Tim Chen, Zhuo, Qiuxu,
	Chen, Yu C, Deng, Pan, Li, Tianyou, Chen Zhang, Zeng, Jason,
	linux-kernel@vger.kernel.org

> Subject: Re: [PATCH v5 1/5] mm: move mirrored memory overlap checking to
> the outer loop
> 
> On Wed, May 20, 2026 at 05:34:53AM -0400, Yuan Liu wrote:
> >Move the overlap memmap initialization check from memmap_init_range()
> >to memmap_init(), and replace the per-PFN check with a memblock-based
> >check.
> 
> The description is a little simple.
> 
> Even I know the purpose, I feel confused at the first glance.

Thanks for the review. 
I will try to rephrase it and provide a clearer description in the next version.
 
> >
> >Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> >Reviewed-by: Jason Zeng <jason.zeng@intel.com>
> >Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
> >---
> > mm/mm_init.c | 29 +++++------------------------
> > 1 file changed, 5 insertions(+), 24 deletions(-)
> >
> >diff --git a/mm/mm_init.c b/mm/mm_init.c
> >index f9f8e1af921c..24e103a402b0 100644
> >--- a/mm/mm_init.c
> >+++ b/mm/mm_init.c
> >@@ -783,28 +783,6 @@ void __meminit init_deferred_page(unsigned long pfn,
> int nid)
> > 	__init_deferred_page(pfn, nid);
> > }
> >
> >-/* If zone is ZONE_MOVABLE but memory is mirrored, it is an overlapped
> init */
> >-static bool __meminit
> >-overlap_memmap_init(unsigned long zone, unsigned long *pfn)
> >-{
> >-	static struct memblock_region *r __meminitdata;
> >-
> >-	if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
> >-		if (!r || *pfn >= memblock_region_memory_end_pfn(r)) {
> >-			for_each_mem_region(r) {
> >-				if (*pfn < memblock_region_memory_end_pfn(r))
> >-					break;
> >-			}
> >-		}
> >-		if (*pfn >= memblock_region_memory_base_pfn(r) &&
> >-		    memblock_is_mirror(r)) {
> >-			*pfn = memblock_region_memory_end_pfn(r);
> >-			return true;
> >-		}
> >-	}
> >-	return false;
> >-}
> >-
> > /*
> >  * Only struct pages that correspond to ranges defined by
> memblock.memory
> >  * are zeroed and initialized by going through __init_single_page()
> during
> >@@ -891,8 +869,6 @@ void __meminit memmap_init_range(unsigned long size,
> int nid, unsigned long zone
> > 		 * function.  They do not exist on hotplugged memory.
> > 		 */
> > 		if (context == MEMINIT_EARLY) {
> >-			if (overlap_memmap_init(zone, &pfn))
> >-				continue;
> > 			if (defer_init(nid, pfn, zone_end_pfn)) {
> > 				deferred_struct_pages = true;
> > 				break;
> >@@ -956,6 +932,7 @@ static void __init memmap_init(void)
> > 	int i, j, zone_id = 0, nid;
> >
> > 	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
> {
> >+		struct memblock_region *r = &memblock.memory.regions[i];
> > 		struct pglist_data *node = NODE_DATA(nid);
> >
> > 		for (j = 0; j < MAX_NR_ZONES; j++) {
> >@@ -964,6 +941,10 @@ static void __init memmap_init(void)
> > 			if (!populated_zone(zone))
> > 				continue;
> >
> >+			if (mirrored_kernelcore && j == ZONE_MOVABLE &&
> >+			    memblock_is_mirror(r))
> >+				continue;
> >+
> 
> So you have figured out the memory layout of mirror memory?
> 
> Would you mind elaborate?

I have a Xeon server, collected mirror memory layout information as below by using follow changes:

int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
 {
-       if (!mirrored_kernelcore)
+       int ret;
+       phys_addr_t end = base + size - 1;
+
+       pr_info("memblock_mark_mirror: base=%pa, size=%pa, mirrored_kernelcore=%d\n",
+               &base, &size, mirrored_kernelcore);
+
+       if (!mirrored_kernelcore) {
+               pr_info("memblock_mark_mirror: mirrored_kernelcore not enabled, skipping\n");
                return 0;
+       }

        system_has_some_mirror = true;

-       return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_MIRROR);
+       ret = memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_MIRROR);
+       pr_info("memblock_mark_mirror: marked [%pa-%pa] as MIRROR, ret=%d\n",
+               &base, &end, ret);
+       return ret;
 }

Here is the detailed layout information:

Case 1: efibootmgr -m t -M 0
Enable mirror memory below 4GB, and no mirror memory above 4G

=== zoneinfo summary ===
node   zone       start_pfn    end_pfn      start_addr         end_addr    
------ ---------- ------------ ------------ ------------------ -------------
0      DMA        0x1          0xfff        0x1000             0xffffff
0      DMA32      0x1000       0xfffff      0x1000000          0xffffffff
0      Normal     0x100000     0x7fbffff    0x100000000        0x7fbfffffff
0      Movable    0x140000     0x7fbffff    0x140000000        0x7fbfffffff
1      Normal     0x7fc0000    0xff7ffff    0x7fc0000000       0xff7fffffff
1      Movable    0x8000000    0xff7ffff    0x8000000000       0xff7fffffff

node  start                end                  size              flags      mirror   pfn_range
----  -------------------- -------------------- ------------ --------------- -------- ------------------ 
0     0x0000000000001000   0x000000000009dfff   0x000000000009d000 0x2        yes      0x1-0x9e
0     0x000000000009f000   0x000000000009ffff   0x0000000000001000 0x2        yes      0x9f-0xa0
0     0x0000000000100000   0x0000000066416fff   0x0000000066317000 0x2        yes      0x100-0x66417
0     0x00000000777ff000   0x00000000777fffff   0x0000000000001000 0x2        yes      0x777ff-0x77800
0     0x0000000100000000   0x000000013fffffff   0x0000000040000000 0x2        yes      0x100000-0x140000
0     0x0000000140000000   0x0000007fbfffffff   0x0000007e80000000 0x0        no       0x140000-0x7fc0000
1     0x0000007fc0000000   0x0000007fffffffff   0x0000000040000000 0x2        yes      0x7fc0000-0x8000000
1     0x0000008000000000   0x000000ff7fffffff   0x0000007f80000000 0x0        no       0x8000000-0xff80000

Case 2: efibootmgr -m t -M 25
Enable mirror memory below 4GB, and put 25% percentage memory to mirror above 4GB

=== zoneinfo summary ===
node   zone       start_pfn    end_pfn      start_addr         end_addr
------ ---------- ------------ ------------ ------------------ -----------
0      DMA        0x1          0xfff        0x1000             0xffffff
0      DMA32      0x1000       0xfffff      0x1000000          0xffffffff
0      Normal     0x100000     0x603ffff    0x100000000        0x603fffffff
0      Movable    0x20c0000    0x603ffff    0x20c0000000       0x603fffffff
1      Normal     0x6040000    0xc03ffff    0x6040000000       0xc03fffffff
1      Movable    0x8040000    0xc03ffff    0x8040000000       0xc03fffffff

node  start                end                  size              flags      mirror   pfn_range
----  -------------------- -------------------- ------------ ---------- -------- ------------------
0     0x0000000000001000   0x000000000009dfff   0x000000000009d000 0x2        yes      0x1-0x9e
0     0x000000000009f000   0x000000000009ffff   0x0000000000001000 0x2        yes      0x9f-0xa0
0     0x0000000000100000   0x0000000066416fff   0x0000000066317000 0x2        yes      0x100-0x66417
0     0x00000000777ff000   0x00000000777fffff   0x0000000000001000 0x2        yes      0x777ff-0x77800
0     0x0000000100000000   0x00000020bfffffff   0x0000001fc0000000 0x2        yes      0x100000-0x20c0000
0     0x00000020c0000000   0x000000603fffffff   0x0000003f80000000 0x0        no       0x20c0000-0x6040000
1     0x0000006040000000   0x000000803fffffff   0x0000002000000000 0x2        yes      0x6040000-0x8040000
1     0x0000008040000000   0x000000c03fffffff   0x0000004000000000 0x0        no       0x8040000-0xc040000

Case 3: efibootmgr -m f -M 25
Disable mirror memory below 4GB, and put 25% percentage memory to mirror above 4GB

=== zoneinfo summary ===
node   zone       start_pfn    end_pfn      start_addr         end_addr
------ ---------- ------------ ------------ ------------------ ------------
0      DMA        0x1          0xfff        0x1000             0xffffff
0      DMA32      0x1000       0xfffff      0x1000000          0xffffffff
0      Normal     0x100000     0x60bffff    0x100000000        0x60bfffffff
0      Movable    0x20c0000    0x60bffff    0x20c0000000       0x60bfffffff
1      Normal     0x60c0000    0xc0bffff    0x60c0000000       0xc0bfffffff
1      Movable    0x80c0000    0xc0bffff    0x80c0000000       0xc0bfffffff

node  start                end                  size              flags      mirror   pfn_range
----  -------------------- -------------------- ------------ ---------- -------- ------------------
0     0x0000000000001000   0x000000000009dfff   0x000000000009d000 0x0        no       0x1-0x9e
0     0x000000000009f000   0x000000000009ffff   0x0000000000001000 0x0        no       0x9f-0xa0
0     0x0000000000100000   0x0000000066416fff   0x0000000066317000 0x0        no       0x100-0x66417
0     0x00000000777ff000   0x00000000777fffff   0x0000000000001000 0x0        no       0x777ff-0x77800
0     0x0000000100000000   0x00000020bfffffff   0x0000001fc0000000 0x2        yes      0x100000-0x20c0000
0     0x00000020c0000000   0x00000060bfffffff   0x0000004000000000 0x0        no       0x20c0000-0x60c0000
1     0x00000060c0000000   0x00000080bfffffff   0x0000002000000000 0x2        yes      0x60c0000-0x80c0000
1     0x00000080c0000000   0x000000c0bfffffff   0x0000004000000000 0x0        no       0x80c0000-0xc0c0000

> > 			memmap_init_zone_range(zone, start_pfn, end_pfn,
> > 					       &hole_pfn);
> > 			zone_id = j;
> >--
> >2.47.3
> 
> --
> Wei Yang
> Help you, Help me


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop
  2026-05-22  7:43     ` Liu, Yuan1
@ 2026-05-25  8:35       ` Wei Yang
  0 siblings, 0 replies; 12+ messages in thread
From: Wei Yang @ 2026-05-25  8:35 UTC (permalink / raw)
  To: Liu, Yuan1
  Cc: Wei Yang, David Hildenbrand, Oscar Salvador, Mike Rapoport,
	linux-mm@kvack.org, Hu, Yong, Zou, Nanhai, Tim Chen, Zhuo, Qiuxu,
	Chen, Yu C, Deng, Pan, Li, Tianyou, Chen Zhang, Zeng, Jason,
	linux-kernel@vger.kernel.org

On Fri, May 22, 2026 at 07:43:38AM +0000, Liu, Yuan1 wrote:
>> Subject: Re: [PATCH v5 1/5] mm: move mirrored memory overlap checking to
>> the outer loop
>> 
>> On Wed, May 20, 2026 at 05:34:53AM -0400, Yuan Liu wrote:
>> >Move the overlap memmap initialization check from memmap_init_range()
>> >to memmap_init(), and replace the per-PFN check with a memblock-based
>> >check.
>> 
>> The description is a little simple.
>> 
>> Even I know the purpose, I feel confused at the first glance.
>
>Thanks for the review. 
>I will try to rephrase it and provide a clearer description in the next version.
> 
>> >
>> >Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
>> >Reviewed-by: Jason Zeng <jason.zeng@intel.com>
>> >Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
>> >---
>> > mm/mm_init.c | 29 +++++------------------------
>> > 1 file changed, 5 insertions(+), 24 deletions(-)
>> >
>> >diff --git a/mm/mm_init.c b/mm/mm_init.c
>> >index f9f8e1af921c..24e103a402b0 100644
>> >--- a/mm/mm_init.c
>> >+++ b/mm/mm_init.c
>> >@@ -783,28 +783,6 @@ void __meminit init_deferred_page(unsigned long pfn,
>> int nid)
>> > 	__init_deferred_page(pfn, nid);
>> > }
>> >
>> >-/* If zone is ZONE_MOVABLE but memory is mirrored, it is an overlapped
>> init */
>> >-static bool __meminit
>> >-overlap_memmap_init(unsigned long zone, unsigned long *pfn)
>> >-{
>> >-	static struct memblock_region *r __meminitdata;
>> >-
>> >-	if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
>> >-		if (!r || *pfn >= memblock_region_memory_end_pfn(r)) {
>> >-			for_each_mem_region(r) {
>> >-				if (*pfn < memblock_region_memory_end_pfn(r))
>> >-					break;
>> >-			}
>> >-		}
>> >-		if (*pfn >= memblock_region_memory_base_pfn(r) &&
>> >-		    memblock_is_mirror(r)) {
>> >-			*pfn = memblock_region_memory_end_pfn(r);
>> >-			return true;
>> >-		}
>> >-	}
>> >-	return false;
>> >-}
>> >-
>> > /*
>> >  * Only struct pages that correspond to ranges defined by
>> memblock.memory
>> >  * are zeroed and initialized by going through __init_single_page()
>> during
>> >@@ -891,8 +869,6 @@ void __meminit memmap_init_range(unsigned long size,
>> int nid, unsigned long zone
>> > 		 * function.  They do not exist on hotplugged memory.
>> > 		 */
>> > 		if (context == MEMINIT_EARLY) {
>> >-			if (overlap_memmap_init(zone, &pfn))
>> >-				continue;
>> > 			if (defer_init(nid, pfn, zone_end_pfn)) {
>> > 				deferred_struct_pages = true;
>> > 				break;
>> >@@ -956,6 +932,7 @@ static void __init memmap_init(void)
>> > 	int i, j, zone_id = 0, nid;
>> >
>> > 	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
>> {
>> >+		struct memblock_region *r = &memblock.memory.regions[i];
>> > 		struct pglist_data *node = NODE_DATA(nid);
>> >
>> > 		for (j = 0; j < MAX_NR_ZONES; j++) {
>> >@@ -964,6 +941,10 @@ static void __init memmap_init(void)
>> > 			if (!populated_zone(zone))
>> > 				continue;
>> >
>> >+			if (mirrored_kernelcore && j == ZONE_MOVABLE &&
>> >+			    memblock_is_mirror(r))
>> >+				continue;
>> >+
>> 
>> So you have figured out the memory layout of mirror memory?
>> 
>> Would you mind elaborate?
>
>I have a Xeon server, collected mirror memory layout information as below by using follow changes:
>
>int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
> {
>-       if (!mirrored_kernelcore)
>+       int ret;
>+       phys_addr_t end = base + size - 1;
>+
>+       pr_info("memblock_mark_mirror: base=%pa, size=%pa, mirrored_kernelcore=%d\n",
>+               &base, &size, mirrored_kernelcore);
>+
>+       if (!mirrored_kernelcore) {
>+               pr_info("memblock_mark_mirror: mirrored_kernelcore not enabled, skipping\n");
>                return 0;
>+       }
>
>        system_has_some_mirror = true;
>
>-       return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_MIRROR);
>+       ret = memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_MIRROR);
>+       pr_info("memblock_mark_mirror: marked [%pa-%pa] as MIRROR, ret=%d\n",
>+               &base, &end, ret);
>+       return ret;
> }
>
>Here is the detailed layout information:
>

Thanks for the detailed output.

After some investigation, I got two things to discuss here:

  * current mirror memory would disable memmap defer init
  * confirm the mirror memory layout and may simplify handling

Before discussion, let me mark the key point in the output below.

>Case 1: efibootmgr -m t -M 0
>Enable mirror memory below 4GB, and no mirror memory above 4G
>
>=== zoneinfo summary ===
>node   zone       start_pfn    end_pfn      start_addr         end_addr    
>------ ---------- ------------ ------------ ------------------ -------------
>0      DMA        0x1          0xfff        0x1000             0xffffff
>0      DMA32      0x1000       0xfffff      0x1000000          0xffffffff
>0      Normal     0x100000     0x7fbffff    0x100000000        0x7fbfffffff
>0      Movable    0x140000     0x7fbffff    0x140000000        0x7fbfffffff

(1) Normal and Movable zone end with the same address, so overlapped.

>1      Normal     0x7fc0000    0xff7ffff    0x7fc0000000       0xff7fffffff
>1      Movable    0x8000000    0xff7ffff    0x8000000000       0xff7fffffff

The same as (1).

>
>node  start                end                  size              flags      mirror   pfn_range
>----  -------------------- -------------------- ------------ --------------- -------- ------------------ 
>0     0x0000000000001000   0x000000000009dfff   0x000000000009d000 0x2        yes      0x1-0x9e
>0     0x000000000009f000   0x000000000009ffff   0x0000000000001000 0x2        yes      0x9f-0xa0
>0     0x0000000000100000   0x0000000066416fff   0x0000000066317000 0x2        yes      0x100-0x66417
>0     0x00000000777ff000   0x00000000777fffff   0x0000000000001000 0x2        yes      0x777ff-0x77800
>0     0x0000000100000000   0x000000013fffffff   0x0000000040000000 0x2        yes      0x100000-0x140000

(2) You mentioned no mirror memory above 4G. If my understanding is correct, 4G's
address is 0x100000000. So why this range is marked mirror? Is there some
limitation for this?

>0     0x0000000140000000   0x0000007fbfffffff   0x0000007e80000000 0x0        no       0x140000-0x7fc0000
>1     0x0000007fc0000000   0x0000007fffffffff   0x0000000040000000 0x2        yes      0x7fc0000-0x8000000
>1     0x0000008000000000   0x000000ff7fffffff   0x0000007f80000000 0x0        no       0x8000000-0xff80000
>
>Case 2: efibootmgr -m t -M 25
>Enable mirror memory below 4GB, and put 25% percentage memory to mirror above 4GB
>
>=== zoneinfo summary ===
>node   zone       start_pfn    end_pfn      start_addr         end_addr
>------ ---------- ------------ ------------ ------------------ -----------
>0      DMA        0x1          0xfff        0x1000             0xffffff
>0      DMA32      0x1000       0xfffff      0x1000000          0xffffffff
>0      Normal     0x100000     0x603ffff    0x100000000        0x603fffffff
>0      Movable    0x20c0000    0x603ffff    0x20c0000000       0x603fffffff

The same as (1).

>1      Normal     0x6040000    0xc03ffff    0x6040000000       0xc03fffffff
>1      Movable    0x8040000    0xc03ffff    0x8040000000       0xc03fffffff

The same as (1).

>
>node  start                end                  size              flags      mirror   pfn_range
>----  -------------------- -------------------- ------------ ---------- -------- ------------------
>0     0x0000000000001000   0x000000000009dfff   0x000000000009d000 0x2        yes      0x1-0x9e
>0     0x000000000009f000   0x000000000009ffff   0x0000000000001000 0x2        yes      0x9f-0xa0
>0     0x0000000000100000   0x0000000066416fff   0x0000000066317000 0x2        yes      0x100-0x66417
>0     0x00000000777ff000   0x00000000777fffff   0x0000000000001000 0x2        yes      0x777ff-0x77800
>0     0x0000000100000000   0x00000020bfffffff   0x0000001fc0000000 0x2        yes      0x100000-0x20c0000
>0     0x00000020c0000000   0x000000603fffffff   0x0000003f80000000 0x0        no       0x20c0000-0x6040000
>1     0x0000006040000000   0x000000803fffffff   0x0000002000000000 0x2        yes      0x6040000-0x8040000
>1     0x0000008040000000   0x000000c03fffffff   0x0000004000000000 0x0        no       0x8040000-0xc040000
>
>Case 3: efibootmgr -m f -M 25
>Disable mirror memory below 4GB, and put 25% percentage memory to mirror above 4GB
>
>=== zoneinfo summary ===
>node   zone       start_pfn    end_pfn      start_addr         end_addr
>------ ---------- ------------ ------------ ------------------ ------------
>0      DMA        0x1          0xfff        0x1000             0xffffff
>0      DMA32      0x1000       0xfffff      0x1000000          0xffffffff
>0      Normal     0x100000     0x60bffff    0x100000000        0x60bfffffff
>0      Movable    0x20c0000    0x60bffff    0x20c0000000       0x60bfffffff
>1      Normal     0x60c0000    0xc0bffff    0x60c0000000       0xc0bfffffff
>1      Movable    0x80c0000    0xc0bffff    0x80c0000000       0xc0bfffffff
>
>node  start                end                  size              flags      mirror   pfn_range
>----  -------------------- -------------------- ------------ ---------- -------- ------------------
>0     0x0000000000001000   0x000000000009dfff   0x000000000009d000 0x0        no       0x1-0x9e
>0     0x000000000009f000   0x000000000009ffff   0x0000000000001000 0x0        no       0x9f-0xa0
>0     0x0000000000100000   0x0000000066416fff   0x0000000066317000 0x0        no       0x100-0x66417
>0     0x00000000777ff000   0x00000000777fffff   0x0000000000001000 0x0        no       0x777ff-0x77800
>0     0x0000000100000000   0x00000020bfffffff   0x0000001fc0000000 0x2        yes      0x100000-0x20c0000
>0     0x00000020c0000000   0x00000060bfffffff   0x0000004000000000 0x0        no       0x20c0000-0x60c0000
>1     0x00000060c0000000   0x00000080bfffffff   0x0000002000000000 0x2        yes      0x60c0000-0x80c0000
>1     0x00000080c0000000   0x000000c0bfffffff   0x0000004000000000 0x0        no       0x80c0000-0xc0c0000

#1 mirror memory would disable memmap defer init

  I hacked my kernel to behave like Case 2 here, and found defer_init is
  skipped.
  
  The reason is from (1): both Normal and Movable zone ends at the same address

  But what happened is tricky:

    * calculate_node_totalpages() count more space to node_spanned_pages
    * defer_init() would skip low zone by check end_pfn with pgdat_end_pfn(),
      which is far away from real value

  The node_spanned_pages in calculate_node_totalpages() is easy to fix:

  diff --git a/mm/mm_init.c b/mm/mm_init.c
  index db5568cf36e1..8f353d8dde3b 100644
  --- a/mm/mm_init.c
  +++ b/mm/mm_init.c
  @@ -1334,7 +1334,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
   						unsigned long node_start_pfn,
   						unsigned long node_end_pfn)
   {
  -	unsigned long realtotalpages = 0, totalpages = 0;
  +	unsigned long realtotalpages = 0;
   	enum zone_type i;
   
   	for (i = 0; i < MAX_NR_ZONES; i++) {
  @@ -1364,11 +1364,10 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
   		zone->present_early_pages = real_size;
   #endif
   
  -		totalpages += spanned;
   		realtotalpages += real_size;
   	}
   
  -	pgdat->node_spanned_pages = totalpages;
  +	pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
   	pgdat->node_present_pages = realtotalpages;
   	pr_debug("On node %d totalpages: %lu\n", pgdat->node_id, realtotalpages);
   }

  But after this, defer_init() is over working. It starts defer from
  ZONE_NORMAL, which is not the last zone.

  To fix this, let's see below.

#2 confirm mirror memory layout and may simplify handling

  In [1], I listed three possible cases for mirror memory layout. According to
  the test from Yuan and the output here, case C seems not possible.

  [1]: https://lkml.org/2026/4/24/90

  But case A is not shown from the output. I expect the Case 1 here would
  behave like case A, but as I mentioned at (2), I don't know why memory above
  4G is still mirror memory when it says "no mirror memory above 4G". 

  At lest, we could conclude:

     ZONE_MOVABLE would only sits in ZONE_NORMAL without interleave

  With this knowledge, we may simplify current handling:

     Remove the possible overlapping between ZONE_NORMAL and ZONE_MOVABLE

  As ZONE_MOVABLE only sits in ZONE_NORMAL without interleave, we see actually
  these two zones are not overlapped. But we see (1) above, because we don't
  adjust ZONE_NORMAL if mirrored_kernelcore.

  So we can simply do this and these two zones are not overlapped.

  diff --git a/mm/mm_init.c b/mm/mm_init.c
  index 8f353d8dde3b..16cc42c3ad12 100644
  --- a/mm/mm_init.c
  +++ b/mm/mm_init.c
  @@ -1170,8 +1170,7 @@ static void __init adjust_zone_range_for_zone_movable(int nid,
   				arch_zone_highest_possible_pfn[movable_zone]);
   
   		/* Adjust for ZONE_MOVABLE starting within this range */
  -		} else if (!mirrored_kernelcore &&
  -			*zone_start_pfn < zone_movable_pfn[nid] &&
  +		} else if (*zone_start_pfn < zone_movable_pfn[nid] &&
   			*zone_end_pfn > zone_movable_pfn[nid]) {
   			*zone_end_pfn = zone_movable_pfn[nid];

  And after this, the problem left above for memmap defer_init is solved. Only
  last zone would do defer_init.

  And after this, other special handling for mirror memory could be removed,
  including absent page calculation and the overlap_memmap_init() in this patch.

  Two of the below change I have done boot test. And it looks good.

  Remove absent pages calculation:

  diff --git a/mm/mm_init.c b/mm/mm_init.c
  index 16cc42c3ad12..a154053a37db 100644
  --- a/mm/mm_init.c
  +++ b/mm/mm_init.c
  @@ -1219,40 +1219,11 @@ static unsigned long __init zone_absent_pages_in_node(int nid,
   					unsigned long zone_start_pfn,
   					unsigned long zone_end_pfn)
   {
  -	unsigned long nr_absent;
  -
   	/* zone is empty, we don't have any absent pages */
   	if (zone_start_pfn == zone_end_pfn)
   		return 0;
   
  -	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
  -
  -	/*
  -	 * ZONE_MOVABLE handling.
  -	 * Treat pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages
  -	 * and vice versa.
  -	 */
  -	if (mirrored_kernelcore && zone_movable_pfn[nid]) {
  -		unsigned long start_pfn, end_pfn;
  -		struct memblock_region *r;
  -
  -		for_each_mem_region(r) {
  -			start_pfn = clamp(memblock_region_memory_base_pfn(r),
  -					  zone_start_pfn, zone_end_pfn);
  -			end_pfn = clamp(memblock_region_memory_end_pfn(r),
  -					zone_start_pfn, zone_end_pfn);
  -
  -			if (zone_type == ZONE_MOVABLE &&
  -			    memblock_is_mirror(r))
  -				nr_absent += end_pfn - start_pfn;
  -
  -			if (zone_type == ZONE_NORMAL &&
  -			    !memblock_is_mirror(r))
  -				nr_absent += end_pfn - start_pfn;
  -		}
  -	}
  -
  -	return nr_absent;
  +	return __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
   }

  And remove overlap_memmap_init() directly:

  diff --git a/mm/mm_init.c b/mm/mm_init.c
  index a154053a37db..01c6920fefa1 100644
  --- a/mm/mm_init.c
  +++ b/mm/mm_init.c
  @@ -905,8 +905,8 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone
   		 * function.  They do not exist on hotplugged memory.
   		 */
   		if (context == MEMINIT_EARLY) {
  -			if (overlap_memmap_init(zone, &pfn))
  -				continue;
   			if (defer_init(nid, pfn, zone_end_pfn)) {
   				deferred_struct_pages = true;
   				break;

  So if we only have memory layout for mirror memory as Yuan shows, we may
  simplify the code to some extend.

  Hope I don't misunderstand the case. Looking for some insights from others.

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range
  2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
                   ` (5 preceding siblings ...)
  2026-05-21  2:59 ` [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Wei Yang
@ 2026-06-01  7:17 ` Mike Rapoport
  6 siblings, 0 replies; 12+ messages in thread
From: Mike Rapoport @ 2026-06-01  7:17 UTC (permalink / raw)
  To: Yuan Liu
  Cc: David Hildenbrand, Oscar Salvador, Wei Yang, linux-mm, Yong Hu,
	Nanhai Zou, Tim Chen, Qiuxu Zhuo, Yu C Chen, Pan Deng, Tianyou Li,
	Chen Zhang, Jason Zeng, linux-kernel

Hi,

On Wed, May 20, 2026 at 05:34:52AM -0400, Yuan Liu wrote:
> This series introduces a pages_with_online_memmap member into struct zone
> to avoid pageblock-by-pageblock scans across the entire zone and improve
> memory hotplug performance.
> 
> For VM hotplug performance data, please refer to Patch 4.
> This series also benefits CXL hotplug. Performance results are as follows
> https://lore.kernel.org/all/20260409023552.GA2807@AE/
> 
> Yuan Liu (5):
>   mm: move mirrored memory overlap checking to the outer loop
>   mm: skip non-mirrored ZONE_NORMAL memory map init when
>     kernelcore=mirror
>   mm: remove the special early-section handling from pfn_valid() and
>     for_each_valid_pfn()
>   mm/memory_hotplug: optimize zone contiguous check when changing pfn
>     range
>   mm/memory_hotplug: improve shrink_zone_span() subsection boundary
>     checks
> 
>  Documentation/mm/physical_memory.rst | 13 +++++
>  drivers/base/memory.c                |  6 ++
>  include/linux/mmzone.h               | 60 +++++++++++++++++---
>  mm/internal.h                        |  8 +--
>  mm/memory_hotplug.c                  | 54 +++++++++---------
>  mm/mm_init.c                         | 83 +++++++++++-----------------
>  mm/sparse-vmemmap.c                  |  4 +-
>  7 files changed, 136 insertions(+), 92 deletions(-)

Sashiko has a few comments:
https://sashiko.dev/#/patchset/20260520093457.3719960-3-yuan1.liu@intel.com

Can you please check them? 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-01  7:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20  9:34 [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
2026-05-20  9:34 ` [PATCH v5 1/5] mm: move mirrored memory overlap checking to the outer loop Yuan Liu
2026-05-21  3:33   ` Wei Yang
2026-05-22  7:43     ` Liu, Yuan1
2026-05-25  8:35       ` Wei Yang
2026-05-20  9:34 ` [PATCH v5 2/5] mm: skip non-mirrored ZONE_NORMAL memory map init when kernelcore=mirror Yuan Liu
2026-05-20  9:34 ` [PATCH v5 3/5] mm: remove the special early-section handling from pfn_valid() and for_each_valid_pfn() Yuan Liu
2026-05-20  9:34 ` [PATCH v5 4/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Yuan Liu
2026-05-20  9:34 ` [PATCH v5 5/5] mm/memory_hotplug: improve shrink_zone_span() subsection boundary checks Yuan Liu
2026-05-21  2:59 ` [PATCH v5 0/5] mm/memory_hotplug: optimize zone contiguous check when changing pfn range Wei Yang
2026-05-21  5:16   ` Liu, Yuan1
2026-06-01  7:17 ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox