public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: "Liu, Yuan1" <yuan1.liu@intel.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	David Hildenbrand <david@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Mike Rapoport <rppt@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Hu, Yong" <yong.hu@intel.com>,
	"Zou, Nanhai" <nanhai.zou@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	"Zhuo, Qiuxu" <qiuxu.zhuo@intel.com>,
	"Chen, Yu C" <yu.c.chen@intel.com>,
	"Deng, Pan" <pan.deng@intel.com>,
	"Li, Tianyou" <tianyou.li@intel.com>,
	Chen Zhang <zhangchen.kidd@jd.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 1/2] mm: move overlap memory map init check to memmap_init()
Date: Fri, 24 Apr 2026 01:05:35 +0000	[thread overview]
Message-ID: <20260424010535.54sh5z6nkqt3j6du@master> (raw)
In-Reply-To: <MW4PR11MB6936682EB5EF1993DE7BDAEEA32D2@MW4PR11MB6936.namprd11.prod.outlook.com>

On Wed, Apr 22, 2026 at 09:28:52AM +0000, Liu, Yuan1 wrote:
>> -----Original Message-----
>> From: Wei Yang <richard.weiyang@gmail.com>
>> Sent: Wednesday, April 22, 2026 11:27 AM
>> To: Wei Yang <richard.weiyang@gmail.com>
>> Cc: Liu, Yuan1 <yuan1.liu@intel.com>; David Hildenbrand
>> <david@kernel.org>; Oscar Salvador <osalvador@suse.de>; Mike Rapoport
>> <rppt@kernel.org>; linux-mm@kvack.org; Hu, Yong <yong.hu@intel.com>; Zou,
>> Nanhai <nanhai.zou@intel.com>; Tim Chen <tim.c.chen@linux.intel.com>;
>> Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Chen, Yu C <yu.c.chen@intel.com>;
>> Deng, Pan <pan.deng@intel.com>; Li, Tianyou <tianyou.li@intel.com>; Chen
>> Zhang <zhangchen.kidd@jd.com>; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH v4 1/2] mm: move overlap memory map init check to
>> memmap_init()
>> 
>> On Wed, Apr 22, 2026 at 01:11:26AM +0000, Wei Yang wrote:
>> >On Tue, Apr 21, 2026 at 08:55:07AM -0400, Yuan Liu wrote:
>> >>Move the overlap memmap init check from memmap_init_range() into
>> >>memmap_init().
>> >>
>> >>When mirrored kernelcore is enabled, avoid memory map initialization
>> >>for overlap regions. There are two cases that may overlap: a mirror
>> >>memory region assigned to movable zone, or a non-mirror memory region
>> >>assigned to a non-movable zone but falling within the movable zone
>> >>range.
>> >>
>> >>Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
>> >>---
>> >> mm/mm_init.c | 37 +++++++++++++------------------------
>> >> 1 file changed, 13 insertions(+), 24 deletions(-)
>> >>
>> >>diff --git a/mm/mm_init.c b/mm/mm_init.c
>> >>index df34797691bd..2b5233060504 100644
>> >>--- a/mm/mm_init.c
>> >>+++ b/mm/mm_init.c
>> >>@@ -797,28 +797,6 @@ void __meminit reserve_bootmem_region(phys_addr_t
>> start,
>> >> 	}
>> >> }
>> >>
>> >>-/* If zone is ZONE_MOVABLE but memory is mirrored, it is an overlapped
>> init */
>> >>-static bool __meminit
>> >>-overlap_memmap_init(unsigned long zone, unsigned long *pfn)
>> >>-{
>> >>-	static struct memblock_region *r;
>> >>-
>> >>-	if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
>> >>-		if (!r || *pfn >= memblock_region_memory_end_pfn(r)) {
>> >>-			for_each_mem_region(r) {
>> >>-				if (*pfn < memblock_region_memory_end_pfn(r))
>> >>-					break;
>> >>-			}
>> >>-		}
>> >>-		if (*pfn >= memblock_region_memory_base_pfn(r) &&
>> >>-		    memblock_is_mirror(r)) {
>> >>-			*pfn = memblock_region_memory_end_pfn(r);
>> >>-			return true;
>> >>-		}
>> >>-	}
>> >>-	return false;
>> >>-}
>> >>-
>> >> /*
>> >>  * Only struct pages that correspond to ranges defined by
>> memblock.memory
>> >>  * are zeroed and initialized by going through __init_single_page()
>> during
>> >>@@ -905,8 +883,6 @@ void __meminit memmap_init_range(unsigned long size,
>> int nid, unsigned long zone
>> >> 		 * function.  They do not exist on hotplugged memory.
>> >> 		 */
>> >> 		if (context == MEMINIT_EARLY) {
>> >>-			if (overlap_memmap_init(zone, &pfn))
>> >>-				continue;
>> >> 			if (defer_init(nid, pfn, zone_end_pfn)) {
>> >> 				deferred_struct_pages = true;
>> >> 				break;
>> >>@@ -971,6 +947,7 @@ static void __init memmap_init(void)
>> >>
>> >> 	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
>> {
>> >> 		struct pglist_data *node = NODE_DATA(nid);
>> >>+		struct memblock_region *r = &memblock.memory.regions[i];
>> >>
>> >> 		for (j = 0; j < MAX_NR_ZONES; j++) {
>> >> 			struct zone *zone = node->node_zones + j;
>> >>@@ -978,6 +955,18 @@ static void __init memmap_init(void)
>> >> 			if (!populated_zone(zone))
>> >> 				continue;
>> >>
>> >>+			if (mirrored_kernelcore) {
>> >>+				const bool is_mirror = memblock_is_mirror(r);
>> >>+				const bool is_movable_zone = (j == ZONE_MOVABLE);
>> >>+
>> >>+				if (is_mirror && is_movable_zone)
>> >>+					continue;
>> >>+
>> >>+				if (!is_mirror && !is_movable_zone &&
>> >>+				    start_pfn >= zone_movable_pfn[nid])
>> >>+					continue;
>> >
>> >IIUC, when mirrored_kernelcore is set but !memblock_has_mirror() or
>> >is_kdump_kernel(), zone_movable_pfn[nid] is kept to be 0.
>> >
>> >This means it will skip all memory regions.
>> >
>> 
>> Did some tests. When mirrored_kernelcore && !memblock_has_mirror(), which
>> means there is no is_mirror memblock. This will leave
>> zone_movable_pfn[nid] 0.
>> 
>> So for all memory regions, the above logic will skip them.
>> 
>> Adjust the code as below, my local test could pass and kernel bootup as
>> expected.
>> 
>> From 6351ac79a17edbfd830510fba2959ddc47b17258 Mon Sep 17 00:00:00 2001
>> From: Wei Yang <richard.weiyang@gmail.com>
>> Date: Wed, 22 Apr 2026 09:13:24 +0800
>> Subject: [PATCH] skip overlap region higher level
>> 
>> ---
>>  mm/mm_init.c | 29 ++++++++++++++++++++++-------
>>  1 file changed, 22 insertions(+), 7 deletions(-)
>> 
>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>> index 79f93f2a90cf..7a85ba58e87f 100644
>> --- a/mm/mm_init.c
>> +++ b/mm/mm_init.c
>> @@ -916,8 +916,8 @@ void __meminit memmap_init_range(unsigned long size,
>> int nid, unsigned long zone
>>  		 * function.  They do not exist on hotplugged memory.
>>  		 */
>>  		if (context == MEMINIT_EARLY) {
>> -			if (overlap_memmap_init(zone, &pfn))
>> -				continue;
>> +			// if (overlap_memmap_init(zone, &pfn))
>> +			// 	continue;
>>  			if (defer_init(nid, pfn, zone_end_pfn)) {
>>  				deferred_struct_pages = true;
>>  				break;
>> @@ -974,6 +974,17 @@ static void __init memmap_init_zone_range(struct zone
>> *zone,
>>  	*hole_pfn = end_pfn;
>>  }
>> 
>> +static bool __init region_overlapped(struct memblock_region *rgn,
>> unsigned long zone_type)
>> +{
>> +	if (zone_type == ZONE_MOVABLE && memblock_is_mirror(rgn))
>> +		return true;
>> +
>> +	if (zone_type == ZONE_NORMAL && !memblock_is_mirror(rgn))
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>>  static void __init memmap_init(void)
>>  {
>>  	unsigned long start_pfn, end_pfn;
>> @@ -985,10 +996,15 @@ static void __init memmap_init(void)
>> 
>>  		for (j = 0; j < MAX_NR_ZONES; j++) {
>>  			struct zone *zone = node->node_zones + j;
>> +			struct memblock_region *r = &memblock.memory.regions[i];
>> 
>>  			if (!populated_zone(zone))
>>  				continue;
>> 
>> +			if (mirrored_kernelcore && zone_movable_pfn[nid] &&
>> +			    region_overlapped(r, j))
>> +				continue;
>> +
>>  			memmap_init_zone_range(zone, start_pfn, end_pfn,
>>  					       &hole_pfn);
>>  			zone_id = j;
>> @@ -1257,13 +1273,12 @@ static unsigned long __init
>> zone_absent_pages_in_node(int nid,
>>  			end_pfn = clamp(memblock_region_memory_end_pfn(r),
>>  					zone_start_pfn, zone_end_pfn);
>> 
>> -			if (zone_type == ZONE_MOVABLE &&
>> -			    memblock_is_mirror(r))
>> -				nr_absent += end_pfn - start_pfn;
>> +			if (start_pfn == end_pfn)
>> +				continue;
>> 
>> -			if (zone_type == ZONE_NORMAL &&
>> -			    !memblock_is_mirror(r))
>> +			if (region_overlapped(r, zone_type))
>>  				nr_absent += end_pfn - start_pfn;
>> +
>>  		}
>>  	}
>
>Hi Wei Yang
>
>I ran some tests based on this patch and didn't observe any issues. 
>Thanks for the patch.
>

You are welcome.

Well, maybe we need to do something more. Let me explain what I see.

My assumption of the position of mirror memory is:

   When there is mirror memory in system, all memory in low zone should be
   mirror memory. Non-Mirror memory only could be in the range of ZONE_NORMAL.

   And in the range of ZONE_NORMAL

     * there could be no mirror memory
     * the mirror memory could be at the head or middle in ZONE_NORMAL

Take my test machine as an example, 

    MEMBLOCK configuration:
     memory size = 0x000000017ff7dc00 reserved size = 0x0000000005a9a9c2
     memory.cnt  = 0x3
     memory[0x0]     [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
     memory[0x1]     [0x0000000000100000-0x00000000bffdefff], 0x00000000bfedf000 bytes on node 0 flags: 0x0
     memory[0x2]     [0x0000000100000000-0x00000001bfffffff], 0x00000000c0000000 bytes on node 1 flags: 0x0

The first two memblock region span ZONE_DMA and ZONE_DMA32. The third one span
ZONE_NORMAL.(When kernelcore is not specified).

So I did test with below code change:

@@ -147,6 +148,14 @@ static int __init numa_register_nodes(void)
        }
 
        /* Dump memblock with node info and return. */
+
+       /* Mark mirror by hand */
+       for_each_mem_region(r) {
+               if (i++ < 2)
+                       memblock_mark_mirror(r->base, r->size);
+       }
+

This mark the first two memblock region as mirror. And then use

        memblock_mark_mirror(0x100000000, 0x40000000);
or 
        memblock_mark_mirror(0x140000000, 0x40000000);

To mark the head 1G or second 1G as mirror in the 3rd memblock region to mimic
the overlap case.

So I manually create 3 cases:

A: all ZONE_NORMAL is non-mirror
  memory[0x0]     [0x0000000000001000-0x000000000009efff], node 0 flags: 0x2 mirror
  memory[0x1]     [0x0000000000100000-0x00000000bffdefff], node 0 flags: 0x2 mirror
  memory[0x2]     [0x0000000100000000-0x00000001bfffffff], node 1 flags: 0x0 non-mirror

B: head 1G of ZONE_NORMAL is mirror
  memory[0x0]     [0x0000000000001000-0x000000000009efff], node 0 flags: 0x2 mirror
  memory[0x1]     [0x0000000000100000-0x00000000bffdefff], node 0 flags: 0x2 mirror
  memory[0x2]     [0x0000000100000000-0x000000013fffffff], node 1 flags: 0x2 mirror
  memory[0x3]     [0x0000000140000000-0x00000001bfffffff], node 1 flags: 0x0 non-mirror

C: second 1G of ZONE_NORMAL is mirror
  memory[0x0]     [0x0000000000001000-0x000000000009efff], node 0 flags: 0x2 mirror
  memory[0x1]     [0x0000000000100000-0x00000000bffdefff], node 0 flags: 0x2 mirror
  memory[0x2]     [0x0000000100000000-0x000000013fffffff], node 1 flags: 0x0 non-mirror
  memory[0x3]     [0x0000000140000000-0x000000017fffffff], node 1 flags: 0x2 mirror
  memory[0x4]     [0x0000000180000000-0x00000001bfffffff], node 1 flags: 0x0 non-mirror

The change I proposed works fine for A/B, but for C pages in
[0x140000000-0x17fffffff] is miss placed.

    Node 1, zone  Normal
            spanned  0
            present  0           <-- missing
            managed  0
    Node 1, zone  Movable
            spanned  786432
            present  524288
            managed  773552      <-- but put in here

The reason is in adjust_zone_range_for_zone_movable(), ZONE_NORMAL is
truncated, since zone_movable_pfn[nid] equals to ZONE_NORMAL's start. So this
range is skipped and then by "accident" it is initialized by
init_unavailable_range() to ZONE_MOVABLE. And then it is freed to ZONE_MOVABLE
in __free_pages_core().

After removing this truncation, the zone stats looks good.

Node 1, zone   Normal
        spanned  786432
        present  262144
        managed  249310
Node 1, zone  Movable
        spanned  786432
        present  524288
        managed  517223

@@ -1204,10 +1204,7 @@ static void __init adjust_zone_range_for_zone_movable(int nid,
                        *zone_start_pfn < zone_movable_pfn[nid] &&
                        *zone_end_pfn > zone_movable_pfn[nid]) {
                        *zone_end_pfn = zone_movable_pfn[nid];
-
-               /* Check if this whole range is within ZONE_MOVABLE */
-               } else if (*zone_start_pfn >= zone_movable_pfn[nid])
-                       *zone_start_pfn = *zone_end_pfn;
+               }
        }
 }

All above analysis is based on my assumption on possible mirror memory
position in system. If my assumption of mirror memory is not true, this may
not be true.

-- 
Wei Yang
Help you, Help me


  reply	other threads:[~2026-04-24  1:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 12:55 [PATCH v4 0/2] mm/memory hotplug/unplug: Optimize zone contiguous check when changing pfn range Yuan Liu
2026-04-21 12:55 ` [PATCH v4 1/2] mm: move overlap memory map init check to memmap_init() Yuan Liu
2026-04-22  1:11   ` Wei Yang
2026-04-22  3:26     ` Wei Yang
2026-04-22  9:28       ` Liu, Yuan1
2026-04-24  1:05         ` Wei Yang [this message]
2026-04-24  7:49           ` Liu, Yuan1
2026-04-22  7:08     ` Liu, Yuan1
2026-04-25  9:01   ` Mike Rapoport
2026-04-26  4:00     ` Wei Yang
2026-04-27  0:31     ` Liu, Yuan1
2026-04-21 12:55 ` [PATCH v4 2/2] mm/memory hotplug/unplug: Optimize zone contiguous check when changing pfn range Yuan Liu
2026-04-22  7:46 ` [PATCH v4 0/2] " David Hildenbrand (Arm)
2026-04-22  7:56   ` Liu, Yuan1
2026-04-22 19:13     ` David Hildenbrand (Arm)
2026-04-23  3:17       ` Liu, Yuan1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260424010535.54sh5z6nkqt3j6du@master \
    --to=richard.weiyang@gmail.com \
    --cc=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nanhai.zou@intel.com \
    --cc=osalvador@suse.de \
    --cc=pan.deng@intel.com \
    --cc=qiuxu.zhuo@intel.com \
    --cc=rppt@kernel.org \
    --cc=tianyou.li@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=yong.hu@intel.com \
    --cc=yu.c.chen@intel.com \
    --cc=yuan1.liu@intel.com \
    --cc=zhangchen.kidd@jd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox