All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Yuan Liu <yuan1.liu@intel.com>,
	Oscar Salvador <osalvador@suse.de>,
	Wei Yang <richard.weiyang@gmail.com>,
	linux-mm@kvack.org, Yong Hu <yong.hu@intel.com>,
	Nanhai Zou <nanhai.zou@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Qiuxu Zhuo <qiuxu.zhuo@intel.com>,
	Yu C Chen <yu.c.chen@intel.com>, Pan Deng <pan.deng@intel.com>,
	Tianyou Li <tianyou.li@intel.com>,
	Chen Zhang <zhangchen.kidd@jd.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] mm/memory hotplug/unplug: Optimize zone contiguous check when changing pfn range
Date: Thu, 9 Apr 2026 17:40:05 +0300	[thread overview]
Message-ID: <ade6RVeXxDt7ImP4@kernel.org> (raw)
In-Reply-To: <17b821b6-0176-43d5-92f7-fe2a0c4f70cf@kernel.org>

On Wed, Apr 08, 2026 at 09:36:14AM +0200, David Hildenbrand (Arm) wrote:
> On 4/8/26 05:16, Yuan Liu wrote:
> > When move_pfn_range_to_zone() or remove_pfn_range_from_zone() updates a
> > zone, set_zone_contiguous() rescans the entire zone pageblock-by-pageblock
> > to rebuild zone->contiguous. For large zones this is a significant cost
> > during memory hotplug and hot-unplug.
> > 
> > Add a new zone member pages_with_online_memmap that tracks the number of
> > pages within the zone span that have an online memory map (including present
> > pages and memory holes whose memory map has been initialized). When
> > spanned_pages == pages_with_online_memmap the zone is contiguous and
> > pfn_to_page() can be called on any PFN in the zone span without further
> > pfn_valid() checks.
> > 
> > Only pages that fall within the current zone span are accounted towards
> > pages_with_online_memmap. A "too small" value is safe, it merely prevents
> > detecting a contiguous zone.
> > 
> > The following test cases of memory hotplug for a VM [1], tested in the
> > environment [2], show that this optimization can significantly reduce the
> > memory hotplug time [3].
> > 
> > +----------------+------+---------------+--------------+----------------+
> > |                | Size | Time (before) | Time (after) | Time Reduction |
> > |                +------+---------------+--------------+----------------+
> > | Plug Memory    | 256G |      10s      |      3s      |       70%      |
> > |                +------+---------------+--------------+----------------+
> > |                | 512G |      36s      |      7s      |       81%      |
> > +----------------+------+---------------+--------------+----------------+
> > 
> > +----------------+------+---------------+--------------+----------------+
> > |                | Size | Time (before) | Time (after) | Time Reduction |
> > |                +------+---------------+--------------+----------------+
> > | Unplug Memory  | 256G |      11s      |      4s      |       64%      |
> > |                +------+---------------+--------------+----------------+
> > |                | 512G |      36s      |      9s      |       75%      |
> > +----------------+------+---------------+--------------+----------------+
> > 
> > [1] Qemu commands to hotplug 256G/512G memory for a VM:
> >     object_add memory-backend-ram,id=hotmem0,size=256G/512G,share=on
> >     device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
> >     qom-set vmem1 requested-size 256G/512G (Plug Memory)
> >     qom-set vmem1 requested-size 0G (Unplug Memory)
> > 
> > [2] Hardware     : Intel Icelake server
> >     Guest Kernel : v7.0-rc4
> >     Qemu         : v9.0.0
> > 
> >     Launch VM    :
> >     qemu-system-x86_64 -accel kvm -cpu host \
> >     -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
> >     -drive file=./seed.img,format=raw,if=virtio \
> >     -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
> >     -m 2G,slots=10,maxmem=2052472M \
> >     -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
> >     -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
> >     -nographic -machine q35 \
> >     -nic user,hostfwd=tcp::3000-:22
> > 
> >     Guest kernel auto-onlines newly added memory blocks:
> >     echo online > /sys/devices/system/memory/auto_online_blocks
> > 
> > [3] The time from typing the QEMU commands in [1] to when the output of
> >     'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
> >     memory is recognized.
> > 
> > Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> > Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> > Tested-by: Yuan Liu <yuan1.liu@intel.com>
> > Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> > Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> > Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> > Reviewed-by: Pan Deng <pan.deng@intel.com>
> > Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> > Co-developed-by: Tianyou Li <tianyou.li@intel.com>
> > Signed-off-by: Tianyou Li <tianyou.li@intel.com>
> > Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
> > Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> > ---
> 
> [...]
> 
> > @@ -842,7 +842,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
> >   *   zone/node above the hole except for the trailing pages in the last
> >   *   section that will be appended to the zone/node below.
> >   */
> > -static void __init init_unavailable_range(unsigned long spfn,
> > +static unsigned long __init init_unavailable_range(unsigned long spfn,
> >  					  unsigned long epfn,
> >  					  int zone, int node)
> >  {
> > @@ -858,6 +858,7 @@ static void __init init_unavailable_range(unsigned long spfn,
> >  	if (pgcnt)
> >  		pr_info("On node %d, zone %s: %lld pages in unavailable ranges\n",
> >  			node, zone_names[zone], pgcnt);
> > +	return pgcnt;
> >  }
> >  
> >  /*
> > @@ -956,9 +957,22 @@ static void __init memmap_init_zone_range(struct zone *zone,
> >  	memmap_init_range(end_pfn - start_pfn, nid, zone_id, start_pfn,
> >  			  zone_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE,
> >  			  false);
> > +	zone->pages_with_online_memmap += end_pfn - start_pfn;
> >  
> > -	if (*hole_pfn < start_pfn)
> > -		init_unavailable_range(*hole_pfn, start_pfn, zone_id, nid);
> > +	if (*hole_pfn < start_pfn) {
> > +		unsigned long pgcnt;
> > +
> > +		if (*hole_pfn < zone_start_pfn) {
> > +			init_unavailable_range(*hole_pfn, zone_start_pfn,
> > +					       zone_id, nid);
> > +			pgcnt = init_unavailable_range(zone_start_pfn,
> > +					start_pfn, zone_id, nid);
> 
> Indentation of parameters.
> 
> > +		} else {
> > +			pgcnt = init_unavailable_range(*hole_pfn, start_pfn,
> > +					zone_id, nid);
> 
> 
> Same here.
> 
> > +		}
> > +		zone->pages_with_online_memmap += pgcnt;
> > +	}
> 
> 
> Maybe something like the following could make it nicer to read, just a
> thought.
> 
> 
> unsigned long hole_start_pfn = *hole_pfn;
> 
> if (hole_start_pfn < zone_start_pfn) {
> 	init_unavailable_range(hole_start_pfn, zone_start_pfn,
> 			       zone_id, nid);
> 	hole_start_pfn = zone_start_pfn;
> }
> pgcnt = init_unavailable_range(hole_start_pfn, start_pfn,
> 			       zone_id, nid);
> 

Yeah, this looks better :)

sashiko had several comments
https://sashiko.dev/#/patchset/20260408031615.1831922-1-yuan1.liu%40intel.com

I skipped the ones related to hotplug, but in the mm_init part the comment
about zones that can have overlapping physical spans when mirrored
kernelcore is enabled seems valid.
 
> -- 
> Cheers,
> David

-- 
Sincerely yours,
Mike.


  parent reply	other threads:[~2026-04-09 14:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08  3:16 [PATCH v3] mm/memory hotplug/unplug: Optimize zone contiguous check when changing pfn range Yuan Liu
2026-04-08  7:36 ` David Hildenbrand (Arm)
2026-04-08 12:29   ` Liu, Yuan1
2026-04-08 12:31     ` David Hildenbrand (Arm)
2026-04-08 12:37       ` Liu, Yuan1
2026-04-09 14:40   ` Mike Rapoport [this message]
2026-04-09 15:08     ` David Hildenbrand (Arm)
2026-04-14  7:06       ` Liu, Yuan1
2026-04-14  9:24         ` David Hildenbrand (Arm)
2026-04-17  6:34           ` Liu, Yuan1
2026-04-17  9:00             ` David Hildenbrand (Arm)
2026-04-17  9:28               ` Liu, Yuan1
2026-04-20 14:03             ` Mike Rapoport
2026-04-21  0:00               ` Liu, Yuan1
2026-04-28  0:37         ` Liu, Yuan1
2026-04-13 13:06 ` Wei Yang
2026-04-13 18:24   ` David Hildenbrand (Arm)
2026-04-14  2:12     ` Wei Yang
2026-04-14  9:32       ` David Hildenbrand (Arm)
2026-04-15  2:30         ` Wei Yang
2026-04-15  9:11           ` David Hildenbrand (Arm)
2026-04-16  2:23 ` Wei Yang
2026-04-16  7:15   ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ade6RVeXxDt7ImP4@kernel.org \
    --to=rppt@kernel.org \
    --cc=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nanhai.zou@intel.com \
    --cc=osalvador@suse.de \
    --cc=pan.deng@intel.com \
    --cc=qiuxu.zhuo@intel.com \
    --cc=richard.weiyang@gmail.com \
    --cc=tianyou.li@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=yong.hu@intel.com \
    --cc=yu.c.chen@intel.com \
    --cc=yuan1.liu@intel.com \
    --cc=zhangchen.kidd@jd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.