All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Tianyou Li <tianyou.li@intel.com>
Cc: David Hildenbrand <david@redhat.com>,
	Mike Rapoport <rppt@kernel.org>,
	Wei Yang <richard.weiyang@gmail.com>,
	linux-mm@kvack.org, Yong Hu <yong.hu@intel.com>,
	Nanhai Zou <nanhai.zou@intel.com>, Yuan Liu <yuan1.liu@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Qiuxu Zhuo <qiuxu.zhuo@intel.com>,
	Yu C Chen <yu.c.chen@intel.com>, Pan Deng <pan.deng@intel.com>,
	Chen Zhang <zhangchen.kidd@jd.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 1/2] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range
Date: Thu, 11 Dec 2025 06:07:45 +0100	[thread overview]
Message-ID: <aTpRoUt0WvEULeKE@localhost.localdomain> (raw)
In-Reply-To: <20251208152544.1150732-2-tianyou.li@intel.com>

On Mon, Dec 08, 2025 at 11:25:43PM +0800, Tianyou Li wrote:
> When invoke move_pfn_range_to_zone or remove_pfn_range_from_zone, it will
> update the zone->contiguous by checking the new zone's pfn range from the
> beginning to the end, regardless the previous state of the old zone. When
> the zone's pfn range is large, the cost of traversing the pfn range to
> update the zone->contiguous could be significant.
> 
> Add fast paths to quickly detect cases where zone is definitely not
> contiguous without scanning the new zone. The cases are: when the new range
> did not overlap with previous range, the contiguous should be false; if the
> new range adjacent with the previous range, just need to check the new
> range; if the new added pages could not fill the hole of previous zone, the
> contiguous should be false.
> 
> The following test cases of memory hotplug for a VM [1], tested in the
> environment [2], show that this optimization can significantly reduce the
> memory hotplug time [3].
> 
> +----------------+------+---------------+--------------+----------------+
> |                | Size | Time (before) | Time (after) | Time Reduction |
> |                +------+---------------+--------------+----------------+
> | Plug Memory    | 256G |      10s      |      2s      |       80%      |
> |                +------+---------------+--------------+----------------+
> |                | 512G |      33s      |      6s      |       81%      |
> +----------------+------+---------------+--------------+----------------+
> 
> +----------------+------+---------------+--------------+----------------+
> |                | Size | Time (before) | Time (after) | Time Reduction |
> |                +------+---------------+--------------+----------------+
> | Unplug Memory  | 256G |      10s      |      2s      |       80%      |
> |                +------+---------------+--------------+----------------+
> |                | 512G |      34s      |      6s      |       82%      |
> +----------------+------+---------------+--------------+----------------+
> 
> [1] Qemu commands to hotplug 256G/512G memory for a VM:
>     object_add memory-backend-ram,id=hotmem0,size=256G/512G,share=on
>     device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>     qom-set vmem1 requested-size 256G/512G (Plug Memory)
>     qom-set vmem1 requested-size 0G (Unplug Memory)
> 
> [2] Hardware     : Intel Icelake server
>     Guest Kernel : v6.18-rc2
>     Qemu         : v9.0.0
> 
>     Launch VM    :
>     qemu-system-x86_64 -accel kvm -cpu host \
>     -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>     -drive file=./seed.img,format=raw,if=virtio \
>     -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>     -m 2G,slots=10,maxmem=2052472M \
>     -device pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>     -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>     -nographic -machine q35 \
>     -nic user,hostfwd=tcp::3000-:22
> 
>     Guest kernel auto-onlines newly added memory blocks:
>     echo online > /sys/devices/system/memory/auto_online_blocks
> 
> [3] The time from typing the QEMU commands in [1] to when the output of
>     'grep MemTotal /proc/meminfo' on Guest reflects that all hotplugged
>     memory is recognized.
> 
> Reported-by: Nanhai Zou <nanhai.zou@intel.com>
> Reported-by: Chen Zhang <zhangchen.kidd@jd.com>
> Tested-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Reviewed-by: Yu C Chen <yu.c.chen@intel.com>
> Reviewed-by: Pan Deng <pan.deng@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Reviewed-by: Yuan Liu <yuan1.liu@intel.com>
> Signed-off-by: Tianyou Li <tianyou.li@intel.com>

Overall this looks good to me, thanks Tianyou Li for working on this.
Just some minor comments below:

> ---
>  mm/internal.h       |  8 +++++-
>  mm/memory_hotplug.c | 64 ++++++++++++++++++++++++++++++++++++++++++---
>  mm/mm_init.c        | 13 +++++++--
>  3 files changed, 79 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 1561fc2ff5b8..1b5bba6526d4 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -730,7 +730,13 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>  	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
>  }
>  
> -void set_zone_contiguous(struct zone *zone);
> +enum zone_contig_state {
> +	ZONE_CONTIG_YES,
> +	ZONE_CONTIG_NO,
> +	ZONE_CONTIG_MAYBE,
> +};
> +
> +void set_zone_contiguous(struct zone *zone, enum zone_contig_state state);
>  bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>  			   unsigned long nr_pages);
>  
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0be83039c3b5..d711f6e2c87f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -544,6 +544,28 @@ static void update_pgdat_span(struct pglist_data *pgdat)
>  	pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
>  }
>  
> +static enum zone_contig_state __meminit zone_contig_state_after_shrinking(
> +		struct zone *zone, unsigned long start_pfn, unsigned long nr_pages)

Why do we need the __meminit? These functions are only used from memory-hotplug
code so we should not need it?

> +{
> +	const unsigned long end_pfn = start_pfn + nr_pages;
> +
> +	/*
> +	 * If the removed pfn range inside the original zone span, the contiguous
> +	 * property is surely false.
> +	 */
> +	if (start_pfn > zone->zone_start_pfn && end_pfn < zone_end_pfn(zone))
> +		return ZONE_CONTIG_NO;
> +
> +	/* If the removed pfn range is at the beginning or end of the
> +	 * original zone span, the contiguous property is preserved when
> +	 * the original zone is contiguous.
> +	 */
> +	if (start_pfn == zone->zone_start_pfn || end_pfn == zone_end_pfn(zone))
> +		return zone->contiguous ? ZONE_CONTIG_YES : ZONE_CONTIG_MAYBE;
> +
> +	return ZONE_CONTIG_MAYBE;
> +}
> +
>  void remove_pfn_range_from_zone(struct zone *zone,
>  				      unsigned long start_pfn,
>  				      unsigned long nr_pages)
> @@ -551,6 +573,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
>  	const unsigned long end_pfn = start_pfn + nr_pages;
>  	struct pglist_data *pgdat = zone->zone_pgdat;
>  	unsigned long pfn, cur_nr_pages;
> +	enum zone_contig_state contiguous_state = ZONE_CONTIG_MAYBE;

I think that new_contiguous_state is clearer, but I do not have a strong
opinion here.

>  	/* Poison struct pages because they are now uninitialized again. */
>  	for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
> @@ -571,12 +594,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
>  	if (zone_is_zone_device(zone))
>  		return;
>  
> +	contiguous_state = zone_contig_state_after_shrinking(zone, start_pfn, nr_pages);
>  	clear_zone_contiguous(zone);
>  
>  	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
>  	update_pgdat_span(pgdat);
>  
> -	set_zone_contiguous(zone);
> +	set_zone_contiguous(zone, contiguous_state);
>  }
>  
...
> @@ -752,7 +809,8 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>  {
>  	struct pglist_data *pgdat = zone->zone_pgdat;
>  	int nid = pgdat->node_id;
> -
> +	const enum zone_contig_state contiguous_state =
> +		zone_contig_state_after_growing(zone, start_pfn, nr_pages);

Same comment from remove_pfn_range_from_zone.

>  	clear_zone_contiguous(zone);
>  
>  	if (zone_is_empty(zone))
> @@ -783,7 +841,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>  			 MEMINIT_HOTPLUG, altmap, migratetype,
>  			 isolate_pageblock);
>  
> -	set_zone_contiguous(zone);
> +	set_zone_contiguous(zone, contiguous_state);
>  }
>  
>  struct auto_movable_stats {
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 7712d887b696..e296bd9fac9e 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -2263,11 +2263,19 @@ void __init init_cma_pageblock(struct page *page)
>  }
>  #endif
>  
> -void set_zone_contiguous(struct zone *zone)
> +void set_zone_contiguous(struct zone *zone, enum zone_contig_state state)
>  {
>  	unsigned long block_start_pfn = zone->zone_start_pfn;
>  	unsigned long block_end_pfn;
>  
> +	if (state == ZONE_CONTIG_YES) {
> +		zone->contiguous = true;
> +		return;
> +	}
> +
> +	if (state == ZONE_CONTIG_NO)
> +		return;
> +
>  	block_end_pfn = pageblock_end_pfn(block_start_pfn);
>  	for (; block_start_pfn < zone_end_pfn(zone);
>  			block_start_pfn = block_end_pfn,
> @@ -2283,6 +2291,7 @@ void set_zone_contiguous(struct zone *zone)
>  
>  	/* We confirm that there is no hole */
>  	zone->contiguous = true;
> +

Not needed?

>  }
 

-- 
Oscar Salvador
SUSE Labs


  reply	other threads:[~2025-12-11  5:07 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-08 15:25 [PATCH v5 0/2] Optimize zone->contiguous update and issue fix Tianyou Li
2025-12-08 15:25 ` [PATCH v5 1/2] mm/memory hotplug/unplug: Optimize zone->contiguous update when changes pfn range Tianyou Li
2025-12-11  5:07   ` Oscar Salvador [this message]
2025-12-12  5:27     ` Li, Tianyou
2025-12-19  1:31       ` Li, Tianyou
2025-12-08 15:25 ` [PATCH v5 2/2] mm/memory hotplug: fix zone->contiguous always false when hotplug Tianyou Li
2025-12-11  5:16   ` Oscar Salvador
2025-12-12  5:35     ` Li, Tianyou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTpRoUt0WvEULeKE@localhost.localdomain \
    --to=osalvador@suse.de \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nanhai.zou@intel.com \
    --cc=pan.deng@intel.com \
    --cc=qiuxu.zhuo@intel.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rppt@kernel.org \
    --cc=tianyou.li@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=yong.hu@intel.com \
    --cc=yu.c.chen@intel.com \
    --cc=yuan1.liu@intel.com \
    --cc=zhangchen.kidd@jd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.