All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Len Brown <lenb@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>, Toshi Kani <toshi.kani@hp.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Thomas Renninger <trenn@suse.de>, Yinghai Lu <yinghai@kernel.org>,
	Jiang Liu <jiang.liu@huawei.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Taku Izumi <izumi.taku@jp.fujitsu.com>,
	Minchan Kim <minchan@kernel.org>,
	"mina86@mina86.com" <mina86@mina86.com>,
	"gong.chen@linux.intel.com" <gong.chen@linux.intel.com>,
	Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>,
	"lwoodman@redhat.com" <lwoodman@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	"jweiner@redhat.com" <jweiner@redhat.com>,
	Prarit Bhargava <prarit@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, Chen Tang <imtangchen@gmail.com>,
	Tang Chen <tangchen@cn.fujitsu.com>,
	Zhang Yanfei <zhangyanfei.yes@gmail.com>
Subject: Re: [PATCH RESEND part2 v2 8/8] x86, numa, acpi, memory-hotplug: Make movable_node have higher priority
Date: Thu, 16 Jan 2014 17:03:19 +0000	[thread overview]
Message-ID: <20140116170253.GA24740@suse.de> (raw)
In-Reply-To: <529D423F.3030200@cn.fujitsu.com>

On Tue, Dec 03, 2013 at 10:30:23AM +0800, Zhang Yanfei wrote:
> From: Tang Chen <tangchen@cn.fujitsu.com>
> 
> If users specify the original movablecore=nn@ss boot option, the kernel will
> arrange [ss, ss+nn) as ZONE_MOVABLE. The kernelcore=nn@ss boot option is similar
> except it specifies ZONE_NORMAL ranges.
> 
> Now, if users specify "movable_node" in kernel commandline, the kernel will
> arrange hotpluggable memory in SRAT as ZONE_MOVABLE. And if users do this, all
> the other movablecore=nn@ss and kernelcore=nn@ss options should be ignored.
> 
> For those who don't want this, just specify nothing. The kernel will act as
> before.
> 
> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
> Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
> ---
>  mm/page_alloc.c |   28 ++++++++++++++++++++++++++--
>  1 files changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index dd886fa..768ea0e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5021,9 +5021,33 @@ static void __init find_zone_movable_pfns_for_nodes(void)
>  	nodemask_t saved_node_state = node_states[N_MEMORY];
>  	unsigned long totalpages = early_calculate_totalpages();
>  	int usable_nodes = nodes_weight(node_states[N_MEMORY]);
> +	struct memblock_type *type = &memblock.memory;
> +
> +	/* Need to find movable_zone earlier when movable_node is specified. */
> +	find_usable_zone_for_movable();
> +
> +	/*
> +	 * If movable_node is specified, ignore kernelcore and movablecore
> +	 * options.
> +	 */
> +	if (movable_node_is_enabled()) {
> +		for (i = 0; i < type->cnt; i++) {
> +			if (!memblock_is_hotpluggable(&type->regions[i]))
> +				continue;
> +
> +			nid = type->regions[i].nid;
> +
> +			usable_startpfn = PFN_DOWN(type->regions[i].base);
> +			zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
> +				min(usable_startpfn, zone_movable_pfn[nid]) :
> +				usable_startpfn;
> +		}
> +
> +		goto out2;

out2 is not the most descriptive variable that ever existed. out_align?

There is an assumption here that the hot-pluggable regions of memory
are always at the upper end of the physical address space for that NUMA
node. What prevents the hardware having something like

node0:	0-4G	Not removable
node0:	4-8G	Removable
node0:	8-12G	Not removable

?

By the looks of things, the current code would make ZONE_MOVABLE for the
while 4-12G range of memory even though the 8-12G region cannot be
hot-removed. That would compound any problems related to lowmem-like
pressure as the 8-12G region cannot be used for kernel allocations like
inodes.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-01-16 17:03 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-03  2:19 [PATCH RESEND part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE Zhang Yanfei
2013-12-03  2:22 ` [PATCH RESEND part2 v2 1/8] x86: get pg_data_t's memory from other node Zhang Yanfei
2014-01-16 17:11   ` Mel Gorman
2014-01-17  0:15     ` H. Peter Anvin
2014-01-20  7:29     ` Tang Chen
2014-01-20 15:14       ` Mel Gorman
2014-02-06 10:12         ` Mel Gorman
2014-02-10  5:44           ` Tang Chen
2014-02-11 11:08             ` Mel Gorman
2014-02-12  7:11               ` Tang Chen
2013-12-03  2:24 ` [PATCH RESEND part2 v2 2/8] memblock, numa: Introduce flag into memblock Zhang Yanfei
2013-12-03  2:25 ` [PATCH RESEND part2 v2 3/8] memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions Zhang Yanfei
2013-12-03  2:25 ` [PATCH RESEND part2 v2 4/8] memblock: Make memblock_set_node() support different memblock_type Zhang Yanfei
2013-12-03  2:27 ` [PATCH RESEND part2 v2 5/8] acpi, numa, mem_hotplug: Mark hotpluggable memory in memblock Zhang Yanfei
2013-12-03  2:28 ` [PATCH RESEND part2 v2 6/8] acpi, numa, mem_hotplug: Mark all nodes the kernel resides un-hotpluggable Zhang Yanfei
2013-12-03 23:44   ` Andrew Morton
2013-12-04  2:09     ` [PATCH update " Zhang Yanfei
2013-12-03  2:29 ` [PATCH RESEND part2 v2 7/8] memblock, mem_hotplug: Make memblock skip hotpluggable regions if needed Zhang Yanfei
2013-12-03  2:30 ` [PATCH RESEND part2 v2 8/8] x86, numa, acpi, memory-hotplug: Make movable_node have higher priority Zhang Yanfei
2014-01-16 17:03   ` Mel Gorman [this message]
2013-12-03  2:45 ` [PATCH RESEND part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE Zhang Yanfei
2013-12-03 23:48 ` Andrew Morton
2013-12-04  0:02   ` Zhang Yanfei
2013-12-04  9:53     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140116170253.GA24740@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=gong.chen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=imtangchen@gmail.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=jweiner@redhat.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=lwoodman@redhat.com \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=prarit@redhat.com \
    --cc=riel@redhat.com \
    --cc=tangchen@cn.fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=toshi.kani@hp.com \
    --cc=trenn@suse.de \
    --cc=vasilis.liaskovitis@profitbricks.com \
    --cc=wency@cn.fujitsu.com \
    --cc=yinghai@kernel.org \
    --cc=zhangyanfei.yes@gmail.com \
    --cc=zhangyanfei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.