From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
To: Tang Chen <tangchen@cn.fujitsu.com>
Cc: rob@landley.net, tglx@linutronix.de, mingo@redhat.com,
hpa@zytor.com, yinghai@kernel.org, akpm@linux-foundation.org,
wency@cn.fujitsu.com, trenn@suse.de, liwanp@linux.vnet.ibm.com,
mgorman@suse.de, walken@google.com, riel@redhat.com,
khlebnikov@openvz.org, tj@kernel.org, minchan@kernel.org,
m.szyprowski@samsung.com, mina86@mina86.com,
laijs@cn.fujitsu.com, linfeng@cn.fujitsu.com,
kosaki.motohiro@jp.fujitsu.com, jiang.liu@huawei.com,
guz.fnst@cn.fujitsu.com, x86@kernel.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH 00/11] Introduce movablemem_map=acpi boot option.
Date: Tue, 9 Apr 2013 14:14:39 +0900 [thread overview]
Message-ID: <5163A3BF.3030900@jp.fujitsu.com> (raw)
In-Reply-To: <1365154801-473-1-git-send-email-tangchen@cn.fujitsu.com>
Hi Tang,
The patch works well on my x86_64 box.
I confirmed that hotpluggable node is allocated as Movable Zone.
So feel free to add:
Tested by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Nitpick below.
2013/04/05 18:39, Tang Chen wrote:
> Before this patch-set, we introduced movablemem_map boot option which allowed
> users to specify physical address ranges to set memory as movable. This is not
> user friendly enough for normal users.
>
> So now, we introduce just movablemem_map=acpi to allow users to enable/disable
> the kernel to use Hot Pluggable bit in SRAT to determine which memory ranges are
> hotpluggable, and set them as ZONE_MOVABLE.
>
> This patch-set is based on Yinghai's patch-set:
> v1: https://lkml.org/lkml/2013/3/7/642
> v2: https://lkml.org/lkml/2013/3/10/47
>
> So it supports to allocate pagetable pages in local nodes.
>
> We also split the large patch-set into smaller ones, and it seems easier to review.
>
>
> ========================================================================
> [What we are doing]
> This patchset introduces a boot option for users to specify ZONE_MOVABLE
> memory map for each node in the system. Users can use it in two ways:
>
> 1. movablecore_map=acpi
> In this way, the kernel will use Hot Pluggable bit in SRAT to determine
> ZONE_MOVABLE for each node. All the ranges user has specified will be
> ignored.
>
>
> [Why we do this]
> If we hot remove a memroy device, it cannot have kernel memory,
> because Linux cannot migrate kernel memory currently. Therefore,
> we have to guarantee that the hot removed memory has only movable
> memoroy.
> (Here is an exception: When we implement the node hotplug functionality,
> for those kernel memory whose life cycle is the same as the node, such as
> pagetables, vmemmap and so on, although the kernel cannot migrate them,
> we can still put them on local node because we can free them before we
> hot-remove the node. This is not completely implemented yet.)
>
> Linux has two boot options, kernelcore= and movablecore=, for
> creating movable memory. These boot options can specify the amount
> of memory use as kernel or movable memory. Using them, we can
> create ZONE_MOVABLE which has only movable memory.
> (NOTE: doing this will cause NUMA performance because the kernel won't
> be able to distribute kernel memory evenly to each node.)
>
> But it does not fulfill a requirement of memory hot remove, because
> even if we specify the boot options, movable memory is distributed
> in each node evenly. So when we want to hot remove memory which
> memory range is 0x80000000-0c0000000, we have no way to specify
> the memory as movable memory.
>
> Furthermore, even if we can use SRAT, users still need an interface
> to enable/disable this functionality if they don't want to lose their
> NUMA performance. So I think, a user interface is always needed.
>
> So we proposed this new feature which enable/disable the kernel to set
> hotpluggable memory as ZONE_MOVABLE.
>
>
> [Ways to do this]
> There may be 2 ways to specify movable memory.
> 1. use firmware information
> 2. use boot option
>
> 1. use firmware information
> According to ACPI spec 5.0, SRAT table has memory affinity structure
> and the structure has Hot Pluggable Filed. See "5.2.16.2 Memory
> Affinity Structure". If we use the information, we might be able to
> specify movable memory by firmware. For example, if Hot Pluggable
> Filed is enabled, Linux sets the memory as movable memory.
>
> 2. use boot option
> This is our proposal. New boot option can specify memory range to use
> as movable memory.
>
>
> [How we do this]
> We now propose a boot option, but support the first way above. A boot option
> is always needed because set memory as movable will cause NUMA performance
> down. So at least, we need an interface to enable/disable it so that users
> who don't want to use memory hotplug functionality will also be happy.
>
>
> [How to use]
> Specify movablemem_map=acpi in kernel commandline:
> *
> * SRAT: |_____| |_____| |_________| |_________| ......
> * node id: 0 1 1 2
> * hotpluggable: n y y n
> * ZONE_MOVABLE: |_____| |_________|
> *
> NOTE: 1) Before parsing SRAT, memblock has already reserve some memory ranges
> for other purposes, such as for kernel image. We cannot prevent
> kernel from using these memory, so we need to exclude these memory
> even if it is hotpluggable.
> Furthermore, to ensure the kernel has enough memory to boot, we make
> all the memory on the node which the kernel resides in should be
> un-hotpluggable.
> 2) In this case, all the user specified memory ranges will be ingored.
>
> We also need to consider the following points:
> 1) Using this boot option could cause NUMA performance down because the kernel
> memory will not be distributed on each node evenly. So for users who don't
> want to lose their NUMA performance, just don't use it.
> 2) If kernelcore or movablecore is also specified, movablecore_map will have
> higher priority to be satisfied.
> 3) This option has no conflict with memmap option.
>
> Tane Chen (10):
> acpi: Print hotplug info in SRAT.
> numa, acpi, memory-hotplug: Add movablemem_map=acpi boot option.
> x86, numa, acpi, memory-hotplug: Introduce hotplug info into struct
> numa_meminfo.
> x86, numa, acpi, memory-hotplug: Consider hotplug info when cleanup
> numa_meminfo.
> X86, numa, acpi, memory-hotplug: Add hotpluggable ranges to
> movablemem_map.
It has a whitespace error.
> x86, numa, acpi, memory-hotplug: Make any node which the kernel
> resides in un-hotpluggable.
> x86, numa, acpi, memory-hotplug: Introduce zone_movable_limit[] to
> store start pfn of ZONE_MOVABLE.
It has a whitespace error.
> x86, numa, acpi, memory-hotplug: Sanitize zone_movable_limit[].
> x86, numa, acpi, memory-hotplug: make movablemem_map have higher
> priority
> x86, numa, acpi, memory-hotplug: Memblock limit with movablemem_map
Thanks,
Yasuaki Ishimatsu
>
> Yasuaki Ishimatsu (1):
> x86: get pg_data_t's memory from other node
>
> Documentation/kernel-parameters.txt | 11 ++
> arch/x86/include/asm/numa.h | 3 +-
> arch/x86/kernel/apic/numaq_32.c | 2 +-
> arch/x86/mm/amdtopology.c | 3 +-
> arch/x86/mm/numa.c | 92 ++++++++++++++--
> arch/x86/mm/numa_internal.h | 1 +
> arch/x86/mm/srat.c | 28 ++++-
> include/linux/memblock.h | 2 +
> include/linux/mm.h | 19 +++
> mm/memblock.c | 50 ++++++++
> mm/page_alloc.c | 210 ++++++++++++++++++++++++++++++++++-
> 11 files changed, 399 insertions(+), 22 deletions(-)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
To: Tang Chen <tangchen@cn.fujitsu.com>
Cc: <rob@landley.net>, <tglx@linutronix.de>, <mingo@redhat.com>,
<hpa@zytor.com>, <yinghai@kernel.org>,
<akpm@linux-foundation.org>, <wency@cn.fujitsu.com>,
<trenn@suse.de>, <liwanp@linux.vnet.ibm.com>, <mgorman@suse.de>,
<walken@google.com>, <riel@redhat.com>, <khlebnikov@openvz.org>,
<tj@kernel.org>, <minchan@kernel.org>, <m.szyprowski@samsung.com>,
<mina86@mina86.com>, <laijs@cn.fujitsu.com>,
<linfeng@cn.fujitsu.com>, <kosaki.motohiro@jp.fujitsu.com>,
<jiang.liu@huawei.com>, <guz.fnst@cn.fujitsu.com>,
<x86@kernel.org>, <linux-doc@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH 00/11] Introduce movablemem_map=acpi boot option.
Date: Tue, 9 Apr 2013 14:14:39 +0900 [thread overview]
Message-ID: <5163A3BF.3030900@jp.fujitsu.com> (raw)
In-Reply-To: <1365154801-473-1-git-send-email-tangchen@cn.fujitsu.com>
Hi Tang,
The patch works well on my x86_64 box.
I confirmed that hotpluggable node is allocated as Movable Zone.
So feel free to add:
Tested by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Nitpick below.
2013/04/05 18:39, Tang Chen wrote:
> Before this patch-set, we introduced movablemem_map boot option which allowed
> users to specify physical address ranges to set memory as movable. This is not
> user friendly enough for normal users.
>
> So now, we introduce just movablemem_map=acpi to allow users to enable/disable
> the kernel to use Hot Pluggable bit in SRAT to determine which memory ranges are
> hotpluggable, and set them as ZONE_MOVABLE.
>
> This patch-set is based on Yinghai's patch-set:
> v1: https://lkml.org/lkml/2013/3/7/642
> v2: https://lkml.org/lkml/2013/3/10/47
>
> So it supports to allocate pagetable pages in local nodes.
>
> We also split the large patch-set into smaller ones, and it seems easier to review.
>
>
> ========================================================================
> [What we are doing]
> This patchset introduces a boot option for users to specify ZONE_MOVABLE
> memory map for each node in the system. Users can use it in two ways:
>
> 1. movablecore_map=acpi
> In this way, the kernel will use Hot Pluggable bit in SRAT to determine
> ZONE_MOVABLE for each node. All the ranges user has specified will be
> ignored.
>
>
> [Why we do this]
> If we hot remove a memroy device, it cannot have kernel memory,
> because Linux cannot migrate kernel memory currently. Therefore,
> we have to guarantee that the hot removed memory has only movable
> memoroy.
> (Here is an exception: When we implement the node hotplug functionality,
> for those kernel memory whose life cycle is the same as the node, such as
> pagetables, vmemmap and so on, although the kernel cannot migrate them,
> we can still put them on local node because we can free them before we
> hot-remove the node. This is not completely implemented yet.)
>
> Linux has two boot options, kernelcore= and movablecore=, for
> creating movable memory. These boot options can specify the amount
> of memory use as kernel or movable memory. Using them, we can
> create ZONE_MOVABLE which has only movable memory.
> (NOTE: doing this will cause NUMA performance because the kernel won't
> be able to distribute kernel memory evenly to each node.)
>
> But it does not fulfill a requirement of memory hot remove, because
> even if we specify the boot options, movable memory is distributed
> in each node evenly. So when we want to hot remove memory which
> memory range is 0x80000000-0c0000000, we have no way to specify
> the memory as movable memory.
>
> Furthermore, even if we can use SRAT, users still need an interface
> to enable/disable this functionality if they don't want to lose their
> NUMA performance. So I think, a user interface is always needed.
>
> So we proposed this new feature which enable/disable the kernel to set
> hotpluggable memory as ZONE_MOVABLE.
>
>
> [Ways to do this]
> There may be 2 ways to specify movable memory.
> 1. use firmware information
> 2. use boot option
>
> 1. use firmware information
> According to ACPI spec 5.0, SRAT table has memory affinity structure
> and the structure has Hot Pluggable Filed. See "5.2.16.2 Memory
> Affinity Structure". If we use the information, we might be able to
> specify movable memory by firmware. For example, if Hot Pluggable
> Filed is enabled, Linux sets the memory as movable memory.
>
> 2. use boot option
> This is our proposal. New boot option can specify memory range to use
> as movable memory.
>
>
> [How we do this]
> We now propose a boot option, but support the first way above. A boot option
> is always needed because set memory as movable will cause NUMA performance
> down. So at least, we need an interface to enable/disable it so that users
> who don't want to use memory hotplug functionality will also be happy.
>
>
> [How to use]
> Specify movablemem_map=acpi in kernel commandline:
> *
> * SRAT: |_____| |_____| |_________| |_________| ......
> * node id: 0 1 1 2
> * hotpluggable: n y y n
> * ZONE_MOVABLE: |_____| |_________|
> *
> NOTE: 1) Before parsing SRAT, memblock has already reserve some memory ranges
> for other purposes, such as for kernel image. We cannot prevent
> kernel from using these memory, so we need to exclude these memory
> even if it is hotpluggable.
> Furthermore, to ensure the kernel has enough memory to boot, we make
> all the memory on the node which the kernel resides in should be
> un-hotpluggable.
> 2) In this case, all the user specified memory ranges will be ingored.
>
> We also need to consider the following points:
> 1) Using this boot option could cause NUMA performance down because the kernel
> memory will not be distributed on each node evenly. So for users who don't
> want to lose their NUMA performance, just don't use it.
> 2) If kernelcore or movablecore is also specified, movablecore_map will have
> higher priority to be satisfied.
> 3) This option has no conflict with memmap option.
>
> Tane Chen (10):
> acpi: Print hotplug info in SRAT.
> numa, acpi, memory-hotplug: Add movablemem_map=acpi boot option.
> x86, numa, acpi, memory-hotplug: Introduce hotplug info into struct
> numa_meminfo.
> x86, numa, acpi, memory-hotplug: Consider hotplug info when cleanup
> numa_meminfo.
> X86, numa, acpi, memory-hotplug: Add hotpluggable ranges to
> movablemem_map.
It has a whitespace error.
> x86, numa, acpi, memory-hotplug: Make any node which the kernel
> resides in un-hotpluggable.
> x86, numa, acpi, memory-hotplug: Introduce zone_movable_limit[] to
> store start pfn of ZONE_MOVABLE.
It has a whitespace error.
> x86, numa, acpi, memory-hotplug: Sanitize zone_movable_limit[].
> x86, numa, acpi, memory-hotplug: make movablemem_map have higher
> priority
> x86, numa, acpi, memory-hotplug: Memblock limit with movablemem_map
Thanks,
Yasuaki Ishimatsu
>
> Yasuaki Ishimatsu (1):
> x86: get pg_data_t's memory from other node
>
> Documentation/kernel-parameters.txt | 11 ++
> arch/x86/include/asm/numa.h | 3 +-
> arch/x86/kernel/apic/numaq_32.c | 2 +-
> arch/x86/mm/amdtopology.c | 3 +-
> arch/x86/mm/numa.c | 92 ++++++++++++++--
> arch/x86/mm/numa_internal.h | 1 +
> arch/x86/mm/srat.c | 28 ++++-
> include/linux/memblock.h | 2 +
> include/linux/mm.h | 19 +++
> mm/memblock.c | 50 ++++++++
> mm/page_alloc.c | 210 ++++++++++++++++++++++++++++++++++-
> 11 files changed, 399 insertions(+), 22 deletions(-)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
next prev parent reply other threads:[~2013-04-09 5:16 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-05 9:39 [PATCH 00/11] Introduce movablemem_map=acpi boot option Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 01/11] x86: get pg_data_t's memory from other node Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 02/11] acpi: Print hotplug info in SRAT Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 03/11] numa, acpi, memory-hotplug: Add movablemem_map=acpi boot option Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 04/11] x86, numa, acpi, memory-hotplug: Introduce hotplug info into struct numa_meminfo Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 05/11] x86, numa, acpi, memory-hotplug: Consider hotplug info when cleanup numa_meminfo Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 06/11] X86, numa, acpi, memory-hotplug: Add hotpluggable ranges to movablemem_map Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 07/11] x86, numa, acpi, memory-hotplug: Make any node which the kernel resides in un-hotpluggable Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 08/11] x86, numa, acpi, memory-hotplug: Introduce zone_movable_limit[] to store start pfn of ZONE_MOVABLE Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:39 ` [PATCH 09/11] x86, numa, acpi, memory-hotplug: Sanitize zone_movable_limit[] Tang Chen
2013-04-05 9:39 ` Tang Chen
2013-04-05 9:40 ` [PATCH 10/11] x86, numa, acpi, memory-hotplug: make movablemem_map have higher priority Tang Chen
2013-04-05 9:40 ` Tang Chen
2013-04-05 9:40 ` [PATCH 11/11] x86, numa, acpi, memory-hotplug: Memblock limit with movablemem_map Tang Chen
2013-04-05 9:40 ` Tang Chen
2013-04-09 5:14 ` Yasuaki Ishimatsu [this message]
2013-04-09 5:14 ` [PATCH 00/11] Introduce movablemem_map=acpi boot option Yasuaki Ishimatsu
2013-04-09 8:21 ` Tang Chen
2013-04-09 8:21 ` Tang Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5163A3BF.3030900@jp.fujitsu.com \
--to=isimatu.yasuaki@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=guz.fnst@cn.fujitsu.com \
--cc=hpa@zytor.com \
--cc=jiang.liu@huawei.com \
--cc=khlebnikov@openvz.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=laijs@cn.fujitsu.com \
--cc=linfeng@cn.fujitsu.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liwanp@linux.vnet.ibm.com \
--cc=m.szyprowski@samsung.com \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=mingo@redhat.com \
--cc=riel@redhat.com \
--cc=rob@landley.net \
--cc=tangchen@cn.fujitsu.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=trenn@suse.de \
--cc=walken@google.com \
--cc=wency@cn.fujitsu.com \
--cc=x86@kernel.org \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.