All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
To: Zhang Yanfei <zhangyanfei.yes@gmail.com>
Cc: Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Rafael J . Wysocki" <rjw@sisk.pl>, Len Brown <lenb@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>, Toshi Kani <toshi.kani@hp.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Thomas Renninger <trenn@suse.de>, Yinghai Lu <yinghai@kernel.org>,
	Jiang Liu <jiang.liu@huawei.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Taku Izumi <izumi.taku@jp.fujitsu.com>,
	Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan@kernel.org>,
	"mina86@mina86.com" <mina86@mina86.com>,
	"gong.chen@linux.intel.com" <gong.chen@linux.intel.com>,
	Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.>
Subject: Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE
Date: Tue, 19 Nov 2013 17:56:43 +0800	[thread overview]
Message-ID: <528B35DB.9050209@cn.fujitsu.com> (raw)
In-Reply-To: <528383BB.6060901@gmail.com>

ping...

On 11/13/2013 09:50 PM, Zhang Yanfei wrote:
> Hello guys,
> 
> Could anyone help reviewing this part?
> 
> The first part has been merged into linus's tree. And I've tried to apply this part
> to today's linus tree:
> 
> commit 42a2d923cc349583ebf6fdd52a7d35e1c2f7e6bd
> Merge: 5cbb3d2 75ecab1
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Wed Nov 13 17:40:34 2013 +0900
> 
>     Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> 
> No conflict, no compiling error and it works well.
> 
> Tejun, any comments?
> 
> On 10/12/2013 02:00 PM, Zhang Yanfei wrote:
>> Hello guys, this is the part2 of our memory hotplug work. This part
>> is based on the part1:
>>     "x86, memblock: Allocate memory near kernel image before SRAT parsed"
>> which is base on 3.12-rc4.
>>
>> You could refer part1 from: https://lkml.org/lkml/2013/10/10/644
>>
>> Any comments are welcome! Thanks!
>>
>> [Problem]
>>
>> The current Linux cannot migrate pages used by the kerenl because
>> of the kernel direct mapping. In Linux kernel space, va = pa + PAGE_OFFSET.
>> When the pa is changed, we cannot simply update the pagetable and
>> keep the va unmodified. So the kernel pages are not migratable.
>>
>> There are also some other issues will cause the kernel pages not migratable.
>> For example, the physical address may be cached somewhere and will be used.
>> It is not to update all the caches.
>>
>> When doing memory hotplug in Linux, we first migrate all the pages in one
>> memory device somewhere else, and then remove the device. But if pages are
>> used by the kernel, they are not migratable. As a result, memory used by
>> the kernel cannot be hot-removed.
>>
>> Modifying the kernel direct mapping mechanism is too difficult to do. And
>> it may cause the kernel performance down and unstable. So we use the following
>> way to do memory hotplug.
>>
>>
>> [What we are doing]
>>
>> In Linux, memory in one numa node is divided into several zones. One of the
>> zones is ZONE_MOVABLE, which the kernel won't use.
>>
>> In order to implement memory hotplug in Linux, we are going to arrange all
>> hotpluggable memory in ZONE_MOVABLE so that the kernel won't use these memory.
>>
>> To do this, we need ACPI's help.
>>
>>
>> [How we do this]
>>
>> In ACPI, SRAT(System Resource Affinity Table) contains NUMA info. The memory
>> affinities in SRAT record every memory range in the system, and also, flags
>> specifying if the memory range is hotpluggable.
>> (Please refer to ACPI spec 5.0 5.2.16)
>>
>> With the help of SRAT, we have to do the following two things to achieve our
>> goal:
>>
>> 1. When doing memory hot-add, allow the users arranging hotpluggable as
>>    ZONE_MOVABLE.
>>    (This has been done by the MOVABLE_NODE functionality in Linux.)
>>
>> 2. when the system is booting, prevent bootmem allocator from allocating
>>    hotpluggable memory for the kernel before the memory initialization
>>    finishes.
>>    (This is what we are going to do. See below.)
>>
>>
>> [About this patch-set]
>>
>> In previous part's patches, we have made the kernel allocate memory near
>> kernel image before SRAT parsed to avoid allocating hotpluggable memory
>> for kernel. So this patch-set does the following things:
>>
>> 1. Improve memblock to support flags, which are used to indicate different 
>>    memory type.
>>
>> 2. Mark all hotpluggable memory in memblock.memory[].
>>
>> 3. Make the default memblock allocator skip hotpluggable memory.
>>
>> 4. Improve "movable_node" boot option to have higher priority of movablecore
>>    and kernelcore boot option.
>>
>> Change log v1 -> v2:
>> 1. Rebase this part on the v7 version of part1
>> 2. Fix bug: If movable_node boot option not specified, memblock still
>>    checks hotpluggable memory when allocating memory. 
>>
>> Tang Chen (7):
>>   memblock, numa: Introduce flag into memblock
>>   memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark
>>     hotpluggable regions
>>   memblock: Make memblock_set_node() support different memblock_type
>>   acpi, numa, mem_hotplug: Mark hotpluggable memory in memblock
>>   acpi, numa, mem_hotplug: Mark all nodes the kernel resides
>>     un-hotpluggable
>>   memblock, mem_hotplug: Make memblock skip hotpluggable regions if
>>     needed
>>   x86, numa, acpi, memory-hotplug: Make movable_node have higher
>>     priority
>>
>> Yasuaki Ishimatsu (1):
>>   x86: get pg_data_t's memory from other node
>>
>>  arch/metag/mm/init.c      |    3 +-
>>  arch/metag/mm/numa.c      |    3 +-
>>  arch/microblaze/mm/init.c |    3 +-
>>  arch/powerpc/mm/mem.c     |    2 +-
>>  arch/powerpc/mm/numa.c    |    8 ++-
>>  arch/sh/kernel/setup.c    |    4 +-
>>  arch/sparc/mm/init_64.c   |    5 +-
>>  arch/x86/mm/init_32.c     |    2 +-
>>  arch/x86/mm/init_64.c     |    2 +-
>>  arch/x86/mm/numa.c        |   63 +++++++++++++++++++++--
>>  arch/x86/mm/srat.c        |    5 ++
>>  include/linux/memblock.h  |   39 ++++++++++++++-
>>  mm/memblock.c             |  123 ++++++++++++++++++++++++++++++++++++++-------
>>  mm/memory_hotplug.c       |    1 +
>>  mm/page_alloc.c           |   28 ++++++++++-
>>  15 files changed, 252 insertions(+), 39 deletions(-)
>>
> 
> 


-- 
Thanks.
Zhang Yanfei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
To: Zhang Yanfei <zhangyanfei.yes@gmail.com>
Cc: Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Rafael J . Wysocki" <rjw@sisk.pl>, Len Brown <lenb@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>, Toshi Kani <toshi.kani@hp.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Thomas Renninger <trenn@suse.de>, Yinghai Lu <yinghai@kernel.org>,
	Jiang Liu <jiang.liu@huawei.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Taku Izumi <izumi.taku@jp.fujitsu.com>,
	Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan@kernel.org>,
	"mina86@mina86.com" <mina86@mina86.com>,
	"gong.chen@linux.intel.com" <gong.chen@linux.intel.com>,
	Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>,
	"lwoodman@redhat.com" <lwoodman@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	"jweiner@redhat.com" <jweiner@redhat.com>,
	Prarit Bhargava <prarit@redhat.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	Chen Tang <imtangchen@gmail.com>,
	Tang Chen <tangchen@cn.fujitsu.com>
Subject: Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE
Date: Tue, 19 Nov 2013 17:56:43 +0800	[thread overview]
Message-ID: <528B35DB.9050209@cn.fujitsu.com> (raw)
In-Reply-To: <528383BB.6060901@gmail.com>

ping...

On 11/13/2013 09:50 PM, Zhang Yanfei wrote:
> Hello guys,
> 
> Could anyone help reviewing this part?
> 
> The first part has been merged into linus's tree. And I've tried to apply this part
> to today's linus tree:
> 
> commit 42a2d923cc349583ebf6fdd52a7d35e1c2f7e6bd
> Merge: 5cbb3d2 75ecab1
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Wed Nov 13 17:40:34 2013 +0900
> 
>     Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
> 
> No conflict, no compiling error and it works well.
> 
> Tejun, any comments?
> 
> On 10/12/2013 02:00 PM, Zhang Yanfei wrote:
>> Hello guys, this is the part2 of our memory hotplug work. This part
>> is based on the part1:
>>     "x86, memblock: Allocate memory near kernel image before SRAT parsed"
>> which is base on 3.12-rc4.
>>
>> You could refer part1 from: https://lkml.org/lkml/2013/10/10/644
>>
>> Any comments are welcome! Thanks!
>>
>> [Problem]
>>
>> The current Linux cannot migrate pages used by the kerenl because
>> of the kernel direct mapping. In Linux kernel space, va = pa + PAGE_OFFSET.
>> When the pa is changed, we cannot simply update the pagetable and
>> keep the va unmodified. So the kernel pages are not migratable.
>>
>> There are also some other issues will cause the kernel pages not migratable.
>> For example, the physical address may be cached somewhere and will be used.
>> It is not to update all the caches.
>>
>> When doing memory hotplug in Linux, we first migrate all the pages in one
>> memory device somewhere else, and then remove the device. But if pages are
>> used by the kernel, they are not migratable. As a result, memory used by
>> the kernel cannot be hot-removed.
>>
>> Modifying the kernel direct mapping mechanism is too difficult to do. And
>> it may cause the kernel performance down and unstable. So we use the following
>> way to do memory hotplug.
>>
>>
>> [What we are doing]
>>
>> In Linux, memory in one numa node is divided into several zones. One of the
>> zones is ZONE_MOVABLE, which the kernel won't use.
>>
>> In order to implement memory hotplug in Linux, we are going to arrange all
>> hotpluggable memory in ZONE_MOVABLE so that the kernel won't use these memory.
>>
>> To do this, we need ACPI's help.
>>
>>
>> [How we do this]
>>
>> In ACPI, SRAT(System Resource Affinity Table) contains NUMA info. The memory
>> affinities in SRAT record every memory range in the system, and also, flags
>> specifying if the memory range is hotpluggable.
>> (Please refer to ACPI spec 5.0 5.2.16)
>>
>> With the help of SRAT, we have to do the following two things to achieve our
>> goal:
>>
>> 1. When doing memory hot-add, allow the users arranging hotpluggable as
>>    ZONE_MOVABLE.
>>    (This has been done by the MOVABLE_NODE functionality in Linux.)
>>
>> 2. when the system is booting, prevent bootmem allocator from allocating
>>    hotpluggable memory for the kernel before the memory initialization
>>    finishes.
>>    (This is what we are going to do. See below.)
>>
>>
>> [About this patch-set]
>>
>> In previous part's patches, we have made the kernel allocate memory near
>> kernel image before SRAT parsed to avoid allocating hotpluggable memory
>> for kernel. So this patch-set does the following things:
>>
>> 1. Improve memblock to support flags, which are used to indicate different 
>>    memory type.
>>
>> 2. Mark all hotpluggable memory in memblock.memory[].
>>
>> 3. Make the default memblock allocator skip hotpluggable memory.
>>
>> 4. Improve "movable_node" boot option to have higher priority of movablecore
>>    and kernelcore boot option.
>>
>> Change log v1 -> v2:
>> 1. Rebase this part on the v7 version of part1
>> 2. Fix bug: If movable_node boot option not specified, memblock still
>>    checks hotpluggable memory when allocating memory. 
>>
>> Tang Chen (7):
>>   memblock, numa: Introduce flag into memblock
>>   memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark
>>     hotpluggable regions
>>   memblock: Make memblock_set_node() support different memblock_type
>>   acpi, numa, mem_hotplug: Mark hotpluggable memory in memblock
>>   acpi, numa, mem_hotplug: Mark all nodes the kernel resides
>>     un-hotpluggable
>>   memblock, mem_hotplug: Make memblock skip hotpluggable regions if
>>     needed
>>   x86, numa, acpi, memory-hotplug: Make movable_node have higher
>>     priority
>>
>> Yasuaki Ishimatsu (1):
>>   x86: get pg_data_t's memory from other node
>>
>>  arch/metag/mm/init.c      |    3 +-
>>  arch/metag/mm/numa.c      |    3 +-
>>  arch/microblaze/mm/init.c |    3 +-
>>  arch/powerpc/mm/mem.c     |    2 +-
>>  arch/powerpc/mm/numa.c    |    8 ++-
>>  arch/sh/kernel/setup.c    |    4 +-
>>  arch/sparc/mm/init_64.c   |    5 +-
>>  arch/x86/mm/init_32.c     |    2 +-
>>  arch/x86/mm/init_64.c     |    2 +-
>>  arch/x86/mm/numa.c        |   63 +++++++++++++++++++++--
>>  arch/x86/mm/srat.c        |    5 ++
>>  include/linux/memblock.h  |   39 ++++++++++++++-
>>  mm/memblock.c             |  123 ++++++++++++++++++++++++++++++++++++++-------
>>  mm/memory_hotplug.c       |    1 +
>>  mm/page_alloc.c           |   28 ++++++++++-
>>  15 files changed, 252 insertions(+), 39 deletions(-)
>>
> 
> 


-- 
Thanks.
Zhang Yanfei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-11-19  9:56 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-12  6:00 [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE Zhang Yanfei
2013-10-12  6:00 ` Zhang Yanfei
2013-10-12  6:00 ` Zhang Yanfei
2013-10-12  6:03 ` [PATCH part2 v2 1/8] x86: get pg_data_t's memory from other node Zhang Yanfei
2013-10-12  6:03   ` Zhang Yanfei
2013-10-12  6:03   ` Zhang Yanfei
2013-10-12  6:04 ` [PATCH part2 v2 2/8] memblock, numa: Introduce flag into memblock Zhang Yanfei
2013-10-12  6:04   ` Zhang Yanfei
2013-10-12  6:04   ` Zhang Yanfei
2013-10-12  6:05 ` [PATCH part2 v2 3/8] memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions Zhang Yanfei
2013-10-12  6:05   ` Zhang Yanfei
2013-10-12  6:05   ` Zhang Yanfei
2013-10-12  6:06 ` [PATCH part2 v2 4/8] memblock: Make memblock_set_node() support different memblock_type Zhang Yanfei
2013-10-12  6:06   ` Zhang Yanfei
2013-10-12  6:06   ` Zhang Yanfei
2013-10-12  6:07 ` [PATCH part2 v2 5/8] acpi, numa, mem_hotplug: Mark hotpluggable memory in memblock Zhang Yanfei
2013-10-12  6:07   ` Zhang Yanfei
2013-10-12  6:07   ` Zhang Yanfei
2013-10-12  6:08 ` [PATCH part2 v2 6/8] acpi, numa, mem_hotplug: Mark all nodes the kernel resides un-hotpluggable Zhang Yanfei
2013-10-12  6:08   ` Zhang Yanfei
2013-10-12  6:08   ` Zhang Yanfei
2013-10-12  6:09 ` [PATCH part2 v2 7/8] memblock, mem_hotplug: Make memblock skip hotpluggable regions if needed Zhang Yanfei
2013-10-12  6:09   ` Zhang Yanfei
2013-10-12  6:09   ` Zhang Yanfei
2013-10-12  6:09 ` [PATCH part2 v2 8/8] x86, numa, acpi, memory-hotplug: Make movable_node have higher priority Zhang Yanfei
2013-10-12  6:09   ` Zhang Yanfei
2013-10-12  6:09   ` Zhang Yanfei
     [not found] ` <525B19C3.9040907@gmail.com>
     [not found]   ` <20131014133835.GG4722@htj.dyndns.org>
     [not found]     ` <525BFCF3.5010908@gmail.com>
     [not found]       ` <20131014142719.GI4722@htj.dyndns.org>
     [not found]         ` <525C02DC.4050706@gmail.com>
     [not found]           ` <20131014145131.GJ4722@htj.dyndns.org>
     [not found]             ` <525C0866.2010808@gmail.com>
     [not found]               ` <20131014151902.GL4722@htj.dyndns.org>
2013-10-14 15:34                 ` [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE Zhang Yanfei
2013-10-14 19:34                   ` Yinghai Lu
2013-10-14 20:04                     ` Tejun Heo
2013-10-14 20:37                       ` Yinghai Lu
2013-10-14 20:42                         ` H. Peter Anvin
2013-10-15  6:50                           ` Ingo Molnar
2013-10-15 17:31                             ` H. Peter Anvin
2013-10-16  7:03                               ` Ingo Molnar
2013-10-14 20:55                         ` Tejun Heo
2013-10-15  1:40                           ` Zhang Yanfei
2013-10-15  2:25                           ` Yinghai Lu
2013-10-15 13:16                             ` Tejun Heo
2013-10-14 20:35                     ` H. Peter Anvin
2013-10-14 20:42                       ` Yinghai Lu
2013-10-14 20:49                         ` H. Peter Anvin
2013-11-13 13:50 ` Zhang Yanfei
2013-11-13 13:50   ` Zhang Yanfei
2013-11-19  9:56   ` Zhang Yanfei [this message]
2013-11-19  9:56     ` Zhang Yanfei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=528B35DB.9050209@cn.fujitsu.com \
    --to=zhangyanfei@cn.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=gong.chen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=lenb@kernel.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=toshi.kani@hp.com \
    --cc=trenn@suse.de \
    --cc=vasilis.liaskovitis@profitbricks. \
    --cc=wency@cn.fujitsu.com \
    --cc=yinghai@kernel.org \
    --cc=zhangyanfei.yes@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.