linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jiang Liu <liuj97@gmail.com>
To: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>,
	Jiang Liu <jiang.liu@huawei.com>,
	Tang Chen <tangchen@cn.fujitsu.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"rob@landley.net" <rob@landley.net>,
	"laijs@cn.fujitsu.com" <laijs@cn.fujitsu.com>,
	"wency@cn.fujitsu.com" <wency@cn.fujitsu.com>,
	"linfeng@cn.fujitsu.com" <linfeng@cn.fujitsu.com>,
	"yinghai@kernel.org" <yinghai@kernel.org>,
	"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>,
	"minchan.kim@gmail.com" <minchan.kim@gmail.com>,
	"mgorman@suse.de" <mgorman@suse.de>,
	"rientjes@google.com" <rientjes@google.com>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	Len Brown <lenb@kernel.org>, "Wang, Frank" <frank.wang@intel.com>
Subject: Re: [PATCH v2 0/5] Add movablecore_map boot option
Date: Thu, 29 Nov 2012 23:47:19 +0800	[thread overview]
Message-ID: <50B78387.5070707@gmail.com> (raw)
In-Reply-To: <50B73B22.90500@jp.fujitsu.com>

On 11/29/2012 06:38 PM, Yasuaki Ishimatsu wrote:
> Hi Tony,
> 
> 2012/11/29 6:34, Luck, Tony wrote:
>>> 1. use firmware information
>>>    According to ACPI spec 5.0, SRAT table has memory affinity structure
>>>    and the structure has Hot Pluggable Filed. See "5.2.16.2 Memory
>>>    Affinity Structure". If we use the information, we might be able to
>>>    specify movable memory by firmware. For example, if Hot Pluggable
>>>    Filed is enabled, Linux sets the memory as movable memory.
>>>
>>> 2. use boot option
>>>    This is our proposal. New boot option can specify memory range to use
>>>    as movable memory.
>>
>> Isn't this just moving the work to the user? To pick good values for the
> 
> Yes.
> 
>> movable areas, they need to know how the memory lines up across
>> node boundaries ... because they need to make sure to allow some
>> non-movable memory allocations on each node so that the kernel can
>> take advantage of node locality.
> 
> There is no problem.
> Linux has already two boot options, kernelcore= and movablecore=.
> So if we use them, non-movable memory is divided into each node evenly.
> 
> But there is no way to specify a node used as movable currently. So
> we proposed the new boot option.
> 
>> So the user would have to read at least the SRAT table, and perhaps
>> more, to figure out what to provide as arguments.
>>
> 
>> Since this is going to be used on a dynamic system where nodes might
>> be added an removed - the right values for these arguments might
>> change from one boot to the next. So even if the user gets them right
>> on day 1, a month later when a new node has been added, or a broken
>> node removed the values would be stale.
> 
> I don't think so. Even if we hot add/remove node, the memory range of
> each memory device is not changed. So we don't need to change the boot
> option.
Hi Yasuaki,
	Addresses assigned to each memory device may change under different 
hardware configurations.
	According to my experiences with some hotplug capable Xeon and Itanium
systems, a typical algorithm adopted by BIOS to support memory hotplug is:
1) For backward compatibility, BIOS assigns continuous addresses to memory
devices present at boot time. In other words, there are no holes in the memory
addresses except the hole just below 4G reserved for MMIO and other arch 
specific usage.
2) To support memory hotplug, BIOS reserves enough memory address ranges 
at the high end.
 
	Let's take a typical 4 sockets system as an example. Say we have four
sockets S0-S3, and each socket supports two memory devices(M0-M1) at maximum. 
Each memory device supports 128G memory at maximum. And at boot, all memory
slots are fully populated with 4GB memory. Then the address assignment looks
like:
0-2G: 		S0.M0
2-4G: 		MMIO
4-8G: 		S0.M1
8-12G: 		S1.M0
12-16G: 	S1.M1
16-20G: 	S2.M0
20-24G:		S2.M1
24-28G: 	S2.M0
28-32G:		S2.M1
32-34G:		S0.M0 (memory recovered from the MMIO hole)
1024-1152G:	reserved for S0.M0
1152-1280G:	reserved for S0.M1
1280-1408G:	reserved for S1.M0
1408-1536G:	reserved for S1.M1
1536-1664G:	reserved for S2.M0
1664-1792G:	reserved for S2.M1
1792-1920G:	reserved for S3.M0
1920-2048G:	reserved for S4.M1

If we hot-remove S2.M0 and add back a bigger memory device with 8G memory, it will
be assigned a new memory address range 1536-1544G.

Based on above algorithm, and we configure 16-24G(S2.M0 and S2.M1) as movable memory.
1) memory on S3 will be configured as movable if S2 isn't present at boot time. (the
same effect as "movable_node" in discussion at https://lkml.org/lkml/2012/11/27/154)
2) S2.M0 will be configured as non-movable and S3.M0 will be configured as movable
   if S1.M0 isn't present at boot.
3) And how about replace S1.M0 with a 8GB memory device?

To summarize, kernel parameter to configure movable memory for hotplug will easily
become invalid if hardware configuration changes, and that may confuse administrators.
I still think the most reliable way is to figure out movable memory for hotplug by
parsing hardware configuration information from BIOS.

Regards!
Gerry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-11-29 15:48 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-23 10:44 [PATCH v2 0/5] Add movablecore_map boot option Tang Chen
2012-11-23 10:44 ` [PATCH v2 1/5] x86: get pg_data_t's memory from other node Tang Chen
2012-11-24  1:19   ` Jiang Liu
2012-11-26  1:19     ` Tang Chen
2012-12-02 15:11   ` Jiang Liu
2012-11-23 10:44 ` [PATCH v2 2/5] page_alloc: add movable_memmap kernel parameter Tang Chen
2012-11-23 10:44 ` [PATCH v2 3/5] page_alloc: Introduce zone_movable_limit[] to keep movable limit for nodes Tang Chen
2012-12-05 15:46   ` Jiang Liu
2012-12-06  1:20     ` Tang Chen
2012-11-23 10:44 ` [PATCH v2 4/5] page_alloc: Make movablecore_map has higher priority Tang Chen
2012-12-05 15:43   ` Jiang Liu
2012-12-06  1:26     ` Tang Chen
2012-12-06  2:26       ` Jiang Liu
2012-12-06  2:51         ` Jianguo Wu
2012-12-06  2:57           ` Tang Chen
2012-12-09  8:10         ` Tang Chen
2012-12-10  2:15           ` Jiang Liu
2012-11-23 10:44 ` [PATCH v2 5/5] page_alloc: Bootmem limit with movablecore_map Tang Chen
2012-11-26 12:22   ` wujianguo
2012-11-26 12:53     ` Tang Chen
2012-11-26 12:40   ` wujianguo
2012-11-26 13:15     ` Tang Chen
2012-11-26 15:48       ` H. Peter Anvin
2012-11-27  0:58         ` Jianguo Wu
2012-11-27  3:19           ` Wen Congyang
2012-11-27  3:22             ` Jianguo Wu
2012-11-27  3:34               ` Wen Congyang
2012-11-27  1:12         ` Jiang Liu
2012-11-27  1:20           ` H. Peter Anvin
2012-11-27  3:15         ` Wen Congyang
2012-11-27  5:31           ` H. Peter Anvin
2012-12-06 17:28             ` Jiang Liu
2012-12-06 17:41               ` H. Peter Anvin
2012-12-07  0:18                 ` Jiang Liu
2012-12-19  9:17     ` Tang Chen
2012-11-27  3:10 ` [PATCH v2 0/5] Add movablecore_map boot option wujianguo
2012-11-27  5:43   ` Tang Chen
2012-11-27  6:20     ` H. Peter Anvin
2012-11-27  6:47     ` Jianguo Wu
2012-11-28  3:47   ` Tang Chen
2012-11-28  4:01     ` Jiang Liu
2012-11-28  5:21       ` Wen Congyang
2012-11-28  5:17         ` Jiang Liu
2012-11-28  4:53     ` Jianguo Wu
2012-11-27  8:00 ` Bob Liu
2012-11-27  8:29   ` Tang Chen
2012-11-27  8:49     ` H. Peter Anvin
2012-11-27  9:47       ` Wen Congyang
2012-11-27  9:53         ` H. Peter Anvin
2012-11-27  9:59       ` Yasuaki Ishimatsu
2012-11-27 12:09     ` Bob Liu
2012-11-27 12:49       ` Tang Chen
2012-11-28  3:24         ` Bob Liu
2012-11-28  4:08           ` Jiang Liu
2012-11-28  6:16             ` Tang Chen
2012-11-28  7:03               ` Jiang Liu
2012-11-28  8:29             ` Wen Congyang
2012-11-28  8:28               ` Jiang Liu
2012-11-28  8:38                 ` Wen Congyang
2012-11-29  0:43               ` Jaegeuk Hanse
2012-11-29  1:24                 ` Tang Chen
2012-11-30  9:20             ` Lai Jiangshan
2012-11-28  8:47 ` Jiang Liu
2012-11-28 21:34   ` Luck, Tony
2012-11-28 21:38     ` H. Peter Anvin
2012-11-29 11:00       ` Mel Gorman
2012-11-29 16:07         ` H. Peter Anvin
2012-11-29 22:41           ` Luck, Tony
2012-11-29 22:45             ` H. Peter Anvin
2012-11-30  2:56         ` Jiang Liu
2012-11-30  3:15           ` Yasuaki Ishimatsu
2012-11-30 15:36             ` Jiang Liu
2012-11-30  2:58         ` Luck, Tony
2012-11-30  3:28           ` H. Peter Anvin
2012-11-30 10:19           ` Glauber Costa
2012-11-30 10:52           ` Mel Gorman
2012-11-29 10:38     ` Yasuaki Ishimatsu
2012-11-29 11:05       ` Mel Gorman
2012-11-29 15:47       ` Jiang Liu [this message]
2012-11-29 15:53       ` Jiang Liu
2012-11-29  1:42   ` Jaegeuk Hanse
2012-11-29  2:25     ` Jiang Liu
2012-11-29  2:49       ` Wanpeng Li
2012-11-29  2:59         ` Jiang Liu
2012-11-29  2:49       ` Wanpeng Li
2012-11-30 22:27       ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50B78387.5070707@gmail.com \
    --to=liuj97@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=frank.wang@intel.com \
    --cc=hpa@zytor.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=lenb@kernel.org \
    --cc=linfeng@cn.fujitsu.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan.kim@gmail.com \
    --cc=rientjes@google.com \
    --cc=rob@landley.net \
    --cc=rusty@rustcorp.com.au \
    --cc=tangchen@cn.fujitsu.com \
    --cc=tony.luck@intel.com \
    --cc=wency@cn.fujitsu.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).