From: Xishi Qiu <qiuxishi@huawei.com>
To: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
tony.luck@intel.com, kamezawa.hiroyu@jp.fujitsu.com,
mel@csn.ul.ie,
akpm@linux-foundation.orgKamezawa Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com>,
Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH][RFC] mm: Introduce kernelcore=reliable option
Date: Fri, 9 Oct 2015 14:46:52 +0800 [thread overview]
Message-ID: <561762DC.3080608@huawei.com> (raw)
In-Reply-To: <1444402599-15274-1-git-send-email-izumi.taku@jp.fujitsu.com>
On 2015/10/9 22:56, Taku Izumi wrote:
> Xeon E7 v3 based systems supports Address Range Mirroring
> and UEFI BIOS complied with UEFI spec 2.5 can notify which
> ranges are reliable (mirrored) via EFI memory map.
> Now Linux kernel utilize its information and allocates
> boot time memory from reliable region.
>
> My requirement is:
> - allocate kernel memory from reliable region
> - allocate user memory from non-reliable region
>
> In order to meet my requirement, ZONE_MOVABLE is useful.
> By arranging non-reliable range into ZONE_MOVABLE,
> reliable memory is only used for kernel allocations.
>
Hi Taku,
You mean set non-mirrored memory to movable zone, and set
mirrored memory to normal zone, right? So kernel allocations
will use mirrored memory in normal zone, and user allocations
will use non-mirrored memory in movable zone.
My question is:
1) do we need to change the fallback function?
2) the mirrored region should locate at the start of normal
zone, right?
I remember Kame has already suggested this idea. In my opinion,
I still think it's better to add a new migratetype or a new zone,
so both user and kernel could use mirrored memory.
Thanks,
Xishi Qiu
> This patch extends existing "kernelcore" option and
> introduces kernelcore=reliable option. By specifying
> "reliable" instead of specifying the amount of memory,
> non-reliable region will be arranged into ZONE_MOVABLE.
>
> Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
> ---
> Documentation/kernel-parameters.txt | 9 ++++++++-
> mm/page_alloc.c | 26 ++++++++++++++++++++++++++
> 2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 50fc09b..6791cbb 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -1669,7 +1669,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>
> keepinitrd [HW,ARM]
>
> - kernelcore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
> + kernelcore= Format: nn[KMG] | "reliable"
> + [KNL,X86,IA-64,PPC] This parameter
> specifies the amount of memory usable by the kernel
> for non-movable allocations. The requested amount is
> spread evenly throughout all nodes in the system. The
> @@ -1685,6 +1686,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> use the HighMem zone if it exists, and the Normal
> zone if it does not.
>
> + Instead of specifying the amount of memory (nn[KMS]),
> + you can specify "reliable" option. In case "reliable"
> + option is specified, reliable memory is used for
> + non-movable allocations and remaining memory is used
> + for Movable pages.
> +
> kgdbdbgp= [KGDB,HW] kgdb over EHCI usb debug port.
> Format: <Controller#>[,poll interval]
> The controller # is the number of the ehci usb debug
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 48aaf7b..91d7556 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -242,6 +242,7 @@ static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
> static unsigned long __initdata required_kernelcore;
> static unsigned long __initdata required_movablecore;
> static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
> +static bool reliable_kernelcore __initdata;
>
> /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
> int movable_zone;
> @@ -5652,6 +5653,25 @@ static void __init find_zone_movable_pfns_for_nodes(void)
> }
>
> /*
> + * If kernelcore=reliable is specified, ignore movablecore option
> + */
> + if (reliable_kernelcore) {
> + for_each_memblock(memory, r) {
> + if (memblock_is_mirror(r))
> + continue;
> +
> + nid = r->nid;
> +
> + usable_startpfn = PFN_DOWN(r->base);
> + zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
> + min(usable_startpfn, zone_movable_pfn[nid]) :
> + usable_startpfn;
> + }
> +
> + goto out2;
> + }
> +
> + /*
> * If movablecore=nn[KMG] was specified, calculate what size of
> * kernelcore that corresponds so that memory usable for
> * any allocation type is evenly spread. If both kernelcore
> @@ -5907,6 +5927,12 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
> */
> static int __init cmdline_parse_kernelcore(char *p)
> {
> + /* parse kernelcore=reliable */
> + if (parse_option_str(p, "reliable")) {
> + reliable_kernelcore = true;
> + return 0;
> + }
> +
> return cmdline_parse_core(p, &required_kernelcore);
> }
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Xishi Qiu <qiuxishi@huawei.com>
To: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
<tony.luck@intel.com>, <kamezawa.hiroyu@jp.fujitsu.com>,
<mel@csn.ul.ie>, <akpm@linux-foundation.org>,
Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Dave Hansen <dave.hansen@intel.com>,
"Mel Gorman" <mgorman@suse.de>, Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH][RFC] mm: Introduce kernelcore=reliable option
Date: Fri, 9 Oct 2015 14:46:52 +0800 [thread overview]
Message-ID: <561762DC.3080608@huawei.com> (raw)
In-Reply-To: <1444402599-15274-1-git-send-email-izumi.taku@jp.fujitsu.com>
On 2015/10/9 22:56, Taku Izumi wrote:
> Xeon E7 v3 based systems supports Address Range Mirroring
> and UEFI BIOS complied with UEFI spec 2.5 can notify which
> ranges are reliable (mirrored) via EFI memory map.
> Now Linux kernel utilize its information and allocates
> boot time memory from reliable region.
>
> My requirement is:
> - allocate kernel memory from reliable region
> - allocate user memory from non-reliable region
>
> In order to meet my requirement, ZONE_MOVABLE is useful.
> By arranging non-reliable range into ZONE_MOVABLE,
> reliable memory is only used for kernel allocations.
>
Hi Taku,
You mean set non-mirrored memory to movable zone, and set
mirrored memory to normal zone, right? So kernel allocations
will use mirrored memory in normal zone, and user allocations
will use non-mirrored memory in movable zone.
My question is:
1) do we need to change the fallback function?
2) the mirrored region should locate at the start of normal
zone, right?
I remember Kame has already suggested this idea. In my opinion,
I still think it's better to add a new migratetype or a new zone,
so both user and kernel could use mirrored memory.
Thanks,
Xishi Qiu
> This patch extends existing "kernelcore" option and
> introduces kernelcore=reliable option. By specifying
> "reliable" instead of specifying the amount of memory,
> non-reliable region will be arranged into ZONE_MOVABLE.
>
> Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
> ---
> Documentation/kernel-parameters.txt | 9 ++++++++-
> mm/page_alloc.c | 26 ++++++++++++++++++++++++++
> 2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 50fc09b..6791cbb 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -1669,7 +1669,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>
> keepinitrd [HW,ARM]
>
> - kernelcore=nn[KMG] [KNL,X86,IA-64,PPC] This parameter
> + kernelcore= Format: nn[KMG] | "reliable"
> + [KNL,X86,IA-64,PPC] This parameter
> specifies the amount of memory usable by the kernel
> for non-movable allocations. The requested amount is
> spread evenly throughout all nodes in the system. The
> @@ -1685,6 +1686,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> use the HighMem zone if it exists, and the Normal
> zone if it does not.
>
> + Instead of specifying the amount of memory (nn[KMS]),
> + you can specify "reliable" option. In case "reliable"
> + option is specified, reliable memory is used for
> + non-movable allocations and remaining memory is used
> + for Movable pages.
> +
> kgdbdbgp= [KGDB,HW] kgdb over EHCI usb debug port.
> Format: <Controller#>[,poll interval]
> The controller # is the number of the ehci usb debug
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 48aaf7b..91d7556 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -242,6 +242,7 @@ static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
> static unsigned long __initdata required_kernelcore;
> static unsigned long __initdata required_movablecore;
> static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
> +static bool reliable_kernelcore __initdata;
>
> /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
> int movable_zone;
> @@ -5652,6 +5653,25 @@ static void __init find_zone_movable_pfns_for_nodes(void)
> }
>
> /*
> + * If kernelcore=reliable is specified, ignore movablecore option
> + */
> + if (reliable_kernelcore) {
> + for_each_memblock(memory, r) {
> + if (memblock_is_mirror(r))
> + continue;
> +
> + nid = r->nid;
> +
> + usable_startpfn = PFN_DOWN(r->base);
> + zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
> + min(usable_startpfn, zone_movable_pfn[nid]) :
> + usable_startpfn;
> + }
> +
> + goto out2;
> + }
> +
> + /*
> * If movablecore=nn[KMG] was specified, calculate what size of
> * kernelcore that corresponds so that memory usable for
> * any allocation type is evenly spread. If both kernelcore
> @@ -5907,6 +5927,12 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
> */
> static int __init cmdline_parse_kernelcore(char *p)
> {
> + /* parse kernelcore=reliable */
> + if (parse_option_str(p, "reliable")) {
> + reliable_kernelcore = true;
> + return 0;
> + }
> +
> return cmdline_parse_core(p, &required_kernelcore);
> }
>
next prev parent reply other threads:[~2015-10-09 6:50 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-09 14:56 [PATCH][RFC] mm: Introduce kernelcore=reliable option Taku Izumi
2015-10-09 14:56 ` Taku Izumi
2015-10-09 6:46 ` Xishi Qiu [this message]
2015-10-09 6:46 ` Xishi Qiu
2015-10-09 9:24 ` Kamezawa Hiroyuki
2015-10-09 9:24 ` Kamezawa Hiroyuki
2015-10-09 10:36 ` Xishi Qiu
2015-10-09 10:36 ` Xishi Qiu
2015-10-09 15:08 ` Dave Hansen
2015-10-09 15:08 ` Dave Hansen
2015-10-09 18:51 ` Luck, Tony
2015-10-09 18:51 ` Luck, Tony
2015-10-12 10:32 ` Matt Fleming
2015-10-12 10:32 ` Matt Fleming
2015-10-10 2:01 ` Xishi Qiu
2015-10-10 2:01 ` Xishi Qiu
2015-10-12 18:43 ` Luck, Tony
2015-10-12 18:43 ` Luck, Tony
2015-10-13 9:51 ` Kamezawa Hiroyuki
2015-10-13 9:51 ` Kamezawa Hiroyuki
2015-10-09 21:43 ` Luck, Tony
2015-10-09 21:43 ` Luck, Tony
2015-10-14 1:19 ` Izumi, Taku
2015-10-14 1:19 ` Izumi, Taku
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=561762DC.3080608@huawei.com \
--to=qiuxishi@huawei.com \
--cc=akpm@linux-foundation.orgKamezawa \
--cc=izumi.taku@jp.fujitsu.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.