All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Xishi Qiu <qiuxishi@huawei.com>, Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	tony.luck@intel.com, mel@csn.ul.ie, akpm@linux-foundation.org,
	Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH][RFC] mm: Introduce kernelcore=reliable option
Date: Fri, 9 Oct 2015 18:24:42 +0900	[thread overview]
Message-ID: <561787DA.4040809@jp.fujitsu.com> (raw)
In-Reply-To: <561762DC.3080608@huawei.com>

On 2015/10/09 15:46, Xishi Qiu wrote:
> On 2015/10/9 22:56, Taku Izumi wrote:
>
>> Xeon E7 v3 based systems supports Address Range Mirroring
>> and UEFI BIOS complied with UEFI spec 2.5 can notify which
>> ranges are reliable (mirrored) via EFI memory map.
>> Now Linux kernel utilize its information and allocates
>> boot time memory from reliable region.
>>
>> My requirement is:
>>    - allocate kernel memory from reliable region
>>    - allocate user memory from non-reliable region
>>
>> In order to meet my requirement, ZONE_MOVABLE is useful.
>> By arranging non-reliable range into ZONE_MOVABLE,
>> reliable memory is only used for kernel allocations.
>>
>
> Hi Taku,
>
> You mean set non-mirrored memory to movable zone, and set
> mirrored memory to normal zone, right? So kernel allocations
> will use mirrored memory in normal zone, and user allocations
> will use non-mirrored memory in movable zone.
>
> My question is:
> 1) do we need to change the fallback function?

For *our* requirement, it's not required. But if someone want to prevent
user's memory allocation from NORMAL_ZONE, we need some change in zonelist
walking.

> 2) the mirrored region should locate at the start of normal
> zone, right?

Precisely, "not-reliable" range of memory are handled by ZONE_MOVABLE.
This patch does only that.

>
> I remember Kame has already suggested this idea. In my opinion,
> I still think it's better to add a new migratetype or a new zone,
> so both user and kernel could use mirrored memory.

Hi, Xishi.

I and Izumi-san discussed the implementation much and found using "zone"
is better approach.

The biggest reason is that zone is a unit of vmscan and all statistics and
handling the range of memory for a purpose. We can reuse all vmscan and
information codes by making use of zones. Introdcing other structure will be messy.
His patch is very simple.

For your requirements. I and Izumi-san are discussing following plan.

  - Add a flag to show the zone is reliable or not, then, mark ZONE_MOVABLE as not-reliable.
  - Add __GFP_RELIABLE. This will allow alloc_pages() to skip not-reliable zone.
  - Add madivse() MADV_RELIABLE and modify page fault code's gfp flag with that flag.


Thanks,
-Kame





















--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Xishi Qiu <qiuxishi@huawei.com>, Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	tony.luck@intel.com, mel@csn.ul.ie, akpm@linux-foundation.org,
	Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH][RFC] mm: Introduce kernelcore=reliable option
Date: Fri, 9 Oct 2015 18:24:42 +0900	[thread overview]
Message-ID: <561787DA.4040809@jp.fujitsu.com> (raw)
In-Reply-To: <561762DC.3080608@huawei.com>

On 2015/10/09 15:46, Xishi Qiu wrote:
> On 2015/10/9 22:56, Taku Izumi wrote:
>
>> Xeon E7 v3 based systems supports Address Range Mirroring
>> and UEFI BIOS complied with UEFI spec 2.5 can notify which
>> ranges are reliable (mirrored) via EFI memory map.
>> Now Linux kernel utilize its information and allocates
>> boot time memory from reliable region.
>>
>> My requirement is:
>>    - allocate kernel memory from reliable region
>>    - allocate user memory from non-reliable region
>>
>> In order to meet my requirement, ZONE_MOVABLE is useful.
>> By arranging non-reliable range into ZONE_MOVABLE,
>> reliable memory is only used for kernel allocations.
>>
>
> Hi Taku,
>
> You mean set non-mirrored memory to movable zone, and set
> mirrored memory to normal zone, right? So kernel allocations
> will use mirrored memory in normal zone, and user allocations
> will use non-mirrored memory in movable zone.
>
> My question is:
> 1) do we need to change the fallback function?

For *our* requirement, it's not required. But if someone want to prevent
user's memory allocation from NORMAL_ZONE, we need some change in zonelist
walking.

> 2) the mirrored region should locate at the start of normal
> zone, right?

Precisely, "not-reliable" range of memory are handled by ZONE_MOVABLE.
This patch does only that.

>
> I remember Kame has already suggested this idea. In my opinion,
> I still think it's better to add a new migratetype or a new zone,
> so both user and kernel could use mirrored memory.

Hi, Xishi.

I and Izumi-san discussed the implementation much and found using "zone"
is better approach.

The biggest reason is that zone is a unit of vmscan and all statistics and
handling the range of memory for a purpose. We can reuse all vmscan and
information codes by making use of zones. Introdcing other structure will be messy.
His patch is very simple.

For your requirements. I and Izumi-san are discussing following plan.

  - Add a flag to show the zone is reliable or not, then, mark ZONE_MOVABLE as not-reliable.
  - Add __GFP_RELIABLE. This will allow alloc_pages() to skip not-reliable zone.
  - Add madivse() MADV_RELIABLE and modify page fault code's gfp flag with that flag.


Thanks,
-Kame






















  reply	other threads:[~2015-10-09  9:25 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-09 14:56 [PATCH][RFC] mm: Introduce kernelcore=reliable option Taku Izumi
2015-10-09 14:56 ` Taku Izumi
2015-10-09  6:46 ` Xishi Qiu
2015-10-09  6:46   ` Xishi Qiu
2015-10-09  9:24   ` Kamezawa Hiroyuki [this message]
2015-10-09  9:24     ` Kamezawa Hiroyuki
2015-10-09 10:36     ` Xishi Qiu
2015-10-09 10:36       ` Xishi Qiu
2015-10-09 15:08       ` Dave Hansen
2015-10-09 15:08         ` Dave Hansen
2015-10-09 18:51         ` Luck, Tony
2015-10-09 18:51           ` Luck, Tony
2015-10-12 10:32           ` Matt Fleming
2015-10-12 10:32             ` Matt Fleming
2015-10-10  2:01       ` Xishi Qiu
2015-10-10  2:01         ` Xishi Qiu
2015-10-12 18:43         ` Luck, Tony
2015-10-12 18:43           ` Luck, Tony
2015-10-13  9:51       ` Kamezawa Hiroyuki
2015-10-13  9:51         ` Kamezawa Hiroyuki
2015-10-09 21:43   ` Luck, Tony
2015-10-09 21:43     ` Luck, Tony
2015-10-14  1:19     ` Izumi, Taku
2015-10-14  1:19       ` Izumi, Taku

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561787DA.4040809@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=qiuxishi@huawei.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.