linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Xishi Qiu <qiuxishi@huawei.com>
To: Tony Luck <tony.luck@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Hanjun Guo <guohanjun@huawei.com>, Xiexiuqi <xiexiuqi@huawei.com>
Subject: Re: [PATCHv2 0/3] Find mirrored memory, use for boot time allocations
Date: Tue, 19 May 2015 14:37:42 +0800	[thread overview]
Message-ID: <555ADA36.6060507@huawei.com> (raw)
In-Reply-To: <CA+8MBbKo=zgyftrrcLcB7D3T7npT7JvpBTj9txEr+ZumgsGuxQ@mail.gmail.com>

On 2015/5/19 12:48, Tony Luck wrote:

> On Mon, May 18, 2015 at 8:01 PM, Xishi Qiu <qiuxishi@huawei.com> wrote:
>> In part2, does it means the memory allocated from kernel should use mirrored memory?
> 
> Yes. I want to use mirrored memory for all (or as many as
> possible) kernel allocations.
> 
>> I have heard of this feature(address range mirroring) before, and I changed some
>> code to test it(implement memory allocations in specific physical areas).
>>
>> In my opinion, add a new zone(ZONE_MIRROR) to fill the mirrored memory is not a good
>> idea. If there are XX discontiguous mirrored areas in one numa node, there should be
>> XX ZONE_MIRROR zones in one pgdat, it is impossible, right?
> 
> With current h/w implementations XX is at most 2, and is possibly only 1
> on most nodes.  But we shouldn't depend on that.
> 
>> I think add a new migrate type(MIGRATE_MIRROR) will be better, the following print
>> is from my changed kernel.
> 
> This sounds interesting.
> 
>> [root@localhost ~]# cat /proc/pagetypeinfo
>> Page block order: 9
>> Pages per block:  512
>>
>> Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
> ...
>> Node    0, zone      DMA, type       Mirror      0      0      0      0      0      0      0      0      0      0      0
> ...
>> Node    0, zone    DMA32, type       Mirror      0      0      0      0      0      0      0      0      0      0      0
> 
> I see all zero counts here ... which is fine.  I expect that systems
> will mirror all memory below 4GB ... but we should probably
> ignore the attribute for this range because we want to make

Hi Tony,

I think 0-4G will be all mirrored, so I change nothing, just ignore
the mirror flag.(e.g. 4 socket machine, every socket has 32G memory,
then node0: 0-4G, 4-8G mirrored, node1: 32-36G mirrored, node2:64-68G
mirrored, node3: 96-100G mirrored)

> sure that the memory is still available for users that depend
> on getting memory that legacy devices can access. On systems
> that support address range mirror the <4GB area is <2% of even
> a small system (128GB seems to be the minimum rational configuration
> for a 4 socket machine ... you end up with that much if you populate
> every channel with just one 4GB DIMM). On a big system (in the TB
> range) <4GB area is a trivial rounding error.
> 
>> Also I add a new flag(GFP_MIRROR), then we can use the mirrored form both
>> kernel-space and user-space. If there is no mirrored memory, we will allocate
>> other types memory.
> 
> But I *think* I want all kernel and no users to allocate mirror
> memory.  I'd like to not have to touch every place that allocates
> memory to add/clear this flag.
> 

If only want kernel to use the mirrored memory, it is much easier.
I have some patches, but it's a little ugly and implement both user 
and kernel.

>> 1) kernel-space(pcp, page buddy, slab/slub ...):
>>         -> use mirrored memory(e.g. /proc/sys/vm/mirrorable)
>>                 -> __alloc_pages_nodemask()
>>                         ->gfpflags_to_migratetype()
>>                                 -> use MIGRATE_MIRROR list
> 
> I think you are telling me that we can do this, but I don't understand
> how the code would look.
> 
>> 2) user-space(syscall, madvise, mmap ...):
>>         -> add VM_MIRROR flag in the vma
>>                 -> add GFP_MIRROR when page fault in the vma
>>                         -> __alloc_pages_nodemask()
>>                                 -> use MIGRATE_MIRROR list
> 
> If we do let users have access to mirrored memory, then
> madvise/mmap seem a plausible way to allow it.  Not sure
> what access privileges are appropriate to allow it. I expect
> mirrored memory to be in short supply (the whole point of

I think allocations from some key process(e.g. date base) are
as important as kernel, and in most cases MCE just kill them
if memory failure, so let user can access the mirrored memory
may be a good way to solve the problem. 

> address range mirror is to make do with a minimal amount
> of mirrored memory ... if you expect to want/have lots of
> mirrored memory, then just take the 50% hit in capacity
> and mirror everything and ignore all the s/w complexity).
> 
> Are your patches ready to be shared?

I'll rewrite and send them soon.

Thanks,
Xishi Qiu

> 
> -Tony
> 
> .
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2015-05-19  6:39 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-08 16:44 [PATCHv2 0/3] Find mirrored memory, use for boot time allocations Tony Luck
2015-05-07 22:17 ` [PATCHv2 1/3] mm/memblock: Add extra "flags" to memblock to allow selection of memory based on attribute Tony Luck
2015-05-07 22:18 ` [PATCHv2 2/3] mm/memblock: Allocate boot time data structures from mirrored memory Tony Luck
2015-05-07 22:19 ` [PATCHv2 3/3] x86, mirror: x86 enabling - find mirrored memory ranges Tony Luck
2015-05-08 20:03 ` [PATCHv2 0/3] Find mirrored memory, use for boot time allocations Andrew Morton
2015-05-08 20:38   ` Tony Luck
2015-05-08 20:49     ` Andrew Morton
2015-05-08 23:41       ` Tony Luck
2015-05-19  3:01 ` Xishi Qiu
2015-05-19  4:48   ` Tony Luck
2015-05-19  6:37     ` Xishi Qiu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555ADA36.6060507@huawei.com \
    --to=qiuxishi@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=guohanjun@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tony.luck@gmail.com \
    --cc=xiexiuqi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).