From: Mel Gorman <mgorman@suse.de>
To: Xishi Qiu <qiuxishi@huawei.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@kernel.org>,
"Luck, Tony" <tony.luck@intel.com>,
Hanjun Guo <guohanjun@huawei.com>, Xiexiuqi <xiexiuqi@huawei.com>,
leon@leon.nu, Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Dave Hansen <dave.hansen@intel.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Vlastimil Babka <vbabka@suse.cz>, Linux MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations
Date: Tue, 30 Jun 2015 10:41:50 +0100 [thread overview]
Message-ID: <20150630094149.GA6812@suse.de> (raw)
In-Reply-To: <558E084A.60900@huawei.com>
On Sat, Jun 27, 2015 at 10:19:54AM +0800, Xishi Qiu wrote:
> Intel Xeon processor E7 v3 product family-based platforms introduces support
> for partial memory mirroring called as 'Address Range Mirroring'. This feature
> allows BIOS to specify a subset of total available memory to be mirrored (and
> optionally also specify whether to mirror the range 0-4 GB). This capability
> allows user to make an appropriate tradeoff between non-mirrored memory range
> and mirrored memory range thus optimizing total available memory and still
> achieving highly reliable memory range for mission critical workloads and/or
> kernel space.
>
> Tony has already send a patchset to supprot this feature at boot time.
> https://lkml.org/lkml/2015/5/8/521
> This patchset is based on Tony's, it can support the feature after boot time.
> Use mirrored memory for all kernel allocations.
>
This is my first time glancing through the series so I'm not aware of any
past discussion. Hopefully there are no repeats. Broadly speaking though
I'm not comfortable with the series.
First and foremost, there is uncontrolled access to the memory because
it's any kernel request. This includes even short-lived ones that do not
need mirroring such as network buffers or caches. Network network traffic
can be retried, caches can be reconstructed from disk etc. Kernel page
tables, struct page corruption etc are much harder to recover from.
Who are the expected users of this memory and how are they meant to be
prioritised? What happens if they fail to be mirrored? What happens if the
mirrored memory is all used up and a high priority request arrives? Is there
any prioritisation of one subsystem over another? What about boot-memory
allocations, should they ever use mirrored memory? The expected users are
important and this series does not address it.
Callers do not specify the flag, you just assume that kernel allocations
must be mirrored. If the allocation request fails, then you assume it was
MIGRATE_RECLAIMABLE later in the series. This is wrong as it'll break
fragmentation avoidance on machines with mirrored memory. Even if you
were to use migrate types to handle mirrored memory, you need to treat
mirrored memory as a type of reserve or else as a first preference for
allocations requested.
The fact that this will be used by very few machines but affects the memory
footprint of the page allocator is a general concern. When active, it affects
the fast paths for all users whether they care about mirroring or not.
If all free memory is in the MIGRATE_MIRROR then all user-space requests
will be rejected but reclaim will not make any progress if the zone
is balanced. The system may go prematurely OOM as no progress is made.
Getting around this is tricky and affects a few fast paths. Generally, the
easiest approach would be zone-based but I recognise that it has problems
of its own.
Basically, overall I feel this series is the wrong approach but not knowing
who the users are making is much harder to judge. I strongly suspect that
if mirrored memory is to be properly used then it needs to be available
before the page allocator is even active. Once active, there needs to
be controlled access for allocation requests that are really critical
to mirror and not just all kernel allocations. None of that would use a
MIGRATE_TYPE approach. It would be alterations to the bootmem allocator and
access to an explicit reserve that is not accounted for as "free memory"
and accessed via an explicit GFP flag.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-06-30 9:42 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-27 2:19 [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations Xishi Qiu
2015-06-27 2:23 ` [RFC v2 PATCH 1/8] mm: add a new config to manage the code Xishi Qiu
2015-06-29 6:50 ` Kamezawa Hiroyuki
2015-06-30 2:52 ` Xishi Qiu
2015-06-27 2:24 ` [RFC v2 PATCH 2/8] mm: introduce MIGRATE_MIRROR to manage the mirrored pages Xishi Qiu
2015-06-29 7:32 ` Kamezawa Hiroyuki
2015-06-30 2:45 ` Xishi Qiu
2015-06-30 7:53 ` Kamezawa Hiroyuki
2015-06-30 9:22 ` Xishi Qiu
2015-06-27 2:24 ` [RFC v2 PATCH 3/8] mm: find mirrored memory in memblock Xishi Qiu
2015-06-27 2:25 ` [RFC v2 PATCH 4/8] mm: add mirrored memory to buddy system Xishi Qiu
2015-06-29 7:39 ` Kamezawa Hiroyuki
2015-06-27 2:26 ` [RFC v2 PATCH 5/8] mm: introduce a new zone_stat_item NR_FREE_MIRROR_PAGES Xishi Qiu
2015-06-27 2:27 ` [RFC v2 PATCH 6/8] mm: add free mirrored pages info Xishi Qiu
2015-06-27 2:27 ` [RFC v2 PATCH 7/8] mm: add the buddy system interface Xishi Qiu
2015-06-29 23:11 ` Luck, Tony
2015-06-30 1:01 ` Kamezawa Hiroyuki
2015-06-30 1:31 ` Xishi Qiu
2015-06-30 2:01 ` Kamezawa Hiroyuki
2015-06-27 2:28 ` [RFC v2 PATCH 8/8] mm: add the PCP interface Xishi Qiu
2015-06-29 15:19 ` [RFC v2 PATCH 0/8] mm: mirrored memory support for page buddy allocations Dave Hansen
2015-06-30 1:26 ` Xishi Qiu
2015-06-30 1:52 ` Dave Hansen
2015-06-30 2:48 ` Xishi Qiu
2015-06-30 9:41 ` Mel Gorman [this message]
2015-06-30 10:46 ` Ingo Molnar
2015-06-30 11:53 ` Mel Gorman
2015-06-30 18:12 ` Luck, Tony
2015-07-13 4:56 ` Xishi Qiu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150630094149.GA6812@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=guohanjun@huawei.com \
--cc=hpa@zytor.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=leon@leon.nu \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=qiuxishi@huawei.com \
--cc=tony.luck@intel.com \
--cc=vbabka@suse.cz \
--cc=xiexiuqi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).