From: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
To: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>,
Zhang Yanfei <zhangyanfei.yes@gmail.com>,
"H. Peter Anvin" <hpa@zytor.com>, Toshi Kani <toshi.kani@hp.com>,
Ingo Molnar <mingo@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE
Date: Tue, 15 Oct 2013 09:40:10 +0800 [thread overview]
Message-ID: <525C9CFA.7070601@cn.fujitsu.com> (raw)
In-Reply-To: <20131014205540.GM4722@htj.dyndns.org>
Hello tejun, peter and yinghai
On 10/15/2013 04:55 AM, Tejun Heo wrote:
> Hello,
>
> On Mon, Oct 14, 2013 at 01:37:20PM -0700, Yinghai Lu wrote:
>> The problem is how to define "amount necessary". If we can parse srat early,
>> then we could just map RAM for all boot nodes one time, instead of try some
>> small and then after SRAT table, expand it cover non-boot nodes.
>
> Wouldn't that amount be fairly static and restricted? If you wanna
> chunk memory init anyway, there's no reason to init more than
> necessary until smp stage is reached. The more you do early, the more
> serialized you're, so wouldn't the goal naturally be initing the
> minimum possible?
>
>> To keep non-boot numa node hot-removable. we need to page table (and other
>> that we allocate during boot stage) on ram of non boot nodes, or their
>> local node ram. (share page table always should be on boot nodes).
>
> The above assumes the followings,
>
> * 4k page mappings. It'd be nice to keep everything working for 4k
> but just following SRAT isn't enough. What if the non-hotpluggable
> boot node doesn't stretch high enough and page table reaches down
> too far? This won't be an optional behavior, so it is actually
> *likely* to happen on certain setups.
>
> * Memory hotplug is at NUMA node granularity instead of device.
>
>>> Optimizing NUMA boot just requires moving the heavy lifting to
>>> appropriate NUMA nodes. It doesn't require that early boot phase
>>> should strictly follow NUMA node boundaries.
>>
>> At end of day, I like to see all numa system (ram/cpu/pci) could have
>> non boot nodes to be hot-removed logically. with any boot command
>> line.
>
> I suppose you mean "without any boot command line"? Sure, but, first
> of all, there is a clear performance trade-off, and, secondly, don't
> we want something finer grained? Why would we want to that per-NUMA
> node, which is extremely coarse?
>
Both ways seem ok enough *currently*. But what tejun always emphasizes
is the trade-off, or benefit / cost ratio.
Yinghai and peter insist on the long-term plan. But it seems currently
no actual requirements and plans that *must* parse SRAT earlier comparing
to the current approach in this patchset, right?
Should we follow "Make it work first and optimize/beautify it later"?
I think if we have the scene that must parse SRAT earlier, I think tejun
will have no objection to it.
--
Thanks.
Zhang Yanfei
next prev parent reply other threads:[~2013-10-15 1:41 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-12 6:00 [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE Zhang Yanfei
2013-10-12 6:00 ` Zhang Yanfei
2013-10-12 6:00 ` Zhang Yanfei
2013-10-12 6:03 ` [PATCH part2 v2 1/8] x86: get pg_data_t's memory from other node Zhang Yanfei
2013-10-12 6:03 ` Zhang Yanfei
2013-10-12 6:03 ` Zhang Yanfei
2013-10-12 6:04 ` [PATCH part2 v2 2/8] memblock, numa: Introduce flag into memblock Zhang Yanfei
2013-10-12 6:04 ` Zhang Yanfei
2013-10-12 6:04 ` Zhang Yanfei
2013-10-12 6:05 ` [PATCH part2 v2 3/8] memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions Zhang Yanfei
2013-10-12 6:05 ` Zhang Yanfei
2013-10-12 6:05 ` Zhang Yanfei
2013-10-12 6:06 ` [PATCH part2 v2 4/8] memblock: Make memblock_set_node() support different memblock_type Zhang Yanfei
2013-10-12 6:06 ` Zhang Yanfei
2013-10-12 6:06 ` Zhang Yanfei
2013-10-12 6:07 ` [PATCH part2 v2 5/8] acpi, numa, mem_hotplug: Mark hotpluggable memory in memblock Zhang Yanfei
2013-10-12 6:07 ` Zhang Yanfei
2013-10-12 6:07 ` Zhang Yanfei
2013-10-12 6:08 ` [PATCH part2 v2 6/8] acpi, numa, mem_hotplug: Mark all nodes the kernel resides un-hotpluggable Zhang Yanfei
2013-10-12 6:08 ` Zhang Yanfei
2013-10-12 6:08 ` Zhang Yanfei
2013-10-12 6:09 ` [PATCH part2 v2 7/8] memblock, mem_hotplug: Make memblock skip hotpluggable regions if needed Zhang Yanfei
2013-10-12 6:09 ` Zhang Yanfei
2013-10-12 6:09 ` Zhang Yanfei
2013-10-12 6:09 ` [PATCH part2 v2 8/8] x86, numa, acpi, memory-hotplug: Make movable_node have higher priority Zhang Yanfei
2013-10-12 6:09 ` Zhang Yanfei
2013-10-12 6:09 ` Zhang Yanfei
[not found] ` <525B19C3.9040907@gmail.com>
[not found] ` <20131014133835.GG4722@htj.dyndns.org>
[not found] ` <525BFCF3.5010908@gmail.com>
[not found] ` <20131014142719.GI4722@htj.dyndns.org>
[not found] ` <525C02DC.4050706@gmail.com>
[not found] ` <20131014145131.GJ4722@htj.dyndns.org>
[not found] ` <525C0866.2010808@gmail.com>
[not found] ` <20131014151902.GL4722@htj.dyndns.org>
2013-10-14 15:34 ` [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE Zhang Yanfei
2013-10-14 19:34 ` Yinghai Lu
2013-10-14 20:04 ` Tejun Heo
2013-10-14 20:37 ` Yinghai Lu
2013-10-14 20:42 ` H. Peter Anvin
2013-10-15 6:50 ` Ingo Molnar
2013-10-15 17:31 ` H. Peter Anvin
2013-10-16 7:03 ` Ingo Molnar
2013-10-14 20:55 ` Tejun Heo
2013-10-15 1:40 ` Zhang Yanfei [this message]
2013-10-15 2:25 ` Yinghai Lu
2013-10-15 13:16 ` Tejun Heo
2013-10-14 20:35 ` H. Peter Anvin
2013-10-14 20:42 ` Yinghai Lu
2013-10-14 20:49 ` H. Peter Anvin
2013-11-13 13:50 ` Zhang Yanfei
2013-11-13 13:50 ` Zhang Yanfei
2013-11-19 9:56 ` Zhang Yanfei
2013-11-19 9:56 ` Zhang Yanfei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=525C9CFA.7070601@cn.fujitsu.com \
--to=zhangyanfei@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=tj@kernel.org \
--cc=toshi.kani@hp.com \
--cc=yinghai@kernel.org \
--cc=zhangyanfei.yes@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.