From: Tang Chen <tangchen@cn.fujitsu.com>
To: Tejun Heo <tj@kernel.org>
Cc: Tang Chen <imtangchen@gmail.com>,
"H. Peter Anvin" <hpa@zytor.com>,
robert.moore@intel.com, lv.zheng@intel.com, rjw@sisk.pl,
lenb@kernel.org, tglx@linutronix.de, mingo@elte.hu,
akpm@linux-foundation.org, trenn@suse.de, yinghai@kernel.org,
jiang.liu@huawei.com, wency@cn.fujitsu.com, laijs@cn.fujitsu.com,
isimatu.yasuaki@jp.fujitsu.com, izumi.taku@jp.fujitsu.com,
mgorman@suse.de, minchan@kernel.org, mina86@mina86.com,
gong.chen@linux.intel.com, vasilis.liaskovitis@profitbricks.com,
lwoodman@redhat.com, riel@redhat.com, jweiner@redhat.com,
prarit@redhat.com, zhangyanfei@cn.fujitsu.com,
yanghy@cn.fujitsu.com, x86@kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-acpi@vger.kernel.org, "Luck,
Tony (tony.luck@intel.com)" <tony.luck@intel.com>
Subject: Re: [PATCH part5 0/7] Arrange hotpluggable memory as ZONE_MOVABLE.
Date: Tue, 13 Aug 2013 17:56:46 +0800 [thread overview]
Message-ID: <520A02DE.1010908@cn.fujitsu.com> (raw)
In-Reply-To: <5209CEC1.8070908@cn.fujitsu.com>
Hi tj,
When doing the "near kernel memory allocation", I have something
about memblock that I need you to comfirm.
1. First of all, memblock is platform independent. Different platforms
have different ways to store kernel image address. So I don't think
we can obtain the kernel image address on memblock side, right ?
If so, then we need to pass kernel image address to memblock. But...
2. There are several places calling memblock_find_in_range_node() to
allocate memory before SRAT parsed.
early_reserve_e820_mpc_new()
reserve_real_mode()
init_mem_mapping()
setup_log_buf()
relocate_initrd()
acpi_initrd_override()
reserve_crashkernel()
Maybe more, I didn't find out.
And in the future, maybe someone will add code to allocate memory
before SRAT parsed. So I don't think we should pass kernel image
addr to them one by one. It will modify a lot of things.
So I think we need a generic way to tell memblock to allocate memory
from the kernel image end address to higher memory.
My idea is:
1. Introduce a memblock.current_limit_low to limit the lowest address
that memblock can use.
2. Make memblock be able to allocate memory from low to high.
3. Get kernel image address on x86, and set memblock.current_limit_low
to it before SRAT is parsed. Then we achieve the goal.
4. Reset it to 0, and make memblock allocate memory form high to low.
How do you think of this, or do you have any better idea ?
Thanks for your patient and help. :)
On 08/13/2013 02:14 PM, Tang Chen wrote:
> On 08/13/2013 12:46 AM, Tejun Heo wrote:
> ......
>>
>> * Adding an option to tell the kernel to try to stay away from
>> hotpluggable nodes is fine. I have no problem with that at all.
>>
>> * The patchsets upto this point have been somehow trying to reorder
>> operations shomehow such that *no* memory allocation happens before
>> memblock is populated with hotplug information.
>>
>> * However, we already *know* that the memory the kernel image is
>> occupying won't be removeable. It's highly likely that the amount
>> of memory allocation before NUMA / hotplug information is fully
>> populated is pretty small. Also, it's highly likely that small
>> amount of memory right after the kernel image is contained in the
>> same NUMA node, so if we allocate memory close to the kernel image,
>> it's likely that we don't contaminate hotpluggable node. We're
>> talking about few megs at most right after the kernel image. I
>> can't see how that would make any noticeable difference.
>>
>> * Once hotplug information is available, allocation can happen as
>> usual and the kernel can report the nodes which are actually
>> hotpluggable - marked as hotpluggable by the firmware&& didn't get
>> contaminated during early alloc&& didn't get overflow allocations
>> afterwards. Note that we need such mechanism no matter what as the
>> kernel image can be loaded into hotpluggable nodes and reporting
>> that to userland is the only thing the kernel can do for cases like
>> that short of denying memory unplug on such nodes.
>>
>
> Hi tj, hpa, luck, yinghai,
>
> So if all of you agree on the idea above from tj, I think
> we can do it in this way. Will update the patches to allocate
> memory near kernel image before SRAT is parsed.
>
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-08-13 9:56 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-08 10:16 [PATCH part5 0/7] Arrange hotpluggable memory as ZONE_MOVABLE Tang Chen
2013-08-08 10:16 ` [PATCH part5 1/7] x86: get pg_data_t's memory from other node Tang Chen
2013-08-12 14:39 ` Tejun Heo
2013-08-12 15:12 ` Tang Chen
2013-08-08 10:16 ` [PATCH part5 2/7] x86, numa, mem_hotplug: Skip all the regions the kernel resides in Tang Chen
2013-08-08 10:16 ` [PATCH part5 3/7] memblock, numa: Introduce flag into memblock Tang Chen
2013-08-08 10:16 ` [PATCH part5 4/7] memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions Tang Chen
2013-08-08 10:16 ` [PATCH part5 5/7] memblock, mem_hotplug: Make memblock skip hotpluggable regions by default Tang Chen
2013-08-14 21:54 ` Naoya Horiguchi
2013-08-15 5:15 ` Tang Chen
2013-08-08 10:16 ` [PATCH part5 6/7] mem-hotplug: Introduce movablenode boot option to {en|dis}able using SRAT Tang Chen
2013-08-08 10:16 ` [PATCH part5 7/7] x86, numa, acpi, memory-hotplug: Make movablenode have higher priority Tang Chen
2013-08-09 16:32 ` [PATCH part5 0/7] Arrange hotpluggable memory as ZONE_MOVABLE Tejun Heo
2013-08-12 6:33 ` Tang Chen
2013-08-12 8:54 ` Tang Chen
2013-08-12 14:50 ` Tejun Heo
2013-08-12 15:14 ` H. Peter Anvin
2013-08-12 15:23 ` Tejun Heo
2013-08-12 16:29 ` Tang Chen
2013-08-12 16:46 ` Tejun Heo
2013-08-12 18:23 ` Tang Chen
2013-08-12 20:20 ` Tejun Heo
2013-08-12 20:49 ` Luck, Tony
2013-08-12 20:54 ` Tejun Heo
2013-08-12 20:57 ` H. Peter Anvin
2013-08-12 21:06 ` Yinghai Lu
2013-08-12 21:08 ` Tejun Heo
2013-08-12 21:12 ` H. Peter Anvin
2013-08-12 21:14 ` Tejun Heo
2013-08-12 21:11 ` H. Peter Anvin
2013-08-12 21:11 ` Luck, Tony
2013-08-12 21:25 ` Yinghai Lu
2013-08-12 21:28 ` H. Peter Anvin
2013-08-13 5:14 ` H. Peter Anvin
2013-08-13 6:14 ` Tang Chen
2013-08-13 9:56 ` Tang Chen [this message]
2013-08-13 14:38 ` Tejun Heo
2013-08-13 22:33 ` Yinghai Lu
2013-08-14 1:22 ` Tang Chen
2013-08-15 19:06 ` Toshi Kani
2013-08-15 20:28 ` Yinghai Lu
2013-08-16 2:08 ` Tang Chen
2013-08-16 4:21 ` Yinghai Lu
2013-08-19 3:07 ` Tang Chen
2013-08-19 3:28 ` Yinghai Lu
2013-08-15 8:42 ` Tang Chen
2013-08-15 12:19 ` Tejun Heo
2013-08-15 12:44 ` Tang Chen
2013-08-15 12:49 ` Tejun Heo
2013-08-15 12:52 ` Tang Chen
2013-08-15 14:37 ` Yinghai Lu
2013-08-15 14:45 ` Tejun Heo
2013-08-15 15:05 ` Yinghai Lu
2013-08-15 15:10 ` Tejun Heo
2013-08-15 19:49 ` Toshi Kani
2013-08-15 19:08 ` Luck, Tony
2013-08-15 19:34 ` Yinghai Lu
2013-08-15 14:35 ` Yinghai Lu
2013-08-16 1:16 ` Tang Chen
2013-08-12 15:41 ` Tang Chen
2013-08-12 15:46 ` Tejun Heo
2013-08-12 16:19 ` Tang Chen
2013-08-12 16:22 ` Tejun Heo
2013-08-12 17:01 ` Tang Chen
2013-08-12 17:23 ` H. Peter Anvin
2013-08-14 18:22 ` KOSAKI Motohiro
2013-08-12 18:07 ` Tejun Heo
2013-08-14 18:15 ` KOSAKI Motohiro
2013-08-14 18:23 ` Tejun Heo
2013-08-14 19:40 ` KOSAKI Motohiro
2013-08-14 19:55 ` Tejun Heo
2013-08-14 20:29 ` KOSAKI Motohiro
2013-08-14 20:30 ` H. Peter Anvin
2013-08-14 20:35 ` Tejun Heo
2013-08-14 21:17 ` KOSAKI Motohiro
2013-08-14 21:36 ` Tejun Heo
2013-08-15 1:08 ` KOSAKI Motohiro
2013-08-15 1:21 ` Tejun Heo
2013-08-15 1:33 ` Tejun Heo
2013-08-15 1:44 ` KOSAKI Motohiro
2013-08-15 2:22 ` Tejun Heo
2013-08-15 1:38 ` KOSAKI Motohiro
2013-08-15 1:51 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=520A02DE.1010908@cn.fujitsu.com \
--to=tangchen@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=gong.chen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=imtangchen@gmail.com \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=izumi.taku@jp.fujitsu.com \
--cc=jiang.liu@huawei.com \
--cc=jweiner@redhat.com \
--cc=laijs@cn.fujitsu.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lv.zheng@intel.com \
--cc=lwoodman@redhat.com \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=mingo@elte.hu \
--cc=prarit@redhat.com \
--cc=riel@redhat.com \
--cc=rjw@sisk.pl \
--cc=robert.moore@intel.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=tony.luck@intel.com \
--cc=trenn@suse.de \
--cc=vasilis.liaskovitis@profitbricks.com \
--cc=wency@cn.fujitsu.com \
--cc=x86@kernel.org \
--cc=yanghy@cn.fujitsu.com \
--cc=yinghai@kernel.org \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).