From: Tejun Heo <tj@kernel.org>
To: Zhang Yanfei <zhangyanfei.yes@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Rafael J . Wysocki" <rjw@sisk.pl>,
lenb@kernel.org, Thomas Gleixner <tglx@linutronix.de>,
mingo@elte.hu, Toshi Kani <toshi.kani@hp.com>,
Wanpeng Li <liwanp@linux.vnet.ibm.com>,
Thomas Renninger <trenn@suse.de>, Yinghai Lu <yinghai@kernel.org>,
Jiang Liu <jiang.liu@huawei.com>,
Wen Congyang <wency@cn.fujitsu.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
isimatu.yasuaki@jp.fujitsu.com, izumi.taku@jp.fujitsu.com,
Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan@kernel.org>,
mina86@mina86.com, gong.chen@linux.intel.com,
vasilis.liaskovitis@profitbricks.com, lwoodman@redhat.com,
Rik van Riel <riel@redhat.com>,
jweiner@redhat.com, prarit@redhat.com,
"x86@kernel.org" <x86@kernel.org>,
linux-doc@vger.kernel.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
linux-acpi@vger.kernel.org, imtangchen@gmail.com,
Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
Tang Chen <tangchen@cn.fujitsu.com>
Subject: Re: [PATCH part1 v6 4/6] x86/mem-hotplug: Support initialize page tables in bottom-up
Date: Wed, 9 Oct 2013 15:20:40 -0400 [thread overview]
Message-ID: <20131009192040.GA5592@mtj.dyndns.org> (raw)
In-Reply-To: <52558EEF.4050009@gmail.com>
Hello,
On Thu, Oct 10, 2013 at 01:14:23AM +0800, Zhang Yanfei wrote:
> >> You meant that the memory size is about few megs. But here, page tables
> >> seems to be large enough in big memory machines, so that page tables will
> >
> > Hmmm? Even with 4k mappings and, say, 16Gigs of memory, it's still
> > somewhere above 32MiB, right? And, these physical mappings don't
> > usually use 4k mappings to begin with. Unless we're worrying about
> > ISA DMA limit, I don't think it'd be problematic.
>
> I think Peter meant very huge memory machines, say 2T memory? In the worst
> case, this may need 2G memory for page tables, seems huge....
Realistically tho, why would people be using 4k mappings on 2T
machines? For the sake of argument, let's say 4k mappings are
required for some weird reason, even then, doing SRAT parsing early
doesn't necessarily solve the problem in itself. It'd still need
heuristics to avoid occupying too much of 32bit memory because it
isn't difficult to imagine specific NUMA settings which would drive
page table allocation into low address.
No matter what we do, there's no way around the fact that this whole
effort is mostly an incomplete solution in its nature and that's why I
think we better keep things isolated and simple. It isn't a good idea
to make structural changes to accomodate something which isn't and
doesn't have much chance of becoming a full solution. In addition,
the problem itself is niche to begin with.
> And I am not familiar with the ISA DMA limit, does this mean the memory
> below 4G? Just as we have the ZONE_DMA32 in x86_64. (16MB limit seems not
> the case here)
Yeah, I was referring to the 16MB limit, which apparently ceased to
exist.
> 1. introduce bottom up allocation to allocate memory near the kernel before
> we parse SRAT.
> 2. Since peter have the serious concern about the pagetable setup in bottom-up
> and Ingo also said we'd better not to touch the current top-down pagetable
> setup. Could we just put acpi_initrd_override and numa_init related functions
> before init_mem_mapping()? After numa info is parsed (including SRAT), we
> reset the allocation direction back to top-down, so we needn't change the
> page table setup process. And before numa info parsed, we use the bottom-up
> allocation to make sure all memory allocated by memblock is near the kernel
> image.
>
> How do you think?
Let's wait to hear more about Peter's concern. Peter, the whole thing
is very specialized, off-by-default thing which is more or less a
kludge no matter which implementation direction we choose and as far
as the cost and risk go, I think the proposed series is pretty small
in its foot print. What do you think?
Thanks.
--
tejun
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-10-09 19:20 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-04 1:56 [PATCH part1 v6 0/6] x86, memblock: Allocate memory near kernel image before SRAT parsed Zhang Yanfei
2013-10-04 1:57 ` [PATCH part1 v6 1/6] memblock: Factor out of top-down allocation Zhang Yanfei
2013-10-04 1:58 ` [PATCH part1 v6 2/6] memblock: Introduce bottom-up allocation mode Zhang Yanfei
2013-10-05 21:30 ` Toshi Kani
2013-10-04 1:59 ` [PATCH part1 v6 3/6] x86/mm: Factor out of top-down direct mapping setup Zhang Yanfei
2013-10-04 2:00 ` [PATCH part1 v6 4/6] x86/mem-hotplug: Support initialize page tables in bottom-up Zhang Yanfei
2013-10-05 22:09 ` Toshi Kani
2013-10-07 0:00 ` H. Peter Anvin
2013-10-07 14:17 ` Zhang Yanfei
2013-10-08 17:36 ` Zhang Yanfei
2013-10-09 16:44 ` Tejun Heo
2013-10-09 17:14 ` Zhang Yanfei
2013-10-09 19:20 ` Tejun Heo [this message]
2013-10-09 19:30 ` Dave Hansen
2013-10-09 19:47 ` Tejun Heo
2013-10-09 20:58 ` Toshi Kani
2013-10-09 21:11 ` Tejun Heo
2013-10-09 21:14 ` H. Peter Anvin
2013-10-09 21:45 ` Zhang Yanfei
2013-10-09 23:10 ` H. Peter Anvin
2013-10-09 23:26 ` Zhang Yanfei
2013-10-10 1:20 ` Zhang Yanfei
2013-10-10 0:25 ` Toshi Kani
2013-10-09 23:58 ` Toshi Kani
2013-10-10 1:00 ` Tejun Heo
2013-10-10 14:36 ` Toshi Kani
2013-10-10 15:35 ` Tejun Heo
2013-10-10 16:24 ` Toshi Kani
2013-10-10 16:46 ` Tejun Heo
2013-10-10 16:50 ` Toshi Kani
2013-10-10 16:55 ` Tejun Heo
2013-10-10 16:59 ` Toshi Kani
2013-10-10 17:12 ` H. Peter Anvin
2013-10-10 19:17 ` Toshi Kani
2013-10-10 22:19 ` Tejun Heo
2013-10-10 23:00 ` Toshi Kani
2013-10-09 21:19 ` Zhang Yanfei
2013-10-09 21:22 ` H. Peter Anvin
2013-10-09 23:30 ` Zhang Yanfei
2013-10-09 19:10 ` Yinghai Lu
2013-10-09 19:23 ` Tejun Heo
2013-10-11 5:27 ` Yinghai Lu
2013-10-11 5:47 ` Zhang Yanfei
2013-10-11 6:33 ` Ingo Molnar
2013-10-11 6:46 ` Zhang Yanfei
2013-10-04 2:01 ` [PATCH part1 v6 5/6] x86, acpi, crash, kdump: Do reserve_crashkernel() after SRAT is parsed Zhang Yanfei
2013-10-05 22:10 ` Toshi Kani
2013-10-04 2:02 ` [PATCH part1 v6 6/6] mem-hotplug: Introduce movable_node boot option Zhang Yanfei
2013-10-05 22:28 ` Toshi Kani
2013-10-06 14:43 ` [PATCH part1 v6 update " Zhang Yanfei
2013-10-06 23:03 ` Toshi Kani
2013-10-08 4:23 ` [PATCH part1 v6 0/6] x86, memblock: Allocate memory near kernel image before SRAT parsed Ingo Molnar
2013-10-08 15:28 ` Zhang Yanfei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131009192040.GA5592@mtj.dyndns.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=gong.chen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=imtangchen@gmail.com \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=izumi.taku@jp.fujitsu.com \
--cc=jiang.liu@huawei.com \
--cc=jweiner@redhat.com \
--cc=laijs@cn.fujitsu.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liwanp@linux.vnet.ibm.com \
--cc=lwoodman@redhat.com \
--cc=mgorman@suse.de \
--cc=mina86@mina86.com \
--cc=minchan@kernel.org \
--cc=mingo@elte.hu \
--cc=prarit@redhat.com \
--cc=riel@redhat.com \
--cc=rjw@sisk.pl \
--cc=tangchen@cn.fujitsu.com \
--cc=tglx@linutronix.de \
--cc=toshi.kani@hp.com \
--cc=trenn@suse.de \
--cc=vasilis.liaskovitis@profitbricks.com \
--cc=wency@cn.fujitsu.com \
--cc=x86@kernel.org \
--cc=yinghai@kernel.org \
--cc=zhangyanfei.yes@gmail.com \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).