public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Tejun Heo <tj@kernel.org>
Cc: x86@kernel.org, Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org
Subject: Re: questions about init_memory_mapping_high()
Date: Thu, 24 Feb 2011 17:37:37 -0800	[thread overview]
Message-ID: <4D6707E1.3060509@kernel.org> (raw)
In-Reply-To: <20110224091557.GD7840@htj.dyndns.org>

On 02/24/2011 01:15 AM, Tejun Heo wrote:
> Hey, again.
> 
> On Wed, Feb 23, 2011 at 02:17:34PM -0800, Yinghai Lu wrote:
>>> Hmmm... I'm not really following.  Can you elaborate?  The reason why
>>> smaller mapping is bad is because of increased TLB pressure.  What
>>> does using the existing entries have to do with it?
>>
>> assume 1g page is used. first node will actually mapped 512G already.
>> so if the system only have 1024g. then first 512g page table will on node0 ram.
>> second 512g page table will be on node4.
>>
>> when only 2M are used, it is 1G boundary. for 1024g system.
>> page table (about 512k) for mem 0-128g is on node0.
>> page table (about 512k) for mem 128g-256g is on node1.
>> ...
>> Do you mean we need to put those all 512k together to reduce TLB presure?
> 
> Nope, let's say the machine supports 1GiB mapping, has 8GiB of memory
> where [0,4)GiB is node 0 and [4,8)GiB node1, and there's a hole of
> 128MiB right on top of 4GiB.  Before the change, the page mapping code
> wouldn't care about the whole and just map the whole [0,8)GiB area
> with eight 1GiB mapping.  Now with your change, [4, 5)GiB will be
> mapped using 2MiB mappings to avoid mapping the 128MiB hole.
> 
> We end up unnecessarily using smaller size mappings (512 2MiB mappings
> instead of 1 1GiB mapping) thus increasing TLB pressure.  There is no
> reason to match the linear address mapping exactly to the physical
> memory map.  It is no accident that the original code didn't consider
> memory holes.  Using larger mappings over them is more beneficial to
> trying to punch holes with smaller mappings.
> 
> This rather important change was made without any description or
> explanation, which I find somewhat disturbing.  Anyways, what we can
> do is just taking bottom and top addresses of occupied NUMA regions
> and round them down and up, respectively, to the largest page mapping
> size supported as long as the top address doesn't go over max_pfn
> instead of mapping exactly according to the memblocks.
> 

ok, please check two patches that fix the problem.

thanks

Yinghai

  reply	other threads:[~2011-02-25  1:38 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-23 17:19 questions about init_memory_mapping_high() Tejun Heo
2011-02-23 20:24 ` Yinghai Lu
2011-02-23 20:46   ` Tejun Heo
2011-02-23 20:51     ` Yinghai Lu
2011-02-23 21:03       ` Tejun Heo
2011-02-23 22:17         ` Yinghai Lu
2011-02-24  9:15           ` Tejun Heo
2011-02-25  1:37             ` Yinghai Lu [this message]
2011-02-25  1:38             ` [PATCH 1/2] x86,mm: Introduce init_memory_mapping_ext() Yinghai Lu
2011-02-25  6:20             ` [PATCH 2/2] x86,mm,64bit: Round up memory boundary for init_memory_mapping_high() Yinghai Lu
2011-02-25 10:03               ` Ingo Molnar
2011-02-25 20:22                 ` Yinghai Lu
2011-02-26  3:06                 ` [PATCH 1/3] x86, mm: Introduce global page_size_mask Yinghai Lu
2011-02-26  3:07                 ` [PATCH 2/3] x86,mm: Introduce init_memory_mapping_ext() Yinghai Lu
2011-02-26  3:08                 ` [PATCH 3/3] x86,mm,64bit: Round up memory boundary for init_memory_mapping_high() Yinghai Lu
2011-02-26 10:36                   ` Tejun Heo
2011-02-26 10:55                     ` Tejun Heo
2011-02-25 11:16               ` [PATCH 2/2] " Tejun Heo
2011-02-25 20:18                 ` Yinghai Lu
2011-02-26  8:57                   ` Tejun Heo
2011-02-27 11:53                     ` Ingo Molnar
2011-02-28 18:14 ` questions about init_memory_mapping_high() H. Peter Anvin
2011-03-01  8:29   ` Tejun Heo
2011-03-01 19:44     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D6707E1.3060509@kernel.org \
    --to=yinghai@kernel.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox