All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] x86,mm,64bit: Round up memory boundary for init_memory_mapping_high()
Date: Fri, 25 Feb 2011 12:18:44 -0800	[thread overview]
Message-ID: <4D680EA4.6050402@kernel.org> (raw)
In-Reply-To: <20110225111606.GG24828@htj.dyndns.org>

On 02/25/2011 03:16 AM, Tejun Heo wrote:
> On Thu, Feb 24, 2011 at 10:20:35PM -0800, Yinghai Lu wrote:
>> tj pointed out:
>> 	when node does not have 1G aligned boundary, like 128M.
>> init_memory_mapping_high() could render smaller mapping by 128M on one node,
>> and 896M on next node with 2M pages instead of 1g page. that could increase
>> TLB presure.
>>
>> So if gb page is used, try to align the boundary to 1G before calling
>> init_memory_mapping_ext(), to make sure only use one 1g entry for that cross
>> node 1G.
>> Need to init_meory_mapping_ext() to table tbl_end, to make sure pgtable is on
>> previous node instead of next node.
> 
> I don't know, Yinghai.  The whole code seems overly complicated to me.
> Just ignore e820 map when building linear mapping.  It doesn't matter.
> Why not just do something like the following?  Also, can you please
> add some comments explaining how the NUMA affine allocation actually
> works for page tables?
yes, that could be done in separated patch.

>  Or better, can you please make that explicit?
> It currently depends on memories being registered in ascending address
> order, right?  The memblock code already is NUMA aware, I think it
> would be far better to make the node affine part explicit.

yes, memblock is numa aware after memblock_x86_register_active_regions().
and it rely on early_node_map[].

do you mean let init_memory_mapping to take node id like setup_node_bootmem?
so find_early_table_space could take nodeid instead of tbl_end? 

> 
> Thanks.
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 46e684f..4fd0b59 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -966,6 +966,11 @@ void __init setup_arch(char **cmdline_p)
>  	memblock.current_limit = get_max_mapped();
>  
>  	/*
> +	 * Add whole lot of comment explaining what's going on and WHY
> +	 * because as it currently stands, it's frigging cryptic.
> +	 */
> +
> +	/*
>  	 * NOTE: On x86-32, only from this point on, fixmaps are ready for use.
>  	 */
>  
> diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
> index 7757d22..50ec03c 100644
> --- a/arch/x86/mm/numa_64.c
> +++ b/arch/x86/mm/numa_64.c
> @@ -536,8 +536,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
>  	if (!numa_meminfo_cover_memory(mi))
>  		return -EINVAL;
>  
> -	init_memory_mapping_high();
> -
>  	/* Finally register nodes. */
>  	for_each_node_mask(nid, node_possible_map) {
>  		u64 start = (u64)max_pfn << PAGE_SHIFT;
> @@ -550,8 +548,12 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
>  			end = max(mi->blk[i].end, end);
>  		}
>  
> -		if (start < end)
> +		if (start < end) {
> +			init_memory_mapping(
> +			  ALIGN_DOWN_TO_MAX_MAP_SIZE_AND_CONVERT_TO_PFN(start),
> +			  ALIGN_UP_SIMILARY_BUT_DONT_GO_OVER_MAX_PFN(end));
>  			setup_node_bootmem(nid, start, end);
> +		}
will have problem with cross node conf. like 0-4g, 8-12g on node0, 4g-8g, 12g-16g on node1.

>  	}
>  
>  	return 0;
> 
> 

Thanks

Yinghai Lu

  reply	other threads:[~2011-02-25 20:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-23 17:19 questions about init_memory_mapping_high() Tejun Heo
2011-02-23 20:24 ` Yinghai Lu
2011-02-23 20:46   ` Tejun Heo
2011-02-23 20:51     ` Yinghai Lu
2011-02-23 21:03       ` Tejun Heo
2011-02-23 22:17         ` Yinghai Lu
2011-02-24  9:15           ` Tejun Heo
2011-02-25  1:37             ` Yinghai Lu
2011-02-25  1:38             ` [PATCH 1/2] x86,mm: Introduce init_memory_mapping_ext() Yinghai Lu
2011-02-25  6:20             ` [PATCH 2/2] x86,mm,64bit: Round up memory boundary for init_memory_mapping_high() Yinghai Lu
2011-02-25 10:03               ` Ingo Molnar
2011-02-25 20:22                 ` Yinghai Lu
2011-02-26  3:06                 ` [PATCH 1/3] x86, mm: Introduce global page_size_mask Yinghai Lu
2011-02-26  3:07                 ` [PATCH 2/3] x86,mm: Introduce init_memory_mapping_ext() Yinghai Lu
2011-02-26  3:08                 ` [PATCH 3/3] x86,mm,64bit: Round up memory boundary for init_memory_mapping_high() Yinghai Lu
2011-02-26 10:36                   ` Tejun Heo
2011-02-26 10:55                     ` Tejun Heo
2011-02-25 11:16               ` [PATCH 2/2] " Tejun Heo
2011-02-25 20:18                 ` Yinghai Lu [this message]
2011-02-26  8:57                   ` Tejun Heo
2011-02-27 11:53                     ` Ingo Molnar
2011-02-28 18:14 ` questions about init_memory_mapping_high() H. Peter Anvin
2011-03-01  8:29   ` Tejun Heo
2011-03-01 19:44     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D680EA4.6050402@kernel.org \
    --to=yinghai@kernel.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.