From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Yinghai Lu <yinghai@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
Tejun Heo <tj@kernel.org>, Thomas Renninger <trenn@suse.de>,
Tang Chen <tangchen@cn.fujitsu.com>,
linux-kernel@vger.kernel.org, Pekka Enberg <penberg@kernel.org>,
Jacob Shin <jacob.shin@amd.com>
Subject: Re: [PATCH v3 21/22] x86, mm: Make init_mem_mapping be able to be called several times
Date: Fri, 5 Apr 2013 09:38:59 -0400 [thread overview]
Message-ID: <20130405133859.GH20093@phenom.dumpdata.com> (raw)
In-Reply-To: <1365119186-23487-22-git-send-email-yinghai@kernel.org>
On Thu, Apr 04, 2013 at 04:46:25PM -0700, Yinghai Lu wrote:
> Prepare to put page table on local nodes.
>
> Move calling of init_mem_mapping to early_initmem_init.
>
> Rework alloc_low_pages to alloc page table in following order:
> BRK, local node, low range
>
> Still only load_cr3 one time, otherwise we would break xen 64bit again.
Nope. It should be fixed now.
>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: Jacob Shin <jacob.shin@amd.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> arch/x86/include/asm/pgtable.h | 2 +-
> arch/x86/kernel/setup.c | 1 -
> arch/x86/mm/init.c | 88 ++++++++++++++++++++++++++----------------
> arch/x86/mm/numa.c | 24 ++++++++++++
> 4 files changed, 79 insertions(+), 36 deletions(-)
>
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 1e67223..868687c 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -621,7 +621,7 @@ static inline int pgd_none(pgd_t pgd)
> #ifndef __ASSEMBLY__
>
> extern int direct_gbpages;
> -void init_mem_mapping(void);
> +void init_mem_mapping(unsigned long begin, unsigned long end);
> void early_alloc_pgt_buf(void);
>
> /* local pte updates need not use xchg for locking */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 6ef3fa2..67ef4bc 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1105,7 +1105,6 @@ void __init setup_arch(char **cmdline_p)
> acpi_boot_table_init();
> early_acpi_boot_init();
> early_initmem_init();
> - init_mem_mapping();
> memblock.current_limit = get_max_mapped();
> early_trap_pf_init();
>
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 2754e45..8a03283 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -24,7 +24,10 @@ static unsigned long __initdata pgt_buf_start;
> static unsigned long __initdata pgt_buf_end;
> static unsigned long __initdata pgt_buf_top;
>
> -static unsigned long min_pfn_mapped;
> +static unsigned long low_min_pfn_mapped;
> +static unsigned long low_max_pfn_mapped;
> +static unsigned long local_min_pfn_mapped;
> +static unsigned long local_max_pfn_mapped;
>
> static bool __initdata can_use_brk_pgt = true;
>
> @@ -52,10 +55,17 @@ __ref void *alloc_low_pages(unsigned int num)
>
> if ((pgt_buf_end + num) > pgt_buf_top || !can_use_brk_pgt) {
> unsigned long ret;
> - if (min_pfn_mapped >= max_pfn_mapped)
> - panic("alloc_low_page: ran out of memory");
> - ret = memblock_find_in_range(min_pfn_mapped << PAGE_SHIFT,
> - max_pfn_mapped << PAGE_SHIFT,
> + if (local_min_pfn_mapped >= local_max_pfn_mapped) {
> + if (low_min_pfn_mapped >= low_max_pfn_mapped)
> + panic("alloc_low_page: ran out of memory");
> + ret = memblock_find_in_range(
> + low_min_pfn_mapped << PAGE_SHIFT,
> + low_max_pfn_mapped << PAGE_SHIFT,
> + PAGE_SIZE * num , PAGE_SIZE);
> + } else
> + ret = memblock_find_in_range(
> + local_min_pfn_mapped << PAGE_SHIFT,
> + local_max_pfn_mapped << PAGE_SHIFT,
> PAGE_SIZE * num , PAGE_SIZE);
> if (!ret)
> panic("alloc_low_page: can not alloc memory");
> @@ -402,60 +412,75 @@ static unsigned long __init get_new_step_size(unsigned long step_size)
> return step_size;
> }
>
> -void __init init_mem_mapping(void)
> +void __init init_mem_mapping(unsigned long begin, unsigned long end)
> {
> - unsigned long end, real_end, start, last_start;
> + unsigned long real_end, start, last_start;
> unsigned long step_size;
> unsigned long addr;
> unsigned long mapped_ram_size = 0;
> unsigned long new_mapped_ram_size;
> + bool is_low = false;
> +
> + if (!begin) {
> + probe_page_size_mask();
> + /* the ISA range is always mapped regardless of memory holes */
> + init_memory_mapping(0, ISA_END_ADDRESS);
> + begin = ISA_END_ADDRESS;
> + is_low = true;
> + }
>
> - probe_page_size_mask();
> -
> -#ifdef CONFIG_X86_64
> - end = max_pfn << PAGE_SHIFT;
> -#else
> - end = max_low_pfn << PAGE_SHIFT;
> -#endif
> -
> - /* the ISA range is always mapped regardless of memory holes */
> - init_memory_mapping(0, ISA_END_ADDRESS);
> + if (begin >= end)
> + return;
>
> /* xen has big range in reserved near end of ram, skip it at first.*/
> - addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE, PMD_SIZE);
> + addr = memblock_find_in_range(begin, end, PMD_SIZE, PMD_SIZE);
> real_end = addr + PMD_SIZE;
>
> /* step_size need to be small so pgt_buf from BRK could cover it */
> step_size = PMD_SIZE;
> - max_pfn_mapped = 0; /* will get exact value next */
> - min_pfn_mapped = real_end >> PAGE_SHIFT;
> + local_max_pfn_mapped = begin >> PAGE_SHIFT;
> + local_min_pfn_mapped = real_end >> PAGE_SHIFT;
> last_start = start = real_end;
> - while (last_start > ISA_END_ADDRESS) {
> + while (last_start > begin) {
> if (last_start > step_size) {
> start = round_down(last_start - 1, step_size);
> - if (start < ISA_END_ADDRESS)
> - start = ISA_END_ADDRESS;
> + if (start < begin)
> + start = begin;
> } else
> - start = ISA_END_ADDRESS;
> + start = begin;
> new_mapped_ram_size = init_range_memory_mapping(start,
> last_start);
> + if ((last_start >> PAGE_SHIFT) > local_max_pfn_mapped)
> + local_max_pfn_mapped = last_start >> PAGE_SHIFT;
> + local_min_pfn_mapped = start >> PAGE_SHIFT;
> last_start = start;
> - min_pfn_mapped = last_start >> PAGE_SHIFT;
> /* only increase step_size after big range get mapped */
> if (new_mapped_ram_size > mapped_ram_size)
> step_size = get_new_step_size(step_size);
> mapped_ram_size += new_mapped_ram_size;
> }
>
> - if (real_end < end)
> + if (real_end < end) {
> init_range_memory_mapping(real_end, end);
> + if ((end >> PAGE_SHIFT) > local_max_pfn_mapped)
> + local_max_pfn_mapped = end >> PAGE_SHIFT;
> + }
>
> + if (is_low) {
> + low_min_pfn_mapped = local_min_pfn_mapped;
> + low_max_pfn_mapped = local_max_pfn_mapped;
> + }
> +}
> +
> +#ifndef CONFIG_NUMA
> +void __init early_initmem_init(void)
> +{
> #ifdef CONFIG_X86_64
> - if (max_pfn > max_low_pfn) {
> - /* can we preseve max_low_pfn ?*/
> + init_mem_mapping(0, max_pfn << PAGE_SHIFT);
> + if (max_pfn > max_low_pfn)
> max_low_pfn = max_pfn;
> - }
> #else
> + init_mem_mapping(0, max_low_pfn << PAGE_SHIFT);
> early_ioremap_page_table_range_init();
> #endif
>
> @@ -464,11 +489,6 @@ void __init init_mem_mapping(void)
>
> early_memtest(0, max_pfn_mapped << PAGE_SHIFT);
> }
> -
> -#ifndef CONFIG_NUMA
> -void __init early_initmem_init(void)
> -{
> -}
> #endif
>
> /*
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index c2d4653..d3eb0c9 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -17,8 +17,10 @@
> #include <asm/dma.h>
> #include <asm/acpi.h>
> #include <asm/amd_nb.h>
> +#include <asm/tlbflush.h>
>
> #include "numa_internal.h"
> +#include "mm_internal.h"
>
> int __initdata numa_off;
> nodemask_t numa_nodes_parsed __initdata;
> @@ -668,9 +670,31 @@ static void __init early_x86_numa_init(void)
> numa_init(dummy_numa_init);
> }
>
> +#ifdef CONFIG_X86_64
> +static void __init early_x86_numa_init_mapping(void)
> +{
> + init_mem_mapping(0, max_pfn << PAGE_SHIFT);
> + if (max_pfn > max_low_pfn)
> + max_low_pfn = max_pfn;
> +}
> +#else
> +static void __init early_x86_numa_init_mapping(void)
> +{
> + init_mem_mapping(0, max_low_pfn << PAGE_SHIFT);
> + early_ioremap_page_table_range_init();
> +}
> +#endif
> +
> void __init early_initmem_init(void)
> {
> early_x86_numa_init();
> +
> + early_x86_numa_init_mapping();
> +
> + load_cr3(swapper_pg_dir);
> + __flush_tlb_all();
> +
> + early_memtest(0, max_pfn_mapped<<PAGE_SHIFT);
> }
>
> void __init x86_numa_init(void)
> --
> 1.8.1.4
>
next prev parent reply other threads:[~2013-04-05 13:40 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-04 23:46 [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 01/22] x86: Change get_ramdisk_image() to global Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 02/22] x86, microcode: Use common get_ramdisk_image() Yinghai Lu
2013-04-10 5:34 ` Tang Chen
2013-04-10 7:40 ` Early microcode signing in secure boot environment - Was: " Thomas Renninger
2013-04-10 17:47 ` Yu, Fenghua
2013-04-11 7:31 ` Thomas Renninger
2013-04-11 8:28 ` Yu, Fenghua
2013-04-11 8:59 ` Thomas Renninger
2013-04-11 22:51 ` H. Peter Anvin
2013-04-10 16:13 ` [PATCH v3 02/22] " Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 03/22] x86, ACPI, mm: Kill max_low_pfn_mapped Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 04/22] x86, ACPI: Search buffer above 4G in second try for acpi override tables Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 05/22] x86, ACPI: Increase override tables number limit Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 06/22] x86, ACPI: Split acpi_initrd_override to find/copy two functions Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 07/22] x86, ACPI: Store override acpi tables phys addr in cpio files info array Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 08/22] x86, ACPI: Make acpi_initrd_override_find work with 32bit flat mode Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 09/22] x86, ACPI: Find acpi tables in initrd early from head_32.S/head64.c Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 10/22] x86, mm, numa: Move two functions calling on successful path later Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 11/22] x86, mm, numa: Call numa_meminfo_cover_memory() checking early Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 12/22] x86, mm, numa: Move node_map_pfn alignment() to x86 Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 13/22] x86, mm, numa: Use numa_meminfo to check node_map_pfn alignment Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 14/22] x86, mm, numa: Set memblock nid later Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 15/22] x86, mm, numa: Move node_possible_map setting later Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 16/22] x86, mm, numa: Move emulation handling down Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 17/22] x86, ACPI, numa, ia64: split SLIT handling out Yinghai Lu
2013-04-04 23:46 ` Yinghai Lu
2013-04-05 21:54 ` Tony Luck
2013-04-05 21:54 ` Tony Luck
2013-04-05 22:16 ` Yinghai Lu
2013-04-05 22:16 ` Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 18/22] x86, mm, numa: Add early_initmem_init() stub Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 19/22] x86, mm: Parse numa info early Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 20/22] x86, mm: Add comments for step_size shift Yinghai Lu
2013-04-04 23:46 ` [PATCH v3 21/22] x86, mm: Make init_mem_mapping be able to be called several times Yinghai Lu
2013-04-05 13:38 ` Konrad Rzeszutek Wilk [this message]
2013-04-04 23:46 ` [PATCH v3 22/22] x86, mm, numa: Put pagetable on local node ram for 64bit Yinghai Lu
2013-04-05 2:28 ` [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early Thomas Renninger
2013-04-05 3:09 ` Yinghai Lu
2013-04-05 10:44 ` Thomas Renninger
2013-04-05 16:36 ` Thomas Renninger
2013-04-05 18:10 ` Yinghai Lu
2013-04-11 22:53 ` H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130405133859.GH20093@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=jacob.shin@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=penberg@kernel.org \
--cc=tangchen@cn.fujitsu.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=trenn@suse.de \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.