From: Yinghai Lu <yinghai@kernel.org>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Miller <davem@davemloft.net>,
Be
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
Yinghai Lu <yinghai@kernel.org>
Subject: [PATCH 01/49] x86, numa: fix boot without RAM on node0 again
Date: Mon, 19 Jul 2010 16:56:10 -0700 [thread overview]
Message-ID: <1279583818-14249-2-git-send-email-yinghai@kernel.org> (raw)
In-Reply-To: <1279583818-14249-1-git-send-email-yinghai@kernel.org>
|commit e534c7c5f8d6e9fc46f57fab067c7e48d8ceb172
|Author: Lee Schermerhorn <lee.schermerhorn@hp.com>
|Date: Wed May 26 14:44:58 2010 -0700
|
| numa: x86_64: use generic percpu var numa_node_id() implementation
|
| x86 arch specific changes to use generic numa_node_id() based on generic
| percpu variable infrastructure. Back out x86's custom version of
| numa_node_id()
broke numa system that doesn't have ram on node0 when MEMORY_HOTPLUG is enabled.
because cpu_up() will call cpu_to_node() before per_cpu(numa_node) is setup for
APs.
When Node0 doesn't have RAM, on x86, cpus already round it to nearest node with
RAM in x86_cpu_to_node_map. and per_cpu(numa_node) is not set up until in c_init for APs.
when later cpu_up() calling cpu_to_node() will get 0 again, and make it online
even there is no RAM on node0. so later all APs can not booted up, and later
will have panic.
[ 1.611101] On node 0 totalpages: 0
.........
[ 2.608558] On node 0 totalpages: 0
[ 2.612065] Brought up 1 CPUs
[ 2.615199] Total of 1 processors activated (3990.31 BogoMIPS).
...
93.225341] calling loop_init+0x0/0x1a4 @ 1
[ 93.229314] PERCPU: allocation failed, size=80 align=8, failed to populate
[ 93.246539] Pid: 1, comm: swapper Tainted: G W 2.6.35-rc4-tip-yh-04371-gd64e6c4-dirty #354
[ 93.264621] Call Trace:
[ 93.266533] [<ffffffff81125e43>] pcpu_alloc+0x83a/0x8e7
[ 93.270710] [<ffffffff81125f15>] __alloc_percpu+0x10/0x12
[ 93.285849] [<ffffffff8140786c>] alloc_disk_node+0x94/0x16d
[ 93.291811] [<ffffffff81407956>] alloc_disk+0x11/0x13
[ 93.306157] [<ffffffff81503e51>] loop_alloc+0xa7/0x180
[ 93.310538] [<ffffffff8277ef48>] loop_init+0x9b/0x1a4
[ 93.324909] [<ffffffff8277eead>] ? loop_init+0x0/0x1a4
[ 93.329650] [<ffffffff810001f2>] do_one_initcall+0x57/0x136
[ 93.345197] [<ffffffff827486d0>] kernel_init+0x184/0x20e
[ 93.348146] [<ffffffff81034954>] kernel_thread_helper+0x4/0x10
[ 93.365194] [<ffffffff81c7cc3c>] ? restore_args+0x0/0x30
[ 93.369305] [<ffffffff8274854c>] ? kernel_init+0x0/0x20e
[ 93.386011] [<ffffffff81034950>] ? kernel_thread_helper+0x0/0x10
[ 93.392047] loop: out of memory
...
Try to assign per_cpu(numa_node) early
Signed-off-by: Yinghai <yinghai@kernel.org>
---
arch/x86/kernel/setup_percpu.c | 17 +++++++++--------
1 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index de3b63a..b03959c 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -238,6 +238,15 @@ void __init setup_per_cpu_areas(void)
#ifdef CONFIG_NUMA
per_cpu(x86_cpu_to_node_map, cpu) =
early_per_cpu_map(x86_cpu_to_node_map, cpu);
+ /*
+ * make sure boot cpu numa_node is right, when boot cpu is on
+ * the node that doesn't have mem installed
+ * also cpu_up() will call cpu_to_node() for APs when
+ * MEMORY_HOTPLUG is defined, before per_cpu(numa_node) is set
+ * up later with c_init aka intel_init/amd_init
+ * So set them all (boot cpu and all APs)
+ */
+ set_cpu_numa_node(cpu, early_cpu_to_node(cpu));
#endif
#endif
/*
@@ -257,14 +266,6 @@ void __init setup_per_cpu_areas(void)
early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
#endif
-#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
- /*
- * make sure boot cpu numa_node is right, when boot cpu is on the
- * node that doesn't have mem installed
- */
- set_cpu_numa_node(boot_cpu_id, early_cpu_to_node(boot_cpu_id));
-#endif
-
/* Setup node to cpumask map */
setup_node_to_cpumask_map();
--
1.6.4.2
WARNING: multiple messages have this Message-ID (diff)
From: Yinghai Lu <yinghai@kernel.org>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Miller <davem@davemloft.net>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
Yinghai Lu <yinghai@kernel.org>
Subject: [PATCH 01/49] x86, numa: fix boot without RAM on node0 again
Date: Mon, 19 Jul 2010 16:56:10 -0700 [thread overview]
Message-ID: <1279583818-14249-2-git-send-email-yinghai@kernel.org> (raw)
Message-ID: <20100719235610.gNGKN5tZ7LmhHN-OXAnOBbl7Gvdh-PRpFeRke8ePnu4@z> (raw)
In-Reply-To: <1279583818-14249-1-git-send-email-yinghai@kernel.org>
|commit e534c7c5f8d6e9fc46f57fab067c7e48d8ceb172
|Author: Lee Schermerhorn <lee.schermerhorn@hp.com>
|Date: Wed May 26 14:44:58 2010 -0700
|
| numa: x86_64: use generic percpu var numa_node_id() implementation
|
| x86 arch specific changes to use generic numa_node_id() based on generic
| percpu variable infrastructure. Back out x86's custom version of
| numa_node_id()
broke numa system that doesn't have ram on node0 when MEMORY_HOTPLUG is enabled.
because cpu_up() will call cpu_to_node() before per_cpu(numa_node) is setup for
APs.
When Node0 doesn't have RAM, on x86, cpus already round it to nearest node with
RAM in x86_cpu_to_node_map. and per_cpu(numa_node) is not set up until in c_init for APs.
when later cpu_up() calling cpu_to_node() will get 0 again, and make it online
even there is no RAM on node0. so later all APs can not booted up, and later
will have panic.
[ 1.611101] On node 0 totalpages: 0
.........
[ 2.608558] On node 0 totalpages: 0
[ 2.612065] Brought up 1 CPUs
[ 2.615199] Total of 1 processors activated (3990.31 BogoMIPS).
...
93.225341] calling loop_init+0x0/0x1a4 @ 1
[ 93.229314] PERCPU: allocation failed, size=80 align=8, failed to populate
[ 93.246539] Pid: 1, comm: swapper Tainted: G W 2.6.35-rc4-tip-yh-04371-gd64e6c4-dirty #354
[ 93.264621] Call Trace:
[ 93.266533] [<ffffffff81125e43>] pcpu_alloc+0x83a/0x8e7
[ 93.270710] [<ffffffff81125f15>] __alloc_percpu+0x10/0x12
[ 93.285849] [<ffffffff8140786c>] alloc_disk_node+0x94/0x16d
[ 93.291811] [<ffffffff81407956>] alloc_disk+0x11/0x13
[ 93.306157] [<ffffffff81503e51>] loop_alloc+0xa7/0x180
[ 93.310538] [<ffffffff8277ef48>] loop_init+0x9b/0x1a4
[ 93.324909] [<ffffffff8277eead>] ? loop_init+0x0/0x1a4
[ 93.329650] [<ffffffff810001f2>] do_one_initcall+0x57/0x136
[ 93.345197] [<ffffffff827486d0>] kernel_init+0x184/0x20e
[ 93.348146] [<ffffffff81034954>] kernel_thread_helper+0x4/0x10
[ 93.365194] [<ffffffff81c7cc3c>] ? restore_args+0x0/0x30
[ 93.369305] [<ffffffff8274854c>] ? kernel_init+0x0/0x20e
[ 93.386011] [<ffffffff81034950>] ? kernel_thread_helper+0x0/0x10
[ 93.392047] loop: out of memory
...
Try to assign per_cpu(numa_node) early
Signed-off-by: Yinghai <yinghai@kernel.org>
---
arch/x86/kernel/setup_percpu.c | 17 +++++++++--------
1 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index de3b63a..b03959c 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -238,6 +238,15 @@ void __init setup_per_cpu_areas(void)
#ifdef CONFIG_NUMA
per_cpu(x86_cpu_to_node_map, cpu) =
early_per_cpu_map(x86_cpu_to_node_map, cpu);
+ /*
+ * make sure boot cpu numa_node is right, when boot cpu is on
+ * the node that doesn't have mem installed
+ * also cpu_up() will call cpu_to_node() for APs when
+ * MEMORY_HOTPLUG is defined, before per_cpu(numa_node) is set
+ * up later with c_init aka intel_init/amd_init
+ * So set them all (boot cpu and all APs)
+ */
+ set_cpu_numa_node(cpu, early_cpu_to_node(cpu));
#endif
#endif
/*
@@ -257,14 +266,6 @@ void __init setup_per_cpu_areas(void)
early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
#endif
-#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
- /*
- * make sure boot cpu numa_node is right, when boot cpu is on the
- * node that doesn't have mem installed
- */
- set_cpu_numa_node(boot_cpu_id, early_cpu_to_node(boot_cpu_id));
-#endif
-
/* Setup node to cpumask map */
setup_node_to_cpumask_map();
--
1.6.4.2
next prev parent reply other threads:[~2010-07-19 23:56 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-19 23:56 [PATCH -v25 00/49] Use memblock with x86 Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu [this message]
2010-07-19 23:56 ` [PATCH 01/49] x86, numa: fix boot without RAM on node0 again Yinghai Lu
2010-07-19 23:56 ` [PATCH 02/49] x86,mm: fix 32bit numa sparsemem Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 03/49] memblock: Rename memblock_region to memblock_type and memblock_property to memblock_region Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 04/49] memblock: No reason to include asm/memblock.h late Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 05/49] memblock: Introduce for_each_memblock() and new accessors, and use it Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 06/49] memblock: Remove nid_range argument, arch provides memblock_nid_range() instead Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 07/49] memblock: Factor the lowest level alloc function Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 08/49] memblock: Expose MEMBLOCK_ALLOC_ANYWHERE Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 09/49] memblock: Introduce default allocation limit and use it to replace explicit ones Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 10/49] memblock: Remove rmo_size, burry it in arch/powerpc where it belongs Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 11/49] memblock: Change u64 to phys_addr_t Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 12/49] memblock: Remove unused memblock.debug struct member Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 13/49] memblock: Remove memblock_type.size and add memblock.memory_size instead Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 14/49] memblock: Move memblock arrays to static storage in memblock.c and make their size a variable Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 15/49] memblock: Add debug markers at the end of the array Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 16/49] memblock: Make memblock_find_region() out of memblock_alloc_region() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 17/49] memblock: Define MEMBLOCK_ERROR internally instead of using ~(phys_addr_t)0 Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 18/49] memblock: Move memblock_init() to the bottom of the file Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 19/49] memblock: split memblock_find_base() out of __memblock_alloc_base() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 20/49] memblock: Move functions around into a more sensible order Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 21/49] memblock: Add array resizing support Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 22/49] memblock: Add arch function to control coalescing of memblock memory regions Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 23/49] memblock: Add "start" argument to memblock_find_base() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 24/49] memblock: NUMA allocate can now use early_pfn_map Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 25/49] memblock: Separate memblock_alloc_nid() and memblock_alloc_try_nid() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 26/49] memblock: Make memblock_alloc_try_nid() fallback to MEMBLOCK_ALLOC_ANYWHERE Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 27/49] memblock: Add debugfs files to dump the arrays content Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 28/49] memblock: Prepare x86 to use memblock to replace early_res Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 29/49] memblock: Print new doubled array location info Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 30/49] memblock: Export MEMBLOCK_ERROR again Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 31/49] memblock: Prepare to include linux/memblock.h in core file Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 32/49] memblock: Add ARCH_DISCARD_MEMBLOCK to put memblock code to .init Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 33/49] memblock: Add memblock_find_in_range() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 34/49] x86, memblock: Add memblock_x86_find_in_range_size() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 35/49] bootmem, x86: Add weak version of reserve_bootmem_generic Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 36/49] x86, memblock: Add memblock_x86_to_bootmem() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 37/49] x86,memblock: Add memblock_x86_reserve_range/memblock_x86_free_range Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 38/49] x86, memblock: Add get_free_all_memory_range() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 39/49] x86, memblock: Add memblock_x86_register_active_regions() and memblock_x86_hole_size() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 40/49] memblock: Add find_memory_core_early() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 41/49] x86, memblock: Add memblock_x86_find_in_range_node() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 42/49] x86, memblock: Add memblock_x86_free_memory_in_range() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 43/49] x86, memblock: Add memblock_x86_memory_in_range() Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 44/49] x86, memblock: Use memblock_debug to control debug message print out Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 45/49] x86: Use memblock to replace early_res Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 46/49] x86: Replace e820_/_early string with memblock_ Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 47/49] x86: Remove not used early_res code Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 48/49] x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-19 23:56 ` [PATCH 49/49] x86: remove old bootmem code Yinghai Lu
2010-07-19 23:56 ` Yinghai Lu
2010-07-20 0:14 ` [PATCH -v25 00/49] Use memblock with x86 Linus Torvalds
2010-07-20 0:35 ` Yinghai Lu
2010-07-20 18:29 ` H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1279583818-14249-2-git-send-email-yinghai@kernel.org \
--to=yinghai@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=hannes@cmpxchg.org \
--cc=hpa@zytor.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).