From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932285Ab0GTSgb (ORCPT ); Tue, 20 Jul 2010 14:36:31 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]:27631 "EHLO rcsinet10.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754253Ab0GTSg3 (ORCPT ); Tue, 20 Jul 2010 14:36:29 -0400 Message-ID: <4C45EC59.3030304@kernel.org> Date: Tue, 20 Jul 2010 11:35:05 -0700 From: Yinghai Lu User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100520 SUSE/3.0.5 Thunderbird/3.0.5 MIME-Version: 1.0 To: "H. Peter Anvin" CC: Thomas Gleixner , Ingo Molnar , Tejun Heo , Andrew Morton , Denys Vlasenko , Lee Schermerhorn , linux-kernel@vger.kernel.org Subject: [PATCH] x86, numa: fix boot without RAM on node0 again Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Source-IP: acsmt354.oracle.com [141.146.40.154] X-Auth-Type: Internal IP X-CT-RefId: str=0001.0A090202.4C45EC66.00B8,ss=1,fgs=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org |commit e534c7c5f8d6e9fc46f57fab067c7e48d8ceb172 |Author: Lee Schermerhorn |Date: Wed May 26 14:44:58 2010 -0700 | | numa: x86_64: use generic percpu var numa_node_id() implementation | | x86 arch specific changes to use generic numa_node_id() based on generic | percpu variable infrastructure. Back out x86's custom version of | numa_node_id() broke numa system that doesn't have ram on node0 when MEMORY_HOTPLUG is enabled. because cpu_up() will call cpu_to_node() before per_cpu(numa_node) is setup for APs. When Node0 doesn't have RAM, on x86, cpus already round it to nearest node with RAM in x86_cpu_to_node_map. and per_cpu(numa_node) is not set up until in c_init for APs. when later cpu_up() calling cpu_to_node() will get 0 again, and make it online even there is no RAM on node0. so later all APs can not booted up, and later will have panic. [ 1.611101] On node 0 totalpages: 0 ......... [ 2.608558] On node 0 totalpages: 0 [ 2.612065] Brought up 1 CPUs [ 2.615199] Total of 1 processors activated (3990.31 BogoMIPS). ... 93.225341] calling loop_init+0x0/0x1a4 @ 1 [ 93.229314] PERCPU: allocation failed, size=80 align=8, failed to populate [ 93.246539] Pid: 1, comm: swapper Tainted: G W 2.6.35-rc4-tip-yh-04371-gd64e6c4-dirty #354 [ 93.264621] Call Trace: [ 93.266533] [] pcpu_alloc+0x83a/0x8e7 [ 93.270710] [] __alloc_percpu+0x10/0x12 [ 93.285849] [] alloc_disk_node+0x94/0x16d [ 93.291811] [] alloc_disk+0x11/0x13 [ 93.306157] [] loop_alloc+0xa7/0x180 [ 93.310538] [] loop_init+0x9b/0x1a4 [ 93.324909] [] ? loop_init+0x0/0x1a4 [ 93.329650] [] do_one_initcall+0x57/0x136 [ 93.345197] [] kernel_init+0x184/0x20e [ 93.348146] [] kernel_thread_helper+0x4/0x10 [ 93.365194] [] ? restore_args+0x0/0x30 [ 93.369305] [] ? kernel_init+0x0/0x20e [ 93.386011] [] ? kernel_thread_helper+0x0/0x10 [ 93.392047] loop: out of memory ... Try to assign per_cpu(numa_node) early Signed-off-by: Yinghai --- arch/x86/kernel/setup_percpu.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) Index: linux-2.6/arch/x86/kernel/setup_percpu.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/setup_percpu.c +++ linux-2.6/arch/x86/kernel/setup_percpu.c @@ -238,6 +238,15 @@ void __init setup_per_cpu_areas(void) #ifdef CONFIG_NUMA per_cpu(x86_cpu_to_node_map, cpu) = early_per_cpu_map(x86_cpu_to_node_map, cpu); + /* + * make sure boot cpu numa_node is right, when boot cpu is on + * the node that doesn't have mem installed + * also cpu_up() will call cpu_to_node() for APs when + * MEMORY_HOTPLUG is defined, before per_cpu(numa_node) is set + * up later with c_init aka intel_init/amd_init + * So set them all (boot cpu and all APs) + */ + set_cpu_numa_node(cpu, early_cpu_to_node(cpu)); #endif #endif /* @@ -257,14 +266,6 @@ void __init setup_per_cpu_areas(void) early_per_cpu_ptr(x86_cpu_to_node_map) = NULL; #endif -#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA) - /* - * make sure boot cpu numa_node is right, when boot cpu is on the - * node that doesn't have mem installed - */ - set_cpu_numa_node(boot_cpu_id, early_cpu_to_node(boot_cpu_id)); -#endif - /* Setup node to cpumask map */ setup_node_to_cpumask_map();