linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] cpu-hotplug,memory-hotplug: bug fix for offlining node
@ 2012-11-26 10:20 Wen Congyang
  2012-11-26 10:20 ` [PATCH 1/5] cpu_hotplug: clear apicid to node when the cpu is hotremoved Wen Congyang
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Wen Congyang @ 2012-11-26 10:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-pm, linux-acpi, x86
  Cc: Yasuaki Ishimatsu, David Rientjes, Jiang Liu, Minchan Kim,
	KOSAKI Motohiro, Andrew Morton, Mel Gorman, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Tang Chen,
	Rafael J. Wysocki, Len Brown, Lai Jiangshan, Wen Congyang

This patchset is based on the following patchset:
https://lkml.org/lkml/2012/11/1/93

The following patch in mm tree can be dropped now:
    cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved.patch

Tang Chen (1):
  Do not use cpu_to_node() to find an offlined cpu's node.

Wen Congyang (4):
  cpu_hotplug: clear apicid to node when the cpu is hotremoved
  memory-hotplug: export the function try_offline_node()
  cpu-hotplug, memory-hotplug: try offline the node when hotremoving a
    cpu
  cpu-hotplug,memory-hotplug: clear cpu_to_node() when offlining the
    node

 arch/x86/kernel/acpi/boot.c     |  4 ++++
 drivers/acpi/processor_driver.c |  2 ++
 include/linux/memory_hotplug.h  |  2 ++
 kernel/sched/core.c             | 28 +++++++++++++++++++---------
 mm/memory_hotplug.c             | 33 +++++++++++++++++++++++++++++++--
 5 files changed, 58 insertions(+), 11 deletions(-)

-- 
1.8.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/5] cpu_hotplug: clear apicid to node when the cpu is hotremoved
  2012-11-26 10:20 [PATCH 0/5] cpu-hotplug,memory-hotplug: bug fix for offlining node Wen Congyang
@ 2012-11-26 10:20 ` Wen Congyang
  2012-11-26 10:20 ` [PATCH 2/5] memory-hotplug: export the function try_offline_node() Wen Congyang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Wen Congyang @ 2012-11-26 10:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-pm, linux-acpi, x86
  Cc: Yasuaki Ishimatsu, David Rientjes, Jiang Liu, Minchan Kim,
	KOSAKI Motohiro, Andrew Morton, Mel Gorman, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Tang Chen,
	Rafael J. Wysocki, Len Brown, Lai Jiangshan, Wen Congyang

When a cpu is hotpluged, we call acpi_map_cpu2node() in _acpi_map_lsapic()
to store the cpu's node and apicid's node. But we don't clear the cpu's node
in acpi_unmap_lsapic() when this cpu is hotremove. If the node is also
hotremoved, we will get the following messages:
[ 1646.771485] kernel BUG at include/linux/gfp.h:329!
[ 1646.828729] invalid opcode: 0000 [#1] SMP
[ 1646.877872] Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crc32c_intel microcode pcspkr i2c_i801 i2c_core lpc_ich mfd_core ioatdma e1000e i7core_edac edac_core sg acpi_memhotplug igb dca sd_mod crc_t10dif megaraid_sas mptsas mptscsih mptbase scsi_transport_sas scsi_mod
[ 1647.588773] Pid: 3126, comm: init Not tainted 3.6.0-rc3-tangchen-hostbridge+ #13 FUJITSU-SV PRIMEQUEST 1800E/SB
[ 1647.711545] RIP: 0010:[<ffffffff811bc3fd>]  [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300
[ 1647.810492] RSP: 0018:ffff88078a049cf8  EFLAGS: 00010246
[ 1647.874028] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 1647.959339] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000246
[ 1648.044659] RBP: ffff88078a049d38 R08: 00000000000040d0 R09: 0000000000000001
[ 1648.129953] R10: 0000000000000000 R11: 0000000000000b5f R12: 00000000000052d0
[ 1648.215259] R13: ffff8807c1417300 R14: 0000000000030038 R15: 0000000000000003
[ 1648.300572] FS:  00007fa9b1b44700(0000) GS:ffff8807c3800000(0000) knlGS:0000000000000000
[ 1648.397272] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1648.465985] CR2: 00007fa9b09acca0 CR3: 000000078b855000 CR4: 00000000000007e0
[ 1648.551265] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1648.636565] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1648.721838] Process init (pid: 3126, threadinfo ffff88078a048000, task ffff8807bb6f2650)
[ 1648.818534] Stack:
[ 1648.842548]  ffff8807c39d7fa0 ffffffff000040d0 00000000000000bb 00000000000080d0
[ 1648.931469]  ffff8807c1417300 ffff8807c39d7fa0 ffff8807c1417300 0000000000000001
[ 1649.020410]  ffff88078a049d88 ffffffff811bc4a0 ffff8807c1410c80 0000000000000000
[ 1649.109464] Call Trace:
[ 1649.138713]  [<ffffffff811bc4a0>] new_slab+0x30/0x1b0
[ 1649.199075]  [<ffffffff811bc978>] __slab_alloc+0x358/0x4c0
[ 1649.264683]  [<ffffffff810b71c0>] ? alloc_fair_sched_group+0xd0/0x1b0
[ 1649.341695]  [<ffffffff811be7d4>] kmem_cache_alloc_node_trace+0xb4/0x1e0
[ 1649.421824]  [<ffffffff8109d188>] ? hrtimer_init+0x48/0x100
[ 1649.488414]  [<ffffffff810b71c0>] ? alloc_fair_sched_group+0xd0/0x1b0
[ 1649.565402]  [<ffffffff810b71c0>] alloc_fair_sched_group+0xd0/0x1b0
[ 1649.640297]  [<ffffffff810a8bce>] sched_create_group+0x3e/0x110
[ 1649.711040]  [<ffffffff810bdbcd>] sched_autogroup_create_attach+0x4d/0x180
[ 1649.793260]  [<ffffffff81089614>] sys_setsid+0xd4/0xf0
[ 1649.854694]  [<ffffffff8167a029>] system_call_fastpath+0x16/0x1b
[ 1649.926483] Code: 89 c4 e9 73 fe ff ff 31 c0 89 de 48 c7 c7 45 de 9e 81 44 89 45 c8 e8 22 05 4b 00 85 db 44 8b 45 c8 0f 89 4f ff ff ff 0f 0b eb fe <0f> 0b 90 eb fd 0f 0b eb fe 89 de 48 c7 c7 45 de 9e 81 31 c0 44
[ 1650.161454] RIP  [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300
[ 1650.232348]  RSP <ffff88078a049cf8>
[ 1650.274029] ---[ end trace adf84c90f3fea3e5 ]---

The reason is that: the cpu's node is not NUMA_NO_NODE, we will call
alloc_pages_exact_node() to alloc memory on the node, but the node
is offlined.

If the node is onlined, we still need cpu's node. For example:
a task on the cpu is sleeped when the cpu is hotremoved. We will
choose another cpu to run this task when it is waked up. If we
know the cpu's node, we will choose the cpu on the same node first.
So we should clear cpu-to-node mapping when the node is offlined.

This patch only clears apicid-to-node mapping when the cpu is hotremoved.

Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 arch/x86/kernel/acpi/boot.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index e651f7a..f4030fe 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -691,6 +691,10 @@ EXPORT_SYMBOL(acpi_map_lsapic);
 
 int acpi_unmap_lsapic(int cpu)
 {
+#ifdef CONFIG_ACPI_NUMA
+	set_apicid_to_node(per_cpu(x86_cpu_to_apicid, cpu), NUMA_NO_NODE);
+#endif
+
 	per_cpu(x86_cpu_to_apicid, cpu) = -1;
 	set_cpu_present(cpu, false);
 	num_processors--;
-- 
1.8.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/5] memory-hotplug: export the function try_offline_node()
  2012-11-26 10:20 [PATCH 0/5] cpu-hotplug,memory-hotplug: bug fix for offlining node Wen Congyang
  2012-11-26 10:20 ` [PATCH 1/5] cpu_hotplug: clear apicid to node when the cpu is hotremoved Wen Congyang
@ 2012-11-26 10:20 ` Wen Congyang
  2012-11-26 10:20 ` [PATCH 3/5] cpu-hotplug, memory-hotplug: try offline the node when hotremoving a cpu Wen Congyang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Wen Congyang @ 2012-11-26 10:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-pm, linux-acpi, x86
  Cc: Yasuaki Ishimatsu, David Rientjes, Jiang Liu, Minchan Kim,
	KOSAKI Motohiro, Andrew Morton, Mel Gorman, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Tang Chen,
	Rafael J. Wysocki, Len Brown, Lai Jiangshan, Wen Congyang

The node will be offlined when all memory/cpu on the node
have been hotremoved. So we need the function try_offline_node()
in cpu-hotplug path.

If the memory-hotplug is disabled, and cpu-hotplug is enabled
1. no memory no the node
   we don't online the node, and cpu's node is the nearest node.
2. the node contains some memory
   the node has been onlined, and cpu's node is still needed
   to migrate the sleep task on the cpu to the same node.
So we do nothing in try_offline_node() in this case.

Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 include/linux/memory_hotplug.h | 2 ++
 mm/memory_hotplug.c            | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index ad2dd17..48ece75 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -187,6 +187,7 @@ extern void get_page_bootmem(unsigned long ingo, struct page *page,
 
 void lock_memory_hotplug(void);
 void unlock_memory_hotplug(void);
+extern void try_offline_node(int nid);
 
 #else /* ! CONFIG_MEMORY_HOTPLUG */
 /*
@@ -221,6 +222,7 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
 
 static inline void lock_memory_hotplug(void) {}
 static inline void unlock_memory_hotplug(void) {}
+static inline void try_offline_node(int nid) {}
 
 #endif /* ! CONFIG_MEMORY_HOTPLUG */
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 52db031..b7c30bb 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1670,7 +1670,7 @@ static int check_cpu_on_node(void *data)
 }
 
 /* offline the node if all memory sections of this node are removed */
-static void try_offline_node(int nid)
+void try_offline_node(int nid)
 {
 	pg_data_t *pgdat = NODE_DATA(nid);
 	unsigned long start_pfn = NODE_DATA(nid)->node_start_pfn;
@@ -1720,6 +1720,7 @@ static void try_offline_node(int nid)
 	arch_refresh_nodedata(nid, NULL);
 	arch_free_nodedata(pgdat);
 }
+EXPORT_SYMBOL(try_offline_node);
 
 int __ref remove_memory(int nid, u64 start, u64 size)
 {
-- 
1.8.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/5] cpu-hotplug, memory-hotplug: try offline the node when hotremoving a cpu
  2012-11-26 10:20 [PATCH 0/5] cpu-hotplug,memory-hotplug: bug fix for offlining node Wen Congyang
  2012-11-26 10:20 ` [PATCH 1/5] cpu_hotplug: clear apicid to node when the cpu is hotremoved Wen Congyang
  2012-11-26 10:20 ` [PATCH 2/5] memory-hotplug: export the function try_offline_node() Wen Congyang
@ 2012-11-26 10:20 ` Wen Congyang
  2012-11-26 10:20 ` [PATCH 4/5] cpu-hotplug,memory-hotplug: clear cpu_to_node() when offlining the node Wen Congyang
  2012-11-26 10:20 ` [PATCH 5/5] Do not use cpu_to_node() to find an offlined cpu's node Wen Congyang
  4 siblings, 0 replies; 6+ messages in thread
From: Wen Congyang @ 2012-11-26 10:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-pm, linux-acpi, x86
  Cc: Yasuaki Ishimatsu, David Rientjes, Jiang Liu, Minchan Kim,
	KOSAKI Motohiro, Andrew Morton, Mel Gorman, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Tang Chen,
	Rafael J. Wysocki, Len Brown, Lai Jiangshan, Wen Congyang

The node will be offlined when all memory/cpu on the node is hotremoved.
So we should try offline the node when hotremoving a cpu on the node.

Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 drivers/acpi/processor_driver.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index a4352b8..7fc728c7 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -44,6 +44,7 @@
 #include <linux/moduleparam.h>
 #include <linux/cpuidle.h>
 #include <linux/slab.h>
+#include <linux/memory_hotplug.h>
 
 #include <asm/io.h>
 #include <asm/cpu.h>
@@ -634,6 +635,7 @@ static int acpi_processor_remove(struct acpi_device *device, int type)
 
 	per_cpu(processors, pr->id) = NULL;
 	per_cpu(processor_device_array, pr->id) = NULL;
+	try_offline_node(cpu_to_node(pr->id));
 
 free:
 	free_cpumask_var(pr->throttling.shared_cpu_map);
-- 
1.8.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/5] cpu-hotplug,memory-hotplug: clear cpu_to_node() when offlining the node
  2012-11-26 10:20 [PATCH 0/5] cpu-hotplug,memory-hotplug: bug fix for offlining node Wen Congyang
                   ` (2 preceding siblings ...)
  2012-11-26 10:20 ` [PATCH 3/5] cpu-hotplug, memory-hotplug: try offline the node when hotremoving a cpu Wen Congyang
@ 2012-11-26 10:20 ` Wen Congyang
  2012-11-26 10:20 ` [PATCH 5/5] Do not use cpu_to_node() to find an offlined cpu's node Wen Congyang
  4 siblings, 0 replies; 6+ messages in thread
From: Wen Congyang @ 2012-11-26 10:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-pm, linux-acpi, x86
  Cc: Yasuaki Ishimatsu, David Rientjes, Jiang Liu, Minchan Kim,
	KOSAKI Motohiro, Andrew Morton, Mel Gorman, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Tang Chen,
	Rafael J. Wysocki, Len Brown, Lai Jiangshan, Wen Congyang

When the node is offlined, there is no memory/cpu on the node. If a
sleep task runs on a cpu of this node, it will be migrated to the
cpu on the other node. So we can clear cpu-to-node mapping.

Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 mm/memory_hotplug.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b7c30bb..5ae86d7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1669,6 +1669,34 @@ static int check_cpu_on_node(void *data)
 	return 0;
 }
 
+static void unmap_cpu_on_node(void *data)
+{
+#ifdef CONFIG_ACPI_NUMA
+	struct pglist_data *pgdat = data;
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		if (cpu_to_node(cpu) == pgdat->node_id)
+			numa_clear_node(cpu);
+#endif
+}
+
+static int check_and_unmap_cpu_on_node(void *data)
+{
+	int ret = check_cpu_on_node(data);
+
+	if (ret)
+		return ret;
+
+	/*
+	 * the node will be offlined when we come here, so we can clear
+	 * the cpu_to_node() now.
+	 */
+
+	unmap_cpu_on_node(data);
+	return 0;
+}
+
 /* offline the node if all memory sections of this node are removed */
 void try_offline_node(int nid)
 {
@@ -1695,7 +1723,7 @@ void try_offline_node(int nid)
 		return;
 	}
 
-	if (stop_machine(check_cpu_on_node, NODE_DATA(nid), NULL))
+	if (stop_machine(check_and_unmap_cpu_on_node, NODE_DATA(nid), NULL))
 		return;
 
 	/*
-- 
1.8.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 5/5] Do not use cpu_to_node() to find an offlined cpu's node.
  2012-11-26 10:20 [PATCH 0/5] cpu-hotplug,memory-hotplug: bug fix for offlining node Wen Congyang
                   ` (3 preceding siblings ...)
  2012-11-26 10:20 ` [PATCH 4/5] cpu-hotplug,memory-hotplug: clear cpu_to_node() when offlining the node Wen Congyang
@ 2012-11-26 10:20 ` Wen Congyang
  4 siblings, 0 replies; 6+ messages in thread
From: Wen Congyang @ 2012-11-26 10:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-pm, linux-acpi, x86
  Cc: Yasuaki Ishimatsu, David Rientjes, Jiang Liu, Minchan Kim,
	KOSAKI Motohiro, Andrew Morton, Mel Gorman, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Tang Chen,
	Rafael J. Wysocki, Len Brown, Lai Jiangshan, Wen Congyang

From: Tang Chen <tangchen@cn.fujitsu.com>

If a cpu is offline, its nid will be set to -1, and cpu_to_node(cpu) will
return -1. As a result, cpumask_of_node(nid) will return NULL. In this case,
find_next_bit() in for_each_cpu will get a NULL pointer and cause panic.

Here is a call trace:
[  609.824017] Call Trace:
[  609.824017]  <IRQ>
[  609.824017]  [<ffffffff810b0721>] select_fallback_rq+0x71/0x190
[  609.824017]  [<ffffffff810b086e>] ? try_to_wake_up+0x2e/0x2f0
[  609.824017]  [<ffffffff810b0b0b>] try_to_wake_up+0x2cb/0x2f0
[  609.824017]  [<ffffffff8109da08>] ? __run_hrtimer+0x78/0x320
[  609.824017]  [<ffffffff810b0b85>] wake_up_process+0x15/0x20
[  609.824017]  [<ffffffff8109ce62>] hrtimer_wakeup+0x22/0x30
[  609.824017]  [<ffffffff8109da13>] __run_hrtimer+0x83/0x320
[  609.824017]  [<ffffffff8109ce40>] ? update_rmtp+0x80/0x80
[  609.824017]  [<ffffffff8109df56>] hrtimer_interrupt+0x106/0x280
[  609.824017]  [<ffffffff810a72c8>] ? sd_free_ctl_entry+0x68/0x70
[  609.824017]  [<ffffffff8167cf39>] smp_apic_timer_interrupt+0x69/0x99
[  609.824017]  [<ffffffff8167be2f>] apic_timer_interrupt+0x6f/0x80

There is a hrtimer process sleeping, whose cpu has already been offlined.
When it is waken up, it tries to find another cpu to run, and get a -1 nid.
As a result, cpumask_of_node(-1) returns NULL, and causes ernel panic.

This patch fixes this problem by judging if the nid is -1.
If nid is not -1, a cpu on the same node will be picked.
Else, a online cpu on another node will be picked.

Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 kernel/sched/core.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2d8927f..4e6404e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1106,18 +1106,28 @@ EXPORT_SYMBOL_GPL(kick_process);
  */
 static int select_fallback_rq(int cpu, struct task_struct *p)
 {
-	const struct cpumask *nodemask = cpumask_of_node(cpu_to_node(cpu));
+	int nid = cpu_to_node(cpu);
+	const struct cpumask *nodemask = NULL;
 	enum { cpuset, possible, fail } state = cpuset;
 	int dest_cpu;
 
-	/* Look for allowed, online CPU in same node. */
-	for_each_cpu(dest_cpu, nodemask) {
-		if (!cpu_online(dest_cpu))
-			continue;
-		if (!cpu_active(dest_cpu))
-			continue;
-		if (cpumask_test_cpu(dest_cpu, tsk_cpus_allowed(p)))
-			return dest_cpu;
+	/*
+	 * If the node that the cpu is on has been offlined, cpu_to_node()
+	 * will return -1. There is no cpu on the node, and we should
+	 * select the cpu on the other node.
+	 */
+	if (nid != -1) {
+		nodemask = cpumask_of_node(nid);
+
+		/* Look for allowed, online CPU in same node. */
+		for_each_cpu(dest_cpu, nodemask) {
+			if (!cpu_online(dest_cpu))
+				continue;
+			if (!cpu_active(dest_cpu))
+				continue;
+			if (cpumask_test_cpu(dest_cpu, tsk_cpus_allowed(p)))
+				return dest_cpu;
+		}
 	}
 
 	for (;;) {
-- 
1.8.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-11-26 10:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-26 10:20 [PATCH 0/5] cpu-hotplug,memory-hotplug: bug fix for offlining node Wen Congyang
2012-11-26 10:20 ` [PATCH 1/5] cpu_hotplug: clear apicid to node when the cpu is hotremoved Wen Congyang
2012-11-26 10:20 ` [PATCH 2/5] memory-hotplug: export the function try_offline_node() Wen Congyang
2012-11-26 10:20 ` [PATCH 3/5] cpu-hotplug, memory-hotplug: try offline the node when hotremoving a cpu Wen Congyang
2012-11-26 10:20 ` [PATCH 4/5] cpu-hotplug,memory-hotplug: clear cpu_to_node() when offlining the node Wen Congyang
2012-11-26 10:20 ` [PATCH 5/5] Do not use cpu_to_node() to find an offlined cpu's node Wen Congyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).