* [0/3, v4] CPU Hotplug Emulatation
@ 2010-11-26 4:19 shaohui.zheng
2010-11-26 4:19 ` [1/3, v4] CPU Hotplug Emulator: Abstract cpu register functions shaohui.zheng
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: shaohui.zheng @ 2010-11-26 4:19 UTC (permalink / raw)
To: akpm, gregkh, linux-mm
Cc: haicheng.li, xiyou.wangcong, shaohui.zheng, rientjes
According to the discussion result on NUMA Hotplug Emulator. There are many
suggestions on node/Memory hotplug emulation, and David Rientjes provides a more
flexiable solution for node/memory hotplug. We appreciate for his patches,
and we accept his numa=possible=<N> parameters.
For CPU hotplug emulatinon, there are no opposite voices from the community,
and think CPU probe/release is an useful interface. There is no obvious
relationship with node/memory hotplug, it can work well alone, so I send CPU
hotplug emulation patcheset standalone. it makes the patch reviewing process
faster.
--
Thanks & Regards,
Shaohui
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [1/3, v4] CPU Hotplug Emulator: Abstract cpu register functions
2010-11-26 4:19 [0/3, v4] CPU Hotplug Emulatation shaohui.zheng
@ 2010-11-26 4:19 ` shaohui.zheng
2010-11-26 4:20 ` [2/3, v4] CPU Hotplug Emulator: support cpu probe/release in x86_64 shaohui.zheng
2010-11-26 4:20 ` [3/3, v4] CPU Hotplug Emulator: Fake CPU socket with logical CPU on x86 shaohui.zheng
2 siblings, 0 replies; 4+ messages in thread
From: shaohui.zheng @ 2010-11-26 4:19 UTC (permalink / raw)
To: akpm, gregkh, linux-mm
Cc: haicheng.li, xiyou.wangcong, shaohui.zheng, rientjes, Paul Mundt
[-- Attachment #1: 004-hotplug-emulator-x86-abstract-cpu-register-functions.patch --]
[-- Type: text/plain, Size: 3655 bytes --]
From: Shaohui Zheng <shaohui.zheng@intel.com>
Abstract cpu register functions, provide a more flexible interface
register_cpu_node, the new interface provides convenience to add cpu
to a specified node, we can use it to add a cpu to a fake node.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
---
Index: linux-hpe4/arch/x86/include/asm/cpu.h
===================================================================
--- linux-hpe4.orig/arch/x86/include/asm/cpu.h 2010-11-17 09:00:59.742608402 +0800
+++ linux-hpe4/arch/x86/include/asm/cpu.h 2010-11-17 09:01:10.192838977 +0800
@@ -27,6 +27,7 @@
#ifdef CONFIG_HOTPLUG_CPU
extern int arch_register_cpu(int num);
+extern int arch_register_cpu_node(int num, int nid);
extern void arch_unregister_cpu(int);
#endif
Index: linux-hpe4/arch/x86/kernel/topology.c
===================================================================
--- linux-hpe4.orig/arch/x86/kernel/topology.c 2010-11-17 09:01:01.053461766 +0800
+++ linux-hpe4/arch/x86/kernel/topology.c 2010-11-17 10:05:32.934085248 +0800
@@ -52,6 +52,15 @@
}
EXPORT_SYMBOL(arch_register_cpu);
+int __ref arch_register_cpu_node(int num, int nid)
+{
+ if (num)
+ per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
+
+ return register_cpu_node(&per_cpu(cpu_devices, num).cpu, num, nid);
+}
+EXPORT_SYMBOL(arch_register_cpu_node);
+
void arch_unregister_cpu(int num)
{
unregister_cpu(&per_cpu(cpu_devices, num).cpu);
Index: linux-hpe4/drivers/base/cpu.c
===================================================================
--- linux-hpe4.orig/drivers/base/cpu.c 2010-11-17 09:01:01.053461766 +0800
+++ linux-hpe4/drivers/base/cpu.c 2010-11-17 10:05:32.943465010 +0800
@@ -208,17 +208,18 @@
static SYSDEV_CLASS_ATTR(offline, 0444, print_cpus_offline, NULL);
/*
- * register_cpu - Setup a sysfs device for a CPU.
+ * register_cpu_node - Setup a sysfs device for a CPU.
* @cpu - cpu->hotpluggable field set to 1 will generate a control file in
* sysfs for this CPU.
* @num - CPU number to use when creating the device.
+ * @nid - Node ID to use, if any.
*
* Initialize and register the CPU device.
*/
-int __cpuinit register_cpu(struct cpu *cpu, int num)
+int __cpuinit register_cpu_node(struct cpu *cpu, int num, int nid)
{
int error;
- cpu->node_id = cpu_to_node(num);
+ cpu->node_id = nid;
cpu->sysdev.id = num;
cpu->sysdev.cls = &cpu_sysdev_class;
@@ -229,7 +230,7 @@
if (!error)
per_cpu(cpu_sys_devices, num) = &cpu->sysdev;
if (!error)
- register_cpu_under_node(num, cpu_to_node(num));
+ register_cpu_under_node(num, nid);
#ifdef CONFIG_KEXEC
if (!error)
Index: linux-hpe4/include/linux/cpu.h
===================================================================
--- linux-hpe4.orig/include/linux/cpu.h 2010-11-17 09:00:59.772898926 +0800
+++ linux-hpe4/include/linux/cpu.h 2010-11-17 10:05:32.954085309 +0800
@@ -30,7 +30,13 @@
struct sys_device sysdev;
};
-extern int register_cpu(struct cpu *cpu, int num);
+extern int register_cpu_node(struct cpu *cpu, int num, int nid);
+
+static inline int register_cpu(struct cpu *cpu, int num)
+{
+ return register_cpu_node(cpu, num, cpu_to_node(num));
+}
+
extern struct sys_device *get_cpu_sysdev(unsigned cpu);
extern int cpu_add_sysdev_attr(struct sysdev_attribute *attr);
--
Thanks & Regards,
Shaohui
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [2/3, v4] CPU Hotplug Emulator: support cpu probe/release in x86_64
2010-11-26 4:19 [0/3, v4] CPU Hotplug Emulatation shaohui.zheng
2010-11-26 4:19 ` [1/3, v4] CPU Hotplug Emulator: Abstract cpu register functions shaohui.zheng
@ 2010-11-26 4:20 ` shaohui.zheng
2010-11-26 4:20 ` [3/3, v4] CPU Hotplug Emulator: Fake CPU socket with logical CPU on x86 shaohui.zheng
2 siblings, 0 replies; 4+ messages in thread
From: shaohui.zheng @ 2010-11-26 4:20 UTC (permalink / raw)
To: akpm, gregkh, linux-mm
Cc: haicheng.li, xiyou.wangcong, shaohui.zheng, rientjes, Ingo Molnar,
Len Brown, Yinghai Lu, Haicheng Li
[-- Attachment #1: 005-hotplug-emulator-x86-support-cpu-probe-release-in-x86.patch --]
[-- Type: text/plain, Size: 10983 bytes --]
From: Shaohui Zheng <shaohui.zheng@intel.com>
CPU physical hot-add/hot-remove are supported on some hardwares, and it
was already supported in current linux kernel. CPU Hotplug Emulator provides
a mechanism to emulate the process with software method. It can be used for
testing or debuging purpose.
CPU physical hotplug is different with logical CPU online/offline. Logical
online/offline is controled by interface /sys/device/cpu/cpuX/online. CPU
hotplug emulator uses probe/release interface. It becomes possible to do cpu
hotplug automation and stress
Add cpu interface probe/release under sysfs for x86_64. User can use this
interface to emulate the cpu hot-add and hot-remove process.
Directive:
*) Reserve CPU thru grub parameter like:
maxcpus=4
the rest CPUs will not be initiliazed.
*) Probe CPU
we can use the probe interface to hot-add new CPUs:
echo nid > /sys/devices/system/cpu/probe
*) Release a CPU
echo cpu > /sys/devices/system/cpu/release
A reserved CPU will be hot-added to the specified node.
1) nid == 0, the CPU will be added to the real node which the CPU
should be in
2) nid != 0, add the CPU to node nid even through it is a fake node.
CC: Ingo Molnar <mingo@elte.hu>
CC: Len Brown <len.brown@intel.com>
CC: Yinghai Lu <Yinghai.Lu@Sun.COM>
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
---
Index: linux-hpe4/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-hpe4.orig/arch/x86/kernel/acpi/boot.c 2010-11-17 09:00:59.742608402 +0800
+++ linux-hpe4/arch/x86/kernel/acpi/boot.c 2010-11-17 09:01:10.202837209 +0800
@@ -647,8 +647,44 @@
}
EXPORT_SYMBOL(acpi_map_lsapic);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static void acpi_map_cpu2node_emu(int cpu, int physid, int nid)
+{
+#ifdef CONFIG_ACPI_NUMA
+#ifdef CONFIG_X86_64
+ apicid_to_node[physid] = nid;
+ numa_set_node(cpu, nid);
+#else /* CONFIG_X86_32 */
+ apicid_2_node[physid] = nid;
+ cpu_to_node_map[cpu] = nid;
+#endif
+#endif
+}
+
+static u16 cpu_to_apicid_saved[CONFIG_NR_CPUS];
+int __ref acpi_map_lsapic_emu(int pcpu, int nid)
+{
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[pcpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, pcpu) != BAD_APICID)
+ cpu_to_apicid_saved[pcpu] = per_cpu(x86_cpu_to_apicid, pcpu);
+
+ per_cpu(x86_cpu_to_apicid, pcpu) = cpu_to_apicid_saved[pcpu];
+ acpi_map_cpu2node_emu(pcpu, per_cpu(x86_cpu_to_apicid, pcpu), nid);
+
+ return pcpu;
+}
+EXPORT_SYMBOL(acpi_map_lsapic_emu);
+#endif
+
int acpi_unmap_lsapic(int cpu)
{
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[cpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, cpu) != BAD_APICID)
+ cpu_to_apicid_saved[cpu] = per_cpu(x86_cpu_to_apicid, cpu);
+#endif
per_cpu(x86_cpu_to_apicid, cpu) = -1;
set_cpu_present(cpu, false);
num_processors--;
Index: linux-hpe4/arch/x86/kernel/smpboot.c
===================================================================
--- linux-hpe4.orig/arch/x86/kernel/smpboot.c 2010-11-17 09:00:59.753464132 +0800
+++ linux-hpe4/arch/x86/kernel/smpboot.c 2010-11-17 10:05:26.913464702 +0800
@@ -107,8 +107,6 @@
mutex_unlock(&x86_cpu_hotplug_driver_mutex);
}
-ssize_t arch_cpu_probe(const char *buf, size_t count) { return -1; }
-ssize_t arch_cpu_release(const char *buf, size_t count) { return -1; }
#else
static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
#define get_idle_for_cpu(x) (idle_thread_array[(x)])
Index: linux-hpe4/arch/x86/kernel/topology.c
===================================================================
--- linux-hpe4.orig/arch/x86/kernel/topology.c 2010-11-17 09:01:10.192838977 +0800
+++ linux-hpe4/arch/x86/kernel/topology.c 2010-11-17 10:05:26.924085712 +0800
@@ -30,6 +30,9 @@
#include <linux/init.h>
#include <linux/smp.h>
#include <asm/cpu.h>
+#include <linux/cpu.h>
+#include <linux/topology.h>
+#include <linux/acpi.h>
static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
@@ -66,6 +69,74 @@
unregister_cpu(&per_cpu(cpu_devices, num).cpu);
}
EXPORT_SYMBOL(arch_unregister_cpu);
+
+ssize_t arch_cpu_probe(const char *buf, size_t count)
+{
+ int nid = 0;
+ int num = 0, selected = 0;
+
+ /* check parameters */
+ if (!buf || count < 2)
+ return -EPERM;
+
+ nid = simple_strtoul(buf, NULL, 0);
+ printk(KERN_DEBUG "Add a cpu to node : %d\n", nid);
+
+ if (nid < 0 || nid > nr_node_ids - 1) {
+ printk(KERN_ERR "Invalid NUMA node id: %d (0 <= nid < %d).\n",
+ nid, nr_node_ids);
+ return -EPERM;
+ }
+
+ if (!node_online(nid)) {
+ printk(KERN_ERR "NUMA node %d is not online, give up.\n", nid);
+ return -EPERM;
+ }
+
+ /* find first uninitialized cpu */
+ for_each_present_cpu(num) {
+ if (per_cpu(cpu_sys_devices, num) == NULL) {
+ selected = num;
+ break;
+ }
+ }
+
+ if (selected >= num_possible_cpus()) {
+ printk(KERN_ERR "No free cpu, give up cpu probing.\n");
+ return -EPERM;
+ }
+
+ /* register cpu */
+ arch_register_cpu_node(selected, nid);
+ acpi_map_lsapic_emu(selected, nid);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_probe);
+
+ssize_t arch_cpu_release(const char *buf, size_t count)
+{
+ int cpu = 0;
+
+ cpu = simple_strtoul(buf, NULL, 0);
+ /* cpu 0 is not hotplugable */
+ if (cpu == 0) {
+ printk(KERN_ERR "can not release cpu 0.\n");
+ return -EPERM;
+ }
+
+ if (cpu_online(cpu)) {
+ printk(KERN_DEBUG "offline cpu %d.\n", cpu);
+ cpu_down(cpu);
+ }
+
+ arch_unregister_cpu(cpu);
+ acpi_unmap_lsapic(cpu);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_release);
+
#else /* CONFIG_HOTPLUG_CPU */
static int __init arch_register_cpu(int num)
@@ -83,8 +154,14 @@
register_one_node(i);
#endif
- for_each_present_cpu(i)
- arch_register_cpu(i);
+ /*
+ * when cpu hotplug emulation enabled, register the online cpu only,
+ * the rests are reserved for cpu probe.
+ */
+ for_each_present_cpu(i) {
+ if ((cpu_hpe_on && cpu_online(i)) || !cpu_hpe_on)
+ arch_register_cpu(i);
+ }
return 0;
}
Index: linux-hpe4/arch/x86/mm/numa_64.c
===================================================================
--- linux-hpe4.orig/arch/x86/mm/numa_64.c 2010-11-17 09:01:10.132837502 +0800
+++ linux-hpe4/arch/x86/mm/numa_64.c 2010-11-17 09:01:10.202837209 +0800
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/nodemask.h>
#include <linux/sched.h>
+#include <linux/cpu.h>
#include <asm/e820.h>
#include <asm/proto.h>
@@ -915,6 +916,19 @@
}
#endif
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static __init int cpu_hpe_setup(char *opt)
+{
+ if (!opt)
+ return -EINVAL;
+
+ if (!strncmp(opt, "on", 2) || !strncmp(opt, "1", 1))
+ cpu_hpe_on = 1;
+
+ return 0;
+}
+early_param("cpu_hpe", cpu_hpe_setup);
+#endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
void __cpuinit numa_set_node(int cpu, int node)
{
Index: linux-hpe4/drivers/acpi/processor_driver.c
===================================================================
--- linux-hpe4.orig/drivers/acpi/processor_driver.c 2010-11-17 09:00:59.765335724 +0800
+++ linux-hpe4/drivers/acpi/processor_driver.c 2010-11-17 09:01:10.212839478 +0800
@@ -530,6 +530,14 @@
goto err_free_cpumask;
sysdev = get_cpu_sysdev(pr->id);
+ /*
+ * Reserve cpu for hotplug emulation, the reserved cpu can be hot-added
+ * throu the cpu probe interface. Return directly.
+ */
+ if (sysdev == NULL) {
+ goto out;
+ }
+
if (sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) {
result = -EFAULT;
goto err_remove_fs;
@@ -570,6 +578,7 @@
goto err_remove_sysfs;
}
+out:
return 0;
err_remove_sysfs:
Index: linux-hpe4/drivers/base/cpu.c
===================================================================
--- linux-hpe4.orig/drivers/base/cpu.c 2010-11-17 09:01:10.192838977 +0800
+++ linux-hpe4/drivers/base/cpu.c 2010-11-17 09:01:10.212839478 +0800
@@ -22,9 +22,15 @@
};
EXPORT_SYMBOL(cpu_sysdev_class);
-static DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
+DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * cpu_hpe_on is a switch to enable/disable cpu hotplug emulation. it is
+ * disabled in default, we can enable it throu grub parameter cpu_hpe=on
+ */
+int cpu_hpe_on;
+
static ssize_t show_online(struct sys_device *dev, struct sysdev_attribute *attr,
char *buf)
{
Index: linux-hpe4/include/linux/acpi.h
===================================================================
--- linux-hpe4.orig/include/linux/acpi.h 2010-11-17 09:00:59.772898926 +0800
+++ linux-hpe4/include/linux/acpi.h 2010-11-17 09:01:10.212839478 +0800
@@ -102,6 +102,7 @@
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
int acpi_map_lsapic(acpi_handle handle, int *pcpu);
+int acpi_map_lsapic_emu(int pcpu, int nid);
int acpi_unmap_lsapic(int cpu);
#endif /* CONFIG_ACPI_HOTPLUG_CPU */
Index: linux-hpe4/include/linux/cpu.h
===================================================================
--- linux-hpe4.orig/include/linux/cpu.h 2010-11-17 09:01:10.192838977 +0800
+++ linux-hpe4/include/linux/cpu.h 2010-11-17 09:01:10.212839478 +0800
@@ -30,6 +30,8 @@
struct sys_device sysdev;
};
+DECLARE_PER_CPU(struct sys_device *, cpu_sys_devices);
+
extern int register_cpu_node(struct cpu *cpu, int num, int nid);
static inline int register_cpu(struct cpu *cpu, int num)
@@ -149,6 +151,7 @@
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
int cpu_down(unsigned int cpu);
+extern int cpu_hpe_on;
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
extern void cpu_hotplug_driver_lock(void);
@@ -171,6 +174,7 @@
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
#define unregister_hotcpu_notifier(nb) ({ (void)(nb); })
+static int cpu_hpe_on;
#endif /* CONFIG_HOTPLUG_CPU */
#ifdef CONFIG_PM_SLEEP_SMP
Index: linux-hpe4/Documentation/x86/x86_64/boot-options.txt
===================================================================
--- linux-hpe4.orig/Documentation/x86/x86_64/boot-options.txt 2010-11-26 12:49:44.847725099 +0800
+++ linux-hpe4/Documentation/x86/x86_64/boot-options.txt 2010-11-26 12:55:50.527724999 +0800
@@ -316,3 +316,9 @@
Do not use GB pages for kernel direct mappings.
gbpages
Use GB pages for kernel direct mappings.
+ cpu_hpe=on/off
+ Enable/disable cpu hotplug emulation with software method. When cpu_hpe=on,
+ sysfs provides probe/release interface to hot add/remove cpu dynamically.
+ We can use maxcpus=<N> to reserve CPUs.
+ This option is disabled in default.
+
--
Thanks & Regards,
Shaohui
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [3/3, v4] CPU Hotplug Emulator: Fake CPU socket with logical CPU on x86
2010-11-26 4:19 [0/3, v4] CPU Hotplug Emulatation shaohui.zheng
2010-11-26 4:19 ` [1/3, v4] CPU Hotplug Emulator: Abstract cpu register functions shaohui.zheng
2010-11-26 4:20 ` [2/3, v4] CPU Hotplug Emulator: support cpu probe/release in x86_64 shaohui.zheng
@ 2010-11-26 4:20 ` shaohui.zheng
2 siblings, 0 replies; 4+ messages in thread
From: shaohui.zheng @ 2010-11-26 4:20 UTC (permalink / raw)
To: akpm, gregkh, linux-mm
Cc: haicheng.li, xiyou.wangcong, shaohui.zheng, rientjes,
Sam Ravnborg, Haicheng Li
[-- Attachment #1: 006-hotplug-emulator-fake_socket_with_logic_cpu_on_x86.patch --]
[-- Type: text/plain, Size: 7989 bytes --]
From: Shaohui Zheng <shaohui.zheng@intel.com>
When hotplug a CPU with emulator, we are using a logical CPU to emulate the
CPU hotplug process. For the CPU supported SMT, some logical CPUs are in the
same socket, but it may located in different NUMA node after we have emulator.
it misleads the scheduling domain to build the incorrect hierarchy, and it
causes the following call trace when rebalance the scheduling domain:
divide error: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu8/online
CPU 0
Modules linked in: fbcon tileblit font bitblit softcursor radeon ttm drm_kms_helper e1000e usbhid via_rhine mii drm i2c_algo_bit igb dca
Pid: 0, comm: swapper Not tainted 2.6.32hpe #78 X8DTN
RIP: 0010:[<ffffffff81051da5>] [<ffffffff81051da5>] find_busiest_group+0x6c5/0xa10
RSP: 0018:ffff880028203c30 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000015ac0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff880277e8cfa0 RDI: 0000000000000000
RBP: ffff880028203dc0 R08: ffff880277e8cfa0 R09: 0000000000000040
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f16cfc85770 CR3: 0000000001001000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81822000, task ffffffff8184a600)
Stack:
ffff880028203d60 ffff880028203cd0 ffff8801c204ff08 ffff880028203e38
<0> 0101ffff81018c59 ffff880028203e44 00000001810806bd ffff8801c204fe00
<0> 0000000528200000 ffffffff00000000 0000000000000018 0000000000015ac0
Call Trace:
<IRQ>
[<ffffffff81088ee0>] ? tick_dev_program_event+0x40/0xd0
[<ffffffff81053b2c>] rebalance_domains+0x17c/0x570
[<ffffffff81018c89>] ? read_tsc+0x9/0x20
[<ffffffff81088ee0>] ? tick_dev_program_event+0x40/0xd0
[<ffffffff810569ed>] run_rebalance_domains+0xbd/0xf0
[<ffffffff8106471f>] __do_softirq+0xaf/0x1e0
[<ffffffff810b7d18>] ? handle_IRQ_event+0x58/0x160
[<ffffffff810130ac>] call_softirq+0x1c/0x30
[<ffffffff81014a85>] do_softirq+0x65/0xa0
[<ffffffff810645cd>] irq_exit+0x7d/0x90
[<ffffffff81013ff0>] do_IRQ+0x70/0xe0
[<ffffffff810128d3>] ret_from_intr+0x0/0x11
<EOI>
[<ffffffff8133387f>] ? acpi_idle_enter_bm+0x281/0x2b5
[<ffffffff81333878>] ? acpi_idle_enter_bm+0x27a/0x2b5
[<ffffffff8145dc8f>] ? cpuidle_idle_call+0x9f/0x130
[<ffffffff81010e2b>] ? cpu_idle+0xab/0x100
[<ffffffff8158aee6>] ? rest_init+0x66/0x70
[<ffffffff81905d90>] ? start_kernel+0x3e3/0x3ef
[<ffffffff8190533a>] ? x86_64_start_reservations+0x125/0x129
[<ffffffff81905438>] ? x86_64_start_kernel+0xfa/0x109
Code: 00 00 e9 4c fb ff ff 0f 1f 80 00 00 00 00 48 8b b5 d8 fe ff ff 48 8b 45 a8 4d 29 ef 8b 56 08 48 c1 e0 0a 49 89 f0 48 89 d7 31 d2 <48> f7 f7 31 d2 48 89 45 a0 8b 76 08 4c 89 f0 48 c1 e0 0a 48 f7
RIP [<ffffffff81051da5>] find_busiest_group+0x6c5/0xa10
RSP <ffff880028203c30>
Solution:
We put the logical CPU into a fake CPU socket, and assign it an unique
phys_proc_id. For the fake socket, we put one logical CPU in only. This
method fixes the above bug.
CC: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
---
Index: linux-hpe4/arch/x86/include/asm/processor.h
===================================================================
--- linux-hpe4.orig/arch/x86/include/asm/processor.h 2010-11-17 09:00:51.354100239 +0800
+++ linux-hpe4/arch/x86/include/asm/processor.h 2010-11-17 09:01:10.222837594 +0800
@@ -113,6 +113,15 @@
/* Index into per_cpu list: */
u16 cpu_index;
#endif
+
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ /*
+ * Use a logic cpu to emulate a physical cpu's hotplug. We put the
+ * logical cpu into a fake socket, assign a fake physical id to it,
+ * and create a fake core.
+ */
+ __u8 cpu_probe_on; /* A flag to enable cpu probe/release */
+#endif
} __attribute__((__aligned__(SMP_CACHE_BYTES)));
#define X86_VENDOR_INTEL 0
Index: linux-hpe4/arch/x86/kernel/smpboot.c
===================================================================
--- linux-hpe4.orig/arch/x86/kernel/smpboot.c 2010-11-17 09:01:10.202837209 +0800
+++ linux-hpe4/arch/x86/kernel/smpboot.c 2010-11-17 09:01:10.222837594 +0800
@@ -97,6 +97,7 @@
*/
static DEFINE_MUTEX(x86_cpu_hotplug_driver_mutex);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
void cpu_hotplug_driver_lock()
{
mutex_lock(&x86_cpu_hotplug_driver_mutex);
@@ -106,6 +107,7 @@
{
mutex_unlock(&x86_cpu_hotplug_driver_mutex);
}
+#endif
#else
static struct task_struct *idle_thread_array[NR_CPUS] __cpuinitdata ;
@@ -198,6 +200,8 @@
{
int cpuid, phys_id;
unsigned long timeout;
+ u8 cpu_probe_on = 0;
+ struct cpuinfo_x86 *c;
/*
* If waken up by an INIT in an 82489DX configuration
@@ -277,7 +281,20 @@
/*
* Save our processor parameters
*/
+ c = &cpu_data(cpuid);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ cpu_probe_on = c->cpu_probe_on;
+ phys_id = c->phys_proc_id;
+#endif
+
smp_store_cpu_info(cpuid);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ if (cpu_probe_on) {
+ c->phys_proc_id = phys_id; /* restore the fake phys_proc_id */
+ c->cpu_core_id = 0; /* force the logical cpu to core 0 */
+ c->cpu_probe_on = cpu_probe_on;
+ }
+#endif
notify_cpu_starting(cpuid);
@@ -400,6 +417,11 @@
{
int i;
struct cpuinfo_x86 *c = &cpu_data(cpu);
+ int cpu_probe_on = 0;
+
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ cpu_probe_on = c->cpu_probe_on;
+#endif
cpumask_set_cpu(cpu, cpu_sibling_setup_mask);
@@ -431,7 +453,8 @@
for_each_cpu(i, cpu_sibling_setup_mask) {
if (per_cpu(cpu_llc_id, cpu) != BAD_APICID &&
- per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i)) {
+ per_cpu(cpu_llc_id, cpu) == per_cpu(cpu_llc_id, i) &&
+ cpu_probe_on == 0) {
cpumask_set_cpu(i, c->llc_shared_map);
cpumask_set_cpu(cpu, cpu_data(i).llc_shared_map);
}
Index: linux-hpe4/arch/x86/kernel/topology.c
===================================================================
--- linux-hpe4.orig/arch/x86/kernel/topology.c 2010-11-17 09:01:10.202837209 +0800
+++ linux-hpe4/arch/x86/kernel/topology.c 2010-11-17 09:01:10.222837594 +0800
@@ -70,6 +70,36 @@
}
EXPORT_SYMBOL(arch_unregister_cpu);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+/*
+ * Put the logical cpu into a new sokect, and encapsule it into core 0.
+ */
+static void fake_cpu_socket_info(int cpu)
+{
+ struct cpuinfo_x86 *c = &cpu_data(cpu);
+ int i, phys_id = 0;
+
+ /* calculate the max phys_id */
+ for_each_present_cpu(i) {
+ struct cpuinfo_x86 *c = &cpu_data(i);
+ if (phys_id < c->phys_proc_id)
+ phys_id = c->phys_proc_id;
+ }
+
+ c->phys_proc_id = phys_id + 1; /* pick up a unused phys_proc_id */
+ c->cpu_core_id = 0; /* always put the logical cpu to core 0 */
+ c->cpu_probe_on = 1;
+}
+
+static void clear_cpu_socket_info(int cpu)
+{
+ struct cpuinfo_x86 *c = &cpu_data(cpu);
+ c->phys_proc_id = 0;
+ c->cpu_core_id = 0;
+ c->cpu_probe_on = 0;
+}
+
+
ssize_t arch_cpu_probe(const char *buf, size_t count)
{
int nid = 0;
@@ -109,6 +139,7 @@
/* register cpu */
arch_register_cpu_node(selected, nid);
acpi_map_lsapic_emu(selected, nid);
+ fake_cpu_socket_info(selected);
return count;
}
@@ -132,10 +163,13 @@
arch_unregister_cpu(cpu);
acpi_unmap_lsapic(cpu);
+ clear_cpu_socket_info(cpu);
+ set_cpu_present(cpu, true);
return count;
}
EXPORT_SYMBOL(arch_cpu_release);
+#endif CONFIG_ARCH_CPU_PROBE_RELEASE
#else /* CONFIG_HOTPLUG_CPU */
--
Thanks & Regards,
Shaohui
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-11-26 6:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-26 4:19 [0/3, v4] CPU Hotplug Emulatation shaohui.zheng
2010-11-26 4:19 ` [1/3, v4] CPU Hotplug Emulator: Abstract cpu register functions shaohui.zheng
2010-11-26 4:20 ` [2/3, v4] CPU Hotplug Emulator: support cpu probe/release in x86_64 shaohui.zheng
2010-11-26 4:20 ` [3/3, v4] CPU Hotplug Emulator: Fake CPU socket with logical CPU on x86 shaohui.zheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).