* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-13 11:56 [RFC,5/7] NUMA hotplug emulator Shaohui Zheng
@ 2010-05-07 14:11 ` Pavel Machek
2010-05-16 17:45 ` Paul E. McKenney
` (2 more replies)
2010-05-13 12:11 ` Jean Delvare
1 sibling, 3 replies; 11+ messages in thread
From: Pavel Machek @ 2010-05-07 14:11 UTC (permalink / raw)
To: akpm, linux-mm, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
Andi Kleen, Hidetoshi Seto, Len Brown, Rafael J. Wysocki,
Yinghai Lu, Thomas Renninger, David Rientjes, Mel Gorman,
Venkatesh Pallipadi, Alex Chiang, Tejun Heo, Christoph Lameter,
Greg Kroah-Hartman, Stephen Rothwell, Benjamin Herrenschmidt,
Shaohua Li, Jean Delvare, Hugh Dickins, James Bottomley,
Paul E. McKenney, linux-kernel, linux-pm, linux-acpi,
fengguang.wu, haicheng.li, shaohui.zheng
Hi!
> hotplug emulator: Abstract cpu register functions
>
> Abstract function arch_register_cpu and register_cpu, move the implementation
> details to a sub function with prefix "__".
>
> each of the sub function has an extra parameter nid, it can be used to register
> CPU under a fake NUMA node, it is a reserved interface for cpu hotplug emulation
> (CPU PROBE/RELEASE) in x86.
I don't get it. CPU hotplug can already be tested using echo 0/1 >
online, and that works on 386. How is this different?
It seems to add some numa magic. Why is it important?
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC,5/7] NUMA hotplug emulator
@ 2010-05-13 11:56 Shaohui Zheng
2010-05-07 14:11 ` Pavel Machek
2010-05-13 12:11 ` Jean Delvare
0 siblings, 2 replies; 11+ messages in thread
From: Shaohui Zheng @ 2010-05-13 11:56 UTC (permalink / raw)
To: akpm, linux-mm
Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Andi Kleen,
Hidetoshi Seto, Len Brown, Pavel Machek, Rafael J. Wysocki,
Yinghai Lu, Thomas Renninger, David Rientjes, Mel Gorman,
Venkatesh Pallipadi, Alex Chiang, Tejun Heo, Christoph Lameter,
Greg Kroah-Hartman, Stephen Rothwell, Benjamin Herrenschmidt,
Shaohua Li, Jean Delvare, Hugh Dickins, James Bottomley,
Paul E. McKenney, linux-kernel, linux-pm, linux-acpi,
fengguang.wu, haicheng.li, shaohui.zheng
[-- Attachment #1: Type: text/plain, Size: 3065 bytes --]
hotplug emulator: Abstract cpu register functions
Abstract function arch_register_cpu and register_cpu, move the implementation
details to a sub function with prefix "__".
each of the sub function has an extra parameter nid, it can be used to register
CPU under a fake NUMA node, it is a reserved interface for cpu hotplug emulation
(CPU PROBE/RELEASE) in x86.
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
---
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index 7e45159..f716cd9 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -34,7 +34,11 @@
static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
#ifdef CONFIG_HOTPLUG_CPU
-int __ref arch_register_cpu(int num)
+/*
+ * Add nid(NUMA node id) as parameter for cpu hotplug emulation. It supports
+ * to register a CPU to any nodes.
+ */
+static int __ref __arch_register_cpu(int num, int nid)
{
/*
* CPU0 cannot be offlined due to several
@@ -50,6 +54,11 @@ int __ref arch_register_cpu(int num)
return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
}
+
+int __ref arch_register_cpu(int num)
+{
+ return __arch_register_cpu(num, NUMA_NO_NODE);
+}
EXPORT_SYMBOL(arch_register_cpu);
void arch_unregister_cpu(int num)
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index f35719a..4aca9e3 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -208,17 +208,20 @@ static ssize_t print_cpus_offline(struct sysdev_class *class,
static SYSDEV_CLASS_ATTR(offline, 0444, print_cpus_offline, NULL);
/*
- * register_cpu - Setup a sysfs device for a CPU.
+ * __register_cpu -Initialize and register the CPU device.
+ *
* @cpu - cpu->hotpluggable field set to 1 will generate a control file in
* sysfs for this CPU.
* @num - CPU number to use when creating the device.
+ * @nid - numa node id
*
- * Initialize and register the CPU device.
+ * We do not calculate nid by funciton cpu_to_node(), and change it as a
+ * parameter, it is an reserved interface for CPU hotplug emulation.
*/
-int __cpuinit register_cpu(struct cpu *cpu, int num)
+static int __cpuinit __register_cpu(struct cpu *cpu, int num, int nid)
{
int error;
- cpu->node_id = cpu_to_node(num);
+ cpu->node_id = nid;
cpu->sysdev.id = num;
cpu->sysdev.cls = &cpu_sysdev_class;
@@ -229,7 +232,7 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
if (!error)
per_cpu(cpu_sys_devices, num) = &cpu->sysdev;
if (!error)
- register_cpu_under_node(num, cpu_to_node(num));
+ register_cpu_under_node(num, nid);
#ifdef CONFIG_KEXEC
if (!error)
@@ -238,6 +241,15 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
return error;
}
+/*
+ * register_cpu - Setup a sysfs device for a CPU.
+ * Initialize and register the CPU device.
+ */
+int __cpuinit register_cpu(struct cpu *cpu, int num)
+{
+ return __register_cpu(cpu, num, cpu_to_node(num));
+}
+
struct sys_device *get_cpu_sysdev(unsigned cpu)
{
if (cpu < nr_cpu_ids && cpu_possible(cpu))
--
Thanks & Regards,
Shaohui
[-- Attachment #2: 005-hotplug-emulator-x86-support-cpu-probe-release-in-x86.patch --]
[-- Type: text/x-diff, Size: 12546 bytes --]
hotplug emulator: support cpu probe/release in x86
Add cpu interface probe/release under sysfs for x86. User can use this
interface to emulate the cpu hot-add process, it is for cpu hotplug
test purpose. Add a kernel option CONFIG_ARCH_CPU_PROBE_RELEASE for this
feature.
This interface provides a mechanism to emulate cpu hotplug with software
methods, it becomes possible to do cpu hotplug automation and stress
testing.
Directive:
*) Reserve CPU throu grub parameter like:
maxcpus=4
the rest CPUs will not be initiliazed.
*) Probe CPU
we can use the probe interface to hot-add new CPUs:
echo nid > /sys/devices/system/cpu/probe
*) Release a CPU
echo cpu > /sys/devices/system/cpu/release
A reserved CPU will be hot-added to the specified node.
1) nid == 0, the CPU will be added to the real node which the CPU
should be in
2) nid != 0, add the CPU to node nid even through it is a fake node.
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
---
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2c078c8..54ccb0d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1228,6 +1228,17 @@ config NODE_HOTPLUG_EMU
N is the number of hidden nodes, size is the memory size per
hidden node. This is only useful for debugging.
+config ARCH_CPU_PROBE_RELEASE
+ def_bool y
+ bool "CPU hotplug emulation"
+ depends on NUMA_HOTPLUG_EMU
+ ---help---
+ Enable cpu hotplug emulation. Reserve cpu with grub parameter
+ "maxcpus=N", where N is the initial CPU number, the rest physical
+ CPUs will not be initialized; there is a probe/release interface
+ is for cpu hot-add/hot-remove to specified node in software method.
+ This is for debuging and testing purpose
+
config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)" if !MAXSMP
range 1 10
@@ -1651,6 +1662,9 @@ config HOTPLUG_CPU
( Note: power management support will enable this option
automatically on SMP systems. )
Say N if you want to disable CPU hotplug.
+config ARCH_CPU_PROBE_RELEASE
+ def_bool y
+ depends on HOTPLUG_CPU
config COMPAT_VDSO
def_bool y
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index b185091..339ac2d 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -28,6 +28,9 @@ struct x86_cpu {
#ifdef CONFIG_HOTPLUG_CPU
extern int arch_register_cpu(int num);
extern void arch_unregister_cpu(int);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+extern int arch_register_cpu_emu(int num, int nid);
+#endif
#endif
DECLARE_PER_CPU(int, cpu_state);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index cd40aba..c3c7878 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -592,8 +592,44 @@ int __ref acpi_map_lsapic(acpi_handle handle, int *pcpu)
}
EXPORT_SYMBOL(acpi_map_lsapic);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static void acpi_map_cpu2node_emu(int cpu, int physid, int nid)
+{
+#ifdef CONFIG_ACPI_NUMA
+#ifdef CONFIG_X86_64
+ apicid_to_node[physid] = nid;
+ numa_set_node(cpu, nid);
+#else /* CONFIG_X86_32 */
+ apicid_2_node[physid] = nid;
+ cpu_to_node_map[cpu] = nid;
+#endif
+#endif
+}
+
+static u16 cpu_to_apicid_saved[CONFIG_NR_CPUS];
+int __ref acpi_map_lsapic_emu(int pcpu, int nid)
+{
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[pcpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, pcpu) != BAD_APICID)
+ cpu_to_apicid_saved[pcpu] = per_cpu(x86_cpu_to_apicid, pcpu);
+
+ per_cpu(x86_cpu_to_apicid, pcpu) = cpu_to_apicid_saved[pcpu];
+ acpi_map_cpu2node_emu(pcpu, per_cpu(x86_cpu_to_apicid, pcpu), nid);
+
+ return pcpu;
+}
+EXPORT_SYMBOL(acpi_map_lsapic_emu);
+#endif
+
int acpi_unmap_lsapic(int cpu)
{
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[cpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, cpu) != BAD_APICID)
+ cpu_to_apicid_saved[cpu] = per_cpu(x86_cpu_to_apicid, cpu);
+#endif
per_cpu(x86_cpu_to_apicid, cpu) = -1;
set_cpu_present(cpu, false);
num_processors--;
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index f716cd9..3a7b788 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -29,6 +29,9 @@
#include <linux/mmzone.h>
#include <linux/init.h>
#include <linux/smp.h>
+#include <linux/cpu.h>
+#include <linux/topology.h>
+#include <linux/acpi.h>
#include <asm/cpu.h>
static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
@@ -37,6 +40,11 @@ static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
/*
* Add nid(NUMA node id) as parameter for cpu hotplug emulation. It supports
* to register a CPU to any nodes.
+ *
+ * nid is a special parameter, it has 2 different branches:
+ * 1) when nid == NUMA_NO_NODE, the CPU will be registered into the normal node
+ * which it should be in.
+ * 2) nid != NUMA_NO_NODE, it will be registered into the specified node.
*/
static int __ref __arch_register_cpu(int num, int nid)
{
@@ -52,9 +60,24 @@ static int __ref __arch_register_cpu(int num, int nid)
if (num)
per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
- return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+ if (nid == NUMA_NO_NODE)
+ return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+ else
+ return register_cpu_emu(&per_cpu(cpu_devices, num).cpu, num, nid);
}
+/*
+ * Emulated version of function arch_register_cpu
+ * Parameter:
+ * num: cpu_id
+ * nid: emulated numa id
+ */
+int __ref arch_register_cpu_emu(int num, int nid)
+{
+ return __arch_register_cpu(num, nid);
+}
+EXPORT_SYMBOL(arch_register_cpu_emu);
+
int __ref arch_register_cpu(int num)
{
return __arch_register_cpu(num, NUMA_NO_NODE);
@@ -66,6 +89,84 @@ void arch_unregister_cpu(int num)
unregister_cpu(&per_cpu(cpu_devices, num).cpu);
}
EXPORT_SYMBOL(arch_unregister_cpu);
+
+ssize_t arch_cpu_probe(const char *buf, size_t count)
+{
+ int nid = 0;
+ int num = 0, selected = 0;
+
+ /* check parameters */
+ if (!buf || count < 2)
+ return -EPERM;
+
+ nid = simple_strtoul(buf, NULL, 0);
+ printk(KERN_DEBUG "Add a cpu to node : %d\n", nid);
+
+ if (nid < 0 || nid > nr_node_ids - 1) {
+ printk(KERN_ERR "Invalid NUMA node id: %d (0 <= nid < %d).\n",
+ nid, nr_node_ids);
+ return -EPERM;
+ }
+
+ if (!node_online(nid)) {
+ printk(KERN_ERR "NUMA node %d is not online, give up.\n", nid);
+ return -EPERM;
+ }
+
+ /* find first uninitialized cpu */
+ for_each_present_cpu(num) {
+ if (per_cpu(cpu_sys_devices, num) == NULL) {
+ selected = num;
+ break;
+ }
+ }
+
+ if (selected >= num_possible_cpus()) {
+ printk(KERN_ERR "No free cpu, give up cpu probing.\n");
+ return -EPERM;
+ }
+
+ /* register cpu */
+ arch_register_cpu_emu(selected, nid);
+ acpi_map_lsapic_emu(selected, nid);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_probe);
+
+ssize_t arch_cpu_release(const char *buf, size_t count)
+{
+ int cpu = 0;
+
+ cpu = simple_strtoul(buf, NULL, 0);
+ /* cpu 0 is not hotplugable */
+ if (cpu == 0) {
+ printk(KERN_ERR "can not release cpu 0.\n");
+ return -EPERM;
+ }
+
+ if (cpu_online(cpu)) {
+ printk(KERN_DEBUG "offline cpu %d.\n", cpu);
+ cpu_down(cpu);
+ }
+
+ arch_unregister_cpu(cpu);
+ acpi_unmap_lsapic(cpu);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_release);
+
+void cpu_hotplug_driver_unlock(void)
+{
+}
+EXPORT_SYMBOL(cpu_hotplug_driver_unlock);
+
+void cpu_hotplug_driver_lock(void)
+{
+}
+EXPORT_SYMBOL(cpu_hotplug_driver_lock);
+
#else /* CONFIG_HOTPLUG_CPU */
static int __init arch_register_cpu(int num)
@@ -83,8 +184,14 @@ static int __init topology_init(void)
register_one_node(i);
#endif
- for_each_present_cpu(i)
- arch_register_cpu(i);
+ /*
+ * when cpu hotplug emulation enabled, register the online cpu only,
+ * the rests are reserved for cpu probe.
+ */
+ for_each_present_cpu(i) {
+ if ((cpu_hpe_on && cpu_online(i)) || !cpu_hpe_on)
+ arch_register_cpu(i);
+ }
return 0;
}
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 7c61208..3430ff2 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/nodemask.h>
#include <linux/sched.h>
+#include <linux/cpu.h>
#include <asm/e820.h>
#include <asm/proto.h>
@@ -889,6 +890,19 @@ void __init init_cpu_to_node(void)
}
#endif
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static __init int cpu_hpe_setup(char *opt)
+{
+ if (!opt)
+ return -EINVAL;
+
+ if (!strncmp(opt, "on", 2) || !strncmp(opt, "1", 1))
+ cpu_hpe_on = 1;
+
+ return 0;
+}
+early_param("cpu_hpe", cpu_hpe_setup);
+#endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
void __cpuinit numa_set_node(int cpu, int node)
{
diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 5675d97..e024143 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -604,6 +604,14 @@ static int __cpuinit acpi_processor_add(struct acpi_device *device)
goto err_free_cpumask;
sysdev = get_cpu_sysdev(pr->id);
+ /*
+ * Reserve cpu for hotplug emulation, the reserved cpu can be hot-added
+ * throu the cpu probe interface. Return directly.
+ */
+ if (sysdev == NULL) {
+ goto out;
+ }
+
if (sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) {
result = -EFAULT;
goto err_remove_fs;
@@ -643,6 +651,7 @@ static int __cpuinit acpi_processor_add(struct acpi_device *device)
goto err_remove_sysfs;
}
+out:
return 0;
err_remove_sysfs:
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index a1bc9c6..3225b32 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -22,9 +22,15 @@ struct sysdev_class cpu_sysdev_class = {
};
EXPORT_SYMBOL(cpu_sysdev_class);
-static DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
+DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * cpu_hpe_on is a switch to enable/disable cpu hotplug emulation. it is
+ * disabled in default, we can enable it throu grub parameter cpu_hpe=on
+ */
+int cpu_hpe_on;
+
static ssize_t show_online(struct sys_device *dev, struct sysdev_attribute *attr,
char *buf)
{
@@ -80,6 +86,7 @@ void unregister_cpu(struct cpu *cpu)
}
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+
static ssize_t cpu_probe_store(struct sysdev_class *class,
struct sysdev_class_attribute *attr,
const char *buf,
@@ -250,6 +257,18 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
return __register_cpu(cpu, num, cpu_to_node(num));
}
+/*
+ * Register cpu to the specified NUMA node
+ *
+ * emulated version of function register_cpu, but is more flexible. it supports
+ * an extra parameter nid, We can register a CPU to any specified node throu
+ * this function.
+ */
+int __cpuinit register_cpu_emu(struct cpu *cpu, int num, int nid)
+{
+ return __register_cpu(cpu, num, nid);
+}
+
struct sys_device *get_cpu_sysdev(unsigned cpu)
{
if (cpu < nr_cpu_ids && cpu_possible(cpu))
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index b926afe..c3bc5c7 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -102,6 +102,7 @@ void acpi_numa_arch_fixup(void);
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
int acpi_map_lsapic(acpi_handle handle, int *pcpu);
+int acpi_map_lsapic_emu(int pcpu, int nid);
int acpi_unmap_lsapic(int cpu);
#endif /* CONFIG_ACPI_HOTPLUG_CPU */
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index e287863..2d4df89 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -30,7 +30,10 @@ struct cpu {
struct sys_device sysdev;
};
+DECLARE_PER_CPU(struct sys_device *, cpu_sys_devices);
+
extern int register_cpu(struct cpu *cpu, int num);
+extern int register_cpu_emu(struct cpu *cpu, int num, int nid);
extern struct sys_device *get_cpu_sysdev(unsigned cpu);
extern int cpu_add_sysdev_attr(struct sysdev_attribute *attr);
@@ -116,6 +119,7 @@ extern void put_online_cpus(void);
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
int cpu_down(unsigned int cpu);
+extern int cpu_hpe_on;
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
extern void cpu_hotplug_driver_lock(void);
@@ -138,6 +142,7 @@ static inline void cpu_hotplug_driver_unlock(void)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
#define unregister_hotcpu_notifier(nb) ({ (void)(nb); })
+static int cpu_hpe_on;
#endif /* CONFIG_HOTPLUG_CPU */
#ifdef CONFIG_PM_SLEEP_SMP
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-13 11:56 [RFC,5/7] NUMA hotplug emulator Shaohui Zheng
2010-05-07 14:11 ` Pavel Machek
@ 2010-05-13 12:11 ` Jean Delvare
1 sibling, 0 replies; 11+ messages in thread
From: Jean Delvare @ 2010-05-13 12:11 UTC (permalink / raw)
To: Shaohui Zheng
Cc: akpm, linux-mm, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
Andi Kleen, Hidetoshi Seto, Len Brown, Pavel Machek,
Rafael J. Wysocki, Yinghai Lu, Thomas Renninger, David Rientjes,
Mel Gorman, Venkatesh Pallipadi, Alex Chiang, Tejun Heo,
Christoph Lameter, Greg Kroah-Hartman, Stephen Rothwell,
Benjamin Herrenschmidt, Shaohua Li, Hugh Dickins, James Bottomley,
Paul E. McKenney, linux-kernel, linux-pm, linux-acpi,
fengguang.wu, haicheng.li, shaohui.zheng
On Thu, 13 May 2010 19:56:25 +0800, Shaohui Zheng wrote:
>
> hotplug emulator: Abstract cpu register functions
>
> Abstract function arch_register_cpu and register_cpu, move the implementation
> details to a sub function with prefix "__".
>
> each of the sub function has an extra parameter nid, it can be used to register
> CPU under a fake NUMA node, it is a reserved interface for cpu hotplug emulation
> (CPU PROBE/RELEASE) in x86.
>
> Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
> Signed-off-by: Haicheng Li <haicheng.li@intel.com>
I don't know anything about this, please don't Cc me on these patches.
Given the very long Cc list, I'm certain many other developers you have
included are not interested. Please focus on the relevant lists next
time!
--
Jean Delvare
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC,5/7] NUMA hotplug emulator
@ 2010-05-13 12:14 Shaohui Zheng
2010-05-14 5:49 ` Paul Mundt
0 siblings, 1 reply; 11+ messages in thread
From: Shaohui Zheng @ 2010-05-13 12:14 UTC (permalink / raw)
To: akpm, linux-mm; +Cc: linux-kernel, ak, fengguang.wu, haicheng.li, shaohui.zheng
[-- Attachment #1: Type: text/plain, Size: 12577 bytes --]
hotplug emulator: support cpu probe/release in x86
Add cpu interface probe/release under sysfs for x86. User can use this
interface to emulate the cpu hot-add process, it is for cpu hotplug
test purpose. Add a kernel option CONFIG_ARCH_CPU_PROBE_RELEASE for this
feature.
This interface provides a mechanism to emulate cpu hotplug with software
methods, it becomes possible to do cpu hotplug automation and stress
testing.
Directive:
*) Reserve CPU throu grub parameter like:
maxcpus=4
the rest CPUs will not be initiliazed.
*) Probe CPU
we can use the probe interface to hot-add new CPUs:
echo nid > /sys/devices/system/cpu/probe
*) Release a CPU
echo cpu > /sys/devices/system/cpu/release
A reserved CPU will be hot-added to the specified node.
1) nid == 0, the CPU will be added to the real node which the CPU
should be in
2) nid != 0, add the CPU to node nid even through it is a fake node.
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
---
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2c078c8..54ccb0d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1228,6 +1228,17 @@ config NODE_HOTPLUG_EMU
N is the number of hidden nodes, size is the memory size per
hidden node. This is only useful for debugging.
+config ARCH_CPU_PROBE_RELEASE
+ def_bool y
+ bool "CPU hotplug emulation"
+ depends on NUMA_HOTPLUG_EMU
+ ---help---
+ Enable cpu hotplug emulation. Reserve cpu with grub parameter
+ "maxcpus=N", where N is the initial CPU number, the rest physical
+ CPUs will not be initialized; there is a probe/release interface
+ is for cpu hot-add/hot-remove to specified node in software method.
+ This is for debuging and testing purpose
+
config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)" if !MAXSMP
range 1 10
@@ -1651,6 +1662,9 @@ config HOTPLUG_CPU
( Note: power management support will enable this option
automatically on SMP systems. )
Say N if you want to disable CPU hotplug.
+config ARCH_CPU_PROBE_RELEASE
+ def_bool y
+ depends on HOTPLUG_CPU
config COMPAT_VDSO
def_bool y
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index b185091..339ac2d 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -28,6 +28,9 @@ struct x86_cpu {
#ifdef CONFIG_HOTPLUG_CPU
extern int arch_register_cpu(int num);
extern void arch_unregister_cpu(int);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+extern int arch_register_cpu_emu(int num, int nid);
+#endif
#endif
DECLARE_PER_CPU(int, cpu_state);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index cd40aba..c3c7878 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -592,8 +592,44 @@ int __ref acpi_map_lsapic(acpi_handle handle, int *pcpu)
}
EXPORT_SYMBOL(acpi_map_lsapic);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static void acpi_map_cpu2node_emu(int cpu, int physid, int nid)
+{
+#ifdef CONFIG_ACPI_NUMA
+#ifdef CONFIG_X86_64
+ apicid_to_node[physid] = nid;
+ numa_set_node(cpu, nid);
+#else /* CONFIG_X86_32 */
+ apicid_2_node[physid] = nid;
+ cpu_to_node_map[cpu] = nid;
+#endif
+#endif
+}
+
+static u16 cpu_to_apicid_saved[CONFIG_NR_CPUS];
+int __ref acpi_map_lsapic_emu(int pcpu, int nid)
+{
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[pcpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, pcpu) != BAD_APICID)
+ cpu_to_apicid_saved[pcpu] = per_cpu(x86_cpu_to_apicid, pcpu);
+
+ per_cpu(x86_cpu_to_apicid, pcpu) = cpu_to_apicid_saved[pcpu];
+ acpi_map_cpu2node_emu(pcpu, per_cpu(x86_cpu_to_apicid, pcpu), nid);
+
+ return pcpu;
+}
+EXPORT_SYMBOL(acpi_map_lsapic_emu);
+#endif
+
int acpi_unmap_lsapic(int cpu)
{
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[cpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, cpu) != BAD_APICID)
+ cpu_to_apicid_saved[cpu] = per_cpu(x86_cpu_to_apicid, cpu);
+#endif
per_cpu(x86_cpu_to_apicid, cpu) = -1;
set_cpu_present(cpu, false);
num_processors--;
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index f716cd9..3a7b788 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -29,6 +29,9 @@
#include <linux/mmzone.h>
#include <linux/init.h>
#include <linux/smp.h>
+#include <linux/cpu.h>
+#include <linux/topology.h>
+#include <linux/acpi.h>
#include <asm/cpu.h>
static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
@@ -37,6 +40,11 @@ static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
/*
* Add nid(NUMA node id) as parameter for cpu hotplug emulation. It supports
* to register a CPU to any nodes.
+ *
+ * nid is a special parameter, it has 2 different branches:
+ * 1) when nid == NUMA_NO_NODE, the CPU will be registered into the normal node
+ * which it should be in.
+ * 2) nid != NUMA_NO_NODE, it will be registered into the specified node.
*/
static int __ref __arch_register_cpu(int num, int nid)
{
@@ -52,9 +60,24 @@ static int __ref __arch_register_cpu(int num, int nid)
if (num)
per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
- return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+ if (nid == NUMA_NO_NODE)
+ return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+ else
+ return register_cpu_emu(&per_cpu(cpu_devices, num).cpu, num, nid);
}
+/*
+ * Emulated version of function arch_register_cpu
+ * Parameter:
+ * num: cpu_id
+ * nid: emulated numa id
+ */
+int __ref arch_register_cpu_emu(int num, int nid)
+{
+ return __arch_register_cpu(num, nid);
+}
+EXPORT_SYMBOL(arch_register_cpu_emu);
+
int __ref arch_register_cpu(int num)
{
return __arch_register_cpu(num, NUMA_NO_NODE);
@@ -66,6 +89,84 @@ void arch_unregister_cpu(int num)
unregister_cpu(&per_cpu(cpu_devices, num).cpu);
}
EXPORT_SYMBOL(arch_unregister_cpu);
+
+ssize_t arch_cpu_probe(const char *buf, size_t count)
+{
+ int nid = 0;
+ int num = 0, selected = 0;
+
+ /* check parameters */
+ if (!buf || count < 2)
+ return -EPERM;
+
+ nid = simple_strtoul(buf, NULL, 0);
+ printk(KERN_DEBUG "Add a cpu to node : %d\n", nid);
+
+ if (nid < 0 || nid > nr_node_ids - 1) {
+ printk(KERN_ERR "Invalid NUMA node id: %d (0 <= nid < %d).\n",
+ nid, nr_node_ids);
+ return -EPERM;
+ }
+
+ if (!node_online(nid)) {
+ printk(KERN_ERR "NUMA node %d is not online, give up.\n", nid);
+ return -EPERM;
+ }
+
+ /* find first uninitialized cpu */
+ for_each_present_cpu(num) {
+ if (per_cpu(cpu_sys_devices, num) == NULL) {
+ selected = num;
+ break;
+ }
+ }
+
+ if (selected >= num_possible_cpus()) {
+ printk(KERN_ERR "No free cpu, give up cpu probing.\n");
+ return -EPERM;
+ }
+
+ /* register cpu */
+ arch_register_cpu_emu(selected, nid);
+ acpi_map_lsapic_emu(selected, nid);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_probe);
+
+ssize_t arch_cpu_release(const char *buf, size_t count)
+{
+ int cpu = 0;
+
+ cpu = simple_strtoul(buf, NULL, 0);
+ /* cpu 0 is not hotplugable */
+ if (cpu == 0) {
+ printk(KERN_ERR "can not release cpu 0.\n");
+ return -EPERM;
+ }
+
+ if (cpu_online(cpu)) {
+ printk(KERN_DEBUG "offline cpu %d.\n", cpu);
+ cpu_down(cpu);
+ }
+
+ arch_unregister_cpu(cpu);
+ acpi_unmap_lsapic(cpu);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_release);
+
+void cpu_hotplug_driver_unlock(void)
+{
+}
+EXPORT_SYMBOL(cpu_hotplug_driver_unlock);
+
+void cpu_hotplug_driver_lock(void)
+{
+}
+EXPORT_SYMBOL(cpu_hotplug_driver_lock);
+
#else /* CONFIG_HOTPLUG_CPU */
static int __init arch_register_cpu(int num)
@@ -83,8 +184,14 @@ static int __init topology_init(void)
register_one_node(i);
#endif
- for_each_present_cpu(i)
- arch_register_cpu(i);
+ /*
+ * when cpu hotplug emulation enabled, register the online cpu only,
+ * the rests are reserved for cpu probe.
+ */
+ for_each_present_cpu(i) {
+ if ((cpu_hpe_on && cpu_online(i)) || !cpu_hpe_on)
+ arch_register_cpu(i);
+ }
return 0;
}
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 7c61208..3430ff2 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/nodemask.h>
#include <linux/sched.h>
+#include <linux/cpu.h>
#include <asm/e820.h>
#include <asm/proto.h>
@@ -889,6 +890,19 @@ void __init init_cpu_to_node(void)
}
#endif
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static __init int cpu_hpe_setup(char *opt)
+{
+ if (!opt)
+ return -EINVAL;
+
+ if (!strncmp(opt, "on", 2) || !strncmp(opt, "1", 1))
+ cpu_hpe_on = 1;
+
+ return 0;
+}
+early_param("cpu_hpe", cpu_hpe_setup);
+#endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
void __cpuinit numa_set_node(int cpu, int node)
{
diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 5675d97..e024143 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -604,6 +604,14 @@ static int __cpuinit acpi_processor_add(struct acpi_device *device)
goto err_free_cpumask;
sysdev = get_cpu_sysdev(pr->id);
+ /*
+ * Reserve cpu for hotplug emulation, the reserved cpu can be hot-added
+ * throu the cpu probe interface. Return directly.
+ */
+ if (sysdev == NULL) {
+ goto out;
+ }
+
if (sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) {
result = -EFAULT;
goto err_remove_fs;
@@ -643,6 +651,7 @@ static int __cpuinit acpi_processor_add(struct acpi_device *device)
goto err_remove_sysfs;
}
+out:
return 0;
err_remove_sysfs:
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index a1bc9c6..3225b32 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -22,9 +22,15 @@ struct sysdev_class cpu_sysdev_class = {
};
EXPORT_SYMBOL(cpu_sysdev_class);
-static DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
+DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * cpu_hpe_on is a switch to enable/disable cpu hotplug emulation. it is
+ * disabled in default, we can enable it throu grub parameter cpu_hpe=on
+ */
+int cpu_hpe_on;
+
static ssize_t show_online(struct sys_device *dev, struct sysdev_attribute *attr,
char *buf)
{
@@ -80,6 +86,7 @@ void unregister_cpu(struct cpu *cpu)
}
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+
static ssize_t cpu_probe_store(struct sysdev_class *class,
struct sysdev_class_attribute *attr,
const char *buf,
@@ -250,6 +257,18 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
return __register_cpu(cpu, num, cpu_to_node(num));
}
+/*
+ * Register cpu to the specified NUMA node
+ *
+ * emulated version of function register_cpu, but is more flexible. it supports
+ * an extra parameter nid, We can register a CPU to any specified node throu
+ * this function.
+ */
+int __cpuinit register_cpu_emu(struct cpu *cpu, int num, int nid)
+{
+ return __register_cpu(cpu, num, nid);
+}
+
struct sys_device *get_cpu_sysdev(unsigned cpu)
{
if (cpu < nr_cpu_ids && cpu_possible(cpu))
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index b926afe..c3bc5c7 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -102,6 +102,7 @@ void acpi_numa_arch_fixup(void);
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
int acpi_map_lsapic(acpi_handle handle, int *pcpu);
+int acpi_map_lsapic_emu(int pcpu, int nid);
int acpi_unmap_lsapic(int cpu);
#endif /* CONFIG_ACPI_HOTPLUG_CPU */
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index e287863..2d4df89 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -30,7 +30,10 @@ struct cpu {
struct sys_device sysdev;
};
+DECLARE_PER_CPU(struct sys_device *, cpu_sys_devices);
+
extern int register_cpu(struct cpu *cpu, int num);
+extern int register_cpu_emu(struct cpu *cpu, int num, int nid);
extern struct sys_device *get_cpu_sysdev(unsigned cpu);
extern int cpu_add_sysdev_attr(struct sysdev_attribute *attr);
@@ -116,6 +119,7 @@ extern void put_online_cpus(void);
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
int cpu_down(unsigned int cpu);
+extern int cpu_hpe_on;
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
extern void cpu_hotplug_driver_lock(void);
@@ -138,6 +142,7 @@ static inline void cpu_hotplug_driver_unlock(void)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
#define unregister_hotcpu_notifier(nb) ({ (void)(nb); })
+static int cpu_hpe_on;
#endif /* CONFIG_HOTPLUG_CPU */
#ifdef CONFIG_PM_SLEEP_SMP
--
Thanks & Regards,
Shaohui
[-- Attachment #2: 005-hotplug-emulator-x86-support-cpu-probe-release-in-x86.patch --]
[-- Type: text/x-diff, Size: 12546 bytes --]
hotplug emulator: support cpu probe/release in x86
Add cpu interface probe/release under sysfs for x86. User can use this
interface to emulate the cpu hot-add process, it is for cpu hotplug
test purpose. Add a kernel option CONFIG_ARCH_CPU_PROBE_RELEASE for this
feature.
This interface provides a mechanism to emulate cpu hotplug with software
methods, it becomes possible to do cpu hotplug automation and stress
testing.
Directive:
*) Reserve CPU throu grub parameter like:
maxcpus=4
the rest CPUs will not be initiliazed.
*) Probe CPU
we can use the probe interface to hot-add new CPUs:
echo nid > /sys/devices/system/cpu/probe
*) Release a CPU
echo cpu > /sys/devices/system/cpu/release
A reserved CPU will be hot-added to the specified node.
1) nid == 0, the CPU will be added to the real node which the CPU
should be in
2) nid != 0, add the CPU to node nid even through it is a fake node.
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
---
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2c078c8..54ccb0d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1228,6 +1228,17 @@ config NODE_HOTPLUG_EMU
N is the number of hidden nodes, size is the memory size per
hidden node. This is only useful for debugging.
+config ARCH_CPU_PROBE_RELEASE
+ def_bool y
+ bool "CPU hotplug emulation"
+ depends on NUMA_HOTPLUG_EMU
+ ---help---
+ Enable cpu hotplug emulation. Reserve cpu with grub parameter
+ "maxcpus=N", where N is the initial CPU number, the rest physical
+ CPUs will not be initialized; there is a probe/release interface
+ is for cpu hot-add/hot-remove to specified node in software method.
+ This is for debuging and testing purpose
+
config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)" if !MAXSMP
range 1 10
@@ -1651,6 +1662,9 @@ config HOTPLUG_CPU
( Note: power management support will enable this option
automatically on SMP systems. )
Say N if you want to disable CPU hotplug.
+config ARCH_CPU_PROBE_RELEASE
+ def_bool y
+ depends on HOTPLUG_CPU
config COMPAT_VDSO
def_bool y
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index b185091..339ac2d 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -28,6 +28,9 @@ struct x86_cpu {
#ifdef CONFIG_HOTPLUG_CPU
extern int arch_register_cpu(int num);
extern void arch_unregister_cpu(int);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+extern int arch_register_cpu_emu(int num, int nid);
+#endif
#endif
DECLARE_PER_CPU(int, cpu_state);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index cd40aba..c3c7878 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -592,8 +592,44 @@ int __ref acpi_map_lsapic(acpi_handle handle, int *pcpu)
}
EXPORT_SYMBOL(acpi_map_lsapic);
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static void acpi_map_cpu2node_emu(int cpu, int physid, int nid)
+{
+#ifdef CONFIG_ACPI_NUMA
+#ifdef CONFIG_X86_64
+ apicid_to_node[physid] = nid;
+ numa_set_node(cpu, nid);
+#else /* CONFIG_X86_32 */
+ apicid_2_node[physid] = nid;
+ cpu_to_node_map[cpu] = nid;
+#endif
+#endif
+}
+
+static u16 cpu_to_apicid_saved[CONFIG_NR_CPUS];
+int __ref acpi_map_lsapic_emu(int pcpu, int nid)
+{
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[pcpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, pcpu) != BAD_APICID)
+ cpu_to_apicid_saved[pcpu] = per_cpu(x86_cpu_to_apicid, pcpu);
+
+ per_cpu(x86_cpu_to_apicid, pcpu) = cpu_to_apicid_saved[pcpu];
+ acpi_map_cpu2node_emu(pcpu, per_cpu(x86_cpu_to_apicid, pcpu), nid);
+
+ return pcpu;
+}
+EXPORT_SYMBOL(acpi_map_lsapic_emu);
+#endif
+
int acpi_unmap_lsapic(int cpu)
{
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+ /* backup cpu apicid to array cpu_to_apicid_saved */
+ if (cpu_to_apicid_saved[cpu] == 0 &&
+ per_cpu(x86_cpu_to_apicid, cpu) != BAD_APICID)
+ cpu_to_apicid_saved[cpu] = per_cpu(x86_cpu_to_apicid, cpu);
+#endif
per_cpu(x86_cpu_to_apicid, cpu) = -1;
set_cpu_present(cpu, false);
num_processors--;
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index f716cd9..3a7b788 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -29,6 +29,9 @@
#include <linux/mmzone.h>
#include <linux/init.h>
#include <linux/smp.h>
+#include <linux/cpu.h>
+#include <linux/topology.h>
+#include <linux/acpi.h>
#include <asm/cpu.h>
static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
@@ -37,6 +40,11 @@ static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
/*
* Add nid(NUMA node id) as parameter for cpu hotplug emulation. It supports
* to register a CPU to any nodes.
+ *
+ * nid is a special parameter, it has 2 different branches:
+ * 1) when nid == NUMA_NO_NODE, the CPU will be registered into the normal node
+ * which it should be in.
+ * 2) nid != NUMA_NO_NODE, it will be registered into the specified node.
*/
static int __ref __arch_register_cpu(int num, int nid)
{
@@ -52,9 +60,24 @@ static int __ref __arch_register_cpu(int num, int nid)
if (num)
per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
- return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+ if (nid == NUMA_NO_NODE)
+ return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+ else
+ return register_cpu_emu(&per_cpu(cpu_devices, num).cpu, num, nid);
}
+/*
+ * Emulated version of function arch_register_cpu
+ * Parameter:
+ * num: cpu_id
+ * nid: emulated numa id
+ */
+int __ref arch_register_cpu_emu(int num, int nid)
+{
+ return __arch_register_cpu(num, nid);
+}
+EXPORT_SYMBOL(arch_register_cpu_emu);
+
int __ref arch_register_cpu(int num)
{
return __arch_register_cpu(num, NUMA_NO_NODE);
@@ -66,6 +89,84 @@ void arch_unregister_cpu(int num)
unregister_cpu(&per_cpu(cpu_devices, num).cpu);
}
EXPORT_SYMBOL(arch_unregister_cpu);
+
+ssize_t arch_cpu_probe(const char *buf, size_t count)
+{
+ int nid = 0;
+ int num = 0, selected = 0;
+
+ /* check parameters */
+ if (!buf || count < 2)
+ return -EPERM;
+
+ nid = simple_strtoul(buf, NULL, 0);
+ printk(KERN_DEBUG "Add a cpu to node : %d\n", nid);
+
+ if (nid < 0 || nid > nr_node_ids - 1) {
+ printk(KERN_ERR "Invalid NUMA node id: %d (0 <= nid < %d).\n",
+ nid, nr_node_ids);
+ return -EPERM;
+ }
+
+ if (!node_online(nid)) {
+ printk(KERN_ERR "NUMA node %d is not online, give up.\n", nid);
+ return -EPERM;
+ }
+
+ /* find first uninitialized cpu */
+ for_each_present_cpu(num) {
+ if (per_cpu(cpu_sys_devices, num) == NULL) {
+ selected = num;
+ break;
+ }
+ }
+
+ if (selected >= num_possible_cpus()) {
+ printk(KERN_ERR "No free cpu, give up cpu probing.\n");
+ return -EPERM;
+ }
+
+ /* register cpu */
+ arch_register_cpu_emu(selected, nid);
+ acpi_map_lsapic_emu(selected, nid);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_probe);
+
+ssize_t arch_cpu_release(const char *buf, size_t count)
+{
+ int cpu = 0;
+
+ cpu = simple_strtoul(buf, NULL, 0);
+ /* cpu 0 is not hotplugable */
+ if (cpu == 0) {
+ printk(KERN_ERR "can not release cpu 0.\n");
+ return -EPERM;
+ }
+
+ if (cpu_online(cpu)) {
+ printk(KERN_DEBUG "offline cpu %d.\n", cpu);
+ cpu_down(cpu);
+ }
+
+ arch_unregister_cpu(cpu);
+ acpi_unmap_lsapic(cpu);
+
+ return count;
+}
+EXPORT_SYMBOL(arch_cpu_release);
+
+void cpu_hotplug_driver_unlock(void)
+{
+}
+EXPORT_SYMBOL(cpu_hotplug_driver_unlock);
+
+void cpu_hotplug_driver_lock(void)
+{
+}
+EXPORT_SYMBOL(cpu_hotplug_driver_lock);
+
#else /* CONFIG_HOTPLUG_CPU */
static int __init arch_register_cpu(int num)
@@ -83,8 +184,14 @@ static int __init topology_init(void)
register_one_node(i);
#endif
- for_each_present_cpu(i)
- arch_register_cpu(i);
+ /*
+ * when cpu hotplug emulation enabled, register the online cpu only,
+ * the rests are reserved for cpu probe.
+ */
+ for_each_present_cpu(i) {
+ if ((cpu_hpe_on && cpu_online(i)) || !cpu_hpe_on)
+ arch_register_cpu(i);
+ }
return 0;
}
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 7c61208..3430ff2 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -12,6 +12,7 @@
#include <linux/module.h>
#include <linux/nodemask.h>
#include <linux/sched.h>
+#include <linux/cpu.h>
#include <asm/e820.h>
#include <asm/proto.h>
@@ -889,6 +890,19 @@ void __init init_cpu_to_node(void)
}
#endif
+#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+static __init int cpu_hpe_setup(char *opt)
+{
+ if (!opt)
+ return -EINVAL;
+
+ if (!strncmp(opt, "on", 2) || !strncmp(opt, "1", 1))
+ cpu_hpe_on = 1;
+
+ return 0;
+}
+early_param("cpu_hpe", cpu_hpe_setup);
+#endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
void __cpuinit numa_set_node(int cpu, int node)
{
diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 5675d97..e024143 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -604,6 +604,14 @@ static int __cpuinit acpi_processor_add(struct acpi_device *device)
goto err_free_cpumask;
sysdev = get_cpu_sysdev(pr->id);
+ /*
+ * Reserve cpu for hotplug emulation, the reserved cpu can be hot-added
+ * throu the cpu probe interface. Return directly.
+ */
+ if (sysdev == NULL) {
+ goto out;
+ }
+
if (sysfs_create_link(&device->dev.kobj, &sysdev->kobj, "sysdev")) {
result = -EFAULT;
goto err_remove_fs;
@@ -643,6 +651,7 @@ static int __cpuinit acpi_processor_add(struct acpi_device *device)
goto err_remove_sysfs;
}
+out:
return 0;
err_remove_sysfs:
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index a1bc9c6..3225b32 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -22,9 +22,15 @@ struct sysdev_class cpu_sysdev_class = {
};
EXPORT_SYMBOL(cpu_sysdev_class);
-static DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
+DEFINE_PER_CPU(struct sys_device *, cpu_sys_devices);
#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * cpu_hpe_on is a switch to enable/disable cpu hotplug emulation. it is
+ * disabled in default, we can enable it throu grub parameter cpu_hpe=on
+ */
+int cpu_hpe_on;
+
static ssize_t show_online(struct sys_device *dev, struct sysdev_attribute *attr,
char *buf)
{
@@ -80,6 +86,7 @@ void unregister_cpu(struct cpu *cpu)
}
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
+
static ssize_t cpu_probe_store(struct sysdev_class *class,
struct sysdev_class_attribute *attr,
const char *buf,
@@ -250,6 +257,18 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
return __register_cpu(cpu, num, cpu_to_node(num));
}
+/*
+ * Register cpu to the specified NUMA node
+ *
+ * emulated version of function register_cpu, but is more flexible. it supports
+ * an extra parameter nid, We can register a CPU to any specified node throu
+ * this function.
+ */
+int __cpuinit register_cpu_emu(struct cpu *cpu, int num, int nid)
+{
+ return __register_cpu(cpu, num, nid);
+}
+
struct sys_device *get_cpu_sysdev(unsigned cpu)
{
if (cpu < nr_cpu_ids && cpu_possible(cpu))
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index b926afe..c3bc5c7 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -102,6 +102,7 @@ void acpi_numa_arch_fixup(void);
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
int acpi_map_lsapic(acpi_handle handle, int *pcpu);
+int acpi_map_lsapic_emu(int pcpu, int nid);
int acpi_unmap_lsapic(int cpu);
#endif /* CONFIG_ACPI_HOTPLUG_CPU */
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index e287863..2d4df89 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -30,7 +30,10 @@ struct cpu {
struct sys_device sysdev;
};
+DECLARE_PER_CPU(struct sys_device *, cpu_sys_devices);
+
extern int register_cpu(struct cpu *cpu, int num);
+extern int register_cpu_emu(struct cpu *cpu, int num, int nid);
extern struct sys_device *get_cpu_sysdev(unsigned cpu);
extern int cpu_add_sysdev_attr(struct sysdev_attribute *attr);
@@ -116,6 +119,7 @@ extern void put_online_cpus(void);
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
int cpu_down(unsigned int cpu);
+extern int cpu_hpe_on;
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE
extern void cpu_hotplug_driver_lock(void);
@@ -138,6 +142,7 @@ static inline void cpu_hotplug_driver_unlock(void)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
#define unregister_hotcpu_notifier(nb) ({ (void)(nb); })
+static int cpu_hpe_on;
#endif /* CONFIG_HOTPLUG_CPU */
#ifdef CONFIG_PM_SLEEP_SMP
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-13 12:14 Shaohui Zheng
@ 2010-05-14 5:49 ` Paul Mundt
2010-05-18 9:03 ` Shaohui Zheng
0 siblings, 1 reply; 11+ messages in thread
From: Paul Mundt @ 2010-05-14 5:49 UTC (permalink / raw)
To: akpm, linux-mm, linux-kernel, ak, fengguang.wu, haicheng.li,
shaohui.zheng
On Thu, May 13, 2010 at 08:14:57PM +0800, Shaohui Zheng wrote:
> hotplug emulator: support cpu probe/release in x86
>
> Add cpu interface probe/release under sysfs for x86. User can use this
> interface to emulate the cpu hot-add process, it is for cpu hotplug
> test purpose. Add a kernel option CONFIG_ARCH_CPU_PROBE_RELEASE for this
> feature.
>
> This interface provides a mechanism to emulate cpu hotplug with software
> methods, it becomes possible to do cpu hotplug automation and stress
> testing.
>
At a quick glance, is this really necessary? It seems like you could
easily replace most of this with a CPU notifier chain that takes care of
the node handling. See for example how ppc64 manages the CPU hotplug/numa
emulation case in arch/powerpc/mm/numa.c. arch_register_cpu() just looks
like some topology hack for ACPI, it would be nice not to perpetuate that
too much.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-07 14:11 ` Pavel Machek
@ 2010-05-16 17:45 ` Paul E. McKenney
2010-05-17 2:42 ` Haicheng Li
2010-05-17 3:37 ` Shaohui Zheng
2010-05-17 9:38 ` Andi Kleen
2 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2010-05-16 17:45 UTC (permalink / raw)
To: Pavel Machek
Cc: akpm, linux-mm, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
Andi Kleen, Hidetoshi Seto, Len Brown, Rafael J. Wysocki,
Yinghai Lu, Thomas Renninger, David Rientjes, Mel Gorman,
Venkatesh Pallipadi, Alex Chiang, Tejun Heo, Christoph Lameter,
Greg Kroah-Hartman, Stephen Rothwell, Benjamin Herrenschmidt,
Shaohua Li, Jean Delvare, Hugh Dickins, James Bottomley,
linux-kernel, linux-pm, linux-acpi, fengguang.wu, haicheng.li,
shaohui.zheng
On Fri, May 07, 2010 at 04:11:42PM +0200, Pavel Machek wrote:
> Hi!
>
> > hotplug emulator: Abstract cpu register functions
> >
> > Abstract function arch_register_cpu and register_cpu, move the implementation
> > details to a sub function with prefix "__".
> >
> > each of the sub function has an extra parameter nid, it can be used to register
> > CPU under a fake NUMA node, it is a reserved interface for cpu hotplug emulation
> > (CPU PROBE/RELEASE) in x86.
>
> I don't get it. CPU hotplug can already be tested using echo 0/1 >
> online, and that works on 386. How is this different?
>
> It seems to add some numa magic. Why is it important?
My guess is that he wants to test the software surrounding NUMA on a
non-NUMA (or different-NUMA) machine, perhaps in order to shake out bugs
before the corresponding hardware is available.
Thanx, Paul
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-16 17:45 ` Paul E. McKenney
@ 2010-05-17 2:42 ` Haicheng Li
0 siblings, 0 replies; 11+ messages in thread
From: Haicheng Li @ 2010-05-17 2:42 UTC (permalink / raw)
To: paulmck
Cc: Pavel Machek, akpm, linux-mm, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, Andi Kleen, Hidetoshi Seto, Len Brown,
Rafael J. Wysocki, Yinghai Lu, Thomas Renninger, David Rientjes,
Mel Gorman, Venkatesh Pallipadi, Alex Chiang, Tejun Heo,
Christoph Lameter, Greg Kroah-Hartman, Stephen Rothwell,
Benjamin Herrenschmidt, Shaohua Li, Hugh Dickins, James Bottomley,
linux-kernel, linux-pm, linux-acpi, fengguang.wu, shaohui.zheng
Paul E. McKenney wrote:
> On Fri, May 07, 2010 at 04:11:42PM +0200, Pavel Machek wrote:
>> Hi!
>>
>>> hotplug emulator: Abstract cpu register functions
>>>
>>> Abstract function arch_register_cpu and register_cpu, move the implementation
>>> details to a sub function with prefix "__".
>>>
>>> each of the sub function has an extra parameter nid, it can be used to register
>>> CPU under a fake NUMA node, it is a reserved interface for cpu hotplug emulation
>>> (CPU PROBE/RELEASE) in x86.
>> I don't get it. CPU hotplug can already be tested using echo 0/1 >
>> online, and that works on 386. How is this different?
"echo 0/1 > online" is logical cpu online/offline.
The emulator intends to emulate physical add/remove of cpus.
They cover different code path.
You can get details of the terms via $KERN_SRC/Documentation/cpu-hotplug.txt.
>> It seems to add some numa magic. Why is it important?
In real world, numa affinity info of the cpus is required for physical cpu hotadd/remove
, which finally affects related data structures and code path. Emulator need the ability
to emulate it.
> My guess is that he wants to test the software surrounding NUMA on a
> non-NUMA (or different-NUMA) machine, perhaps in order to shake out bugs
> before the corresponding hardware is available.
This is one of the purposes. Auto tests and debugging all can get benefits from such emulation.
-haicheng
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-07 14:11 ` Pavel Machek
2010-05-16 17:45 ` Paul E. McKenney
@ 2010-05-17 3:37 ` Shaohui Zheng
2010-05-17 9:39 ` Andi Kleen
2010-05-17 9:38 ` Andi Kleen
2 siblings, 1 reply; 11+ messages in thread
From: Shaohui Zheng @ 2010-05-17 3:37 UTC (permalink / raw)
To: Pavel Machek, lethal, Nathan Fontenot, Paul E. McKenney
Cc: akpm, linux-mm, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
Andi Kleen, Hidetoshi Seto, Len Brown, Rafael J. Wysocki,
Yinghai Lu, Thomas Renninger, David Rientjes, Mel Gorman,
Venkatesh Pallipadi, Alex Chiang, Tejun Heo, Christoph Lameter,
Greg Kroah-Hartman, Stephen Rothwell, Benjamin Herrenschmidt,
Shaohua Li, Jean Delvare, Hugh Dickins, James Bottomley,
linux-kernel, linux-pm, linux-acpi, fengguang.wu, haicheng.li,
shaohui.zheng, linux-hotplug
On Fri, May 07, 2010 at 04:11:42PM +0200, Pavel Machek wrote:
> Hi!
>
> > hotplug emulator: Abstract cpu register functions
> >
> > Abstract function arch_register_cpu and register_cpu, move the implementation
> > details to a sub function with prefix "__".
> >
> > each of the sub function has an extra parameter nid, it can be used to register
> > CPU under a fake NUMA node, it is a reserved interface for cpu hotplug emulation
> > (CPU PROBE/RELEASE) in x86.
>
> I don't get it. CPU hotplug can already be tested using echo 0/1 >
> online, and that works on 386. How is this different?
>
> It seems to add some numa magic. Why is it important?
Pavel,
it is not an easy thing to understand the full story since you may not work on this project
so you have such question. Let me do a simpe introductions about the background.
We need to understand 2 differnets concepts if you wnat to know the reason why we develop
the hotplug emulaor.
- CPU logcial online/offline
it is the existed feature which you mentioned, we can online/offline CPUs throu sysfs
interface /sys/device/system/cpu/cpuX/online (X is an integer, it stands for the CPU number)
echo 0/1 > /sys/device/cpu/cpuX/online
This is is logical CPU online/offline, when we do such operation, the CPU is already pluged
into the motherboard, and the OS initialized the CPU. the data structure and CPU entries on sysfs
are created, the CPU present mask and possible mask are setted, it does not refer to any physical
hardware. the CPU status becomes online from offline, and ready to schedule to run process by
scheduler.
CPU online/offline is control by the kernel option CONFIG_HOTPLUG_CPU.
- CPU hot-add/hot-remove
This is physical CPU hot-add/hot-remove into motherboard, without shutdown the machine, after
the hot-add operation, the new CPU will be powered on, and the OS recognize the new CPUs throu SCI
interrupts, then OS intializes the new CPUs, create the related CPU structures, create sysfs entries
for the new CPUs. Once all done, the CPU is ready to logcial online.
The process to hot-add CPU:
1) Physical CPU hot-add to motherboard when after the machine is powered on
2) the BIOS send SCI interrupts to notice the OS
3) Linux hotplug handler parse the data from the acpi_handle data
4) hotplug handler initialize the CPU structure according the cpu ACPI data
Current situation:
1) Provides developers an envronment
Only very few hardware can support CPU hot-add/hot-remove, we need create an working environment
for developers to write and debug hotplug code even through they do not has such hardward on hand.
It is what NUMA hotplug emulator does exactly. Physcial hotplug emuator should be a better name.
We have 2 solutions to solve this problem, and this one is selected finally; if you want to know
more about the solutions, we can continue to on this thread.
2) Offers an automation test inferface for Linux CPU hot-add/hot-remove code
Linux hot-add/hot-remove code has obvious bugs, but we do not see any automation test suite for it,
even in LTP project(LTP has hotplug suite for logical CPU online/offline).
It is a know diffcult work to test physcial hot-add/hot-remove code in automation way, but the hotplug
emualtor does a good job for it. We reproduce all the major hotlug bugs against the internal emulator
v2 and v3.
We are sharing it to the community, wish more wisdoms and talents are included in it. We want to show an
exmaple of software emualtion, and hopes more guys benifit from it, this is the purpose for this group
patches.
PowerPC supporting
For ppc, it was added about half year ago by Nathan Fontenot, but x86 does not has such feature.
Thanks for lethal to mention it, we already did some researching about it, I will reply it in another
thread.
commit 12633e803a2a556f6469e0933d08233d0844a2d9
Author: Nathan Fontenot <nfont@austin.ibm.com>
Date: Wed Nov 25 17:23:25 2009 +0000
commit 1a8061c46c46c960f715c597b9d279ea2ba42bd9
Author: Nathan Fontenot <nfont@austin.ibm.com>
Date: Tue Nov 24 21:13:32 2009 +0000
We inherit the name style from ppc, CPU hot-add/hot-remove is called CPU probe/release in kernel, it was
control by kernel option CONFIG_ARCH_CPU_PROBE_RELEASE.
--
Thanks & Regards,
Shaohui
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-07 14:11 ` Pavel Machek
2010-05-16 17:45 ` Paul E. McKenney
2010-05-17 3:37 ` Shaohui Zheng
@ 2010-05-17 9:38 ` Andi Kleen
2 siblings, 0 replies; 11+ messages in thread
From: Andi Kleen @ 2010-05-17 9:38 UTC (permalink / raw)
To: Pavel Machek
Cc: akpm, linux-mm, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
Hidetoshi Seto, Len Brown, Rafael J. Wysocki, Yinghai Lu,
Thomas Renninger, David Rientjes, Mel Gorman, Venkatesh Pallipadi,
Alex Chiang, Tejun Heo, Christoph Lameter, Greg Kroah-Hartman,
Stephen Rothwell, Benjamin Herrenschmidt, Shaohua Li,
Jean Delvare, Hugh Dickins, James Bottomley, Paul E. McKenney,
linux-kernel, linux-pm, linux-acpi, fengguang.wu, haicheng.li,
shaohui.zheng
, Pavel Machek wrote:
> Hi!
>
>> hotplug emulator: Abstract cpu register functions
>>
>> Abstract function arch_register_cpu and register_cpu, move the implementation
>> details to a sub function with prefix "__".
>>
>> each of the sub function has an extra parameter nid, it can be used to register
>> CPU under a fake NUMA node, it is a reserved interface for cpu hotplug emulation
>> (CPU PROBE/RELEASE) in x86.
>
> I don't get it. CPU hotplug can already be tested using echo 0/1>
> online, and that works on 386. How is this different?
It tests a different code path.
> It seems to add some numa magic. Why is it important?
It tests memory and node hotadd too. Memory/node hotadd is a quite problematic
feature and needs all the testing support it can get.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-17 3:37 ` Shaohui Zheng
@ 2010-05-17 9:39 ` Andi Kleen
0 siblings, 0 replies; 11+ messages in thread
From: Andi Kleen @ 2010-05-17 9:39 UTC (permalink / raw)
To: Pavel Machek, lethal, Nathan Fontenot, Paul E. McKenney, akpm,
linux-mm, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
Hidetoshi Seto, Len Brown, Rafael J. Wysocki, Yinghai Lu,
Thomas Renninger, David Rientjes, Mel Gorman, Venkatesh Pallipadi,
Alex Chiang, Tejun Heo, Christoph Lameter, Greg Kroah-Hartman,
Stephen Rothwell, Benjamin Herrenschmidt, Shaohua Li,
Jean Delvare, Hugh Dickins, James Bottomley, linux-kernel,
linux-pm, linux-acpi, fengguang.wu, haicheng.li, shaohui.zheng,
linux-hotplug
>
> PowerPC supporting
> For ppc, it was added about half year ago by Nathan Fontenot, but x86 does not has such feature.
> Thanks for lethal to mention it, we already did some researching about it, I will reply it in another
> thread.
Again the probe interface only covers part of the code, not ACPI for example.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC,5/7] NUMA hotplug emulator
2010-05-14 5:49 ` Paul Mundt
@ 2010-05-18 9:03 ` Shaohui Zheng
0 siblings, 0 replies; 11+ messages in thread
From: Shaohui Zheng @ 2010-05-18 9:03 UTC (permalink / raw)
To: Paul Mundt
Cc: akpm, linux-mm, linux-kernel, ak, fengguang.wu, haicheng.li,
shaohui.zheng
On Fri, May 14, 2010 at 02:49:28PM +0900, Paul Mundt wrote:
> On Thu, May 13, 2010 at 08:14:57PM +0800, Shaohui Zheng wrote:
> > hotplug emulator: support cpu probe/release in x86
> >
> > Add cpu interface probe/release under sysfs for x86. User can use this
> > interface to emulate the cpu hot-add process, it is for cpu hotplug
> > test purpose. Add a kernel option CONFIG_ARCH_CPU_PROBE_RELEASE for this
> > feature.
> >
> > This interface provides a mechanism to emulate cpu hotplug with software
> > methods, it becomes possible to do cpu hotplug automation and stress
> > testing.
> >
> At a quick glance, is this really necessary? It seems like you could
> easily replace most of this with a CPU notifier chain that takes care of
> the node handling. See for example how ppc64 manages the CPU hotplug/numa
> emulation case in arch/powerpc/mm/numa.c. arch_register_cpu() just looks
> like some topology hack for ACPI, it would be nice not to perpetuate that
> too much.
Paul,
When we reivew the possible solutions for the emulator, we already do some researching
for ppc hotplug interface, I did not has ppc background, so it is hard for me to understand
all the details, but we get clues indeed, so you see the emulator today.
We are *NOT* expecting to find simple way to probe a CPU, we are trying to emulate the
behavior with software methods, we expect the same result when we do same operation on real
hardware and emualtor. That is the reason why we did not selelct CPU notifier chain, you can
see the CPU probe process is almost the same with CPU physical hot-add, the only difference
is that some functions are replaced with a '_emu' suffix, these '_emu' function has the same
function with the old one, but it does not refer to any acpi_handle data since the hot-add
event is fake.
for exmaple:
register_cpu & register_cpu_emu
arch_register_cpu & arch_register_cpu_emu
acpi_map_lsapic & acpi_map_lsapic_emu
the nid and apic_id are parsed from the acpi_handle, but for a fake hot-add, we does not
has such data, so we delete the parser code and replace them with a parameter.
I believe you method can success probe a CPU, but it is obvious different with the CPU hot-add
process, it has the different behavior with the real hardware, it is not expect. that is the failure
of the emulation.
ppc does not care about the ACPI data, that is the reason why it seems to be simple.
--
Thanks & Regards,
Shaohui
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-05-18 9:11 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-13 11:56 [RFC,5/7] NUMA hotplug emulator Shaohui Zheng
2010-05-07 14:11 ` Pavel Machek
2010-05-16 17:45 ` Paul E. McKenney
2010-05-17 2:42 ` Haicheng Li
2010-05-17 3:37 ` Shaohui Zheng
2010-05-17 9:39 ` Andi Kleen
2010-05-17 9:38 ` Andi Kleen
2010-05-13 12:11 ` Jean Delvare
-- strict thread matches above, loose matches on Subject: below --
2010-05-13 12:14 Shaohui Zheng
2010-05-14 5:49 ` Paul Mundt
2010-05-18 9:03 ` Shaohui Zheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).