* [PATCH v3 14/52] powerpc, sysfs: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:36 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, Olof Johansson,
linux-kernel, Wang Dongsheng, linuxppc-dev, Madhavan Srinivasan,
Paul Mackerras, Srivatsa S. Bhat, tj, paulmck, linuxppc-dev,
Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the sysfs code in powerpc by using this latter form of callback
registration.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Wang Dongsheng <dongsheng.wang@freescale.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/powerpc/kernel/sysfs.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 97e1dc9..d90d4b7 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -975,7 +975,8 @@ static int __init topology_init(void)
int cpu;
register_nodes();
- register_cpu_notifier(&sysfs_cpu_nb);
+
+ cpu_notifier_register_begin();
for_each_possible_cpu(cpu) {
struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -999,6 +1000,11 @@ static int __init topology_init(void)
if (cpu_online(cpu))
register_cpu_online(cpu);
}
+
+ __register_cpu_notifier(&sysfs_cpu_nb);
+
+ cpu_notifier_register_done();
+
#ifdef CONFIG_PPC64
sysfs_create_dscr_default();
#endif /* CONFIG_PPC64 */
^ permalink raw reply related
* [PATCH v3 13/52] sparc, sysfs: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:36 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, sparclinux, walken, linux, linux-pm,
linux-kernel, David S. Miller, linuxppc-dev, Srivatsa S. Bhat, tj,
paulmck, Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the sysfs code in sparc by using this latter form of callback
registration.
Cc: Ingo Molnar <mingo@kernel.org>
Cc: sparclinux@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/sparc/kernel/sysfs.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/sparc/kernel/sysfs.c b/arch/sparc/kernel/sysfs.c
index c21c673..a364000 100644
--- a/arch/sparc/kernel/sysfs.c
+++ b/arch/sparc/kernel/sysfs.c
@@ -300,7 +300,7 @@ static int __init topology_init(void)
check_mmu_stats();
- register_cpu_notifier(&sysfs_cpu_nb);
+ cpu_notifier_register_begin();
for_each_possible_cpu(cpu) {
struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -310,6 +310,10 @@ static int __init topology_init(void)
register_cpu_online(cpu);
}
+ __register_cpu_notifier(&sysfs_cpu_nb);
+
+ cpu_notifier_register_done();
+
return 0;
}
^ permalink raw reply related
* [PATCH v3 12/52] s390, smp: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:35 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, linux-s390,
Heiko Carstens, linux-kernel, Ingo Molnar, linuxppc-dev,
Srivatsa S. Bhat, tj, Thomas Gleixner, paulmck,
Martin Schwidefsky
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the smp code in s390 by using this latter form of callback registration.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/s390/kernel/smp.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index a7125b6..e10be35 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -1057,19 +1057,24 @@ static DEVICE_ATTR(rescan, 0200, NULL, rescan_store);
static int __init s390_smp_init(void)
{
- int cpu, rc;
+ int cpu, rc = 0;
- hotcpu_notifier(smp_cpu_notify, 0);
#ifdef CONFIG_HOTPLUG_CPU
rc = device_create_file(cpu_subsys.dev_root, &dev_attr_rescan);
if (rc)
return rc;
#endif
+ cpu_notifier_register_begin();
for_each_present_cpu(cpu) {
rc = smp_add_present_cpu(cpu);
if (rc)
- return rc;
+ goto out;
}
- return 0;
+
+ __hotcpu_notifier(smp_cpu_notify, 0);
+
+out:
+ cpu_notifier_register_done();
+ return rc;
}
subsys_initcall(s390_smp_init);
^ permalink raw reply related
* [PATCH v3 11/52] s390, cacheinfo: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:35 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, linux-s390,
Heiko Carstens, linux-kernel, Ingo Molnar, linuxppc-dev,
Srivatsa S. Bhat, tj, paulmck, Martin Schwidefsky
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the cacheinfo code in s390 by using this latter form of callback
registration.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/s390/kernel/cache.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/cache.c b/arch/s390/kernel/cache.c
index 3a414c0..c0b03c2 100644
--- a/arch/s390/kernel/cache.c
+++ b/arch/s390/kernel/cache.c
@@ -378,9 +378,12 @@ static int __init cache_init(void)
if (!test_facility(34))
return 0;
cache_build_info();
+
+ cpu_notifier_register_begin();
for_each_online_cpu(cpu)
cache_add_cpu(cpu);
- hotcpu_notifier(cache_hotplug, 0);
+ __hotcpu_notifier(cache_hotplug, 0);
+ cpu_notifier_register_done();
return 0;
}
device_initcall(cache_init);
^ permalink raw reply related
* [PATCH v3 10/52] arm, kvm: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:35 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, Russell King, kvm, linux-pm,
Gleb Natapov, linux-kernel, kvmarm, linuxppc-dev,
linux-arm-kernel, Srivatsa S. Bhat, tj, Paolo Bonzini, paulmck,
Ingo Molnar, Christoffer Dall
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the kvm code in arm by using this latter form of callback registration.
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: kvmarm@lists.cs.columbia.edu
Cc: kvm@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/arm/kvm/arm.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index bd18bb8..f0e50a0 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1051,21 +1051,26 @@ int kvm_arch_init(void *opaque)
}
}
+ cpu_notifier_register_begin();
+
err = init_hyp_mode();
if (err)
goto out_err;
- err = register_cpu_notifier(&hyp_init_cpu_nb);
+ err = __register_cpu_notifier(&hyp_init_cpu_nb);
if (err) {
kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
goto out_err;
}
+ cpu_notifier_register_done();
+
hyp_cpu_pm_init();
kvm_coproc_table_init();
return 0;
out_err:
+ cpu_notifier_register_done();
return err;
}
^ permalink raw reply related
* [PATCH v3 09/52] arm, hw-breakpoint: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:35 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, Russell King, linux-pm, Will Deacon,
linux-kernel, linuxppc-dev, Srivatsa S. Bhat, tj, paulmck,
Ingo Molnar, linux-arm-kernel
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the hw-breakpoint code in arm by using this latter form of callback
registration.
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/arm/kernel/hw_breakpoint.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index 3d44660..3702de8 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -1072,6 +1072,8 @@ static int __init arch_hw_breakpoint_init(void)
core_num_brps = get_num_brps();
core_num_wrps = get_num_wrps();
+ cpu_notifier_register_begin();
+
/*
* We need to tread carefully here because DBGSWENABLE may be
* driven low on this core and there isn't an architected way to
@@ -1088,6 +1090,7 @@ static int __init arch_hw_breakpoint_init(void)
if (!cpumask_empty(&debug_err_mask)) {
core_num_brps = 0;
core_num_wrps = 0;
+ cpu_notifier_register_done();
return 0;
}
@@ -1107,7 +1110,10 @@ static int __init arch_hw_breakpoint_init(void)
TRAP_HWBKPT, "breakpoint debug exception");
/* Register hotplug and PM notifiers. */
- register_cpu_notifier(&dbg_reset_nb);
+ __register_cpu_notifier(&dbg_reset_nb);
+
+ cpu_notifier_register_done();
+
pm_init();
return 0;
}
^ permalink raw reply related
* [PATCH v3 08/52] ia64, err-inject: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:35 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, linux-kernel,
Fenghua Yu, linuxppc-dev, Tony Luck, Srivatsa S. Bhat, tj,
linux-ia64, paulmck, Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the error injection code in ia64 by using this latter form of callback
registration.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/ia64/kernel/err_inject.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/ia64/kernel/err_inject.c b/arch/ia64/kernel/err_inject.c
index f59c0b8..0c161ed 100644
--- a/arch/ia64/kernel/err_inject.c
+++ b/arch/ia64/kernel/err_inject.c
@@ -269,12 +269,17 @@ err_inject_init(void)
#ifdef ERR_INJ_DEBUG
printk(KERN_INFO "Enter error injection driver.\n");
#endif
+
+ cpu_notifier_register_begin();
+
for_each_online_cpu(i) {
err_inject_cpu_callback(&err_inject_cpu_notifier, CPU_ONLINE,
(void *)(long)i);
}
- register_hotcpu_notifier(&err_inject_cpu_notifier);
+ __register_hotcpu_notifier(&err_inject_cpu_notifier);
+
+ cpu_notifier_register_done();
return 0;
}
@@ -288,11 +293,17 @@ err_inject_exit(void)
#ifdef ERR_INJ_DEBUG
printk(KERN_INFO "Exit error injection driver.\n");
#endif
+
+ cpu_notifier_register_begin();
+
for_each_online_cpu(i) {
sys_dev = get_cpu_device(i);
sysfs_remove_group(&sys_dev->kobj, &err_inject_attr_group);
}
- unregister_hotcpu_notifier(&err_inject_cpu_notifier);
+
+ __unregister_hotcpu_notifier(&err_inject_cpu_notifier);
+
+ cpu_notifier_register_done();
}
module_init(err_inject_init);
^ permalink raw reply related
* [PATCH v3 07/52] ia64, topology: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:34 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, linux-kernel,
Fenghua Yu, linuxppc-dev, Tony Luck, Srivatsa S. Bhat, tj,
linux-ia64, paulmck, Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the topology code in ia64 by using this latter form of callback
registration.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/ia64/kernel/topology.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
index ca69a5a..f295f9a 100644
--- a/arch/ia64/kernel/topology.c
+++ b/arch/ia64/kernel/topology.c
@@ -454,12 +454,16 @@ static int __init cache_sysfs_init(void)
{
int i;
+ cpu_notifier_register_begin();
+
for_each_online_cpu(i) {
struct device *sys_dev = get_cpu_device((unsigned int)i);
cache_add_dev(sys_dev);
}
- register_hotcpu_notifier(&cache_cpu_notifier);
+ __register_hotcpu_notifier(&cache_cpu_notifier);
+
+ cpu_notifier_register_done();
return 0;
}
^ permalink raw reply related
* [PATCH v3 06/52] ia64, palinfo: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:34 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, linux-kernel,
Fenghua Yu, linuxppc-dev, Tony Luck, Srivatsa S. Bhat, tj,
linux-ia64, paulmck, Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the palinfo code in ia64 by using this latter form of callback
registration.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/ia64/kernel/palinfo.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/ia64/kernel/palinfo.c b/arch/ia64/kernel/palinfo.c
index ab33328..c39c3cd 100644
--- a/arch/ia64/kernel/palinfo.c
+++ b/arch/ia64/kernel/palinfo.c
@@ -996,13 +996,17 @@ palinfo_init(void)
if (!palinfo_dir)
return -ENOMEM;
+ cpu_notifier_register_begin();
+
/* Create palinfo dirs in /proc for all online cpus */
for_each_online_cpu(i) {
create_palinfo_proc_entries(i);
}
/* Register for future delivery via notify registration */
- register_hotcpu_notifier(&palinfo_cpu_notifier);
+ __register_hotcpu_notifier(&palinfo_cpu_notifier);
+
+ cpu_notifier_register_done();
return 0;
}
^ permalink raw reply related
* [PATCH v3 05/52] ia64, salinfo: Fix hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:34 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, linux-kernel,
Fenghua Yu, linuxppc-dev, Tony Luck, Srivatsa S. Bhat, tj,
linux-ia64, paulmck, Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the salinfo code in ia64 by using this latter form of callback
registration.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/ia64/kernel/salinfo.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/ia64/kernel/salinfo.c b/arch/ia64/kernel/salinfo.c
index 960a396..ee9719e 100644
--- a/arch/ia64/kernel/salinfo.c
+++ b/arch/ia64/kernel/salinfo.c
@@ -635,6 +635,8 @@ salinfo_init(void)
(void *)salinfo_entries[i].feature);
}
+ cpu_notifier_register_begin();
+
for (i = 0; i < ARRAY_SIZE(salinfo_log_name); i++) {
data = salinfo_data + i;
data->type = i;
@@ -669,7 +671,9 @@ salinfo_init(void)
salinfo_timer.function = &salinfo_timeout;
add_timer(&salinfo_timer);
- register_hotcpu_notifier(&salinfo_cpu_notifier);
+ __register_hotcpu_notifier(&salinfo_cpu_notifier);
+
+ cpu_notifier_register_done();
return 0;
}
^ permalink raw reply related
* [PATCH v3 04/52] CPU hotplug, perf: Fix CPU hotplug callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:34 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, Arnaldo Carvalho de Melo,
linux-pm, Peter Zijlstra, linux-kernel, linuxppc-dev,
Paul Mackerras, Srivatsa S. Bhat, tj, paulmck, Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the perf subsystem's hotplug notifier by using this latter form of
callback registration.
Also provide a bare-bones version of perf_cpu_notifier() that doesn't
invoke the notifiers for the already online CPUs. This would be useful
for subsystems that need to perform a different set of initialization
for the already online CPUs, or don't need the initialization altogether.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
include/linux/perf_event.h | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e56b07f..3356abc 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -835,6 +835,8 @@ do { \
{ .notifier_call = fn, .priority = CPU_PRI_PERF }; \
unsigned long cpu = smp_processor_id(); \
unsigned long flags; \
+ \
+ cpu_notifier_register_begin(); \
fn(&fn##_nb, (unsigned long)CPU_UP_PREPARE, \
(void *)(unsigned long)cpu); \
local_irq_save(flags); \
@@ -843,9 +845,21 @@ do { \
local_irq_restore(flags); \
fn(&fn##_nb, (unsigned long)CPU_ONLINE, \
(void *)(unsigned long)cpu); \
- register_cpu_notifier(&fn##_nb); \
+ __register_cpu_notifier(&fn##_nb); \
+ cpu_notifier_register_done(); \
} while (0)
+/*
+ * Bare-bones version of perf_cpu_notifier(), which doesn't invoke the
+ * callback for already online CPUs.
+ */
+#define __perf_cpu_notifier(fn) \
+do { \
+ static struct notifier_block fn##_nb = \
+ { .notifier_call = fn, .priority = CPU_PRI_PERF }; \
+ \
+ __register_cpu_notifier(&fn##_nb); \
+} while (0)
struct perf_pmu_events_attr {
struct device_attribute attr;
^ permalink raw reply related
* [PATCH v3 01/52] CPU hotplug: Add lockdep annotations to get/put_online_cpus()
From: Srivatsa S. Bhat @ 2014-03-10 20:34 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, Gautham R. Shenoy, walken, linux, linux-pm,
linux-kernel, linuxppc-dev, Srivatsa S. Bhat, tj, paulmck,
Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
From: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Add lockdep annotations for get/put_online_cpus() and
cpu_hotplug_begin()/cpu_hotplug_end().
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
kernel/cpu.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index deff2e6..33caf5e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -19,6 +19,7 @@
#include <linux/mutex.h>
#include <linux/gfp.h>
#include <linux/suspend.h>
+#include <linux/lockdep.h>
#include "smpboot.h"
@@ -57,17 +58,30 @@ static struct {
* an ongoing cpu hotplug operation.
*/
int refcount;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
} cpu_hotplug = {
.active_writer = NULL,
.lock = __MUTEX_INITIALIZER(cpu_hotplug.lock),
.refcount = 0,
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ .dep_map = {.name = "cpu_hotplug.lock" },
+#endif
};
+/* Lockdep annotations for get/put_online_cpus() and cpu_hotplug_begin/end() */
+#define cpuhp_lock_acquire_read() lock_map_acquire_read(&cpu_hotplug.dep_map)
+#define cpuhp_lock_acquire() lock_map_acquire(&cpu_hotplug.dep_map)
+#define cpuhp_lock_release() lock_map_release(&cpu_hotplug.dep_map)
+
void get_online_cpus(void)
{
might_sleep();
if (cpu_hotplug.active_writer == current)
return;
+ cpuhp_lock_acquire_read();
mutex_lock(&cpu_hotplug.lock);
cpu_hotplug.refcount++;
mutex_unlock(&cpu_hotplug.lock);
@@ -87,6 +101,7 @@ void put_online_cpus(void)
if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
wake_up_process(cpu_hotplug.active_writer);
mutex_unlock(&cpu_hotplug.lock);
+ cpuhp_lock_release();
}
EXPORT_SYMBOL_GPL(put_online_cpus);
@@ -117,6 +132,7 @@ void cpu_hotplug_begin(void)
{
cpu_hotplug.active_writer = current;
+ cpuhp_lock_acquire();
for (;;) {
mutex_lock(&cpu_hotplug.lock);
if (likely(!cpu_hotplug.refcount))
@@ -131,6 +147,7 @@ void cpu_hotplug_done(void)
{
cpu_hotplug.active_writer = NULL;
mutex_unlock(&cpu_hotplug.lock);
+ cpuhp_lock_release();
}
/*
^ permalink raw reply related
* [PATCH v3 03/52] Doc/cpu-hotplug: Specify race-free way to register CPU hotplug callbacks
From: Srivatsa S. Bhat @ 2014-03-10 20:34 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, Rob Landley, linux-pm, linux-doc,
linux-kernel, linuxppc-dev, Srivatsa S. Bhat, tj, paulmck,
Ingo Molnar
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
Recommend the usage of the new CPU hotplug callback registration APIs
(__register_cpu_notifier() etc), when subsystems need to also perform
initialization for already online CPUs. Provide examples of correct
and race-free ways of achieving this, and point out the kinds of code
that are error-prone.
Cc: Rob Landley <rob@landley.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
Documentation/cpu-hotplug.txt | 45 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt
index be675d2..a0b005d 100644
--- a/Documentation/cpu-hotplug.txt
+++ b/Documentation/cpu-hotplug.txt
@@ -312,12 +312,57 @@ things will happen if a notifier in path sent a BAD notify code.
Q: I don't see my action being called for all CPUs already up and running?
A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
If you need to perform some action for each cpu already in the system, then
+ do this:
for_each_online_cpu(i) {
foobar_cpu_callback(&foobar_cpu_notifier, CPU_UP_PREPARE, i);
foobar_cpu_callback(&foobar_cpu_notifier, CPU_ONLINE, i);
}
+ However, if you want to register a hotplug callback, as well as perform
+ some initialization for CPUs that are already online, then do this:
+
+ Version 1: (Correct)
+ ---------
+
+ cpu_notifier_register_begin();
+
+ for_each_online_cpu(i) {
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_UP_PREPARE, i);
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_ONLINE, i);
+ }
+
+ /* Note the use of the double underscored version of the API */
+ __register_cpu_notifier(&foobar_cpu_notifier);
+
+ cpu_notifier_register_done();
+
+ Note that the following code is *NOT* the right way to achieve this,
+ because it is prone to an ABBA deadlock between the cpu_add_remove_lock
+ and the cpu_hotplug.lock.
+
+ Version 2: (Wrong!)
+ ---------
+
+ get_online_cpus();
+
+ for_each_online_cpu(i) {
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_UP_PREPARE, i);
+ foobar_cpu_callback(&foobar_cpu_notifier,
+ CPU_ONLINE, i);
+ }
+
+ register_cpu_notifier(&foobar_cpu_notifier);
+
+ put_online_cpus();
+
+ So always use the first version shown above when you want to register
+ callbacks as well as initialize the already online CPUs.
+
+
Q: If i would like to develop cpu hotplug support for a new architecture,
what do i need at a minimum?
A: The following are what is required for CPU hotplug infrastructure to work
^ permalink raw reply related
* [PATCH v3 02/52] CPU hotplug: Provide lockless versions of callback registration functions
From: Srivatsa S. Bhat @ 2014-03-10 20:34 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, Peter Zijlstra,
Rafael J. Wysocki, linux-kernel, Ingo Molnar, linuxppc-dev,
Srivatsa S. Bhat, Oleg Nesterov, tj, Toshi Kani, Thomas Gleixner,
paulmck, Andrew Morton
In-Reply-To: <20140310203312.10746.310.stgit@srivatsabhat.in.ibm.com>
The following method of CPU hotplug callback registration is not safe
due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
and the cpu_hotplug.lock.
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
The deadlock is shown below:
CPU 0 CPU 1
----- -----
Acquire cpu_hotplug.lock
[via get_online_cpus()]
CPU online/offline operation
takes cpu_add_remove_lock
[via cpu_maps_update_begin()]
Try to acquire
cpu_add_remove_lock
[via register_cpu_notifier()]
CPU online/offline operation
tries to acquire cpu_hotplug.lock
[via cpu_hotplug_begin()]
*** DEADLOCK! ***
The problem here is that callback registration takes the locks in one order
whereas the CPU hotplug operations take the same locks in the opposite order.
To avoid this issue and to provide a race-free method to register CPU hotplug
callbacks (along with initialization of already online CPUs), introduce new
variants of the callback registration APIs that simply register the callbacks
without holding the cpu_add_remove_lock during the registration. That way,
we can avoid the ABBA scenario. However, we will need to hold the
cpu_add_remove_lock throughout the entire critical section, to protect updates
to the callback/notifier chain.
This can be achieved by writing the callback registration code as follows:
cpu_maps_update_begin(); [ or cpu_notifier_register_begin(); see below ]
for_each_online_cpu(cpu)
init_cpu(cpu);
/* This doesn't take the cpu_add_remove_lock */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_maps_update_done(); [ or cpu_notifier_register_done(); see below ]
Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
notifiers, and hence get_online_cpus() cannot provide the necessary
synchronization to protect the callback/notifier chains against concurrent
reads and writes. On the other hand, since the cpu_add_remove_lock protects
the entire hotplug operation (including CPU_POST_DEAD), we can use
cpu_maps_update_begin/done() to guarantee proper synchronization.
Also, since cpu_maps_update_begin/done() is like a super-set of
get/put_online_cpus(), the former naturally protects the critical sections
from concurrent hotplug operations.
Since the names cpu_maps_update_begin/done() don't make much sense in CPU
hotplug callback registration scenarios, we'll introduce new APIs named
cpu_notifier_register_begin/done() and map them to cpu_maps_update_begin/done().
In summary, introduce the lockless variants of un/register_cpu_notifier() and
also export the cpu_notifier_register_begin/done() APIs for use by modules.
This way, we provide a race-free way to register hotplug callbacks as well as
perform initialization for the CPUs that are already online.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
include/linux/cpu.h | 47 +++++++++++++++++++++++++++++++++++++++++++++++
kernel/cpu.c | 21 +++++++++++++++++++--
2 files changed, 66 insertions(+), 2 deletions(-)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 03e235ad..488d6eb 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -122,26 +122,46 @@ enum {
{ .notifier_call = fn, .priority = pri }; \
register_cpu_notifier(&fn##_nb); \
}
+
+#define __cpu_notifier(fn, pri) { \
+ static struct notifier_block fn##_nb = \
+ { .notifier_call = fn, .priority = pri }; \
+ __register_cpu_notifier(&fn##_nb); \
+}
#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#define __cpu_notifier(fn, pri) do { (void)(fn); } while (0)
#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
+
#ifdef CONFIG_HOTPLUG_CPU
extern int register_cpu_notifier(struct notifier_block *nb);
+extern int __register_cpu_notifier(struct notifier_block *nb);
extern void unregister_cpu_notifier(struct notifier_block *nb);
+extern void __unregister_cpu_notifier(struct notifier_block *nb);
#else
#ifndef MODULE
extern int register_cpu_notifier(struct notifier_block *nb);
+extern int __register_cpu_notifier(struct notifier_block *nb);
#else
static inline int register_cpu_notifier(struct notifier_block *nb)
{
return 0;
}
+
+static inline int __register_cpu_notifier(struct notifier_block *nb)
+{
+ return 0;
+}
#endif
static inline void unregister_cpu_notifier(struct notifier_block *nb)
{
}
+
+static inline void __unregister_cpu_notifier(struct notifier_block *nb)
+{
+}
#endif
int cpu_up(unsigned int cpu);
@@ -149,19 +169,32 @@ void notify_cpu_starting(unsigned int cpu);
extern void cpu_maps_update_begin(void);
extern void cpu_maps_update_done(void);
+#define cpu_notifier_register_begin cpu_maps_update_begin
+#define cpu_notifier_register_done cpu_maps_update_done
+
#else /* CONFIG_SMP */
#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#define __cpu_notifier(fn, pri) do { (void)(fn); } while (0)
static inline int register_cpu_notifier(struct notifier_block *nb)
{
return 0;
}
+static inline int __register_cpu_notifier(struct notifier_block *nb)
+{
+ return 0;
+}
+
static inline void unregister_cpu_notifier(struct notifier_block *nb)
{
}
+static inline void __unregister_cpu_notifier(struct notifier_block *nb)
+{
+}
+
static inline void cpu_maps_update_begin(void)
{
}
@@ -170,6 +203,14 @@ static inline void cpu_maps_update_done(void)
{
}
+static inline void cpu_notifier_register_begin(void)
+{
+}
+
+static inline void cpu_notifier_register_done(void)
+{
+}
+
#endif /* CONFIG_SMP */
extern struct bus_type cpu_subsys;
@@ -183,8 +224,11 @@ extern void put_online_cpus(void);
extern void cpu_hotplug_disable(void);
extern void cpu_hotplug_enable(void);
#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
+#define __hotcpu_notifier(fn, pri) __cpu_notifier(fn, pri)
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
+#define __register_hotcpu_notifier(nb) __register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
+#define __unregister_hotcpu_notifier(nb) __unregister_cpu_notifier(nb)
void clear_tasks_mm_cpumask(int cpu);
int cpu_down(unsigned int cpu);
@@ -197,9 +241,12 @@ static inline void cpu_hotplug_done(void) {}
#define cpu_hotplug_disable() do { } while (0)
#define cpu_hotplug_enable() do { } while (0)
#define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#define __hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
+#define __register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
#define unregister_hotcpu_notifier(nb) ({ (void)(nb); })
+#define __unregister_hotcpu_notifier(nb) ({ (void)(nb); })
#endif /* CONFIG_HOTPLUG_CPU */
#ifdef CONFIG_PM_SLEEP_SMP
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 33caf5e..a9e710e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -28,18 +28,23 @@
static DEFINE_MUTEX(cpu_add_remove_lock);
/*
- * The following two API's must be used when attempting
- * to serialize the updates to cpu_online_mask, cpu_present_mask.
+ * The following two APIs (cpu_maps_update_begin/done) must be used when
+ * attempting to serialize the updates to cpu_online_mask & cpu_present_mask.
+ * The APIs cpu_notifier_register_begin/done() must be used to protect CPU
+ * hotplug callback (un)registration performed using __register_cpu_notifier()
+ * or __unregister_cpu_notifier().
*/
void cpu_maps_update_begin(void)
{
mutex_lock(&cpu_add_remove_lock);
}
+EXPORT_SYMBOL(cpu_notifier_register_begin);
void cpu_maps_update_done(void)
{
mutex_unlock(&cpu_add_remove_lock);
}
+EXPORT_SYMBOL(cpu_notifier_register_done);
static RAW_NOTIFIER_HEAD(cpu_chain);
@@ -183,6 +188,11 @@ int __ref register_cpu_notifier(struct notifier_block *nb)
return ret;
}
+int __ref __register_cpu_notifier(struct notifier_block *nb)
+{
+ return raw_notifier_chain_register(&cpu_chain, nb);
+}
+
static int __cpu_notify(unsigned long val, void *v, int nr_to_call,
int *nr_calls)
{
@@ -206,6 +216,7 @@ static void cpu_notify_nofail(unsigned long val, void *v)
BUG_ON(cpu_notify(val, v));
}
EXPORT_SYMBOL(register_cpu_notifier);
+EXPORT_SYMBOL(__register_cpu_notifier);
void __ref unregister_cpu_notifier(struct notifier_block *nb)
{
@@ -215,6 +226,12 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb)
}
EXPORT_SYMBOL(unregister_cpu_notifier);
+void __ref __unregister_cpu_notifier(struct notifier_block *nb)
+{
+ raw_notifier_chain_unregister(&cpu_chain, nb);
+}
+EXPORT_SYMBOL(__unregister_cpu_notifier);
+
/**
* clear_tasks_mm_cpumask - Safely clear tasks' mm_cpumask for a CPU
* @cpu: a CPU id
^ permalink raw reply related
* [PATCH v3 00/52] CPU hotplug: Fix issues with callback registration
From: Srivatsa S. Bhat @ 2014-03-10 20:33 UTC (permalink / raw)
To: paulus, oleg, mingo, rjw, rusty, peterz, tglx, akpm
Cc: linux-arch, ego, walken, linux, linux-pm, linux-kernel,
linuxppc-dev, srivatsa.bhat, tj, paulmck
Hi,
Many subsystems and drivers have the need to register CPU hotplug callbacks
from their init routines and also perform initialization for the CPUs that are
already online. But unfortunately there is no race-free way to achieve this
today.
For example, consider this piece of code:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is not safe because there is a possibility of an ABBA deadlock involving
the cpu_add_remove_lock and the cpu_hotplug.lock.
CPU 0 CPU 1
----- -----
Acquire cpu_hotplug.lock
[via get_online_cpus()]
CPU online/offline operation
takes cpu_add_remove_lock
[via cpu_maps_update_begin()]
Try to acquire
cpu_add_remove_lock
[via register_cpu_notifier()]
CPU online/offline operation
tries to acquire cpu_hotplug.lock
[via cpu_hotplug_begin()]
*** DEADLOCK! ***
Other combinations of callback registration also don't work correctly.
Examples:
register_cpu_notifier(&foobar_cpu_notifier);
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
put_online_cpus();
This can lead to double initialization if a hotplug operation occurs after
registering the notifier and before invoking get_online_cpus().
On the other hand, the following piece of code can miss hotplug events
altogether:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
put_online_cpus();
^
| Race window; Can miss hotplug events here
v
register_cpu_notifier(&foobar_cpu_notifier);
To solve these issues and provide a race-free method to register CPU hotplug
callbacks, this patchset introduces new variants of the callback registration
APIs that don't hold the cpu_add_remove_lock, and exports the
cpu_add_remove_lock via 2 new APIs cpu_notifier_register_begin/done() for use
by various subsystems. With this in place, the following code snippet will
register a hotplug callback as well as initialize already online CPUs without
any race conditions.
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* This doesn't take the cpu_add_remove_lock */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
In this patchset, patch 1 adds lockdep annotations to catch the above mentioned
deadlock scenario. Patch 2 introduces the new APIs and infrastructure necessary
for race-free callback registration. The remaining patches perform tree-wide
conversions (to use this model).
This patchset has been hosted in the below git tree. It applies cleanly on
v3.14-rc6.
git://github.com/srivatsabhat/linux.git cpuhp-registration-fixes-v3
Changes from v2:
* Collected more Acks from subsystem maintainers.
* Updated the xen-balloon patch and got Ack from Boris Ostrovsky.
Gautham R. Shenoy (1):
CPU hotplug: Add lockdep annotations to get/put_online_cpus()
Srivatsa S. Bhat (51):
CPU hotplug: Provide lockless versions of callback registration functions
Doc/cpu-hotplug: Specify race-free way to register CPU hotplug callbacks
CPU hotplug, perf: Fix CPU hotplug callback registration
ia64, salinfo: Fix hotplug callback registration
ia64, palinfo: Fix CPU hotplug callback registration
ia64, topology: Fix CPU hotplug callback registration
ia64, err-inject: Fix CPU hotplug callback registration
arm, hw-breakpoint: Fix CPU hotplug callback registration
arm, kvm: Fix CPU hotplug callback registration
s390, cacheinfo: Fix CPU hotplug callback registration
s390, smp: Fix CPU hotplug callback registration
sparc, sysfs: Fix CPU hotplug callback registration
powerpc, sysfs: Fix CPU hotplug callback registration
x86, msr: Fix CPU hotplug callback registration
x86, cpuid: Fix CPU hotplug callback registration
x86, vsyscall: Fix CPU hotplug callback registration
x86, intel, uncore: Fix CPU hotplug callback registration
x86, mce: Fix CPU hotplug callback registration
x86, therm_throt.c: Fix CPU hotplug callback registration
x86, therm_throt.c: Remove unused therm_cpu_lock
x86, amd, ibs: Fix CPU hotplug callback registration
x86, intel, cacheinfo: Fix CPU hotplug callback registration
x86, intel, rapl: Fix CPU hotplug callback registration
x86, amd, uncore: Fix CPU hotplug callback registration
x86, hpet: Fix CPU hotplug callback registration
x86, pci, amd-bus: Fix CPU hotplug callback registration
x86, oprofile, nmi: Fix CPU hotplug callback registration
x86, kvm: Fix CPU hotplug callback registration
arm64, hw_breakpoint.c: Fix CPU hotplug callback registration
arm64, debug-monitors: Fix CPU hotplug callback registration
powercap, intel-rapl: Fix CPU hotplug callback registration
scsi, bnx2i: Fix CPU hotplug callback registration
scsi, bnx2fc: Fix CPU hotplug callback registration
scsi, fcoe: Fix CPU hotplug callback registration
zsmalloc: Fix CPU hotplug callback registration
acpi-cpufreq: Fix CPU hotplug callback registration
drivers/base/topology.c: Fix CPU hotplug callback registration
clocksource, dummy-timer: Fix CPU hotplug callback registration
intel-idle: Fix CPU hotplug callback registration
oprofile, nmi-timer: Fix CPU hotplug callback registration
octeon, watchdog: Fix CPU hotplug callback registration
thermal, x86-pkg-temp: Fix CPU hotplug callback registration
hwmon, coretemp: Fix CPU hotplug callback registration
hwmon, via-cputemp: Fix CPU hotplug callback registration
xen, balloon: Fix CPU hotplug callback registration
trace, ring-buffer: Fix CPU hotplug callback registration
profile: Fix CPU hotplug callback registration
mm, vmstat: Fix CPU hotplug callback registration
mm, zswap: Fix CPU hotplug callback registration
net/core/flow.c: Fix CPU hotplug callback registration
net/iucv/iucv.c: Fix CPU hotplug callback registration
Documentation/cpu-hotplug.txt | 45 +++++++++
arch/arm/kernel/hw_breakpoint.c | 8 +-
arch/arm/kvm/arm.c | 7 +
arch/arm64/kernel/debug-monitors.c | 6 +
arch/arm64/kernel/hw_breakpoint.c | 7 +
arch/ia64/kernel/err_inject.c | 15 +++
arch/ia64/kernel/palinfo.c | 6 +
arch/ia64/kernel/salinfo.c | 6 +
arch/ia64/kernel/topology.c | 6 +
arch/powerpc/kernel/sysfs.c | 8 +-
arch/s390/kernel/cache.c | 5 +
arch/s390/kernel/smp.c | 13 ++-
arch/sparc/kernel/sysfs.c | 6 +
arch/x86/kernel/cpu/intel_cacheinfo.c | 13 ++-
arch/x86/kernel/cpu/mcheck/mce.c | 8 +-
arch/x86/kernel/cpu/mcheck/therm_throt.c | 18 +---
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 6 +
arch/x86/kernel/cpu/perf_event_amd_uncore.c | 7 +
arch/x86/kernel/cpu/perf_event_intel_rapl.c | 9 +-
arch/x86/kernel/cpu/perf_event_intel_uncore.c | 6 +
arch/x86/kernel/cpuid.c | 15 ++-
arch/x86/kernel/hpet.c | 4 +
arch/x86/kernel/msr.c | 16 ++-
arch/x86/kernel/vsyscall_64.c | 6 +
arch/x86/kvm/x86.c | 7 +
arch/x86/oprofile/nmi_int.c | 15 +++
arch/x86/pci/amd_bus.c | 5 +
drivers/base/topology.c | 12 ++
drivers/clocksource/dummy_timer.c | 11 ++
drivers/cpufreq/acpi-cpufreq.c | 7 +
drivers/hwmon/coretemp.c | 14 +--
drivers/hwmon/via-cputemp.c | 14 +--
drivers/idle/intel_idle.c | 12 ++
drivers/oprofile/nmi_timer_int.c | 23 +++--
drivers/powercap/intel_rapl.c | 10 ++
drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 12 ++
drivers/scsi/bnx2i/bnx2i_init.c | 12 ++
drivers/scsi/fcoe/fcoe.c | 15 +++
drivers/thermal/x86_pkg_temp_thermal.c | 14 +--
drivers/watchdog/octeon-wdt-main.c | 11 ++
drivers/xen/balloon.c | 36 +++++--
include/linux/cpu.h | 47 ++++++++++
include/linux/perf_event.h | 16 +++
kernel/cpu.c | 38 +++++++-
kernel/profile.c | 20 +++-
kernel/trace/ring_buffer.c | 19 ++--
mm/vmstat.c | 6 +
mm/zsmalloc.c | 17 +++-
mm/zswap.c | 8 +-
net/core/flow.c | 8 +-
net/iucv/iucv.c | 121 ++++++++++++-------------
51 files changed, 550 insertions(+), 226 deletions(-)
Regards,
Srivatsa S. Bhat
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH RFC/RFT v3 6/9] powerpc: move cacheinfo sysfs to generic cacheinfo infrastructure
From: Sudeep Holla @ 2014-03-10 11:12 UTC (permalink / raw)
To: Anshuman Khandual
Cc: linuxppc-dev@lists.ozlabs.org, Paul Mackerras,
linux-kernel@vger.kernel.org, Sudeep Holla
In-Reply-To: <531963D9.5040701@linux.vnet.ibm.com>
Hi Anshuman,
On 07/03/14 06:14, Anshuman Khandual wrote:
> On 03/07/2014 09:36 AM, Anshuman Khandual wrote:
>> On 02/19/2014 09:36 PM, Sudeep Holla wrote:
>>> From: Sudeep Holla <sudeep.holla@arm.com>
>>>
>>> This patch removes the redundant sysfs cacheinfo code by making use of
>>> the newly introduced generic cacheinfo infrastructure.
>>>
>>> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: linuxppc-dev@lists.ozlabs.org
>>> ---
>>> arch/powerpc/kernel/cacheinfo.c | 831 ++++++-------------------------=
---------
>>> arch/powerpc/kernel/cacheinfo.h | 8 -
>>> arch/powerpc/kernel/sysfs.c | 4 -
>>> 3 files changed, 109 insertions(+), 734 deletions(-)
>>> delete mode 100644 arch/powerpc/kernel/cacheinfo.h
>>>
>>> diff --git a/arch/powerpc/kernel/cacheinfo.c b/arch/powerpc/kernel/cach=
einfo.c
>>> index 2912b87..05b7580 100644
>>> --- a/arch/powerpc/kernel/cacheinfo.c
>>> +++ b/arch/powerpc/kernel/cacheinfo.c
>>> @@ -10,38 +10,10 @@
>>> * 2 as published by the Free Software Foundation.
>>> */
>>>
>>> +#include <linux/cacheinfo.h>
>>> #include <linux/cpu.h>
>>> -#include <linux/cpumask.h>
>>> #include <linux/kernel.h>
>>> -#include <linux/kobject.h>
>>> -#include <linux/list.h>
>>> -#include <linux/notifier.h>
>>> #include <linux/of.h>
>>> -#include <linux/percpu.h>
>>> -#include <linux/slab.h>
>>> -#include <asm/prom.h>
>>> -
>>> -#include "cacheinfo.h"
>>> -
>>> -/* per-cpu object for tracking:
>>> - * - a "cache" kobject for the top-level directory
>>> - * - a list of "index" objects representing the cpu's local cache hier=
archy
>>> - */
>>> -struct cache_dir {
>>> -=09struct kobject *kobj; /* bare (not embedded) kobject for cache
>>> -=09=09=09 * directory */
>>> -=09struct cache_index_dir *index; /* list of index objects */
>>> -};
>>> -
>>> -/* "index" object: each cpu's cache directory has an index
>>> - * subdirectory corresponding to a cache object associated with the
>>> - * cpu. This object's lifetime is managed via the embedded kobject.
>>> - */
>>> -struct cache_index_dir {
>>> -=09struct kobject kobj;
>>> -=09struct cache_index_dir *next; /* next index in parent directory */
>>> -=09struct cache *cache;
>>> -};
>>>
>>> /* Template for determining which OF properties to query for a given
>>> * cache type */
>>> @@ -60,11 +32,6 @@ struct cache_type_info {
>>> =09const char *nr_sets_prop;
>>> };
>>>
>>> -/* These are used to index the cache_type_info array. */
>>> -#define CACHE_TYPE_UNIFIED 0
>>> -#define CACHE_TYPE_INSTRUCTION 1
>>> -#define CACHE_TYPE_DATA 2
>>> -
>>> static const struct cache_type_info cache_type_info[] =3D {
>>> =09{
>>> =09=09/* PowerPC Processor binding says the [di]-cache-*
>>> @@ -77,246 +44,115 @@ static const struct cache_type_info cache_type_in=
fo[] =3D {
>>> =09=09.nr_sets_prop =3D "d-cache-sets",
>>> =09},
>>> =09{
>>> -=09=09.name =3D "Instruction",
>>> -=09=09.size_prop =3D "i-cache-size",
>>> -=09=09.line_size_props =3D { "i-cache-line-size",
>>> -=09=09=09=09 "i-cache-block-size", },
>>> -=09=09.nr_sets_prop =3D "i-cache-sets",
>>> -=09},
>>> -=09{
>>> =09=09.name =3D "Data",
>>> =09=09.size_prop =3D "d-cache-size",
>>> =09=09.line_size_props =3D { "d-cache-line-size",
>>> =09=09=09=09 "d-cache-block-size", },
>>> =09=09.nr_sets_prop =3D "d-cache-sets",
>>> =09},
>>> +=09{
>>> +=09=09.name =3D "Instruction",
>>> +=09=09.size_prop =3D "i-cache-size",
>>> +=09=09.line_size_props =3D { "i-cache-line-size",
>>> +=09=09=09=09 "i-cache-block-size", },
>>> +=09=09.nr_sets_prop =3D "i-cache-sets",
>>> +=09},
>>> };
>>
>>
>> Hey Sudeep,
>>
>> After applying this patch, the cache_type_info array looks like this.
>>
>> static const struct cache_type_info cache_type_info[] =3D {
>> {
>> /*
>> * PowerPC Processor binding says the [di]-cache-*
>> * must be equal on unified caches, so just use
>> * d-cache properties.
>> */
>> .name =3D "Unified",
>> .size_prop =3D "d-cache-size",
>> .line_size_props =3D { "d-cache-line-size",
>> "d-cache-block-size", },
>> .nr_sets_prop =3D "d-cache-sets",
>> },
>> {
>> .name =3D "Data",
>> .size_prop =3D "d-cache-size",
>> .line_size_props =3D { "d-cache-line-size",
>> "d-cache-block-size", },
>> .nr_sets_prop =3D "d-cache-sets",
>> },
>> {
>> .name =3D "Instruction",
>> .size_prop =3D "i-cache-size",
>> .line_size_props =3D { "i-cache-line-size",
>> "i-cache-block-size", },
>> .nr_sets_prop =3D "i-cache-sets",
>> },
>> };
>>
>> and this function computes the the array index for any given cache type
>> define for PowerPC.
>>
>> static inline int get_cacheinfo_idx(enum cache_type type)
>> {
>> if (type =3D=3D CACHE_TYPE_UNIFIED)
>> return 0;
>> else
>> return type;
>> }
>>
>> These types are define in include/linux/cacheinfo.h as
>>
>> enum cache_type {
>> CACHE_TYPE_NOCACHE =3D 0,
>> CACHE_TYPE_INST =3D BIT(0),=09=09---> 1
>> CACHE_TYPE_DATA =3D BIT(1),=09=09---> 2
>> CACHE_TYPE_SEPARATE =3D CACHE_TYPE_INST | CACHE_TYPE_DATA,
>> CACHE_TYPE_UNIFIED =3D BIT(2),
>> };
>>
>> When it is UNIFIED we return index 0, which is correct. But the index
>> for instruction and data cache seems to be swapped which wrong. This
>> will fetch invalid properties for any given cache type.
>>
Ah, that's silly mistake on my side, will fix it.
>> I have done some initial review and testing for this patch's impact on
>> PowerPC (ppc64 POWER specifically). I am trying to do some code clean-up
>> and re-arrangements. Will post out soon. Thanks !
Thanks for taking time for testing and reviewing these patches.
>
> It does not work correctly on POWER.
>
> The new patchset adds some more attributes for every cache entry apart fr=
om
> what we used to have on PowerPC before. From the ABI perspective, the old=
ones
> should reflect the correct value in the same manner as before. Looks like
> the generic code will make any attribute as "Unknown" if the arch code do=
es
> not populate them in the respective callback.
>
Yes this is on my list, I need to avoid populating the sysfs files with=20
"Unknown" as value, will do that in next version.
> Here are some problems found on a POWER7 system
>
> (1) L1 instruction cache (cpu<N>/cache/index1/)
>
> =09=3D=3D=3D=3D=3D=3D Before patch =3D=3D=3D=3D=3D=3D
>
> =09coherency_line_size: =09128
> =09level:=09=09=091
> =09shared_cpu_map:=09=0900000000,00000000,00000000,00000000,00000000,0000=
0000,00000000,00000000,00000000,
> =09=09=0900000000,00000000,00000000,00000000,00000000,00000000,0=
0000000,00000000,00000000,
> =09=09=09=0900000000,00000000,00000000,00000000,00000000,00000000,0000000=
0,00000000,00000000,
> =09=09=09=0900000000,00000000,00000000,00000000,00000f00
> =09size:=09=09=0932K
> =09type:=09=09=09Instruction
>
> =09=3D=3D=3D=3D=3D After patch =3D=3D=3D=3D=3D=3D=3D=3D
>
> =09coherency_line_size:=09Unknown=09=09=09=09=09=09----> Wrong
> =09level:=09=09=091
> =09shared_cpu_map:=09=0900000000,00000000,00000000,00000000,00000000,0000=
0000,00000000,00000000,00000000,
> =09=09=0900000000,00000000,00000000,00000000,00000000,00000000,0=
0000000,00000000,00000000,
> =09=09=09=0900000000,00000000,00000000,00000000,00000000,00000000,0000000=
0,00000000,00000000,
> =09=09=09=0900000000,00000000,00000000,00000000,00ffffff=09----> Wrong
> =09size:=09=09=090K=09=09=09=09=09=09----> Wrong
> =09type:=09=09=09Instruction=09
>
> (2) L3 cache (cpu<N>/cache/index3/)
>
> =09=3D=3D=3D=3D=3D=3D Before patch =3D=3D=3D=3D=3D=3D
>
> =09number_of_sets:=09=091
> =09size:=09=09=094096K
> =09ways_of_associativity:=090
>
> =09=3D=3D=3D=3D=3D After patch =3D=3D=3D=3D=3D=3D=3D=3D
>
> =09number_of_sets:=09=091
> =09size:=09=09=094096K
> =09ways_of_associativity:=09Unknown=09=09----> Wrong
>
> Need to revisit this implementation on PowerPC and figure out the cause o=
f these problems.
>
Yes, based on the logs you have provided, I will check for the root=20
cause of these issues. I will get back with questions if I need=20
clarifications.
Regards,
Sudeep
^ permalink raw reply
* [PATCH v2 4/6] powernv:cpufreq: Create pstate_id_to_freq() helper
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy
In-Reply-To: <1394449861-8688-1-git-send-email-ego@linux.vnet.ibm.com>
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
Create a helper routine that can return the cpu-frequency for the
corresponding pstate_id.
Also, cache the values of the pstate_max, pstate_min and
pstate_nominal and nr_pstates in a static structure so that they can
be reused in the future to perform any validations.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
drivers/cpufreq/powernv-cpufreq.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 4c2e8ca..0ecd163 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -39,6 +39,14 @@ static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
+struct powernv_pstate_info {
+ int pstate_min_id;
+ int pstate_max_id;
+ int pstate_nominal_id;
+ int nr_pstates;
+};
+static struct powernv_pstate_info powernv_pstate_info;
+
/*
* Initialize the freq table based on data obtained
* from the firmware passed via device-tree
@@ -112,9 +120,28 @@ static int init_powernv_pstates(void)
for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
+ powernv_pstate_info.pstate_min_id = pstate_min;
+ powernv_pstate_info.pstate_max_id = pstate_max;
+ powernv_pstate_info.pstate_nominal_id = pstate_nominal;
+ powernv_pstate_info.nr_pstates = nr_pstates;
+
return 0;
}
+/**
+ * Returns the cpu frequency corresponding to the pstate_id.
+ */
+static unsigned int pstate_id_to_freq(int pstate_id)
+{
+ int i;
+
+ i = powernv_pstate_info.pstate_max_id - pstate_id;
+
+ BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
+ WARN_ON(powernv_pstate_ids[i] != pstate_id);
+ return powernv_freqs[i].frequency;
+}
+
static struct freq_attr *powernv_cpu_freq_attr[] = {
&cpufreq_freq_attr_scaling_available_freqs,
NULL,
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 6/6] powernv:cpufreq: Implement the driver->get() method
From: Gautham R. Shenoy @ 2014-03-10 11:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy
In-Reply-To: <1394449861-8688-1-git-send-email-ego@linux.vnet.ibm.com>
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
The current frequency of a cpu is reported through the sysfs file
cpuinfo_cur_freq. This requires the driver to implement a
"->get(unsigned int cpu)" method which will return the current
operating frequency.
Implement a function named powernv_cpufreq_get() which reads the local
pstate from the PMSR and returns the corresponding frequency.
Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
drivers/cpufreq/powernv-cpufreq.c | 48 +++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 183bbc4..6f3b6e1 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -223,6 +223,53 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
BUG();
}
+/*
+ * Computes the current frequency on this cpu
+ * and stores the result in *ret_freq.
+ */
+static void powernv_read_cpu_freq(void *ret_freq)
+{
+ unsigned long pmspr_val;
+ s8 local_pstate_id;
+ int *cur_freq, freq, pstate_id;
+
+ cur_freq = (int *)ret_freq;
+ pmspr_val = get_pmspr(SPRN_PMSR);
+
+ /* The local pstate id corresponds bits 48..55 in the PMSR.
+ * Note: Watch out for the sign! */
+ local_pstate_id = (pmspr_val >> 48) & 0xFF;
+ pstate_id = local_pstate_id;
+
+ freq = pstate_id_to_freq(pstate_id);
+ pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
+ smp_processor_id(), pmspr_val, pstate_id, freq);
+ *cur_freq = freq;
+}
+
+/*
+ * Returns the cpu frequency as reported by the firmware for 'cpu'.
+ * This value is reported through the sysfs file cpuinfo_cur_freq.
+ */
+unsigned int powernv_cpufreq_get(unsigned int cpu)
+{
+ int ret_freq;
+ cpumask_var_t sibling_mask;
+
+ if (unlikely(!zalloc_cpumask_var(&sibling_mask, GFP_KERNEL))) {
+ smp_call_function_single(cpu, powernv_read_cpu_freq,
+ &ret_freq, 1);
+ return ret_freq;
+ }
+
+ powernv_cpu_to_core_mask(cpu, sibling_mask);
+ smp_call_function_any(sibling_mask, powernv_read_cpu_freq,
+ &ret_freq, 1);
+
+ free_cpumask_var(sibling_mask);
+ return ret_freq;
+}
+
static void set_pstate(void *pstate)
{
unsigned long val;
@@ -309,6 +356,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
static struct cpufreq_driver powernv_cpufreq_driver = {
.verify = powernv_cpufreq_verify,
.target = powernv_cpufreq_target,
+ .get = powernv_cpufreq_get,
.init = powernv_cpufreq_cpu_init,
.exit = powernv_cpufreq_cpu_exit,
.name = "powernv-cpufreq",
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper.
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy
In-Reply-To: <1394449861-8688-1-git-send-email-ego@linux.vnet.ibm.com>
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Create a helper method that computes the cpumask corresponding to the
thread-siblings of a cpu. Use this for initializing the policy->cpus
mask for a given cpu.
(Original code written by Srivatsa S. Bhat. Gautham moved this to a
helper function!)
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
drivers/cpufreq/powernv-cpufreq.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index ab1551f..4cad727 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -115,6 +115,23 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
/* Helper routines */
+/**
+ * Sets the bits corresponding to the thread-siblings of cpu in its core
+ * in 'cpus'.
+ */
+static void powernv_cpu_to_core_mask(unsigned int cpu, cpumask_var_t cpus)
+{
+ int base, i;
+
+ base = cpu_first_thread_sibling(cpu);
+
+ for (i = 0; i < threads_per_core; i++) {
+ cpumask_set_cpu(base + i, cpus);
+ }
+
+ return;
+}
+
/* Access helpers to power mgt SPR */
static inline unsigned long get_pmspr(unsigned long sprn)
@@ -180,13 +197,8 @@ static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
{
- int base, i;
-
#ifdef CONFIG_SMP
- base = cpu_first_thread_sibling(policy->cpu);
-
- for (i = 0; i < threads_per_core; i++)
- cpumask_set_cpu(base + i, policy->cpus);
+ powernv_cpu_to_core_mask(policy->cpu, policy->cpus);
#endif
policy->cpuinfo.transition_latency = 25000;
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 3/6] powernv, cpufreq:Add per-core locking to serialize frequency transitions
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy
In-Reply-To: <1394449861-8688-1-git-send-email-ego@linux.vnet.ibm.com>
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
On POWER systems, the CPU frequency is controlled at a core-level and
hence we need to serialize so that only one of the threads in the core
switches the core's frequency at a time.
Using a global mutex lock would needlessly serialize _all_ frequency
transitions in the system (across all cores). So introduce per-core
locking to enable finer-grained synchronization and thereby enhance
the speed and responsiveness of the cpufreq driver to varying workload
demands.
The design of per-core locking is very simple and straight-forward: we
first define a Per-CPU lock and use the ones that belongs to the first
thread sibling of the core.
cpu_first_thread_sibling() macro is used to find the *common* lock for
all thread siblings belonging to a core.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
drivers/cpufreq/powernv-cpufreq.c | 21 ++++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 4cad727..4c2e8ca 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -24,8 +24,15 @@
#include <linux/of.h>
#include <asm/cputhreads.h>
-/* FIXME: Make this per-core */
-static DEFINE_MUTEX(freq_switch_mutex);
+/* Per-Core locking for frequency transitions */
+static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
+
+#define lock_core_freq(cpu) \
+ mutex_lock(&per_cpu(freq_switch_lock,\
+ cpu_first_thread_sibling(cpu)));
+#define unlock_core_freq(cpu) \
+ mutex_unlock(&per_cpu(freq_switch_lock,\
+ cpu_first_thread_sibling(cpu)));
#define POWERNV_MAX_PSTATES 256
@@ -233,7 +240,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
freqs.new = powernv_freqs[new_index].frequency;
freqs.cpu = policy->cpu;
- mutex_lock(&freq_switch_mutex);
+ lock_core_freq(policy->cpu);
cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
@@ -245,7 +252,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
rc = powernv_set_freq(policy->cpus, new_index);
cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
- mutex_unlock(&freq_switch_mutex);
+ unlock_core_freq(policy->cpu);
return rc;
}
@@ -262,7 +269,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
static int __init powernv_cpufreq_init(void)
{
- int rc = 0;
+ int cpu, rc = 0;
/* Discover pstates from device tree and init */
@@ -272,6 +279,10 @@ static int __init powernv_cpufreq_init(void)
pr_info("powernv-cpufreq disabled\n");
return rc;
}
+ /* Init per-core mutex */
+ for_each_possible_cpu(cpu) {
+ mutex_init(&per_cpu(freq_switch_lock, cpu));
+ }
rc = cpufreq_register_driver(&powernv_cpufreq_driver);
return rc;
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 5/6] powernv:cpufreq: Export nominal frequency via sysfs.
From: Gautham R. Shenoy @ 2014-03-10 11:11 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy
In-Reply-To: <1394449861-8688-1-git-send-email-ego@linux.vnet.ibm.com>
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
Create a driver attribute named cpuinfo_nominal_freq which
creates a sysfs read-only file named cpuinfo_nominal_freq. Export
the frequency corresponding to the nominal_pstate through this
interface.
Nominal frequency is the highest non-turbo frequency for the
platform. This is generally used for setting governor policies from
user space for optimal energy efficiency.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
drivers/cpufreq/powernv-cpufreq.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 0ecd163..183bbc4 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -142,8 +142,30 @@ static unsigned int pstate_id_to_freq(int pstate_id)
return powernv_freqs[i].frequency;
}
+/**
+ * show_cpuinfo_nominal_freq - Show the nominal CPU frequency as indicated by
+ * the firmware
+ */
+static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
+ char *buf)
+{
+ int nominal_freq;
+ nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
+ return sprintf(buf, "%u\n", nominal_freq);
+}
+
+
+struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
+ .attr = { .name = "cpuinfo_nominal_freq",
+ .mode = 0444,
+ },
+ .show = show_cpuinfo_nominal_freq,
+};
+
+
static struct freq_attr *powernv_cpu_freq_attr[] = {
&cpufreq_freq_attr_scaling_available_freqs,
+ &cpufreq_freq_attr_cpuinfo_nominal_freq,
NULL,
};
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 1/6] powernv: cpufreq driver for powernv platform
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Anton Blanchard, srivatsa.bhat, Gautham R. Shenoy
In-Reply-To: <1394449861-8688-1-git-send-email-ego@linux.vnet.ibm.com>
From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Backend driver to dynamically set voltage and frequency on
IBM POWER non-virtualized platforms. Power management SPRs
are used to set the required PState.
This driver works in conjunction with cpufreq governors
like 'ondemand' to provide a demand based frequency and
voltage setting on IBM POWER non-virtualized platforms.
PState table is obtained from OPAL v3 firmware through device
tree.
powernv_cpufreq back-end driver would parse the relevant device-tree
nodes and initialise the cpufreq subsystem on powernv platform.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/reg.h | 4 +
arch/powerpc/platforms/powernv/Kconfig | 1 +
drivers/cpufreq/Kconfig | 1 +
drivers/cpufreq/Kconfig.powerpc | 13 ++
drivers/cpufreq/Makefile | 1 +
drivers/cpufreq/powernv-cpufreq.c | 277 +++++++++++++++++++++++++++++++++
6 files changed, 297 insertions(+)
create mode 100644 drivers/cpufreq/powernv-cpufreq.c
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 90c06ec..84f92ca 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -271,6 +271,10 @@
#define SPRN_HSRR1 0x13B /* Hypervisor Save/Restore 1 */
#define SPRN_IC 0x350 /* Virtual Instruction Count */
#define SPRN_VTB 0x351 /* Virtual Time Base */
+#define SPRN_PMICR 0x354 /* Power Management Idle Control Reg */
+#define SPRN_PMSR 0x355 /* Power Management Status Reg */
+#define SPRN_PMCR 0x374 /* Power Management Control Register */
+
/* HFSCR and FSCR bit numbers are the same */
#define FSCR_TAR_LG 8 /* Enable Target Address Register */
#define FSCR_EBB_LG 7 /* Enable Event Based Branching */
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 895e8a2..1fe12b1 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -11,6 +11,7 @@ config PPC_POWERNV
select PPC_UDBG_16550
select PPC_SCOM
select ARCH_RANDOM
+ select CPU_FREQ
default y
config PPC_POWERNV_RTAS
diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index 4b029c0..4ba1632 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
choice
prompt "Default CPUFreq governor"
default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
+ default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
default CPU_FREQ_DEFAULT_GOV_PERFORMANCE
help
This option sets which CPUFreq governor shall be loaded at
diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc
index ca0021a..93f8689 100644
--- a/drivers/cpufreq/Kconfig.powerpc
+++ b/drivers/cpufreq/Kconfig.powerpc
@@ -54,3 +54,16 @@ config PPC_PASEMI_CPUFREQ
help
This adds the support for frequency switching on PA Semi
PWRficient processors.
+
+config POWERNV_CPUFREQ
+ tristate "CPU frequency scaling for IBM POWERNV platform"
+ depends on PPC_POWERNV
+ select CPU_FREQ_GOV_PERFORMANCE
+ select CPU_FREQ_GOV_POWERSAVE
+ select CPU_FREQ_GOV_USERSPACE
+ select CPU_FREQ_GOV_ONDEMAND
+ select CPU_FREQ_GOV_CONSERVATIVE
+ default y
+ help
+ This adds support for CPU frequency switching on IBM POWERNV
+ platform
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 7494565..0dbb963 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_PPC_CORENET_CPUFREQ) += ppc-corenet-cpufreq.o
obj-$(CONFIG_CPU_FREQ_PMAC) += pmac32-cpufreq.o
obj-$(CONFIG_CPU_FREQ_PMAC64) += pmac64-cpufreq.o
obj-$(CONFIG_PPC_PASEMI_CPUFREQ) += pasemi-cpufreq.o
+obj-$(CONFIG_POWERNV_CPUFREQ) += powernv-cpufreq.o
##################################################################################
# Other platform drivers
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
new file mode 100644
index 0000000..ab1551f
--- /dev/null
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -0,0 +1,277 @@
+/*
+ * POWERNV cpufreq driver for the IBM POWER processors
+ *
+ * (C) Copyright IBM 2014
+ *
+ * Author: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#define pr_fmt(fmt) "powernv-cpufreq: " fmt
+
+#include <linux/module.h>
+#include <linux/cpufreq.h>
+#include <linux/of.h>
+#include <asm/cputhreads.h>
+
+/* FIXME: Make this per-core */
+static DEFINE_MUTEX(freq_switch_mutex);
+
+#define POWERNV_MAX_PSTATES 256
+
+static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
+static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
+
+/*
+ * Initialize the freq table based on data obtained
+ * from the firmware passed via device-tree
+ */
+
+static int init_powernv_pstates(void)
+{
+ struct device_node *power_mgt;
+ int nr_pstates = 0;
+ int pstate_min, pstate_max, pstate_nominal;
+ const __be32 *pstate_ids, *pstate_freqs;
+ int i;
+ u32 len_ids, len_freqs;
+
+ power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
+ if (!power_mgt) {
+ pr_warn("power-mgt node not found\n");
+ return -ENODEV;
+ }
+
+ if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
+ pr_warn("ibm,pstate-min node not found\n");
+ return -ENODEV;
+ }
+
+ if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
+ pr_warn("ibm,pstate-max node not found\n");
+ return -ENODEV;
+ }
+
+ if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
+ &pstate_nominal)) {
+ pr_warn("ibm,pstate-nominal not found\n");
+ return -ENODEV;
+ }
+ pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
+ pstate_nominal, pstate_max);
+
+ pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
+ if (!pstate_ids) {
+ pr_warn("ibm,pstate-ids not found\n");
+ return -ENODEV;
+ }
+
+ pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
+ &len_freqs);
+ if (!pstate_freqs) {
+ pr_warn("ibm,pstate-frequencies-mhz not found\n");
+ return -ENODEV;
+ }
+
+ WARN_ON(len_ids != len_freqs);
+ nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
+ WARN_ON(!nr_pstates);
+
+ pr_debug("NR PStates %d\n", nr_pstates);
+ for (i = 0; i < nr_pstates; i++) {
+ u32 id = be32_to_cpu(pstate_ids[i]);
+ u32 freq = be32_to_cpu(pstate_freqs[i]);
+
+ pr_debug("PState id %d freq %d MHz\n", id, freq);
+ powernv_freqs[i].driver_data = i;
+ powernv_freqs[i].frequency = freq * 1000; /* kHz */
+ powernv_pstate_ids[i] = id;
+ }
+ /* End of list marker entry */
+ powernv_freqs[i].driver_data = 0;
+ powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
+
+ /* Print frequency table */
+ for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
+ pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
+
+ return 0;
+}
+
+static struct freq_attr *powernv_cpu_freq_attr[] = {
+ &cpufreq_freq_attr_scaling_available_freqs,
+ NULL,
+};
+
+/* Helper routines */
+
+/* Access helpers to power mgt SPR */
+
+static inline unsigned long get_pmspr(unsigned long sprn)
+{
+ switch (sprn) {
+ case SPRN_PMCR:
+ return mfspr(SPRN_PMCR);
+
+ case SPRN_PMICR:
+ return mfspr(SPRN_PMICR);
+
+ case SPRN_PMSR:
+ return mfspr(SPRN_PMSR);
+ }
+ BUG();
+}
+
+static inline void set_pmspr(unsigned long sprn, unsigned long val)
+{
+ switch (sprn) {
+ case SPRN_PMCR:
+ mtspr(SPRN_PMCR, val);
+ return;
+
+ case SPRN_PMICR:
+ mtspr(SPRN_PMICR, val);
+ return;
+
+ case SPRN_PMSR:
+ mtspr(SPRN_PMSR, val);
+ return;
+ }
+ BUG();
+}
+
+static void set_pstate(void *pstate)
+{
+ unsigned long val;
+ unsigned long pstate_ul = *(unsigned long *) pstate;
+
+ val = get_pmspr(SPRN_PMCR);
+ val = val & 0x0000ffffffffffffULL;
+ /* Set both global(bits 56..63) and local(bits 48..55) PStates */
+ val = val | (pstate_ul << 56) | (pstate_ul << 48);
+ pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
+ set_pmspr(SPRN_PMCR, val);
+}
+
+static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
+{
+ unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
+
+ /*
+ * Use smp_call_function to send IPI and execute the
+ * mtspr on target cpu. We could do that without IPI
+ * if current CPU is within policy->cpus (core)
+ */
+
+ val = val & 0xFF;
+ smp_call_function_any(cpus, set_pstate, &val, 1);
+ return 0;
+}
+
+static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
+{
+ int base, i;
+
+#ifdef CONFIG_SMP
+ base = cpu_first_thread_sibling(policy->cpu);
+
+ for (i = 0; i < threads_per_core; i++)
+ cpumask_set_cpu(base + i, policy->cpus);
+#endif
+ policy->cpuinfo.transition_latency = 25000;
+
+ policy->cur = powernv_freqs[0].frequency;
+ cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
+ return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
+{
+ cpufreq_frequency_table_put_attr(policy->cpu);
+ return 0;
+}
+
+static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
+{
+ return cpufreq_frequency_table_verify(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_target(struct cpufreq_policy *policy,
+ unsigned int target_freq,
+ unsigned int relation)
+{
+ int rc;
+ struct cpufreq_freqs freqs;
+ unsigned int new_index;
+
+ cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
+ relation, &new_index);
+
+ freqs.old = policy->cur;
+ freqs.new = powernv_freqs[new_index].frequency;
+ freqs.cpu = policy->cpu;
+
+ mutex_lock(&freq_switch_mutex);
+ cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+
+ pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
+ policy->cpu,
+ powernv_freqs[new_index].frequency,
+ new_index,
+ powernv_pstate_ids[new_index]);
+
+ rc = powernv_set_freq(policy->cpus, new_index);
+
+ cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ mutex_unlock(&freq_switch_mutex);
+
+ return rc;
+}
+
+static struct cpufreq_driver powernv_cpufreq_driver = {
+ .verify = powernv_cpufreq_verify,
+ .target = powernv_cpufreq_target,
+ .init = powernv_cpufreq_cpu_init,
+ .exit = powernv_cpufreq_cpu_exit,
+ .name = "powernv-cpufreq",
+ .flags = CPUFREQ_CONST_LOOPS,
+ .attr = powernv_cpu_freq_attr,
+};
+
+static int __init powernv_cpufreq_init(void)
+{
+ int rc = 0;
+
+ /* Discover pstates from device tree and init */
+
+ rc = init_powernv_pstates();
+
+ if (rc) {
+ pr_info("powernv-cpufreq disabled\n");
+ return rc;
+ }
+
+ rc = cpufreq_register_driver(&powernv_cpufreq_driver);
+ return rc;
+}
+
+static void __exit powernv_cpufreq_exit(void)
+{
+ cpufreq_unregister_driver(&powernv_cpufreq_driver);
+}
+
+module_init(powernv_cpufreq_init);
+module_exit(powernv_cpufreq_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");
--
1.8.3.1
^ permalink raw reply related
* [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
Hi,
This is the v2 of the consolidated patchset consisting
patches for enabling cpufreq on IBM POWERNV platforms
along with some enhancements.
The v1 of these patches have been previously
submitted on linuxppc-dev [1][2].
- This patchset contains code for the platform driver to support CPU
frequency scaling on IBM POWERNV platforms.
- In addition to the standard control and status files exposed by the
cpufreq core, the patchset exposes the nominal frequency through the
file named "cpuinfo_nominal_freq".
The patchset is based against commit c3bebc71c4bcdafa24b506adf0c1de3c1f77e2e0
of the mainline tree.
[1]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-February/115244.html
[2]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-March/115703.html
Gautham R. Shenoy (3):
powernv:cpufreq: Create pstate_id_to_freq() helper
powernv:cpufreq: Export nominal frequency via sysfs.
powernv:cpufreq: Implement the driver->get() method
Srivatsa S. Bhat (2):
powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper.
powernv,cpufreq:Add per-core locking to serialize frequency
transitions
Vaidyanathan Srinivasan (1):
powernv: cpufreq driver for powernv platform
arch/powerpc/include/asm/reg.h | 4 +
arch/powerpc/platforms/powernv/Kconfig | 1 +
drivers/cpufreq/Kconfig | 1 +
drivers/cpufreq/Kconfig.powerpc | 13 ++
drivers/cpufreq/Makefile | 1 +
drivers/cpufreq/powernv-cpufreq.c | 397 +++++++++++++++++++++++++++++++++
6 files changed, 417 insertions(+)
create mode 100644 drivers/cpufreq/powernv-cpufreq.c
--
1.8.3.1
^ permalink raw reply
* Re: [PATCH 1/2] Revert "KVM: PPC: Book3S HV: Add new state for transactional memory"
From: Paolo Bonzini @ 2014-03-10 10:51 UTC (permalink / raw)
To: Paul Mackerras, Aneesh Kumar K.V, Scott Wood
Cc: linuxppc-dev, agraf, kvm-ppc, kvm
In-Reply-To: <20140310105028.GA5934@iris.ozlabs.ibm.com>
Il 10/03/2014 11:50, Paul Mackerras ha scritto:
> We can either do this revert, or apply a patch removing the extra
> hunk, but one or the other should go in for 3.14 since it's quite
> broken as it is (that is, HV-mode KVM on powerpc is broken).
>
> Paolo, do you have a preference about revert vs. fix? Are you happy
> to take what Aneesh sent (in which case please add my acked-by and
> perhaps edit the commentary to say how the problem arose), or do you
> want a freshly-prepared patch, and if so against which branch?
I prefer a fix.
Paolo
^ permalink raw reply
* Re: [PATCH 1/2] Revert "KVM: PPC: Book3S HV: Add new state for transactional memory"
From: Paul Mackerras @ 2014-03-10 10:50 UTC (permalink / raw)
To: Aneesh Kumar K.V, Paolo Bonzini, Scott Wood
Cc: linuxppc-dev, agraf, kvm-ppc, kvm
In-Reply-To: <1394102170-22126-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>
On Thu, Mar 06, 2014 at 04:06:09PM +0530, Aneesh Kumar K.V wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>
> This reverts commit 7b490411c37f7ab7965cbdfe5e3ec28eadb6db5b which cause
> the below crash in the host.
OK, I understand now what happened, which is this: when I sent out
that patch, I inadvertently included a hunk of extra code as a result
of not cleaning up a rebase properly. The next patch in the series
removed the extraneous hunk, but Alex didn't apply the next patch.
We can either do this revert, or apply a patch removing the extra
hunk, but one or the other should go in for 3.14 since it's quite
broken as it is (that is, HV-mode KVM on powerpc is broken).
Paolo, do you have a preference about revert vs. fix? Are you happy
to take what Aneesh sent (in which case please add my acked-by and
perhaps edit the commentary to say how the problem arose), or do you
want a freshly-prepared patch, and if so against which branch?
Thanks,
Paul.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox