* Re: [GIT PULL] ACPI and power management updates for v3.8-rc1
From: Witold Szczeponik @ 2012-12-11 16:25 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linus Torvalds, Len Brown, Linux PM list, ACPI Devel Maling List,
LKML
In-Reply-To: <4460434.rhV1bpKLo5@vostro.rjw.lan>
Hi Rafael,
please consider the inclusion of the two patches from https://lkml.org/lkml/2012/7/29/87 and https://lkml.org/lkml/2012/7/29/86, as discussed in our e-mail conversation on Oct 19 and 20. The patches apply without modification against 3.7 as well. (Since there is no change since 3.5, I did not resend them.)
Thanks in advance! If a separate re-send of the patches is needed, please let me know.
--- Witold
^ permalink raw reply
* Re: [PATCH][RFC] smsc95xx: enable dynamic autosuspend (RFC)
From: Ming Lei @ 2012-12-11 15:58 UTC (permalink / raw)
To: Oliver Neukum
Cc: Steve Glendinning, Steve Glendinning, netdev, linux-usb,
Greg Kroah-Hartman, Linux PM List
In-Reply-To: <3108632.YdjKGB5H1R@linux-lqwf.site>
CC linux-power
On Tue, Dec 11, 2012 at 11:19 PM, Oliver Neukum <oliver@neukum.org> wrote:
> On Tuesday 11 December 2012 20:53:19 Ming Lei wrote:
>> In fact, I have test data which can show a much power save
>> on OMAP3 based beagle board plus asix usbnet device with
>> the periodic work. IMO, the power save after introducing periodic
>> timer depends on the arch or platform, there should be much power
>> save if the CPU power consumption is very less. So how about letting
>> module parameter switch on/off the periodic work?
>
> You could ask on linux-power and netdev. But there would be an
> obvious question: Why kernel space?
How does user space utility know one interface doesn't support remote
wakeup for link change? and how to do it in user space? Or could we
persuade user space guys to do it? Or could the previous user space
utility can support power save on these devices?
At least, one advantage of doing it in kernel space is that we can let
current and previous user space utility support power save on these
devices when the link is off.
Also, suppose user space utility may close interface automatically when
the link becomes off, some configurations(such as IP address) of the
network interface will be lost after it is brought up again next time once
the link becomes on. The problem might break some application.
Thanks,
--
Ming Lei
^ permalink raw reply
* [RFC PATCH v4 2/9] CPU hotplug: Convert preprocessor macros to static inline functions
From: Srivatsa S. Bhat @ 2012-12-11 14:04 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
On 12/05/2012 06:10 AM, Andrew Morton wrote:
"static inline C functions would be preferred if possible. Feel free to
fix up the wrong crufty surrounding code as well ;-)"
Convert the macros in the CPU hotplug code to static inline C functions.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
include/linux/cpu.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index cf24da1..eb79f47 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -198,10 +198,10 @@ static inline void cpu_hotplug_driver_unlock(void)
#else /* CONFIG_HOTPLUG_CPU */
-#define get_online_cpus() do { } while (0)
-#define put_online_cpus() do { } while (0)
-#define get_online_cpus_atomic() do { } while (0)
-#define put_online_cpus_atomic() do { } while (0)
+static inline void get_online_cpus(void) {}
+static inline void put_online_cpus(void) {}
+static inline void get_online_cpus_atomic(void) {}
+static inline void put_online_cpus_atomic(void) {}
#define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
^ permalink raw reply related
* [RFC PATCH v4 4/9] smp, cpu hotplug: Fix on_each_cpu_*() to prevent CPU offline properly
From: Srivatsa S. Bhat @ 2012-12-11 14:04 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() to prevent CPUs from going offline from under us.
Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
kernel/smp.c | 25 +++++++++++++++----------
1 file changed, 15 insertions(+), 10 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index ce1a866..0031000 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -688,12 +688,12 @@ int on_each_cpu(void (*func) (void *info), void *info, int wait)
unsigned long flags;
int ret = 0;
- preempt_disable();
+ get_online_cpus_atomic();
ret = smp_call_function(func, info, wait);
local_irq_save(flags);
func(info);
local_irq_restore(flags);
- preempt_enable();
+ put_online_cpus_atomic();
return ret;
}
EXPORT_SYMBOL(on_each_cpu);
@@ -715,7 +715,11 @@ EXPORT_SYMBOL(on_each_cpu);
void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
void *info, bool wait)
{
- int cpu = get_cpu();
+ int cpu;
+
+ get_online_cpus_atomic();
+
+ cpu = smp_processor_id();
smp_call_function_many(mask, func, info, wait);
if (cpumask_test_cpu(cpu, mask)) {
@@ -723,7 +727,7 @@ void on_each_cpu_mask(const struct cpumask *mask, smp_call_func_t func,
func(info);
local_irq_enable();
}
- put_cpu();
+ put_online_cpus_atomic();
}
EXPORT_SYMBOL(on_each_cpu_mask);
@@ -748,8 +752,9 @@ EXPORT_SYMBOL(on_each_cpu_mask);
* The function might sleep if the GFP flags indicates a non
* atomic allocation is allowed.
*
- * Preemption is disabled to protect against CPUs going offline but not online.
- * CPUs going online during the call will not be seen or sent an IPI.
+ * We use get/put_online_cpus_atomic() to prevent CPUs from going
+ * offline in-between our operation. CPUs coming online during the
+ * call will not be seen or sent an IPI.
*
* You must not call this function with disabled interrupts or
* from a hardware interrupt handler or from a bottom half handler.
@@ -764,26 +769,26 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
might_sleep_if(gfp_flags & __GFP_WAIT);
if (likely(zalloc_cpumask_var(&cpus, (gfp_flags|__GFP_NOWARN)))) {
- preempt_disable();
+ get_online_cpus_atomic();
for_each_online_cpu(cpu)
if (cond_func(cpu, info))
cpumask_set_cpu(cpu, cpus);
on_each_cpu_mask(cpus, func, info, wait);
- preempt_enable();
+ put_online_cpus_atomic();
free_cpumask_var(cpus);
} else {
/*
* No free cpumask, bother. No matter, we'll
* just have to IPI them one by one.
*/
- preempt_disable();
+ get_online_cpus_atomic();
for_each_online_cpu(cpu)
if (cond_func(cpu, info)) {
ret = smp_call_function_single(cpu, func,
info, wait);
WARN_ON_ONCE(!ret);
}
- preempt_enable();
+ put_online_cpus_atomic();
}
}
EXPORT_SYMBOL(on_each_cpu_cond);
^ permalink raw reply related
* [RFC PATCH v4 3/9] smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly
From: Srivatsa S. Bhat @ 2012-12-11 14:04 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() to prevent CPUs from going offline from under us.
Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
kernel/smp.c | 38 +++++++++++++++++++++++++-------------
1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index 29dd40a..ce1a866 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -310,7 +310,8 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
* prevent preemption and reschedule on another processor,
* as well as CPU removal
*/
- this_cpu = get_cpu();
+ get_online_cpus_atomic();
+ this_cpu = smp_processor_id();
/*
* Can deadlock when called with interrupts disabled.
@@ -342,7 +343,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
}
}
- put_cpu();
+ put_online_cpus_atomic();
return err;
}
@@ -371,8 +372,10 @@ int smp_call_function_any(const struct cpumask *mask,
const struct cpumask *nodemask;
int ret;
+ get_online_cpus_atomic();
/* Try for same CPU (cheapest) */
- cpu = get_cpu();
+ cpu = smp_processor_id();
+
if (cpumask_test_cpu(cpu, mask))
goto call;
@@ -388,7 +391,7 @@ int smp_call_function_any(const struct cpumask *mask,
cpu = cpumask_any_and(mask, cpu_online_mask);
call:
ret = smp_call_function_single(cpu, func, info, wait);
- put_cpu();
+ put_online_cpus_atomic();
return ret;
}
EXPORT_SYMBOL_GPL(smp_call_function_any);
@@ -409,14 +412,17 @@ void __smp_call_function_single(int cpu, struct call_single_data *data,
unsigned int this_cpu;
unsigned long flags;
- this_cpu = get_cpu();
+ get_online_cpus_atomic();
+
+ this_cpu = smp_processor_id();
+
/*
* Can deadlock when called with interrupts disabled.
* We allow cpu's that are not yet online though, as no one else can
* send smp call function interrupt to this cpu and as such deadlocks
* can't happen.
*/
- WARN_ON_ONCE(cpu_online(smp_processor_id()) && wait && irqs_disabled()
+ WARN_ON_ONCE(cpu_online(this_cpu) && wait && irqs_disabled()
&& !oops_in_progress);
if (cpu == this_cpu) {
@@ -427,7 +433,7 @@ void __smp_call_function_single(int cpu, struct call_single_data *data,
csd_lock(data);
generic_exec_single(cpu, data, wait);
}
- put_cpu();
+ put_online_cpus_atomic();
}
/**
@@ -451,6 +457,8 @@ void smp_call_function_many(const struct cpumask *mask,
unsigned long flags;
int refs, cpu, next_cpu, this_cpu = smp_processor_id();
+ get_online_cpus_atomic();
+
/*
* Can deadlock when called with interrupts disabled.
* We allow cpu's that are not yet online though, as no one else can
@@ -467,17 +475,18 @@ void smp_call_function_many(const struct cpumask *mask,
/* No online cpus? We're done. */
if (cpu >= nr_cpu_ids)
- return;
+ goto out_unlock;
/* Do we have another CPU which isn't us? */
next_cpu = cpumask_next_and(cpu, mask, cpu_online_mask);
if (next_cpu == this_cpu)
- next_cpu = cpumask_next_and(next_cpu, mask, cpu_online_mask);
+ next_cpu = cpumask_next_and(next_cpu, mask,
+ cpu_online_mask);
/* Fastpath: do that cpu by itself. */
if (next_cpu >= nr_cpu_ids) {
smp_call_function_single(cpu, func, info, wait);
- return;
+ goto out_unlock;
}
data = &__get_cpu_var(cfd_data);
@@ -523,7 +532,7 @@ void smp_call_function_many(const struct cpumask *mask,
/* Some callers race with other cpus changing the passed mask */
if (unlikely(!refs)) {
csd_unlock(&data->csd);
- return;
+ goto out_unlock;
}
raw_spin_lock_irqsave(&call_function.lock, flags);
@@ -554,6 +563,9 @@ void smp_call_function_many(const struct cpumask *mask,
/* Optionally wait for the CPUs to complete */
if (wait)
csd_lock_wait(&data->csd);
+
+out_unlock:
+ put_online_cpus_atomic();
}
EXPORT_SYMBOL(smp_call_function_many);
@@ -574,9 +586,9 @@ EXPORT_SYMBOL(smp_call_function_many);
*/
int smp_call_function(smp_call_func_t func, void *info, int wait)
{
- preempt_disable();
+ get_online_cpus_atomic();
smp_call_function_many(cpu_online_mask, func, info, wait);
- preempt_enable();
+ put_online_cpus_atomic();
return 0;
}
^ permalink raw reply related
* Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Tejun Heo @ 2012-12-11 14:07 UTC (permalink / raw)
To: Srivatsa S. Bhat
Cc: Oleg Nesterov, tglx, peterz, paulmck, rusty, mingo, akpm,
namhyung, vincent.guittot, sbw, amit.kucheria, rostedt, rjw,
wangyun, xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <50C73CE5.1080201@linux.vnet.ibm.com>
Hello,
On Tue, Dec 11, 2012 at 07:32:13PM +0530, Srivatsa S. Bhat wrote:
> On 12/11/2012 07:17 PM, Tejun Heo wrote:
> > Hello, Srivatsa.
> >
> > On Tue, Dec 11, 2012 at 06:43:54PM +0530, Srivatsa S. Bhat wrote:
> >> This approach (of using synchronize_sched()) also looks good. It is simple,
> >> yet effective, but unfortunately inefficient at the writer side (because
> >> he'll have to wait for a full synchronize_sched()).
> >
> > While synchornize_sched() is heavier on the writer side than the
> > originally posted version, it doesn't stall the whole machine and
> > wouldn't introduce latencies to others. Shouldn't that be enough?
> >
>
> Short answer: Yes. But we can do better, with almost comparable code
> complexity. So I'm tempted to try that out.
>
> Long answer:
> Even in the synchronize_sched() approach, we still have to identify the
> readers who need to be converted to use the new get/put_online_cpus_atomic()
> APIs and convert them. Then, if we can come up with a scheme such that
> the writer has to wait only for those readers to complete, then why not?
>
> If such a scheme ends up becoming too complicated, then I agree, we
> can use synchronize_sched() itself. (That's what I meant by saying that
> we'll use this as a fallback).
>
> But even in this scheme which uses synchronize_sched(), we are
> already half-way through (we already use 2 types of sync schemes -
> counters and rwlocks). Just a little more logic can get rid of the
> unnecessary full-wait too.. So why not give it a shot?
It's not really about the code complexity but making the reader side
as light as possible. Please keep in mind that reader side is still
*way* more hotter than the writer side. Before, the writer side was
heavy to the extent which causes noticeable disruptions on the whole
system and I think that's what we're trying to hunt down here. If we
can shave of memory barriers from reader side by using
synchornized_sched() on writer side, that is the *better* result, not
worse.
Thanks.
--
tejun
^ permalink raw reply
* [RFC PATCH v4 6/9] kick_process(), cpu-hotplug: Prevent offlining of target CPU properly
From: Srivatsa S. Bhat @ 2012-12-11 14:05 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() to prevent CPUs from going offline from under us.
Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
kernel/sched/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f51e0aa..cff7656 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1091,11 +1091,11 @@ void kick_process(struct task_struct *p)
{
int cpu;
- preempt_disable();
+ get_online_cpus_atomic();
cpu = task_cpu(p);
if ((cpu != smp_processor_id()) && task_curr(p))
smp_send_reschedule(cpu);
- preempt_enable();
+ put_online_cpus_atomic();
}
EXPORT_SYMBOL_GPL(kick_process);
#endif /* CONFIG_SMP */
^ permalink raw reply related
* [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-11 14:04 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
There are places where preempt_disable() is used to prevent any CPU from
going offline during the critical section. Let us call them as "atomic
hotplug readers" ("atomic" because they run in atomic contexts).
Today, preempt_disable() works because the writer uses stop_machine().
But once stop_machine() is gone, the readers won't be able to prevent
CPUs from going offline using preempt_disable().
The intent of this patch is to provide synchronization APIs for such
atomic hotplug readers, to prevent (any) CPUs from going offline, without
depending on stop_machine() at the writer-side. The new APIs will look
something like this: get/put_online_cpus_atomic()
Some important design requirements and considerations:
-----------------------------------------------------
1. Scalable synchronization at the reader-side, especially in the fast-path
Any synchronization at the atomic hotplug readers side must be highly
scalable - avoid global single-holder locks/counters etc. Because, these
paths currently use the extremely fast preempt_disable(); our replacement
to preempt_disable() should not become ridiculously costly and also should
not serialize the readers among themselves needlessly.
At a minimum, the new APIs must be extremely fast at the reader side
atleast in the fast-path, when no CPU offline writers are active.
2. preempt_disable() was recursive. The replacement should also be recursive.
3. No (new) lock-ordering restrictions
preempt_disable() was super-flexible. It didn't impose any ordering
restrictions or rules for nesting. Our replacement should also be equally
flexible and usable.
4. No deadlock possibilities
Regular per-cpu locking is not the way to go if we want to have relaxed
rules for lock-ordering. Because, we can end up in circular-locking
dependencies as explained in https://lkml.org/lkml/2012/12/6/290
So, avoid the usual per-cpu locking schemes (per-cpu locks/per-cpu atomic
counters with spin-on-contention etc) as much as possible.
Implementation of the design:
----------------------------
We use global rwlocks for synchronization, because then we won't get into
lock-ordering related problems (unlike per-cpu locks). However, global
rwlocks lead to unnecessary cache-line bouncing even when there are no
hotplug writers present, which can slow down the system needlessly.
Per-cpu counters can help solve the cache-line bouncing problem. So we
actually use the best of both: per-cpu counters (no-waiting) at the reader
side in the fast-path, and global rwlocks in the slowpath.
[ Fastpath = no writer is active; Slowpath = a writer is active ]
IOW, the hotplug readers just increment/decrement their per-cpu refcounts
when no writer is active. When a writer becomes active, he signals all
readers to switch to global rwlocks for the duration of the CPU offline
operation. The readers switch over when it is safe for them (ie., when they
are about to start a fresh, non-nested read-side critical section) and
start using (holding) the global rwlock for read in their subsequent critical
sections.
The hotplug writer waits for every reader to switch, and then acquires
the global rwlock for write and takes the CPU offline. Then the writer
signals all readers that the CPU offline is done, and that they can go back
to using their per-cpu refcounts again.
Note that the lock-safety (despite the per-cpu scheme) comes from the fact
that the readers can *choose* _when_ to switch to rwlocks upon the writer's
signal. And the readers don't wait on anybody based on the per-cpu counters.
The only true synchronization that involves waiting at the reader-side in this
scheme, is the one arising from the global rwlock, which is safe from the
circular locking dependency problems mentioned above (because it is global).
Reader-writer locks and per-cpu counters are recursive, so they can be
used in a nested fashion in the reader-path. Also, this design of switching
the synchronization scheme ensures that you can safely nest and call these
APIs in any way you want, just like preempt_disable()/enable.
Together, these satisfy all the requirements mentioned above.
I'm indebted to Michael Wang and Xiao Guangrong for their numerous thoughtful
suggestions and ideas, which inspired and influenced many of the decisions in
this as well as previous designs. Thanks a lot Michael and Xiao!
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
include/linux/cpu.h | 4 +
kernel/cpu.c | 204 ++++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 205 insertions(+), 3 deletions(-)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index ce7a074..cf24da1 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys;
extern void get_online_cpus(void);
extern void put_online_cpus(void);
+extern void get_online_cpus_atomic(void);
+extern void put_online_cpus_atomic(void);
#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
@@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void)
#define get_online_cpus() do { } while (0)
#define put_online_cpus() do { } while (0)
+#define get_online_cpus_atomic() do { } while (0)
+#define put_online_cpus_atomic() do { } while (0)
#define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0)
/* These aren't inline functions due to a GCC bug. */
#define register_hotcpu_notifier(nb) ({ (void)(nb); 0; })
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 42bd331..5a63296 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -133,6 +133,119 @@ static void cpu_hotplug_done(void)
mutex_unlock(&cpu_hotplug.lock);
}
+/*
+ * Reader-writer lock to synchronize between atomic hotplug readers
+ * and the CPU offline hotplug writer.
+ */
+static DEFINE_RWLOCK(hotplug_rwlock);
+
+static DEFINE_PER_CPU(int, reader_percpu_refcnt);
+static DEFINE_PER_CPU(bool, writer_signal);
+
+
+#define reader_uses_percpu_refcnt(cpu) \
+ (ACCESS_ONCE(per_cpu(reader_percpu_refcnt, cpu)))
+
+#define reader_nested_percpu() \
+ (__this_cpu_read(reader_percpu_refcnt) > 1)
+
+#define writer_active() \
+ (__this_cpu_read(writer_signal))
+
+
+
+/*
+ * Invoked by hotplug reader, to prevent CPUs from going offline.
+ *
+ * If there are no CPU offline writers active, just increment the
+ * per-cpu counter 'reader_percpu_refcnt' and proceed.
+ *
+ * If a CPU offline hotplug writer is active, we'll need to switch from
+ * per-cpu refcounts to the global rwlock, when the time is right.
+ *
+ * It is not safe to switch the synchronization scheme when we are
+ * already in a read-side critical section which uses per-cpu refcounts.
+ * Also, we don't want to allow heterogeneous readers to nest inside
+ * each other, to avoid complications in put_online_cpus_atomic().
+ *
+ * Once you switch, keep using the rwlocks for synchronization, until
+ * the writer signals the end of CPU offline.
+ *
+ * You can call this recursively, without fear of locking problems.
+ *
+ * Returns with preemption disabled.
+ */
+void get_online_cpus_atomic(void)
+{
+ unsigned long flags;
+
+ preempt_disable();
+
+ if (cpu_hotplug.active_writer == current)
+ return;
+
+ local_irq_save(flags);
+
+ /*
+ * Use the percpu refcounts by default. Switch over to rwlock (if
+ * necessary) later on. This helps avoid several race conditions
+ * as well.
+ */
+ __this_cpu_inc(reader_percpu_refcnt);
+
+ smp_rmb(); /* Paired with smp_mb() in announce_cpu_offline_begin(). */
+
+ /*
+ * We must not allow heterogeneous nesting of readers (ie., readers
+ * using percpu refcounts to nest with readers using rwlocks).
+ * So don't switch the synchronization scheme if we are currently
+ * using perpcu refcounts.
+ */
+ if (!reader_nested_percpu() && unlikely(writer_active())) {
+
+ read_lock(&hotplug_rwlock);
+
+ /*
+ * We might have raced with a writer going inactive before we
+ * took the read-lock. So re-evaluate whether we still need to
+ * use the rwlock or if we can switch back to percpu refcounts.
+ * (This also helps avoid heterogeneous nesting of readers).
+ */
+ if (writer_active())
+ __this_cpu_dec(reader_percpu_refcnt);
+ else
+ read_unlock(&hotplug_rwlock);
+ }
+
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(get_online_cpus_atomic);
+
+void put_online_cpus_atomic(void)
+{
+ unsigned long flags;
+
+ if (cpu_hotplug.active_writer == current)
+ goto out;
+
+ local_irq_save(flags);
+
+ /*
+ * We never allow heterogeneous nesting of readers. So it is trivial
+ * to find out the kind of reader we are, and undo the operation
+ * done by our corresponding get_online_cpus_atomic().
+ */
+ if (__this_cpu_read(reader_percpu_refcnt))
+ __this_cpu_dec(reader_percpu_refcnt);
+ else
+ read_unlock(&hotplug_rwlock);
+
+ local_irq_restore(flags);
+out:
+ preempt_enable();
+}
+EXPORT_SYMBOL_GPL(put_online_cpus_atomic);
+
#else /* #if CONFIG_HOTPLUG_CPU */
static void cpu_hotplug_begin(void) {}
static void cpu_hotplug_done(void) {}
@@ -237,6 +350,61 @@ static inline void check_for_tasks(int cpu)
write_unlock_irq(&tasklist_lock);
}
+static inline void raise_writer_signal(unsigned int cpu)
+{
+ per_cpu(writer_signal, cpu) = true;
+}
+
+static inline void drop_writer_signal(unsigned int cpu)
+{
+ per_cpu(writer_signal, cpu) = false;
+}
+
+static void announce_cpu_offline_begin(void)
+{
+ unsigned int cpu;
+
+ for_each_online_cpu(cpu)
+ raise_writer_signal(cpu);
+
+ smp_mb();
+}
+
+static void announce_cpu_offline_end(unsigned int dead_cpu)
+{
+ unsigned int cpu;
+
+ drop_writer_signal(dead_cpu);
+
+ for_each_online_cpu(cpu)
+ drop_writer_signal(cpu);
+
+ smp_mb();
+}
+
+/*
+ * Wait for the reader to see the writer's signal and switch from percpu
+ * refcounts to global rwlock.
+ *
+ * If the reader is still using percpu refcounts, wait for him to switch.
+ * Else, we can safely go ahead, because either the reader has already
+ * switched over, or the next atomic hotplug reader who comes along on this
+ * CPU will notice the writer's signal and will switch over to the rwlock.
+ */
+static inline void sync_atomic_reader(unsigned int cpu)
+{
+ while (reader_uses_percpu_refcnt(cpu))
+ cpu_relax();
+}
+
+static void sync_all_readers(void)
+{
+ unsigned int cpu;
+
+ for_each_online_cpu(cpu)
+ sync_atomic_reader(cpu);
+}
+
struct take_cpu_down_param {
unsigned long mod;
void *hcpu;
@@ -246,15 +414,45 @@ struct take_cpu_down_param {
static int __ref take_cpu_down(void *_param)
{
struct take_cpu_down_param *param = _param;
- int err;
+ unsigned long flags;
+ unsigned int cpu = (long)(param->hcpu);
+ int err = 0;
+
+
+ /*
+ * Inform all atomic readers that we are going to offline a CPU
+ * and that they need to switch from per-cpu refcounts to the
+ * global hotplug_rwlock.
+ */
+ announce_cpu_offline_begin();
+
+ /* Wait for every reader to notice the announcement and switch over */
+ sync_all_readers();
+
+ /*
+ * Now all the readers have switched to using the global hotplug_rwlock.
+ * So now is our chance, go bring down the CPU!
+ */
+
+ write_lock_irqsave(&hotplug_rwlock, flags);
/* Ensure this CPU doesn't handle any more interrupts. */
err = __cpu_disable();
if (err < 0)
- return err;
+ goto out;
cpu_notify(CPU_DYING | param->mod, param->hcpu);
- return 0;
+
+out:
+ /*
+ * Inform all atomic readers that we are done with the CPU offline
+ * operation, so that they can switch back to their per-cpu refcounts.
+ * (We don't need to wait for them to see it).
+ */
+ announce_cpu_offline_end(cpu);
+
+ write_unlock_irqrestore(&hotplug_rwlock, flags);
+ return err;
}
/* Requires cpu_add_remove_lock to be held */
^ permalink raw reply related
* [RFC PATCH v4 9/9] cpu: No more __stop_machine() in _cpu_down()
From: Srivatsa S. Bhat @ 2012-12-11 14:05 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
From: Paul E. McKenney <paul.mckenney@linaro.org>
The _cpu_down() function invoked as part of the CPU-hotplug offlining
process currently invokes __stop_machine(), which is slow and inflicts
substantial real-time latencies on the entire system. This patch
substitutes stop_one_cpu() for __stop_machine() in order to improve
both performance and real-time latency.
This is currently unsafe, because there are a number of uses of
preempt_disable() that are intended to block CPU-hotplug offlining.
These will be fixed by using get/put_online_cpus_atomic(), but in the
meantime, this commit is one way to help locate them.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ srivatsa.bhat@linux.vnet.ibm.com: Refer to the new sync primitives for
readers (in the changelog), and s/stop_cpus/stop_one_cpu ]
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
kernel/cpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 5a63296..3f9498e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -484,7 +484,7 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
}
smpboot_park_threads(cpu);
- err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
+ err = stop_one_cpu(cpu, take_cpu_down, &tcd_param);
if (err) {
/* CPU didn't die: tell everyone. Can't complain. */
smpboot_unpark_threads(cpu);
^ permalink raw reply related
* [RFC PATCH v4 8/9] kvm, vmx: Add atomic synchronization with CPU Hotplug
From: Srivatsa S. Bhat @ 2012-12-11 14:05 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
preempt_disable() will no longer help prevent CPUs from going offline, once
stop_machine() gets removed from the CPU offline path. So use
get/put_online_cpus_atomic() in vmx_vcpu_load() to prevent CPUs from
going offline while clearing vmcs.
Reported-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Debugged-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/x86/kvm/vmx.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f858159..d8a4cf1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1519,10 +1519,14 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
- if (!vmm_exclusive)
+ if (!vmm_exclusive) {
kvm_cpu_vmxon(phys_addr);
- else if (vmx->loaded_vmcs->cpu != cpu)
+ } else if (vmx->loaded_vmcs->cpu != cpu) {
+ /* Prevent any CPU from going offline */
+ get_online_cpus_atomic();
loaded_vmcs_clear(vmx->loaded_vmcs);
+ put_online_cpus_atomic();
+ }
if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
^ permalink raw reply related
* [RFC PATCH v4 0/9] CPU hotplug: stop_machine()-free CPU hotplug
From: Srivatsa S. Bhat @ 2012-12-11 14:03 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
Hi,
This patchset removes CPU hotplug's dependence on stop_machine() from the CPU
offline path and provides an alternative (set of APIs) to preempt_disable() to
prevent CPUs from going offline, which can be invoked from atomic context.
This is an RFC patchset with only a few call-sites of preempt_disable()
converted to the new APIs for now, and the main goal is to get feedback on the
design of the new atomic APIs and see if it serves as a viable replacement for
stop_machine()-free CPU hotplug. A brief description of the algorithm is
available in the "Changes in vN" section.
Overview of the patches:
-----------------------
Patch 1 introduces the new APIs that can be used from atomic context, to
prevent CPUs from going offline.
Patch 2 is a cleanup; it converts preprocessor macros to static inline
functions.
Patches 3 to 8 convert various call-sites to use the new APIs.
Patch 9 is the one which actually removes stop_machine() from the CPU
offline path.
Changes in v4:
--------------
The synchronization scheme has been simplified quite a bit, which makes it
look a lot less complex than before. Some highlights:
* Implicit ACKs:
The earlier design required the readers to explicitly ACK the writer's
signal. The new design uses implicit ACKs instead. The reader switching
over to rwlock implicitly tells the writer to stop waiting for that reader.
* No atomic operations:
Since we got rid of explicit ACKs, we no longer have the need for a reader
and a writer to update the same counter. So we can get rid of atomic ops
too.
Changes in v3:
--------------
* Dropped the _light() and _full() variants of the APIs. Provided a single
interface: get/put_online_cpus_atomic().
* Completely redesigned the synchronization mechanism again, to make it
fast and scalable at the reader-side in the fast-path (when no hotplug
writers are active). This new scheme also ensures that there is no
possibility of deadlocks due to circular locking dependency.
In summary, this provides the scalability and speed of per-cpu rwlocks
(without actually using them), while avoiding the downside (deadlock
possibilities) which is inherent in any per-cpu locking scheme that is
meant to compete with preempt_disable()/enable() in terms of flexibility.
The problem with using per-cpu locking to replace preempt_disable()/enable
was explained here:
https://lkml.org/lkml/2012/12/6/290
Basically we use per-cpu counters (for scalability) when no writers are
active, and then switch to global rwlocks (for lock-safety) when a writer
becomes active. It is a slightly complex scheme, but it is based on
standard principles of distributed algorithms.
Changes in v2:
-------------
* Completely redesigned the synchronization scheme to avoid using any extra
cpumasks.
* Provided APIs for 2 types of atomic hotplug readers: "light" (for
light-weight) and "full". We wish to have more "light" readers than
the "full" ones, to avoid indirectly inducing the "stop_machine effect"
without even actually using stop_machine().
And the patches show that it _is_ generally true: 5 patches deal with
"light" readers, whereas only 1 patch deals with a "full" reader.
Also, the "light" readers happen to be in very hot paths. So it makes a
lot of sense to have such a distinction and a corresponding light-weight
API.
Links to previous versions:
v3: https://lkml.org/lkml/2012/12/7/287
v2: https://lkml.org/lkml/2012/12/5/322
v1: https://lkml.org/lkml/2012/12/4/88
Comments and suggestions welcome!
--
Paul E. McKenney (1):
cpu: No more __stop_machine() in _cpu_down()
Srivatsa S. Bhat (8):
CPU hotplug: Provide APIs to prevent CPU offline from atomic context
CPU hotplug: Convert preprocessor macros to static inline functions
smp, cpu hotplug: Fix smp_call_function_*() to prevent CPU offline properly
smp, cpu hotplug: Fix on_each_cpu_*() to prevent CPU offline properly
sched, cpu hotplug: Use stable online cpus in try_to_wake_up() & select_task_rq()
kick_process(), cpu-hotplug: Prevent offlining of target CPU properly
yield_to(), cpu-hotplug: Prevent offlining of other CPUs properly
kvm, vmx: Add atomic synchronization with CPU Hotplug
arch/x86/kvm/vmx.c | 8 +-
include/linux/cpu.h | 8 +-
kernel/cpu.c | 206 ++++++++++++++++++++++++++++++++++++++++++++++++++-
kernel/sched/core.c | 22 +++++
kernel/smp.c | 63 ++++++++++------
5 files changed, 273 insertions(+), 34 deletions(-)
Thanks,
Srivatsa S. Bhat
IBM Linux Technology Center
^ permalink raw reply
* [RFC PATCH v4 7/9] yield_to(), cpu-hotplug: Prevent offlining of other CPUs properly
From: Srivatsa S. Bhat @ 2012-12-11 14:05 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on local_irq_save() to prevent CPUs from going offline from under us.
Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
kernel/sched/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cff7656..4b982bf 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4312,6 +4312,7 @@ bool __sched yield_to(struct task_struct *p, bool preempt)
unsigned long flags;
bool yielded = 0;
+ get_online_cpus_atomic();
local_irq_save(flags);
rq = this_rq();
@@ -4339,13 +4340,14 @@ again:
* Make p's CPU reschedule; pick_next_entity takes care of
* fairness.
*/
- if (preempt && rq != p_rq)
+ if (preempt && rq != p_rq && cpu_online(task_cpu(p)))
resched_task(p_rq->curr);
}
out:
double_rq_unlock(rq, p_rq);
local_irq_restore(flags);
+ put_online_cpus_atomic();
if (yielded)
schedule();
^ permalink raw reply related
* [RFC PATCH v4 5/9] sched, cpu hotplug: Use stable online cpus in try_to_wake_up() & select_task_rq()
From: Srivatsa S. Bhat @ 2012-12-11 14:05 UTC (permalink / raw)
To: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, oleg
Cc: sbw, amit.kucheria, rostedt, rjw, srivatsa.bhat, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211140314.23621.64088.stgit@srivatsabhat.in.ibm.com>
Once stop_machine() is gone from the CPU offline path, we won't be able to
depend on preempt_disable() to prevent CPUs from going offline from under us.
Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going offline,
while invoking from atomic context.
Scheduler functions such as try_to_wake_up() and select_task_rq() (and even
select_fallback_rq()) deal with picking new CPUs to run tasks. So they need
to synchronize with CPU offline operations.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
kernel/sched/core.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2d8927f..f51e0aa 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1103,6 +1103,10 @@ EXPORT_SYMBOL_GPL(kick_process);
#ifdef CONFIG_SMP
/*
* ->cpus_allowed is protected by both rq->lock and p->pi_lock
+ *
+ * Must be called under get/put_online_cpus_atomic() or
+ * equivalent, to avoid CPUs from going offline from underneath
+ * us.
*/
static int select_fallback_rq(int cpu, struct task_struct *p)
{
@@ -1166,6 +1170,9 @@ out:
/*
* The caller (fork, wakeup) owns p->pi_lock, ->cpus_allowed is stable.
+ *
+ * Must be called under get/put_online_cpus_atomic(), to prevent
+ * CPUs from going offline from underneath us.
*/
static inline
int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
@@ -1406,6 +1413,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
int cpu, success = 0;
smp_wmb();
+ get_online_cpus_atomic();
raw_spin_lock_irqsave(&p->pi_lock, flags);
if (!(p->state & state))
goto out;
@@ -1446,6 +1454,7 @@ stat:
ttwu_stat(p, cpu, wake_flags);
out:
raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+ put_online_cpus_atomic();
return success;
}
@@ -1624,6 +1633,7 @@ void wake_up_new_task(struct task_struct *p)
unsigned long flags;
struct rq *rq;
+ get_online_cpus_atomic();
raw_spin_lock_irqsave(&p->pi_lock, flags);
#ifdef CONFIG_SMP
/*
@@ -1644,6 +1654,7 @@ void wake_up_new_task(struct task_struct *p)
p->sched_class->task_woken(rq, p);
#endif
task_rq_unlock(rq, p, &flags);
+ put_online_cpus_atomic();
}
#ifdef CONFIG_PREEMPT_NOTIFIERS
@@ -2541,6 +2552,7 @@ void sched_exec(void)
unsigned long flags;
int dest_cpu;
+ get_online_cpus_atomic();
raw_spin_lock_irqsave(&p->pi_lock, flags);
dest_cpu = p->sched_class->select_task_rq(p, SD_BALANCE_EXEC, 0);
if (dest_cpu == smp_processor_id())
@@ -2550,11 +2562,13 @@ void sched_exec(void)
struct migration_arg arg = { p, dest_cpu };
raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+ put_online_cpus_atomic();
stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
return;
}
unlock:
raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+ put_online_cpus_atomic();
}
#endif
^ permalink raw reply related
* Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-11 14:02 UTC (permalink / raw)
To: Tejun Heo
Cc: Oleg Nesterov, tglx, peterz, paulmck, rusty, mingo, akpm,
namhyung, vincent.guittot, sbw, amit.kucheria, rostedt, rjw,
wangyun, xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121211134758.GA7084@htj.dyndns.org>
On 12/11/2012 07:17 PM, Tejun Heo wrote:
> Hello, Srivatsa.
>
> On Tue, Dec 11, 2012 at 06:43:54PM +0530, Srivatsa S. Bhat wrote:
>> This approach (of using synchronize_sched()) also looks good. It is simple,
>> yet effective, but unfortunately inefficient at the writer side (because
>> he'll have to wait for a full synchronize_sched()).
>
> While synchornize_sched() is heavier on the writer side than the
> originally posted version, it doesn't stall the whole machine and
> wouldn't introduce latencies to others. Shouldn't that be enough?
>
Short answer: Yes. But we can do better, with almost comparable code
complexity. So I'm tempted to try that out.
Long answer:
Even in the synchronize_sched() approach, we still have to identify the
readers who need to be converted to use the new get/put_online_cpus_atomic()
APIs and convert them. Then, if we can come up with a scheme such that
the writer has to wait only for those readers to complete, then why not?
If such a scheme ends up becoming too complicated, then I agree, we
can use synchronize_sched() itself. (That's what I meant by saying that
we'll use this as a fallback).
But even in this scheme which uses synchronize_sched(), we are
already half-way through (we already use 2 types of sync schemes -
counters and rwlocks). Just a little more logic can get rid of the
unnecessary full-wait too.. So why not give it a shot?
Regards,
Srivatsa S. Bhat
^ permalink raw reply
* Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Tejun Heo @ 2012-12-11 13:47 UTC (permalink / raw)
To: Srivatsa S. Bhat
Cc: Oleg Nesterov, tglx, peterz, paulmck, rusty, mingo, akpm,
namhyung, vincent.guittot, sbw, amit.kucheria, rostedt, rjw,
wangyun, xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <50C73192.9010905@linux.vnet.ibm.com>
Hello, Srivatsa.
On Tue, Dec 11, 2012 at 06:43:54PM +0530, Srivatsa S. Bhat wrote:
> This approach (of using synchronize_sched()) also looks good. It is simple,
> yet effective, but unfortunately inefficient at the writer side (because
> he'll have to wait for a full synchronize_sched()).
While synchornize_sched() is heavier on the writer side than the
originally posted version, it doesn't stall the whole machine and
wouldn't introduce latencies to others. Shouldn't that be enough?
Thanks.
--
tejun
^ permalink raw reply
* Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-11 13:13 UTC (permalink / raw)
To: Oleg Nesterov
Cc: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, sbw, amit.kucheria, rostedt, rjw, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121210172410.GA28479@redhat.com>
On 12/10/2012 10:54 PM, Oleg Nesterov wrote:
> On 12/10, Srivatsa S. Bhat wrote:
>>
>> On 12/10/2012 01:52 AM, Oleg Nesterov wrote:
>>> On 12/10, Srivatsa S. Bhat wrote:
>>>>
>>>> On 12/10/2012 12:44 AM, Oleg Nesterov wrote:
>>>>
>>>>> But yes, it is easy to blame somebody else's code ;) And I can't suggest
>>>>> something better at least right now. If I understand correctly, we can not
>>>>> use, say, synchronize_sched() in _cpu_down() path
>>>>
>>>> We can't sleep in that code.. so that's a no-go.
>>>
>>> But we can?
>>>
>>> Note that I meant _cpu_down(), not get_online_cpus_atomic() or take_cpu_down().
>>>
>>
>> Maybe I'm missing something, but how would it help if we did a
>> synchronize_sched() so early (in _cpu_down())? Another bunch of preempt_disable()
>> sections could start immediately after our call to synchronize_sched() no?
>> How would we deal with that?
>
> Sorry for confusion. Of course synchronize_sched() alone is not enough.
> But we can use it to synchronize with preempt-disabled section and avoid
> the barriers/atomic in the fast-path.
>
> For example,
>
> bool writer_pending;
> DEFINE_RWLOCK(writer_rwlock);
> DEFINE_PER_CPU(int, reader_ctr);
>
> void get_online_cpus_atomic(void)
> {
> preempt_disable();
>
> if (likely(!writer_pending) || __this_cpu_read(reader_ctr)) {
> __this_cpu_inc(reader_ctr);
> return;
> }
>
> read_lock(&writer_rwlock);
> __this_cpu_inc(reader_ctr);
> read_unlock(&writer_rwlock);
> }
>
> // lacks release semantics, but we don't care
> void put_online_cpus_atomic(void)
> {
> __this_cpu_dec(reader_ctr);
> preempt_enable();
> }
>
> Now, _cpu_down() does
>
> writer_pending = true;
> synchronize_sched();
>
> before stop_one_cpu(). When synchronize_sched() returns, we know that
> every get_online_cpus_atomic() must see writer_pending == T. And, if
> any CPU incremented its reader_ctr we must see it is not zero.
>
> take_cpu_down() does
>
> write_lock(&writer_rwlock);
>
> for_each_online_cpu(cpu) {
> while (per_cpu(reader_ctr, cpu))
> cpu_relax();
> }
>
> and takes the lock.
>
> However. This can lead to the deadlock we already discussed. So
> take_cpu_down() should do
>
> retry:
> write_lock(&writer_rwlock);
>
> for_each_online_cpu(cpu) {
> if (per_cpu(reader_ctr, cpu)) {
> write_unlock(&writer_rwlock);
> goto retry;
> }
> }
>
> to take the lock. But this is livelockable. However, I do not think it
> is possible to avoid the livelock.
>
> Just in case, the code above is only for illustration, perhaps it is not
> 100% correct and perhaps we can do it better. cpu_hotplug.active_writer
> is ignored for simplicity, get/put should check current == active_writer.
>
This approach (of using synchronize_sched()) also looks good. It is simple,
yet effective, but unfortunately inefficient at the writer side (because
he'll have to wait for a full synchronize_sched()).
So I would say that we should keep this approach as a fallback, if we don't
come up with any better synchronization scheme (in terms of efficiency) at
a comparable level of simplicity/complexity.
I have come up with a v4 that simplifies several aspects of the synchronization
and makes it look a lot more simpler than v3. (Lesser race windows to take
care of, implicit ACKs, no atomic ops etc..)
I'll post it soon. Let me know what you think...
Regards,
Srivatsa S. Bhat
^ permalink raw reply
* Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-11 13:05 UTC (permalink / raw)
To: Oleg Nesterov
Cc: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, sbw, amit.kucheria, rostedt, rjw, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121210172859.GB28479@redhat.com>
On 12/10/2012 10:58 PM, Oleg Nesterov wrote:
> On 12/10, Srivatsa S. Bhat wrote:
>>
>> On 12/10/2012 02:43 AM, Oleg Nesterov wrote:
>>> Damn, sorry for noise. I missed this part...
>>>
>>> On 12/10, Srivatsa S. Bhat wrote:
>>>>
>>>> On 12/10/2012 12:44 AM, Oleg Nesterov wrote:
>>>>> the latency. And I guess something like kick_all_cpus_sync() is "too heavy".
>>>>
>>>> I hadn't considered that. Thinking of it, I don't think it would help us..
>>>> It won't get rid of the currently running preempt_disable() sections no?
>>>
>>> Sure. But (again, this is only my feeling so far) given that get_online_cpus_atomic()
>>> does cli/sti,
>>
>> Ah, that one! Actually, the only reason I do that cli/sti is because, potentially
>> interrupt handlers can be hotplug readers too. So we need to protect the portion
>> of the code of get_online_cpus_atomic() which is not re-entrant.
>
> Yes, I understand.
>
>>> this can help to implement ensure-the-readers-must-see-the-pending-writer.
>>> IOW this might help to implement sync-with-readers.
>>>
>>
>> 2 problems:
>>
>> 1. It won't help with cases like this:
>>
>> preempt_disable()
>> ...
>> preempt_disable()
>> ...
>> <------- Here
>> ...
>> preempt_enable()
>> ...
>> preempt_enable()
>
> No, I meant that kick_all_cpus_sync() can be used to synchronize with
> cli/sti in get_online_cpus_atomic(), just like synchronize_sched() does
> in the code I posted a minute ago.
>
Ah, OK.
>> 2. Part of the reason we want to get rid of stop_machine() is to avoid the
>> latency it induces on _all_ CPUs just to take *one* CPU offline. If we use
>> kick_all_cpus_sync(), we get into that territory again : we unfairly interrupt
>> every CPU, _even when_ that CPU's existing preempt_disabled() sections might
>> not actually be hotplug readers! (ie., not bothered about CPU Hotplug).
>
> I agree, that is why I said it is "too heavy".
>
Got it :)
Regards,
Srivatsa S. Bhat
^ permalink raw reply
* Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-11 13:04 UTC (permalink / raw)
To: Oleg Nesterov
Cc: rostedt, tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
vincent.guittot, tj, sbw, amit.kucheria, rjw, wangyun,
xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121210181521.GA30684@redhat.com>
On 12/10/2012 11:45 PM, Oleg Nesterov wrote:
> On 12/10, Srivatsa S. Bhat wrote:
>>
>> On 12/10/2012 02:27 AM, Oleg Nesterov wrote:
>>> However. If this is true, then compared to preempt_disable/stop_machine
>>> livelock is possible. Probably this is fine, we have the same problem with
>>> get_online_cpus(). But if we can accept this fact I feel we can simmplify
>>> this somehow... Can't prove, only feel ;)
>>
>> Not sure I follow..
>
> I meant that write_lock_irqsave(&hotplug_rwlock) in take_cpu_down()
> can spin "forever".
>
> Suppose that reader_acked() == T on every CPU, so that
> get_online_cpus_atomic() always takes read_lock(&hotplug_rwlock).
>
> It is possible that this lock will be never released by readers,
>
> CPU_0 CPU_1
>
> get_online_cpus_atomic()
> get_online_cpus_atomic()
> put_online_cpus_atomic()
>
> get_online_cpus_atomic()
> put_online_cpus_atomic()
>
> get_online_cpus_atomic()
> put_online_cpus_atomic()
>
> and so on.
>
Right, and we can't do anything about it :(
Regards,
Srivatsa S. Bhat
^ permalink raw reply
* [GIT PULL] ACPI and power management updates for v3.8-rc1
From: Rafael J. Wysocki @ 2012-12-11 11:54 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Len Brown, Linux PM list, ACPI Devel Maling List, LKML
Hi Linus,
Please pull from the git repository at
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git pm+acpi-for-3.8-rc1
to receive power management updates for v3.8 with top-most commit
f316fc56555a5c3bcf6350f3d5ac26dd2c55f4cb
Merge branch 'acpi-enumeration'
on top of commit 9489e9dcae718d5fde988e4a684a0f55b5f94d17
Linux 3.7-rc7
Highlights:
* Introduction of device PM QoS flags allowing kernel subsystems and user space
to constraint the selection of device low-power states by adding binary
requirements (like whether or not it is allowed to remove power from the
given device entirely).
* ACPI device power management update allowing subsystems other than PCI to
use ACPI device PM more easily.
* ACPI device enumeration rework allowing additional kinds of devices
(platform, SPI, I2C) to be enumerated via ACPI in analogy with the
enumeration based on Device Trees. From Mika Westerberg, Adrian Hunter,
Mathias Nyman, Andy Shevchenko, and yours truly.
* ACPICA update to version 20121018. Fixes bugs, adds some new ACPI 5
features, cleans up some things and removes some differences between the
kernel's ACPICA code and the upstream. From Bob Moore and Lv Zheng.
* ACPI memory hotplug update from Wen Congyang and Yasuaki Ishimatsu.
* Introduction of acpi_handle_<level>() messaging macros and ACPI-based CPU
hot-remove support from Toshi Kani.
* ACPI EC updates from Feng Tang.
* cpufreq updates from Viresh Kumar, Fabio Baltieri and others.
* cpuidle changes to quickly notice governor prediction failure from
Youquan Song.
* Support for using multiple cpuidle drivers at the same time and cpuidle
cleanups from Daniel Lezcano.
* devfreq updates from Nishanth Menon and others.
* cpupower update from Thomas Renninger (Thomas is going to maintain that
tool going forward).
* Fixes and small cleanups all over the place.
Thanks!
Documentation/ABI/testing/sysfs-class-devfreq | 44 +-
Documentation/ABI/testing/sysfs-devices-power | 31 +
Documentation/ABI/testing/sysfs-devices-sun | 14 +
Documentation/acpi/enumeration.txt | 227 +++++
.../devicetree/bindings/cpufreq/cpufreq-spear.txt | 42 +
Documentation/power/pm_qos_interface.txt | 2 +-
arch/arm/Kconfig | 1 +
arch/ia64/include/asm/device.h | 3 -
arch/ia64/kernel/acpi.c | 2 +
arch/powerpc/platforms/pseries/processor_idle.c | 4 +-
arch/x86/include/asm/device.h | 3 -
arch/x86/kernel/acpi/boot.c | 6 +
arch/x86/kernel/acpi/sleep.c | 2 +
drivers/acpi/Kconfig | 6 +
drivers/acpi/Makefile | 6 +-
drivers/acpi/acpi_i2c.c | 103 +++
drivers/acpi/acpi_memhotplug.c | 193 ++---
drivers/acpi/acpi_pad.c | 8 +-
drivers/acpi/acpi_platform.c | 104 +++
drivers/acpi/acpica/Makefile | 3 +
drivers/acpi/acpica/acdebug.h | 94 ++-
drivers/acpi/acpica/acdispat.h | 11 +-
drivers/acpi/acpica/acevents.h | 6 +-
drivers/acpi/acpica/acglobal.h | 73 +-
drivers/acpi/acpica/aclocal.h | 16 +-
drivers/acpi/acpica/acmacros.h | 163 ++--
drivers/acpi/acpica/acobject.h | 7 +-
drivers/acpi/acpica/acopcode.h | 6 +-
drivers/acpi/acpica/acparser.h | 3 +-
drivers/acpi/acpica/acpredef.h | 11 +-
drivers/acpi/acpica/acstruct.h | 2 +-
drivers/acpi/acpica/acutils.h | 58 +-
drivers/acpi/acpica/amlresrc.h | 1 -
drivers/acpi/acpica/dscontrol.c | 2 +-
drivers/acpi/acpica/dsfield.c | 2 +-
drivers/acpi/acpica/dsmethod.c | 6 +-
drivers/acpi/acpica/dsmthdat.c | 14 +-
drivers/acpi/acpica/dsobject.c | 6 +-
drivers/acpi/acpica/dsopcode.c | 3 +-
drivers/acpi/acpica/dsutils.c | 33 +-
drivers/acpi/acpica/dswexec.c | 10 +-
drivers/acpi/acpica/dswload2.c | 4 +-
drivers/acpi/acpica/dswstate.c | 26 +-
drivers/acpi/acpica/evgpe.c | 20 +-
drivers/acpi/acpica/evgpeblk.c | 3 +-
drivers/acpi/acpica/evgpeutil.c | 3 +-
drivers/acpi/acpica/evrgnini.c | 7 +-
drivers/acpi/acpica/evxface.c | 2 +-
drivers/acpi/acpica/evxfgpe.c | 13 +-
drivers/acpi/acpica/exconvrt.c | 4 +-
drivers/acpi/acpica/excreate.c | 9 +-
drivers/acpi/acpica/exdebug.c | 10 +-
drivers/acpi/acpica/exdump.c | 20 +-
drivers/acpi/acpica/exfield.c | 4 +-
drivers/acpi/acpica/exfldio.c | 15 +-
drivers/acpi/acpica/exmisc.c | 5 +-
drivers/acpi/acpica/exmutex.c | 9 +-
drivers/acpi/acpica/exnames.c | 9 +-
drivers/acpi/acpica/exoparg1.c | 11 +-
drivers/acpi/acpica/exoparg2.c | 2 +-
drivers/acpi/acpica/exoparg3.c | 3 +-
drivers/acpi/acpica/exoparg6.c | 5 +-
drivers/acpi/acpica/exprep.c | 13 +-
drivers/acpi/acpica/exregion.c | 3 +-
drivers/acpi/acpica/exresnte.c | 9 +-
drivers/acpi/acpica/exresolv.c | 3 +-
drivers/acpi/acpica/exresop.c | 8 +-
drivers/acpi/acpica/exstore.c | 4 +-
drivers/acpi/acpica/exstoren.c | 11 +-
drivers/acpi/acpica/exstorob.c | 5 +-
drivers/acpi/acpica/exsystem.c | 9 +-
drivers/acpi/acpica/exutils.c | 5 +-
drivers/acpi/acpica/hwacpi.c | 3 +-
drivers/acpi/acpica/hwgpe.c | 4 +-
drivers/acpi/acpica/hwpci.c | 4 +-
drivers/acpi/acpica/hwregs.c | 1 -
drivers/acpi/acpica/hwtimer.c | 6 +-
drivers/acpi/acpica/hwvalid.c | 1 -
drivers/acpi/acpica/hwxface.c | 1 -
drivers/acpi/acpica/hwxfsleep.c | 12 +-
drivers/acpi/acpica/nsaccess.c | 7 +-
drivers/acpi/acpica/nsalloc.c | 4 +-
drivers/acpi/acpica/nsdump.c | 10 +-
drivers/acpi/acpica/nsinit.c | 4 +-
drivers/acpi/acpica/nsload.c | 10 +-
drivers/acpi/acpica/nsnames.c | 2 +-
drivers/acpi/acpica/nsobject.c | 8 +-
drivers/acpi/acpica/nsparse.c | 8 +-
drivers/acpi/acpica/nssearch.c | 17 +-
drivers/acpi/acpica/nsutils.c | 18 +-
drivers/acpi/acpica/nswalk.c | 10 +-
drivers/acpi/acpica/nsxfeval.c | 20 +-
drivers/acpi/acpica/nsxfname.c | 66 +-
drivers/acpi/acpica/nsxfobj.c | 4 +-
drivers/acpi/acpica/psargs.c | 8 +-
drivers/acpi/acpica/psloop.c | 61 +-
drivers/acpi/acpica/psopcode.c | 29 +-
drivers/acpi/acpica/psparse.c | 13 +-
drivers/acpi/acpica/psutils.c | 4 +-
drivers/acpi/acpica/rscalc.c | 14 +-
drivers/acpi/acpica/rslist.c | 4 +-
drivers/acpi/acpica/tbfind.c | 2 +-
drivers/acpi/acpica/tbinstal.c | 2 +
drivers/acpi/acpica/tbutils.c | 2 +-
drivers/acpi/acpica/tbxface.c | 4 +-
drivers/acpi/acpica/tbxfload.c | 2 +-
drivers/acpi/acpica/tbxfroot.c | 3 +-
drivers/acpi/acpica/utcache.c | 323 ++++++++
drivers/acpi/acpica/utclib.c | 749 +++++++++++++++++
drivers/acpi/acpica/utdebug.c | 37 +-
drivers/acpi/acpica/utids.c | 104 ++-
drivers/acpi/acpica/utmath.c | 2 +-
drivers/acpi/acpica/utmisc.c | 150 +++-
drivers/acpi/acpica/utmutex.c | 14 +-
drivers/acpi/acpica/utobject.c | 8 +-
drivers/acpi/acpica/utstate.c | 2 +-
drivers/acpi/acpica/uttrack.c | 692 ++++++++++++++++
drivers/acpi/acpica/utxface.c | 5 +-
drivers/acpi/acpica/utxferror.c | 2 +-
drivers/acpi/apei/ghes.c | 2 +-
drivers/acpi/battery.c | 77 ++
drivers/acpi/bus.c | 21 +-
drivers/acpi/container.c | 27 +-
drivers/acpi/device_pm.c | 668 +++++++++++++++
drivers/acpi/dock.c | 56 +-
drivers/acpi/ec.c | 97 ++-
drivers/acpi/glue.c | 56 +-
drivers/acpi/hed.c | 2 +-
drivers/acpi/internal.h | 11 +-
drivers/acpi/osl.c | 22 +-
drivers/acpi/pci_irq.c | 15 +-
drivers/acpi/power.c | 2 +-
drivers/acpi/proc.c | 11 +-
drivers/acpi/processor_driver.c | 74 +-
drivers/acpi/processor_idle.c | 57 +-
drivers/acpi/resource.c | 526 ++++++++++++
drivers/acpi/scan.c | 154 +++-
drivers/acpi/sleep.c | 535 +++++-------
drivers/acpi/sysfs.c | 4 +-
drivers/acpi/thermal.c | 34 +
drivers/acpi/utils.c | 38 +
drivers/acpi/video.c | 14 +
drivers/acpi/video_detect.c | 8 +
drivers/base/core.c | 2 +-
drivers/base/platform.c | 26 +-
drivers/base/power/clock_ops.c | 6 +-
drivers/base/power/domain.c | 11 +-
drivers/base/power/opp.c | 44 +-
drivers/base/power/power.h | 6 +-
drivers/base/power/qos.c | 321 +++++--
drivers/base/power/sysfs.c | 94 ++-
drivers/cpufreq/Kconfig.arm | 7 +
drivers/cpufreq/Makefile | 5 +-
drivers/cpufreq/cpufreq-cpu0.c | 2 +-
drivers/cpufreq/cpufreq.c | 37 +-
drivers/cpufreq/cpufreq_conservative.c | 558 ++++---------
drivers/cpufreq/cpufreq_governor.c | 318 +++++++
drivers/cpufreq/cpufreq_governor.h | 176 ++++
drivers/cpufreq/cpufreq_ondemand.c | 731 +++++-----------
drivers/cpufreq/cpufreq_performance.c | 2 +
drivers/cpufreq/cpufreq_powersave.c | 2 +
drivers/cpufreq/cpufreq_stats.c | 4 +-
drivers/cpufreq/cpufreq_userspace.c | 2 +
drivers/cpufreq/exynos-cpufreq.c | 11 +-
drivers/cpufreq/freq_table.c | 2 +
drivers/cpufreq/longhaul.c | 4 +-
drivers/cpufreq/powernow-k8.c | 4 +-
drivers/cpufreq/spear-cpufreq.c | 291 +++++++
drivers/cpuidle/Kconfig | 9 +
drivers/cpuidle/cpuidle.c | 55 +-
drivers/cpuidle/cpuidle.h | 13 +-
drivers/cpuidle/driver.c | 209 ++++-
drivers/cpuidle/governors/menu.c | 168 +++-
drivers/cpuidle/sysfs.c | 201 ++++-
drivers/devfreq/Kconfig | 8 +-
drivers/devfreq/devfreq.c | 921 ++++++++++++++-------
drivers/devfreq/exynos4_bus.c | 45 +-
drivers/devfreq/governor.h | 17 +
drivers/devfreq/governor_performance.c | 38 +-
drivers/devfreq/governor_powersave.c | 38 +-
drivers/devfreq/governor_simpleondemand.c | 55 +-
drivers/devfreq/governor_userspace.c | 45 +-
drivers/gpio/Kconfig | 4 +
drivers/gpio/Makefile | 1 +
drivers/gpio/gpiolib-acpi.c | 54 ++
drivers/i2c/i2c-core.c | 6 +
drivers/idle/intel_idle.c | 14 +-
drivers/mmc/host/Kconfig | 12 +
drivers/mmc/host/Makefile | 1 +
drivers/mmc/host/sdhci-acpi.c | 312 +++++++
drivers/mtd/nand/sh_flctl.c | 4 +-
drivers/pci/pci-acpi.c | 79 +-
drivers/pnp/base.h | 2 +
drivers/pnp/pnpacpi/core.c | 9 +-
drivers/pnp/pnpacpi/rsparser.c | 296 +------
drivers/pnp/resource.c | 16 +
drivers/spi/spi.c | 103 ++-
include/acpi/acconfig.h | 1 +
include/acpi/acexcep.h | 2 +-
include/acpi/acnames.h | 1 +
include/acpi/acpi_bus.h | 78 +-
include/acpi/acpiosxf.h | 3 +-
include/acpi/acpixf.h | 18 +-
include/acpi/actbl3.h | 22 +-
include/acpi/actypes.h | 42 +-
include/linux/acpi.h | 135 ++-
include/linux/acpi_gpio.h | 19 +
include/linux/cpufreq.h | 5 +-
include/linux/cpuidle.h | 15 +-
include/linux/devfreq.h | 136 +--
include/linux/device.h | 18 +
include/linux/freezer.h | 1 +
include/linux/i2c.h | 9 +
include/linux/platform_device.h | 1 +
include/linux/pm.h | 3 +-
include/linux/pm_qos.h | 77 +-
include/linux/tick.h | 6 +
kernel/cpu.c | 8 +-
kernel/power/main.c | 2 +-
kernel/power/qos.c | 65 +-
kernel/power/swap.c | 2 +-
kernel/time/tick-sched.c | 4 +
tools/power/cpupower/.gitignore | 7 +
tools/power/cpupower/Makefile | 3 +-
tools/power/cpupower/debug/i386/Makefile | 5 +-
tools/power/cpupower/man/cpupower-monitor.1 | 15 +-
tools/power/cpupower/utils/helpers/cpuid.c | 2 +
tools/power/cpupower/utils/helpers/helpers.h | 18 +-
tools/power/cpupower/utils/helpers/sysfs.c | 19 -
tools/power/cpupower/utils/helpers/topology.c | 53 +-
.../cpupower/utils/idle_monitor/cpupower-monitor.c | 21 +-
.../cpupower/utils/idle_monitor/cpupower-monitor.h | 17 +
tools/power/cpupower/utils/idle_monitor/snb_idle.c | 10 +-
233 files changed, 9470 insertions(+), 3326 deletions(-)
---------------
Aaron Lu (1):
ACPI / PM: Introduce os_accessible flag for power_state
Adrian Hunter (4):
ACPI / PNP: skip ACPI device nodes associated with physical nodes already
ACPI: add SDHCI to ACPI platform devices
mmc: sdhci-acpi: add SDHCI ACPI driver
mmc: sdhci-acpi: enable runtime-pm for device HID INT33C6
Alan Cox (1):
pnpacpi: fix incorrect TEST_ALPHA() test
Andreas Schwab (1):
cpufreq: fix jiffies/cputime mixup in conservative/ondemand governors
Andy Shevchenko (3):
ACPI / x86: Export acpi_[un]register_gsi()
ACPI / platform: include missed header into acpi_platform.c
ACPI: remove unnecessary INIT_LIST_HEAD
Bill Pemberton (4):
cpufreq: remove use of __devexit_p
cpufreq: remove use of __devinit
cpufreq: remove use of __devexit
ACPI: remove use of __devexit
Bob Moore (18):
Cleanup of invalid ACPI name handling and repair
ACPICA: Audit/update for ACPICA return macros and debug depth counter
ACPICA: ACPICA core: Cleanup empty lines at file start and end
ACPICA: Fix some typos in comments
ACPICA: Update local C library module comments for ASCII table
ACPICA: Remove extra spaces after periods within comments
ACPICA: Remove extra spaces after periods in the Intel license
ACPICA: Add debug print message for mutex objects that are force-released
ACPICA: AcpiExec: Improve algorithm for tracking memory leaks
ACPICA: Add ACPI_MOVE_NAME macro to optimize 4-byte ACPI_NAME copies
ACPICA: Enhance error reporting for invalid opcodes and bad ACPI_NAMEs
ACPICA: Update support for ACPI 5 MPST table
ACPICA: Deploy ACPI_MOVE_NAME across ACPICA source base
ACPICA: Add starting offset parameter to common dump buffer routine
ACPICA: Fix externalize name to complete migration to ACPI_MOVE_NAME
ACPICA: Update for 64-bit generation of recent error message changes
ACPICA: AcpiGetObjectInfo: Add support for ACPI 5 _SUB method
ACPICA: Update version to 20121018
Cyril Roelandt (1):
ACPI: drop unnecessary local variable from acpi_system_write_wakeup_device()
Daniel Lezcano (8):
cpuidle / sysfs: change function parameter
cpuidle / sysfs: move kobj initialization in the syfs file
cpuidle / sysfs: move structure declaration into the sysfs.c file
cpuidle: fixup device.h header in cpuidle.h
cpuidle: move driver's refcount to cpuidle
cpuidle: move driver checking within the lock section
cpuidle: prepare the cpuidle core to handle multiple drivers
cpuidle: support multiple drivers
Daniel Walter (1):
PM / sysfs: replace strict_str* with kstrto*
David Rientjes (1):
ACPI / PM: Fix build problem related to acpi_target_system_state()
Davidlohr Bueso (1):
PM / Hibernate: use rb_entry
Deepak Sikri (1):
cpufreq: SPEAr: Add CPUFreq driver
Fabio Baltieri (2):
cpufreq: ondemand: fix wrong delay sampling rate
cpufreq: ondemand: update sampling rate only on right CPUs
Feng Tang (5):
ACPI / EC: Cleanup the member name for spinlock/mutex in struct
ACPI / EC: Add more debug info and trivial code cleanup
ACPI / EC: Don't count a SCI interrupt as a false one
ACPI / x86: Add quirk for "CheckPoint P-20-00" to not use bridge _CRS_ info
ACPICA: Resource Mgr: Small fix for buffer size calculation
Jingoo Han (1):
cpufreq: Remove unnecessary initialization of a local variable
Joe Perches (1):
ACPI: Fix logging when no pci_irq is allocated
Jonghwa Lee (1):
PM / devfreq: Add sysfs node for representing frequency transition information.
Josh (1):
ACPI: strict_strtoul() and printk() cleanup in acpi_pad
Julius Werner (1):
cpuidle: Measure idle state durations with monotonic clock
Kamil Iskra (1):
ACPI / battery: Correct battery capacity values on Thinkpads
Kristen Carlson Accardi (1):
ACPI / Sleep: add acpi_sleep=nonvs_s3 parameter
Lan Tianyu (3):
PM / QoS: Resume device before exposing/hiding PM QoS flags
ACPI / PM: Add Sony Vaio VPCEB1S1E to nonvs blacklist.
ACPI / video: Add "Asus UL30VT" to ACPI video detect blacklist
Lan,Tianyu (1):
PM / QoS: Fix a free error in the dev_pm_qos_constraints_destroy()
Li Haifeng (1):
PM / Freezer: Fixup compile error of try_to_freeze_nowarn()
Li Zhong (1):
cpuidle: fix a suspicious RCU usage in menu governor
Liam Girdwood (1):
PM / OPP: Export symbols for module usage.
LongX Zhang (1):
driver core / PM: move the calling to device_pm_remove behind the calling to bus_remove_device
Lv Zheng (9):
ACPI: Add _UID support for ACPI devices.
ACPI: Add user space interface for identification objects
ACPICA: Fix unmerged utility divergences.
ACPICA: Fix unmerged debugger divergences.
ACPICA: Fix divergences of definition conflicts.
ACPICA: Fix AcpiSrc caused divergences.
ACPICA: Fix indent caused divergences.
ACPICA: Fix unmerged acmacros.h divergences.
ACPI / PM: Add check preventing transitioning to non-D0 state from D3.
Mathias Nyman (1):
gpio / ACPI: add ACPI support
Mika Westerberg (9):
driver core / ACPI: Move ACPI support to core device and driver types
ACPI: Provide generic functions for matching ACPI device nodes
ACPI / ia64: Export acpi_[un]register_gsi()
ACPI: Add support for platform bus type
ACPI / platform: use ACPI device name instead of _HID._UID
i2c / ACPI: add ACPI enumeration support
spi / ACPI: add ACPI enumeration support
ACPI: add documentation about ACPI 5 enumeration
ACPI: add Haswell LPSS devices to acpi_platform_device_ids list
Murali Karicheri (1):
base: power - use clk_prepare_enable and clk_prepare_disable
MyungJoo Ham (3):
PM / devfreq: remove compiler error when a governor is module
PM / devfreq: missing rcu_read_lock() added for find_device_opp()
PM / devfreq: remove compiler error with module governors (2)
Nishanth Menon (13):
PM / devfreq: kernel-doc typo corrections
PM / devfreq: fix sscanf handling for writable sysfs entries
PM / devfreq: make devfreq_class static
PM / OPP: predictable fail results for opp_find* functions, v2
PM / devfreq: documentation cleanups for devfreq header
PM / devfreq: Add sysfs node to expose available frequencies
PM / devfreq: export update_devfreq
PM / devfreq: provide hooks for governors to be registered
PM / devfreq: register governors with devfreq framework
PM / devfreq: map devfreq drivers to governor using name
PM / devfreq: governors: add GPL module license and allow module build
PM / devfreq: allow sysfs governor node to switch governor
PM / devfreq: Add sysfs node to expose available governors
Palmer Cox (6):
cpupower tools: Remove brace expansion from clean target
cpupower tools: Update .gitignore for files created in the debug directories
cpupower tools: Fix minor warnings
cpupower tools: Fix issues with sysfs_topology_read_file
cpupower tools: Fix malloc of cpu_info structure
cpupower tools: Fix warning and a bug with the cpu package count
Rafael J. Wysocki (33):
PM / QoS: Prepare device structure for adding more constraint types
PM / QoS: Introduce request and constraint data types for PM QoS flags
PM / QoS: Prepare struct dev_pm_qos_request for more request types
PM / QoS: Introduce PM QoS device flags support
PM / QoS: Make it possible to expose PM QoS device flags to user space
PM / Domains: Check device PM QoS flags in pm_genpd_poweroff()
PM / ACPI: Take device PM QoS flags into account
PM / QoS: Fix the return value of dev_pm_qos_update_request()
PM / QoS: Document request manipulation requirement for flags
ACPI / PM: Fix device PM kernedoc comments and #ifdefs
ACPI / PM: Move routines for adding/removing device wakeup notifiers
ACPI / PM: Move device power state selection routine to device_pm.c
ACPI / PM: Move runtime remote wakeup setup routine to device_pm.c
ACPI / PM: Split device wakeup management routines
ACPI / PM: Provide device PM functions operating on struct acpi_device
ACPI / PM: Move device PM functions related to sleep states
ACPI / PM: Provide ACPI PM callback routines for subsystems
ACPI: Make seemingly useless check in osl.c more understandable
ACPI: Move device resources interpretation code from PNP to ACPI core
ACPI / platform: Use common ACPI device resource parsing routines
ACPI: Centralized processing of ACPI device resources
ACPI / PM: Fix build problem when CONFIG_ACPI or CONFIG_PM is not set
Revert "ACPI / x86: Add quirk for "CheckPoint P-20-00" to not use bridge _CRS_ info"
ACPI / resources: Use AE_CTRL_TERMINATE to terminate resources walks
ACPI: Allow ACPI handles of devices to be initialized in advance
ACPI / driver core: Introduce struct acpi_dev_node and related macros
ACPI / platform: Initialize ACPI handles of platform devices in advance
cpufreq: governors: Fix jiffies/cputime mixup (revisited)
PM / QoS: Handle device PM QoS flags while removing constraints
ACPI / PM: Allow attach/detach routines to change device power states
platform / ACPI: Attach/detach ACPI PM during probe/remove/shutdown
ACPI / PNP: Do not crash due to stale pointer use during system resume
ACPI / PM: Fix header of acpi_dev_pm_detach() in acpi.h
Rajagopal Venkat (3):
PM / devfreq: Core updates to support devices which can idle
PM / devfreq: Add suspend and resume apis
PM / devfreq: Add current freq callback in device profile
Randy Dunlap (1):
ACPI: add newline in power.c message
Robert Moore (1):
ACPICA: Fix for predefined name loop during ACPICA initialization
Sachin Kamat (3):
PM / devfreq: Use devm_* functions in exynos4_bus.c
PM / devfreq: Fix incorrect argument in error message
PM / devfreq: Fix return value in devfreq_remove_governor()
Sangho Yi (1):
PM / devfreq: exynos4_bus.c: Fixed an alignment of the func call args.
Tang Chen (1):
ACPI: Fix a hard coding style when determining if a device is a container, v3
Thomas Renninger (2):
cpupower: Provide -c param for cpupower monitor to schedule process on all cores
cpupower: IvyBridge (0x3a and 0x3e models) support
Tomasz Figa (1):
cpufreq: exynos: Broadcast frequency change notifications for all cores
Toshi Kani (10):
ACPI: dock: Remove redundant ACPI NS walk
ACPI: Fix stale pointer access to flags.lockable
ACPI: Remove unused lockable in acpi_device_flags
ACPI: Export functions for hot-remove
ACPI: Add ACPI CPU hot-remove support
ACPI: Add acpi_handle_<level>() interfaces
ACPI: Update CPU hotplug error messages
ACPI: Update Memory hotplug error messages
ACPI: Update Container hotplug error messages
ACPI: Update Dock hotplug error messages
Tushar Behera (1):
cpufreq: exynos: Use static for functions used in only this file
Vincent Guittot (1):
PM / OPP: RCU reclaim
Viresh Kumar (7):
cpufreq: Improve debug prints
cpufreq: return early from __cpufreq_driver_getavg()
cpufreq: governors: remove redundant code
cpufreq: Fix sparse warnings by updating cputime64_t to u64
cpufreq: Fix sparse warning by making local function static
cpufreq: Avoid calling cpufreq driver's target() routine if target_freq == policy->cur
cpufreq: Make sure target freq is within limits
Wei Yongjun (1):
PM / OPP: using kfree_rcu() to simplify the code
Wen Congyang (6):
ACPI / memory-hotplug: call acpi_bus_trim() to remove memory device
ACPI / memhotplug: deal with eject request in hotplug queue
ACPI / memhotplug: fix memory leak when memory device is unbound from acpi_memhotplug
ACPI / memhotplug: free memory device if acpi_memory_enable_device() failed
ACPI / memhotplug: don't allow to eject the memory device if it is being used
ACPI / memhotplug: bind the memory device when the driver is being loaded
Yasuaki Ishimatsu (3):
ACPI / processor: prevent cpu from becoming online
ACPI / memory-hotplug: add memory offline code to acpi_memory_device_remove()
ACPI: create _SUN sysfs file
Youquan Song (4):
cpuidle: Quickly notice prediction failure for repeat mode
cpuidle: Quickly notice prediction failure in general case
cpuidle: Set residency to 0 if target Cstate not enter
cpuidle: Get typical recent sleep interval
Yuanhan Liu (1):
ACPI: move acpi_no_s4_hw_signature() declaration into #ifdef CONFIG_HIBERNATION
Zhang Rui (4):
ACPI / thermal: _TMP and _CRT/_HOT/_PSV/_ACx dependency fix
ACPI: do acpisleep dmi check when CONFIG_ACPI_SLEEP is set
ACPI : do not use Lid and Sleep button for S5 wakeup
ACPI / video: ignore BIOS initial backlight value for HP Folio 13-2000
viresh kumar (3):
cpufreq / core: Fix typo in comment describing show_bios_limit()
cpufreq / core: Fix printing of governor and driver name
cpufreq: Move common part from governors to separate file, v2
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
^ permalink raw reply
* Re: [RFC] cpuidle - remove the power_specified field in the driver
From: Daniel Lezcano @ 2012-12-11 9:46 UTC (permalink / raw)
To: Julius Werner
Cc: Daniel Lezcano, Rafael J. Wysocki, Kevin Hilman, Deepthi Dharwar,
Trinabh Gupta, Lists Linaro-dev, len.brown, linux-pm,
linux-kernel, Andrew Morton, Sameer Nanda
In-Reply-To: <CAODwPW96ZTXUZjz4B_GET_6UAd4oTednRmrFLe-Jt=EF9n4O1g@mail.gmail.com>
On 12/10/2012 08:09 PM, Julius Werner wrote:
> Hi,
>
> What is the current status of this? Daniel, do you think you have got
> enough feedback to submit a definitive patch for this?
Yes, I have a definitive patch. I will resend it tomorrow.
Thanks
-- Daniel
^ permalink raw reply
* Re: [RESEND PATCH 0/5] cpufreq: db8500: Rename driver and update some parts
From: Linus Walleij @ 2012-12-11 9:15 UTC (permalink / raw)
To: Ulf Hansson
Cc: Rafael J. Wysocki, cpufreq, linux-pm, Mike Turquette,
Mike Turquette, linux-arm-kernel, Lee Jones, Rickard Andersson,
Jonas Aberg, Vincent Guittot, Philippe Begnic, Ulf Hansson
In-Reply-To: <1355153142-9534-1-git-send-email-ulf.hansson@stericsson.com>
On Mon, Dec 10, 2012 at 4:25 PM, Ulf Hansson <ulf.hansson@stericsson.com> wrote:
> From: Ulf Hansson <ulf.hansson@linaro.org>
>
> This patchset starts by renaming the db8500 cpufreq driver to a more generic
> name. There are new variants which rely on it too, so instead we give it a
> family name of dbx500.
>
> On top of that a fixup patch for initialization of the driver and some minor
> cleanup patches are included as well.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
For all patches.
Yours,
Linus Walleij
^ permalink raw reply
* Re: [PATCH v9 06/10] ata: zpodd: check zero power ready status
From: Tejun Heo @ 2012-12-11 5:10 UTC (permalink / raw)
To: Aaron Lu
Cc: James Bottomley, Rafael J. Wysocki, linux-pm, Jeff Garzik,
Alan Stern, Jeff Wu, linux-ide, linux-scsi, linux-acpi
In-Reply-To: <50C5564F.8030000@intel.com>
Hello, guys.
On Mon, Dec 10, 2012 at 11:26:07AM +0800, Aaron Lu wrote:
> >>> The problem here is there's no easy to reach genhd from libata (or the
> >>> other way around) without going through sr. I think we're gonna have
> >>> to have something in sr one way or the other.
> >>
> >> Can't we do that via an event? It's a bit clunky because we need the
> >> callback in the layer that sees the sdev, which is libata-scsi, we just
> >> need an analogue of ata_scsi_media_change_notify, but ignoring and
> >> allowing polling is essentially event driven as well, so it should all
> >> work. We'll need a listener in genhd, which might be trickier.
I'm not really following what you mean. Can you please elaborate?
One way or the other, doesn't the notification have to bubble up
through SCSI?
> A colleague of mine reminded me that it's impolite to write something
> like this, and here is my apology. I didn't mean to be rude, I'm sorry
> for this if it made you feel uncomfortable.
Oh, no, it's not rude at all. I was just being distracted w/ other
stuff and lazy. Sorry about that.
Thanks.
--
tejun
^ permalink raw reply
* Re: [PATCH RFC] PM/Devfreq: Add Exynos5-bus devfreq driver for Exynos5250.
From: MyungJoo Ham @ 2012-12-11 1:20 UTC (permalink / raw)
To: Abhilash Kesavan
Cc: kyungmin.park, rjw, linux-kernel, linux-pm, kgene.kim,
jhbird.choi
In-Reply-To: <1354264973-11214-1-git-send-email-a.kesavan@samsung.com>
On Fri, Nov 30, 2012 at 5:42 PM, Abhilash Kesavan <a.kesavan@samsung.com> wrote:
> Exynos5-bus device devfreq driver monitors PPMU counters and
> adjusts operating frequencies and voltages with OPP. ASV should
> be used to provide appropriate voltages as per the speed group
> of the SoC rather than using a constant 1.025V.
>
> Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
> Cc: Jonghwan Choi <jhbird.choi@samsung.com>
> Cc: Kukjin Kim <kgene.kim@samsung.com>
I've got a few general comments and questions on your patch.
> ---
> This code is based on Jonghwan Choi's <jhbird.choi@samsung.com> devfreq work
> for Exynos5250. This requires corresponding machine specific changes which
> will be posted once the driver is reviewed.
>
> drivers/devfreq/Kconfig | 10 +
> drivers/devfreq/Makefile | 1 +
> drivers/devfreq/exynos5_bus.c | 595 ++++++++++++++++++++++++++++++++++++++++
> drivers/devfreq/exynos5_ppmu.c | 395 ++++++++++++++++++++++++++
> drivers/devfreq/exynos5_ppmu.h | 26 ++
> drivers/devfreq/exynos_ppmu.c | 56 ++++
> drivers/devfreq/exynos_ppmu.h | 79 ++++++
> 7 files changed, 1162 insertions(+), 0 deletions(-)
> create mode 100644 drivers/devfreq/exynos5_bus.c
> create mode 100644 drivers/devfreq/exynos5_ppmu.c
> create mode 100644 drivers/devfreq/exynos5_ppmu.h
> create mode 100644 drivers/devfreq/exynos_ppmu.c
> create mode 100644 drivers/devfreq/exynos_ppmu.h
I understand that Exynos PPMU drivers seem not to be used (at least in
mainline Linux) widely and it'd be convinent for a bus driver to have
ppmu driver located in the same source directory.
However, I don't feel very comfortable to have ppmu drivers explicitly
landing in devfreq directory. Would it be possible to place them
somewhere else? (in drivers/misc, arch/arm/mach-exynos, or somewhere
appropriate?) If PPMU drivers really have nowhere to relocate, they
may be located along with its sole user (exynos5_bus.c) anyway.
> +
> +struct exynos5_bus_int_handle {
> + struct list_head node;
> + struct delayed_work work;
> + bool boost;
> + bool poll;
> + unsigned long min;
> +};
It appears that "boost" is something may be handled by per-dev QoS.
It looks like that you are reimplementing the pm-qos infrastructure in
the driver.
Could you please implement this w/ per-dev QoS?
Or explain why this is required instead of per-dev PM-QoS?
Cheers,
MyungJoo
--
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab, DMC Business, Samsung Electronics
^ permalink raw reply
* RE: [RFC PATCH] ARM: EXYNOS5: Support Exynos5-bus devfreq driver
From: Jonghwan Choi @ 2012-12-10 23:59 UTC (permalink / raw)
To: 'Abhilash Kesavan', linux-kernel, linux-pm, kgene.kim
Cc: myungjoo.ham, kyungmin.park, rjw
In-Reply-To: <1355141166-17205-1-git-send-email-a.kesavan@samsung.com>
Hi Abhilash Kesavan.
> + /* Change Divider - LEX */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_LEX);
> +
> + tmp &= ~(EXYNOS5_CLKDIV_LEX_ATCLK_LEX_MASK |
> + EXYNOS5_CLKDIV_LEX_PCLK_LEX_MASK);
> +
> + tmp |= int_freq[div_index].clk_div_lex;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_LEX);
> +
I knew that only ATCLK_LEX & PCLK_LEX divider value are in CLKDIV_LEX
register. (Others are reserved and value is 0)
So, I think
"
tmp = int_freq[div_index].clk_div_lex;
__raw_writel(tmp, EXYNOS5_CLKDIV_LEX);
"
Is enough.
> + tmp = __raw_readl(EXYNOS5_CLKDIV_LEX);
> +
> + tmp &= ~(EXYNOS5_CLKDIV_LEX_ATCLK_LEX_MASK |
> + EXYNOS5_CLKDIV_LEX_PCLK_LEX_MASK);
-> not need.
> + while (__raw_readl(EXYNOS5_CLKDIV_STAT_LEX) & 0x110)
> + cpu_relax();
> +
> + /* Change Divider - R0X */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_R0X);
> +
> + tmp &= ~EXYNOS5_CLKDIV_R0X_PCLK_R0X_MASK;
> +
> + tmp |= int_freq[div_index].clk_div_r0x;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_R0X);
> +
Same here
> + while (__raw_readl(EXYNOS5_CLKDIV_STAT_R0X) & 0x10)
> + cpu_relax();
> +
> + /* Change Divider - R1X */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_R1X);
> +
> + tmp &= ~EXYNOS5_CLKDIV_R1X_PCLK_R1X_MASK;
> +
> + tmp |= int_freq[div_index].clk_div_r1x;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_R1X);
> +
Same here
> + while (__raw_readl(EXYNOS5_CLKDIV_STAT_R1X) & 0x10)
> + cpu_relax();
How about your opinion?
thanks
> -----Original Message-----
> From: Abhilash Kesavan [mailto:a.kesavan@samsung.com]
> Sent: Monday, December 10, 2012 9:06 PM
> To: linux-kernel@vger.kernel.org; linux-pm@vger.kernel.org;
> kgene.kim@samsung.com
> Cc: myungjoo.ham@samsung.com; kyungmin.park@samsung.com; rjw@sisk.pl;
> jhbird.choi@samsung.com; Abhilash Kesavan
> Subject: [RFC PATCH] ARM: EXYNOS5: Support Exynos5-bus devfreq driver
>
> - Setup the INT clock ops to control/vary INT frequency
> - Add mappings initially for the PPMU device
>
> Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
> ---
> Corresponding devfreq driver support for Exynos5 has been posted at:
> https://patchwork.kernel.org/patch/1823931/
>
> Tested after merging for-rafael branch of
> git://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git
> with for-next branch of
> git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung.git
>
> arch/arm/mach-exynos/clock-exynos5.c | 151
> ++++++++++++++++++++++++
> arch/arm/mach-exynos/common.c | 25 ++++
> arch/arm/mach-exynos/include/mach/map.h | 6 +
> arch/arm/mach-exynos/include/mach/regs-clock.h | 48 ++++++++
> arch/arm/plat-samsung/include/plat/map-s5p.h | 6 +
> 5 files changed, 236 insertions(+), 0 deletions(-)
>
> diff --git a/arch/arm/mach-exynos/clock-exynos5.c b/arch/arm/mach-
> exynos/clock-exynos5.c
> index 5c63bc7..f00b259 100644
> --- a/arch/arm/mach-exynos/clock-exynos5.c
> +++ b/arch/arm/mach-exynos/clock-exynos5.c
> @@ -109,6 +109,11 @@ static struct clk exynos5_clk_sclk_usbphy = {
> .rate = 48000000,
> };
>
> +/* Virtual Bus INT clock */
> +static struct clk exynos5_int_clk = {
> + .name = "int_clk",
> +};
> +
> static int exynos5_clksrc_mask_top_ctrl(struct clk *clk, int enable)
> {
> return s5p_gatectrl(EXYNOS5_CLKSRC_MASK_TOP, clk, enable);
> @@ -1519,6 +1524,149 @@ static struct clk *exynos5_clks[] __initdata = {
> &clk_fout_cpll,
> &clk_fout_mpll_div2,
> &exynos5_clk_armclk,
> + &exynos5_int_clk,
> +};
> +
> +#define INT_FREQ(f, a0, a1, a2, a3, a4, a5, b0, b1, b2, b3, \
> + c0, c1, d0, e0) \
> + { \
> + .freq = (f) * 1000000, \
> + .clk_div_top0 = ((a0) << 0 | (a1) << 8 | (a2) << 12 | \
> + (a3) << 16 | (a4) << 20 | (a5) << 28), \
> + .clk_div_top1 = ((b0) << 12 | (b1) << 16 | (b2) << 20 | \
> + (b3) << 24), \
> + .clk_div_lex = ((c0) << 4 | (c1) << 8), \
> + .clk_div_r0x = ((d0) << 4), \
> + .clk_div_r1x = ((e0) << 4), \
> + }
> +
> +static struct {
> + unsigned long freq;
> + u32 clk_div_top0;
> + u32 clk_div_top1;
> + u32 clk_div_lex;
> + u32 clk_div_r0x;
> + u32 clk_div_r1x;
> +} int_freq[] = {
> + /*
> + * values:
> + * freq
> + * clock divider for ACLK66, ACLK166, ACLK200, ACLK266,
> + ACLK333, ACLK300_DISP1
> + * clock divider for ACLK300_GSCL, ACLK400_IOP, ACLK400_ISP,
> ACLK66_PRE
> + * clock divider for PCLK_LEX, ATCLK_LEX
> + * clock divider for ACLK_PR0X
> + * clock divider for ACLK_PR1X
> + */
> + INT_FREQ(266, 1, 1, 3, 2, 0, 0, 0, 1, 1, 5, 1, 0, 1, 1),
> + INT_FREQ(200, 1, 2, 4, 3, 1, 0, 0, 3, 2, 5, 1, 0, 1, 1),
> + INT_FREQ(160, 1, 3, 4, 4, 2, 0, 0, 3, 3, 5, 1, 0, 1, 1),
> + INT_FREQ(133, 1, 3, 5, 5, 2, 1, 1, 4, 4, 5, 1, 0, 1, 1),
> + INT_FREQ(100, 1, 7, 7, 7, 7, 3, 7, 7, 7, 5, 1, 0, 1, 1),
> +};
> +
> +static unsigned long exynos5_clk_int_get_rate(struct clk *clk)
> +{
> + return clk->rate;
> +}
> +
> +static void exynos5_int_set_clkdiv(unsigned int div_index)
> +{
> + unsigned int tmp;
> +
> + /* Change Divider - TOP0 */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_TOP0);
> +
> + tmp &= ~(EXYNOS5_CLKDIV_TOP0_ACLK266_MASK |
> + EXYNOS5_CLKDIV_TOP0_ACLK200_MASK |
> + EXYNOS5_CLKDIV_TOP0_ACLK66_MASK |
> + EXYNOS5_CLKDIV_TOP0_ACLK333_MASK |
> + EXYNOS5_CLKDIV_TOP0_ACLK166_MASK |
> + EXYNOS5_CLKDIV_TOP0_ACLK300_DISP1_MASK);
> +
> + tmp |= int_freq[div_index].clk_div_top0;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_TOP0);
> +
> + while (__raw_readl(EXYNOS5_CLKDIV_STAT_TOP0) & 0x151101)
> + cpu_relax();
> +
> + /* Change Divider - TOP1 */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_TOP1);
> +
> + tmp &= ~(EXYNOS5_CLKDIV_TOP1_ACLK400_ISP_MASK |
> + EXYNOS5_CLKDIV_TOP1_ACLK400_IOP_MASK |
> + EXYNOS5_CLKDIV_TOP1_ACLK66_PRE_MASK |
> + EXYNOS5_CLKDIV_TOP1_ACLK300_GSCL_MASK);
> +
> + tmp |= int_freq[div_index].clk_div_top1;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_TOP1);
> +
> + while ((__raw_readl(EXYNOS5_CLKDIV_STAT_TOP1) & 0x1110000) &&
> + (__raw_readl(EXYNOS5_CLKDIV_STAT_TOP0) & 0x80000))
> + cpu_relax();
> +
> + /* Change Divider - LEX */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_LEX);
> +
> + tmp &= ~(EXYNOS5_CLKDIV_LEX_ATCLK_LEX_MASK |
> + EXYNOS5_CLKDIV_LEX_PCLK_LEX_MASK);
> +
> + tmp |= int_freq[div_index].clk_div_lex;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_LEX);
> +
> + while (__raw_readl(EXYNOS5_CLKDIV_STAT_LEX) & 0x110)
> + cpu_relax();
> +
> + /* Change Divider - R0X */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_R0X);
> +
> + tmp &= ~EXYNOS5_CLKDIV_R0X_PCLK_R0X_MASK;
> +
> + tmp |= int_freq[div_index].clk_div_r0x;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_R0X);
> +
> + while (__raw_readl(EXYNOS5_CLKDIV_STAT_R0X) & 0x10)
> + cpu_relax();
> +
> + /* Change Divider - R1X */
> + tmp = __raw_readl(EXYNOS5_CLKDIV_R1X);
> +
> + tmp &= ~EXYNOS5_CLKDIV_R1X_PCLK_R1X_MASK;
> +
> + tmp |= int_freq[div_index].clk_div_r1x;
> +
> + __raw_writel(tmp, EXYNOS5_CLKDIV_R1X);
> +
> + while (__raw_readl(EXYNOS5_CLKDIV_STAT_R1X) & 0x10)
> + cpu_relax();
> +}
> +
> +static int exynos5_clk_int_set_rate(struct clk *clk, unsigned long rate)
> +{
> + int index;
> +
> + for (index = 0; index < ARRAY_SIZE(int_freq); index++)
> + if (int_freq[index].freq == rate)
> + break;
> +
> + if (index == ARRAY_SIZE(int_freq))
> + return -EINVAL;
> +
> + /* Change the system clock divider values */
> + exynos5_int_set_clkdiv(index);
> +
> + clk->rate = rate;
> +
> + return 0;
> +}
> +
> +static struct clk_ops exynos5_clk_int_ops = {
> + .get_rate = exynos5_clk_int_get_rate,
> + .set_rate = exynos5_clk_int_set_rate
> };
>
> static u32 epll_div[][6] = {
> @@ -1713,6 +1861,9 @@ void __init_or_cpufreq exynos5_setup_clocks(void)
>
> clk_fout_epll.ops = &exynos5_epll_ops;
>
> + exynos5_int_clk.ops = &exynos5_clk_int_ops;
> + exynos5_int_clk.rate = aclk_266;
> +
> if (clk_set_parent(&exynos5_clk_mout_epll.clk, &clk_fout_epll))
> printk(KERN_ERR "Unable to set parent %s of clock %s.\n",
> clk_fout_epll.name,
> exynos5_clk_mout_epll.clk.name);
> diff --git a/arch/arm/mach-exynos/common.c b/arch/arm/mach-exynos/common.c
> index 73b940f..6b7d4ee 100644
> --- a/arch/arm/mach-exynos/common.c
> +++ b/arch/arm/mach-exynos/common.c
> @@ -282,6 +282,31 @@ static struct map_desc exynos5_iodesc[] __initdata =
> {
> .pfn = __phys_to_pfn(EXYNOS5_PA_UART),
> .length = SZ_512K,
> .type = MT_DEVICE,
> + }, {
> + .virtual = (unsigned long)S5P_VA_PPMU_CPU,
> + .pfn = __phys_to_pfn(EXYNOS5_PA_PPMU_CPU),
> + .length = SZ_8K,
> + .type = MT_DEVICE,
> + }, {
> + .virtual = (unsigned long)S5P_VA_PPMU_DDR_C,
> + .pfn = __phys_to_pfn(EXYNOS5_PA_PPMU_DDR_C),
> + .length = SZ_8K,
> + .type = MT_DEVICE,
> + }, {
> + .virtual = (unsigned long)S5P_VA_PPMU_DDR_R1,
> + .pfn = __phys_to_pfn(EXYNOS5_PA_PPMU_DDR_R1),
> + .length = SZ_8K,
> + .type = MT_DEVICE,
> + }, {
> + .virtual = (unsigned long)S5P_VA_PPMU_DDR_L,
> + .pfn = __phys_to_pfn(EXYNOS5_PA_PPMU_DDR_L),
> + .length = SZ_8K,
> + .type = MT_DEVICE,
> + }, {
> + .virtual = (unsigned long)S5P_VA_PPMU_RIGHT,
> + .pfn = __phys_to_pfn(EXYNOS5_PA_PPMU_RIGHT),
> + .length = SZ_8K,
> + .type = MT_DEVICE,
> },
> };
>
> diff --git a/arch/arm/mach-exynos/include/mach/map.h b/arch/arm/mach-
> exynos/include/mach/map.h
> index cbb2852..8c8de91 100644
> --- a/arch/arm/mach-exynos/include/mach/map.h
> +++ b/arch/arm/mach-exynos/include/mach/map.h
> @@ -228,6 +228,12 @@
> #define EXYNOS4_PA_SDRAM 0x40000000
> #define EXYNOS5_PA_SDRAM 0x40000000
>
> +#define EXYNOS5_PA_PPMU_DDR_C 0x10C40000
> +#define EXYNOS5_PA_PPMU_DDR_R1 0x10C50000
> +#define EXYNOS5_PA_PPMU_CPU 0x10C60000
> +#define EXYNOS5_PA_PPMU_DDR_L 0x10CB0000
> +#define EXYNOS5_PA_PPMU_RIGHT 0x13660000
> +
> /* Compatibiltiy Defines */
>
> #define S3C_PA_HSMMC0 EXYNOS4_PA_HSMMC(0)
> diff --git a/arch/arm/mach-exynos/include/mach/regs-clock.h
> b/arch/arm/mach-exynos/include/mach/regs-clock.h
> index d36ad76..bad3cd3 100644
> --- a/arch/arm/mach-exynos/include/mach/regs-clock.h
> +++ b/arch/arm/mach-exynos/include/mach/regs-clock.h
> @@ -323,6 +323,9 @@
> #define EXYNOS5_CLKDIV_PERIC5
EXYNOS_CLKREG(0x1056C)
> #define EXYNOS5_SCLK_DIV_ISP EXYNOS_CLKREG(0x10580)
>
> +#define EXYNOS5_CLKDIV_STAT_TOP0 EXYNOS_CLKREG(0x10610)
> +#define EXYNOS5_CLKDIV_STAT_TOP1 EXYNOS_CLKREG(0x10614)
> +
> #define EXYNOS5_CLKGATE_IP_ACP
EXYNOS_CLKREG(0x08800)
> #define EXYNOS5_CLKGATE_IP_ISP0
EXYNOS_CLKREG(0x0C800)
> #define EXYNOS5_CLKGATE_IP_ISP1
EXYNOS_CLKREG(0x0C804)
> @@ -337,6 +340,18 @@
> #define EXYNOS5_CLKGATE_IP_PERIS EXYNOS_CLKREG(0x10960)
> #define EXYNOS5_CLKGATE_BLOCK
EXYNOS_CLKREG(0x10980)
>
> +#define EXYNOS5_CLKGATE_BUS_SYSLFT EXYNOS_CLKREG(0x08920)
> +
> +#define EXYNOS5_CLKOUT_CMU_TOP
EXYNOS_CLKREG(0x10A00)
> +
> +#define EXYNOS5_CLKDIV_LEX EXYNOS_CLKREG(0x14500)
> +#define EXYNOS5_CLKDIV_STAT_LEX
EXYNOS_CLKREG(0x14600)
> +
> +#define EXYNOS5_CLKDIV_R0X EXYNOS_CLKREG(0x18500)
> +#define EXYNOS5_CLKDIV_STAT_R0X
EXYNOS_CLKREG(0x18600)
> +
> +#define EXYNOS5_CLKDIV_R1X EXYNOS_CLKREG(0x1C500)
> +#define EXYNOS5_CLKDIV_STAT_R1X
EXYNOS_CLKREG(0x1C600)
> #define EXYNOS5_BPLL_CON0 EXYNOS_CLKREG(0x20110)
> #define EXYNOS5_CLKSRC_CDREX EXYNOS_CLKREG(0x20200)
> #define EXYNOS5_CLKDIV_CDREX EXYNOS_CLKREG(0x20500)
> @@ -347,6 +362,39 @@
>
> #define EXYNOS5_EPLLCON0_LOCKED_SHIFT (29)
>
> +#define EXYNOS5_CLKDIV_TOP0_ACLK300_DISP1_SHIFT (28)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK300_DISP1_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP0_ACLK300_DISP1_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK333_SHIFT (20)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK333_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP0_ACLK333_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK266_SHIFT (16)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK266_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP0_ACLK266_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK200_SHIFT (12)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK200_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP0_ACLK200_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK166_SHIFT (8)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK166_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP0_ACLK166_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK66_SHIFT (0)
> +#define EXYNOS5_CLKDIV_TOP0_ACLK66_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP0_ACLK66_SHIFT)
> +
> +#define EXYNOS5_CLKDIV_TOP1_ACLK66_PRE_SHIFT (24)
> +#define EXYNOS5_CLKDIV_TOP1_ACLK66_PRE_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP1_ACLK66_PRE_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP1_ACLK400_ISP_SHIFT (20)
> +#define EXYNOS5_CLKDIV_TOP1_ACLK400_ISP_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP1_ACLK400_ISP_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP1_ACLK400_IOP_SHIFT (16)
> +#define EXYNOS5_CLKDIV_TOP1_ACLK400_IOP_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP1_ACLK400_IOP_SHIFT)
> +#define EXYNOS5_CLKDIV_TOP1_ACLK300_GSCL_SHIFT (12)
> +#define EXYNOS5_CLKDIV_TOP1_ACLK300_GSCL_MASK (0x7 <<
> EXYNOS5_CLKDIV_TOP1_ACLK300_GSCL_SHIFT)
> +
> +#define EXYNOS5_CLKDIV_LEX_ATCLK_LEX_SHIFT (8)
> +#define EXYNOS5_CLKDIV_LEX_ATCLK_LEX_MASK (0x7 <<
> EXYNOS5_CLKDIV_LEX_ATCLK_LEX_SHIFT)
> +#define EXYNOS5_CLKDIV_LEX_PCLK_LEX_SHIFT (4)
> +#define EXYNOS5_CLKDIV_LEX_PCLK_LEX_MASK (0x7 <<
> EXYNOS5_CLKDIV_LEX_PCLK_LEX_SHIFT)
> +
> +#define EXYNOS5_CLKDIV_R0X_PCLK_R0X_SHIFT (4)
> +#define EXYNOS5_CLKDIV_R0X_PCLK_R0X_MASK (0x7 <<
> EXYNOS5_CLKDIV_R0X_PCLK_R0X_SHIFT)
> +
> +#define EXYNOS5_CLKDIV_R1X_PCLK_R1X_SHIFT (4)
> +#define EXYNOS5_CLKDIV_R1X_PCLK_R1X_MASK (0x7 <<
> EXYNOS5_CLKDIV_R1X_PCLK_R1X_SHIFT)
> +
> #define PWR_CTRL1_CORE2_DOWN_RATIO (7 << 28)
> #define PWR_CTRL1_CORE1_DOWN_RATIO (7 << 16)
> #define PWR_CTRL1_DIV2_DOWN_EN (1 << 9)
> diff --git a/arch/arm/plat-samsung/include/plat/map-s5p.h b/arch/arm/plat-
> samsung/include/plat/map-s5p.h
> index 038aa96..28bef98 100644
> --- a/arch/arm/plat-samsung/include/plat/map-s5p.h
> +++ b/arch/arm/plat-samsung/include/plat/map-s5p.h
> @@ -42,6 +42,12 @@
>
> #define S5P_VA_AUDSS S3C_ADDR(0x02830000)
>
> +#define S5P_VA_PPMU_CPU S3C_ADDR(0x02840000)
> +#define S5P_VA_PPMU_DDR_C S3C_ADDR(0x02842000)
> +#define S5P_VA_PPMU_DDR_R1 S3C_ADDR(0x02844000)
> +#define S5P_VA_PPMU_DDR_L S3C_ADDR(0x02846000)
> +#define S5P_VA_PPMU_RIGHT S3C_ADDR(0x02848000)
> +
> #define VA_VIC(x) (S3C_VA_IRQ + ((x) * 0x10000))
> #define VA_VIC0 VA_VIC(0)
> #define VA_VIC1 VA_VIC(1)
> --
> 1.7.8.6
^ permalink raw reply
* Re: [RFC] cpuidle - remove the power_specified field in the driver
From: Rafael J. Wysocki @ 2012-12-10 22:41 UTC (permalink / raw)
To: Julius Werner
Cc: Daniel Lezcano, Francesco Lavra, linux-pm, Kevin Hilman,
Deepthi Dharwar, Trinabh Gupta, Lists Linaro-dev, len.brown,
linux-kernel, Andrew Morton, Sameer Nanda, Len Brown
In-Reply-To: <CAODwPW96ZTXUZjz4B_GET_6UAd4oTednRmrFLe-Jt=EF9n4O1g@mail.gmail.com>
On Monday, December 10, 2012 11:09:58 AM Julius Werner wrote:
> Hi,
>
> What is the current status of this? Daniel, do you think you have got
> enough feedback to submit a definitive patch for this? Rafael, would
> you approve of such a change?
I need to talk to Len about that before I give you a reliable answer.
Thanks,
Rafael
> The bug with dynamically added C-states that is tied to this still
> hurts the battery life for some users across all distros every day, so
> I think it would be valuable to get a consistent solution into the
> mainline soon before everyone has to roll their own.
>
> On 11/12/2012 09:26 PM, Daniel Lezcano wrote:
> > This patch follows the discussion about reinitializing the power usage
> > when a C-state is added/removed.
> >
> > https://lkml.org/lkml/2012/10/16/518
> >
> > We realized the power usage field is never filled and when it is
> > filled for tegra, the power_specified flag is not set making all these
> > values to be resetted when the driver is initialized with the set_power_state
> > function.
> >
> > Julius and I feel this is over-engineered and the power_specified
> > flag could be simply removed and continue assuming the states are
> > backward sorted.
> >
> > The menu governor select function is simplified as the power is ordered.
> > Actually the condition is always true with the current code.
> >
> > The cpuidle_play_dead function is also simplified by doing a reverse lookup
> > on the array.
> >
> > The set_power_states function is removed as it does no make sense anymore.
> >
> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > ---
> > drivers/cpuidle/cpuidle.c | 17 ++++-------------
> > drivers/cpuidle/driver.c | 25 -------------------------
> > drivers/cpuidle/governors/menu.c | 8 ++------
> > include/linux/cpuidle.h | 2 +-
> > 4 files changed, 7 insertions(+), 45 deletions(-)
> >
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox