* [PATCH V4 0/3] cpufreq: Introduce cpufreq_freq_transition_{begin|end}()
@ 2014-03-21 5:34 Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized Viresh Kumar
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 5:34 UTC (permalink / raw)
To: rjw
Cc: linaro-kernel, cpufreq, linux-pm, linux-kernel, srivatsa.bhat,
Viresh Kumar
Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
should strictly alternate, thereby preventing two different sets of PRECHANGE or
POSTCHANGE notifiers from interleaving arbitrarily.
The following examples illustrate why this is important:
Scenario 1:
-----------
A thread reading the value of cpuinfo_cur_freq, will call
__cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
The ondemand governor can decide to change the frequency of the CPU at the same
time and hence it can end up sending the notifications via ->target().
If the notifiers are not serialized, the following sequence can occur:
- PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
- PRECHANGE Notification for freq B (from target())
- Freq changed by target() to B
- POSTCHANGE Notification for freq B
- POSTCHANGE Notification for freq A
We can see from the above that the last POSTCHANGE Notification happens for freq
A but the hardware is set to run at freq B.
Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
loops_per_jiffy calculations will get messed up.
Scenario 2:
-----------
The governor calls __cpufreq_driver_target() to change the frequency. At the
same time, if we change scaling_{min|max}_freq from sysfs, it will end up
calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
__cpufreq_driver_target(). And hence we end up issuing concurrent calls to
->target().
Typically, platforms have the following logic in their ->target() routines:
(Eg: cpufreq-cpu0, omap, exynos, etc)
A. If new freq is more than old: Increase voltage
B. Change freq
C. If new freq is less than old: decrease voltage
Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
increase the freq and Y is trying to decrease it, we get the following race
condition:
X.A: voltage gets increased for larger freq
Y.A: nothing happens
Y.B: freq gets decreased
Y.C: voltage gets decreased
X.B: freq gets increased
X.C: nothing happens
Thus we can end up setting a freq which is not supported by the voltage we have
set. That will probably make the clock to the CPU unstable and the system might
not work properly anymore.
This patchset introduces a new set of routines cpufreq_freq_transition_begin()
and cpufreq_freq_transition_end(), which will guarantee that calls to frequency
transition routines are serialized. Later patches force other drivers to use
these new routines.
Srivatsa S. Bhat (1):
cpufreq: Make sure frequency transitions are serialized
Viresh Kumar (2):
cpufreq: Convert existing drivers to use
cpufreq_freq_transition_{begin|end}
cpufreq: Make cpufreq_notify_transition &
cpufreq_notify_post_transition static
drivers/cpufreq/cpufreq-nforce2.c | 4 +--
drivers/cpufreq/cpufreq.c | 52 +++++++++++++++++++++++++++++-------
drivers/cpufreq/exynos5440-cpufreq.c | 4 +--
drivers/cpufreq/gx-suspmod.c | 4 +--
drivers/cpufreq/integrator-cpufreq.c | 4 +--
drivers/cpufreq/longhaul.c | 4 +--
drivers/cpufreq/pcc-cpufreq.c | 4 +--
drivers/cpufreq/powernow-k6.c | 4 +--
drivers/cpufreq/powernow-k7.c | 4 +--
drivers/cpufreq/powernow-k8.c | 4 +--
drivers/cpufreq/s3c24xx-cpufreq.c | 4 +--
drivers/cpufreq/sh-cpufreq.c | 4 +--
drivers/cpufreq/unicore2-cpufreq.c | 4 +--
include/linux/cpufreq.h | 12 ++++++---
14 files changed, 76 insertions(+), 36 deletions(-)
--
1.7.12.rc2.18.g61b472e
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 5:34 [PATCH V4 0/3] cpufreq: Introduce cpufreq_freq_transition_{begin|end}() Viresh Kumar
@ 2014-03-21 5:34 ` Viresh Kumar
2014-03-21 7:46 ` Srivatsa S. Bhat
2014-03-21 5:34 ` [PATCH V4 2/3] cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end} Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 3/3] cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static Viresh Kumar
2 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 5:34 UTC (permalink / raw)
To: rjw
Cc: linaro-kernel, cpufreq, linux-pm, linux-kernel, srivatsa.bhat,
Viresh Kumar
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
should strictly alternate, thereby preventing two different sets of PRECHANGE or
POSTCHANGE notifiers from interleaving arbitrarily.
The following examples illustrate why this is important:
Scenario 1:
-----------
A thread reading the value of cpuinfo_cur_freq, will call
__cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
The ondemand governor can decide to change the frequency of the CPU at the same
time and hence it can end up sending the notifications via ->target().
If the notifiers are not serialized, the following sequence can occur:
- PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
- PRECHANGE Notification for freq B (from target())
- Freq changed by target() to B
- POSTCHANGE Notification for freq B
- POSTCHANGE Notification for freq A
We can see from the above that the last POSTCHANGE Notification happens for freq
A but the hardware is set to run at freq B.
Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
loops_per_jiffy calculations will get messed up.
Scenario 2:
-----------
The governor calls __cpufreq_driver_target() to change the frequency. At the
same time, if we change scaling_{min|max}_freq from sysfs, it will end up
calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
__cpufreq_driver_target(). And hence we end up issuing concurrent calls to
->target().
Typically, platforms have the following logic in their ->target() routines:
(Eg: cpufreq-cpu0, omap, exynos, etc)
A. If new freq is more than old: Increase voltage
B. Change freq
C. If new freq is less than old: decrease voltage
Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
increase the freq and Y is trying to decrease it, we get the following race
condition:
X.A: voltage gets increased for larger freq
Y.A: nothing happens
Y.B: freq gets decreased
Y.C: voltage gets decreased
X.B: freq gets increased
X.C: nothing happens
Thus we can end up setting a freq which is not supported by the voltage we have
set. That will probably make the clock to the CPU unstable and the system might
not work properly anymore.
This patch introduces a set of synchronization primitives to serialize frequency
transitions, which are to be used as shown below:
cpufreq_freq_transition_begin();
//Perform the frequency change
cpufreq_freq_transition_end();
The _begin() call sends the PRECHANGE notification whereas the _end() call sends
the POSTCHANGE notification. Also, all the necessary synchronization is handled
within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
flag can also use these APIs for performing frequency transitions (ie., you can
call _begin() from one task, and call the corresponding _end() from a different
task).
The actual synchronization underneath is not that complicated:
The key challenge is to allow drivers to begin the transition from one thread
and end it in a completely different thread (this is to enable drivers that do
asynchronous POSTCHANGE notification from bottom-halves, to also use the same
interface).
To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
wait-queue are added per-policy. The flag and the wait-queue are used in
conjunction to create an "uninterrupted flow" from _begin() to _end(). The
spinlock is used to ensure that only one such "flow" is in flight at any given
time. Put together, this provides us all the necessary synchronization.
Based-on-patch-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
I have kept your Authorship for this patch as is and did few modifications:
- removed 'state' parameter from begin/end routines.
- added 'trasition_failed' parameter to end routine.
- changed mutex with spinlock as discussed earlier.
- Added WARN_ON() as discussed.
- Exported these new routines.
- Removed locks from end.
drivers/cpufreq/cpufreq.c | 37 +++++++++++++++++++++++++++++++++++++
include/linux/cpufreq.h | 10 ++++++++++
2 files changed, 47 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b349406..4279cc9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -353,6 +353,41 @@ void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
}
EXPORT_SYMBOL_GPL(cpufreq_notify_post_transition);
+void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
+ struct cpufreq_freqs *freqs)
+{
+wait:
+ wait_event(policy->transition_wait, !policy->transition_ongoing);
+
+ spin_lock(&policy->transition_lock);
+
+ if (unlikely(policy->transition_ongoing)) {
+ spin_unlock(&policy->transition_lock);
+ goto wait;
+ }
+
+ policy->transition_ongoing = true;
+
+ spin_unlock(&policy->transition_lock);
+
+ cpufreq_notify_transition(policy, freqs, CPUFREQ_PRECHANGE);
+}
+EXPORT_SYMBOL_GPL(cpufreq_freq_transition_begin);
+
+void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
+ struct cpufreq_freqs *freqs, int transition_failed)
+{
+ if (unlikely(WARN_ON(!policy->transition_ongoing)))
+ return;
+
+ cpufreq_notify_post_transition(policy, freqs, transition_failed);
+
+ policy->transition_ongoing = false;
+
+ wake_up(&policy->transition_wait);
+}
+EXPORT_SYMBOL_GPL(cpufreq_freq_transition_end);
+
/*********************************************************************
* SYSFS INTERFACE *
@@ -982,6 +1017,8 @@ static struct cpufreq_policy *cpufreq_policy_alloc(void)
INIT_LIST_HEAD(&policy->policy_list);
init_rwsem(&policy->rwsem);
+ spin_lock_init(&policy->transition_lock);
+ init_waitqueue_head(&policy->transition_wait);
return policy;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 70929bc..263173d 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -16,6 +16,7 @@
#include <linux/completion.h>
#include <linux/kobject.h>
#include <linux/notifier.h>
+#include <linux/spinlock.h>
#include <linux/sysfs.h>
/*********************************************************************
@@ -104,6 +105,11 @@ struct cpufreq_policy {
* __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
*/
struct rw_semaphore rwsem;
+
+ /* Synchronization for frequency transitions */
+ bool transition_ongoing; /* Tracks transition status */
+ spinlock_t transition_lock;
+ wait_queue_head_t transition_wait;
};
/* Only for ACPI */
@@ -336,6 +342,10 @@ void cpufreq_notify_transition(struct cpufreq_policy *policy,
struct cpufreq_freqs *freqs, unsigned int state);
void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
struct cpufreq_freqs *freqs, int transition_failed);
+void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
+ struct cpufreq_freqs *freqs);
+void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
+ struct cpufreq_freqs *freqs, int transition_failed);
#else /* CONFIG_CPU_FREQ */
static inline int cpufreq_register_notifier(struct notifier_block *nb,
--
1.7.12.rc2.18.g61b472e
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH V4 2/3] cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end}
2014-03-21 5:34 [PATCH V4 0/3] cpufreq: Introduce cpufreq_freq_transition_{begin|end}() Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized Viresh Kumar
@ 2014-03-21 5:34 ` Viresh Kumar
2014-03-21 7:48 ` Srivatsa S. Bhat
2014-03-21 5:34 ` [PATCH V4 3/3] cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static Viresh Kumar
2 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 5:34 UTC (permalink / raw)
To: rjw
Cc: linaro-kernel, cpufreq, linux-pm, linux-kernel, srivatsa.bhat,
Viresh Kumar
CPUFreq core has new infrastructure that would guarantee serialized calls to
target() or target_index() callbacks. These are called
cpufreq_freq_transition_begin() and cpufreq_freq_transition_end().
This patch converts existing drivers to use these new set of routines.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
drivers/cpufreq/cpufreq.c | 9 ++++-----
drivers/cpufreq/exynos5440-cpufreq.c | 4 ++--
drivers/cpufreq/gx-suspmod.c | 4 ++--
drivers/cpufreq/integrator-cpufreq.c | 4 ++--
drivers/cpufreq/longhaul.c | 4 ++--
drivers/cpufreq/pcc-cpufreq.c | 4 ++--
drivers/cpufreq/powernow-k6.c | 4 ++--
drivers/cpufreq/powernow-k7.c | 4 ++--
drivers/cpufreq/powernow-k8.c | 4 ++--
drivers/cpufreq/s3c24xx-cpufreq.c | 4 ++--
drivers/cpufreq/sh-cpufreq.c | 4 ++--
drivers/cpufreq/unicore2-cpufreq.c | 4 ++--
13 files changed, 28 insertions(+), 29 deletions(-)
diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
index a05b876..379cc2c 100644
--- a/drivers/cpufreq/cpufreq-nforce2.c
+++ b/drivers/cpufreq/cpufreq-nforce2.c
@@ -270,7 +270,7 @@ static int nforce2_target(struct cpufreq_policy *policy,
pr_debug("Old CPU frequency %d kHz, new %d kHz\n",
freqs.old, freqs.new);
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
/* Disable IRQs */
/* local_irq_save(flags); */
@@ -285,7 +285,7 @@ static int nforce2_target(struct cpufreq_policy *policy,
/* Enable IRQs */
/* local_irq_restore(flags); */
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
return 0;
}
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 4279cc9..b63e7e4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1503,8 +1503,8 @@ static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
policy = per_cpu(cpufreq_cpu_data, cpu);
read_unlock_irqrestore(&cpufreq_driver_lock, flags);
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
+ cpufreq_freq_transition_end(policy, &freqs, false);
}
/**
@@ -1864,8 +1864,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
pr_debug("%s: cpu: %d, oldfreq: %u, new freq: %u\n",
__func__, policy->cpu, freqs.old, freqs.new);
- cpufreq_notify_transition(policy, &freqs,
- CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
}
retval = cpufreq_driver->target_index(policy, index);
@@ -1874,7 +1873,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
__func__, retval);
if (notify)
- cpufreq_notify_post_transition(policy, &freqs, retval);
+ cpufreq_freq_transition_end(policy, &freqs, retval);
}
out:
diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
index 7f776aa..3655e7d 100644
--- a/drivers/cpufreq/exynos5440-cpufreq.c
+++ b/drivers/cpufreq/exynos5440-cpufreq.c
@@ -219,7 +219,7 @@ static int exynos_target(struct cpufreq_policy *policy, unsigned int index)
freqs.old = policy->cur;
freqs.new = freq_table[index].frequency;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
/* Set the target frequency in all C0_3_PSTATE register */
for_each_cpu(i, policy->cpus) {
@@ -258,7 +258,7 @@ static void exynos_cpufreq_work(struct work_struct *work)
dev_crit(dvfs_info->dev, "New frequency out of range\n");
freqs.new = freqs.old;
}
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
cpufreq_cpu_put(policy);
mutex_unlock(&cpufreq_lock);
diff --git a/drivers/cpufreq/gx-suspmod.c b/drivers/cpufreq/gx-suspmod.c
index d83e826..fe85673 100644
--- a/drivers/cpufreq/gx-suspmod.c
+++ b/drivers/cpufreq/gx-suspmod.c
@@ -265,7 +265,7 @@ static void gx_set_cpuspeed(struct cpufreq_policy *policy, unsigned int khz)
freqs.new = new_khz;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
local_irq_save(flags);
if (new_khz != stock_freq) {
@@ -314,7 +314,7 @@ static void gx_set_cpuspeed(struct cpufreq_policy *policy, unsigned int khz)
gx_params->pci_suscfg = suscfg;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
pr_debug("suspend modulation w/ duration of ON:%d us, OFF:%d us\n",
gx_params->on_duration * 32, gx_params->off_duration * 32);
diff --git a/drivers/cpufreq/integrator-cpufreq.c b/drivers/cpufreq/integrator-cpufreq.c
index 0e27844..771f422 100644
--- a/drivers/cpufreq/integrator-cpufreq.c
+++ b/drivers/cpufreq/integrator-cpufreq.c
@@ -122,7 +122,7 @@ static int integrator_set_target(struct cpufreq_policy *policy,
return 0;
}
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
cm_osc = __raw_readl(cm_base + INTEGRATOR_HDR_OSC_OFFSET);
@@ -143,7 +143,7 @@ static int integrator_set_target(struct cpufreq_policy *policy,
*/
set_cpus_allowed(current, cpus_allowed);
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
return 0;
}
diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c
index 7b94da3..c599af9 100644
--- a/drivers/cpufreq/longhaul.c
+++ b/drivers/cpufreq/longhaul.c
@@ -269,7 +269,7 @@ static void longhaul_setstate(struct cpufreq_policy *policy,
freqs.old = calc_speed(longhaul_get_cpu_mult());
freqs.new = speed;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
pr_debug("Setting to FSB:%dMHz Mult:%d.%dx (%s)\n",
fsb, mult/10, mult%10, print_speed(speed/1000));
@@ -386,7 +386,7 @@ retry_loop:
}
}
/* Report true CPU frequency */
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
if (!bm_timeout)
printk(KERN_INFO PFX "Warning: Timeout while waiting for "
diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
index 1c0f106..728a2d8 100644
--- a/drivers/cpufreq/pcc-cpufreq.c
+++ b/drivers/cpufreq/pcc-cpufreq.c
@@ -215,7 +215,7 @@ static int pcc_cpufreq_target(struct cpufreq_policy *policy,
freqs.old = policy->cur;
freqs.new = target_freq;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
input_buffer = 0x1 | (((target_freq * 100)
/ (ioread32(&pcch_hdr->nominal) * 1000)) << 8);
@@ -231,7 +231,7 @@ static int pcc_cpufreq_target(struct cpufreq_policy *policy,
status = ioread16(&pcch_hdr->status);
iowrite16(0, &pcch_hdr->status);
- cpufreq_notify_post_transition(policy, &freqs, status != CMD_COMPLETE);
+ cpufreq_freq_transition_end(policy, &freqs, status != CMD_COMPLETE);
spin_unlock(&pcc_lock);
if (status != CMD_COMPLETE) {
diff --git a/drivers/cpufreq/powernow-k6.c b/drivers/cpufreq/powernow-k6.c
index ce27e6c..882d673 100644
--- a/drivers/cpufreq/powernow-k6.c
+++ b/drivers/cpufreq/powernow-k6.c
@@ -148,11 +148,11 @@ static int powernow_k6_target(struct cpufreq_policy *policy,
freqs.old = busfreq * powernow_k6_get_cpu_multiplier();
freqs.new = busfreq * clock_ratio[best_i].driver_data;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
powernow_k6_set_cpu_multiplier(best_i);
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
return 0;
}
diff --git a/drivers/cpufreq/powernow-k7.c b/drivers/cpufreq/powernow-k7.c
index 0e68e02..30a6c48 100644
--- a/drivers/cpufreq/powernow-k7.c
+++ b/drivers/cpufreq/powernow-k7.c
@@ -269,7 +269,7 @@ static int powernow_target(struct cpufreq_policy *policy, unsigned int index)
freqs.new = powernow_table[index].frequency;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
/* Now do the magic poking into the MSRs. */
@@ -290,7 +290,7 @@ static int powernow_target(struct cpufreq_policy *policy, unsigned int index)
if (have_a0 == 1)
local_irq_enable();
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
return 0;
}
diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index 27eb2be..770a9e1 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -963,9 +963,9 @@ static int transition_frequency_fidvid(struct powernow_k8_data *data,
policy = cpufreq_cpu_get(smp_processor_id());
cpufreq_cpu_put(policy);
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
res = transition_fid_vid(data, fid, vid);
- cpufreq_notify_post_transition(policy, &freqs, res);
+ cpufreq_freq_transition_end(policy, &freqs, res);
return res;
}
diff --git a/drivers/cpufreq/s3c24xx-cpufreq.c b/drivers/cpufreq/s3c24xx-cpufreq.c
index 2506974..d5ef905 100644
--- a/drivers/cpufreq/s3c24xx-cpufreq.c
+++ b/drivers/cpufreq/s3c24xx-cpufreq.c
@@ -217,7 +217,7 @@ static int s3c_cpufreq_settarget(struct cpufreq_policy *policy,
s3c_cpufreq_updateclk(clk_pclk, cpu_new.freq.pclk);
/* start the frequency change */
- cpufreq_notify_transition(policy, &freqs.freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs.freqs);
/* If hclk is staying the same, then we do not need to
* re-write the IO or the refresh timings whilst we are changing
@@ -261,7 +261,7 @@ static int s3c_cpufreq_settarget(struct cpufreq_policy *policy,
local_irq_restore(flags);
/* notify everyone we've done this */
- cpufreq_notify_transition(policy, &freqs.freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs.freqs, false);
s3c_freq_dbg("%s: finished\n", __func__);
return 0;
diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
index 696170e..f63abf3 100644
--- a/drivers/cpufreq/sh-cpufreq.c
+++ b/drivers/cpufreq/sh-cpufreq.c
@@ -68,10 +68,10 @@ static int sh_cpufreq_target(struct cpufreq_policy *policy,
freqs.new = (freq + 500) / 1000;
freqs.flags = 0;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
set_cpus_allowed_ptr(current, &cpus_allowed);
clk_set_rate(cpuclk, freq);
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+ cpufreq_freq_transition_end(policy, &freqs, false);
dev_dbg(dev, "set frequency %lu Hz\n", freq);
diff --git a/drivers/cpufreq/unicore2-cpufreq.c b/drivers/cpufreq/unicore2-cpufreq.c
index 36cc330..13be802 100644
--- a/drivers/cpufreq/unicore2-cpufreq.c
+++ b/drivers/cpufreq/unicore2-cpufreq.c
@@ -44,9 +44,9 @@ static int ucv2_target(struct cpufreq_policy *policy,
freqs.old = policy->cur;
freqs.new = target_freq;
- cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+ cpufreq_freq_transition_begin(policy, &freqs);
ret = clk_set_rate(policy->mclk, target_freq * 1000);
- cpufreq_notify_post_transition(policy, &freqs, ret);
+ cpufreq_freq_transition_end(policy, &freqs, ret);
return ret;
}
--
1.7.12.rc2.18.g61b472e
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH V4 3/3] cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static
2014-03-21 5:34 [PATCH V4 0/3] cpufreq: Introduce cpufreq_freq_transition_{begin|end}() Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 2/3] cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end} Viresh Kumar
@ 2014-03-21 5:34 ` Viresh Kumar
2014-03-21 7:51 ` Srivatsa S. Bhat
2 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 5:34 UTC (permalink / raw)
To: rjw
Cc: linaro-kernel, cpufreq, linux-pm, linux-kernel, srivatsa.bhat,
Viresh Kumar
cpufreq_notify_transition() and cpufreq_notify_post_transition() shouldn't be
called directly by cpufreq drivers anymore and so these should be marked static.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
drivers/cpufreq/cpufreq.c | 6 ++----
include/linux/cpufreq.h | 4 ----
2 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b63e7e4..7b1feff 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -331,16 +331,15 @@ static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
* function. It is called twice on all CPU frequency changes that have
* external effects.
*/
-void cpufreq_notify_transition(struct cpufreq_policy *policy,
+static void cpufreq_notify_transition(struct cpufreq_policy *policy,
struct cpufreq_freqs *freqs, unsigned int state)
{
for_each_cpu(freqs->cpu, policy->cpus)
__cpufreq_notify_transition(policy, freqs, state);
}
-EXPORT_SYMBOL_GPL(cpufreq_notify_transition);
/* Do post notifications when there are chances that transition has failed */
-void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
+static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
struct cpufreq_freqs *freqs, int transition_failed)
{
cpufreq_notify_transition(policy, freqs, CPUFREQ_POSTCHANGE);
@@ -351,7 +350,6 @@ void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
cpufreq_notify_transition(policy, freqs, CPUFREQ_PRECHANGE);
cpufreq_notify_transition(policy, freqs, CPUFREQ_POSTCHANGE);
}
-EXPORT_SYMBOL_GPL(cpufreq_notify_post_transition);
void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
struct cpufreq_freqs *freqs)
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 263173d..826830b 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -338,10 +338,6 @@ static inline void cpufreq_resume(void) {}
int cpufreq_register_notifier(struct notifier_block *nb, unsigned int list);
int cpufreq_unregister_notifier(struct notifier_block *nb, unsigned int list);
-void cpufreq_notify_transition(struct cpufreq_policy *policy,
- struct cpufreq_freqs *freqs, unsigned int state);
-void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
- struct cpufreq_freqs *freqs, int transition_failed);
void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
struct cpufreq_freqs *freqs);
void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
--
1.7.12.rc2.18.g61b472e
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 5:34 ` [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized Viresh Kumar
@ 2014-03-21 7:46 ` Srivatsa S. Bhat
2014-03-21 7:58 ` Viresh Kumar
0 siblings, 1 reply; 18+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 7:46 UTC (permalink / raw)
To: Viresh Kumar
Cc: rjw, linaro-kernel, cpufreq, linux-pm, linux-kernel,
ego@linux.vnet.ibm.com
On 03/21/2014 11:04 AM, Viresh Kumar wrote:
> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
>
> Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
> notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
> should strictly alternate, thereby preventing two different sets of PRECHANGE or
> POSTCHANGE notifiers from interleaving arbitrarily.
>
> The following examples illustrate why this is important:
>
> Scenario 1:
> -----------
> A thread reading the value of cpuinfo_cur_freq, will call
> __cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
>
> The ondemand governor can decide to change the frequency of the CPU at the same
> time and hence it can end up sending the notifications via ->target().
>
> If the notifiers are not serialized, the following sequence can occur:
> - PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
> - PRECHANGE Notification for freq B (from target())
> - Freq changed by target() to B
> - POSTCHANGE Notification for freq B
> - POSTCHANGE Notification for freq A
>
> We can see from the above that the last POSTCHANGE Notification happens for freq
> A but the hardware is set to run at freq B.
>
> Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
> in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
> loops_per_jiffy calculations will get messed up.
>
> Scenario 2:
> -----------
> The governor calls __cpufreq_driver_target() to change the frequency. At the
> same time, if we change scaling_{min|max}_freq from sysfs, it will end up
> calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
> __cpufreq_driver_target(). And hence we end up issuing concurrent calls to
> ->target().
>
> Typically, platforms have the following logic in their ->target() routines:
> (Eg: cpufreq-cpu0, omap, exynos, etc)
>
> A. If new freq is more than old: Increase voltage
> B. Change freq
> C. If new freq is less than old: decrease voltage
>
> Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
> increase the freq and Y is trying to decrease it, we get the following race
> condition:
>
> X.A: voltage gets increased for larger freq
> Y.A: nothing happens
> Y.B: freq gets decreased
> Y.C: voltage gets decreased
> X.B: freq gets increased
> X.C: nothing happens
>
> Thus we can end up setting a freq which is not supported by the voltage we have
> set. That will probably make the clock to the CPU unstable and the system might
> not work properly anymore.
>
> This patch introduces a set of synchronization primitives to serialize frequency
> transitions, which are to be used as shown below:
>
> cpufreq_freq_transition_begin();
>
> //Perform the frequency change
>
> cpufreq_freq_transition_end();
>
> The _begin() call sends the PRECHANGE notification whereas the _end() call sends
> the POSTCHANGE notification. Also, all the necessary synchronization is handled
> within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
> flag can also use these APIs for performing frequency transitions (ie., you can
> call _begin() from one task, and call the corresponding _end() from a different
> task).
>
> The actual synchronization underneath is not that complicated:
>
> The key challenge is to allow drivers to begin the transition from one thread
> and end it in a completely different thread (this is to enable drivers that do
> asynchronous POSTCHANGE notification from bottom-halves, to also use the same
> interface).
>
> To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
> wait-queue are added per-policy. The flag and the wait-queue are used in
> conjunction to create an "uninterrupted flow" from _begin() to _end(). The
> spinlock is used to ensure that only one such "flow" is in flight at any given
> time. Put together, this provides us all the necessary synchronization.
>
> Based-on-patch-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>
> I have kept your Authorship for this patch as is and did few modifications:
> - removed 'state' parameter from begin/end routines.
> - added 'trasition_failed' parameter to end routine.
> - changed mutex with spinlock as discussed earlier.
> - Added WARN_ON() as discussed.
> - Exported these new routines.
> - Removed locks from end.
>
Wonderful! I was going to do this today, but thanks a lot for taking
care of this for me!
The patch looks good, but I have one comment below.
> drivers/cpufreq/cpufreq.c | 37 +++++++++++++++++++++++++++++++++++++
> include/linux/cpufreq.h | 10 ++++++++++
> 2 files changed, 47 insertions(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b349406..4279cc9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -353,6 +353,41 @@ void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
> }
> EXPORT_SYMBOL_GPL(cpufreq_notify_post_transition);
>
> +void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
> + struct cpufreq_freqs *freqs)
> +{
> +wait:
> + wait_event(policy->transition_wait, !policy->transition_ongoing);
> +
> + spin_lock(&policy->transition_lock);
> +
> + if (unlikely(policy->transition_ongoing)) {
> + spin_unlock(&policy->transition_lock);
> + goto wait;
> + }
> +
> + policy->transition_ongoing = true;
> +
> + spin_unlock(&policy->transition_lock);
> +
> + cpufreq_notify_transition(policy, freqs, CPUFREQ_PRECHANGE);
> +}
> +EXPORT_SYMBOL_GPL(cpufreq_freq_transition_begin);
> +
> +void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
> + struct cpufreq_freqs *freqs, int transition_failed)
> +{
> + if (unlikely(WARN_ON(!policy->transition_ongoing)))
> + return;
> +
> + cpufreq_notify_post_transition(policy, freqs, transition_failed);
> +
> + policy->transition_ongoing = false;
We need this assignment to happen exactly at this point, that is, *after*
the call to post_transition() completes and before calling wake_up().
If the compiler or the CPU reorders the instructions and moves this
assignment to some other place, then we will be in trouble!
We might need memory barriers to ensure this doesn't get reordered.
Alternatively, we can simply hold the spin-lock around this assignment,
since locks automatically imply memory barriers. As an added advantage,
the code will then look more intuitive and easier to understand as well.
Thoughts?
Regards,
Srivatsa S. Bhat
> +
> + wake_up(&policy->transition_wait);
> +}
> +EXPORT_SYMBOL_GPL(cpufreq_freq_transition_end);
> +
>
> /*********************************************************************
> * SYSFS INTERFACE *
> @@ -982,6 +1017,8 @@ static struct cpufreq_policy *cpufreq_policy_alloc(void)
>
> INIT_LIST_HEAD(&policy->policy_list);
> init_rwsem(&policy->rwsem);
> + spin_lock_init(&policy->transition_lock);
> + init_waitqueue_head(&policy->transition_wait);
>
> return policy;
>
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 70929bc..263173d 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -16,6 +16,7 @@
> #include <linux/completion.h>
> #include <linux/kobject.h>
> #include <linux/notifier.h>
> +#include <linux/spinlock.h>
> #include <linux/sysfs.h>
>
> /*********************************************************************
> @@ -104,6 +105,11 @@ struct cpufreq_policy {
> * __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> */
> struct rw_semaphore rwsem;
> +
> + /* Synchronization for frequency transitions */
> + bool transition_ongoing; /* Tracks transition status */
> + spinlock_t transition_lock;
> + wait_queue_head_t transition_wait;
> };
>
> /* Only for ACPI */
> @@ -336,6 +342,10 @@ void cpufreq_notify_transition(struct cpufreq_policy *policy,
> struct cpufreq_freqs *freqs, unsigned int state);
> void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
> struct cpufreq_freqs *freqs, int transition_failed);
> +void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
> + struct cpufreq_freqs *freqs);
> +void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
> + struct cpufreq_freqs *freqs, int transition_failed);
>
> #else /* CONFIG_CPU_FREQ */
> static inline int cpufreq_register_notifier(struct notifier_block *nb,
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 2/3] cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end}
2014-03-21 5:34 ` [PATCH V4 2/3] cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end} Viresh Kumar
@ 2014-03-21 7:48 ` Srivatsa S. Bhat
2014-03-21 7:59 ` Viresh Kumar
0 siblings, 1 reply; 18+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 7:48 UTC (permalink / raw)
To: Viresh Kumar
Cc: rjw, linaro-kernel, cpufreq, linux-pm, linux-kernel,
ego@linux.vnet.ibm.com
On 03/21/2014 11:04 AM, Viresh Kumar wrote:
> CPUFreq core has new infrastructure that would guarantee serialized calls to
> target() or target_index() callbacks. These are called
> cpufreq_freq_transition_begin() and cpufreq_freq_transition_end().
>
> This patch converts existing drivers to use these new set of routines.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Nitpick: Instead of using 'false' as an argument to _post_transition(),
you could use '0', since the argument is supposed to be an int. But that's
minor, I won't insist.
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Regards,
Srivatsa S. Bhat
> ---
> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
> drivers/cpufreq/cpufreq.c | 9 ++++-----
> drivers/cpufreq/exynos5440-cpufreq.c | 4 ++--
> drivers/cpufreq/gx-suspmod.c | 4 ++--
> drivers/cpufreq/integrator-cpufreq.c | 4 ++--
> drivers/cpufreq/longhaul.c | 4 ++--
> drivers/cpufreq/pcc-cpufreq.c | 4 ++--
> drivers/cpufreq/powernow-k6.c | 4 ++--
> drivers/cpufreq/powernow-k7.c | 4 ++--
> drivers/cpufreq/powernow-k8.c | 4 ++--
> drivers/cpufreq/s3c24xx-cpufreq.c | 4 ++--
> drivers/cpufreq/sh-cpufreq.c | 4 ++--
> drivers/cpufreq/unicore2-cpufreq.c | 4 ++--
> 13 files changed, 28 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
> index a05b876..379cc2c 100644
> --- a/drivers/cpufreq/cpufreq-nforce2.c
> +++ b/drivers/cpufreq/cpufreq-nforce2.c
> @@ -270,7 +270,7 @@ static int nforce2_target(struct cpufreq_policy *policy,
> pr_debug("Old CPU frequency %d kHz, new %d kHz\n",
> freqs.old, freqs.new);
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
>
> /* Disable IRQs */
> /* local_irq_save(flags); */
> @@ -285,7 +285,7 @@ static int nforce2_target(struct cpufreq_policy *policy,
> /* Enable IRQs */
> /* local_irq_restore(flags); */
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> return 0;
> }
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 4279cc9..b63e7e4 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1503,8 +1503,8 @@ static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
> policy = per_cpu(cpufreq_cpu_data, cpu);
> read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
> + cpufreq_freq_transition_end(policy, &freqs, false);
> }
>
> /**
> @@ -1864,8 +1864,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
> pr_debug("%s: cpu: %d, oldfreq: %u, new freq: %u\n",
> __func__, policy->cpu, freqs.old, freqs.new);
>
> - cpufreq_notify_transition(policy, &freqs,
> - CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
> }
>
> retval = cpufreq_driver->target_index(policy, index);
> @@ -1874,7 +1873,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
> __func__, retval);
>
> if (notify)
> - cpufreq_notify_post_transition(policy, &freqs, retval);
> + cpufreq_freq_transition_end(policy, &freqs, retval);
> }
>
> out:
> diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
> index 7f776aa..3655e7d 100644
> --- a/drivers/cpufreq/exynos5440-cpufreq.c
> +++ b/drivers/cpufreq/exynos5440-cpufreq.c
> @@ -219,7 +219,7 @@ static int exynos_target(struct cpufreq_policy *policy, unsigned int index)
> freqs.old = policy->cur;
> freqs.new = freq_table[index].frequency;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
>
> /* Set the target frequency in all C0_3_PSTATE register */
> for_each_cpu(i, policy->cpus) {
> @@ -258,7 +258,7 @@ static void exynos_cpufreq_work(struct work_struct *work)
> dev_crit(dvfs_info->dev, "New frequency out of range\n");
> freqs.new = freqs.old;
> }
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> cpufreq_cpu_put(policy);
> mutex_unlock(&cpufreq_lock);
> diff --git a/drivers/cpufreq/gx-suspmod.c b/drivers/cpufreq/gx-suspmod.c
> index d83e826..fe85673 100644
> --- a/drivers/cpufreq/gx-suspmod.c
> +++ b/drivers/cpufreq/gx-suspmod.c
> @@ -265,7 +265,7 @@ static void gx_set_cpuspeed(struct cpufreq_policy *policy, unsigned int khz)
>
> freqs.new = new_khz;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
> local_irq_save(flags);
>
> if (new_khz != stock_freq) {
> @@ -314,7 +314,7 @@ static void gx_set_cpuspeed(struct cpufreq_policy *policy, unsigned int khz)
>
> gx_params->pci_suscfg = suscfg;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> pr_debug("suspend modulation w/ duration of ON:%d us, OFF:%d us\n",
> gx_params->on_duration * 32, gx_params->off_duration * 32);
> diff --git a/drivers/cpufreq/integrator-cpufreq.c b/drivers/cpufreq/integrator-cpufreq.c
> index 0e27844..771f422 100644
> --- a/drivers/cpufreq/integrator-cpufreq.c
> +++ b/drivers/cpufreq/integrator-cpufreq.c
> @@ -122,7 +122,7 @@ static int integrator_set_target(struct cpufreq_policy *policy,
> return 0;
> }
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
>
> cm_osc = __raw_readl(cm_base + INTEGRATOR_HDR_OSC_OFFSET);
>
> @@ -143,7 +143,7 @@ static int integrator_set_target(struct cpufreq_policy *policy,
> */
> set_cpus_allowed(current, cpus_allowed);
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> return 0;
> }
> diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c
> index 7b94da3..c599af9 100644
> --- a/drivers/cpufreq/longhaul.c
> +++ b/drivers/cpufreq/longhaul.c
> @@ -269,7 +269,7 @@ static void longhaul_setstate(struct cpufreq_policy *policy,
> freqs.old = calc_speed(longhaul_get_cpu_mult());
> freqs.new = speed;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
>
> pr_debug("Setting to FSB:%dMHz Mult:%d.%dx (%s)\n",
> fsb, mult/10, mult%10, print_speed(speed/1000));
> @@ -386,7 +386,7 @@ retry_loop:
> }
> }
> /* Report true CPU frequency */
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> if (!bm_timeout)
> printk(KERN_INFO PFX "Warning: Timeout while waiting for "
> diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
> index 1c0f106..728a2d8 100644
> --- a/drivers/cpufreq/pcc-cpufreq.c
> +++ b/drivers/cpufreq/pcc-cpufreq.c
> @@ -215,7 +215,7 @@ static int pcc_cpufreq_target(struct cpufreq_policy *policy,
>
> freqs.old = policy->cur;
> freqs.new = target_freq;
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
>
> input_buffer = 0x1 | (((target_freq * 100)
> / (ioread32(&pcch_hdr->nominal) * 1000)) << 8);
> @@ -231,7 +231,7 @@ static int pcc_cpufreq_target(struct cpufreq_policy *policy,
> status = ioread16(&pcch_hdr->status);
> iowrite16(0, &pcch_hdr->status);
>
> - cpufreq_notify_post_transition(policy, &freqs, status != CMD_COMPLETE);
> + cpufreq_freq_transition_end(policy, &freqs, status != CMD_COMPLETE);
> spin_unlock(&pcc_lock);
>
> if (status != CMD_COMPLETE) {
> diff --git a/drivers/cpufreq/powernow-k6.c b/drivers/cpufreq/powernow-k6.c
> index ce27e6c..882d673 100644
> --- a/drivers/cpufreq/powernow-k6.c
> +++ b/drivers/cpufreq/powernow-k6.c
> @@ -148,11 +148,11 @@ static int powernow_k6_target(struct cpufreq_policy *policy,
> freqs.old = busfreq * powernow_k6_get_cpu_multiplier();
> freqs.new = busfreq * clock_ratio[best_i].driver_data;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
>
> powernow_k6_set_cpu_multiplier(best_i);
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> return 0;
> }
> diff --git a/drivers/cpufreq/powernow-k7.c b/drivers/cpufreq/powernow-k7.c
> index 0e68e02..30a6c48 100644
> --- a/drivers/cpufreq/powernow-k7.c
> +++ b/drivers/cpufreq/powernow-k7.c
> @@ -269,7 +269,7 @@ static int powernow_target(struct cpufreq_policy *policy, unsigned int index)
>
> freqs.new = powernow_table[index].frequency;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
>
> /* Now do the magic poking into the MSRs. */
>
> @@ -290,7 +290,7 @@ static int powernow_target(struct cpufreq_policy *policy, unsigned int index)
> if (have_a0 == 1)
> local_irq_enable();
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> return 0;
> }
> diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
> index 27eb2be..770a9e1 100644
> --- a/drivers/cpufreq/powernow-k8.c
> +++ b/drivers/cpufreq/powernow-k8.c
> @@ -963,9 +963,9 @@ static int transition_frequency_fidvid(struct powernow_k8_data *data,
> policy = cpufreq_cpu_get(smp_processor_id());
> cpufreq_cpu_put(policy);
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
> res = transition_fid_vid(data, fid, vid);
> - cpufreq_notify_post_transition(policy, &freqs, res);
> + cpufreq_freq_transition_end(policy, &freqs, res);
>
> return res;
> }
> diff --git a/drivers/cpufreq/s3c24xx-cpufreq.c b/drivers/cpufreq/s3c24xx-cpufreq.c
> index 2506974..d5ef905 100644
> --- a/drivers/cpufreq/s3c24xx-cpufreq.c
> +++ b/drivers/cpufreq/s3c24xx-cpufreq.c
> @@ -217,7 +217,7 @@ static int s3c_cpufreq_settarget(struct cpufreq_policy *policy,
> s3c_cpufreq_updateclk(clk_pclk, cpu_new.freq.pclk);
>
> /* start the frequency change */
> - cpufreq_notify_transition(policy, &freqs.freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs.freqs);
>
> /* If hclk is staying the same, then we do not need to
> * re-write the IO or the refresh timings whilst we are changing
> @@ -261,7 +261,7 @@ static int s3c_cpufreq_settarget(struct cpufreq_policy *policy,
> local_irq_restore(flags);
>
> /* notify everyone we've done this */
> - cpufreq_notify_transition(policy, &freqs.freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs.freqs, false);
>
> s3c_freq_dbg("%s: finished\n", __func__);
> return 0;
> diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
> index 696170e..f63abf3 100644
> --- a/drivers/cpufreq/sh-cpufreq.c
> +++ b/drivers/cpufreq/sh-cpufreq.c
> @@ -68,10 +68,10 @@ static int sh_cpufreq_target(struct cpufreq_policy *policy,
> freqs.new = (freq + 500) / 1000;
> freqs.flags = 0;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
> set_cpus_allowed_ptr(current, &cpus_allowed);
> clk_set_rate(cpuclk, freq);
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> + cpufreq_freq_transition_end(policy, &freqs, false);
>
> dev_dbg(dev, "set frequency %lu Hz\n", freq);
>
> diff --git a/drivers/cpufreq/unicore2-cpufreq.c b/drivers/cpufreq/unicore2-cpufreq.c
> index 36cc330..13be802 100644
> --- a/drivers/cpufreq/unicore2-cpufreq.c
> +++ b/drivers/cpufreq/unicore2-cpufreq.c
> @@ -44,9 +44,9 @@ static int ucv2_target(struct cpufreq_policy *policy,
> freqs.old = policy->cur;
> freqs.new = target_freq;
>
> - cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> + cpufreq_freq_transition_begin(policy, &freqs);
> ret = clk_set_rate(policy->mclk, target_freq * 1000);
> - cpufreq_notify_post_transition(policy, &freqs, ret);
> + cpufreq_freq_transition_end(policy, &freqs, ret);
>
> return ret;
> }
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 3/3] cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static
2014-03-21 5:34 ` [PATCH V4 3/3] cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static Viresh Kumar
@ 2014-03-21 7:51 ` Srivatsa S. Bhat
0 siblings, 0 replies; 18+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 7:51 UTC (permalink / raw)
To: Viresh Kumar
Cc: rjw, linaro-kernel, cpufreq, linux-pm,
linux-kernel@vger.kernel.org, ego@linux.vnet.ibm.com
On 03/21/2014 11:04 AM, Viresh Kumar wrote:
> cpufreq_notify_transition() and cpufreq_notify_post_transition() shouldn't be
> called directly by cpufreq drivers anymore and so these should be marked static.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Regards,
Srivatsa S. Bhat
> ---
> drivers/cpufreq/cpufreq.c | 6 ++----
> include/linux/cpufreq.h | 4 ----
> 2 files changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b63e7e4..7b1feff 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -331,16 +331,15 @@ static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
> * function. It is called twice on all CPU frequency changes that have
> * external effects.
> */
> -void cpufreq_notify_transition(struct cpufreq_policy *policy,
> +static void cpufreq_notify_transition(struct cpufreq_policy *policy,
> struct cpufreq_freqs *freqs, unsigned int state)
> {
> for_each_cpu(freqs->cpu, policy->cpus)
> __cpufreq_notify_transition(policy, freqs, state);
> }
> -EXPORT_SYMBOL_GPL(cpufreq_notify_transition);
>
> /* Do post notifications when there are chances that transition has failed */
> -void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
> +static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
> struct cpufreq_freqs *freqs, int transition_failed)
> {
> cpufreq_notify_transition(policy, freqs, CPUFREQ_POSTCHANGE);
> @@ -351,7 +350,6 @@ void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
> cpufreq_notify_transition(policy, freqs, CPUFREQ_PRECHANGE);
> cpufreq_notify_transition(policy, freqs, CPUFREQ_POSTCHANGE);
> }
> -EXPORT_SYMBOL_GPL(cpufreq_notify_post_transition);
>
> void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
> struct cpufreq_freqs *freqs)
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 263173d..826830b 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -338,10 +338,6 @@ static inline void cpufreq_resume(void) {}
> int cpufreq_register_notifier(struct notifier_block *nb, unsigned int list);
> int cpufreq_unregister_notifier(struct notifier_block *nb, unsigned int list);
>
> -void cpufreq_notify_transition(struct cpufreq_policy *policy,
> - struct cpufreq_freqs *freqs, unsigned int state);
> -void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
> - struct cpufreq_freqs *freqs, int transition_failed);
> void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
> struct cpufreq_freqs *freqs);
> void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 7:46 ` Srivatsa S. Bhat
@ 2014-03-21 7:58 ` Viresh Kumar
2014-03-21 8:42 ` Srivatsa S. Bhat
0 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 7:58 UTC (permalink / raw)
To: Srivatsa S. Bhat
Cc: Rafael J. Wysocki, Lists linaro-kernel, cpufreq@vger.kernel.org,
linux-pm@vger.kernel.org, Linux Kernel Mailing List,
ego@linux.vnet.ibm.com
On 21 March 2014 13:16, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> Wonderful! I was going to do this today, but thanks a lot for taking
> care of this for me!
I just wanted to finish this long lasting thread as soon as possible.
> We need this assignment to happen exactly at this point, that is, *after*
> the call to post_transition() completes and before calling wake_up().
>
> If the compiler or the CPU reorders the instructions and moves this
> assignment to some other place, then we will be in trouble!
>
> We might need memory barriers to ensure this doesn't get reordered.
> Alternatively, we can simply hold the spin-lock around this assignment,
> since locks automatically imply memory barriers. As an added advantage,
> the code will then look more intuitive and easier to understand as well.
>
> Thoughts?
I may be wrong but this is how I understand locks. Yes, spinlocks have
memory barriers implemented but it wouldn't guarantee what you are
asking for in the above explanation.
It will guarantee that transition_ongoing will be updated after the lock
is taken but the notification() can happen after the lock is taken and
also after the variable is modified.
You can find some information on this in
Documentation/memory-barriers.txt
I don't think compiler or CPU will reorder calls to a function and
updates of a variable. And so this code might simply work. And
I hope there would be plenty of such code in kernel.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 2/3] cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end}
2014-03-21 7:48 ` Srivatsa S. Bhat
@ 2014-03-21 7:59 ` Viresh Kumar
0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 7:59 UTC (permalink / raw)
To: Srivatsa S. Bhat
Cc: Rafael J. Wysocki, Lists linaro-kernel, cpufreq@vger.kernel.org,
linux-pm@vger.kernel.org, Linux Kernel Mailing List,
ego@linux.vnet.ibm.com
On 21 March 2014 13:18, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> Nitpick: Instead of using 'false' as an argument to _post_transition(),
> you could use '0', since the argument is supposed to be an int. But that's
> minor, I won't insist.
You should :)
I will update that with a Macro actually, to make it more readable.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 7:58 ` Viresh Kumar
@ 2014-03-21 8:42 ` Srivatsa S. Bhat
2014-03-21 9:21 ` Viresh Kumar
0 siblings, 1 reply; 18+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 8:42 UTC (permalink / raw)
To: Viresh Kumar
Cc: Rafael J. Wysocki, Lists linaro-kernel, cpufreq@vger.kernel.org,
linux-pm@vger.kernel.org, Linux Kernel Mailing List,
ego@linux.vnet.ibm.com
On 03/21/2014 01:28 PM, Viresh Kumar wrote:
> On 21 March 2014 13:16, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> We need this assignment to happen exactly at this point, that is, *after*
>> the call to post_transition() completes and before calling wake_up().
>>
>> If the compiler or the CPU reorders the instructions and moves this
>> assignment to some other place, then we will be in trouble!
>>
>> We might need memory barriers to ensure this doesn't get reordered.
>> Alternatively, we can simply hold the spin-lock around this assignment,
>> since locks automatically imply memory barriers. As an added advantage,
>> the code will then look more intuitive and easier to understand as well.
>>
>> Thoughts?
>
> I may be wrong but this is how I understand locks. Yes, spinlocks have
> memory barriers implemented but it wouldn't guarantee what you are
> asking for in the above explanation.
>
> It will guarantee that transition_ongoing will be updated after the lock
> is taken but the notification() can happen after the lock is taken and
> also after the variable is modified.
>
Actually, yes, that's true. The lock and unlock act as one-way barriers,
hence they ensure that the critical section doesn't seep outside of the
locks, but don't necessarily ensure that pieces of code outside the
critical section don't seep -into- the critical section. IOW, my reasoning
was not quite correct, but see below.
> You can find some information on this in
> Documentation/memory-barriers.txt
>
Yep, I know, I have read it several times, but I'm no expert ;-)
I found this interesting section on "SLEEP AND WAKE-UP FUNCTIONS". It
says that doing:
policy->transition_ongoing = false;
wake_up(&policy->transition_wait);
is safe (as long as some tasks are woken up). So we don't have to worry
about that part. So only the first part remains to be solved: ensuring
that the assignment occurs _after_ completing the invocation of the
POSTCHANGE notifiers.
For that, we can do:
cpufreq_notify_post_transition();
smp_mb();
policy->transition_ongoing = false;
That should take care of everything.
> I don't think compiler or CPU will reorder calls to a function and
> updates of a variable.
I'm not sure about that. I think it is free to do so if it finds
that there is no dependency that prevents it from reordering. In this
case the update to the flag has no "visible" dependency on the call
to post_transition().
> And so this code might simply work. And
> I hope there would be plenty of such code in kernel.
>
Sure, there are plenty of examples in the kernel where we call functions
and update variables. But in this particular case, our synchronization
_depends_ on those operations happening in a particular order. Hence
we need to ensure the ordering is right. Otherwise the synchronization
might get broken.
Here are some examples where memory barriers are inserted to avoid
reordering of variable updates and function calls:
kernel/rcu/torture.c: rcu_torture_barrier_cbs()
kernel/smp.c: kick_all_cpus_sync()
Regards,
Srivatsa S. Bhat
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 8:42 ` Srivatsa S. Bhat
@ 2014-03-21 9:21 ` Viresh Kumar
2014-03-21 10:06 ` Viresh Kumar
2014-03-21 11:05 ` Catalin Marinas
0 siblings, 2 replies; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 9:21 UTC (permalink / raw)
To: Srivatsa S. Bhat, Catalin Marinas
Cc: Rafael J. Wysocki, Lists linaro-kernel, cpufreq@vger.kernel.org,
linux-pm@vger.kernel.org, Linux Kernel Mailing List,
ego@linux.vnet.ibm.com
On 21 March 2014 14:12, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> You can find some information on this in
>> Documentation/memory-barriers.txt
>
> Yep, I know, I have read it several times, but I'm no expert ;-)
Not me either :) .. That file has so complex stuff in there that its
difficult to
understand what's all it says.. I read it several times the last time I went for
a interview (Almost 2 years back) and don't remember anything now :)
> I found this interesting section on "SLEEP AND WAKE-UP FUNCTIONS". It
> says that doing:
>
> policy->transition_ongoing = false;
> wake_up(&policy->transition_wait);
>
> is safe (as long as some tasks are woken up). So we don't have to worry
> about that part.
Okay..
> So only the first part remains to be solved: ensuring
> that the assignment occurs _after_ completing the invocation of the
> POSTCHANGE notifiers.
>
> For that, we can do:
>
> cpufreq_notify_post_transition();
>
> smp_mb();
>
> policy->transition_ongoing = false;
>
> That should take care of everything.
>
>> I don't think compiler or CPU will reorder calls to a function and
>> updates of a variable.
>
> I'm not sure about that. I think it is free to do so if it finds
> that there is no dependency that prevents it from reordering. In this
> case the update to the flag has no "visible" dependency on the call
> to post_transition().
>
>> And so this code might simply work. And
>> I hope there would be plenty of such code in kernel.
>>
>
> Sure, there are plenty of examples in the kernel where we call functions
> and update variables. But in this particular case, our synchronization
> _depends_ on those operations happening in a particular order. Hence
> we need to ensure the ordering is right. Otherwise the synchronization
> might get broken.
I still don't buy that.. Lets call an expert :)
> Here are some examples where memory barriers are inserted to avoid
> reordering of variable updates and function calls:
>
> kernel/rcu/torture.c: rcu_torture_barrier_cbs()
rcutorture.c instead.
> kernel/smp.c: kick_all_cpus_sync()
These examples are a bit different than what we have here..
@Catalin: We have a problem here and need your expert advice. After changing
CPU frequency we need to call this code:
cpufreq_notify_post_transition();
policy->transition_ongoing = false;
And the sequence must be like this only. Is this guaranteed without any
memory barriers? cpufreq_notify_post_transition() isn't touching
transition_ongoing at all..
--
thanks..
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 9:21 ` Viresh Kumar
@ 2014-03-21 10:06 ` Viresh Kumar
2014-03-21 11:05 ` Catalin Marinas
1 sibling, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2014-03-21 10:06 UTC (permalink / raw)
To: Srivatsa S. Bhat, Catalin Marinas
Cc: Rafael J. Wysocki, Lists linaro-kernel, cpufreq@vger.kernel.org,
linux-pm@vger.kernel.org, Linux Kernel Mailing List,
ego@linux.vnet.ibm.com, Russell King - ARM Linux
On 21 March 2014 14:51, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> @Catalin: We have a problem here and need your expert advice. After changing
> CPU frequency we need to call this code:
>
> cpufreq_notify_post_transition();
> policy->transition_ongoing = false;
>
> And the sequence must be like this only. Is this guaranteed without any
> memory barriers? cpufreq_notify_post_transition() isn't touching
> transition_ongoing at all..
For others this is what we discussed on IRC (rmk: Russell King)
<rmk> I'm no barrier expert, but the compiler can't reorder that assignment
across a function call which it knows nothing about (which it can't
know anything
about because it calls other functions through function pointers)
<rmk> however, the CPU could re-order the effects with respect to other agents
(cpus/devices) when they look at the memory
<rmk> for the local CPU, the question is really: what does the C
language virtual
machine say about this - that's what really matters. If the CPU does
speculative
stuff, it still has to make the machine behaviour fit that model.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 9:21 ` Viresh Kumar
2014-03-21 10:06 ` Viresh Kumar
@ 2014-03-21 11:05 ` Catalin Marinas
2014-03-21 11:24 ` Srivatsa S. Bhat
2014-03-24 6:19 ` Viresh Kumar
1 sibling, 2 replies; 18+ messages in thread
From: Catalin Marinas @ 2014-03-21 11:05 UTC (permalink / raw)
To: Viresh Kumar
Cc: Srivatsa S. Bhat, Rafael J. Wysocki, Lists linaro-kernel,
cpufreq@vger.kernel.org, linux-pm@vger.kernel.org,
Linux Kernel Mailing List, ego@linux.vnet.ibm.com
On Fri, Mar 21, 2014 at 09:21:02AM +0000, Viresh Kumar wrote:
> @Catalin: We have a problem here and need your expert advice. After changing
> CPU frequency we need to call this code:
>
> cpufreq_notify_post_transition();
> policy->transition_ongoing = false;
>
> And the sequence must be like this only. Is this guaranteed without any
> memory barriers? cpufreq_notify_post_transition() isn't touching
> transition_ongoing at all..
The above sequence doesn't say much. As rmk said, the compiler wouldn't
reorder the transition_ongoing write before the function call. I think
most architectures (not sure about Alpha) don't do speculative stores,
so hardware wouldn't reorder them either. However, other stores inside
the cpufreq_notify_post_transition() could be reordered after
transition_ongoing store. The same for memory accesses after the
transition_ongoing update, they could be reordered before.
So what we actually need to know is what are the other relevant memory
accesses that require strict ordering with transition_ongoing.
What I find strange in your patch is that
cpufreq_freq_transition_begin() uses spinlocks around transition_ongoing
update but cpufreq_freq_transition_end() doesn't.
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 11:05 ` Catalin Marinas
@ 2014-03-21 11:24 ` Srivatsa S. Bhat
2014-03-21 18:07 ` Catalin Marinas
2014-03-24 6:19 ` Viresh Kumar
1 sibling, 1 reply; 18+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 11:24 UTC (permalink / raw)
To: Catalin Marinas
Cc: Viresh Kumar, Rafael J. Wysocki, Lists linaro-kernel,
cpufreq@vger.kernel.org, linux-pm@vger.kernel.org,
Linux Kernel Mailing List, ego@linux.vnet.ibm.com
On 03/21/2014 04:35 PM, Catalin Marinas wrote:
> On Fri, Mar 21, 2014 at 09:21:02AM +0000, Viresh Kumar wrote:
>> @Catalin: We have a problem here and need your expert advice. After changing
>> CPU frequency we need to call this code:
>>
>> cpufreq_notify_post_transition();
>> policy->transition_ongoing = false;
>>
>> And the sequence must be like this only. Is this guaranteed without any
>> memory barriers? cpufreq_notify_post_transition() isn't touching
>> transition_ongoing at all..
>
> The above sequence doesn't say much. As rmk said, the compiler wouldn't
> reorder the transition_ongoing write before the function call. I think
> most architectures (not sure about Alpha) don't do speculative stores,
> so hardware wouldn't reorder them either. However, other stores inside
> the cpufreq_notify_post_transition() could be reordered after
> transition_ongoing store. The same for memory accesses after the
> transition_ongoing update, they could be reordered before.
>
> So what we actually need to know is what are the other relevant memory
> accesses that require strict ordering with transition_ongoing.
>
Hmm.. The thing is, _everything_ inside the post_transition() function
should complete before writing to transition_ongoing. Because, setting the
flag to 'false' indicates the end of the critical section, and the next
contending task can enter the critical section.
So, I think we should use smp_mb() before setting transition_ongoing = false.
That way we'll be safe.
> What I find strange in your patch is that
> cpufreq_freq_transition_begin() uses spinlocks around transition_ongoing
> update but cpufreq_freq_transition_end() doesn't.
>
The reason is that, by the time we drop the spinlock, we would have set
the transition_ongoing flag to true, which prevents any other task from
entering the critical section. Hence, when we call the _end() function,
we are 100% sure that only one task is executing it. Hence locks are not
necessary around that second update. In fact, that very update marks the
end of the critical section (which acts much like a spin_unlock(&lock)
in a "regular" critical section).
I know the "critical section" and the synchronization used in this patch
is a bit unconventional, but that's because the scenario itself is
unconventional : we need to able to start the critical section in one
task, and end it in another task! That's where all the complication
arises :-) It sounds weird, but in this cpufreq case, its actually valid
and surprisingly, makes sense too! :-)
Regards,
Srivatsa S. Bhat
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 11:24 ` Srivatsa S. Bhat
@ 2014-03-21 18:07 ` Catalin Marinas
2014-03-22 3:48 ` Viresh Kumar
2014-03-24 6:48 ` Srivatsa S. Bhat
0 siblings, 2 replies; 18+ messages in thread
From: Catalin Marinas @ 2014-03-21 18:07 UTC (permalink / raw)
To: Srivatsa S. Bhat
Cc: Viresh Kumar, Rafael J. Wysocki, Lists linaro-kernel,
cpufreq@vger.kernel.org, linux-pm@vger.kernel.org,
Linux Kernel Mailing List, ego@linux.vnet.ibm.com
On Fri, Mar 21, 2014 at 11:24:16AM +0000, Srivatsa S. Bhat wrote:
> On 03/21/2014 04:35 PM, Catalin Marinas wrote:
> > On Fri, Mar 21, 2014 at 09:21:02AM +0000, Viresh Kumar wrote:
> >> @Catalin: We have a problem here and need your expert advice. After changing
> >> CPU frequency we need to call this code:
> >>
> >> cpufreq_notify_post_transition();
> >> policy->transition_ongoing = false;
> >>
> >> And the sequence must be like this only. Is this guaranteed without any
> >> memory barriers? cpufreq_notify_post_transition() isn't touching
> >> transition_ongoing at all..
> >
> > The above sequence doesn't say much. As rmk said, the compiler wouldn't
> > reorder the transition_ongoing write before the function call. I think
> > most architectures (not sure about Alpha) don't do speculative stores,
> > so hardware wouldn't reorder them either. However, other stores inside
> > the cpufreq_notify_post_transition() could be reordered after
> > transition_ongoing store. The same for memory accesses after the
> > transition_ongoing update, they could be reordered before.
> >
> > So what we actually need to know is what are the other relevant memory
> > accesses that require strict ordering with transition_ongoing.
>
> Hmm.. The thing is, _everything_ inside the post_transition() function
> should complete before writing to transition_ongoing. Because, setting the
> flag to 'false' indicates the end of the critical section, and the next
> contending task can enter the critical section.
smp_mb() is all about relative ordering. So if you want memory accesses
in post_transition() to be visible to other observers before
transition_ongoing = false, you also need to make sure that the readers
of transition_ongoing have a barrier before subsequent memory accesses.
> > What I find strange in your patch is that
> > cpufreq_freq_transition_begin() uses spinlocks around transition_ongoing
> > update but cpufreq_freq_transition_end() doesn't.
>
> The reason is that, by the time we drop the spinlock, we would have set
> the transition_ongoing flag to true, which prevents any other task from
> entering the critical section. Hence, when we call the _end() function,
> we are 100% sure that only one task is executing it. Hence locks are not
> necessary around that second update. In fact, that very update marks the
> end of the critical section (which acts much like a spin_unlock(&lock)
> in a "regular" critical section).
OK, I start to get it. Is there a risk of missing a wake_up event? E.g.
one thread waking up earlier, noticing that transition is in progress
and waiting indefinitely?
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 18:07 ` Catalin Marinas
@ 2014-03-22 3:48 ` Viresh Kumar
2014-03-24 6:48 ` Srivatsa S. Bhat
1 sibling, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2014-03-22 3:48 UTC (permalink / raw)
To: Catalin Marinas
Cc: Srivatsa S. Bhat, Rafael J. Wysocki, Lists linaro-kernel,
cpufreq@vger.kernel.org, linux-pm@vger.kernel.org,
Linux Kernel Mailing List, ego@linux.vnet.ibm.com
On 21 March 2014 23:37, Catalin Marinas <catalin.marinas@arm.com> wrote:
> smp_mb() is all about relative ordering. So if you want memory accesses
> in post_transition() to be visible to other observers before
> transition_ongoing = false, you also need to make sure that the readers
> of transition_ongoing have a barrier before subsequent memory accesses.
I don't think this is a requirement in our case. We are just trying to serialize
frequency transitions here and just want to make sure that second one
start after first one is over. And so this query.
> OK, I start to get it. Is there a risk of missing a wake_up event? E.g.
> one thread waking up earlier, noticing that transition is in progress
> and waiting indefinitely?
I don't think so. The only requirement is that second thread wakes up
after this variable is set to false.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 11:05 ` Catalin Marinas
2014-03-21 11:24 ` Srivatsa S. Bhat
@ 2014-03-24 6:19 ` Viresh Kumar
1 sibling, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2014-03-24 6:19 UTC (permalink / raw)
To: Catalin Marinas
Cc: Srivatsa S. Bhat, Rafael J. Wysocki, Lists linaro-kernel,
cpufreq@vger.kernel.org, linux-pm@vger.kernel.org,
Linux Kernel Mailing List, ego@linux.vnet.ibm.com
On 21 March 2014 16:35, Catalin Marinas <catalin.marinas@arm.com> wrote:
> The above sequence doesn't say much. As rmk said, the compiler wouldn't
> reorder the transition_ongoing write before the function call. I think
> most architectures (not sure about Alpha) don't do speculative stores,
> so hardware wouldn't reorder them either. However, other stores inside
> the cpufreq_notify_post_transition() could be reordered after
> transition_ongoing store. The same for memory accesses after the
> transition_ongoing update, they could be reordered before.
I got confused again. If we see what cpufreq_notify_post_transition() does:
Just calling a list of routines from a notifiers chain. And going by the above
statements from you, we aren't going to reorder this with function calls or
a branch instructions.
And even if for some reason, there is a bit of reorder, it doesn't look harmless
at all to me.
We are more concerned about serialization of frequency translations here. And
it still looks to me like we don't really need a barrier at all..
Probably we can keep it as is for now and maybe later add a barrier if required.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized
2014-03-21 18:07 ` Catalin Marinas
2014-03-22 3:48 ` Viresh Kumar
@ 2014-03-24 6:48 ` Srivatsa S. Bhat
1 sibling, 0 replies; 18+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-24 6:48 UTC (permalink / raw)
To: Catalin Marinas
Cc: Viresh Kumar, Rafael J. Wysocki, Lists linaro-kernel,
cpufreq@vger.kernel.org, linux-pm@vger.kernel.org,
Linux Kernel Mailing List, ego@linux.vnet.ibm.com
On 03/21/2014 11:37 PM, Catalin Marinas wrote:
> On Fri, Mar 21, 2014 at 11:24:16AM +0000, Srivatsa S. Bhat wrote:
>> On 03/21/2014 04:35 PM, Catalin Marinas wrote:
>>> On Fri, Mar 21, 2014 at 09:21:02AM +0000, Viresh Kumar wrote:
>>>> @Catalin: We have a problem here and need your expert advice. After changing
>>>> CPU frequency we need to call this code:
>>>>
>>>> cpufreq_notify_post_transition();
>>>> policy->transition_ongoing = false;
>>>>
>>>> And the sequence must be like this only. Is this guaranteed without any
>>>> memory barriers? cpufreq_notify_post_transition() isn't touching
>>>> transition_ongoing at all..
>>>
>>> The above sequence doesn't say much. As rmk said, the compiler wouldn't
>>> reorder the transition_ongoing write before the function call. I think
>>> most architectures (not sure about Alpha) don't do speculative stores,
>>> so hardware wouldn't reorder them either. However, other stores inside
>>> the cpufreq_notify_post_transition() could be reordered after
>>> transition_ongoing store. The same for memory accesses after the
>>> transition_ongoing update, they could be reordered before.
>>>
>>> So what we actually need to know is what are the other relevant memory
>>> accesses that require strict ordering with transition_ongoing.
>>
>> Hmm.. The thing is, _everything_ inside the post_transition() function
>> should complete before writing to transition_ongoing. Because, setting the
>> flag to 'false' indicates the end of the critical section, and the next
>> contending task can enter the critical section.
>
> smp_mb() is all about relative ordering. So if you want memory accesses
> in post_transition() to be visible to other observers before
> transition_ongoing = false, you also need to make sure that the readers
> of transition_ongoing have a barrier before subsequent memory accesses.
>
The reader takes a spin-lock before reading the flag.. won't that suffice?
+wait:
+ wait_event(policy->transition_wait, !policy->transition_ongoing);
+
+ spin_lock(&policy->transition_lock);
+
+ if (unlikely(policy->transition_ongoing)) {
+ spin_unlock(&policy->transition_lock);
+ goto wait;
+ }
>>> What I find strange in your patch is that
>>> cpufreq_freq_transition_begin() uses spinlocks around transition_ongoing
>>> update but cpufreq_freq_transition_end() doesn't.
>>
>> The reason is that, by the time we drop the spinlock, we would have set
>> the transition_ongoing flag to true, which prevents any other task from
>> entering the critical section. Hence, when we call the _end() function,
>> we are 100% sure that only one task is executing it. Hence locks are not
>> necessary around that second update. In fact, that very update marks the
>> end of the critical section (which acts much like a spin_unlock(&lock)
>> in a "regular" critical section).
>
> OK, I start to get it. Is there a risk of missing a wake_up event? E.g.
> one thread waking up earlier, noticing that transition is in progress
> and waiting indefinitely?
>
No, the only downside to having the CPU reorder the assignment to the
flag is that a new transition can begin while the old one is still
finishing up the frequency transition by calling the _post_transition()
notifiers.
Regards,
Srivatsa S. Bhat
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-03-24 6:50 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-21 5:34 [PATCH V4 0/3] cpufreq: Introduce cpufreq_freq_transition_{begin|end}() Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 1/3] cpufreq: Make sure frequency transitions are serialized Viresh Kumar
2014-03-21 7:46 ` Srivatsa S. Bhat
2014-03-21 7:58 ` Viresh Kumar
2014-03-21 8:42 ` Srivatsa S. Bhat
2014-03-21 9:21 ` Viresh Kumar
2014-03-21 10:06 ` Viresh Kumar
2014-03-21 11:05 ` Catalin Marinas
2014-03-21 11:24 ` Srivatsa S. Bhat
2014-03-21 18:07 ` Catalin Marinas
2014-03-22 3:48 ` Viresh Kumar
2014-03-24 6:48 ` Srivatsa S. Bhat
2014-03-24 6:19 ` Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 2/3] cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end} Viresh Kumar
2014-03-21 7:48 ` Srivatsa S. Bhat
2014-03-21 7:59 ` Viresh Kumar
2014-03-21 5:34 ` [PATCH V4 3/3] cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static Viresh Kumar
2014-03-21 7:51 ` Srivatsa S. Bhat
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).