Linux Power Management development
 help / color / mirror / Atom feed
* Re: [PATCH 3.7+stable] pm: fix wrong error-checking condition
From: Rafael J. Wysocki @ 2012-11-23 20:36 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: linux-pm
In-Reply-To: <Pine.LNX.4.64.1211231601040.14984@axis700.grange>

On Friday, November 23, 2012 04:02:56 PM Guennadi Liakhovetski wrote:
> dev_pm_qos_add_request() can return 0, 1, or a negative error code,
> therefore the correct error test is "if (error < 0)." Checking just for
> non-zero return code leads to erroneous setting of the req->dev pointer
> to NULL, which then leads to a repeated call to
> dev_pm_qos_add_ancestor_request() in st1232_ts_irq_handler(). This in turn
> leads to an Oops, when the I2C host adapter is unloaded and reloaded again
> because of the inconsistent state of its QoS request list.
> 
> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> ---
> 
> Hi Rafael, please push to 3.7 and to stable.

I will, thanks for the fix!

Rafael


>  drivers/base/power/qos.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
> index 74a67e0..fbbd4ed 100644
> --- a/drivers/base/power/qos.c
> +++ b/drivers/base/power/qos.c
> @@ -451,7 +451,7 @@ int dev_pm_qos_add_ancestor_request(struct device *dev,
>  	if (ancestor)
>  		error = dev_pm_qos_add_request(ancestor, req, value);
>  
> -	if (error)
> +	if (error < 0)
>  		req->dev = NULL;
>  
>  	return error;
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH] cpufreq: ondemand: fix wrong delay sampling rate
From: Rafael J. Wysocki @ 2012-11-23 20:36 UTC (permalink / raw)
  To: Fabio Baltieri; +Cc: Viresh Kumar, cpufreq, linux-pm, linux-kernel
In-Reply-To: <20121123195700.GA29269@balto.lan>

On Friday, November 23, 2012 08:57:02 PM Fabio Baltieri wrote:
> On Fri, Nov 23, 2012 at 07:23:28PM +0530, Viresh Kumar wrote:
> > On 23 November 2012 18:42, Fabio Baltieri <fabio.baltieri@linaro.org> wrote:
> > > Restore the correct delay value for ondemand's od_dbs_timer, as it was
> > > changed erroneously in 83f0e55.
> > >
> > > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > > Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
> > > ---
> > >  drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> > > index bdaab92..cca3e9f 100644
> > > --- a/drivers/cpufreq/cpufreq_ondemand.c
> > > +++ b/drivers/cpufreq/cpufreq_ondemand.c
> > > @@ -234,7 +234,8 @@ static void od_dbs_timer(struct work_struct *work)
> > >                         dbs_info->sample_type = OD_SUB_SAMPLE;
> > >                         delay = dbs_info->freq_hi_jiffies;
> > >                 } else {
> > > -                       delay = delay_for_sampling_rate(dbs_info->rate_mult);
> > > +                       delay = delay_for_sampling_rate(od_tuners.sampling_rate
> > > +                                               * dbs_info->rate_mult);
> > 
> > So sorry for my poor code :(
> 
> Actually I think that the new code is much better structured, and the
> patch was so big that I'll be surprised if this would be the only bug!
> 
> My problem is that I had to rewrite a patch based on the old code almost
> line-by-line but... these are the rules of the game!
> 
> > Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>

Applied to linux-pm.git/linux-next as v3.8 material.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH] cpufreq: ondemand: fix wrong delay sampling rate
From: Fabio Baltieri @ 2012-11-23 19:57 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Rafael J. Wysocki, cpufreq, linux-pm, linux-kernel
In-Reply-To: <CAKohpommGWb0WRnUDd_Do4yuadA5RcR4_ULFEqt+t9DNFnCg-Q@mail.gmail.com>

On Fri, Nov 23, 2012 at 07:23:28PM +0530, Viresh Kumar wrote:
> On 23 November 2012 18:42, Fabio Baltieri <fabio.baltieri@linaro.org> wrote:
> > Restore the correct delay value for ondemand's od_dbs_timer, as it was
> > changed erroneously in 83f0e55.
> >
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
> > ---
> >  drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> > index bdaab92..cca3e9f 100644
> > --- a/drivers/cpufreq/cpufreq_ondemand.c
> > +++ b/drivers/cpufreq/cpufreq_ondemand.c
> > @@ -234,7 +234,8 @@ static void od_dbs_timer(struct work_struct *work)
> >                         dbs_info->sample_type = OD_SUB_SAMPLE;
> >                         delay = dbs_info->freq_hi_jiffies;
> >                 } else {
> > -                       delay = delay_for_sampling_rate(dbs_info->rate_mult);
> > +                       delay = delay_for_sampling_rate(od_tuners.sampling_rate
> > +                                               * dbs_info->rate_mult);
> 
> So sorry for my poor code :(

Actually I think that the new code is much better structured, and the
patch was so big that I'll be surprised if this would be the only bug!

My problem is that I had to rewrite a patch based on the old code almost
line-by-line but... these are the rules of the game!

> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>

Thanks,
Fabio

-- 
Fabio Baltieri

^ permalink raw reply

* Re: [PATCH v4 1/2] cpufreq: ondemand: handle SW coordinated CPUs
From: Rafael J. Wysocki @ 2012-11-23 19:54 UTC (permalink / raw)
  To: Fabio Baltieri
  Cc: cpufreq, linux-pm, Rickard Andersson, Vincent Guittot,
	Linus Walleij, Lee Jones, linux-kernel
In-Reply-To: <20121123124221.GA7378@balto.lan>

On Friday, November 23, 2012 01:42:21 PM Fabio Baltieri wrote:
> Hi Rafael,
> 
> On Fri, Nov 23, 2012 at 10:31:56AM +0100, Fabio Baltieri wrote:
> >  drivers/cpufreq/cpufreq_ondemand.c | 152 ++++++++++++++++++++++++++++++++-----
> >  1 file changed, 132 insertions(+), 20 deletions(-)
> 
> I just noticed that the whole ondemand driver was refactored in your
> linux-pm.git/linux-next branch so my patch does not apply.  I'm
> preparing a rebased version of the set, so feel free to skip this one.

Thanks for the heads up, I'll wait for the new version.

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* [PATCH 3.7+stable] pm: fix wrong error-checking condition
From: Guennadi Liakhovetski @ 2012-11-23 15:02 UTC (permalink / raw)
  To: linux-pm; +Cc: Rafael J. Wysocki

dev_pm_qos_add_request() can return 0, 1, or a negative error code,
therefore the correct error test is "if (error < 0)." Checking just for
non-zero return code leads to erroneous setting of the req->dev pointer
to NULL, which then leads to a repeated call to
dev_pm_qos_add_ancestor_request() in st1232_ts_irq_handler(). This in turn
leads to an Oops, when the I2C host adapter is unloaded and reloaded again
because of the inconsistent state of its QoS request list.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
---

Hi Rafael, please push to 3.7 and to stable.

 drivers/base/power/qos.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c
index 74a67e0..fbbd4ed 100644
--- a/drivers/base/power/qos.c
+++ b/drivers/base/power/qos.c
@@ -451,7 +451,7 @@ int dev_pm_qos_add_ancestor_request(struct device *dev,
 	if (ancestor)
 		error = dev_pm_qos_add_request(ancestor, req, value);
 
-	if (error)
+	if (error < 0)
 		req->dev = NULL;
 
 	return error;
-- 
1.7.2.5


^ permalink raw reply related

* Re: [PATCH] cpufreq: ondemand: fix wrong delay sampling rate
From: Viresh Kumar @ 2012-11-23 13:53 UTC (permalink / raw)
  To: Fabio Baltieri; +Cc: Rafael J. Wysocki, cpufreq, linux-pm, linux-kernel
In-Reply-To: <1353676358-4385-1-git-send-email-fabio.baltieri@linaro.org>

On 23 November 2012 18:42, Fabio Baltieri <fabio.baltieri@linaro.org> wrote:
> Restore the correct delay value for ondemand's od_dbs_timer, as it was
> changed erroneously in 83f0e55.
>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
> ---
>  drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index bdaab92..cca3e9f 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -234,7 +234,8 @@ static void od_dbs_timer(struct work_struct *work)
>                         dbs_info->sample_type = OD_SUB_SAMPLE;
>                         delay = dbs_info->freq_hi_jiffies;
>                 } else {
> -                       delay = delay_for_sampling_rate(dbs_info->rate_mult);
> +                       delay = delay_for_sampling_rate(od_tuners.sampling_rate
> +                                               * dbs_info->rate_mult);

So sorry for my poor code :(

Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply

* [PATCH] cpufreq: ondemand: fix wrong delay sampling rate
From: Fabio Baltieri @ 2012-11-23 13:12 UTC (permalink / raw)
  To: Rafael J. Wysocki, cpufreq, linux-pm
  Cc: linux-kernel, Fabio Baltieri, Viresh Kumar

Restore the correct delay value for ondemand's od_dbs_timer, as it was
changed erroneously in 83f0e55.

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
---
 drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index bdaab92..cca3e9f 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -234,7 +234,8 @@ static void od_dbs_timer(struct work_struct *work)
 			dbs_info->sample_type = OD_SUB_SAMPLE;
 			delay = dbs_info->freq_hi_jiffies;
 		} else {
-			delay = delay_for_sampling_rate(dbs_info->rate_mult);
+			delay = delay_for_sampling_rate(od_tuners.sampling_rate
+						* dbs_info->rate_mult);
 		}
 	}
 
-- 
1.7.12.1


^ permalink raw reply related

* Re: [PATCH 2/2] cpufreq: governors: remove redundant code
From: Fabio Baltieri @ 2012-11-23 13:00 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: rjw, cpufreq, linux-pm, linux-kernel, linux-arm-kernel,
	linaro-dev, patches, pdsw-power-team, arvind.chauhan
In-Reply-To: <98478a02ae69afc01b3c7860e42a4d763b2045e8.1350677395.git.viresh.kumar@linaro.org>

Hello Viresh,

On Sat, Oct 20, 2012 at 01:42:06AM +0530, Viresh Kumar wrote:
> +static inline int delay_for_sampling_rate(unsigned int sampling_rate)
> +{
> +	int delay = usecs_to_jiffies(sampling_rate);
> +
> +	/* We want all CPUs to do sampling nearly on same jiffy */
> +	if (num_online_cpus() > 1)
> +		delay -= jiffies % delay;
> +
> +	return delay;
> +}
[...]
> +static void od_dbs_timer(struct work_struct *work)
> +{
> +	struct od_cpu_dbs_info_s *dbs_info =
> +		container_of(work, struct od_cpu_dbs_info_s, cdbs.work.work);
> +	unsigned int cpu = dbs_info->cdbs.cpu;
> +	int delay, sample_type = dbs_info->sample_type;
> +
> +	mutex_lock(&dbs_info->cdbs.timer_mutex);
>  
> -/* cpufreq_ondemand Governor Tunables */
> -#define show_one(file_name, object)					\
> -static ssize_t show_##file_name						\
> -(struct kobject *kobj, struct attribute *attr, char *buf)              \
> -{									\
> -	return sprintf(buf, "%u\n", dbs_tuners_ins.object);		\
> +	/* Common NORMAL_SAMPLE setup */
> +	dbs_info->sample_type = OD_NORMAL_SAMPLE;
> +	if (sample_type == OD_SUB_SAMPLE) {
> +		delay = dbs_info->freq_lo_jiffies;
> +		__cpufreq_driver_target(dbs_info->cdbs.cur_policy,
> +			dbs_info->freq_lo, CPUFREQ_RELATION_H);
> +	} else {
> +		dbs_check_cpu(&od_dbs_data, cpu);
> +		if (dbs_info->freq_lo) {
> +			/* Setup timer for SUB_SAMPLE */
> +			dbs_info->sample_type = OD_SUB_SAMPLE;
> +			delay = dbs_info->freq_hi_jiffies;
> +		} else {
> +			delay = delay_for_sampling_rate(dbs_info->rate_mult);
                                                        ^^^^^^^^^^^^^^^^^^^

I think this one should be:

delay = delay_for_sampling_rate(od_tuners.sampling_rate * dbs_info->rate_mult);

as currently the timer is firing every jiffy.  This was the old code:

> -static void do_dbs_timer(struct work_struct *work)
[...]
> -			delay = usecs_to_jiffies(dbs_tuners_ins.sampling_rate
> -				* dbs_info->rate_mult);
> -
> -			if (num_online_cpus() > 1)
> -				delay -= jiffies % delay;

I'm sending a patch for this one.

Fabio

-- 
Fabio Baltieri

^ permalink raw reply

* Re: [PATCH v4 1/2] cpufreq: ondemand: handle SW coordinated CPUs
From: Fabio Baltieri @ 2012-11-23 12:42 UTC (permalink / raw)
  To: Rafael J. Wysocki, cpufreq, linux-pm
  Cc: Rickard Andersson, Vincent Guittot, Linus Walleij, Lee Jones,
	linux-kernel
In-Reply-To: <1353663117-10074-1-git-send-email-fabio.baltieri@linaro.org>

Hi Rafael,

On Fri, Nov 23, 2012 at 10:31:56AM +0100, Fabio Baltieri wrote:
>  drivers/cpufreq/cpufreq_ondemand.c | 152 ++++++++++++++++++++++++++++++++-----
>  1 file changed, 132 insertions(+), 20 deletions(-)

I just noticed that the whole ondemand driver was refactored in your
linux-pm.git/linux-next branch so my patch does not apply.  I'm
preparing a rebased version of the set, so feel free to skip this one.

Thanks,
Fabio

-- 
Fabio Baltieri

^ permalink raw reply

* [PATCH v4 2/2] cpufreq: ondemand: use all CPUs in update_sampling_rate
From: Fabio Baltieri @ 2012-11-23  9:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, cpufreq, linux-pm
  Cc: Rickard Andersson, Vincent Guittot, Linus Walleij, Lee Jones,
	linux-kernel, Fabio Baltieri
In-Reply-To: <1353663117-10074-1-git-send-email-fabio.baltieri@linaro.org>

Modify update_sampling_rate() to check, and eventually immediately
schedule, all CPU's do_dbs_timer delayed work.

This is required in case of software coordinated CPUs, as we now have a
separate delayed work for each CPU.

Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
---
 drivers/cpufreq/cpufreq_ondemand.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 849788b..14339f6 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -286,7 +286,7 @@ static void update_sampling_rate(unsigned int new_rate)
 		policy = cpufreq_cpu_get(cpu);
 		if (!policy)
 			continue;
-		dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
+		dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
 		cpufreq_cpu_put(policy);
 
 		mutex_lock(&dbs_info->timer_mutex);
@@ -306,7 +306,7 @@ static void update_sampling_rate(unsigned int new_rate)
 			cancel_delayed_work_sync(&dbs_info->work);
 			mutex_lock(&dbs_info->timer_mutex);
 
-			schedule_delayed_work_on(dbs_info->cpu, &dbs_info->work,
+			schedule_delayed_work_on(cpu, &dbs_info->work,
 						 usecs_to_jiffies(new_rate));
 
 		}
-- 
1.7.12.1


^ permalink raw reply related

* [PATCH v4 1/2] cpufreq: ondemand: handle SW coordinated CPUs
From: Fabio Baltieri @ 2012-11-23  9:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, cpufreq, linux-pm
  Cc: Rickard Andersson, Vincent Guittot, Linus Walleij, Lee Jones,
	linux-kernel, Fabio Baltieri

From: Rickard Andersson <rickard.andersson@stericsson.com>

This patch fixes a bug that occured when we had load on a secondary CPU
and the primary CPU was sleeping. Only one sampling timer was spawned
and it was spawned as a deferred timer on the primary CPU, so when a
secondary CPU had a change in load this was not detected by the ondemand
governor.

This patch make sure that deferred timers are run on all CPUs in the
case of software controlled CPUs that run on the same frequency.

Signed-off-by: Rickard Andersson <rickard.andersson@stericsson.com>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
---
Changes from v3:
- moved update_sampling rate code on 2/2
- simplified dbs_sw_coordinated_cpus
- reworked timer handling for code readability, now:
  * do_dbs_timer() [-> dbs_timer_coordinated()] -> dbs_timer_update()
- simplified cpu_callback cases

 drivers/cpufreq/cpufreq_ondemand.c | 152 ++++++++++++++++++++++++++++++++-----
 1 file changed, 132 insertions(+), 20 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 396322f..849788b 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -93,6 +93,7 @@ struct cpu_dbs_info_s {
 	 * when user is changing the governor or limits.
 	 */
 	struct mutex timer_mutex;
+	ktime_t time_stamp;
 };
 static DEFINE_PER_CPU(struct cpu_dbs_info_s, od_cpu_dbs_info);
 
@@ -449,6 +450,13 @@ static struct attribute_group dbs_attr_group = {
 
 /************************** sysfs end ************************/
 
+static bool dbs_sw_coordinated_cpus(struct cpu_dbs_info_s *dbs_info)
+{
+	struct cpufreq_policy *policy = dbs_info->cur_policy;
+
+	return cpumask_weight(policy->cpus) > 1;
+}
+
 static void dbs_freq_increase(struct cpufreq_policy *p, unsigned int freq)
 {
 	if (dbs_tuners_ins.powersave_bias)
@@ -596,22 +604,18 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
 	}
 }
 
-static void do_dbs_timer(struct work_struct *work)
+static void dbs_timer_update(struct cpu_dbs_info_s *dbs_info, bool sample,
+			     struct delayed_work *dw)
 {
-	struct cpu_dbs_info_s *dbs_info =
-		container_of(work, struct cpu_dbs_info_s, work.work);
-	unsigned int cpu = dbs_info->cpu;
 	int sample_type = dbs_info->sample_type;
-
 	int delay;
 
-	mutex_lock(&dbs_info->timer_mutex);
-
 	/* Common NORMAL_SAMPLE setup */
 	dbs_info->sample_type = DBS_NORMAL_SAMPLE;
 	if (!dbs_tuners_ins.powersave_bias ||
 	    sample_type == DBS_NORMAL_SAMPLE) {
-		dbs_check_cpu(dbs_info);
+		if (sample)
+			dbs_check_cpu(dbs_info);
 		if (dbs_info->freq_lo) {
 			/* Setup timer for SUB_SAMPLE */
 			dbs_info->sample_type = DBS_SUB_SAMPLE;
@@ -627,32 +631,80 @@ static void do_dbs_timer(struct work_struct *work)
 				delay -= jiffies % delay;
 		}
 	} else {
-		__cpufreq_driver_target(dbs_info->cur_policy,
-			dbs_info->freq_lo, CPUFREQ_RELATION_H);
+		if (sample)
+			__cpufreq_driver_target(dbs_info->cur_policy,
+						dbs_info->freq_lo,
+						CPUFREQ_RELATION_H);
 		delay = dbs_info->freq_lo_jiffies;
 	}
-	schedule_delayed_work_on(cpu, &dbs_info->work, delay);
+	schedule_delayed_work_on(smp_processor_id(), dw, delay);
+}
+
+static void dbs_timer_coordinated(struct cpu_dbs_info_s *dbs_info_local,
+				  struct delayed_work *dw)
+{
+	struct cpu_dbs_info_s *dbs_info;
+	ktime_t time_now;
+	s64 delta_us;
+	bool sample = true;
+
+	/* use leader CPU's dbs_info */
+	dbs_info = &per_cpu(od_cpu_dbs_info, dbs_info_local->cpu);
+	mutex_lock(&dbs_info->timer_mutex);
+
+	time_now = ktime_get();
+	delta_us = ktime_us_delta(time_now, dbs_info->time_stamp);
+
+	/* Do nothing if we recently have sampled */
+	if (delta_us < (s64)(dbs_tuners_ins.sampling_rate / 2))
+		sample = false;
+	else
+		dbs_info->time_stamp = time_now;
+
+	dbs_timer_update(dbs_info, sample, dw);
 	mutex_unlock(&dbs_info->timer_mutex);
 }
 
-static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info)
+static void do_dbs_timer(struct work_struct *work)
+{
+	struct delayed_work *dw = to_delayed_work(work);
+	struct cpu_dbs_info_s *dbs_info =
+		container_of(work, struct cpu_dbs_info_s, work.work);
+
+	if (dbs_sw_coordinated_cpus(dbs_info)) {
+		dbs_timer_coordinated(dbs_info, dw);
+	} else {
+		mutex_lock(&dbs_info->timer_mutex);
+		dbs_timer_update(dbs_info, true, dw);
+		mutex_unlock(&dbs_info->timer_mutex);
+	}
+}
+
+static inline void dbs_timer_init(struct cpu_dbs_info_s *dbs_info, int cpu)
 {
 	/* We want all CPUs to do sampling nearly on same jiffy */
 	int delay = usecs_to_jiffies(dbs_tuners_ins.sampling_rate);
+	struct cpu_dbs_info_s *dbs_info_local = &per_cpu(od_cpu_dbs_info, cpu);
 
 	if (num_online_cpus() > 1)
 		delay -= jiffies % delay;
 
+	cancel_delayed_work_sync(&dbs_info_local->work);
 	dbs_info->sample_type = DBS_NORMAL_SAMPLE;
-	INIT_DEFERRABLE_WORK(&dbs_info->work, do_dbs_timer);
-	schedule_delayed_work_on(dbs_info->cpu, &dbs_info->work, delay);
+	schedule_delayed_work_on(cpu, &dbs_info_local->work, delay);
 }
 
-static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info)
+static inline void dbs_timer_exit(int cpu)
 {
+	struct cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
 	cancel_delayed_work_sync(&dbs_info->work);
 }
 
+static void dbs_timer_exit_per_cpu(struct work_struct *dummy)
+{
+	dbs_timer_exit(smp_processor_id());
+}
+
 /*
  * Not all CPUs want IO time to be accounted as busy; this dependson how
  * efficient idling at a higher frequency/voltage is.
@@ -676,6 +728,41 @@ static int should_io_be_busy(void)
 	return 0;
 }
 
+static int __cpuinit cpu_callback(struct notifier_block *nfb,
+				  unsigned long action, void *hcpu)
+{
+	unsigned int cpu = (unsigned long)hcpu;
+	struct device *cpu_dev;
+	struct cpu_dbs_info_s *dbs_info;
+
+	dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
+
+	/* use leader CPU's dbs_info */
+	if (dbs_sw_coordinated_cpus(dbs_info))
+		dbs_info = &per_cpu(od_cpu_dbs_info, dbs_info->cpu);
+
+	cpu_dev = get_cpu_device(cpu);
+	if (cpu_dev) {
+		switch (action) {
+		case CPU_ONLINE:
+		case CPU_ONLINE_FROZEN:
+		case CPU_DOWN_FAILED:
+		case CPU_DOWN_FAILED_FROZEN:
+			dbs_timer_init(dbs_info, cpu);
+			break;
+		case CPU_DOWN_PREPARE:
+		case CPU_DOWN_PREPARE_FROZEN:
+			dbs_timer_exit(cpu);
+			break;
+		}
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block __refdata ondemand_cpu_notifier = {
+	.notifier_call = cpu_callback,
+};
+
 static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 				   unsigned int event)
 {
@@ -704,9 +791,13 @@ static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 			if (dbs_tuners_ins.ignore_nice)
 				j_dbs_info->prev_cpu_nice =
 						kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+
+			mutex_init(&j_dbs_info->timer_mutex);
+			INIT_DEFERRABLE_WORK(&j_dbs_info->work, do_dbs_timer);
+
+			j_dbs_info->rate_mult = 1;
 		}
 		this_dbs_info->cpu = cpu;
-		this_dbs_info->rate_mult = 1;
 		ondemand_powersave_bias_init_cpu(cpu);
 		/*
 		 * Start the timerschedule work, when this governor
@@ -736,21 +827,42 @@ static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		}
 		mutex_unlock(&dbs_mutex);
 
-		mutex_init(&this_dbs_info->timer_mutex);
-		dbs_timer_init(this_dbs_info);
+		/* If SW coordinated CPUs then register notifier */
+		if (dbs_sw_coordinated_cpus(this_dbs_info)) {
+			register_hotcpu_notifier(&ondemand_cpu_notifier);
+
+			/* Initiate timer time stamp */
+			this_dbs_info->time_stamp = ktime_get();
+
+			for_each_cpu(j, policy->cpus) {
+				struct cpu_dbs_info_s *j_dbs_info;
+
+				j_dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
+				dbs_timer_init(j_dbs_info, j);
+			}
+		} else {
+			dbs_timer_init(this_dbs_info, cpu);
+		}
 		break;
 
 	case CPUFREQ_GOV_STOP:
-		dbs_timer_exit(this_dbs_info);
+		dbs_timer_exit(cpu);
 
 		mutex_lock(&dbs_mutex);
 		mutex_destroy(&this_dbs_info->timer_mutex);
 		dbs_enable--;
 		mutex_unlock(&dbs_mutex);
-		if (!dbs_enable)
+		if (!dbs_enable) {
 			sysfs_remove_group(cpufreq_global_kobject,
 					   &dbs_attr_group);
 
+			if (dbs_sw_coordinated_cpus(this_dbs_info)) {
+				/* Make sure all pending timers/works are
+				 * stopped. */
+				schedule_on_each_cpu(dbs_timer_exit_per_cpu);
+				unregister_hotcpu_notifier(&ondemand_cpu_notifier);
+			}
+		}
 		break;
 
 	case CPUFREQ_GOV_LIMITS:
-- 
1.7.12.1


^ permalink raw reply related

* Re: [PATCH 2/2] thermal: rcar: add .get_trip_type/temp and .notify support
From: Zhang Rui @ 2012-11-23  6:29 UTC (permalink / raw)
  To: Kuninori Morimoto; +Cc: Simon, Magnus, linux-pm
In-Reply-To: <87pq36uo5v.wl%kuninori.morimoto.gx@renesas.com>

On Wed, 2012-11-21 at 22:50 -0800, Kuninori Morimoto wrote:
> This patch adds .get_trip_type(), .get_trip_temp(), and .notify()
> on rcar_thermal_zone_ops.
> Driver will try platform power OFF if it reached to
> critical temperature.
> 
> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> ---
>  drivers/thermal/rcar_thermal.c |   81 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 78 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
> index 454035c..16723ba 100644
> --- a/drivers/thermal/rcar_thermal.c
> +++ b/drivers/thermal/rcar_thermal.c
> @@ -22,10 +22,13 @@
>  #include <linux/io.h>
>  #include <linux/module.h>
>  #include <linux/platform_device.h>
> +#include <linux/reboot.h>
>  #include <linux/slab.h>
>  #include <linux/spinlock.h>
>  #include <linux/thermal.h>
>  
> +#define IDLE_INTERVAL	5000
> +
>  #define THSCR	0x2c
>  #define THSSR	0x30
>  
> @@ -172,11 +175,82 @@ static int rcar_thermal_get_temp(struct thermal_zone_device *zone,
>  	}
>  
>  	*temp = tmp;
> +
> +	return 0;
> +}
> +
> +static int rcar_thermal_get_trip_type(struct thermal_zone_device *zone,
> +				      int trip, enum thermal_trip_type *type)
> +{
> +	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
> +
> +	/* see rcar_thermal_get_temp() */
> +	switch (trip) {
> +	case 0: /* -45 <= temp < +45 */
> +		*type = THERMAL_TRIP_ACTIVE;
> +		break;
> +	case 1: /* +45 <= temp < +90 */
> +		*type = THERMAL_TRIP_HOT;
> +		break;
> +	case 2: /* +90 <= temp < +135 */
> +		*type = THERMAL_TRIP_CRITICAL;
> +		break;
> +	default:
> +		dev_err(priv->dev, "rcar driver trip error\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rcar_thermal_get_trip_temp(struct thermal_zone_device *zone,
> +				      int trip, unsigned long *temp)
> +{
> +	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
> +
> +	/* see rcar_thermal_get_temp() */
> +	switch (trip) {
> +	case 0: /* -45 <= temp < +45 */
> +		*temp = -45 - 1;
> +		break;

does this mean you expect the thermal driver to take some action when
the temperature is higher than -45C?

> +	case 1: /* +45 <= temp < +90 */
> +		*temp = 45 - 1;
> +		break;
what do you expect to happen when the temperature is higher than 45C but
lower than 90C?

> +	case 2: /* +90 <= temp < +135 */
> +		*temp = 90 - 1;
> +		break;
> +	default:
> +		dev_err(priv->dev, "rcar driver trip error\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rcar_thermal_notify(struct thermal_zone_device *zone,
> +			       int trip, enum thermal_trip_type type)
> +{
> +	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
> +
> +	switch (type) {
> +	case THERMAL_TRIP_CRITICAL:
> +		/* FIXME */
> +		dev_warn(priv->dev,
> +			 "Thermal reached to critical temperature\n");
> +		machine_power_off();
> +		break;
> +	default:
> +		break;
> +	}
> +
>  	return 0;
>  }
>  
>  static struct thermal_zone_device_ops rcar_thermal_zone_ops = {
> -	.get_temp = rcar_thermal_get_temp,
> +	.get_temp	= rcar_thermal_get_temp,
> +	.get_trip_type	= rcar_thermal_get_trip_type,
> +	.get_trip_temp	= rcar_thermal_get_trip_temp,
> +	.notify		= rcar_thermal_notify,
>  };
>  
how does the active trip point work if you do not bind any cooling
devices to it?

>  /*
> @@ -212,8 +286,9 @@ static int rcar_thermal_probe(struct platform_device *pdev)
>  		goto error_free_priv;
>  	}
>  
> -	zone = thermal_zone_device_register("rcar_thermal", 0, 0, priv,
> -					    &rcar_thermal_zone_ops, 0, 0);
> +	zone = thermal_zone_device_register("rcar_thermal", 4, 0, priv,
> +					    &rcar_thermal_zone_ops, 0,
> +					    IDLE_INTERVAL);

there is no interrupt for any thermal changes on this platform,
including critical overheat, right?

thanks,
rui


^ permalink raw reply

* Re: [PATCHv9 1/3] Runtime Interpreted Power Sequences
From: Mark Brown @ 2012-11-23  1:44 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Grant Likely, Alexandre Courbot, Anton Vorontsov, Stephen Warren,
	Thierry Reding, Mark Zhang, Rob Herring, David Woodhouse,
	Arnd Bergmann, linux-tegra, linux-arm-kernel, linux-kernel,
	linux-fbdev, devicetree-discuss, linux-pm, Alexandre Courbot,
	Ola LILJA2
In-Reply-To: <CACRpkdaNmBpfONDpt7zFqLaqfiGm+ELpO-v5gZmM0rEi_AzijQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 281 bytes --]

On Thu, Nov 22, 2012 at 09:57:22AM +0100, Linus Walleij wrote:

> Is it correct to assume that this library will be useful also for ALSA
> SoC type of devices?

ASoC has facilities for autogenerating the bulk of the sequences which
with modern devices is all that you really need.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH linux-next] cpuidle: fix a suspicious RCU usage in menu governor
From: Rafael J. Wysocki @ 2012-11-22 23:12 UTC (permalink / raw)
  To: Li Zhong; +Cc: linux-next list, linux-pm, riel, youquan.song, paulmck
In-Reply-To: <1353548752.3425.3.camel@ThinkPad-T5421.cn.ibm.com>

On 11/22/2012 2:45 AM, Li Zhong wrote:
> Adding linux-pm list, and ACKs.
>
> I saw this suspicious RCU usage on the next tree of 11/15
>
> [   67.123404] ===============================
> [   67.123413] [ INFO: suspicious RCU usage. ]
> [   67.123423] 3.7.0-rc5-next-20121115-dirty #1 Not tainted
> [   67.123434] -------------------------------
> [   67.123444] include/trace/events/timer.h:186 suspicious rcu_dereference_check() usage!
> [   67.123458]
> [   67.123458] other info that might help us debug this:
> [   67.123458]
> [   67.123474]
> [   67.123474] RCU used illegally from idle CPU!
> [   67.123474] rcu_scheduler_active = 1, debug_locks = 0
> [   67.123493] RCU used illegally from extended quiescent state!
> [   67.123507] 1 lock held by swapper/1/0:
> [   67.123516]  #0:  (&cpu_base->lock){-.-...}, at: [<c0000000000979b0>] .__hrtimer_start_range_ns+0x28c/0x524
> [   67.123555]
> [   67.123555] stack backtrace:
> [   67.123566] Call Trace:
> [   67.123576] [c0000001e2ccb920] [c00000000001275c] .show_stack+0x78/0x184 (unreliable)
> [   67.123599] [c0000001e2ccb9d0] [c0000000000c15a0] .lockdep_rcu_suspicious+0x120/0x148
> [   67.123619] [c0000001e2ccba70] [c00000000009601c] .enqueue_hrtimer+0x1c0/0x1c8
> [   67.123639] [c0000001e2ccbb00] [c000000000097aa0] .__hrtimer_start_range_ns+0x37c/0x524
> [   67.123660] [c0000001e2ccbc20] [c0000000005c9698] .menu_select+0x508/0x5bc
> [   67.123678] [c0000001e2ccbd20] [c0000000005c740c] .cpuidle_idle_call+0xa8/0x6e4
> [   67.123699] [c0000001e2ccbdd0] [c0000000000459a0] .pSeries_idle+0x10/0x34
> [   67.123717] [c0000001e2ccbe40] [c000000000014dc8] .cpu_idle+0x130/0x280
> [   67.123738] [c0000001e2ccbee0] [c0000000006ffa8c] .start_secondary+0x378/0x384
> [   67.123758] [c0000001e2ccbf90] [c00000000000936c] .start_secondary_prolog+0x10/0x14
>
> hrtimer_start was added in 198fd638 and ae515197. The patch below tries
> to use RCU_NONIDLE around it to avoid the above report.

Applied to the linux-next branch of the linux-pm.git tree as v3.8 material.

Thanks,
Rafael


> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Reviewed-by: Rik van Riel <riel@redhat.com>
> ---
>   drivers/cpuidle/governors/menu.c |   10 ++++++----
>   1 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
> index 2efee27..bd40b94 100644
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -407,8 +407,9 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>   		perfect_us = perfect_cstate_ms * 1000;
>   
>   		if (repeat && (4 * timer_us < data->expected_us)) {
> -			hrtimer_start(hrtmr, ns_to_ktime(1000 * timer_us),
> -				HRTIMER_MODE_REL_PINNED);
> +			RCU_NONIDLE(hrtimer_start(hrtmr,
> +				ns_to_ktime(1000 * timer_us),
> +				HRTIMER_MODE_REL_PINNED));
>   			/* In repeat case, menu hrtimer is started */
>   			per_cpu(hrtimer_status, cpu) = MENU_HRTIMER_REPEAT;
>   		} else if (perfect_us < data->expected_us) {
> @@ -418,8 +419,9 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>   			 * In that case, it makes sense to re-enter
>   			 * into a deeper C-state after some time.
>   			 */
> -			hrtimer_start(hrtmr, ns_to_ktime(1000 * timer_us),
> -				HRTIMER_MODE_REL_PINNED);
> +			RCU_NONIDLE(hrtimer_start(hrtmr,
> +				ns_to_ktime(1000 * timer_us),
> +				HRTIMER_MODE_REL_PINNED));
>   			/* In general case, menu hrtimer is started */
>   			per_cpu(hrtimer_status, cpu) = MENU_HRTIMER_GENERAL;
>   		}

---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk

Sad Rejonowy Gdansk Polnoc w Gdansku, 
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego, 
numer KRS 101882

NIP 957-07-52-316
Kapital zakladowy 200.000 zl

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply

* Re: [PATCHv9 1/3] Runtime Interpreted Power Sequences
From: Thierry Reding @ 2012-11-22 21:40 UTC (permalink / raw)
  To: Grant Likely
  Cc: Alexandre Courbot,
	linux-fbdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Stephen Warren, linux-pm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org,
	Mark Brown, Mark Zhang, Rob Herring,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Anton Vorontsov,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	David Woodhouse,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
In-Reply-To: <20121122133941.24B883E129E@localhost>


[-- Attachment #1.1: Type: text/plain, Size: 1040 bytes --]

On Thu, Nov 22, 2012 at 01:39:41PM +0000, Grant Likely wrote:
[...]
> I do think that each sequence should be contained within a single
> property, but I'm open to other suggestions.

IIRC a very early prototype did implement something like that. However
because of the resource issues this had to be string based, so that the
sequences looked somewhat like (Alex, correct me if I'm wrong):

	power-on = <"REGULATOR", "power", 1, "GPIO", "enable", 1>;

Instead we could possibly have something like:

	power-on = <0 &reg 1,
		    1 &gpio 42 0 1>;

Where the first cell in each entry defines the type (0 = regulator, 1 =
GPIO) and the rest would be a regular OF specifier for the given type of
resource along with some defined parameter such as enable/disable,
voltage, delay in ms, ... I don't know if that sounds any better. It
looks sort of cryptic but it is more "in the spirit of" DT, right Grant?

Writing this down, it seems to me like even that proposal was already
discussed at some point. Again, Alex may remember better.

Thierry

[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 192 bytes --]

_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss

^ permalink raw reply

* Re: [PATCH v3] cpufreq: ondemand: handle SW coordinated CPUs
From: Rafael J. Wysocki @ 2012-11-22 21:27 UTC (permalink / raw)
  To: Fabio Baltieri
  Cc: cpufreq, linux-pm, Rickard Andersson, Vincent Guittot,
	Linus Walleij, Lee Jones, linux-kernel
In-Reply-To: <20121122170237.GA16948@gmail.com>

On Thursday, November 22, 2012 06:02:37 PM Fabio Baltieri wrote:
> Hello Rafael,
> 
> thanks for the review!  I only have one concern before sending a v4:
> 
> On Thu, Nov 22, 2012 at 01:10:15AM +0100, Rafael J. Wysocki wrote:
> > > @@ -627,32 +659,41 @@ static void do_dbs_timer(struct work_struct *work)
> > >  				delay -= jiffies % delay;
> > >  		}
> > >  	} else {
> > > -		__cpufreq_driver_target(dbs_info->cur_policy,
> > > -			dbs_info->freq_lo, CPUFREQ_RELATION_H);
> > > +		if (sample)
> > > +			__cpufreq_driver_target(dbs_info->cur_policy,
> > > +						dbs_info->freq_lo,
> > > +						CPUFREQ_RELATION_H);
> > >  		delay = dbs_info->freq_lo_jiffies;
> > >  	}
> > > -	schedule_delayed_work_on(cpu, &dbs_info->work, delay);
> > > +	schedule_delayed_work_on(smp_processor_id(), dw, delay);
> > 
> > We're not supposed to be using smp_processor_id() any more.
> > get_cpu()/put_cpu() should be used instead.
> 
> That's going to add preemption protection, do I need that?  The function
> is called from a kworker with the affinity set on a specific CPU, so it
> should not migrate to a different one during execution.

Yes, you're right, in that case it should be OK.

> I agree with you for all the other comments.

Cool. :-)

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH 2/2] thermal: rcar: add .get_trip_type/temp and .notify support
From: Magnus Damm @ 2012-11-22  7:05 UTC (permalink / raw)
  To: Kuninori Morimoto; +Cc: Zhang Rui, Simon, linux-pm
In-Reply-To: <87pq36uo5v.wl%kuninori.morimoto.gx@renesas.com>

Hello Morimoto-san,

On Thu, Nov 22, 2012 at 3:50 PM, Kuninori Morimoto
<kuninori.morimoto.gx@renesas.com> wrote:
> This patch adds .get_trip_type(), .get_trip_temp(), and .notify()
> on rcar_thermal_zone_ops.
> Driver will try platform power OFF if it reached to
> critical temperature.
>
> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> ---
>  drivers/thermal/rcar_thermal.c |   81 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 78 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
> index 454035c..16723ba 100644
> --- a/drivers/thermal/rcar_thermal.c
> +++ b/drivers/thermal/rcar_thermal.c
> @@ -212,8 +286,9 @@ static int rcar_thermal_probe(struct platform_device *pdev)
>                 goto error_free_priv;
>         }
>
> -       zone = thermal_zone_device_register("rcar_thermal", 0, 0, priv,
> -                                           &rcar_thermal_zone_ops, 0, 0);
> +       zone = thermal_zone_device_register("rcar_thermal", 4, 0, priv,
> +                                           &rcar_thermal_zone_ops, 0,
> +                                           IDLE_INTERVAL);
>         if (IS_ERR(zone)) {
>                 dev_err(&pdev->dev, "thermal zone device is NULL\n");
>                 ret = PTR_ERR(zone);

Huh, so this IDLE_INTERVAL basically means that the driver currently is polled?

Good that this driver is for automotive systems and not mobile handsets. =)

Thanks,

/ magnus

^ permalink raw reply

* [PATCH 2/2] thermal: rcar: add .get_trip_type/temp and .notify support
From: Kuninori Morimoto @ 2012-11-22  6:50 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Simon, Magnus, linux-pm, Kuninori Morimoto
In-Reply-To: <87sj82uo7o.wl%kuninori.morimoto.gx@renesas.com>

This patch adds .get_trip_type(), .get_trip_temp(), and .notify()
on rcar_thermal_zone_ops.
Driver will try platform power OFF if it reached to
critical temperature.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
---
 drivers/thermal/rcar_thermal.c |   81 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 78 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
index 454035c..16723ba 100644
--- a/drivers/thermal/rcar_thermal.c
+++ b/drivers/thermal/rcar_thermal.c
@@ -22,10 +22,13 @@
 #include <linux/io.h>
 #include <linux/module.h>
 #include <linux/platform_device.h>
+#include <linux/reboot.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
 #include <linux/thermal.h>
 
+#define IDLE_INTERVAL	5000
+
 #define THSCR	0x2c
 #define THSSR	0x30
 
@@ -172,11 +175,82 @@ static int rcar_thermal_get_temp(struct thermal_zone_device *zone,
 	}
 
 	*temp = tmp;
+
+	return 0;
+}
+
+static int rcar_thermal_get_trip_type(struct thermal_zone_device *zone,
+				      int trip, enum thermal_trip_type *type)
+{
+	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
+
+	/* see rcar_thermal_get_temp() */
+	switch (trip) {
+	case 0: /* -45 <= temp < +45 */
+		*type = THERMAL_TRIP_ACTIVE;
+		break;
+	case 1: /* +45 <= temp < +90 */
+		*type = THERMAL_TRIP_HOT;
+		break;
+	case 2: /* +90 <= temp < +135 */
+		*type = THERMAL_TRIP_CRITICAL;
+		break;
+	default:
+		dev_err(priv->dev, "rcar driver trip error\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int rcar_thermal_get_trip_temp(struct thermal_zone_device *zone,
+				      int trip, unsigned long *temp)
+{
+	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
+
+	/* see rcar_thermal_get_temp() */
+	switch (trip) {
+	case 0: /* -45 <= temp < +45 */
+		*temp = -45 - 1;
+		break;
+	case 1: /* +45 <= temp < +90 */
+		*temp = 45 - 1;
+		break;
+	case 2: /* +90 <= temp < +135 */
+		*temp = 90 - 1;
+		break;
+	default:
+		dev_err(priv->dev, "rcar driver trip error\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int rcar_thermal_notify(struct thermal_zone_device *zone,
+			       int trip, enum thermal_trip_type type)
+{
+	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
+
+	switch (type) {
+	case THERMAL_TRIP_CRITICAL:
+		/* FIXME */
+		dev_warn(priv->dev,
+			 "Thermal reached to critical temperature\n");
+		machine_power_off();
+		break;
+	default:
+		break;
+	}
+
 	return 0;
 }
 
 static struct thermal_zone_device_ops rcar_thermal_zone_ops = {
-	.get_temp = rcar_thermal_get_temp,
+	.get_temp	= rcar_thermal_get_temp,
+	.get_trip_type	= rcar_thermal_get_trip_type,
+	.get_trip_temp	= rcar_thermal_get_trip_temp,
+	.notify		= rcar_thermal_notify,
 };
 
 /*
@@ -212,8 +286,9 @@ static int rcar_thermal_probe(struct platform_device *pdev)
 		goto error_free_priv;
 	}
 
-	zone = thermal_zone_device_register("rcar_thermal", 0, 0, priv,
-					    &rcar_thermal_zone_ops, 0, 0);
+	zone = thermal_zone_device_register("rcar_thermal", 4, 0, priv,
+					    &rcar_thermal_zone_ops, 0,
+					    IDLE_INTERVAL);
 	if (IS_ERR(zone)) {
 		dev_err(&pdev->dev, "thermal zone device is NULL\n");
 		ret = PTR_ERR(zone);
-- 
1.7.9.5


^ permalink raw reply related

* [PATCH 0/2] thermal: rcar: add emergency power down
From: Kuninori Morimoto @ 2012-11-22  6:49 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Simon, Magnus, linux-pm


Hi Zhang

These patches adds emergency shutdown on rcar_thermal.c

This "force shutdown" should be updated to use cooling device in future.
But it is enough for now.

Kuninori Morimoto (2):
      thermal: rcar: add rcar_zone_to_priv() macro
      thermal: rcar: add .get_trip_type/temp and .notify support

 drivers/thermal/rcar_thermal.c |   87 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 82 insertions(+), 5 deletions(-)

Best regards
---
Kuninori Morimoto

^ permalink raw reply

* [PATCH 1/2] thermal: rcar: add rcar_zone_to_priv() macro
From: Kuninori Morimoto @ 2012-11-22  6:50 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Simon, Magnus, linux-pm, Kuninori Morimoto
In-Reply-To: <87sj82uo7o.wl%kuninori.morimoto.gx@renesas.com>

This patch adds rcar_zone_to_priv()
which is a helper macro for gettign private data.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
---
 drivers/thermal/rcar_thermal.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
index f7a1b57..454035c 100644
--- a/drivers/thermal/rcar_thermal.c
+++ b/drivers/thermal/rcar_thermal.c
@@ -43,6 +43,8 @@ struct rcar_thermal_priv {
 	u32 comp;
 };
 
+#define rcar_zone_to_priv(zone)		(zone->devdata)
+
 /*
  *		basic functions
  */
@@ -96,7 +98,7 @@ static void rcar_thermal_bset(struct rcar_thermal_priv *priv, u32 reg,
 static int rcar_thermal_get_temp(struct thermal_zone_device *zone,
 			   unsigned long *temp)
 {
-	struct rcar_thermal_priv *priv = zone->devdata;
+	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
 	int val, min, max, tmp;
 
 	tmp = -200; /* default */
@@ -235,7 +237,7 @@ error_free_priv:
 static int rcar_thermal_remove(struct platform_device *pdev)
 {
 	struct thermal_zone_device *zone = platform_get_drvdata(pdev);
-	struct rcar_thermal_priv *priv = zone->devdata;
+	struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
 
 	thermal_zone_device_unregister(zone);
 	platform_set_drvdata(pdev, NULL);
-- 
1.7.9.5


^ permalink raw reply related

* How to use generic power domains (pm_genpd)
From: Rickard Andersson @ 2012-11-22 14:38 UTC (permalink / raw)
  To: linux-pm

Hi,

I am currently looking at genpd to see if it can be used in my case
(U8500) and I have some questions.

I want to model a powerdomain that is tied to the CPU (the CPU can not
be turned on without this power domain being on) so I would like to use
pm_genpd_attach_cpuidle(..). One problem with using this function is
that my powerdomain can be attached to several cpuidle sleep states and
genpd does not seem to handle that case? (genpd does only seem to handle
being connected to one cpuidle state.)

Another thing that I am trying to find out is if genpd works ok if a
driver in the power domain does pm_runtime_get_sync(..) in interrupt
context? When looking at pm_genpd_runtime_resume(..) it looks like the
function will never call __pm_genpd_poweron(..) in the interrupt context
case. If I understand it correctly this means that the cpuidle
connection code "genpd->cpu_data->idle_state->disabled = true" is never
executed?
Also in __pm_genpd_poweron(..) it seems non otimal from power
consumption point of view to call cpuidle_pause_and_lock(..) as soon as
the cpuidle connected power domain is turned on because
cpuidle_pause_and_lock(..) calls kick_all_cpus_sync(). This means that
all CPU's in sleep will be woken up just because another CPU enables
this power domain.

Thanks in advance
Rickard


^ permalink raw reply

* [RFC PATCH V2 3/3] step_wise: Unify the code for both throttle and dethrottle
From: Zhang Rui @ 2012-11-22  7:53 UTC (permalink / raw)
  To: Linux PM list; +Cc: Amit Kachhap, durga, Zhang, Rui

>From 9b298d2d7816074070567b9b20ba0c1eb5fb862b Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang@intel.com>
Date: Thu, 22 Nov 2012 15:45:02 +0800
Subject: [PATCH 3/3] step_wise: Unify the code for both throttle and
 dethrottle

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 drivers/thermal/step_wise.c |   70 ++++++++++++++++---------------------------
 1 file changed, 25 insertions(+), 45 deletions(-)

diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
index c3a0e87..dcc965d 100644
--- a/drivers/thermal/step_wise.c
+++ b/drivers/thermal/step_wise.c
@@ -99,52 +99,14 @@ static void update_passive_instance(struct thermal_zone_device *tz,
 		tz->passive += value;
 }
 
-static void update_instance_for_throttle(struct thermal_zone_device *tz,
-				int trip, enum thermal_trip_type trip_type,
-				enum thermal_trend trend)
-{
-	struct thermal_instance *instance;
-
-	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
-		if (instance->trip != trip)
-			continue;
-
-		instance->target = get_target_state(instance, trend, true);
-
-		/* Activate a passive thermal instance */
-		if (instance->target == THERMAL_NO_TARGET)
-			update_passive_instance(tz, trip_type, 1);
-
-		instance->cdev->updated = false; /* cdev needs update */
-	}
-}
-
-static void update_instance_for_dethrottle(struct thermal_zone_device *tz,
-				int trip, enum thermal_trip_type trip_type,
-				enum thermal_trend trend)
-{
-	struct thermal_instance *instance;
-
-	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
-		if (instance->trip != trip ||
-			instance->target == THERMAL_NO_TARGET)
-			continue;
-
-		instance->target = get_target_state(instance, trend, false);
-
-		/* Deactivate a passive thermal instance */
-		if (instance->target == THERMAL_NO_TARGET)
-			update_passive_instance(tz, trip_type, -1);
-
-		instance->cdev->updated = false; /* cdev needs update */
-	}
-}
-
 static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 {
 	long trip_temp;
 	enum thermal_trip_type trip_type;
 	enum thermal_trend trend;
+	struct thermal_instance *instance;
+	bool throttle = false;
+	int old_target;
 
 	if (trip == THERMAL_TRIPS_NONE) {
 		trip_temp = tz->forced_passive;
@@ -156,12 +118,30 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 
 	trend = get_tz_trend(tz, trip);
 
+	if (tz->temperature >= trip_temp)
+		throttle = true;
+
 	mutex_lock(&tz->lock);
 
-	if (tz->temperature >= trip_temp)
-		update_instance_for_throttle(tz, trip, trip_type, trend);
-	else
-		update_instance_for_dethrottle(tz, trip, trip_type, trend);
+	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
+		if (instance->trip != trip)
+			continue;
+
+		old_target = instance->target;
+		instance->target = get_target_state(instance, trend, throttle);
+
+		/* Activate a passive thermal instance */
+		if (old_target == THERMAL_NO_TARGET &&
+			instance->target != THERMAL_NO_TARGET)
+			update_passive_instance(tz, trip_type, 1);
+		/* Deactivate a passive thermal instance */
+		else if (old_target != THERMAL_NO_TARGET &&
+			instance->target == THERMAL_NO_TARGET)
+			update_passive_instance(tz, trip_type, -1);
+
+
+		instance->cdev->updated = false; /* cdev needs update */
+	}
 
 	mutex_unlock(&tz->lock);
 }
-- 
1.7.9.5




^ permalink raw reply related

* [PATCH  V2 2/3] Introduce THERMAL_TREND_RAISE/DROP_FULL support for step_wise governor
From: Zhang Rui @ 2012-11-22  7:52 UTC (permalink / raw)
  To: Linux PM list; +Cc: Amit Kachhap, durga, Zhang, Rui

>From ac2224a47b88ca64f4ae25da78fb032b31856624 Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang@intel.com>
Date: Mon, 19 Nov 2012 16:10:20 +0800
Subject: [PATCH 2/3] Introduce THERMAL_TREND_RAISE/DROP_FULL support for
 step_wise governor

step_wise governor should set the device cooling state to
upper/lower limit directly when THERMAL_TREND_RAISE/DROP_FULL.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 drivers/thermal/step_wise.c |   64 +++++++++++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 18 deletions(-)

diff --git a/drivers/thermal/step_wise.c b/drivers/thermal/step_wise.c
index 1242cff..c3a0e87 100644
--- a/drivers/thermal/step_wise.c
+++ b/drivers/thermal/step_wise.c
@@ -35,21 +35,54 @@
  *       state for this trip point
  *    b. if the trend is THERMAL_TREND_DROPPING, use lower cooling
  *       state for this trip point
+ *    c. if the trend is THERMAL_TREND_RAISE_FULL, use upper limit
+ *       for this trip point
+ *    d. if the trend is THERMAL_TREND_DROP_FULL, use lower limit
+ *       for this trip point
+ * If the temperature is lower than a trip point,
+ *    a. if the trend is THERMAL_TREND_RAISING, do nothing
+ *    b. if the trend is THERMAL_TREND_DROPPING, use lower cooling
+ *       state for this trip point, if the cooling state already
+ *       equals lower limit, deactivate the thermal instance
+ *    c. if the trend is THERMAL_TREND_RAISE_FULL, do nothing
+ *    d. if the trend is THERMAL_TREND_DROP_FULL, use lower limit,
+ *       if the cooling state already equals lower limit,
+ *       deactive the thermal instance
  */
 static unsigned long get_target_state(struct thermal_instance *instance,
-					enum thermal_trend trend)
+				enum thermal_trend trend, bool throttle)
 {
 	struct thermal_cooling_device *cdev = instance->cdev;
 	unsigned long cur_state;
 
 	cdev->ops->get_cur_state(cdev, &cur_state);
 
-	if (trend == THERMAL_TREND_RAISING) {
-		cur_state = cur_state < instance->upper ?
-			    (cur_state + 1) : instance->upper;
-	} else if (trend == THERMAL_TREND_DROPPING) {
-		cur_state = cur_state > instance->lower ?
-			    (cur_state - 1) : instance->lower;
+	switch (trend) {
+	case THERMAL_TREND_RAISING:
+		if (throttle)
+			cur_state = cur_state < instance->upper ?
+				    (cur_state + 1) : instance->upper;
+		break;
+	case THERMAL_TREND_RAISE_FULL:
+		if (throttle)
+			cur_state = instance->upper;
+		break;
+	case THERMAL_TREND_DROPPING:
+		if (cur_state == instance->lower) {
+			if (!throttle)
+				cur_state = -1;
+		} else
+			cur_state -= 1;
+		break;
+	case THERMAL_TREND_DROP_FULL:
+		if (cur_state == instance->lower) {
+			if (!throttle)
+				cur_state = -1;
+		} else
+			cur_state = instance->lower;
+		break;
+	default:
+		break;
 	}
 
 	return cur_state;
@@ -76,7 +109,7 @@ static void update_instance_for_throttle(struct thermal_zone_device *tz,
 		if (instance->trip != trip)
 			continue;
 
-		instance->target = get_target_state(instance, trend);
+		instance->target = get_target_state(instance, trend, true);
 
 		/* Activate a passive thermal instance */
 		if (instance->target == THERMAL_NO_TARGET)
@@ -87,28 +120,23 @@ static void update_instance_for_throttle(struct thermal_zone_device *tz,
 }
 
 static void update_instance_for_dethrottle(struct thermal_zone_device *tz,
-				int trip, enum thermal_trip_type trip_type)
+				int trip, enum thermal_trip_type trip_type,
+				enum thermal_trend trend)
 {
 	struct thermal_instance *instance;
-	struct thermal_cooling_device *cdev;
-	unsigned long cur_state;
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		if (instance->trip != trip ||
 			instance->target == THERMAL_NO_TARGET)
 			continue;
 
-		cdev = instance->cdev;
-		cdev->ops->get_cur_state(cdev, &cur_state);
-
-		instance->target = cur_state > instance->lower ?
-			    (cur_state - 1) : THERMAL_NO_TARGET;
+		instance->target = get_target_state(instance, trend, false);
 
 		/* Deactivate a passive thermal instance */
 		if (instance->target == THERMAL_NO_TARGET)
 			update_passive_instance(tz, trip_type, -1);
 
-		cdev->updated = false; /* cdev needs update */
+		instance->cdev->updated = false; /* cdev needs update */
 	}
 }
 
@@ -133,7 +161,7 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 	if (tz->temperature >= trip_temp)
 		update_instance_for_throttle(tz, trip, trip_type, trend);
 	else
-		update_instance_for_dethrottle(tz, trip, trip_type);
+		update_instance_for_dethrottle(tz, trip, trip_type, trend);
 
 	mutex_unlock(&tz->lock);
 }
-- 
1.7.9.5




^ permalink raw reply related

* [PATCH V2 1/3] Introduce THERMAL_TREND_RAISE_FULL and THERMAL_TREND_DROP_FULL
From: Zhang Rui @ 2012-11-22  7:51 UTC (permalink / raw)
  To: Linux PM list; +Cc: Amit Kachhap, durga, Zhang, Rui

>From cd05abc4929c21275e3674fb303ca6007f8415a0 Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang@intel.com>
Date: Mon, 19 Nov 2012 15:33:51 +0800
Subject: [PATCH 1/3] Introduce THERMAL_TREND_RAISE_FULL and
 THERMAL_TREND_DROP_FULL

These two new thermal_trend types are used to tell the governor
that the temeprature is raising/dropping quickly.

Thermal cooling governors should handle this situation and make
proper decisions, e.g. set cooling state to upper/lower limit directly
instead of one step each time.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 include/linux/thermal.h |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 807f214..dcaa400 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -68,6 +68,8 @@ enum thermal_trend {
 	THERMAL_TREND_STABLE, /* temperature is stable */
 	THERMAL_TREND_RAISING, /* temperature is raising */
 	THERMAL_TREND_DROPPING, /* temperature is dropping */
+	THERMAL_TREND_RAISE_FULL, /* apply highest cooling action */
+	THERMAL_TREND_DROP_FULL, /* apply lowest cooling action */
 };
 
 /* Events supported by Thermal Netlink */
-- 
1.7.9.5




^ permalink raw reply related

* Re: [PATCH] PM / devfreq: missing rcu_read_lock() added for find_device_opp()
From: Serge E. Hallyn @ 2012-11-22 17:45 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: MyungJoo Ham, myungjoo.ham, linux-pm, rjw, keescook, serge.hallyn,
	linux-kernel
In-Reply-To: <20121122140928.GA14567@mail.hallyn.com>

Quoting Serge E. Hallyn (serge@hallyn.com):
> Quoting MyungJoo Ham (myungjoo.ham@samsung.com):
> > opp_get_notifier() uses find_device_opp(), which requires to
> > held rcu_read_lock. In order to keep the notifier-header
> > valid, we have added rcu_read_lock().
> > 

Well, to be honest, the opp locking isn't 100% clear to me, but IIUC
(1) things can be added but never removed, (2) opp_get_notifier doesn't
pin a refcount on what it returns, but it returns something which won't
be deleted.

So IIUC what's below is fine, bc it doesn't need to pin the nh for the
nh to remain valid outside of the rcu_read_lock.  If I'm wrong about
that, then the below is not sufficient.

> Reviewed-by: Serge Hallyn <serge.hallyn@canonical.com>
> 
> > Reported-by: Kees Cook <keescook@chromium.org>
> > Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> > ---
> >  drivers/devfreq/devfreq.c |   26 ++++++++++++++++++++------
> >  1 files changed, 20 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> > index 1388d46..5275883 100644
> > --- a/drivers/devfreq/devfreq.c
> > +++ b/drivers/devfreq/devfreq.c
> > @@ -1023,11 +1023,18 @@ struct opp *devfreq_recommended_opp(struct device *dev, unsigned long *freq,
> >   */
> >  int devfreq_register_opp_notifier(struct device *dev, struct devfreq *devfreq)
> >  {
> > -	struct srcu_notifier_head *nh = opp_get_notifier(dev);
> > +	struct srcu_notifier_head *nh;
> > +	int ret = 0;
> >  
> > +	rcu_read_lock();
> > +	nh = opp_get_notifier(dev);
> >  	if (IS_ERR(nh))
> > -		return PTR_ERR(nh);
> > -	return srcu_notifier_chain_register(nh, &devfreq->nb);
> > +		ret = PTR_ERR(nh);
> > +	rcu_read_unlock();
> > +	if (!ret)
> > +		ret = srcu_notifier_chain_register(nh, &devfreq->nb);
> > +
> > +	return ret;
> >  }
> >  
> >  /**
> > @@ -1042,11 +1049,18 @@ int devfreq_register_opp_notifier(struct device *dev, struct devfreq *devfreq)
> >   */
> >  int devfreq_unregister_opp_notifier(struct device *dev, struct devfreq *devfreq)
> >  {
> > -	struct srcu_notifier_head *nh = opp_get_notifier(dev);
> > +	struct srcu_notifier_head *nh;
> > +	int ret = 0;
> >  
> > +	rcu_read_lock();
> > +	nh = opp_get_notifier(dev);
> >  	if (IS_ERR(nh))
> > -		return PTR_ERR(nh);
> > -	return srcu_notifier_chain_unregister(nh, &devfreq->nb);
> > +		ret = PTR_ERR(nh);
> > +	rcu_read_unlock();
> > +	if (!ret)
> > +		ret = srcu_notifier_chain_unregister(nh, &devfreq->nb);
> > +
> > +	return ret;
> >  }
> >  
> >  MODULE_AUTHOR("MyungJoo Ham <myungjoo.ham@samsung.com>");
> > -- 
> > 1.7.5.4

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox