linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] thermal: avoid division by zero in power allocator
@ 2015-09-28 21:28 Andrea Arcangeli
  2015-09-28 21:28 ` Andrea Arcangeli
  0 siblings, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2015-09-28 21:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-pm, Javi Merino, Zhang Rui, Eduardo Valentin,
	Daniel Kurtz

Hello,

This is needed for my workstations or they Oops at boot since
v4.3-rc3.

Andrea Arcangeli (1):
  thermal: avoid division by zero in power allocator

 drivers/thermal/power_allocator.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] thermal: avoid division by zero in power allocator
  2015-09-28 21:28 [PATCH] thermal: avoid division by zero in power allocator Andrea Arcangeli
@ 2015-09-28 21:28 ` Andrea Arcangeli
  2015-09-29 20:33   ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2015-09-28 21:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-pm, Javi Merino, Zhang Rui, Eduardo Valentin,
	Daniel Kurtz

During boot I get a div by zero Oops regression starting in v4.3-rc3.

Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 drivers/thermal/power_allocator.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
index 7ff9627..e570ff0 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -144,6 +144,16 @@ static void estimate_pid_constants(struct thermal_zone_device *tz,
 		switch_on_temp = 0;
 
 	temperature_threshold = control_temp - switch_on_temp;
+	/*
+	 * estimate_pid_constants() tries to find appropriate default
+	 * values for thermal zones that don't provide them. If a
+	 * system integrator has configured a thermal zone with two
+	 * passive trip points at the same temperature, that person
+	 * hasn't put any effort to set up the thermal zone properly
+	 * so just give up.
+	 */
+	if (!temperature_threshold)
+		return;
 
 	if (!tz->tzp->k_po || force)
 		tz->tzp->k_po = int_to_frac(sustainable_power) /

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] thermal: avoid division by zero in power allocator
  2015-09-28 21:28 ` Andrea Arcangeli
@ 2015-09-29 20:33   ` Andrew Morton
  2015-10-01 10:17     ` Javi Merino
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2015-09-29 20:33 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-kernel, linux-pm, Javi Merino, Zhang Rui, Eduardo Valentin,
	Daniel Kurtz

On Mon, 28 Sep 2015 23:28:34 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote:

> During boot I get a div by zero Oops regression starting in v4.3-rc3.
> 
> ...
>
> --- a/drivers/thermal/power_allocator.c
> +++ b/drivers/thermal/power_allocator.c
> @@ -144,6 +144,16 @@ static void estimate_pid_constants(struct thermal_zone_device *tz,
>  		switch_on_temp = 0;
>  
>  	temperature_threshold = control_temp - switch_on_temp;
> +	/*
> +	 * estimate_pid_constants() tries to find appropriate default
> +	 * values for thermal zones that don't provide them. If a
> +	 * system integrator has configured a thermal zone with two
> +	 * passive trip points at the same temperature, that person
> +	 * hasn't put any effort to set up the thermal zone properly
> +	 * so just give up.
> +	 */
> +	if (!temperature_threshold)
> +		return;
>  
>  	if (!tz->tzp->k_po || force)
>  		tz->tzp->k_po = int_to_frac(sustainable_power) /

a) Are we sure this won't leave tz->tzp fields uninitialized?

b) I'm not understanding that code at all.  The "proportional" term
   in a PID controller is supposed to be proportional to the (desired -
   actual) difference (aka "the error").

   But estimate_pid_constants() appears to be setting the
   "proportional" term to be proportional to 1/error!

   Maybe a description of local `temperature_threshold' would help
   clue me in.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] thermal: avoid division by zero in power allocator
  2015-09-29 20:33   ` Andrew Morton
@ 2015-10-01 10:17     ` Javi Merino
  0 siblings, 0 replies; 4+ messages in thread
From: Javi Merino @ 2015-10-01 10:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andrea Arcangeli, linux-kernel@vger.kernel.org,
	linux-pm@vger.kernel.org, Zhang Rui, Eduardo Valentin,
	Daniel Kurtz

On Tue, Sep 29, 2015 at 09:33:30PM +0100, Andrew Morton wrote:
> On Mon, 28 Sep 2015 23:28:34 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote:
> 
> > During boot I get a div by zero Oops regression starting in v4.3-rc3.
> > 
> > ...
> >
> > --- a/drivers/thermal/power_allocator.c
> > +++ b/drivers/thermal/power_allocator.c
> > @@ -144,6 +144,16 @@ static void estimate_pid_constants(struct thermal_zone_device *tz,
> >  		switch_on_temp = 0;
> >  
> >  	temperature_threshold = control_temp - switch_on_temp;
> > +	/*
> > +	 * estimate_pid_constants() tries to find appropriate default
> > +	 * values for thermal zones that don't provide them. If a
> > +	 * system integrator has configured a thermal zone with two
> > +	 * passive trip points at the same temperature, that person
> > +	 * hasn't put any effort to set up the thermal zone properly
> > +	 * so just give up.
> > +	 */
> > +	if (!temperature_threshold)
> > +		return;
> >  
> >  	if (!tz->tzp->k_po || force)
> >  		tz->tzp->k_po = int_to_frac(sustainable_power) /
> 
> a) Are we sure this won't leave tz->tzp fields uninitialized?

They will be all zeros.  That's good enough.

> b) I'm not understanding that code at all.  The "proportional" term
>    in a PID controller is supposed to be proportional to the (desired -
>    actual) difference (aka "the error").
> 
>    But estimate_pid_constants() appears to be setting the
>    "proportional" term to be proportional to 1/error!

estimate_pid_constants() calculate the constants that you use in the
PID algorithm.  Say:

k_p * error + k_i * integral_of_error + k_d * diff_of_error

This code is calculating a reasonable k_p, k_i and k_d when they are
not provided by the platform.

>    Maybe a description of local `temperature_threshold' would help
>    clue me in.

The `error' in the above definition is:

target_temperature - current_temperature

whereas `temperature_threshold' is:

`target_temperature' - `switch_on_temperature'

`switch_on_temperature' is the temperature above which the thermal
governor starts operating and throttling cpus (or whatever cooling
device is configured).

The `switch_on_temperature' and `target_temperature' are defined using
trip points.  A platform that sets two trip points to the same
temperature is not properly configured.  With Andrea's patch we
provide degraded behavior instead of crashing.  I agree with that
approach (hence my Reviewed-by, maybe it should be an Acked-by?).

Cheers,
Javi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-01 10:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-28 21:28 [PATCH] thermal: avoid division by zero in power allocator Andrea Arcangeli
2015-09-28 21:28 ` Andrea Arcangeli
2015-09-29 20:33   ` Andrew Morton
2015-10-01 10:17     ` Javi Merino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).