* [PATCH] thermal: avoid division by zero in power allocator
@ 2015-09-28 21:28 Andrea Arcangeli
2015-09-28 21:28 ` Andrea Arcangeli
0 siblings, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2015-09-28 21:28 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-pm, Javi Merino, Zhang Rui, Eduardo Valentin,
Daniel Kurtz
Hello,
This is needed for my workstations or they Oops at boot since
v4.3-rc3.
Andrea Arcangeli (1):
thermal: avoid division by zero in power allocator
drivers/thermal/power_allocator.c | 10 ++++++++++
1 file changed, 10 insertions(+)
Thanks,
Andrea
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] thermal: avoid division by zero in power allocator
2015-09-28 21:28 [PATCH] thermal: avoid division by zero in power allocator Andrea Arcangeli
@ 2015-09-28 21:28 ` Andrea Arcangeli
2015-09-29 20:33 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2015-09-28 21:28 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-pm, Javi Merino, Zhang Rui, Eduardo Valentin,
Daniel Kurtz
During boot I get a div by zero Oops regression starting in v4.3-rc3.
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
drivers/thermal/power_allocator.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
index 7ff9627..e570ff0 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -144,6 +144,16 @@ static void estimate_pid_constants(struct thermal_zone_device *tz,
switch_on_temp = 0;
temperature_threshold = control_temp - switch_on_temp;
+ /*
+ * estimate_pid_constants() tries to find appropriate default
+ * values for thermal zones that don't provide them. If a
+ * system integrator has configured a thermal zone with two
+ * passive trip points at the same temperature, that person
+ * hasn't put any effort to set up the thermal zone properly
+ * so just give up.
+ */
+ if (!temperature_threshold)
+ return;
if (!tz->tzp->k_po || force)
tz->tzp->k_po = int_to_frac(sustainable_power) /
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] thermal: avoid division by zero in power allocator
2015-09-28 21:28 ` Andrea Arcangeli
@ 2015-09-29 20:33 ` Andrew Morton
2015-10-01 10:17 ` Javi Merino
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2015-09-29 20:33 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: linux-kernel, linux-pm, Javi Merino, Zhang Rui, Eduardo Valentin,
Daniel Kurtz
On Mon, 28 Sep 2015 23:28:34 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote:
> During boot I get a div by zero Oops regression starting in v4.3-rc3.
>
> ...
>
> --- a/drivers/thermal/power_allocator.c
> +++ b/drivers/thermal/power_allocator.c
> @@ -144,6 +144,16 @@ static void estimate_pid_constants(struct thermal_zone_device *tz,
> switch_on_temp = 0;
>
> temperature_threshold = control_temp - switch_on_temp;
> + /*
> + * estimate_pid_constants() tries to find appropriate default
> + * values for thermal zones that don't provide them. If a
> + * system integrator has configured a thermal zone with two
> + * passive trip points at the same temperature, that person
> + * hasn't put any effort to set up the thermal zone properly
> + * so just give up.
> + */
> + if (!temperature_threshold)
> + return;
>
> if (!tz->tzp->k_po || force)
> tz->tzp->k_po = int_to_frac(sustainable_power) /
a) Are we sure this won't leave tz->tzp fields uninitialized?
b) I'm not understanding that code at all. The "proportional" term
in a PID controller is supposed to be proportional to the (desired -
actual) difference (aka "the error").
But estimate_pid_constants() appears to be setting the
"proportional" term to be proportional to 1/error!
Maybe a description of local `temperature_threshold' would help
clue me in.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] thermal: avoid division by zero in power allocator
2015-09-29 20:33 ` Andrew Morton
@ 2015-10-01 10:17 ` Javi Merino
0 siblings, 0 replies; 4+ messages in thread
From: Javi Merino @ 2015-10-01 10:17 UTC (permalink / raw)
To: Andrew Morton
Cc: Andrea Arcangeli, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org, Zhang Rui, Eduardo Valentin,
Daniel Kurtz
On Tue, Sep 29, 2015 at 09:33:30PM +0100, Andrew Morton wrote:
> On Mon, 28 Sep 2015 23:28:34 +0200 Andrea Arcangeli <aarcange@redhat.com> wrote:
>
> > During boot I get a div by zero Oops regression starting in v4.3-rc3.
> >
> > ...
> >
> > --- a/drivers/thermal/power_allocator.c
> > +++ b/drivers/thermal/power_allocator.c
> > @@ -144,6 +144,16 @@ static void estimate_pid_constants(struct thermal_zone_device *tz,
> > switch_on_temp = 0;
> >
> > temperature_threshold = control_temp - switch_on_temp;
> > + /*
> > + * estimate_pid_constants() tries to find appropriate default
> > + * values for thermal zones that don't provide them. If a
> > + * system integrator has configured a thermal zone with two
> > + * passive trip points at the same temperature, that person
> > + * hasn't put any effort to set up the thermal zone properly
> > + * so just give up.
> > + */
> > + if (!temperature_threshold)
> > + return;
> >
> > if (!tz->tzp->k_po || force)
> > tz->tzp->k_po = int_to_frac(sustainable_power) /
>
> a) Are we sure this won't leave tz->tzp fields uninitialized?
They will be all zeros. That's good enough.
> b) I'm not understanding that code at all. The "proportional" term
> in a PID controller is supposed to be proportional to the (desired -
> actual) difference (aka "the error").
>
> But estimate_pid_constants() appears to be setting the
> "proportional" term to be proportional to 1/error!
estimate_pid_constants() calculate the constants that you use in the
PID algorithm. Say:
k_p * error + k_i * integral_of_error + k_d * diff_of_error
This code is calculating a reasonable k_p, k_i and k_d when they are
not provided by the platform.
> Maybe a description of local `temperature_threshold' would help
> clue me in.
The `error' in the above definition is:
target_temperature - current_temperature
whereas `temperature_threshold' is:
`target_temperature' - `switch_on_temperature'
`switch_on_temperature' is the temperature above which the thermal
governor starts operating and throttling cpus (or whatever cooling
device is configured).
The `switch_on_temperature' and `target_temperature' are defined using
trip points. A platform that sets two trip points to the same
temperature is not properly configured. With Andrea's patch we
provide degraded behavior instead of crashing. I agree with that
approach (hence my Reviewed-by, maybe it should be an Acked-by?).
Cheers,
Javi
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-10-01 10:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-28 21:28 [PATCH] thermal: avoid division by zero in power allocator Andrea Arcangeli
2015-09-28 21:28 ` Andrea Arcangeli
2015-09-29 20:33 ` Andrew Morton
2015-10-01 10:17 ` Javi Merino
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).