public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] the generic thermal layer enhancement
@ 2012-05-30  8:49 Zhang Rui
  2012-05-30  8:51 ` [linux-pm] " Zhang Rui
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Zhang Rui @ 2012-05-30  8:49 UTC (permalink / raw)
  To: Matthew Garrett, Rafael J. Wysocki, Jean Delvare, Brown, Len
  Cc: linux-acpi@vger.kernel.org, linux-pm

Hi, all,

It is great to see more and more users of the generic thermal layer.
But as we know, the original design of the generic thermal layer comes
from ACPI thermal management, and some of its implementation seems to be
too ACPI specific nowadays.

Recently I'm thinking of enhance the generic thermal layer so that it
works well for more platforms.

Below are some thoughts of mine, after reading the patches from Amit
Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
coding some RFC patches. But I do really want to get feedback from you
before going on.

G1. supporting multiple cooling states for active cooling devices.

    The current active cooling device supports two cooling states only,
    please refer to the code below, in driver/thermal/thermal_sys.c
                case THERMAL_TRIP_ACTIVE:
                        ...
                        if (temp >= trip_temp)
                                cdev->ops->set_cur_state(cdev, 1);
                        else
                                cdev->ops->set_cur_state(cdev, 0);
                        break;

    This is an ACPI specific thing, as our ACPI FAN used to support
    ON/OFF only.
    I think it is reasonable to support multiple active cooling states
    as they are common on many platforms, and note that this is also
    true for ACPI 3.0 FAN device (_FPS).

G2. introduce cooling states range for a certain trip point

    This problem comes with the first one.
    If the cooling devices support multiple cooling states, and surely
    we may want only several cooling states for a certain trip point,
    and other cooling states for other active trip points.
    To do this, we should be able to describe the cooling device
    behavior for a certain trip point, rather than for the entire
    thermal zone.

G3. kernel thermal passive cooling algorithm

    Currently, tc1 and tc2 are hard requirements for kernel passive
    cooling. But non-ACPI platforms do not have this information
    (please correct me if I'm wrong).
    Say, for the patches here
    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
    They just want to slow down the processor when current temperature
    is higher than the trip point and speed up the processor when the
    temperature is lower than the trip point.

    According to Matthew, the platform drivers are responsible to
    provide proper tc1 and tc2 values to use kernel passive cooling.
    But I'm just wondering if we can use something instead.
    Say, introduce .get_trend() in thermal_zone_device_ops.
    And we set cur_state++ or cur_state-- based on the value returned
    by .get_trend(), instead of using tc1 and tc2.

G4. Multiple passive trip points

    I get this idea also from the patches at
    http://marc.info/?l=linux-acpi&m=133681581305341&w=2

    IMO, they want to get an acceptable performance at a tolerable
    temperature.
    Say, a platform with four P-states. P3 is really low.
    And I'm okay with the temperature at 60C, but 80C? No.
    With G2 resolved, we can use processor P0~P2 for Passive trip point
    0 (50C), and P3 for Passive trip point 1 (70C). And then the
    temperature may be jumping at around 60C or even 65C, without
    entering P3.

    Further more, IMO, this also works for ACPI platforms.
    Say, we can easily change p-state to cool the system, but using
    t-state is definitely what we do not want to see. The current
    implementation does not expose this difference to the generic
    thermal layer, but if we can have two passive trip points, and use
    p-state for the first one only... (this works if we start polling
    after entering passive cooling mode, without hardware notification)

G5. unify active cooling and passive cooling code

    If G4 and G5 are resolved, a new problem to me is that there is no
    difference between passive cooling and active cooling except the
    cooling policy.
    Then we can share the same code for both active and passive cooling.
    maybe something like:

    case THERMAL_TRIP_ACTIVE:
    case THERMAL_TRIP_PASSIVE:
         ...
         tz->ops->get_trend();
         if (trend == HEATING)
                 cdev->ops->set_cur_state(cdev, cur_state++);
         else if (trend == COOLING)
                 cdev->ops->set_cur_state(cdev, cur_state--);
         break;

Here are the gaps in my point of view, I'd like to get your ideas about
which are reasonable and which are not.

Any comments are appreciated! Thanks!

-rui

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-30  8:49 [RFC] the generic thermal layer enhancement Zhang Rui
@ 2012-05-30  8:51 ` Zhang Rui
  2012-05-30 10:30   ` Eduardo Valentin
  2012-05-30 10:44 ` [linux-pm] " R, Durgadoss
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Zhang Rui @ 2012-05-30  8:51 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Rafael J. Wysocki, Jean Delvare, Brown, Len,
	linux-acpi@vger.kernel.org, linux-pm, Amit Kachhap

On 三, 2012-05-30 at 16:49 +0800, Zhang Rui wrote:
> Hi, all,
> 
> It is great to see more and more users of the generic thermal layer.
> But as we know, the original design of the generic thermal layer comes
> from ACPI thermal management, and some of its implementation seems to be
> too ACPI specific nowadays.
> 
> Recently I'm thinking of enhance the generic thermal layer so that it
> works well for more platforms.
> 
> Below are some thoughts of mine, after reading the patches from Amit
> Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
> coding some RFC patches. But I do really want to get feedback from you
> before going on.
> 
> G1. supporting multiple cooling states for active cooling devices.
> 
>     The current active cooling device supports two cooling states only,
>     please refer to the code below, in driver/thermal/thermal_sys.c
>                 case THERMAL_TRIP_ACTIVE:
>                         ...
>                         if (temp >= trip_temp)
>                                 cdev->ops->set_cur_state(cdev, 1);
>                         else
>                                 cdev->ops->set_cur_state(cdev, 0);
>                         break;
> 
>     This is an ACPI specific thing, as our ACPI FAN used to support
>     ON/OFF only.
>     I think it is reasonable to support multiple active cooling states
>     as they are common on many platforms, and note that this is also
>     true for ACPI 3.0 FAN device (_FPS).
> 
> G2. introduce cooling states range for a certain trip point
> 
>     This problem comes with the first one.
>     If the cooling devices support multiple cooling states, and surely
>     we may want only several cooling states for a certain trip point,
>     and other cooling states for other active trip points.
>     To do this, we should be able to describe the cooling device
>     behavior for a certain trip point, rather than for the entire
>     thermal zone.
> 
> G3. kernel thermal passive cooling algorithm
> 
>     Currently, tc1 and tc2 are hard requirements for kernel passive
>     cooling. But non-ACPI platforms do not have this information
>     (please correct me if I'm wrong).
>     Say, for the patches here
>     http://marc.info/?l=linux-acpi&m=133681581305341&w=2

Sorry, forgot to cc Amit, the author of this patch set.

thanks,
rui
>     They just want to slow down the processor when current temperature
>     is higher than the trip point and speed up the processor when the
>     temperature is lower than the trip point.
> 
>     According to Matthew, the platform drivers are responsible to
>     provide proper tc1 and tc2 values to use kernel passive cooling.
>     But I'm just wondering if we can use something instead.
>     Say, introduce .get_trend() in thermal_zone_device_ops.
>     And we set cur_state++ or cur_state-- based on the value returned
>     by .get_trend(), instead of using tc1 and tc2.
> 
> G4. Multiple passive trip points
> 
>     I get this idea also from the patches at
>     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> 
>     IMO, they want to get an acceptable performance at a tolerable
>     temperature.
>     Say, a platform with four P-states. P3 is really low.
>     And I'm okay with the temperature at 60C, but 80C? No.
>     With G2 resolved, we can use processor P0~P2 for Passive trip point
>     0 (50C), and P3 for Passive trip point 1 (70C). And then the
>     temperature may be jumping at around 60C or even 65C, without
>     entering P3.
> 
>     Further more, IMO, this also works for ACPI platforms.
>     Say, we can easily change p-state to cool the system, but using
>     t-state is definitely what we do not want to see. The current
>     implementation does not expose this difference to the generic
>     thermal layer, but if we can have two passive trip points, and use
>     p-state for the first one only... (this works if we start polling
>     after entering passive cooling mode, without hardware notification)
> 
> G5. unify active cooling and passive cooling code
> 
>     If G4 and G5 are resolved, a new problem to me is that there is no
>     difference between passive cooling and active cooling except the
>     cooling policy.
>     Then we can share the same code for both active and passive cooling.
>     maybe something like:
> 
>     case THERMAL_TRIP_ACTIVE:
>     case THERMAL_TRIP_PASSIVE:
>          ...
>          tz->ops->get_trend();
>          if (trend == HEATING)
>                  cdev->ops->set_cur_state(cdev, cur_state++);
>          else if (trend == COOLING)
>                  cdev->ops->set_cur_state(cdev, cur_state--);
>          break;
> 
> Here are the gaps in my point of view, I'd like to get your ideas about
> which are reasonable and which are not.
> 
> Any comments are appreciated! Thanks!
> 
> -rui
> 
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/linux-pm


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-30  8:51 ` [linux-pm] " Zhang Rui
@ 2012-05-30 10:30   ` Eduardo Valentin
  2012-05-30 11:05     ` R, Durgadoss
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Eduardo Valentin @ 2012-05-30 10:30 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Matthew Garrett, Brown, Len, amit.kachhap, Jean Delvare,
	linux-acpi@vger.kernel.org, linux-pm

Hello Rui,

Now I copied Amit, for real :-)

I like your proposal, some comments as follow.

On Wed, May 30, 2012 at 04:51:00PM +0800, Zhang Rui wrote:
> On 三, 2012-05-30 at 16:49 +0800, Zhang Rui wrote:
> > Hi, all,
> > 
> > It is great to see more and more users of the generic thermal layer.
> > But as we know, the original design of the generic thermal layer comes
> > from ACPI thermal management, and some of its implementation seems to be
> > too ACPI specific nowadays.


Good. We have also basic OMAP support, on top of Amit's work. I sent
recently a very basic support.  I will be pushing them, while they evolve.

> > 
> > Recently I'm thinking of enhance the generic thermal layer so that it
> > works well for more platforms.

As you said, for non-ACPI support, the "generic" layer needs some
extension and refactoring to be more generic :-).

> > 
> > Below are some thoughts of mine, after reading the patches from Amit
> > Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
> > coding some RFC patches. But I do really want to get feedback from you
> > before going on.

OK.

> > 
> > G1. supporting multiple cooling states for active cooling devices.
> > 
> >     The current active cooling device supports two cooling states only,
> >     please refer to the code below, in driver/thermal/thermal_sys.c
> >                 case THERMAL_TRIP_ACTIVE:
> >                         ...
> >                         if (temp >= trip_temp)
> >                                 cdev->ops->set_cur_state(cdev, 1);
> >                         else
> >                                 cdev->ops->set_cur_state(cdev, 0);
> >                         break;
> > 
> >     This is an ACPI specific thing, as our ACPI FAN used to support
> >     ON/OFF only.
> >     I think it is reasonable to support multiple active cooling states
> >     as they are common on many platforms, and note that this is also
> >     true for ACPI 3.0 FAN device (_FPS).
> > 
> > G2. introduce cooling states range for a certain trip point
> > 
> >     This problem comes with the first one.
> >     If the cooling devices support multiple cooling states, and surely
> >     we may want only several cooling states for a certain trip point,
> >     and other cooling states for other active trip points.
> >     To do this, we should be able to describe the cooling device
> >     behavior for a certain trip point, rather than for the entire
> >     thermal zone.

For G1+G2, I agree with your proposal. I had some discussion with Amit
regarding this. In his series of patches we increase / decrease the cooling
device state linearly and steadily.

But if we would have what you are saying, we could bind cooling device
set of states with trip points.

> > 
> > G3. kernel thermal passive cooling algorithm
> > 
> >     Currently, tc1 and tc2 are hard requirements for kernel passive
> >     cooling. But non-ACPI platforms do not have this information
> >     (please correct me if I'm wrong).
> >     Say, for the patches here
> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> 
> Sorry, forgot to cc Amit, the author of this patch set.
> 
> thanks,
> rui
> >     They just want to slow down the processor when current temperature
> >     is higher than the trip point and speed up the processor when the
> >     temperature is lower than the trip point.
> > 
> >     According to Matthew, the platform drivers are responsible to
> >     provide proper tc1 and tc2 values to use kernel passive cooling.
> >     But I'm just wondering if we can use something instead.
> >     Say, introduce .get_trend() in thermal_zone_device_ops.
> >     And we set cur_state++ or cur_state-- based on the value returned
> >     by .get_trend(), instead of using tc1 and tc2.

I fully support this option and could cook up something on this.
The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
Should probably go away from the registration function that we have
currently.

We could have generic trending computation though. Based on timestamping
and temperature reads, and make it available for zones that want to used it.

> > 
> > G4. Multiple passive trip points
> > 
> >     I get this idea also from the patches at
> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > 
> >     IMO, they want to get an acceptable performance at a tolerable
> >     temperature.
> >     Say, a platform with four P-states. P3 is really low.
> >     And I'm okay with the temperature at 60C, but 80C? No.
> >     With G2 resolved, we can use processor P0~P2 for Passive trip point
> >     0 (50C), and P3 for Passive trip point 1 (70C). And then the
> >     temperature may be jumping at around 60C or even 65C, without
> >     entering P3.

Yeah, I guess we need to solve G1+G2 first to allow this. But I also agree
that ideally, there should be possibility to have multiple passive trip points.

> > 
> >     Further more, IMO, this also works for ACPI platforms.
> >     Say, we can easily change p-state to cool the system, but using
> >     t-state is definitely what we do not want to see. The current
> >     implementation does not expose this difference to the generic
> >     thermal layer, but if we can have two passive trip points, and use
> >     p-state for the first one only... (this works if we start polling
> >     after entering passive cooling mode, without hardware notification)
> > 
> > G5. unify active cooling and passive cooling code
> > 
> >     If G4 and G5 are resolved, a new problem to me is that there is no
> >     difference between passive cooling and active cooling except the
> >     cooling policy.

OK...

> >     Then we can share the same code for both active and passive cooling.
> >     maybe something like:
> > 
> >     case THERMAL_TRIP_ACTIVE:
> >     case THERMAL_TRIP_PASSIVE:
> >          ...
> >          tz->ops->get_trend();

Would the get_trend take into account if we are cooling with active or passive
cooling device?

> >          if (trend == HEATING)
> >                  cdev->ops->set_cur_state(cdev, cur_state++);
> >          else if (trend == COOLING)
> >                  cdev->ops->set_cur_state(cdev, cur_state--);
> >          break;

I believe we should have something for temperature stabilization there as well.

Besides, if we go with this generic policy, then the zone update would be much
simpler no?

> > 
> > Here are the gaps in my point of view, I'd like to get your ideas about
> > which are reasonable and which are not.

Here are some other thoughts:
G6. Another point is, would it make sense to allow for policy extension? Meaning,
the zone update would call a callback to request for update from the zone
device driver?

G7. How do we solve cooling devices being shared between different thermal zones?
Should we have a better cooling device constraint management?

G8. On same topic as G7, how are we currently making sure that thermal constraints
don't get overwritten by, let's say, userspace APIs? I guess the generic CPU cooling
propose by Amit suffers of an issue. If user sets cpufreq governor to userspace
and sets the frequency to its maximum, say in a busy loop, the thermal cooling
could potentially be ruined.

G9. Is there any possibility to have multiple sensors per thermal zone?

G10. Do we want to represent other sensing stimuli other that temperature? Say,
current sensing?

G11. Do we want to allow for cross zoning policies? Sometimes a policy may use
temperature from different thermal zone in order to better represent what
is going on in its thermal zone.

> > 
> > Any comments are appreciated! Thanks!


Thanks to you for starting this up! The above are points that come to my mind now.
I will keep updating the list if something else come to my mind.

> > 
> > -rui
> > 
> > _______________________________________________
> > linux-pm mailing list
> > linux-pm@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/linux-pm
> 
> 
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

All best,

---
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-30  8:49 [RFC] the generic thermal layer enhancement Zhang Rui
  2012-05-30  8:51 ` [linux-pm] " Zhang Rui
@ 2012-05-30 10:44 ` R, Durgadoss
  2012-05-31  3:15   ` Zhang Rui
  2012-05-30 12:50 ` Matthew Garrett
  2012-05-31  4:59 ` Amit Kachhap
  3 siblings, 1 reply; 22+ messages in thread
From: R, Durgadoss @ 2012-05-30 10:44 UTC (permalink / raw)
  To: Zhang, Rui, Matthew Garrett, Rafael J. Wysocki, Jean Delvare,
	Brown, Len
  Cc: linux-acpi@vger.kernel.org, linux-pm

Hi Rui,

> G1. supporting multiple cooling states for active cooling devices.
> 
>     The current active cooling device supports two cooling states only,
>     please refer to the code below, in driver/thermal/thermal_sys.c
>                 case THERMAL_TRIP_ACTIVE:
>                         ...
>                         if (temp >= trip_temp)
>                                 cdev->ops->set_cur_state(cdev, 1);
>                         else
>                                 cdev->ops->set_cur_state(cdev, 0);
>                         break;
> 
>     This is an ACPI specific thing, as our ACPI FAN used to support
>     ON/OFF only.
>     I think it is reasonable to support multiple active cooling states
>     as they are common on many platforms, and note that this is also
>     true for ACPI 3.0 FAN device (_FPS).
> 
> G2. introduce cooling states range for a certain trip point
> 
>     This problem comes with the first one.
>     If the cooling devices support multiple cooling states, and surely
>     we may want only several cooling states for a certain trip point,
>     and other cooling states for other active trip points.
>     To do this, we should be able to describe the cooling device
>     behavior for a certain trip point, rather than for the entire
>     thermal zone.
> 
> G3. kernel thermal passive cooling algorithm
> 
>     Currently, tc1 and tc2 are hard requirements for kernel passive
>     cooling. But non-ACPI platforms do not have this information
>     (please correct me if I'm wrong).
>     Say, for the patches here
>     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
>     They just want to slow down the processor when current temperature
>     is higher than the trip point and speed up the processor when the
>     temperature is lower than the trip point.
> 
>     According to Matthew, the platform drivers are responsible to
>     provide proper tc1 and tc2 values to use kernel passive cooling.
>     But I'm just wondering if we can use something instead.
>     Say, introduce .get_trend() in thermal_zone_device_ops.
>     And we set cur_state++ or cur_state-- based on the value returned
>     by .get_trend(), instead of using tc1 and tc2.

Yes we should do that. I would also like to remove these values from the
registration API's. That makes the registration code way more simpler.
Right now, I do not see any driver using thermal_zone_device_register
with values other than 0's for these.

> 
> G4. Multiple passive trip points
> 
>     I get this idea also from the patches at
>     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> 
>     IMO, they want to get an acceptable performance at a tolerable
>     temperature.
>     Say, a platform with four P-states. P3 is really low.
>     And I'm okay with the temperature at 60C, but 80C? No.
>     With G2 resolved, we can use processor P0~P2 for Passive trip point
>     0 (50C), and P3 for Passive trip point 1 (70C). And then the
>     temperature may be jumping at around 60C or even 65C, without
>     entering P3.
> 
>     Further more, IMO, this also works for ACPI platforms.
>     Say, we can easily change p-state to cool the system, but using
>     t-state is definitely what we do not want to see. The current
>     implementation does not expose this difference to the generic
>     thermal layer, but if we can have two passive trip points, and use
>     p-state for the first one only... (this works if we start polling
>     after entering passive cooling mode, without hardware notification)
> 
> G5. unify active cooling and passive cooling code
> 
>     If G4 and G5 are resolved, a new problem to me is that there is no
>     difference between passive cooling and active cooling except the
>     cooling policy.

Correct...

>     Then we can share the same code for both active and passive cooling.
>     maybe something like:

Or just use TRIP_PASSIVE...

> 
>     case THERMAL_TRIP_ACTIVE:
>     case THERMAL_TRIP_PASSIVE:
>          ...
>          tz->ops->get_trend();
>          if (trend == HEATING)
>                  cdev->ops->set_cur_state(cdev, cur_state++);
>          else if (trend == COOLING)
>                  cdev->ops->set_cur_state(cdev, cur_state--);
>          break;

I agree with you here...

Thanks,
Durga


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-30 10:30   ` Eduardo Valentin
@ 2012-05-30 11:05     ` R, Durgadoss
  2012-05-30 11:17       ` Eduardo Valentin
  2012-05-31  3:27       ` Zhang Rui
  2012-05-31  2:20     ` [linux-pm] " Zhang Rui
  2012-05-31  5:16     ` Amit Kachhap
  2 siblings, 2 replies; 22+ messages in thread
From: R, Durgadoss @ 2012-05-30 11:05 UTC (permalink / raw)
  To: eduardo.valentin@ti.com, Zhang, Rui
  Cc: Matthew Garrett, Brown, Len, amit.kachhap@linaro.org,
	Jean Delvare, linux-acpi@vger.kernel.org, linux-pm

Hi Eduardo,

> 
> For G1+G2, I agree with your proposal. I had some discussion with Amit
> regarding this. In his series of patches we increase / decrease the cooling
> device state linearly and steadily.
> 
> But if we would have what you are saying, we could bind cooling device
> set of states with trip points.

True, We want to bind the levels of cooling with the trips points a thermal zone has.
But we might not get a 1-1 mapping always.

> 
> I fully support this option and could cook up something on this.
> The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> Should probably go away from the registration function that we have
> currently.

I realize I just said the same thing :-)

> 
> We could have generic trending computation though. Based on timestamping
> and temperature reads, and make it available for zones that want to used it.

Agree, but I would like this go into the platform thermal drivers. And then when
those drivers notify the framework they can specify the trend also. This sort of
notification is not there, but that is what I am implementing these days..
Hope to submit this patch in a week's time..

> > >     case THERMAL_TRIP_ACTIVE:
> > >     case THERMAL_TRIP_PASSIVE:
> > >          ...
> > >          tz->ops->get_trend();
> 
> Would the get_trend take into account if we are cooling with active or passive
> cooling device?

To me, it does not matter. It is up to the framework to decide and throttle,
the respective cooling devices according to the trend.

> 
> > >          if (trend == HEATING)
> > >                  cdev->ops->set_cur_state(cdev, cur_state++);
> > >          else if (trend == COOLING)
> > >                  cdev->ops->set_cur_state(cdev, cur_state--);
> > >          break;
> 
> I believe we should have something for temperature stabilization there as well.
> 
> Besides, if we go with this generic policy, then the zone update would be much
> simpler no?

Yes, and that’s what we want too  :-)

> Here are some other thoughts:
> G6. Another point is, would it make sense to allow for policy extension? Meaning,
> the zone update would call a callback to request for update from the zone
> device driver?
> 
> G7. How do we solve cooling devices being shared between different thermal
> zones?
> Should we have a better cooling device constraint management?

This is another thing that was haunting me for quite some time.
And What I have in mind is a mapping kind of thing in the platform layer,
that will provide details about which cooling device is shared with whom.
The framework can then use this and figure out the association among various devices.
I am testing it out, and will submit once it comes to a good shape.

Thanks,
Durga


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-30 11:05     ` R, Durgadoss
@ 2012-05-30 11:17       ` Eduardo Valentin
  2012-05-31  3:32         ` [linux-pm] " Zhang Rui
  2012-05-31  3:27       ` Zhang Rui
  1 sibling, 1 reply; 22+ messages in thread
From: Eduardo Valentin @ 2012-05-30 11:17 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

Hello Durga,

On Wed, May 30, 2012 at 11:05:18AM +0000, R, Durgadoss wrote:
> Hi Eduardo,
> 
> > 
> > For G1+G2, I agree with your proposal. I had some discussion with Amit
> > regarding this. In his series of patches we increase / decrease the cooling
> > device state linearly and steadily.
> > 
> > But if we would have what you are saying, we could bind cooling device
> > set of states with trip points.
> 
> True, We want to bind the levels of cooling with the trips points a thermal zone has.
> But we might not get a 1-1 mapping always.

Just to make sure we are all taking the same thing.

In this case a cooling device would have 1-N states. And this set could
be partitioned and each partition would be assigned to a specific trip point
of a thermal zone, right?



> 
> > 
> > I fully support this option and could cook up something on this.
> > The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> > Should probably go away from the registration function that we have
> > currently.
> 
> I realize I just said the same thing :-)

Cool :-)

> 
> > 
> > We could have generic trending computation though. Based on timestamping
> > and temperature reads, and make it available for zones that want to used it.
> 
> Agree, but I would like this go into the platform thermal drivers. And then when
> those drivers notify the framework they can specify the trend also. This sort of
> notification is not there, but that is what I am implementing these days..
> Hope to submit this patch in a week's time..

Nice, I actually have something being cooked for the same thing. We should probably
align to avoid work duplication...

> 
> > > >     case THERMAL_TRIP_ACTIVE:
> > > >     case THERMAL_TRIP_PASSIVE:
> > > >          ...
> > > >          tz->ops->get_trend();
> > 
> > Would the get_trend take into account if we are cooling with active or passive
> > cooling device?
> 
> To me, it does not matter. It is up to the framework to decide and throttle,
> the respective cooling devices according to the trend.

OK. For me it doesn't really matter as well. Having a simplified zone update is better.

> 
> > 
> > > >          if (trend == HEATING)
> > > >                  cdev->ops->set_cur_state(cdev, cur_state++);
> > > >          else if (trend == COOLING)
> > > >                  cdev->ops->set_cur_state(cdev, cur_state--);
> > > >          break;
> > 
> > I believe we should have something for temperature stabilization there as well.
> > 
> > Besides, if we go with this generic policy, then the zone update would be much
> > simpler no?
> 
> Yes, and that’s what we want too  :-)

Nice!

> 
> > Here are some other thoughts:
> > G6. Another point is, would it make sense to allow for policy extension? Meaning,
> > the zone update would call a callback to request for update from the zone
> > device driver?
> > 
> > G7. How do we solve cooling devices being shared between different thermal
> > zones?
> > Should we have a better cooling device constraint management?
> 
> This is another thing that was haunting me for quite some time.
> And What I have in mind is a mapping kind of thing in the platform layer,
> that will provide details about which cooling device is shared with whom.
> The framework can then use this and figure out the association among various devices.
> I am testing it out, and will submit once it comes to a good shape.

Right, I am not sure we want to go in this direction?

Maybe a better way would be to have sort of pm/thermal contraint framework, which
would map these per device, at LDM level?

I am copying Jean-Pihet, he has been working in this front. Jean, any thoughts?

> 
> Thanks,
> Durga
> 

All Best,

--
Eduardo
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-30  8:49 [RFC] the generic thermal layer enhancement Zhang Rui
  2012-05-30  8:51 ` [linux-pm] " Zhang Rui
  2012-05-30 10:44 ` [linux-pm] " R, Durgadoss
@ 2012-05-30 12:50 ` Matthew Garrett
  2012-05-31  3:54   ` Zhang Rui
  2012-05-31  4:59 ` Amit Kachhap
  3 siblings, 1 reply; 22+ messages in thread
From: Matthew Garrett @ 2012-05-30 12:50 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Rafael J. Wysocki, Jean Delvare, Brown, Len, linux-pm,
	linux-acpi@vger.kernel.org, amit.kachhap

On Wed, May 30, 2012 at 04:49:02PM +0800, Zhang Rui wrote:

> G1. supporting multiple cooling states for active cooling devices.

Sounds fine.

> G2. introduce cooling states range for a certain trip point

Again, no problem here.

> G3. kernel thermal passive cooling algorithm

>     Currently, tc1 and tc2 are hard requirements for kernel passive
>     cooling. But non-ACPI platforms do not have this information
>     (please correct me if I'm wrong).
>     Say, for the patches here
>     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
>     They just want to slow down the processor when current temperature
>     is higher than the trip point and speed up the processor when the
>     temperature is lower than the trip point.

Just slowing the CPU down above a certain temperature is a pretty awful 
approach to passive cooling. You want to maximise performance while 
staying within your thermal envelope, so you need a more advanced 
approach. The existing algorithm provides a generic mechanism for 
balancing performance and thermal output, with the only requirement 
being that the platform provide constants that represent the heating and 
cooling properties of the system.

I don't fundamentally object to adding support for platform-specific 
passive formulae, but I'd like an explanation for how they're better 
than the existing one.

> G4. Multiple passive trip points

It would be good to have an explanation of the use case here. If it's 
acceptable for the device to be at the lower passive trip point, why are 
we slowing it down at all?

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-30 10:30   ` Eduardo Valentin
  2012-05-30 11:05     ` R, Durgadoss
@ 2012-05-31  2:20     ` Zhang Rui
  2012-05-31  5:16     ` Amit Kachhap
  2 siblings, 0 replies; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  2:20 UTC (permalink / raw)
  To: eduardo.valentin
  Cc: Matthew Garrett, Brown, Len, amit.kachhap, Jean Delvare,
	linux-acpi@vger.kernel.org, linux-pm

Hi, Eduardo,

On 三, 2012-05-30 at 13:30 +0300, Eduardo Valentin wrote:
> > > 
> > > G1. supporting multiple cooling states for active cooling devices.
> > > 
> > >     The current active cooling device supports two cooling states only,
> > >     please refer to the code below, in driver/thermal/thermal_sys.c
> > >                 case THERMAL_TRIP_ACTIVE:
> > >                         ...
> > >                         if (temp >= trip_temp)
> > >                                 cdev->ops->set_cur_state(cdev, 1);
> > >                         else
> > >                                 cdev->ops->set_cur_state(cdev, 0);
> > >                         break;
> > > 
> > >     This is an ACPI specific thing, as our ACPI FAN used to support
> > >     ON/OFF only.
> > >     I think it is reasonable to support multiple active cooling states
> > >     as they are common on many platforms, and note that this is also
> > >     true for ACPI 3.0 FAN device (_FPS).
> > > 
> > > G2. introduce cooling states range for a certain trip point
> > > 
> > >     This problem comes with the first one.
> > >     If the cooling devices support multiple cooling states, and surely
> > >     we may want only several cooling states for a certain trip point,
> > >     and other cooling states for other active trip points.
> > >     To do this, we should be able to describe the cooling device
> > >     behavior for a certain trip point, rather than for the entire
> > >     thermal zone.
> 
> For G1+G2, I agree with your proposal. I had some discussion with Amit
> regarding this. In his series of patches we increase / decrease the cooling
> device state linearly and steadily.
> 
> But if we would have what you are saying, we could bind cooling device
> set of states with trip points.
> 
Great.
But further more, IMO, with G3/G4 solved, this patch set can use passive
trip points instead of active trip points for processors, right?

> > > 
> > > G3. kernel thermal passive cooling algorithm
> > > 
> > >     Currently, tc1 and tc2 are hard requirements for kernel passive
> > >     cooling. But non-ACPI platforms do not have this information
> > >     (please correct me if I'm wrong).
> > >     Say, for the patches here
> > >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > 
> > Sorry, forgot to cc Amit, the author of this patch set.
> > 
> > thanks,
> > rui
> > >     They just want to slow down the processor when current temperature
> > >     is higher than the trip point and speed up the processor when the
> > >     temperature is lower than the trip point.
> > > 
> > >     According to Matthew, the platform drivers are responsible to
> > >     provide proper tc1 and tc2 values to use kernel passive cooling.
> > >     But I'm just wondering if we can use something instead.
> > >     Say, introduce .get_trend() in thermal_zone_device_ops.
> > >     And we set cur_state++ or cur_state-- based on the value returned
> > >     by .get_trend(), instead of using tc1 and tc2.
> 
> I fully support this option and could cook up something on this.
> The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> Should probably go away from the registration function that we have
> currently.
> 
> We could have generic trending computation though. Based on timestamping
> and temperature reads, and make it available for zones that want to used it.
> 
Agreed.
I already have this patch in hand.

> > > 
> > > G4. Multiple passive trip points
> > > 
> > >     I get this idea also from the patches at
> > >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > > 
> > >     IMO, they want to get an acceptable performance at a tolerable
> > >     temperature.
> > >     Say, a platform with four P-states. P3 is really low.
> > >     And I'm okay with the temperature at 60C, but 80C? No.
> > >     With G2 resolved, we can use processor P0~P2 for Passive trip point
> > >     0 (50C), and P3 for Passive trip point 1 (70C). And then the
> > >     temperature may be jumping at around 60C or even 65C, without
> > >     entering P3.
> 
> Yeah, I guess we need to solve G1+G2 first to allow this.

yes.

>  But I also agree
> that ideally, there should be possibility to have multiple passive trip points.
> 

I think this is still an open question for Amit.

As I mentioned above, the idea comes from Amit's patch set, and what I'm
doing here is to try to make the passive cooling implementation works
well on exynos platforms.
As we know, in his patch set, MONITOR_ZONE and WARN_ZONE are mapped to
active trip points. But actually, with G2/G3/G4 solved, they SHOULD be
passive trip points as all the cooling devices are processors, right?

So, my question, and probably this is also the question from Matthew,
would be:
Why we need two passive trip points?
What if we use all p-state for MONITOR_ZONE and discard the WARN_ZONE?

I'd like to see Amit's explanation on this. :)

> > > 
> > >     Further more, IMO, this also works for ACPI platforms.
> > >     Say, we can easily change p-state to cool the system, but using
> > >     t-state is definitely what we do not want to see. The current
> > >     implementation does not expose this difference to the generic
> > >     thermal layer, but if we can have two passive trip points, and use
> > >     p-state for the first one only... (this works if we start polling
> > >     after entering passive cooling mode, without hardware notification)
> > > 
> > > G5. unify active cooling and passive cooling code
> > > 
> > >     If G4 and G5 are resolved, a new problem to me is that there is no
> > >     difference between passive cooling and active cooling except the
> > >     cooling policy.
> 
> OK...
> 
> > >     Then we can share the same code for both active and passive cooling.
> > >     maybe something like:
> > > 
> > >     case THERMAL_TRIP_ACTIVE:
> > >     case THERMAL_TRIP_PASSIVE:
> > >          ...
> > >          tz->ops->get_trend();
> 
> Would the get_trend take into account if we are cooling with active or passive
> cooling device?
> 
YES.
We need this as a hint for active/passive cooling management.
But how it actually works depends on the implementation of
the .get_trend callback in the platform driver.

Say, for active trip points,
.get_trend can always return HEATING, so that fan will start to run
faster and faster when the temperature keeps higher than the trip point.
Or it can return COOLING when cur_temperature < last_temperature, so
that the fan will slow down when the temperature is dropping, but still
higher than the trip point.

> > >          if (trend == HEATING)
> > >                  cdev->ops->set_cur_state(cdev, cur_state++);
> > >          else if (trend == COOLING)
> > >                  cdev->ops->set_cur_state(cdev, cur_state--);
> > >          break;
> 
> I believe we should have something for temperature stabilization there as well.
> 
> Besides, if we go with this generic policy, then the zone update would be much
> simpler no?
> 
Agree with Durga.

> > > 
> > > Here are the gaps in my point of view, I'd like to get your ideas about
> > > which are reasonable and which are not.
> 
> Here are some other thoughts:
> G6. Another point is, would it make sense to allow for policy extension? Meaning,
> the zone update would call a callback to request for update from the zone
> device driver?
> 
I'm not against this idea. But do we really have a request for such a
case?
And BTW, if we really have such a request, I think the first thing to do
is to investigate if this is a general request that should be
implemented in the generic thermal layer, rather than go to the zone
device driver directly?

> G7. How do we solve cooling devices being shared between different thermal zones?
> Should we have a better cooling device constraint management?
> 
Maybe we just need a simple Arbitrator in kernel?

When changing a cooling device state, it can only override the cur_state
value for the current cooling device instance!
And the code should parse all the cooling device instance of this
cooling device, and choose the deepest cooling state.

what do you think?

> G8. On same topic as G7, how are we currently making sure that thermal constraints
> don't get overwritten by, let's say, userspace APIs?

what kind of userspace APIs can change this?

> I guess the generic CPU cooling
> propose by Amit suffers of an issue. If user sets cpufreq governor to userspace
> and sets the frequency to its maximum, say in a busy loop, the thermal cooling
> could potentially be ruined.
> 
that's true, but I don't think that's the case we need to cover.

Setting cpufreq governor to "Userspace" means user needs to be
responsible for the p-state controlling. If the system overheats, that's
the thing that the user needs to worry about.

> G9. Is there any possibility to have multiple sensors per thermal zone?
> 
I do not think we need to consider this situation, until there is such
an requirement.

> G10. Do we want to represent other sensing stimuli other that temperature? Say,
> current sensing?
> 
what do you mean by "current sensing"?

> G11. Do we want to allow for cross zoning policies? Sometimes a policy may use
> temperature from different thermal zone in order to better represent what
> is going on in its thermal zone.
> 
Good question. This bothered me a lot actually.

Each zone device just represents a certain part of the overall system,
and these different zones may interact with each other.
But the current implementation does not take care of this.
Unfortunately I do not have any ideas about this yet. :)

> > > 
> > > Any comments are appreciated! Thanks!
> 
> 
> Thanks to you for starting this up! The above are points that come to my mind now.
> I will keep updating the list if something else come to my mind.
> 
They are all good points, thank you for your valuable feedback, Eduardo!

-rui

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-30 10:44 ` [linux-pm] " R, Durgadoss
@ 2012-05-31  3:15   ` Zhang Rui
  0 siblings, 0 replies; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  3:15 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

Hi, Durga,

On 三, 2012-05-30 at 04:44 -0600, R, Durgadoss wrote:
> Hi Rui,
> 
> > G1. supporting multiple cooling states for active cooling devices.
> > 
> >     The current active cooling device supports two cooling states only,
> >     please refer to the code below, in driver/thermal/thermal_sys.c
> >                 case THERMAL_TRIP_ACTIVE:
> >                         ...
> >                         if (temp >= trip_temp)
> >                                 cdev->ops->set_cur_state(cdev, 1);
> >                         else
> >                                 cdev->ops->set_cur_state(cdev, 0);
> >                         break;
> > 
> >     This is an ACPI specific thing, as our ACPI FAN used to support
> >     ON/OFF only.
> >     I think it is reasonable to support multiple active cooling states
> >     as they are common on many platforms, and note that this is also
> >     true for ACPI 3.0 FAN device (_FPS).
> > 
> > G2. introduce cooling states range for a certain trip point
> > 
> >     This problem comes with the first one.
> >     If the cooling devices support multiple cooling states, and surely
> >     we may want only several cooling states for a certain trip point,
> >     and other cooling states for other active trip points.
> >     To do this, we should be able to describe the cooling device
> >     behavior for a certain trip point, rather than for the entire
> >     thermal zone.
> > 
> > G3. kernel thermal passive cooling algorithm
> > 
> >     Currently, tc1 and tc2 are hard requirements for kernel passive
> >     cooling. But non-ACPI platforms do not have this information
> >     (please correct me if I'm wrong).
> >     Say, for the patches here
> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> >     They just want to slow down the processor when current temperature
> >     is higher than the trip point and speed up the processor when the
> >     temperature is lower than the trip point.
> > 
> >     According to Matthew, the platform drivers are responsible to
> >     provide proper tc1 and tc2 values to use kernel passive cooling.
> >     But I'm just wondering if we can use something instead.
> >     Say, introduce .get_trend() in thermal_zone_device_ops.
> >     And we set cur_state++ or cur_state-- based on the value returned
> >     by .get_trend(), instead of using tc1 and tc2.
> 
> Yes we should do that. I would also like to remove these values from the
> registration API's. That makes the registration code way more simpler.
> Right now, I do not see any driver using thermal_zone_device_register
> with values other than 0's for these.
> 
Agreed.

> > 
> > G4. Multiple passive trip points
> > 
> >     I get this idea also from the patches at
> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > 
> >     IMO, they want to get an acceptable performance at a tolerable
> >     temperature.
> >     Say, a platform with four P-states. P3 is really low.
> >     And I'm okay with the temperature at 60C, but 80C? No.
> >     With G2 resolved, we can use processor P0~P2 for Passive trip point
> >     0 (50C), and P3 for Passive trip point 1 (70C). And then the
> >     temperature may be jumping at around 60C or even 65C, without
> >     entering P3.
> > 
> >     Further more, IMO, this also works for ACPI platforms.
> >     Say, we can easily change p-state to cool the system, but using
> >     t-state is definitely what we do not want to see. The current
> >     implementation does not expose this difference to the generic
> >     thermal layer, but if we can have two passive trip points, and use
> >     p-state for the first one only... (this works if we start polling
> >     after entering passive cooling mode, without hardware notification)
> > 
> > G5. unify active cooling and passive cooling code
> > 
> >     If G4 and G5 are resolved, a new problem to me is that there is no
> >     difference between passive cooling and active cooling except the
> >     cooling policy.
> 
> Correct...
> 
> >     Then we can share the same code for both active and passive cooling.
> >     maybe something like:
> 
> Or just use TRIP_PASSIVE...
> 
what do you mean? Remove THERMAL_TRIP_ACTIVE?

Maybe I'm not clear enough here.
We may disable the passive cooling or active cooling because of the
thermal management policy.
Say, if we are running on AC, we may want to use active cooling only to
avoid any performance loss. If we are running on Battery, we may want to
use passive cooling only to get longer battery life.

So the code would be:
   static void thermal_cooling()
   {
      ...
      tz->ops->get_trend();
      if (trend == HEATING)
         cdev->ops->set_cur_state(cdev, cur_state++);
      else if (trend == COOLING)
         cdev->ops->set_cur_state(cdev, cur_state--);
   }

   void thermal_zone_device_update()
   {
       ...
       case THERMAL_TRIP_ACTIVE:
            if (policy & ACTIVE_COOLING)
               thermal_cooling();
            break;
       case THERMAL_TRIP_PASSIVE:
            if (policy & PASSIVE_COOLING)
               thermal_cooling();
            break;
       ...
   }

thanks,
rui


_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-30 11:05     ` R, Durgadoss
  2012-05-30 11:17       ` Eduardo Valentin
@ 2012-05-31  3:27       ` Zhang Rui
  1 sibling, 0 replies; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  3:27 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

On 三, 2012-05-30 at 05:05 -0600, R, Durgadoss wrote:
> Hi Eduardo,
> 
> > 
> > For G1+G2, I agree with your proposal. I had some discussion with Amit
> > regarding this. In his series of patches we increase / decrease the cooling
> > device state linearly and steadily.
> > 
> > But if we would have what you are saying, we could bind cooling device
> > set of states with trip points.
> 
> True, We want to bind the levels of cooling with the trips points a thermal zone has.
> But we might not get a 1-1 mapping always.
> 
> > 
> > I fully support this option and could cook up something on this.
> > The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> > Should probably go away from the registration function that we have
> > currently.
> 
> I realize I just said the same thing :-)
> 
> > 
> > We could have generic trending computation though. Based on timestamping
> > and temperature reads, and make it available for zones that want to used it.
> 
> Agree, but I would like this go into the platform thermal drivers.

But at least we need a dummy function in thermal_sys.c, as a backup.

>  And then when
> those drivers notify the framework they can specify the trend also. This sort of
> notification is not there, but that is what I am implementing these days..
> Hope to submit this patch in a week's time..
> 
You are also working on this?
here is the prototype patch I made for this.
Surely we can change the logic in thermal_get_trend() in thermal_sys.c
It would be great if you can attach your patch as well.

---
 drivers/acpi/thermal.c        |   36 ++++++++++++++++++++++++++++++++++++
 drivers/thermal/thermal_sys.c |   24 ++++++++++++++++++------
 include/linux/thermal.h       |    9 +++++++++
 3 files changed, 63 insertions(+), 6 deletions(-)

Index: rtd3/drivers/acpi/thermal.c
===================================================================
--- rtd3.orig/drivers/acpi/thermal.c
+++ rtd3/drivers/acpi/thermal.c
@@ -706,6 +706,41 @@ static int thermal_get_crit_temp(struct
 		return -EINVAL;
 }
 
+static int thermal_get_trend(struct thermal_zone_device *thermal,
+			     int trip, enum thermal_trend *trend)
+{
+	struct acpi_thermal *tz = thermal->devdata;
+	enum thermal_trip_type type;
+	unsigned long trip_temp;
+	int i;
+
+	if (thermal_get_trip_type(thermal, trip, &type))
+		return -EINVAL;
+
+	/* Only ACTIVE and PASSIVE trip points need TREND */
+	if (type != THERMAL_TRIP_PASSIVE && type != THERMAL_TRIP_ACTIVE)
+		return -EINVAL;
+	/*  */
+	if (type == THERMAL_TRIP_ACTIVE) {
+		*trend = THERMAL_TREND_RAISING;
+		return 0;
+	}
+
+	if (thermal_get_trip_temp(thermal, trip, &trip_temp))
+		return -EINVAL;
+
+	/*
+	 * tz->temperature has already been updated by generic thermal layer,
+	 * before this callback being invoked
+	 */
+        i = (tz->trips.passive.tc1 * (tz->temperature - tz->last_temperature))
+		 + (tz->trips.passive.tc2 * (tz->temperature - trip_temp));
+
+	*trend = i > 0 ? THERMAL_TREND_RAISING :
+		(i < 0 ? THERMAL_TREND_DROPPING : THERMAL_TREND_NONE);
+	return 0;
+}
+
 static int thermal_notify(struct thermal_zone_device *thermal, int trip,
 			   enum thermal_trip_type trip_type)
 {
@@ -821,6 +856,7 @@ static const struct thermal_zone_device_
 	.get_trip_type = thermal_get_trip_type,
 	.get_trip_temp = thermal_get_trip_temp,
 	.get_crit_temp = thermal_get_crit_temp,
+	.get_trend = thermal_get_trend,
 	.notify = thermal_notify,
 };
 
Index: rtd3/drivers/thermal/thermal_sys.c
===================================================================
--- rtd3.orig/drivers/thermal/thermal_sys.c
+++ rtd3/drivers/thermal/thermal_sys.c
@@ -650,6 +650,19 @@ thermal_remove_hwmon_sysfs(struct therma
 }
 #endif
 
+static void thermal_get_trend(struct thermal_zone_device *tz,
+		enum thermal_trip_type type, enum thermal_trend *trend)
+{
+	if (!tz->ops->get_trend) {
+		trend = THERMAL_TREND_DEFAULT;
+		return;
+	}
+
+	if (tz->ops->get_trend(tz, type, trend))
+		trend = THERMAL_TREND_DEFAULT;
+	return;
+}
+
 static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
 					    int delay)
 {
@@ -669,10 +682,11 @@ static void thermal_zone_device_set_poll
 static void thermal_zone_device_passive(struct thermal_zone_device *tz,
 					int temp, int trip_temp, int trip)
 {
-	int trend = 0;
+	enum thermal_trend trend = THERMAL_TREND_DEFAULT;
 	struct thermal_cooling_device_instance *instance;
 	struct thermal_cooling_device *cdev;
 	long state, max_state;
+	int ret =
 
 	/*
 	 * Above Trip?
@@ -684,11 +698,9 @@ static void thermal_zone_device_passive(
 	if (temp >= trip_temp) {
 		tz->passive = true;
 
-		trend = (tz->tc1 * (temp - tz->last_temperature)) +
-			(tz->tc2 * (temp - trip_temp));
+		thermal_get_trend(tz, trip, &trend);
 
-		/* Heating up? */
-		if (trend > 0) {
+		if (trend == THERMAL_TREND_RAISING) {
 			list_for_each_entry(instance, &tz->cooling_devices,
 					    node) {
 				if (instance->trip != trip)
@@ -699,7 +711,7 @@ static void thermal_zone_device_passive(
 				if (state++ < max_state)
 					cdev->ops->set_cur_state(cdev, state);
 			}
-		} else if (trend < 0) { /* Cooling off? */
+		} else if (trend == THERMAL_TREND_DROPPING) {
 			list_for_each_entry(instance, &tz->cooling_devices,
 					    node) {
 				if (instance->trip != trip)
Index: rtd3/include/linux/thermal.h
===================================================================
--- rtd3.orig/include/linux/thermal.h
+++ rtd3/include/linux/thermal.h
@@ -44,6 +44,13 @@ enum thermal_trip_type {
 	THERMAL_TRIP_CRITICAL,
 };
 
+enum thermal_trend {
+	THERMAL_TREND_NONE,
+	THERMAL_TREND_RAISING,
+	THERMAL_TREND_DROPPING,
+	THERMAL_TREND_DEFAULT = THERMAL_TREND_RAISING, /* aggressive cooling */
+};
+
 struct thermal_zone_device_ops {
 	int (*bind) (struct thermal_zone_device *,
 		     struct thermal_cooling_device *);
@@ -59,6 +66,8 @@ struct thermal_zone_device_ops {
 	int (*get_trip_temp) (struct thermal_zone_device *, int,
 			      unsigned long *);
 	int (*get_crit_temp) (struct thermal_zone_device *, unsigned long *);
+	int (*get_trend) (struct thermal_zone_device *, int,
+			  enum thermal_trend *);
 	int (*notify) (struct thermal_zone_device *, int,
 		       enum thermal_trip_type);
 };


> > > >     case THERMAL_TRIP_ACTIVE:
> > > >     case THERMAL_TRIP_PASSIVE:
> > > >          ...
> > > >          tz->ops->get_trend();
> > 
> > Would the get_trend take into account if we are cooling with active or passive
> > cooling device?
> 
> To me, it does not matter. It is up to the framework to decide and throttle,
> the respective cooling devices according to the trend.
> 
> > 
> > > >          if (trend == HEATING)
> > > >                  cdev->ops->set_cur_state(cdev, cur_state++);
> > > >          else if (trend == COOLING)
> > > >                  cdev->ops->set_cur_state(cdev, cur_state--);
> > > >          break;
> > 
> > I believe we should have something for temperature stabilization there as well.
> > 
> > Besides, if we go with this generic policy, then the zone update would be much
> > simpler no?
> 
> Yes, and that’s what we want too  :-)
> 
> > Here are some other thoughts:
> > G6. Another point is, would it make sense to allow for policy extension? Meaning,
> > the zone update would call a callback to request for update from the zone
> > device driver?
> > 
> > G7. How do we solve cooling devices being shared between different thermal
> > zones?
> > Should we have a better cooling device constraint management?
> 
> This is another thing that was haunting me for quite some time.
> And What I have in mind is a mapping kind of thing in the platform layer,
> that will provide details about which cooling device is shared with whom.
> The framework can then use this and figure out the association among various devices.
> I am testing it out, and will submit once it comes to a good shape.
> 
I think this is your answer for G11 rather than G7, no?

thanks,
rui


_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-30 11:17       ` Eduardo Valentin
@ 2012-05-31  3:32         ` Zhang Rui
  2012-05-31 11:06           ` Eduardo Valentin
  0 siblings, 1 reply; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  3:32 UTC (permalink / raw)
  To: eduardo.valentin
  Cc: R, Durgadoss, Matthew Garrett, Brown, Len,
	amit.kachhap@linaro.org, jean.pihet, Jean Delvare,
	linux-acpi@vger.kernel.org, linux-pm

On 三, 2012-05-30 at 14:17 +0300, Eduardo Valentin wrote:
> Hello Durga,
> 
> On Wed, May 30, 2012 at 11:05:18AM +0000, R, Durgadoss wrote:
> > Hi Eduardo,
> > 
> > > 
> > > For G1+G2, I agree with your proposal. I had some discussion with Amit
> > > regarding this. In his series of patches we increase / decrease the cooling
> > > device state linearly and steadily.
> > > 
> > > But if we would have what you are saying, we could bind cooling device
> > > set of states with trip points.
> > 
> > True, We want to bind the levels of cooling with the trips points a thermal zone has.
> > But we might not get a 1-1 mapping always.
> 
> Just to make sure we are all taking the same thing.
> 
> In this case a cooling device would have 1-N states. And this set could
> be partitioned and each partition would be assigned to a specific trip point
> of a thermal zone, right?
> 
yep.
BTW, Overlaps should be possible and we should handle this as well.
> 
> 
> > 
> > > 
> > > I fully support this option and could cook up something on this.
> > > The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> > > Should probably go away from the registration function that we have
> > > currently.
> > 
> > I realize I just said the same thing :-)
> 
> Cool :-)
> 
> > 
> > > 
> > > We could have generic trending computation though. Based on timestamping
> > > and temperature reads, and make it available for zones that want to used it.
> > 
> > Agree, but I would like this go into the platform thermal drivers. And then when
> > those drivers notify the framework they can specify the trend also. This sort of
> > notification is not there, but that is what I am implementing these days..
> > Hope to submit this patch in a week's time..
> 
> Nice, I actually have something being cooked for the same thing. We should probably
> align to avoid work duplication...
> 
Hah, seems a lot of work is in progress in this area.

> > 
> > > > >     case THERMAL_TRIP_ACTIVE:
> > > > >     case THERMAL_TRIP_PASSIVE:
> > > > >          ...
> > > > >          tz->ops->get_trend();
> > > 
> > > Would the get_trend take into account if we are cooling with active or passive
> > > cooling device?
> > 
> > To me, it does not matter. It is up to the framework to decide and throttle,
> > the respective cooling devices according to the trend.
> 
> OK. For me it doesn't really matter as well. Having a simplified zone update is better.
> 
> > 
> > > 
> > > > >          if (trend == HEATING)
> > > > >                  cdev->ops->set_cur_state(cdev, cur_state++);
> > > > >          else if (trend == COOLING)
> > > > >                  cdev->ops->set_cur_state(cdev, cur_state--);
> > > > >          break;
> > > 
> > > I believe we should have something for temperature stabilization there as well.
> > > 
> > > Besides, if we go with this generic policy, then the zone update would be much
> > > simpler no?
> > 
> > Yes, and that’s what we want too  :-)
> 
> Nice!
> 
> > 
> > > Here are some other thoughts:
> > > G6. Another point is, would it make sense to allow for policy extension? Meaning,
> > > the zone update would call a callback to request for update from the zone
> > > device driver?
> > > 
> > > G7. How do we solve cooling devices being shared between different thermal
> > > zones?
> > > Should we have a better cooling device constraint management?
> > 
> > This is another thing that was haunting me for quite some time.
> > And What I have in mind is a mapping kind of thing in the platform layer,
> > that will provide details about which cooling device is shared with whom.
> > The framework can then use this and figure out the association among various devices.
> > I am testing it out, and will submit once it comes to a good shape.
> 
> Right, I am not sure we want to go in this direction?
> 
> Maybe a better way would be to have sort of pm/thermal contraint framework, which
> would map these per device, at LDM level?
> 
> I am copying Jean-Pihet, he has been working in this front. Jean, any thoughts?
> 
Durga and I are investigating how to introduce some concepts like
"influence/weight" to generic thermal layer. :)

thanks,
rui

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-30 12:50 ` Matthew Garrett
@ 2012-05-31  3:54   ` Zhang Rui
  2012-05-31  3:58     ` Matthew Garrett
  0 siblings, 1 reply; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  3:54 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Rafael J. Wysocki, Jean Delvare, Brown, Len, linux-pm,
	linux-acpi@vger.kernel.org, amit.kachhap

On 三, 2012-05-30 at 13:50 +0100, Matthew Garrett wrote:
> On Wed, May 30, 2012 at 04:49:02PM +0800, Zhang Rui wrote:
> 
> > G1. supporting multiple cooling states for active cooling devices.
> 
> Sounds fine.
> 
Great!

> > G2. introduce cooling states range for a certain trip point
> 
> Again, no problem here.
> 
Again, great!

> > G3. kernel thermal passive cooling algorithm
> 
> >     Currently, tc1 and tc2 are hard requirements for kernel passive
> >     cooling. But non-ACPI platforms do not have this information
> >     (please correct me if I'm wrong).
> >     Say, for the patches here
> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> >     They just want to slow down the processor when current temperature
> >     is higher than the trip point and speed up the processor when the
> >     temperature is lower than the trip point.
> 
> Just slowing the CPU down above a certain temperature is a pretty awful 
> approach to passive cooling. You want to maximise performance while 
> staying within your thermal envelope, so you need a more advanced 
> approach.

Agreed.

> The existing algorithm provides a generic mechanism for 
> balancing performance and thermal output, with the only requirement 
> being that the platform provide constants that represent the heating and 
> cooling properties of the system.
> 
I'm not sure if this could work on their platforms. So I'm just looking
for an easier way to handle this, i.e. make generic thermal layer
simple, and provide the flexibility for platform drivers to do their own
tricks.

> I don't fundamentally object to adding support for platform-specific 
> passive formulae, but I'd like an explanation for how they're better 
> than the existing one.
> 
Hmmm, I'd like to see if it is possible for them to make use of the
current passive cooling implementation.. if no...

how about this?

we can provide an API for the current algorithm in thermal_sys.c

thermal_passive_get_trend(int tc1, int tc2)
{
}
EXPORT_SYMBOL(...);

and in platform driver, they can choose to use this default algorithm if
they are able to.

.get_trend()
{
   return thermal_passive_get_trend(platform_tc1, platform_tc2);
}

> > G4. Multiple passive trip points
> 
> It would be good to have an explanation of the use case here. If it's 
> acceptable for the device to be at the lower passive trip point, why are 
> we slowing it down at all?
> 
acceptable does not equal comfortable?
Say, I'd like to use the computer at 30C skin temperature.
I'm okay with the temperature at 50C, but it would be nice if it can be
lower, even if the system would be slower, but not too slow (T-state).
If the temperature is higher than 60, it is not usable for me, I'll wait
for a while, the system can do everything they want do cool the system
down (but hibernate/shutdown would be not a good idea at this time
because it is hot enough for some hardware damage).

thanks,
rui

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-31  3:54   ` Zhang Rui
@ 2012-05-31  3:58     ` Matthew Garrett
  2012-05-31  5:54       ` Zhang Rui
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Garrett @ 2012-05-31  3:58 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Brown, Len, Jean Delvare, linux-acpi@vger.kernel.org, linux-pm

On Thu, May 31, 2012 at 11:54:51AM +0800, Zhang Rui wrote:
> On 三, 2012-05-30 at 13:50 +0100, Matthew Garrett wrote:
> > The existing algorithm provides a generic mechanism for 
> > balancing performance and thermal output, with the only requirement 
> > being that the platform provide constants that represent the heating and 
> > cooling properties of the system.
> > 
> I'm not sure if this could work on their platforms. So I'm just looking
> for an easier way to handle this, i.e. make generic thermal layer
> simple, and provide the flexibility for platform drivers to do their own
> tricks.

If it's not possible for a platform to use the existing generic approach 
then we should certainly provide a way for them to handle that, but 
first I'd like to see evidence that it's impossible for them to use the 
existing generic approach. This kind of conversation is better with real 
world examples :)

> > > G4. Multiple passive trip points
> > 
> > It would be good to have an explanation of the use case here. If it's 
> > acceptable for the device to be at the lower passive trip point, why are 
> > we slowing it down at all?
> > 
> acceptable does not equal comfortable?
> Say, I'd like to use the computer at 30C skin temperature.
> I'm okay with the temperature at 50C, but it would be nice if it can be
> lower, even if the system would be slower, but not too slow (T-state).
> If the temperature is higher than 60, it is not usable for me, I'll wait
> for a while, the system can do everything they want do cool the system
> down (but hibernate/shutdown would be not a good idea at this time
> because it is hot enough for some hardware damage).

Ok, that seems reasonable.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-30  8:49 [RFC] the generic thermal layer enhancement Zhang Rui
                   ` (2 preceding siblings ...)
  2012-05-30 12:50 ` Matthew Garrett
@ 2012-05-31  4:59 ` Amit Kachhap
  2012-05-31  6:09   ` Zhang Rui
  3 siblings, 1 reply; 22+ messages in thread
From: Amit Kachhap @ 2012-05-31  4:59 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

Hi Rui,

Thanks for starting a discussion on further finetuning the thermal layer.

On 30 May 2012 16:49, Zhang Rui <rui.zhang@intel.com> wrote:
> Hi, all,
>
> It is great to see more and more users of the generic thermal layer.
> But as we know, the original design of the generic thermal layer comes
> from ACPI thermal management, and some of its implementation seems to be
> too ACPI specific nowadays.
Totally agreed :)
>
> Recently I'm thinking of enhance the generic thermal layer so that it
> works well for more platforms.
>
> Below are some thoughts of mine, after reading the patches from Amit
> Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
> coding some RFC patches. But I do really want to get feedback from you
> before going on.
>
> G1. supporting multiple cooling states for active cooling devices.
>
>    The current active cooling device supports two cooling states only,
>    please refer to the code below, in driver/thermal/thermal_sys.c
>                case THERMAL_TRIP_ACTIVE:
>                        ...
>                        if (temp >= trip_temp)
>                                cdev->ops->set_cur_state(cdev, 1);
>                        else
>                                cdev->ops->set_cur_state(cdev, 0);
>                        break;
>
>    This is an ACPI specific thing, as our ACPI FAN used to support
>    ON/OFF only.
>    I think it is reasonable to support multiple active cooling states
>    as they are common on many platforms, and note that this is also
>    true for ACPI 3.0 FAN device (_FPS).
Yes I agree that ACTIVE trip type should support more than 1(ON)
state. and which state the the set_cur_state should call depends on
how the state is binded on the trip point. But again in doing these
there is so much logic put in handling this that the I dropped the
patch for new trip type ACTIVE_INSTANCE.
>
> G2. introduce cooling states range for a certain trip point
>
>    This problem comes with the first one.
>    If the cooling devices support multiple cooling states, and surely
>    we may want only several cooling states for a certain trip point,
>    and other cooling states for other active trip points.
>    To do this, we should be able to describe the cooling device
>    behavior for a certain trip point, rather than for the entire
>    thermal zone.
Agreed
>
> G3. kernel thermal passive cooling algorithm
>
>    Currently, tc1 and tc2 are hard requirements for kernel passive
>    cooling. But non-ACPI platforms do not have this information
>    (please correct me if I'm wrong).
>    Say, for the patches here
>    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
>    They just want to slow down the processor when current temperature
>    is higher than the trip point and speed up the processor when the
>    temperature is lower than the trip point.
>
>    According to Matthew, the platform drivers are responsible to
>    provide proper tc1 and tc2 values to use kernel passive cooling.
>    But I'm just wondering if we can use something instead.
>    Say, introduce .get_trend() in thermal_zone_device_ops.
>    And we set cur_state++ or cur_state-- based on the value returned
>    by .get_trend(), instead of using tc1 and tc2.
OK this seems fine. But for exynos platform(also some other platforms)
this may be simply cur_temp - last_temp which is fine.
>
> G4. Multiple passive trip points
>
>    I get this idea also from the patches at
>    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
>
>    IMO, they want to get an acceptable performance at a tolerable
>    temperature.
>    Say, a platform with four P-states. P3 is really low.
>    And I'm okay with the temperature at 60C, but 80C? No.
>    With G2 resolved, we can use processor P0~P2 for Passive trip point
>    0 (50C), and P3 for Passive trip point 1 (70C). And then the
>    temperature may be jumping at around 60C or even 65C, without
>    entering P3.
>
>    Further more, IMO, this also works for ACPI platforms.
>    Say, we can easily change p-state to cool the system, but using
>    t-state is definitely what we do not want to see. The current
>    implementation does not expose this difference to the generic
>    thermal layer, but if we can have two passive trip points, and use
>    p-state for the first one only... (this works if we start polling
>    after entering passive cooling mode, without hardware notification)
This seems cool and to answer Mathew doubt this is needed because in
the not so critical thermal envelop the platforms may not want to go
to the lowest opp level and some intermediate level is fine.

>
> G5. unify active cooling and passive cooling code
>
>    If G4 and G5 are resolved, a new problem to me is that there is no
>    difference between passive cooling and active cooling except the
>    cooling policy.
>    Then we can share the same code for both active and passive cooling.
>    maybe something like:
>
>    case THERMAL_TRIP_ACTIVE:
>    case THERMAL_TRIP_PASSIVE:
>         ...
>         tz->ops->get_trend();
>         if (trend == HEATING)
>                 cdev->ops->set_cur_state(cdev, cur_state++);
>         else if (trend == COOLING)
>                 cdev->ops->set_cur_state(cdev, cur_state--);
>         break;
Actually I still feels that ACTIVE and PASSIVE are needed because they
have different way of calling the set_cur_rate. ACTIVE trip type is
associated with 1 trip point so on reaching a trip point a specific
state is called in set_cur_state. But PASSIVE basically
increments/decrements the state depending on thermal trend.
>
> Here are the gaps in my point of view, I'd like to get your ideas about
> which are reasonable and which are not.
>
> Any comments are appreciated! Thanks!
>
> -rui
>
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-30 10:30   ` Eduardo Valentin
  2012-05-30 11:05     ` R, Durgadoss
  2012-05-31  2:20     ` [linux-pm] " Zhang Rui
@ 2012-05-31  5:16     ` Amit Kachhap
  2012-05-31  6:13       ` Zhang Rui
  2012-05-31 11:13       ` Eduardo Valentin
  2 siblings, 2 replies; 22+ messages in thread
From: Amit Kachhap @ 2012-05-31  5:16 UTC (permalink / raw)
  To: eduardo.valentin
  Cc: Zhang Rui, Matthew Garrett, Brown, Len, Jean Delvare,
	linux-acpi@vger.kernel.org, linux-pm

On 30 May 2012 18:30, Eduardo Valentin <eduardo.valentin@ti.com> wrote:
> Hello Rui,
>
> Now I copied Amit, for real :-)
>
> I like your proposal, some comments as follow.
>
> On Wed, May 30, 2012 at 04:51:00PM +0800, Zhang Rui wrote:
>> On 三, 2012-05-30 at 16:49 +0800, Zhang Rui wrote:
>> > Hi, all,
>> >
>> > It is great to see more and more users of the generic thermal layer.
>> > But as we know, the original design of the generic thermal layer comes
>> > from ACPI thermal management, and some of its implementation seems to be
>> > too ACPI specific nowadays.
>
>
> Good. We have also basic OMAP support, on top of Amit's work. I sent
> recently a very basic support.  I will be pushing them, while they evolve.
>
>> >
>> > Recently I'm thinking of enhance the generic thermal layer so that it
>> > works well for more platforms.
>
> As you said, for non-ACPI support, the "generic" layer needs some
> extension and refactoring to be more generic :-).
>
>> >
>> > Below are some thoughts of mine, after reading the patches from Amit
>> > Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
>> > coding some RFC patches. But I do really want to get feedback from you
>> > before going on.
>
> OK.
>
>> >
>> > G1. supporting multiple cooling states for active cooling devices.
>> >
>> >     The current active cooling device supports two cooling states only,
>> >     please refer to the code below, in driver/thermal/thermal_sys.c
>> >                 case THERMAL_TRIP_ACTIVE:
>> >                         ...
>> >                         if (temp >= trip_temp)
>> >                                 cdev->ops->set_cur_state(cdev, 1);
>> >                         else
>> >                                 cdev->ops->set_cur_state(cdev, 0);
>> >                         break;
>> >
>> >     This is an ACPI specific thing, as our ACPI FAN used to support
>> >     ON/OFF only.
>> >     I think it is reasonable to support multiple active cooling states
>> >     as they are common on many platforms, and note that this is also
>> >     true for ACPI 3.0 FAN device (_FPS).
>> >
>> > G2. introduce cooling states range for a certain trip point
>> >
>> >     This problem comes with the first one.
>> >     If the cooling devices support multiple cooling states, and surely
>> >     we may want only several cooling states for a certain trip point,
>> >     and other cooling states for other active trip points.
>> >     To do this, we should be able to describe the cooling device
>> >     behavior for a certain trip point, rather than for the entire
>> >     thermal zone.
>
> For G1+G2, I agree with your proposal. I had some discussion with Amit
> regarding this. In his series of patches we increase / decrease the cooling
> device state linearly and steadily.
>
> But if we would have what you are saying, we could bind cooling device
> set of states with trip points.
>
>> >
>> > G3. kernel thermal passive cooling algorithm
>> >
>> >     Currently, tc1 and tc2 are hard requirements for kernel passive
>> >     cooling. But non-ACPI platforms do not have this information
>> >     (please correct me if I'm wrong).
>> >     Say, for the patches here
>> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
>>
>> Sorry, forgot to cc Amit, the author of this patch set.
>>
>> thanks,
>> rui
>> >     They just want to slow down the processor when current temperature
>> >     is higher than the trip point and speed up the processor when the
>> >     temperature is lower than the trip point.
>> >
>> >     According to Matthew, the platform drivers are responsible to
>> >     provide proper tc1 and tc2 values to use kernel passive cooling.
>> >     But I'm just wondering if we can use something instead.
>> >     Say, introduce .get_trend() in thermal_zone_device_ops.
>> >     And we set cur_state++ or cur_state-- based on the value returned
>> >     by .get_trend(), instead of using tc1 and tc2.
>
> I fully support this option and could cook up something on this.
> The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> Should probably go away from the registration function that we have
> currently.
>
> We could have generic trending computation though. Based on timestamping
> and temperature reads, and make it available for zones that want to used it.
>
>> >
>> > G4. Multiple passive trip points
>> >
>> >     I get this idea also from the patches at
>> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
>> >
>> >     IMO, they want to get an acceptable performance at a tolerable
>> >     temperature.
>> >     Say, a platform with four P-states. P3 is really low.
>> >     And I'm okay with the temperature at 60C, but 80C? No.
>> >     With G2 resolved, we can use processor P0~P2 for Passive trip point
>> >     0 (50C), and P3 for Passive trip point 1 (70C). And then the
>> >     temperature may be jumping at around 60C or even 65C, without
>> >     entering P3.
>
> Yeah, I guess we need to solve G1+G2 first to allow this. But I also agree
> that ideally, there should be possibility to have multiple passive trip points.
>
>> >
>> >     Further more, IMO, this also works for ACPI platforms.
>> >     Say, we can easily change p-state to cool the system, but using
>> >     t-state is definitely what we do not want to see. The current
>> >     implementation does not expose this difference to the generic
>> >     thermal layer, but if we can have two passive trip points, and use
>> >     p-state for the first one only... (this works if we start polling
>> >     after entering passive cooling mode, without hardware notification)
>> >
>> > G5. unify active cooling and passive cooling code
>> >
>> >     If G4 and G5 are resolved, a new problem to me is that there is no
>> >     difference between passive cooling and active cooling except the
>> >     cooling policy.
>
> OK...
>
>> >     Then we can share the same code for both active and passive cooling.
>> >     maybe something like:
>> >
>> >     case THERMAL_TRIP_ACTIVE:
>> >     case THERMAL_TRIP_PASSIVE:
>> >          ...
>> >          tz->ops->get_trend();
>
> Would the get_trend take into account if we are cooling with active or passive
> cooling device?
>
>> >          if (trend == HEATING)
>> >                  cdev->ops->set_cur_state(cdev, cur_state++);
>> >          else if (trend == COOLING)
>> >                  cdev->ops->set_cur_state(cdev, cur_state--);
>> >          break;
>
> I believe we should have something for temperature stabilization there as well.
I also agree that thermal stablization is important. I have observed
that too much state change is happening around the trip point. But yes
 the trend callback may take care of this and set the COOLING trend
different from HEATING trend.
>
> Besides, if we go with this generic policy, then the zone update would be much
> simpler no?
>
>> >
>> > Here are the gaps in my point of view, I'd like to get your ideas about
>> > which are reasonable and which are not.
>
> Here are some other thoughts:
> G6. Another point is, would it make sense to allow for policy extension? Meaning,
> the zone update would call a callback to request for update from the zone
> device driver?
This may a simple work with adding notifiers like done for CRITICAL,
HOT trip type.
>
> G7. How do we solve cooling devices being shared between different thermal zones?
> Should we have a better cooling device constraint management?
>
> G8. On same topic as G7, how are we currently making sure that thermal constraints
> don't get overwritten by, let's say, userspace APIs? I guess the generic CPU cooling
> propose by Amit suffers of an issue. If user sets cpufreq governor to userspace
> and sets the frequency to its maximum, say in a busy loop, the thermal cooling
> could potentially be ruined.
Yes I agree that is a problem in my implementation but I guess the
current cpufreq framework does not have anything to stop this. May be
with cpufreq_max pmqos patches this will be taken care.
>
> G9. Is there any possibility to have multiple sensors per thermal zone?
>
> G10. Do we want to represent other sensing stimuli other that temperature? Say,
> current sensing?
Yes this is good field to look into.
>
> G11. Do we want to allow for cross zoning policies? Sometimes a policy may use
> temperature from different thermal zone in order to better represent what
> is going on in its thermal zone.
>
>> >
>> > Any comments are appreciated! Thanks!
>
>
> Thanks to you for starting this up! The above are points that come to my mind now.
> I will keep updating the list if something else come to my mind.
>
>> >
>> > -rui
>> >
>> > _______________________________________________
>> > linux-pm mailing list
>> > linux-pm@lists.linux-foundation.org
>> > https://lists.linuxfoundation.org/mailman/listinfo/linux-pm
>>
>>
>> _______________________________________________
>> linux-pm mailing list
>> linux-pm@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/linux-pm
>
> All best,
>
> ---
> Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-31  3:58     ` Matthew Garrett
@ 2012-05-31  5:54       ` Zhang Rui
  0 siblings, 0 replies; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  5:54 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Rafael J. Wysocki, Jean Delvare, Brown, Len, linux-pm,
	linux-acpi@vger.kernel.org, amit.kachhap

On 四, 2012-05-31 at 04:58 +0100, Matthew Garrett wrote:
> On Thu, May 31, 2012 at 11:54:51AM +0800, Zhang Rui wrote:
> > On 三, 2012-05-30 at 13:50 +0100, Matthew Garrett wrote:
> > > The existing algorithm provides a generic mechanism for 
> > > balancing performance and thermal output, with the only requirement 
> > > being that the platform provide constants that represent the heating and 
> > > cooling properties of the system.
> > > 
> > I'm not sure if this could work on their platforms. So I'm just looking
> > for an easier way to handle this, i.e. make generic thermal layer
> > simple, and provide the flexibility for platform drivers to do their own
> > tricks.
> 
> If it's not possible for a platform to use the existing generic approach 
> then we should certainly provide a way for them to handle that, but 
> first I'd like to see evidence that it's impossible for them to use the 
> existing generic approach. This kind of conversation is better with real 
> world examples :)

Agreed.

Amit, do you have any update on this? :)

> 
> > > > G4. Multiple passive trip points
> > > 
> > > It would be good to have an explanation of the use case here. If it's 
> > > acceptable for the device to be at the lower passive trip point, why are 
> > > we slowing it down at all?
> > > 
> > acceptable does not equal comfortable?
> > Say, I'd like to use the computer at 30C skin temperature.
> > I'm okay with the temperature at 50C, but it would be nice if it can be
> > lower, even if the system would be slower, but not too slow (T-state).
> > If the temperature is higher than 60, it is not usable for me, I'll wait
> > for a while, the system can do everything they want do cool the system
> > down (but hibernate/shutdown would be not a good idea at this time
> > because it is hot enough for some hardware damage).
> 
> Ok, that seems reasonable.
> 
Great!

thanks,
rui


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-31  4:59 ` Amit Kachhap
@ 2012-05-31  6:09   ` Zhang Rui
  2012-05-31 10:59     ` Eduardo Valentin
  0 siblings, 1 reply; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  6:09 UTC (permalink / raw)
  To: Amit Kachhap
  Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

Hi, Amit,

On 四, 2012-05-31 at 12:59 +0800, Amit Kachhap wrote:
> Hi Rui,
> 
> Thanks for starting a discussion on further finetuning the thermal layer.
> 
> On 30 May 2012 16:49, Zhang Rui <rui.zhang@intel.com> wrote:
> > Hi, all,
> >
> > It is great to see more and more users of the generic thermal layer.
> > But as we know, the original design of the generic thermal layer comes
> > from ACPI thermal management, and some of its implementation seems to be
> > too ACPI specific nowadays.
> Totally agreed :)
> >
> > Recently I'm thinking of enhance the generic thermal layer so that it
> > works well for more platforms.
> >
> > Below are some thoughts of mine, after reading the patches from Amit
> > Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
> > coding some RFC patches. But I do really want to get feedback from you
> > before going on.
> >
> > G1. supporting multiple cooling states for active cooling devices.
> >
> >    The current active cooling device supports two cooling states only,
> >    please refer to the code below, in driver/thermal/thermal_sys.c
> >                case THERMAL_TRIP_ACTIVE:
> >                        ...
> >                        if (temp >= trip_temp)
> >                                cdev->ops->set_cur_state(cdev, 1);
> >                        else
> >                                cdev->ops->set_cur_state(cdev, 0);
> >                        break;
> >
> >    This is an ACPI specific thing, as our ACPI FAN used to support
> >    ON/OFF only.
> >    I think it is reasonable to support multiple active cooling states
> >    as they are common on many platforms, and note that this is also
> >    true for ACPI 3.0 FAN device (_FPS).
> Yes I agree that ACTIVE trip type should support more than 1(ON)
> state. and which state the the set_cur_state should call depends on
> how the state is binded on the trip point. But again in doing these
> there is so much logic put in handling this that the I dropped the
> patch for new trip type ACTIVE_INSTANCE.

yes, I know.

> >
> > G2. introduce cooling states range for a certain trip point
> >
> >    This problem comes with the first one.
> >    If the cooling devices support multiple cooling states, and surely
> >    we may want only several cooling states for a certain trip point,
> >    and other cooling states for other active trip points.
> >    To do this, we should be able to describe the cooling device
> >    behavior for a certain trip point, rather than for the entire
> >    thermal zone.
> Agreed
> >
> > G3. kernel thermal passive cooling algorithm
> >
> >    Currently, tc1 and tc2 are hard requirements for kernel passive
> >    cooling. But non-ACPI platforms do not have this information
> >    (please correct me if I'm wrong).
> >    Say, for the patches here
> >    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> >    They just want to slow down the processor when current temperature
> >    is higher than the trip point and speed up the processor when the
> >    temperature is lower than the trip point.
> >
> >    According to Matthew, the platform drivers are responsible to
> >    provide proper tc1 and tc2 values to use kernel passive cooling.
> >    But I'm just wondering if we can use something instead.
> >    Say, introduce .get_trend() in thermal_zone_device_ops.
> >    And we set cur_state++ or cur_state-- based on the value returned
> >    by .get_trend(), instead of using tc1 and tc2.
> OK this seems fine. But for exynos platform(also some other platforms)
> this may be simply cur_temp - last_temp which is fine.

I'm wondering if you can try to test some platform tc1/tc2 numbers and
see if it is better to use the current passive cooling implementation.

> >
> > G4. Multiple passive trip points
> >
> >    I get this idea also from the patches at
> >    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> >
> >    IMO, they want to get an acceptable performance at a tolerable
> >    temperature.
> >    Say, a platform with four P-states. P3 is really low.
> >    And I'm okay with the temperature at 60C, but 80C? No.
> >    With G2 resolved, we can use processor P0~P2 for Passive trip point
> >    0 (50C), and P3 for Passive trip point 1 (70C). And then the
> >    temperature may be jumping at around 60C or even 65C, without
> >    entering P3.
> >
> >    Further more, IMO, this also works for ACPI platforms.
> >    Say, we can easily change p-state to cool the system, but using
> >    t-state is definitely what we do not want to see. The current
> >    implementation does not expose this difference to the generic
> >    thermal layer, but if we can have two passive trip points, and use
> >    p-state for the first one only... (this works if we start polling
> >    after entering passive cooling mode, without hardware notification)
> This seems cool and to answer Mathew doubt this is needed because in
> the not so critical thermal envelop the platforms may not want to go
> to the lowest opp level and some intermediate level is fine.
> 
> >
> > G5. unify active cooling and passive cooling code
> >
> >    If G4 and G5 are resolved, a new problem to me is that there is no
> >    difference between passive cooling and active cooling except the
> >    cooling policy.
> >    Then we can share the same code for both active and passive cooling.
> >    maybe something like:
> >
> >    case THERMAL_TRIP_ACTIVE:
> >    case THERMAL_TRIP_PASSIVE:
> >         ...
> >         tz->ops->get_trend();
> >         if (trend == HEATING)
> >                 cdev->ops->set_cur_state(cdev, cur_state++);
> >         else if (trend == COOLING)
> >                 cdev->ops->set_cur_state(cdev, cur_state--);
> >         break;
> Actually I still feels that ACTIVE and PASSIVE are needed because they
> have different way of calling the set_cur_rate. ACTIVE trip type is
> associated with 1 trip point so on reaching a trip point a specific
> state is called in set_cur_state. But PASSIVE basically
> increments/decrements the state depending on thermal trend.

Sorry, I do not understand.
for active trip points, we can either use trend or not.
say, we can always set the trend to HEATING, in our .get_trend()
callback if we always want to spin up the fan when the temperature is
higher than the trip point.
But if the temperature high but it is dropping, and we want to spin down
the fan a little to save more power, we can let .get_trend return
COOLING at this time.
what do you think?

BTW, what I'd like to know is that, with all this gaps solved, is it
much easier and clearer for the generic cpu cooling APIs and the exynos
platforms to follow the generic thermal layer? Say, two passive trip
points for exynos platforms, and certain processor P-states bind to each
trip point?

thanks,
rui

_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-31  5:16     ` Amit Kachhap
@ 2012-05-31  6:13       ` Zhang Rui
  2012-05-31 11:13       ` Eduardo Valentin
  1 sibling, 0 replies; 22+ messages in thread
From: Zhang Rui @ 2012-05-31  6:13 UTC (permalink / raw)
  To: Amit Kachhap
  Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

> >> > G5. unify active cooling and passive cooling code
> >> >
> >> >     If G4 and G5 are resolved, a new problem to me is that there is no
> >> >     difference between passive cooling and active cooling except the
> >> >     cooling policy.
> >
> > OK...
> >
> >> >     Then we can share the same code for both active and passive cooling.
> >> >     maybe something like:
> >> >
> >> >     case THERMAL_TRIP_ACTIVE:
> >> >     case THERMAL_TRIP_PASSIVE:
> >> >          ...
> >> >          tz->ops->get_trend();
> >
> > Would the get_trend take into account if we are cooling with active or passive
> > cooling device?
> >
> >> >          if (trend == HEATING)
> >> >                  cdev->ops->set_cur_state(cdev, cur_state++);
> >> >          else if (trend == COOLING)
> >> >                  cdev->ops->set_cur_state(cdev, cur_state--);
> >> >          break;
> >
> > I believe we should have something for temperature stabilization there as well.
> I also agree that thermal stablization is important. I have observed
> that too much state change is happening around the trip point. But yes
>  the trend callback may take care of this and set the COOLING trend
> different from HEATING trend.

that's why Matthew and I are asking you to try to fit into the current
passive cooling implementation first, as the current algorithm is really
a good one, if you can make use of it.
.get_trend() is just a backup for this.

thanks,
rui

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-31  6:09   ` Zhang Rui
@ 2012-05-31 10:59     ` Eduardo Valentin
  0 siblings, 0 replies; 22+ messages in thread
From: Eduardo Valentin @ 2012-05-31 10:59 UTC (permalink / raw)
  To: Zhang Rui; +Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

Hello,

On Thu, May 31, 2012 at 02:09:19PM +0800, Zhang Rui wrote:
> Hi, Amit,
> 
> On 四, 2012-05-31 at 12:59 +0800, Amit Kachhap wrote:
> > Hi Rui,
> > 
> > Thanks for starting a discussion on further finetuning the thermal layer.
> > 
> > On 30 May 2012 16:49, Zhang Rui <rui.zhang@intel.com> wrote:
> > > Hi, all,
> > >
> > > It is great to see more and more users of the generic thermal layer.
> > > But as we know, the original design of the generic thermal layer comes
> > > from ACPI thermal management, and some of its implementation seems to be
> > > too ACPI specific nowadays.
> > Totally agreed :)
> > >
> > > Recently I'm thinking of enhance the generic thermal layer so that it
> > > works well for more platforms.
> > >
> > > Below are some thoughts of mine, after reading the patches from Amit
> > > Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
> > > coding some RFC patches. But I do really want to get feedback from you
> > > before going on.
> > >
> > > G1. supporting multiple cooling states for active cooling devices.
> > >
> > >    The current active cooling device supports two cooling states only,
> > >    please refer to the code below, in driver/thermal/thermal_sys.c
> > >                case THERMAL_TRIP_ACTIVE:
> > >                        ...
> > >                        if (temp >= trip_temp)
> > >                                cdev->ops->set_cur_state(cdev, 1);
> > >                        else
> > >                                cdev->ops->set_cur_state(cdev, 0);
> > >                        break;
> > >
> > >    This is an ACPI specific thing, as our ACPI FAN used to support
> > >    ON/OFF only.
> > >    I think it is reasonable to support multiple active cooling states
> > >    as they are common on many platforms, and note that this is also
> > >    true for ACPI 3.0 FAN device (_FPS).
> > Yes I agree that ACTIVE trip type should support more than 1(ON)
> > state. and which state the the set_cur_state should call depends on
> > how the state is binded on the trip point. But again in doing these
> > there is so much logic put in handling this that the I dropped the
> > patch for new trip type ACTIVE_INSTANCE.
> 
> yes, I know.
> 
> > >
> > > G2. introduce cooling states range for a certain trip point
> > >
> > >    This problem comes with the first one.
> > >    If the cooling devices support multiple cooling states, and surely
> > >    we may want only several cooling states for a certain trip point,
> > >    and other cooling states for other active trip points.
> > >    To do this, we should be able to describe the cooling device
> > >    behavior for a certain trip point, rather than for the entire
> > >    thermal zone.
> > Agreed
> > >
> > > G3. kernel thermal passive cooling algorithm
> > >
> > >    Currently, tc1 and tc2 are hard requirements for kernel passive
> > >    cooling. But non-ACPI platforms do not have this information
> > >    (please correct me if I'm wrong).
> > >    Say, for the patches here
> > >    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > >    They just want to slow down the processor when current temperature
> > >    is higher than the trip point and speed up the processor when the
> > >    temperature is lower than the trip point.
> > >
> > >    According to Matthew, the platform drivers are responsible to
> > >    provide proper tc1 and tc2 values to use kernel passive cooling.
> > >    But I'm just wondering if we can use something instead.
> > >    Say, introduce .get_trend() in thermal_zone_device_ops.
> > >    And we set cur_state++ or cur_state-- based on the value returned
> > >    by .get_trend(), instead of using tc1 and tc2.
> > OK this seems fine. But for exynos platform(also some other platforms)
> > this may be simply cur_temp - last_temp which is fine.
> 
> I'm wondering if you can try to test some platform tc1/tc2 numbers and
> see if it is better to use the current passive cooling implementation.

Well, we can definitely try it out with TC1 and TC2, but what is the
right procedure to define them properly other than try and error?

> 
> > >
> > > G4. Multiple passive trip points
> > >
> > >    I get this idea also from the patches at
> > >    http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> > >
> > >    IMO, they want to get an acceptable performance at a tolerable
> > >    temperature.
> > >    Say, a platform with four P-states. P3 is really low.
> > >    And I'm okay with the temperature at 60C, but 80C? No.
> > >    With G2 resolved, we can use processor P0~P2 for Passive trip point
> > >    0 (50C), and P3 for Passive trip point 1 (70C). And then the
> > >    temperature may be jumping at around 60C or even 65C, without
> > >    entering P3.
> > >
> > >    Further more, IMO, this also works for ACPI platforms.
> > >    Say, we can easily change p-state to cool the system, but using
> > >    t-state is definitely what we do not want to see. The current
> > >    implementation does not expose this difference to the generic
> > >    thermal layer, but if we can have two passive trip points, and use
> > >    p-state for the first one only... (this works if we start polling
> > >    after entering passive cooling mode, without hardware notification)
> > This seems cool and to answer Mathew doubt this is needed because in
> > the not so critical thermal envelop the platforms may not want to go
> > to the lowest opp level and some intermediate level is fine.
> > 
> > >
> > > G5. unify active cooling and passive cooling code
> > >
> > >    If G4 and G5 are resolved, a new problem to me is that there is no
> > >    difference between passive cooling and active cooling except the
> > >    cooling policy.
> > >    Then we can share the same code for both active and passive cooling.
> > >    maybe something like:
> > >
> > >    case THERMAL_TRIP_ACTIVE:
> > >    case THERMAL_TRIP_PASSIVE:
> > >         ...
> > >         tz->ops->get_trend();
> > >         if (trend == HEATING)
> > >                 cdev->ops->set_cur_state(cdev, cur_state++);
> > >         else if (trend == COOLING)
> > >                 cdev->ops->set_cur_state(cdev, cur_state--);
> > >         break;
> > Actually I still feels that ACTIVE and PASSIVE are needed because they
> > have different way of calling the set_cur_rate. ACTIVE trip type is
> > associated with 1 trip point so on reaching a trip point a specific
> > state is called in set_cur_state. But PASSIVE basically
> > increments/decrements the state depending on thermal trend.
> 
> Sorry, I do not understand.
> for active trip points, we can either use trend or not.
> say, we can always set the trend to HEATING, in our .get_trend()
> callback if we always want to spin up the fan when the temperature is
> higher than the trip point.
> But if the temperature high but it is dropping, and we want to spin down
> the fan a little to save more power, we can let .get_trend return
> COOLING at this time.
> what do you think?
> 
> BTW, what I'd like to know is that, with all this gaps solved, is it
> much easier and clearer for the generic cpu cooling APIs and the exynos
> platforms to follow the generic thermal layer? Say, two passive trip
> points for exynos platforms, and certain processor P-states bind to each
> trip point?
> 
> thanks,
> rui
> 
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-pm] [RFC] the generic thermal layer enhancement
  2012-05-31  3:32         ` [linux-pm] " Zhang Rui
@ 2012-05-31 11:06           ` Eduardo Valentin
  2012-05-31 11:14             ` R, Durgadoss
  0 siblings, 1 reply; 22+ messages in thread
From: Eduardo Valentin @ 2012-05-31 11:06 UTC (permalink / raw)
  To: Zhang Rui
  Cc: eduardo.valentin, R, Durgadoss, Matthew Garrett, Brown, Len,
	amit.kachhap@linaro.org, jean.pihet, Jean Delvare,
	linux-acpi@vger.kernel.org, linux-pm

hello,

On Thu, May 31, 2012 at 11:32:41AM +0800, Zhang Rui wrote:
> On 三, 2012-05-30 at 14:17 +0300, Eduardo Valentin wrote:
> > Hello Durga,
> > 
> > On Wed, May 30, 2012 at 11:05:18AM +0000, R, Durgadoss wrote:
> > > Hi Eduardo,
> > > 
> > > > 
> > > > For G1+G2, I agree with your proposal. I had some discussion with Amit
> > > > regarding this. In his series of patches we increase / decrease the cooling
> > > > device state linearly and steadily.
> > > > 
> > > > But if we would have what you are saying, we could bind cooling device
> > > > set of states with trip points.
> > > 
> > > True, We want to bind the levels of cooling with the trips points a thermal zone has.
> > > But we might not get a 1-1 mapping always.
> > 
> > Just to make sure we are all taking the same thing.
> > 
> > In this case a cooling device would have 1-N states. And this set could
> > be partitioned and each partition would be assigned to a specific trip point
> > of a thermal zone, right?
> > 
> yep.
> BTW, Overlaps should be possible and we should handle this as well.
> > 
> > 
> > > 
> > > > 
> > > > I fully support this option and could cook up something on this.
> > > > The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> > > > Should probably go away from the registration function that we have
> > > > currently.
> > > 
> > > I realize I just said the same thing :-)
> > 
> > Cool :-)
> > 
> > > 
> > > > 
> > > > We could have generic trending computation though. Based on timestamping
> > > > and temperature reads, and make it available for zones that want to used it.
> > > 
> > > Agree, but I would like this go into the platform thermal drivers. And then when
> > > those drivers notify the framework they can specify the trend also. This sort of
> > > notification is not there, but that is what I am implementing these days..
> > > Hope to submit this patch in a week's time..
> > 
> > Nice, I actually have something being cooked for the same thing. We should probably
> > align to avoid work duplication...
> > 
> Hah, seems a lot of work is in progress in this area.
> 
> > > 
> > > > > >     case THERMAL_TRIP_ACTIVE:
> > > > > >     case THERMAL_TRIP_PASSIVE:
> > > > > >          ...
> > > > > >          tz->ops->get_trend();
> > > > 
> > > > Would the get_trend take into account if we are cooling with active or passive
> > > > cooling device?
> > > 
> > > To me, it does not matter. It is up to the framework to decide and throttle,
> > > the respective cooling devices according to the trend.
> > 
> > OK. For me it doesn't really matter as well. Having a simplified zone update is better.
> > 
> > > 
> > > > 
> > > > > >          if (trend == HEATING)
> > > > > >                  cdev->ops->set_cur_state(cdev, cur_state++);
> > > > > >          else if (trend == COOLING)
> > > > > >                  cdev->ops->set_cur_state(cdev, cur_state--);
> > > > > >          break;
> > > > 
> > > > I believe we should have something for temperature stabilization there as well.
> > > > 
> > > > Besides, if we go with this generic policy, then the zone update would be much
> > > > simpler no?
> > > 
> > > Yes, and that’s what we want too  :-)
> > 
> > Nice!
> > 
> > > 
> > > > Here are some other thoughts:
> > > > G6. Another point is, would it make sense to allow for policy extension? Meaning,
> > > > the zone update would call a callback to request for update from the zone
> > > > device driver?
> > > > 
> > > > G7. How do we solve cooling devices being shared between different thermal
> > > > zones?
> > > > Should we have a better cooling device constraint management?
> > > 
> > > This is another thing that was haunting me for quite some time.
> > > And What I have in mind is a mapping kind of thing in the platform layer,
> > > that will provide details about which cooling device is shared with whom.
> > > The framework can then use this and figure out the association among various devices.
> > > I am testing it out, and will submit once it comes to a good shape.
> > 
> > Right, I am not sure we want to go in this direction?
> > 
> > Maybe a better way would be to have sort of pm/thermal contraint framework, which
> > would map these per device, at LDM level?
> > 
> > I am copying Jean-Pihet, he has been working in this front. Jean, any thoughts?
> > 
> Durga and I are investigating how to introduce some concepts like
> "influence/weight" to generic thermal layer. :)

What do you mean here? Describing the cooling devices effectiveness on each zone and
derive algorithms to act accordingly?

> 
> thanks,
> rui
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-31  5:16     ` Amit Kachhap
  2012-05-31  6:13       ` Zhang Rui
@ 2012-05-31 11:13       ` Eduardo Valentin
  1 sibling, 0 replies; 22+ messages in thread
From: Eduardo Valentin @ 2012-05-31 11:13 UTC (permalink / raw)
  To: Amit Kachhap
  Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

Hello,

On Thu, May 31, 2012 at 01:16:37PM +0800, Amit Kachhap wrote:
> On 30 May 2012 18:30, Eduardo Valentin <eduardo.valentin@ti.com> wrote:
> > Hello Rui,
> >
> > Now I copied Amit, for real :-)
> >
> > I like your proposal, some comments as follow.
> >
> > On Wed, May 30, 2012 at 04:51:00PM +0800, Zhang Rui wrote:
> >> On 三, 2012-05-30 at 16:49 +0800, Zhang Rui wrote:
> >> > Hi, all,
> >> >
> >> > It is great to see more and more users of the generic thermal layer.
> >> > But as we know, the original design of the generic thermal layer comes
> >> > from ACPI thermal management, and some of its implementation seems to be
> >> > too ACPI specific nowadays.
> >
> >
> > Good. We have also basic OMAP support, on top of Amit's work. I sent
> > recently a very basic support.  I will be pushing them, while they evolve.
> >
> >> >
> >> > Recently I'm thinking of enhance the generic thermal layer so that it
> >> > works well for more platforms.
> >
> > As you said, for non-ACPI support, the "generic" layer needs some
> > extension and refactoring to be more generic :-).
> >
> >> >
> >> > Below are some thoughts of mine, after reading the patches from Amit
> >> > Daniel Kachhap, and ACPI 3.0 thermal model. Actually, I have started
> >> > coding some RFC patches. But I do really want to get feedback from you
> >> > before going on.
> >
> > OK.
> >
> >> >
> >> > G1. supporting multiple cooling states for active cooling devices.
> >> >
> >> >     The current active cooling device supports two cooling states only,
> >> >     please refer to the code below, in driver/thermal/thermal_sys.c
> >> >                 case THERMAL_TRIP_ACTIVE:
> >> >                         ...
> >> >                         if (temp >= trip_temp)
> >> >                                 cdev->ops->set_cur_state(cdev, 1);
> >> >                         else
> >> >                                 cdev->ops->set_cur_state(cdev, 0);
> >> >                         break;
> >> >
> >> >     This is an ACPI specific thing, as our ACPI FAN used to support
> >> >     ON/OFF only.
> >> >     I think it is reasonable to support multiple active cooling states
> >> >     as they are common on many platforms, and note that this is also
> >> >     true for ACPI 3.0 FAN device (_FPS).
> >> >
> >> > G2. introduce cooling states range for a certain trip point
> >> >
> >> >     This problem comes with the first one.
> >> >     If the cooling devices support multiple cooling states, and surely
> >> >     we may want only several cooling states for a certain trip point,
> >> >     and other cooling states for other active trip points.
> >> >     To do this, we should be able to describe the cooling device
> >> >     behavior for a certain trip point, rather than for the entire
> >> >     thermal zone.
> >
> > For G1+G2, I agree with your proposal. I had some discussion with Amit
> > regarding this. In his series of patches we increase / decrease the cooling
> > device state linearly and steadily.
> >
> > But if we would have what you are saying, we could bind cooling device
> > set of states with trip points.
> >
> >> >
> >> > G3. kernel thermal passive cooling algorithm
> >> >
> >> >     Currently, tc1 and tc2 are hard requirements for kernel passive
> >> >     cooling. But non-ACPI platforms do not have this information
> >> >     (please correct me if I'm wrong).
> >> >     Say, for the patches here
> >> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> >>
> >> Sorry, forgot to cc Amit, the author of this patch set.
> >>
> >> thanks,
> >> rui
> >> >     They just want to slow down the processor when current temperature
> >> >     is higher than the trip point and speed up the processor when the
> >> >     temperature is lower than the trip point.
> >> >
> >> >     According to Matthew, the platform drivers are responsible to
> >> >     provide proper tc1 and tc2 values to use kernel passive cooling.
> >> >     But I'm just wondering if we can use something instead.
> >> >     Say, introduce .get_trend() in thermal_zone_device_ops.
> >> >     And we set cur_state++ or cur_state-- based on the value returned
> >> >     by .get_trend(), instead of using tc1 and tc2.
> >
> > I fully support this option and could cook up something on this.
> > The TC1 and TC2 should go inside the .get_trend() callbacks for ACPI.
> > Should probably go away from the registration function that we have
> > currently.
> >
> > We could have generic trending computation though. Based on timestamping
> > and temperature reads, and make it available for zones that want to used it.
> >
> >> >
> >> > G4. Multiple passive trip points
> >> >
> >> >     I get this idea also from the patches at
> >> >     http://marc.info/?l=linux-acpi&m=133681581305341&w=2
> >> >
> >> >     IMO, they want to get an acceptable performance at a tolerable
> >> >     temperature.
> >> >     Say, a platform with four P-states. P3 is really low.
> >> >     And I'm okay with the temperature at 60C, but 80C? No.
> >> >     With G2 resolved, we can use processor P0~P2 for Passive trip point
> >> >     0 (50C), and P3 for Passive trip point 1 (70C). And then the
> >> >     temperature may be jumping at around 60C or even 65C, without
> >> >     entering P3.
> >
> > Yeah, I guess we need to solve G1+G2 first to allow this. But I also agree
> > that ideally, there should be possibility to have multiple passive trip points.
> >
> >> >
> >> >     Further more, IMO, this also works for ACPI platforms.
> >> >     Say, we can easily change p-state to cool the system, but using
> >> >     t-state is definitely what we do not want to see. The current
> >> >     implementation does not expose this difference to the generic
> >> >     thermal layer, but if we can have two passive trip points, and use
> >> >     p-state for the first one only... (this works if we start polling
> >> >     after entering passive cooling mode, without hardware notification)
> >> >
> >> > G5. unify active cooling and passive cooling code
> >> >
> >> >     If G4 and G5 are resolved, a new problem to me is that there is no
> >> >     difference between passive cooling and active cooling except the
> >> >     cooling policy.
> >
> > OK...
> >
> >> >     Then we can share the same code for both active and passive cooling.
> >> >     maybe something like:
> >> >
> >> >     case THERMAL_TRIP_ACTIVE:
> >> >     case THERMAL_TRIP_PASSIVE:
> >> >          ...
> >> >          tz->ops->get_trend();
> >
> > Would the get_trend take into account if we are cooling with active or passive
> > cooling device?
> >
> >> >          if (trend == HEATING)
> >> >                  cdev->ops->set_cur_state(cdev, cur_state++);
> >> >          else if (trend == COOLING)
> >> >                  cdev->ops->set_cur_state(cdev, cur_state--);
> >> >          break;
> >
> > I believe we should have something for temperature stabilization there as well.
> I also agree that thermal stablization is important. I have observed
> that too much state change is happening around the trip point. But yes
>  the trend callback may take care of this and set the COOLING trend
> different from HEATING trend.
> >
> > Besides, if we go with this generic policy, then the zone update would be much
> > simpler no?
> >
> >> >
> >> > Here are the gaps in my point of view, I'd like to get your ideas about
> >> > which are reasonable and which are not.
> >
> > Here are some other thoughts:
> > G6. Another point is, would it make sense to allow for policy extension? Meaning,
> > the zone update would call a callback to request for update from the zone
> > device driver?
> This may a simple work with adding notifiers like done for CRITICAL,
> HOT trip type.
> >
> > G7. How do we solve cooling devices being shared between different thermal zones?
> > Should we have a better cooling device constraint management?
> >
> > G8. On same topic as G7, how are we currently making sure that thermal constraints
> > don't get overwritten by, let's say, userspace APIs? I guess the generic CPU cooling
> > propose by Amit suffers of an issue. If user sets cpufreq governor to userspace
> > and sets the frequency to its maximum, say in a busy loop, the thermal cooling
> > could potentially be ruined.
> Yes I agree that is a problem in my implementation but I guess the
> current cpufreq framework does not have anything to stop this. May be
> with cpufreq_max pmqos patches this will be taken care.

Just a clarification here. I didn't really mean that this is a problem on your implementation.
But a general problem. And needs to be dealt properly.

And to be frank, assuming that just because we selected userspace at cpufreq level it is
userland problem, it is a weak solution. Specially considering that this may lead to
potentially harmful situations.

Not sure this needs to be solved at the generic thermal framework, but at least
the thermal framework should be aware of who is dealing with the constraint management.

The pmqos patch is a good direction to go, IMO.

> >
> > G9. Is there any possibility to have multiple sensors per thermal zone?
> >
> > G10. Do we want to represent other sensing stimuli other that temperature? Say,
> > current sensing?
> Yes this is good field to look into.
> >
> > G11. Do we want to allow for cross zoning policies? Sometimes a policy may use
> > temperature from different thermal zone in order to better represent what
> > is going on in its thermal zone.
> >
> >> >
> >> > Any comments are appreciated! Thanks!
> >
> >
> > Thanks to you for starting this up! The above are points that come to my mind now.
> > I will keep updating the list if something else come to my mind.
> >
> >> >
> >> > -rui
> >> >
> >> > _______________________________________________
> >> > linux-pm mailing list
> >> > linux-pm@lists.linux-foundation.org
> >> > https://lists.linuxfoundation.org/mailman/listinfo/linux-pm
> >>
> >>
> >> _______________________________________________
> >> linux-pm mailing list
> >> linux-pm@lists.linux-foundation.org
> >> https://lists.linuxfoundation.org/mailman/listinfo/linux-pm
> >
> > All best,
> >
> > ---
> > Eduardo Valentin
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC] the generic thermal layer enhancement
  2012-05-31 11:06           ` Eduardo Valentin
@ 2012-05-31 11:14             ` R, Durgadoss
  0 siblings, 0 replies; 22+ messages in thread
From: R, Durgadoss @ 2012-05-31 11:14 UTC (permalink / raw)
  To: eduardo.valentin@ti.com, Zhang, Rui
  Cc: Jean Delvare, Brown, Len, linux-acpi@vger.kernel.org, linux-pm

Hi Eduardo,

[A big cut]

> > >
> > > Maybe a better way would be to have sort of pm/thermal contraint
> framework, which
> > > would map these per device, at LDM level?
> > >
> > > I am copying Jean-Pihet, he has been working in this front. Jean, any
> thoughts?
> > >
> > Durga and I are investigating how to introduce some concepts like
> > "influence/weight" to generic thermal layer. :)
> 
> What do you mean here? Describing the cooling devices effectiveness on each
> zone and
> derive algorithms to act accordingly?

Yes. something like that..

Thanks,
Durga

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2012-05-31 11:14 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-30  8:49 [RFC] the generic thermal layer enhancement Zhang Rui
2012-05-30  8:51 ` [linux-pm] " Zhang Rui
2012-05-30 10:30   ` Eduardo Valentin
2012-05-30 11:05     ` R, Durgadoss
2012-05-30 11:17       ` Eduardo Valentin
2012-05-31  3:32         ` [linux-pm] " Zhang Rui
2012-05-31 11:06           ` Eduardo Valentin
2012-05-31 11:14             ` R, Durgadoss
2012-05-31  3:27       ` Zhang Rui
2012-05-31  2:20     ` [linux-pm] " Zhang Rui
2012-05-31  5:16     ` Amit Kachhap
2012-05-31  6:13       ` Zhang Rui
2012-05-31 11:13       ` Eduardo Valentin
2012-05-30 10:44 ` [linux-pm] " R, Durgadoss
2012-05-31  3:15   ` Zhang Rui
2012-05-30 12:50 ` Matthew Garrett
2012-05-31  3:54   ` Zhang Rui
2012-05-31  3:58     ` Matthew Garrett
2012-05-31  5:54       ` Zhang Rui
2012-05-31  4:59 ` Amit Kachhap
2012-05-31  6:09   ` Zhang Rui
2012-05-31 10:59     ` Eduardo Valentin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox