linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 0/4] thermal threshold and notification v2.0
       [not found] <1365541282-19366-1-git-send-email-srinivas.pandruvada@linux.intel.com>
@ 2013-04-13  1:32 ` Guenter Roeck
  2013-04-15 15:20   ` Srinivas Pandruvada
  0 siblings, 1 reply; 7+ messages in thread
From: Guenter Roeck @ 2013-04-13  1:32 UTC (permalink / raw)
  To: Srinivas Pandruvada, Zhang Rui
  Cc: lm-sensors, fenghua.yu, tony.luck, bp, linux-pm

On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
> 
> v2.0
> As suggested by Guenter Roeck, used the previous development in this area
> as starting point. The first patch is same as what Guenter Roeck submitted
> before except for checkpatch error for strtoul.As per this patch, the following
> additional coretemp sysfs entries will be added:
> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
> tempX_threshold1_triggered
> 	         - Reflects status of CPU thermal status register bit 6
> 		   (THERM_STATUS_THRESHOLD0).
> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
> tempX_threshold2_triggered
> 	         - Reflects status of CPU thermal status register bit 8
> 		   (THERM_STATUS_THRESHOLD1).
> 
> 
> The notification mechanism is implemented for package level by using uevent.
> Also a debugfs interface is added to check count of interrupts and worker fn
> scheduling.
> 
> 
> v1.0
> 
> This is clear that there is reluctance in adding thresholds in coretemp sysfs,
> during previous attempts. Proably because of lake of use cases.
> But this time use case may be more compelling.
> 
> We have many small form factor devices like ultrabooks, slate PCs in the market.
> Unfortunately these devices reach maximum temperature with relatively less
> workloads, causing BIOS to do thermal throttling. There are real performance
> issues due to aggressive BIOS action to control thermals and also thermal breakdown
> in some cases.
> 
> Even the most expensive laptops, don't have correct ACPI thermal configuration,
> so that kernel thermal driver can act. In some case even the trip point is higher
> than critical temperature setting.
> 
> Intel has developed several drivers, which can be used to cool the system very efficiently.
> They include RAPL based cooling driver, Powerclamp driver and P state driver.
> To utilize these cooling device a closed loop user mode program is required, which
> will utilize these method and dynamically compensate for high CPU temperatures,
> without relying on any configuration data.
> One such solution is developed is "Linux thermal daemon". More details can be
> obtained from 
> "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon_Introduction.pdf".
> This daemon polls for cpu temperature and apply compensation once the CPU reach target
> temperature. 
> 
> This polling can be mostly avoided, by getting notification for the temperature, where
> it needs to wake up and get ready for apply compensation. In most of the normal use 
> cases, there may not be any threshold events. So very minimal number of user space
> notification for thermal thresholds.
> 
> Notification are added only for package level thresholds, to minimize events. Also
> interrupts are enabled only when a non tj_max(default) value is written to thresholds.
> 
> Once thresholds are violated, it uses a rate control of 5 seconds, reducing the number
> of interrupts, when temperature is hanging around trip point. Using the sticky log bit,
> it sends kboject uevent change notification for corresponding package sysfs.
> Once the thermal daemon receives notification, it can change to new threshold or act
> immediately to reduce CPU temperature.*
> 
> Guenter Roeck (1):
>   hwmon: (coretemp) Add support for thermal threshold attributes
> 
> Srinivas Pandruvada (3):
>   x86, mcheck, therm_throt: Process package thresholds
>   hwmon: (coretemp) : Add notification support
>   hwmon: (coretemp) : Add debugfs to support thresholds
> 
>  Documentation/hwmon/coretemp             |   8 +
>  arch/x86/include/asm/mce.h               |   7 +
>  arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
>  drivers/hwmon/coretemp.c                 | 292 +++++++++++++++++++++++++++++--
>  4 files changed, 356 insertions(+), 14 deletions(-)
> 
Rui,

can you have a look at this series ?

I would like to get some feedback from thermal subsystem supporters if hwmon
is really the right place for this. I may be wrong, but it seems to me it would
better fit into thermal.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] thermal threshold and notification v2.0
  2013-04-13  1:32 ` [PATCH 0/4] thermal threshold and notification v2.0 Guenter Roeck
@ 2013-04-15 15:20   ` Srinivas Pandruvada
  2013-04-15 17:53     ` [lm-sensors] " R, Durgadoss
  2013-04-16  4:52     ` Zhang Rui
  0 siblings, 2 replies; 7+ messages in thread
From: Srinivas Pandruvada @ 2013-04-15 15:20 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Zhang Rui, lm-sensors, fenghua.yu, tony.luck, bp, linux-pm

On 04/12/2013 06:32 PM, Guenter Roeck wrote:
> On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
>> v2.0
>> As suggested by Guenter Roeck, used the previous development in this area
>> as starting point. The first patch is same as what Guenter Roeck submitted
>> before except for checkpatch error for strtoul.As per this patch, the following
>> additional coretemp sysfs entries will be added:
>> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
>> tempX_threshold1_triggered
>> 	         - Reflects status of CPU thermal status register bit 6
>> 		   (THERM_STATUS_THRESHOLD0).
>> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
>> tempX_threshold2_triggered
>> 	         - Reflects status of CPU thermal status register bit 8
>> 		   (THERM_STATUS_THRESHOLD1).
>>
>>
>> The notification mechanism is implemented for package level by using uevent.
>> Also a debugfs interface is added to check count of interrupts and worker fn
>> scheduling.
>>
>>
>> v1.0
>>
>> This is clear that there is reluctance in adding thresholds in coretemp sysfs,
>> during previous attempts. Proably because of lake of use cases.
>> But this time use case may be more compelling.
>>
>> We have many small form factor devices like ultrabooks, slate PCs in the market.
>> Unfortunately these devices reach maximum temperature with relatively less
>> workloads, causing BIOS to do thermal throttling. There are real performance
>> issues due to aggressive BIOS action to control thermals and also thermal breakdown
>> in some cases.
>>
>> Even the most expensive laptops, don't have correct ACPI thermal configuration,
>> so that kernel thermal driver can act. In some case even the trip point is higher
>> than critical temperature setting.
>>
>> Intel has developed several drivers, which can be used to cool the system very efficiently.
>> They include RAPL based cooling driver, Powerclamp driver and P state driver.
>> To utilize these cooling device a closed loop user mode program is required, which
>> will utilize these method and dynamically compensate for high CPU temperatures,
>> without relying on any configuration data.
>> One such solution is developed is "Linux thermal daemon". More details can be
>> obtained from
>> "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon_Introduction.pdf".
>> This daemon polls for cpu temperature and apply compensation once the CPU reach target
>> temperature.
>>
>> This polling can be mostly avoided, by getting notification for the temperature, where
>> it needs to wake up and get ready for apply compensation. In most of the normal use
>> cases, there may not be any threshold events. So very minimal number of user space
>> notification for thermal thresholds.
>>
>> Notification are added only for package level thresholds, to minimize events. Also
>> interrupts are enabled only when a non tj_max(default) value is written to thresholds.
>>
>> Once thresholds are violated, it uses a rate control of 5 seconds, reducing the number
>> of interrupts, when temperature is hanging around trip point. Using the sticky log bit,
>> it sends kboject uevent change notification for corresponding package sysfs.
>> Once the thermal daemon receives notification, it can change to new threshold or act
>> immediately to reduce CPU temperature.*
>>
>> Guenter Roeck (1):
>>    hwmon: (coretemp) Add support for thermal threshold attributes
>>
>> Srinivas Pandruvada (3):
>>    x86, mcheck, therm_throt: Process package thresholds
>>    hwmon: (coretemp) : Add notification support
>>    hwmon: (coretemp) : Add debugfs to support thresholds
>>
>>   Documentation/hwmon/coretemp             |   8 +
>>   arch/x86/include/asm/mce.h               |   7 +
>>   arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
>>   drivers/hwmon/coretemp.c                 | 292 +++++++++++++++++++++++++++++--
>>   4 files changed, 356 insertions(+), 14 deletions(-)
>>
> Rui,
>
> can you have a look at this series ?
>
> I would like to get some feedback from thermal subsystem supporters if hwmon
> is really the right place for this. I may be wrong, but it seems to me it would
> better fit into thermal.
>
> Thanks,
> Guenter

I am fine using thermal zones, but the coretemp will be duplicated in 
both coretemp and thermal sysfs and lot of code duplication. Also trip 
point in this case is not for activating any cooling device, but just to 
notify user space. So this will be a zone with no associated cdevs.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [lm-sensors] [PATCH 0/4] thermal threshold and notification v2.0
  2013-04-15 15:20   ` Srinivas Pandruvada
@ 2013-04-15 17:53     ` R, Durgadoss
  2013-04-16  4:57       ` Zhang Rui
  2013-04-16  4:52     ` Zhang Rui
  1 sibling, 1 reply; 7+ messages in thread
From: R, Durgadoss @ 2013-04-15 17:53 UTC (permalink / raw)
  To: Srinivas Pandruvada, Guenter Roeck
  Cc: Yu, Fenghua, Luck, Tony, linux-pm@vger.kernel.org,
	lm-sensors@lm-sensors.org, bp@alien8.de, Zhang, Rui

> -----Original Message-----
> From: lm-sensors-bounces@lm-sensors.org [mailto:lm-sensors-
> bounces@lm-sensors.org] On Behalf Of Srinivas Pandruvada
> Sent: Monday, April 15, 2013 8:51 PM
> To: Guenter Roeck
> Cc: Yu, Fenghua; Luck, Tony; linux-pm@vger.kernel.org; lm-sensors@lm-
> sensors.org; bp@alien8.de; Zhang, Rui
> Subject: Re: [lm-sensors] [PATCH 0/4] thermal threshold and notification v2.0
> 
> On 04/12/2013 06:32 PM, Guenter Roeck wrote:
> > On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
> >> v2.0
> >> As suggested by Guenter Roeck, used the previous development in this
> area
> >> as starting point. The first patch is same as what Guenter Roeck submitted
> >> before except for checkpatch error for strtoul.As per this patch, the
> following
> >> additional coretemp sysfs entries will be added:
> >> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
> >> tempX_threshold1_triggered
> >> 	         - Reflects status of CPU thermal status register bit 6
> >> 		   (THERM_STATUS_THRESHOLD0).
> >> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
> >> tempX_threshold2_triggered
> >> 	         - Reflects status of CPU thermal status register bit 8
> >> 		   (THERM_STATUS_THRESHOLD1).
> >>
> >>
> >> The notification mechanism is implemented for package level by using
> uevent.
> >> Also a debugfs interface is added to check count of interrupts and worker
> fn
> >> scheduling.
> >>
> >>
> >> v1.0
> >>
> >> This is clear that there is reluctance in adding thresholds in coretemp
> sysfs,
> >> during previous attempts. Proably because of lake of use cases.
> >> But this time use case may be more compelling.
> >>
> >> We have many small form factor devices like ultrabooks, slate PCs in the
> market.
> >> Unfortunately these devices reach maximum temperature with relatively
> less
> >> workloads, causing BIOS to do thermal throttling. There are real
> performance
> >> issues due to aggressive BIOS action to control thermals and also thermal
> breakdown
> >> in some cases.
> >>
> >> Even the most expensive laptops, don't have correct ACPI thermal
> configuration,
> >> so that kernel thermal driver can act. In some case even the trip point is
> higher
> >> than critical temperature setting.
> >>
> >> Intel has developed several drivers, which can be used to cool the system
> very efficiently.
> >> They include RAPL based cooling driver, Powerclamp driver and P state
> driver.
> >> To utilize these cooling device a closed loop user mode program is
> required, which
> >> will utilize these method and dynamically compensate for high CPU
> temperatures,
> >> without relying on any configuration data.
> >> One such solution is developed is "Linux thermal daemon". More details
> can be
> >> obtained from
> >>
> "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon
> _Introduction.pdf".
> >> This daemon polls for cpu temperature and apply compensation once the
> CPU reach target
> >> temperature.
> >>
> >> This polling can be mostly avoided, by getting notification for the
> temperature, where
> >> it needs to wake up and get ready for apply compensation. In most of the
> normal use
> >> cases, there may not be any threshold events. So very minimal number of
> user space
> >> notification for thermal thresholds.
> >>
> >> Notification are added only for package level thresholds, to minimize
> events. Also
> >> interrupts are enabled only when a non tj_max(default) value is written
> to thresholds.
> >>
> >> Once thresholds are violated, it uses a rate control of 5 seconds, reducing
> the number
> >> of interrupts, when temperature is hanging around trip point. Using the
> sticky log bit,
> >> it sends kboject uevent change notification for corresponding package
> sysfs.
> >> Once the thermal daemon receives notification, it can change to new
> threshold or act
> >> immediately to reduce CPU temperature.*
> >>
> >> Guenter Roeck (1):
> >>    hwmon: (coretemp) Add support for thermal threshold attributes
> >>
> >> Srinivas Pandruvada (3):
> >>    x86, mcheck, therm_throt: Process package thresholds
> >>    hwmon: (coretemp) : Add notification support
> >>    hwmon: (coretemp) : Add debugfs to support thresholds
> >>
> >>   Documentation/hwmon/coretemp             |   8 +
> >>   arch/x86/include/asm/mce.h               |   7 +
> >>   arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
> >>   drivers/hwmon/coretemp.c                 | 292
> +++++++++++++++++++++++++++++--
> >>   4 files changed, 356 insertions(+), 14 deletions(-)
> >>
> > Rui,
> >
> > can you have a look at this series ?
> >
> > I would like to get some feedback from thermal subsystem supporters if
> hwmon
> > is really the right place for this. I may be wrong, but it seems to me it would
> > better fit into thermal.
> >
> > Thanks,
> > Guenter
> 
> I am fine using thermal zones, but the coretemp will be duplicated in
> both coretemp and thermal sysfs and lot of code duplication. Also trip
> point in this case is not for activating any cooling device, but just to
> notify user space. So this will be a zone with no associated cdevs.

Yes, this was the idea which we discussed in lm-sensors a few months ago.
[I could not locate the thread in the web]. Except that we will register as
'thermal sensor' and not as 'thermal zones' because of the changes happening
to the thermal framework recently[1].

This way, we can expose trip points and configure them, without having a
need to associate any cdevs.

[1] https://lkml.org/lkml/2013/2/5/228

Thanks,
Durga
> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> 
> _______________________________________________
> lm-sensors mailing list
> lm-sensors@lm-sensors.org
> http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] thermal threshold and notification v2.0
  2013-04-15 15:20   ` Srinivas Pandruvada
  2013-04-15 17:53     ` [lm-sensors] " R, Durgadoss
@ 2013-04-16  4:52     ` Zhang Rui
  2013-04-16  4:55       ` Guenter Roeck
  1 sibling, 1 reply; 7+ messages in thread
From: Zhang Rui @ 2013-04-16  4:52 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Guenter Roeck, lm-sensors, fenghua.yu, tony.luck, bp, linux-pm

On Mon, 2013-04-15 at 08:20 -0700, Srinivas Pandruvada wrote:
> On 04/12/2013 06:32 PM, Guenter Roeck wrote:
> > On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
> >> v2.0
> >> As suggested by Guenter Roeck, used the previous development in this area
> >> as starting point. The first patch is same as what Guenter Roeck submitted
> >> before except for checkpatch error for strtoul.As per this patch, the following
> >> additional coretemp sysfs entries will be added:
> >> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
> >> tempX_threshold1_triggered
> >> 	         - Reflects status of CPU thermal status register bit 6
> >> 		   (THERM_STATUS_THRESHOLD0).
> >> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
> >> tempX_threshold2_triggered
> >> 	         - Reflects status of CPU thermal status register bit 8
> >> 		   (THERM_STATUS_THRESHOLD1).
> >>
> >>
> >> The notification mechanism is implemented for package level by using uevent.
> >> Also a debugfs interface is added to check count of interrupts and worker fn
> >> scheduling.
> >>
> >>
> >> v1.0
> >>
> >> This is clear that there is reluctance in adding thresholds in coretemp sysfs,
> >> during previous attempts. Proably because of lake of use cases.
> >> But this time use case may be more compelling.
> >>
> >> We have many small form factor devices like ultrabooks, slate PCs in the market.
> >> Unfortunately these devices reach maximum temperature with relatively less
> >> workloads, causing BIOS to do thermal throttling. There are real performance
> >> issues due to aggressive BIOS action to control thermals and also thermal breakdown
> >> in some cases.
> >>
> >> Even the most expensive laptops, don't have correct ACPI thermal configuration,
> >> so that kernel thermal driver can act. In some case even the trip point is higher
> >> than critical temperature setting.
> >>
> >> Intel has developed several drivers, which can be used to cool the system very efficiently.
> >> They include RAPL based cooling driver, Powerclamp driver and P state driver.
> >> To utilize these cooling device a closed loop user mode program is required, which
> >> will utilize these method and dynamically compensate for high CPU temperatures,
> >> without relying on any configuration data.
> >> One such solution is developed is "Linux thermal daemon". More details can be
> >> obtained from
> >> "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon_Introduction.pdf".
> >> This daemon polls for cpu temperature and apply compensation once the CPU reach target
> >> temperature.
> >>
> >> This polling can be mostly avoided, by getting notification for the temperature, where
> >> it needs to wake up and get ready for apply compensation. In most of the normal use
> >> cases, there may not be any threshold events. So very minimal number of user space
> >> notification for thermal thresholds.
> >>
> >> Notification are added only for package level thresholds, to minimize events. Also
> >> interrupts are enabled only when a non tj_max(default) value is written to thresholds.
> >>
> >> Once thresholds are violated, it uses a rate control of 5 seconds, reducing the number
> >> of interrupts, when temperature is hanging around trip point. Using the sticky log bit,
> >> it sends kboject uevent change notification for corresponding package sysfs.
> >> Once the thermal daemon receives notification, it can change to new threshold or act
> >> immediately to reduce CPU temperature.*
> >>
> >> Guenter Roeck (1):
> >>    hwmon: (coretemp) Add support for thermal threshold attributes
> >>
> >> Srinivas Pandruvada (3):
> >>    x86, mcheck, therm_throt: Process package thresholds
> >>    hwmon: (coretemp) : Add notification support
> >>    hwmon: (coretemp) : Add debugfs to support thresholds
> >>
> >>   Documentation/hwmon/coretemp             |   8 +
> >>   arch/x86/include/asm/mce.h               |   7 +
> >>   arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
> >>   drivers/hwmon/coretemp.c                 | 292 +++++++++++++++++++++++++++++--
> >>   4 files changed, 356 insertions(+), 14 deletions(-)
> >>
> > Rui,
> >
> > can you have a look at this series ?
> >
> > I would like to get some feedback from thermal subsystem supporters if hwmon
> > is really the right place for this. I may be wrong, but it seems to me it would
> > better fit into thermal.
> >
> > Thanks,
> > Guenter
> 
> I am fine using thermal zones, but the coretemp will be duplicated in 
> both coretemp and thermal sysfs and lot of code duplication.

is it possible to introduce code in coretemp driver to register with
thermal subsystem?

>  Also trip 
> point in this case is not for activating any cooling device, but just to 
> notify user space. So this will be a zone with no associated cdevs.
> 
trip points for thermal zones does not have to bind with cdevs.
if you bind the p-state driver, rapl, intel_powerclamp and t-state
driver to this thermal zone, all the thermal management work can be done
in kernel.
But if you do not bind any cooling devices to trip points, you can use
"userspace" thermal governor, and provide your own
thermal_zone_device_ops->notify() callback.
In this case, .notify() will be called instead, and you can do whatever
you want in that callback.

thanks,
rui


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/4] thermal threshold and notification v2.0
  2013-04-16  4:52     ` Zhang Rui
@ 2013-04-16  4:55       ` Guenter Roeck
  0 siblings, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2013-04-16  4:55 UTC (permalink / raw)
  To: Zhang Rui
  Cc: Srinivas Pandruvada, lm-sensors, fenghua.yu, tony.luck, bp,
	linux-pm

On Tue, Apr 16, 2013 at 12:52:43PM +0800, Zhang Rui wrote:
> On Mon, 2013-04-15 at 08:20 -0700, Srinivas Pandruvada wrote:
> > On 04/12/2013 06:32 PM, Guenter Roeck wrote:
> > > On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
> > >> v2.0
> > >> As suggested by Guenter Roeck, used the previous development in this area
> > >> as starting point. The first patch is same as what Guenter Roeck submitted
> > >> before except for checkpatch error for strtoul.As per this patch, the following
> > >> additional coretemp sysfs entries will be added:
> > >> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
> > >> tempX_threshold1_triggered
> > >> 	         - Reflects status of CPU thermal status register bit 6
> > >> 		   (THERM_STATUS_THRESHOLD0).
> > >> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
> > >> tempX_threshold2_triggered
> > >> 	         - Reflects status of CPU thermal status register bit 8
> > >> 		   (THERM_STATUS_THRESHOLD1).
> > >>
> > >>
> > >> The notification mechanism is implemented for package level by using uevent.
> > >> Also a debugfs interface is added to check count of interrupts and worker fn
> > >> scheduling.
> > >>
> > >>
> > >> v1.0
> > >>
> > >> This is clear that there is reluctance in adding thresholds in coretemp sysfs,
> > >> during previous attempts. Proably because of lake of use cases.
> > >> But this time use case may be more compelling.
> > >>
> > >> We have many small form factor devices like ultrabooks, slate PCs in the market.
> > >> Unfortunately these devices reach maximum temperature with relatively less
> > >> workloads, causing BIOS to do thermal throttling. There are real performance
> > >> issues due to aggressive BIOS action to control thermals and also thermal breakdown
> > >> in some cases.
> > >>
> > >> Even the most expensive laptops, don't have correct ACPI thermal configuration,
> > >> so that kernel thermal driver can act. In some case even the trip point is higher
> > >> than critical temperature setting.
> > >>
> > >> Intel has developed several drivers, which can be used to cool the system very efficiently.
> > >> They include RAPL based cooling driver, Powerclamp driver and P state driver.
> > >> To utilize these cooling device a closed loop user mode program is required, which
> > >> will utilize these method and dynamically compensate for high CPU temperatures,
> > >> without relying on any configuration data.
> > >> One such solution is developed is "Linux thermal daemon". More details can be
> > >> obtained from
> > >> "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon_Introduction.pdf".
> > >> This daemon polls for cpu temperature and apply compensation once the CPU reach target
> > >> temperature.
> > >>
> > >> This polling can be mostly avoided, by getting notification for the temperature, where
> > >> it needs to wake up and get ready for apply compensation. In most of the normal use
> > >> cases, there may not be any threshold events. So very minimal number of user space
> > >> notification for thermal thresholds.
> > >>
> > >> Notification are added only for package level thresholds, to minimize events. Also
> > >> interrupts are enabled only when a non tj_max(default) value is written to thresholds.
> > >>
> > >> Once thresholds are violated, it uses a rate control of 5 seconds, reducing the number
> > >> of interrupts, when temperature is hanging around trip point. Using the sticky log bit,
> > >> it sends kboject uevent change notification for corresponding package sysfs.
> > >> Once the thermal daemon receives notification, it can change to new threshold or act
> > >> immediately to reduce CPU temperature.*
> > >>
> > >> Guenter Roeck (1):
> > >>    hwmon: (coretemp) Add support for thermal threshold attributes
> > >>
> > >> Srinivas Pandruvada (3):
> > >>    x86, mcheck, therm_throt: Process package thresholds
> > >>    hwmon: (coretemp) : Add notification support
> > >>    hwmon: (coretemp) : Add debugfs to support thresholds
> > >>
> > >>   Documentation/hwmon/coretemp             |   8 +
> > >>   arch/x86/include/asm/mce.h               |   7 +
> > >>   arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
> > >>   drivers/hwmon/coretemp.c                 | 292 +++++++++++++++++++++++++++++--
> > >>   4 files changed, 356 insertions(+), 14 deletions(-)
> > >>
> > > Rui,
> > >
> > > can you have a look at this series ?
> > >
> > > I would like to get some feedback from thermal subsystem supporters if hwmon
> > > is really the right place for this. I may be wrong, but it seems to me it would
> > > better fit into thermal.
> > >
> > > Thanks,
> > > Guenter
> > 
> > I am fine using thermal zones, but the coretemp will be duplicated in 
> > both coretemp and thermal sysfs and lot of code duplication.
> 
> is it possible to introduce code in coretemp driver to register with
> thermal subsystem?
> 
Should be possible. As mentioned in my other mail, I think we
should find some automated way to do that (ie in the infrastructure),
but that is something we can deal with later on.

Guenter

> >  Also trip 
> > point in this case is not for activating any cooling device, but just to 
> > notify user space. So this will be a zone with no associated cdevs.
> > 
> trip points for thermal zones does not have to bind with cdevs.
> if you bind the p-state driver, rapl, intel_powerclamp and t-state
> driver to this thermal zone, all the thermal management work can be done
> in kernel.
> But if you do not bind any cooling devices to trip points, you can use
> "userspace" thermal governor, and provide your own
> thermal_zone_device_ops->notify() callback.
> In this case, .notify() will be called instead, and you can do whatever
> you want in that callback.
> 
> thanks,
> rui
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [lm-sensors] [PATCH 0/4] thermal threshold and notification v2.0
  2013-04-15 17:53     ` [lm-sensors] " R, Durgadoss
@ 2013-04-16  4:57       ` Zhang Rui
  2013-04-16  5:12         ` R, Durgadoss
  0 siblings, 1 reply; 7+ messages in thread
From: Zhang Rui @ 2013-04-16  4:57 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Srinivas Pandruvada, Guenter Roeck, Yu, Fenghua, Luck, Tony,
	linux-pm@vger.kernel.org, lm-sensors@lm-sensors.org, bp@alien8.de

On Mon, 2013-04-15 at 11:53 -0600, R, Durgadoss wrote:
> > -----Original Message-----
> > From: lm-sensors-bounces@lm-sensors.org [mailto:lm-sensors-
> > bounces@lm-sensors.org] On Behalf Of Srinivas Pandruvada
> > Sent: Monday, April 15, 2013 8:51 PM
> > To: Guenter Roeck
> > Cc: Yu, Fenghua; Luck, Tony; linux-pm@vger.kernel.org; lm-sensors@lm-
> > sensors.org; bp@alien8.de; Zhang, Rui
> > Subject: Re: [lm-sensors] [PATCH 0/4] thermal threshold and notification v2.0
> > 
> > On 04/12/2013 06:32 PM, Guenter Roeck wrote:
> > > On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
> > >> v2.0
> > >> As suggested by Guenter Roeck, used the previous development in this
> > area
> > >> as starting point. The first patch is same as what Guenter Roeck submitted
> > >> before except for checkpatch error for strtoul.As per this patch, the
> > following
> > >> additional coretemp sysfs entries will be added:
> > >> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
> > >> tempX_threshold1_triggered
> > >> 	         - Reflects status of CPU thermal status register bit 6
> > >> 		   (THERM_STATUS_THRESHOLD0).
> > >> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
> > >> tempX_threshold2_triggered
> > >> 	         - Reflects status of CPU thermal status register bit 8
> > >> 		   (THERM_STATUS_THRESHOLD1).
> > >>
> > >>
> > >> The notification mechanism is implemented for package level by using
> > uevent.
> > >> Also a debugfs interface is added to check count of interrupts and worker
> > fn
> > >> scheduling.
> > >>
> > >>
> > >> v1.0
> > >>
> > >> This is clear that there is reluctance in adding thresholds in coretemp
> > sysfs,
> > >> during previous attempts. Proably because of lake of use cases.
> > >> But this time use case may be more compelling.
> > >>
> > >> We have many small form factor devices like ultrabooks, slate PCs in the
> > market.
> > >> Unfortunately these devices reach maximum temperature with relatively
> > less
> > >> workloads, causing BIOS to do thermal throttling. There are real
> > performance
> > >> issues due to aggressive BIOS action to control thermals and also thermal
> > breakdown
> > >> in some cases.
> > >>
> > >> Even the most expensive laptops, don't have correct ACPI thermal
> > configuration,
> > >> so that kernel thermal driver can act. In some case even the trip point is
> > higher
> > >> than critical temperature setting.
> > >>
> > >> Intel has developed several drivers, which can be used to cool the system
> > very efficiently.
> > >> They include RAPL based cooling driver, Powerclamp driver and P state
> > driver.
> > >> To utilize these cooling device a closed loop user mode program is
> > required, which
> > >> will utilize these method and dynamically compensate for high CPU
> > temperatures,
> > >> without relying on any configuration data.
> > >> One such solution is developed is "Linux thermal daemon". More details
> > can be
> > >> obtained from
> > >>
> > "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon
> > _Introduction.pdf".
> > >> This daemon polls for cpu temperature and apply compensation once the
> > CPU reach target
> > >> temperature.
> > >>
> > >> This polling can be mostly avoided, by getting notification for the
> > temperature, where
> > >> it needs to wake up and get ready for apply compensation. In most of the
> > normal use
> > >> cases, there may not be any threshold events. So very minimal number of
> > user space
> > >> notification for thermal thresholds.
> > >>
> > >> Notification are added only for package level thresholds, to minimize
> > events. Also
> > >> interrupts are enabled only when a non tj_max(default) value is written
> > to thresholds.
> > >>
> > >> Once thresholds are violated, it uses a rate control of 5 seconds, reducing
> > the number
> > >> of interrupts, when temperature is hanging around trip point. Using the
> > sticky log bit,
> > >> it sends kboject uevent change notification for corresponding package
> > sysfs.
> > >> Once the thermal daemon receives notification, it can change to new
> > threshold or act
> > >> immediately to reduce CPU temperature.*
> > >>
> > >> Guenter Roeck (1):
> > >>    hwmon: (coretemp) Add support for thermal threshold attributes
> > >>
> > >> Srinivas Pandruvada (3):
> > >>    x86, mcheck, therm_throt: Process package thresholds
> > >>    hwmon: (coretemp) : Add notification support
> > >>    hwmon: (coretemp) : Add debugfs to support thresholds
> > >>
> > >>   Documentation/hwmon/coretemp             |   8 +
> > >>   arch/x86/include/asm/mce.h               |   7 +
> > >>   arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
> > >>   drivers/hwmon/coretemp.c                 | 292
> > +++++++++++++++++++++++++++++--
> > >>   4 files changed, 356 insertions(+), 14 deletions(-)
> > >>
> > > Rui,
> > >
> > > can you have a look at this series ?
> > >
> > > I would like to get some feedback from thermal subsystem supporters if
> > hwmon
> > > is really the right place for this. I may be wrong, but it seems to me it would
> > > better fit into thermal.
> > >
> > > Thanks,
> > > Guenter
> > 
> > I am fine using thermal zones, but the coretemp will be duplicated in
> > both coretemp and thermal sysfs and lot of code duplication. Also trip
> > point in this case is not for activating any cooling device, but just to
> > notify user space. So this will be a zone with no associated cdevs.
> 
> Yes, this was the idea which we discussed in lm-sensors a few months ago.
> [I could not locate the thread in the web]. Except that we will register as
> 'thermal sensor' and not as 'thermal zones' because of the changes happening
> to the thermal framework recently[1].
> 
> This way, we can expose trip points and configure them, without having a
> need to associate any cdevs.
> 
Agreed.
But note that this is also true in current code as well.

thanks,
rui
> [1] https://lkml.org/lkml/2013/2/5/228

> Thanks,
> Durga
> > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> > 
> > 
> > _______________________________________________
> > lm-sensors mailing list
> > lm-sensors@lm-sensors.org
> > http://lists.lm-sensors.org/mailman/listinfo/lm-sensors



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [lm-sensors] [PATCH 0/4] thermal threshold and notification v2.0
  2013-04-16  4:57       ` Zhang Rui
@ 2013-04-16  5:12         ` R, Durgadoss
  0 siblings, 0 replies; 7+ messages in thread
From: R, Durgadoss @ 2013-04-16  5:12 UTC (permalink / raw)
  To: Zhang, Rui
  Cc: Srinivas Pandruvada, Guenter Roeck, Yu, Fenghua, Luck, Tony,
	linux-pm@vger.kernel.org, lm-sensors@lm-sensors.org, bp@alien8.de

> -----Original Message-----
> From: Zhang, Rui
> Sent: Tuesday, April 16, 2013 10:28 AM
> To: R, Durgadoss
> Cc: Srinivas Pandruvada; Guenter Roeck; Yu, Fenghua; Luck, Tony; linux-
> pm@vger.kernel.org; lm-sensors@lm-sensors.org; bp@alien8.de
> Subject: RE: [lm-sensors] [PATCH 0/4] thermal threshold and notification v2.0
> 
> On Mon, 2013-04-15 at 11:53 -0600, R, Durgadoss wrote:
> > > -----Original Message-----
> > > From: lm-sensors-bounces@lm-sensors.org [mailto:lm-sensors-
> > > bounces@lm-sensors.org] On Behalf Of Srinivas Pandruvada
> > > Sent: Monday, April 15, 2013 8:51 PM
> > > To: Guenter Roeck
> > > Cc: Yu, Fenghua; Luck, Tony; linux-pm@vger.kernel.org; lm-sensors@lm-
> > > sensors.org; bp@alien8.de; Zhang, Rui
> > > Subject: Re: [lm-sensors] [PATCH 0/4] thermal threshold and notification
> v2.0
> > >
> > > On 04/12/2013 06:32 PM, Guenter Roeck wrote:
> > > > On Tue, Apr 09, 2013 at 02:01:18PM -0700, Srinivas Pandruvada wrote:
> > > >> v2.0
> > > >> As suggested by Guenter Roeck, used the previous development in
> this
> > > area
> > > >> as starting point. The first patch is same as what Guenter Roeck
> submitted
> > > >> before except for checkpatch error for strtoul.As per this patch, the
> > > following
> > > >> additional coretemp sysfs entries will be added:
> > > >> tempX_threshold1 - Reflects value of CPU thermal threshold T0.
> > > >> tempX_threshold1_triggered
> > > >> 	         - Reflects status of CPU thermal status register bit 6
> > > >> 		   (THERM_STATUS_THRESHOLD0).
> > > >> tempX_threshold2 - Reflects value of CPU thermal threshold T1.
> > > >> tempX_threshold2_triggered
> > > >> 	         - Reflects status of CPU thermal status register bit 8
> > > >> 		   (THERM_STATUS_THRESHOLD1).
> > > >>
> > > >>
> > > >> The notification mechanism is implemented for package level by using
> > > uevent.
> > > >> Also a debugfs interface is added to check count of interrupts and
> worker
> > > fn
> > > >> scheduling.
> > > >>
> > > >>
> > > >> v1.0
> > > >>
> > > >> This is clear that there is reluctance in adding thresholds in coretemp
> > > sysfs,
> > > >> during previous attempts. Proably because of lake of use cases.
> > > >> But this time use case may be more compelling.
> > > >>
> > > >> We have many small form factor devices like ultrabooks, slate PCs in
> the
> > > market.
> > > >> Unfortunately these devices reach maximum temperature with
> relatively
> > > less
> > > >> workloads, causing BIOS to do thermal throttling. There are real
> > > performance
> > > >> issues due to aggressive BIOS action to control thermals and also
> thermal
> > > breakdown
> > > >> in some cases.
> > > >>
> > > >> Even the most expensive laptops, don't have correct ACPI thermal
> > > configuration,
> > > >> so that kernel thermal driver can act. In some case even the trip point
> is
> > > higher
> > > >> than critical temperature setting.
> > > >>
> > > >> Intel has developed several drivers, which can be used to cool the
> system
> > > very efficiently.
> > > >> They include RAPL based cooling driver, Powerclamp driver and P state
> > > driver.
> > > >> To utilize these cooling device a closed loop user mode program is
> > > required, which
> > > >> will utilize these method and dynamically compensate for high CPU
> > > temperatures,
> > > >> without relying on any configuration data.
> > > >> One such solution is developed is "Linux thermal daemon". More
> details
> > > can be
> > > >> obtained from
> > > >>
> > >
> "https://github.com/01org/thermal_daemon/blob/master/ThermalDaemon
> > > _Introduction.pdf".
> > > >> This daemon polls for cpu temperature and apply compensation once
> the
> > > CPU reach target
> > > >> temperature.
> > > >>
> > > >> This polling can be mostly avoided, by getting notification for the
> > > temperature, where
> > > >> it needs to wake up and get ready for apply compensation. In most of
> the
> > > normal use
> > > >> cases, there may not be any threshold events. So very minimal
> number of
> > > user space
> > > >> notification for thermal thresholds.
> > > >>
> > > >> Notification are added only for package level thresholds, to minimize
> > > events. Also
> > > >> interrupts are enabled only when a non tj_max(default) value is
> written
> > > to thresholds.
> > > >>
> > > >> Once thresholds are violated, it uses a rate control of 5 seconds,
> reducing
> > > the number
> > > >> of interrupts, when temperature is hanging around trip point. Using
> the
> > > sticky log bit,
> > > >> it sends kboject uevent change notification for corresponding package
> > > sysfs.
> > > >> Once the thermal daemon receives notification, it can change to new
> > > threshold or act
> > > >> immediately to reduce CPU temperature.*
> > > >>
> > > >> Guenter Roeck (1):
> > > >>    hwmon: (coretemp) Add support for thermal threshold attributes
> > > >>
> > > >> Srinivas Pandruvada (3):
> > > >>    x86, mcheck, therm_throt: Process package thresholds
> > > >>    hwmon: (coretemp) : Add notification support
> > > >>    hwmon: (coretemp) : Add debugfs to support thresholds
> > > >>
> > > >>   Documentation/hwmon/coretemp             |   8 +
> > > >>   arch/x86/include/asm/mce.h               |   7 +
> > > >>   arch/x86/kernel/cpu/mcheck/therm_throt.c |  63 ++++++-
> > > >>   drivers/hwmon/coretemp.c                 | 292
> > > +++++++++++++++++++++++++++++--
> > > >>   4 files changed, 356 insertions(+), 14 deletions(-)
> > > >>
> > > > Rui,
> > > >
> > > > can you have a look at this series ?
> > > >
> > > > I would like to get some feedback from thermal subsystem supporters
> if
> > > hwmon
> > > > is really the right place for this. I may be wrong, but it seems to me it
> would
> > > > better fit into thermal.
> > > >
> > > > Thanks,
> > > > Guenter
> > >
> > > I am fine using thermal zones, but the coretemp will be duplicated in
> > > both coretemp and thermal sysfs and lot of code duplication. Also trip
> > > point in this case is not for activating any cooling device, but just to
> > > notify user space. So this will be a zone with no associated cdevs.
> >
> > Yes, this was the idea which we discussed in lm-sensors a few months ago.
> > [I could not locate the thread in the web]. Except that we will register as
> > 'thermal sensor' and not as 'thermal zones' because of the changes
> happening
> > to the thermal framework recently[1].
> >
> > This way, we can expose trip points and configure them, without having a
> > need to associate any cdevs.
> >
> Agreed.
> But note that this is also true in current code as well.

Yes, I agree.

Thanks,
Durga
> 
> thanks,
> rui
> > [1] https://lkml.org/lkml/2013/2/5/228
> 
> > Thanks,
> > Durga
> > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > >
> > >
> > >
> > > _______________________________________________
> > > lm-sensors mailing list
> > > lm-sensors@lm-sensors.org
> > > http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-04-16  5:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1365541282-19366-1-git-send-email-srinivas.pandruvada@linux.intel.com>
2013-04-13  1:32 ` [PATCH 0/4] thermal threshold and notification v2.0 Guenter Roeck
2013-04-15 15:20   ` Srinivas Pandruvada
2013-04-15 17:53     ` [lm-sensors] " R, Durgadoss
2013-04-16  4:57       ` Zhang Rui
2013-04-16  5:12         ` R, Durgadoss
2013-04-16  4:52     ` Zhang Rui
2013-04-16  4:55       ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).