* [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures
@ 2013-12-07 18:05 Guenter Roeck
2013-12-14 16:31 ` Guenter Roeck
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Guenter Roeck @ 2013-12-07 18:05 UTC (permalink / raw)
To: lm-sensors
Some Intel CPUs do not set the 'valid' bit in IA32_THERM_STATUS if the
temperature is too low to be measured. This condition will not change until
the CPU is hot enough for its temperature to be measured. Returning an error
in such conditions is not very useful. Drop checking the valid bit and just
return the reported temperature instead.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
I don't think we ever closed on this. Giving it a shot.
drivers/hwmon/coretemp.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 310ce19..6e0579e 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -177,18 +177,19 @@ static ssize_t show_temp(struct device *dev,
/* Check whether the time interval has elapsed */
if (!tdata->valid || time_after(jiffies, tdata->last_updated + HZ)) {
rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
- tdata->valid = 0;
- /* Check whether the data is valid */
- if (eax & 0x80000000) {
- tdata->temp = tdata->tjmax -
- ((eax >> 16) & 0x7f) * 1000;
- tdata->valid = 1;
- }
+ /*
+ * Ignore the valid bit. In all observed cases the register
+ * value is either low or zero if the valid bit is 0.
+ * Return it instead of reporting an error which doesn't
+ * really help at all.
+ */
+ tdata->temp = tdata->tjmax - ((eax >> 16) & 0x7f) * 1000;
+ tdata->valid = 1;
tdata->last_updated = jiffies;
}
mutex_unlock(&tdata->update_lock);
- return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
+ return sprintf(buf, "%d\n", tdata->temp);
}
struct tjmax_pci {
--
1.7.9.7
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures
2013-12-07 18:05 [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures Guenter Roeck
@ 2013-12-14 16:31 ` Guenter Roeck
2013-12-20 15:53 ` Guenter Roeck
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Guenter Roeck @ 2013-12-14 16:31 UTC (permalink / raw)
To: lm-sensors
On 12/07/2013 10:05 AM, Guenter Roeck wrote:
> Some Intel CPUs do not set the 'valid' bit in IA32_THERM_STATUS if the
> temperature is too low to be measured. This condition will not change until
> the CPU is hot enough for its temperature to be measured. Returning an error
> in such conditions is not very useful. Drop checking the valid bit and just
> return the reported temperature instead.
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
> I don't think we ever closed on this. Giving it a shot.
>
Comments, anyone ?
Guenter
> drivers/hwmon/coretemp.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 310ce19..6e0579e 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -177,18 +177,19 @@ static ssize_t show_temp(struct device *dev,
> /* Check whether the time interval has elapsed */
> if (!tdata->valid || time_after(jiffies, tdata->last_updated + HZ)) {
> rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> - tdata->valid = 0;
> - /* Check whether the data is valid */
> - if (eax & 0x80000000) {
> - tdata->temp = tdata->tjmax -
> - ((eax >> 16) & 0x7f) * 1000;
> - tdata->valid = 1;
> - }
> + /*
> + * Ignore the valid bit. In all observed cases the register
> + * value is either low or zero if the valid bit is 0.
> + * Return it instead of reporting an error which doesn't
> + * really help at all.
> + */
> + tdata->temp = tdata->tjmax - ((eax >> 16) & 0x7f) * 1000;
> + tdata->valid = 1;
> tdata->last_updated = jiffies;
> }
>
> mutex_unlock(&tdata->update_lock);
> - return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> + return sprintf(buf, "%d\n", tdata->temp);
> }
>
> struct tjmax_pci {
>
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures
2013-12-07 18:05 [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures Guenter Roeck
2013-12-14 16:31 ` Guenter Roeck
@ 2013-12-20 15:53 ` Guenter Roeck
2013-12-21 18:14 ` Jean Delvare
2013-12-21 19:55 ` Guenter Roeck
3 siblings, 0 replies; 5+ messages in thread
From: Guenter Roeck @ 2013-12-20 15:53 UTC (permalink / raw)
To: lm-sensors
On Sat, Dec 07, 2013 at 10:05:57AM -0800, Guenter Roeck wrote:
> Some Intel CPUs do not set the 'valid' bit in IA32_THERM_STATUS if the
> temperature is too low to be measured. This condition will not change until
> the CPU is hot enough for its temperature to be measured. Returning an error
> in such conditions is not very useful. Drop checking the valid bit and just
> return the reported temperature instead.
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
> I don't think we ever closed on this. Giving it a shot.
>
No feedback. This will go into 3.14 unless there are objections.
Guenter
> drivers/hwmon/coretemp.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 310ce19..6e0579e 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -177,18 +177,19 @@ static ssize_t show_temp(struct device *dev,
> /* Check whether the time interval has elapsed */
> if (!tdata->valid || time_after(jiffies, tdata->last_updated + HZ)) {
> rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> - tdata->valid = 0;
> - /* Check whether the data is valid */
> - if (eax & 0x80000000) {
> - tdata->temp = tdata->tjmax -
> - ((eax >> 16) & 0x7f) * 1000;
> - tdata->valid = 1;
> - }
> + /*
> + * Ignore the valid bit. In all observed cases the register
> + * value is either low or zero if the valid bit is 0.
> + * Return it instead of reporting an error which doesn't
> + * really help at all.
> + */
> + tdata->temp = tdata->tjmax - ((eax >> 16) & 0x7f) * 1000;
> + tdata->valid = 1;
> tdata->last_updated = jiffies;
> }
>
> mutex_unlock(&tdata->update_lock);
> - return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> + return sprintf(buf, "%d\n", tdata->temp);
> }
>
> struct tjmax_pci {
> --
> 1.7.9.7
>
>
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures
2013-12-07 18:05 [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures Guenter Roeck
2013-12-14 16:31 ` Guenter Roeck
2013-12-20 15:53 ` Guenter Roeck
@ 2013-12-21 18:14 ` Jean Delvare
2013-12-21 19:55 ` Guenter Roeck
3 siblings, 0 replies; 5+ messages in thread
From: Jean Delvare @ 2013-12-21 18:14 UTC (permalink / raw)
To: lm-sensors
On Fri, 20 Dec 2013 07:53:11 -0800, Guenter Roeck wrote:
> On Sat, Dec 07, 2013 at 10:05:57AM -0800, Guenter Roeck wrote:
> > Some Intel CPUs do not set the 'valid' bit in IA32_THERM_STATUS if the
> > temperature is too low to be measured. This condition will not change until
> > the CPU is hot enough for its temperature to be measured. Returning an error
> > in such conditions is not very useful. Drop checking the valid bit and just
> > return the reported temperature instead.
> >
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > ---
> > I don't think we ever closed on this. Giving it a shot.
> >
> No feedback. This will go into 3.14 unless there are objections.
I have no objection, I just don't want this change to go to stable, I
believe it needs a lot of testing on a broad range of hardware before
we can even think of it.
Reviewed-by: Jean Delvare <khali@linux-fr.org>
On a related note, I think we already noticed that different CPU models
behave differently in low temperature ranges. As an additional data
point, I noticed when staring at sensord graphs that my wife's Core2
Duo E8400 does clamp at 40°C (which is 60°C below Tjmax). It never ever
reports values below this. Somehow it is a saner implementation than
clearing the valid bit. Too bad we have no way to represent this in
sysfs. Would it make sense to define for example tempX_floor (can't
remember if a better name was ever proposed) so that sensors can report
"< 40°C" instead of "40°C" in this case? The problem is that I don't
think these clamp values can be read from a register, I'm not even sure
if they are documented, so I'm not sure how useful this would be in
practice...
--
Jean Delvare
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures
2013-12-07 18:05 [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures Guenter Roeck
` (2 preceding siblings ...)
2013-12-21 18:14 ` Jean Delvare
@ 2013-12-21 19:55 ` Guenter Roeck
3 siblings, 0 replies; 5+ messages in thread
From: Guenter Roeck @ 2013-12-21 19:55 UTC (permalink / raw)
To: lm-sensors
On 12/21/2013 10:14 AM, Jean Delvare wrote:
> On Fri, 20 Dec 2013 07:53:11 -0800, Guenter Roeck wrote:
>> On Sat, Dec 07, 2013 at 10:05:57AM -0800, Guenter Roeck wrote:
>>> Some Intel CPUs do not set the 'valid' bit in IA32_THERM_STATUS if the
>>> temperature is too low to be measured. This condition will not change until
>>> the CPU is hot enough for its temperature to be measured. Returning an error
>>> in such conditions is not very useful. Drop checking the valid bit and just
>>> return the reported temperature instead.
>>>
>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>> ---
>>> I don't think we ever closed on this. Giving it a shot.
>>>
>> No feedback. This will go into 3.14 unless there are objections.
>
> I have no objection, I just don't want this change to go to stable, I
> believe it needs a lot of testing on a broad range of hardware before
> we can even think of it.
>
Agreed.
> Reviewed-by: Jean Delvare <khali@linux-fr.org>
>
> On a related note, I think we already noticed that different CPU models
> behave differently in low temperature ranges. As an additional data
> point, I noticed when staring at sensord graphs that my wife's Core2
> Duo E8400 does clamp at 40°C (which is 60°C below Tjmax). It never ever
> reports values below this. Somehow it is a saner implementation than
> clearing the valid bit. Too bad we have no way to represent this in
> sysfs. Would it make sense to define for example tempX_floor (can't
> remember if a better name was ever proposed) so that sensors can report
> "< 40°C" instead of "40°C" in this case? The problem is that I don't
> think these clamp values can be read from a register, I'm not even sure
> if they are documented, so I'm not sure how useful this would be in
> practice...
>
Question is really if we would ever have a chance to determine the value
to report as floor. I suspect the answer is no.
Thanks,
Guenter
_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-12-21 19:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-07 18:05 [lm-sensors] [PATCH] hwmon: (coretemp) Do not return -EAGAIN for low temperatures Guenter Roeck
2013-12-14 16:31 ` Guenter Roeck
2013-12-20 15:53 ` Guenter Roeck
2013-12-21 18:14 ` Jean Delvare
2013-12-21 19:55 ` Guenter Roeck
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.