* ACPI Thermal Shutdown on Boot
@ 2004-02-10 6:48 Chris Jensen
0 siblings, 0 replies; 9+ messages in thread
From: Chris Jensen @ 2004-02-10 6:48 UTC (permalink / raw)
To: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Hi,
I've got a couple of problems with the ACPI thermal support in 2.6.2
I'm trying to install kernel 2.6.2. However, as soon as the kernel starts
init, I get a message along the lines of "Critical Temperature reached - 57C,
shutting down" and init switches to runlevel 0.
(I got the same behaviour trying to install 2.6.1, but without the message
telling me why)
The system had been powered on for 2 days straight, so I think it's safe to
keep it running at that temp, I'd like to keep the thermal protection
compiled into the kernel to protect against a really high temperature should
a fan fail etc, so I'd like to adjust the critical temperature so it doesn't
shutdown.
How can I adjust the critical temperature? I realise that there's a file
in /proc that can be adjusted, but as init is shutting down immediately,
there doesn't seem to be an oportunity to do this.
The other issue is that although the kernel said the temperature was 57C, if I
reboot and jump into the BIOS settings, it says its 67C, so there seems to be
some discrepency there.
I'm using a Gigabyte motherboard with a VIA KT133 chipset (VT82C686).
It has a shutdown threshold in the BIOS settings, which I've tried increasing,
but it didn't help.
--
Chris Jensen
chris-s2nexXcBJu3PC4JL755TaA@public.gmane.org
Public Key: http://drspirograph.com/public_key/
Wait: Did you know that there's a direct correlation between the decline of
Spirograph and the rise in gang activity? Think about it.
- Dr Spirograph (The Simpsons)
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: ACPI Thermal Shutdown on Boot
@ 2004-02-11 3:51 Yu, Luming
[not found] ` <3ACA40606221794F80A5670F0AF15F8401CBB690-SRlDPOYGfgogGBtAFL8yw7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Yu, Luming @ 2004-02-11 3:51 UTC (permalink / raw)
To: Chris Jensen, acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
> I'm trying to install kernel 2.6.2. However, as soon as the
> kernel starts
> init, I get a message along the lines of "Critical
> Temperature reached - 57C,
> shutting down" and init switches to runlevel 0.
-57C is an obvious error. Maybe there are something wrong with _TMP.
--Luming
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: ACPI Thermal Shutdown on Boot
[not found] ` <3ACA40606221794F80A5670F0AF15F8401CBB690-SRlDPOYGfgogGBtAFL8yw7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2004-02-11 6:01 ` Nate Lawson
[not found] ` <20040210215906.H85854-Y6VGUYTwhu0@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Nate Lawson @ 2004-02-11 6:01 UTC (permalink / raw)
To: Yu, Luming; +Cc: Chris Jensen, acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Wed, 11 Feb 2004, Yu, Luming wrote:
> > I'm trying to install kernel 2.6.2. However, as soon as the
> > kernel starts
> > init, I get a message along the lines of "Critical
> > Temperature reached - 57C,
> > shutting down" and init switches to runlevel 0.
>
> -57C is an obvious error. Maybe there are something wrong with _TMP.
I think he just meant 57C. In any case, we decided to set the _TMP poll
interval to 10 seconds. After 2 readings above _CRT/_HOT, userland
receives an event and a message is logged saying the temperature is too
high. On the 3rd one, we shutdown/power off the system. I think Linux
chose something similar or at least I saw patches for that.
But 57C (or even 67C) is pretty low for _CRT. We should probably add an
override.
-Nate
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ACPI Thermal Shutdown on Boot
[not found] ` <20040210215906.H85854-Y6VGUYTwhu0@public.gmane.org>
@ 2004-02-11 10:59 ` Karol Kozimor
[not found] ` <20040211105908.GB30647-DETuoxkZsSqrDJvtcaxF/A@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Karol Kozimor @ 2004-02-11 10:59 UTC (permalink / raw)
To: Nate Lawson
Cc: Yu, Luming, Chris Jensen,
acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Thus wrote Nate Lawson:
> I think he just meant 57C. In any case, we decided to set the _TMP poll
> interval to 10 seconds. After 2 readings above _CRT/_HOT, userland
> receives an event and a message is logged saying the temperature is too
> high. On the 3rd one, we shutdown/power off the system. I think Linux
If I understand you correctly, 30 seconds is certainly enough time to burn
a CPU when the fan has failed, isn't it?
Best regards,
--
Karol 'sziwan' Kozimor
sziwan-DETuoxkZsSqrDJvtcaxF/A@public.gmane.org
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ACPI Thermal Shutdown on Boot
[not found] ` <20040211105908.GB30647-DETuoxkZsSqrDJvtcaxF/A@public.gmane.org>
@ 2004-02-11 17:48 ` Nate Lawson
0 siblings, 0 replies; 9+ messages in thread
From: Nate Lawson @ 2004-02-11 17:48 UTC (permalink / raw)
To: Karol Kozimor; +Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Wed, 11 Feb 2004, Karol Kozimor wrote:
> Thus wrote Nate Lawson:
> > I think he just meant 57C. In any case, we decided to set the _TMP poll
> > interval to 10 seconds. After 2 readings above _CRT/_HOT, userland
> > receives an event and a message is logged saying the temperature is too
> > high. On the 3rd one, we shutdown/power off the system. I think Linux
>
> If I understand you correctly, 30 seconds is certainly enough time to burn
> a CPU when the fan has failed, isn't it?
Our old default poll interval was 30 seconds so this doesn't change
anything. I know DeadRat takes more than 30 seconds from "shutdown -h
now" to power off. All modern CPUs have a thermal shutdown builtin.
Think of _CRT as more of "system temp too high, shut down as soon as you
can."
-Nate
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: ACPI Thermal Shutdown on Boot
@ 2004-02-12 2:52 Yu, Luming
[not found] ` <3ACA40606221794F80A5670F0AF15F8401CBB6A6-SRlDPOYGfgogGBtAFL8yw7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Yu, Luming @ 2004-02-12 2:52 UTC (permalink / raw)
To: Nate Lawson, Karol Kozimor; +Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
> > Thus wrote Nate Lawson:
> > > I think he just meant 57C. In any case, we decided to
> set the _TMP poll
> > > interval to 10 seconds. After 2 readings above
> _CRT/_HOT, userland
> > > receives an event and a message is logged saying the
> temperature is too
> > > high. On the 3rd one, we shutdown/power off the system.
> I think Linux
> >
> > If I understand you correctly, 30 seconds is certainly
> enough time to burn
> > a CPU when the fan has failed, isn't it?
>
> Our old default poll interval was 30 seconds so this doesn't change
> anything. I know DeadRat takes more than 30 seconds from "shutdown -h
> now" to power off. All modern CPUs have a thermal shutdown builtin.
> Think of _CRT as more of "system temp too high, shut down as
> soon as you
> can."
If I didn't miss something, Chris Jensen was unable to adjust
critical trip point due
to current policy. Anyway, 57C isn't a real critical temperature. It
doesn't make sense to
enter emergent shutdown/power.So we need sane checking here.( A sane
_CRT ?)
If FAN is broken, at least we can use throttling to guarantee it will
not be over heated.
So, throttling should be included into policy about handling critical
trip point.
--Luming
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id\x1356&alloc_id438&op=click
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ACPI Thermal Shutdown on Boot
[not found] ` <3ACA40606221794F80A5670F0AF15F8401CBB6A6-SRlDPOYGfgogGBtAFL8yw7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2004-02-13 19:06 ` Pavel Machek
[not found] ` <20040213190646.GH6804-u08AdweFZfgxtPtxi4kahqVXKuFTiq87@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Pavel Machek @ 2004-02-13 19:06 UTC (permalink / raw)
To: Yu, Luming
Cc: Nate Lawson, Karol Kozimor,
acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Hi!
> > > If I understand you correctly, 30 seconds is certainly
> > enough time to burn
> > > a CPU when the fan has failed, isn't it?
> >
> > Our old default poll interval was 30 seconds so this doesn't change
> > anything. I know DeadRat takes more than 30 seconds from "shutdown -h
> > now" to power off. All modern CPUs have a thermal shutdown builtin.
> > Think of _CRT as more of "system temp too high, shut down as
> > soon as you
> > can."
>
> If I didn't miss something, Chris Jensen was unable to adjust
> critical trip point due
> to current policy. Anyway, 57C isn't a real critical temperature. It
> doesn't make sense to
> enter emergent shutdown/power.So we need sane checking here.( A sane
> _CRT ?)
> If FAN is broken, at least we can use throttling to guarantee it will
> not be over heated.
> So, throttling should be included into policy about handling critical
> trip point.
At critical trip point, we have to shut down, according
to specs. No choice here.
If their ACPI BIOS is b0rken, blacklist them.
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ACPI Thermal Shutdown on Boot
[not found] ` <20040213190646.GH6804-u08AdweFZfgxtPtxi4kahqVXKuFTiq87@public.gmane.org>
@ 2004-02-15 20:24 ` Nate Lawson
[not found] ` <20040215122249.V20266-Y6VGUYTwhu0@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Nate Lawson @ 2004-02-15 20:24 UTC (permalink / raw)
To: Pavel Machek
Cc: Yu, Luming, Karol Kozimor,
acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
On Fri, 13 Feb 2004, Pavel Machek wrote:
> > > > If I understand you correctly, 30 seconds is certainly
> > > enough time to burn
> > > > a CPU when the fan has failed, isn't it?
> > >
> > > Our old default poll interval was 30 seconds so this doesn't change
> > > anything. I know DeadRat takes more than 30 seconds from "shutdown -h
> > > now" to power off. All modern CPUs have a thermal shutdown builtin.
> > > Think of _CRT as more of "system temp too high, shut down as
> > > soon as you
> > > can."
> >
> > If I didn't miss something, Chris Jensen was unable to adjust
> > critical trip point due
> > to current policy. Anyway, 57C isn't a real critical temperature. It
> > doesn't make sense to
> > enter emergent shutdown/power.So we need sane checking here.( A sane
> > _CRT ?)
> > If FAN is broken, at least we can use throttling to guarantee it will
> > not be over heated.
> > So, throttling should be included into policy about handling critical
> > trip point.
>
> At critical trip point, we have to shut down, according
> to specs. No choice here.
>
> If their ACPI BIOS is b0rken, blacklist them.
My point was that shutdown to poweroff takes about 10 seconds on FreeBSD
but much longer on common Linux distributions. So even with a little
delay, the system gets shut down at the same time. This is better than
blacklisting the system (a Compaq laptop) and not having any _CRT
protection. But Linux can do it however you want.
-Nate
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ACPI Thermal Shutdown on Boot
[not found] ` <20040215122249.V20266-Y6VGUYTwhu0@public.gmane.org>
@ 2004-02-15 20:29 ` Pavel Machek
0 siblings, 0 replies; 9+ messages in thread
From: Pavel Machek @ 2004-02-15 20:29 UTC (permalink / raw)
To: Nate Lawson
Cc: Yu, Luming, Karol Kozimor,
acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Hi!
> > > > > If I understand you correctly, 30 seconds is certainly
> > > > enough time to burn
> > > > > a CPU when the fan has failed, isn't it?
> > > >
> > > > Our old default poll interval was 30 seconds so this doesn't change
> > > > anything. I know DeadRat takes more than 30 seconds from "shutdown -h
> > > > now" to power off. All modern CPUs have a thermal shutdown builtin.
> > > > Think of _CRT as more of "system temp too high, shut down as
> > > > soon as you
> > > > can."
> > >
> > > If I didn't miss something, Chris Jensen was unable to adjust
> > > critical trip point due
> > > to current policy. Anyway, 57C isn't a real critical temperature. It
> > > doesn't make sense to
> > > enter emergent shutdown/power.So we need sane checking here.( A sane
> > > _CRT ?)
> > > If FAN is broken, at least we can use throttling to guarantee it will
> > > not be over heated.
> > > So, throttling should be included into policy about handling critical
> > > trip point.
> >
> > At critical trip point, we have to shut down, according
> > to specs. No choice here.
> >
> > If their ACPI BIOS is b0rken, blacklist them.
>
> My point was that shutdown to poweroff takes about 10 seconds on FreeBSD
> but much longer on common Linux distributions. So even with a little
> delay, the system gets shut down at the same time. This is better than
> blacklisting the system (a Compaq laptop) and not having any _CRT
> protection. But Linux can do it however you want.
Well, hardware should protect itself at the end.
Can you see if this:
fixes stuff for you? You should see 'ACPI changed its mind...' in the
logs. (I wonder how to do this properly. Hardcoding 10Celsius is
broken...)
Pavel
--- clean/drivers/acpi/thermal.c 2004-02-05 01:54:00.000000000 +0100
+++ linux/drivers/acpi/thermal.c 2004-02-05 02:24:15.000000000 +0100
@@ -223,8 +223,11 @@
tz->last_temperature = tz->temperature;
status = acpi_evaluate_integer(tz->handle, "_TMP", NULL, &tz->temperature);
- if (ACPI_FAILURE(status))
+ if (ACPI_FAILURE(status)) {
+ if (tz->temperature != tz->last_temperature)
+ printk(KERN_ERR "temperature damaged while processing\n");
return -ENODEV;
+ }
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Temperature is %lu dK\n", tz->temperature));
@@ -457,7 +460,17 @@
return_VALUE(-EINVAL);
if (tz->temperature >= tz->trips.critical.temperature) {
+ long old_temperature = tz->temperature;
ACPI_DEBUG_PRINT((ACPI_DB_WARN, "Critical trip point\n"));
+
+ result = acpi_thermal_get_temperature(tz);
+ if (!result) {
+ if (tz->temperature < (tz->trips.critical.temperature - 100)) {
+ printk(KERN_ALERT "ACPI changed its mind about temperature, was %ld C, now %ld C",
+ KELVIN_TO_CELSIUS(old_temperature), KELVIN_TO_CELSIUS(tz->temperature));
+ return_VALUE(0);
+ }
+ }
tz->trips.critical.flags.enabled = 1;
}
else if (tz->trips.critical.flags.enabled)
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-02-15 20:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-12 2:52 ACPI Thermal Shutdown on Boot Yu, Luming
[not found] ` <3ACA40606221794F80A5670F0AF15F8401CBB6A6-SRlDPOYGfgogGBtAFL8yw7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2004-02-13 19:06 ` Pavel Machek
[not found] ` <20040213190646.GH6804-u08AdweFZfgxtPtxi4kahqVXKuFTiq87@public.gmane.org>
2004-02-15 20:24 ` Nate Lawson
[not found] ` <20040215122249.V20266-Y6VGUYTwhu0@public.gmane.org>
2004-02-15 20:29 ` Pavel Machek
-- strict thread matches above, loose matches on Subject: below --
2004-02-11 3:51 Yu, Luming
[not found] ` <3ACA40606221794F80A5670F0AF15F8401CBB690-SRlDPOYGfgogGBtAFL8yw7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2004-02-11 6:01 ` Nate Lawson
[not found] ` <20040210215906.H85854-Y6VGUYTwhu0@public.gmane.org>
2004-02-11 10:59 ` Karol Kozimor
[not found] ` <20040211105908.GB30647-DETuoxkZsSqrDJvtcaxF/A@public.gmane.org>
2004-02-11 17:48 ` Nate Lawson
2004-02-10 6:48 Chris Jensen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox