* Random shutdowns
@ 2003-09-03 5:49 Gerfried Maier
[not found] ` <20030908095253.GC3944@openzaurus.ucw.cz>
0 siblings, 1 reply; 4+ messages in thread
From: Gerfried Maier @ 2003-09-03 5:49 UTC (permalink / raw)
To: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Dear ACPI-list!
On my notebook (Acer Travelmate 630) I experience random shutdowns when
using Linux. Everything lokks like if someone called "init 6" or "halt".
(But this is completely impossible; I'm the only user on that machine,
no internet connection available)
Because there is always a similar sequence of events logged in my
/var/log/acpid, I think that this is acpi-related.
I hope that somebody can help me making these shutdowns disappear.
Thanhs a lot.
Maier Gerfried
Attached some sequences cut from /var/log/acpid which show entries just
befroe or during such a shutdown (The kernel-log /var/log/messages does
not give any specific information; It looks just like a "normal" shutdown):
...
[Mon Aug 25 22:56:28 2003] exiting
[Tue Aug 26 07:15:18 2003] starting up
[Tue Aug 26 07:15:18 2003] 1 rule loaded
[Tue Aug 26 07:21:02 2003] exiting
[Tue Aug 26 18:27:32 2003] starting up
[Tue Aug 26 18:27:32 2003] 1 rule loaded
[Tue Aug 26 19:30:43 2003] received event "thermal_zone THRS 000000f0
00000001"
[Tue Aug 26 19:30:43 2003] executing action "/usr/sbin/acpid_proxy "
[Tue Aug 26 19:30:43 2003] BEGIN HANDLER MESSAGES
ACPI event
No action defined for ACPI event
[Tue Aug 26 19:30:43 2003] END HANDLER MESSAGES
[Tue Aug 26 19:30:43 2003] action exited with status 0
[Tue Aug 26 19:30:43 2003] completed event "thermal_zone THRS 000000f0
00000001"
[Tue Aug 26 19:30:43 2003] received event "thermal_zone THRS 000000f0
00000000"
[Tue Aug 26 19:30:43 2003] executing action "/usr/sbin/acpid_proxy "
[Tue Aug 26 19:30:43 2003] BEGIN HANDLER MESSAGES
ACPI event
No action defined for ACPI event
[Tue Aug 26 19:30:43 2003] END HANDLER MESSAGES
[Tue Aug 26 19:30:43 2003] action exited with status 0
[Tue Aug 26 19:30:43 2003] completed event "thermal_zone THRS 000000f0
00000000"
[Tue Aug 26 19:30:51 2003] exiting
[Tue Aug 26 19:32:03 2003] starting up
[Tue Aug 26 19:32:03 2003] 1 rule loaded
[Tue Aug 26 20:29:43 2003] received event "thermal_zone THRS 000000f0
00000001"
[Tue Aug 26 20:29:43 2003] executing action "/usr/sbin/acpid_proxy "
[Tue Aug 26 20:29:43 2003] BEGIN HANDLER MESSAGES
ACPI event
No action defined for ACPI event
[Tue Aug 26 20:29:43 2003] END HANDLER MESSAGES
[Tue Aug 26 20:29:43 2003] action exited with status 0
[Tue Aug 26 20:29:43 2003] completed event "thermal_zone THRS 000000f0
00000001"
[Tue Aug 26 20:29:50 2003] exiting
[Tue Aug 26 20:31:08 2003] starting up
(two consecutive shutdowns)
...
[Tue Sep 2 12:29:04 2003] 1 rule loaded
[Tue Sep 2 12:32:24 2003] exiting
[Tue Sep 2 18:26:47 2003] starting up
[Tue Sep 2 18:26:47 2003] 1 rule loaded
[Tue Sep 2 20:57:09 2003] received event "thermal_zone THRS 000000f0
00000001"
[Tue Sep 2 20:57:09 2003] executing action "/usr/sbin/acpid_proxy
thermal_zone THRS 000000f0 00000001"
[Tue Sep 2 20:57:09 2003] BEGIN HANDLER MESSAGES
ACPI event thermal_zone THRS 000000f0 00000001
No action defined for ACPI event thermal_zone THRS 000000f0 00000001
[Tue Sep 2 20:57:09 2003] END HANDLER MESSAGES
[Tue Sep 2 20:57:09 2003] action exited with status 0
[Tue Sep 2 20:57:09 2003] completed event "thermal_zone THRS 000000f0
00000001"
[Tue Sep 2 20:57:09 2003] received event "thermal_zone THRS 000000f0
00000000"
[Tue Sep 2 20:57:09 2003] executing action "/usr/sbin/acpid_proxy
thermal_zone THRS 000000f0 00000000"
[Tue Sep 2 20:57:09 2003] BEGIN HANDLER MESSAGES
ACPI event thermal_zone THRS 000000f0 00000000
No action defined for ACPI event thermal_zone THRS 000000f0 00000000
[Tue Sep 2 20:57:09 2003] END HANDLER MESSAGES
[Tue Sep 2 20:57:09 2003] action exited with status 0
[Tue Sep 2 20:57:09 2003] completed event "thermal_zone THRS 000000f0
00000000"
[Tue Sep 2 20:57:19 2003] exiting
[Tue Sep 2 21:01:40 2003] starting up
...
when the first two shutdowns occured, my acpid was broken, so it did not
call acpid_proxy correctly. But that's already fixed.
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
^ permalink raw reply [flat|nested] 4+ messages in thread[parent not found: <20030908095253.GC3944@openzaurus.ucw.cz>]
[parent not found: <20030908095253.GC3944-u08AdweFZfgxtPtxi4kahqVXKuFTiq87@public.gmane.org>]
* Re: Random shutdowns - some new details [not found] ` <20030908095253.GC3944-u08AdweFZfgxtPtxi4kahqVXKuFTiq87@public.gmane.org> @ 2003-09-16 5:54 ` Gerfried Maier [not found] ` <3F66A5B2.2060400-ArvQUR6U0fYD0fefG/KofA@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Gerfried Maier @ 2003-09-16 5:54 UTC (permalink / raw) To: pavel-AlSwsSmVLrQ; +Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f [-- Attachment #1: Type: text/plain, Size: 1469 bytes --] Pavel Machek wrote: > In-kernel ACPI will happily call /sbin/halt for you in case of system overheat. > Watch /proc/acpi/thermal/*/* Thank you for this information - when digging through the sources I was able to locate the responsible lines. To get behind these shutdowns I added some lines dealing with thermal events, mainly cat'ting /proc/acpi/thermal_zone/THRx/temperature, to the script /usr/sbin/acpid_proxy. This script is called by acpid with the text of the occured event as parameter (The SuSE-way of handling acpi-events). Three days ago another shutdown occured, showing two things: 1) I made some mistake when extending acpid_proxy. But the temperatures were logged. 2) The temperature logged are far below the trip-points for a shutdown. (see the attached files) The trip-point for my thermal zone THRS is 92C, the actual temperature when the event occured was 46C (logged) What does this mean? Any hints are welcome Regards, Maier Gerfried Please mind the attached files: acpid - a cut from the acpid-logfile THRC-trip_points - /proc/acpi/thermal_zone/THRC/trip_points THRS-trip_points - /proc/acpi/thermal_zone/THRS/trip_points The System is an Acer Travelmate Notebook with Kernel 2.4.22 acpi and the preemptive-patch (although I do not think that preemptive has anything todo with my problems - the shuthdowns already occured when I knew nothing about preemptive, meaning with vanilla 2.4.xx + acpi-patch) Base-system: SuSE 8.1 [-- Attachment #2: acpid --] [-- Type: text/plain, Size: 1980 bytes --] [Sat Sep 13 14:46:31 2003] starting up [Sat Sep 13 14:46:32 2003] 1 rule loaded [Sat Sep 13 17:39:58 2003] received event "button/sleep SLPB 00000080 00000001" [Sat Sep 13 17:39:58 2003] executing action "/usr/sbin/acpid_proxy button/sleep SLPB 00000080 00000001" [Sat Sep 13 17:39:58 2003] BEGIN HANDLER MESSAGES ACPI event button/sleep SLPB 00000080 00000001 No action defined for ACPI event button/sleep SLPB 00000080 00000001 [Sat Sep 13 17:39:58 2003] END HANDLER MESSAGES [Sat Sep 13 17:39:58 2003] action exited with status 0 [Sat Sep 13 17:39:58 2003] completed event "button/sleep SLPB 00000080 00000001" [Sat Sep 13 17:48:07 2003] received event "thermal_zone THRS 000000f0 00000001" [Sat Sep 13 17:48:07 2003] executing action "/usr/sbin/acpid_proxy thermal_zone THRS 000000f0 00000001" [Sat Sep 13 17:48:07 2003] BEGIN HANDLER MESSAGES ACPI event thermal_zone THRS 000000f0 00000001 Sat Sep 13 17:48:07 CEST 2003: event thermal_zone THRS 000000f0 00000001, THRS temperature: 46 C, / /usr/sbin/acpid_proxy: line 109: THRC temperature: 44 C: command not found [Sat Sep 13 17:48:07 2003] END HANDLER MESSAGES [Sat Sep 13 17:48:07 2003] action exited with status 0 [Sat Sep 13 17:48:07 2003] completed event "thermal_zone THRS 000000f0 00000001" [Sat Sep 13 17:48:07 2003] received event "thermal_zone THRS 000000f0 00000000" [Sat Sep 13 17:48:07 2003] executing action "/usr/sbin/acpid_proxy thermal_zone THRS 000000f0 00000000" [Sat Sep 13 17:48:07 2003] BEGIN HANDLER MESSAGES ACPI event thermal_zone THRS 000000f0 00000000 Sat Sep 13 17:48:07 CEST 2003: event thermal_zone THRS 000000f0 00000000, THRS temperature: 46 C, / /usr/sbin/acpid_proxy: line 109: THRC temperature: 44 C: command not found [Sat Sep 13 17:48:07 2003] END HANDLER MESSAGES [Sat Sep 13 17:48:07 2003] action exited with status 0 [Sat Sep 13 17:48:07 2003] completed event "thermal_zone THRS 000000f0 00000000" [Sat Sep 13 17:48:14 2003] exiting [-- Attachment #3: THRC-trip_points --] [-- Type: text/plain, Size: 99 bytes --] tical (S5): 100 C passive: 86 C: tc1=2 tc2=5 tsp=300 devices=0xc12c6a80 [-- Attachment #4: THRS-trip_points --] [-- Type: text/plain, Size: 101 bytes --] critical (S5): 92 C passive: 80 C: tc1=2 tc2=5 tsp=300 devices=0xc12c6a80 ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <3F66A5B2.2060400-ArvQUR6U0fYD0fefG/KofA@public.gmane.org>]
* Re: Random shutdowns - some new details [not found] ` <3F66A5B2.2060400-ArvQUR6U0fYD0fefG/KofA@public.gmane.org> @ 2003-09-16 12:02 ` Pavel Machek [not found] ` <20030916120243.GD602-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Pavel Machek @ 2003-09-16 12:02 UTC (permalink / raw) To: Gerfried Maier; +Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Hi! > >In-kernel ACPI will happily call /sbin/halt for you in case of system > >overheat. > >Watch /proc/acpi/thermal/*/* > > Thank you for this information - when digging through the sources I was > able to locate the responsible lines. So try to locate those "/sbin/halt" calling lines, and replace with something like "logger 'I refuse to die'", and see what happens. If it does not trigger, you know it is something else. > To get behind these shutdowns I added some lines dealing with thermal > events, mainly cat'ting /proc/acpi/thermal_zone/THRx/temperature, to the > script /usr/sbin/acpid_proxy. > > This script is called by acpid with the text of the occured event as > parameter (The SuSE-way of handling acpi-events). > > Three days ago another shutdown occured, showing two things: > 1) I made some mistake when extending acpid_proxy. But the temperatures > were logged. > 2) The temperature logged are far below the trip-points for a shutdown. > (see the attached files) > > The trip-point for my thermal zone THRS is 92C, the actual temperature > when the event occured was 46C (logged) > > What does this mean? No idea. [BTW nice system: two thermal zones, I never seen it before]. > Please mind the attached files: > acpid - a cut from the acpid-logfile > THRC-trip_points - /proc/acpi/thermal_zone/THRC/trip_points > THRS-trip_points - /proc/acpi/thermal_zone/THRS/trip_points > -- When do you have a heart between your knees? [Johanka's followup: and *two* hearts?] ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20030916120243.GD602-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>]
* Re: Random shutdowns - again some new details [not found] ` <20030916120243.GD602-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org> @ 2003-10-20 7:14 ` Gerfried Maier 0 siblings, 0 replies; 4+ messages in thread From: Gerfried Maier @ 2003-10-20 7:14 UTC (permalink / raw) To: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f [-- Attachment #1: Type: text/plain, Size: 1996 bytes --] Pavel Machek wrote: > So try to locate those "/sbin/halt" calling lines, and replace with > something like "logger 'I refuse to die'", and see what happens. If it > does not trigger, you know it is something else. I was not that couraged to disable the shutdown completely, but I compiled acpi with debug-statements to get more information. Here are the results of the most recent shutdown: Temperatures logged via cat /proc/acpi/thermal_zone/*/temperature > file, triggered by the acpi-thermal-event: Fri Oct 17 02:18:33 CEST 2003: event thermal_zone THRC 000000f0 00000001, THRC temperature: 39 C, THRS temperature: 44 C (please mind: far below the trip-points, which are around 80C) A section from the acpid-log, where all occuring events are logged: [Thu Oct 16 17:21:43 2003] starting up [Thu Oct 16 17:21:43 2003] 1 rule loaded [Fri Oct 17 02:18:33 2003] received event "thermal_zone THRC 000000f0 00000001" [Fri Oct 17 02:18:33 2003] executing action "/usr/sbin/acpid_proxy thermal_zone THRC 000000f0 00000001" [Fri Oct 17 02:18:33 2003] BEGIN HANDLER MESSAGES ACPI event thermal_zone THRC 000000f0 00000001 [Fri Oct 17 02:18:33 2003] END HANDLER MESSAGES [Fri Oct 17 02:18:33 2003] action exited with status 0 [Fri Oct 17 02:18:33 2003] completed event "thermal_zone THRC 000000f0 00000001" [Fri Oct 17 02:18:42 2003] exiting [Fri Oct 17 02:33:52 2003] starting up Attached the last few seconds of the system-log containing the acpi-debug messages. Is the code located in thermal.c around line 412 (function acpi_thermal_critical, namely: acpi_thermal_call_usermode(ACPI_THERMAL_PATH_POWEROFF)) the only occurrence of code in acpi beeing able to do a shutdown? I'm completely seeking in the dark. The only thing I understood by now is that the actual temperatures _do not_ seem to exceed the trip-points. (on my system 92C resp. 100C) Regards, Maier Gerfried PS.: I'm running kernel 2.4.22 with the acpi in this kernel. (no further patch applied) [-- Attachment #2: messages_ausschn.txt --] [-- Type: text/plain, Size: 3396 bytes --] Oct 17 02:18:29 acer kernel: [ACPI Debug] String: ----------------- Thermal event ------------------- Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 0000000000000064 Oct 17 02:18:29 acer kernel: [ACPI Debug] String: SYST of _TMP = Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 000000000000002B Oct 17 02:18:29 acer kernel: [ACPI Debug] String: CPU _TMP = Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:29 acer kernel: [ACPI Debug] String: ----------------- Thermal event ------------------- Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:29 acer kernel: [ACPI Debug] String: SYST of _TMP = Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 000000000000002B Oct 17 02:18:29 acer kernel: [ACPI Debug] String: CPU _TMP = Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:29 acer kernel: [ACPI Debug] String: ----------------- Thermal event ------------------- Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:29 acer kernel: [ACPI Debug] String: SYST of _TMP = Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 000000000000002B Oct 17 02:18:29 acer kernel: [ACPI Debug] String: CPU _TMP = Oct 17 02:18:29 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:33 acer kernel: [ACPI Debug] String: ----------------- Thermal event ------------------- Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:33 acer kernel: [ACPI Debug] String: SYST of _TMP = Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 000000000000002C Oct 17 02:18:33 acer kernel: [ACPI Debug] String: CPU _TMP = Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:33 acer kernel: [ACPI Debug] String: ----------------- Thermal event ------------------- Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:33 acer kernel: [ACPI Debug] String: SYST of _TMP = Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 000000000000002C Oct 17 02:18:33 acer kernel: [ACPI Debug] String: CPU _TMP = Oct 17 02:18:33 acer kernel: [ACPI Debug] String: BAT0_BST_RETURN: Oct 17 02:18:33 acer kernel: [ACPI Debug] String: BAT0_BST_RETURN: Oct 17 02:18:33 acer kernel: [ACPI Debug] String: --------------------------------------- AC Present Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 0000000000000064 Oct 17 02:18:33 acer kernel: acpi_thermal-0398 [3248] acpi_thermal_critical : Critical trip point Oct 17 02:18:33 acer kernel: [ACPI Debug] String: CPU _TMP = Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 0000000000000027 Oct 17 02:18:33 acer kernel: [ACPI Debug] String: SYST of _TMP = Oct 17 02:18:33 acer kernel: [ACPI Debug] Integer: 000000000000002C Oct 17 02:18:34 acer init: Switching to runlevel: 0 Oct 17 02:18:43 acer cardmgr[548]: executing: 'rmmod memory_cs' Oct 17 02:18:43 acer cardmgr[548]: + rmmod: module memory_cs is not loaded Oct 17 02:18:43 acer cardmgr[548]: rmmod exited with status 1 Oct 17 02:18:44 acer cardmgr[548]: exiting Oct 17 02:18:44 acer kernel: unloading Kernel Card Services Oct 17 02:18:45 acer kernel: usb.c: deregistering driver usb-storage Oct 17 02:18:45 acer kernel: scsi : 1 host left. Oct 17 02:18:45 acer kernel: Kernel logging (proc) stopped. Oct 17 02:18:45 acer kernel: Kernel log daemon terminating. Oct 17 02:18:46 acer exiting on signal 15 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-10-20 7:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-03 5:49 Random shutdowns Gerfried Maier
[not found] ` <20030908095253.GC3944@openzaurus.ucw.cz>
[not found] ` <20030908095253.GC3944-u08AdweFZfgxtPtxi4kahqVXKuFTiq87@public.gmane.org>
2003-09-16 5:54 ` Random shutdowns - some new details Gerfried Maier
[not found] ` <3F66A5B2.2060400-ArvQUR6U0fYD0fefG/KofA@public.gmane.org>
2003-09-16 12:02 ` Pavel Machek
[not found] ` <20030916120243.GD602-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2003-10-20 7:14 ` Random shutdowns - again " Gerfried Maier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox