public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* ACPI too sensitive about critical temperature
@ 2004-01-20 21:11 Julius Volz
  0 siblings, 0 replies; only message in thread
From: Julius Volz @ 2004-01-20 21:11 UTC (permalink / raw)
  To: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Hi ACPI developers,

these emails between me and Paul may be of interest to some of you.
Sorry for the messed up quoting, I hope it's still readable. My original
email is at the bottom.

It's basically about ACPI being too quick to shut down the computer when
you have a mainboard that occasionally delivers bogus values.
Andy has already provided me with a patch to try out, but since the
problem is not on my computer, it will take some time.

-----Forwarded Message-----
> From: "Grover, Andrew" <andrew.grover-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> To: Julius Volz <julius.volz-cOr7zazyEW0f8UlNpKJ8ow@public.gmane.org>, "Diefenbaugh, Paul S" <paul.s.diefenbaugh-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Subject: RE: ACPI too sensitive about critical temperature
> Date: Tue, 20 Jan 2004 09:28:18 -0800
> 
> Please take this to acpi-devel-5NWGOfrQmneRv+LV9MX5uti2KpX7p8Fi@public.gmane.org
> 
> You may want to fwd this last email there, for starters.
> 
> Regards -- Andy 
> 
> > -----Original Message-----
> > From: Julius Volz [mailto:julius.volz-cOr7zazyEW0f8UlNpKJ8ow@public.gmane.org] 
> > Sent: Tuesday, January 20, 2004 9:27 AM
> > To: Diefenbaugh, Paul S
> > Cc: Grover, Andrew
> > Subject: Re: ACPI too sensitive about critical temperature
> > 
> > Hi again,
> > 
> > >Julius:
> > >
> > >Hmmm ... this would be dangerous to apply globally, but one 
> > could allow some kind of positive hysteresis as an optional 
> > attribute - e.g. amount of time in milliseconds to wait 
> > before confirming a temperature event (_HOT, _CRT).  The 
> > default should be zero (don't wait, don't confirm).  Most 
> > good ;~) processors shutdown automatically in the case this 
> > was real critical event and the OS waits too long ... but 
> > there could be loss of data.
> > >  
> > >
> > 
> > Yes, true. The thing I'm a bit worried about though is that 
> > this might 
> > happen to people who just use the kernel of some big distribution and 
> > will never hear about this option. Hard to make it right for 
> > everyone :-(
> > Well, so waiting three seconds is much too long, but how fast can you 
> > get consecutive measurements of the temperature? One should at least 
> > look at more than just one value before shutting down, it seems...
> > 
> > >The ACPI thermal zone driver must be seeing the transient 
> > temperature properly or else it wouldn't attempt a shutdown; 
> > the logged message must be from a separate code path (post-transient).
> > >  
> > >
> > 
> > I don't fully understand what you mean with this (no kernel 
> > coder yet, 
> > sorry), but I also noticed that my original assumption (that 
> > the value 
> > in the message must be from a next measurement) is apparently wrong. 
> > When I look at thermal.c, I see that the same value 
> > (tz->temperature) is 
> > used for deciding the shutdown and the printing of the 
> > message, and it 
> > is not changed in between... so why did the messages say 45°C 
> > and 43°C 
> > when the BIOS crit. temp. was set to 85°C? Or am I missing something 
> > (e.g. the tz struct being filled in by some mysterious background 
> > thread...).
> > 
> > >>From your data I'd guess this was a fault motherboard 
> > and/or thermal sensor.
> > >  
> > >
> > 
> > I have the board type now: it is an Epox EP4-PDA2+ (Chipset: 
> > Intel 865PE).
> > http://www.epox.nl/english/products/motherboard/4pda2.htm
> > 
> > I hope I can help somehow, but since the board isn't mine I can't 
> > experiment with it a lot.
> > 
> > Julius
> > 
> > >-----Original Message-----
> > >From: Julius Volz [mailto:julius.volz-cOr7zazyEW0f8UlNpKJ8ow@public.gmane.org] 
> > >Sent: Sunday, January 18, 2004 6:26 AM
> > >To: Grover, Andrew; Diefenbaugh, Paul S
> > >Subject: ACPI too sensitive about critical temperature
> > >
> > >Hi,
> > >
> > >I didn't want to bug the whole lkml with this, so I'm 
> > sending this to 
> > >you guys who wrote "drivers/acpi/thermal.c". I hope that's okay...
> > >
> > >A friend of mine has a mainboard (I can find out the 
> > specific type if 
> > >you want to know) which reports inaccurately high 
> > temperature "peaks" 
> > >from time to time.
> > >Looking at /proc/acpi/thermal_zone/THR1/temperature every 
> > second might 
> > >look something like this: 45, 43, 47, 42, 90, 43, 41, 45
> > >
> > >This is of course bad when CONFIG_ACPI_THERMAL is enabled and the 
> > >computer just randomly shuts down when you are working. The 
> > problem with 
> > >detecting this bug for normal users is that the temperature 
> > reported in 
> > >the shutdown message is already taken from a next 
> > measurement and looks 
> > >ok again.
> > >So you might get a shutdown message saying,
> > >"Critical temperature reached (43 C)..."
> > >and wonder why the system shuts down although you set critical 
> > >temperature to 85°C in the BIOS.
> > >
> > >I've looked at thermal.c and thought that it would be better 
> > to look at 
> > >temperature measurements over a range of, say, three seconds 
> > (if that is 
> > >possible) to sort out these false peaks.
> > >
> > >As there might be many mainboards out there that do this and 
> > >CONFIG_ACPI_THERMAL should normally be enabled, this could 
> > turn out to 
> > >be a problem for many people otherwise.
> > >
> > >What do you think?
> > >
> > >Julius



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2004-01-20 21:11 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-20 21:11 ACPI too sensitive about critical temperature Julius Volz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox