linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
       [not found] <20070515201914.16944e04.akpm@linux-foundation.org>
@ 2007-05-16 17:37 ` Maciej Rutecki
  2007-05-16 17:47   ` Chuck Ebbert
  0 siblings, 1 reply; 41+ messages in thread
From: Maciej Rutecki @ 2007-05-16 17:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-acpi, lenb

[-- Attachment #1: Type: text/plain, Size: 2016 bytes --]

In 2.6.20.9 I can change trippoints:

echo "105:100:100:78:70:40:30" > /proc/acpi/thermal_zone/TZ0/trip_points
echo 10  > /proc/acpi/thermal_zone/TZ0/polling_frequency

Then I got:
cat /proc/acpi/thermal_zone/TZ0/*
<setting not supported>
cooling mode:   active
polling frequency:       10 seconds
state:                   active[2]
temperature:             45 C
critical (S5):           105 C
active[0]:               78 C: devices=0xdf415a40
active[1]:               70 C: devices=0xdf4159dc
active[2]:               40 C: devices=0xdf41598c
active[3]:               30 C: devices=0xdf41593c

cat /proc/acpi/fan/*/*
status:                  off
status:                  off
status:                  on
status:                  on

And fan turns on.

In 2.6.22-rc1-mm1:
echo "105:100:100:78:70:40:30" > /proc/acpi/thermal_zone/TZ0/trip_points
bash: echo: write error: Błąd wejścia/wyjścia (input/output error)

rutek:/home/maciek# cat /proc/acpi/thermal_zone/TZ0/*
<setting not supported>
polling frequency:       10 seconds
state:                   ok
temperature:             45 C
critical (S5):           256 C
active[0]:               78 C: devices=0xc1827a40
active[1]:               70 C: devices=0xc18279dc
active[2]:               60 C: devices=0xc182798c
active[3]:               50 C: devices=0xc182793c
rutek:/home/maciek# cat /proc/acpi/fan/*/*
status:                  off
status:                  off
status:                  off
status:                  off

Fan turns on when temperature is over 50*C. (want: 30)

A read this:
http://article.gmane.org/gmane.linux.acpi.devel/22750

But I don't have colling_policy, but only colling_mode:
ls /proc/acpi/thermal_zone/TZ0/
cooling_mode  polling_frequency  state  temperature  trip_points

Its bug or feature?

Config, acpidump, dmesg:
http://www.unixy.pl/maciek/download/kernel/2.6.22-rc1-mm1/

-- 
Maciej Rutecki
www.unixy.pl
Kernel Monkeys
(http://kernel.wikidot.com/)



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3265 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-16 17:37 ` 2.6.22-rc1-mm1 [cannot change thermal trip points] Maciej Rutecki
@ 2007-05-16 17:47   ` Chuck Ebbert
  2007-05-16 18:10     ` Goulven Guillard
  2007-05-17  9:23     ` Pavel Machek
  0 siblings, 2 replies; 41+ messages in thread
From: Chuck Ebbert @ 2007-05-16 17:47 UTC (permalink / raw)
  To: Maciej Rutecki; +Cc: Andrew Morton, linux-kernel, linux-acpi, lenb

Maciej Rutecki wrote:
> In 2.6.20.9 I can change trippoints:
> 
> echo "105:100:100:78:70:40:30" > /proc/acpi/thermal_zone/TZ0/trip_points
> echo 10  > /proc/acpi/thermal_zone/TZ0/polling_frequency
> 
> Then I got:
> cat /proc/acpi/thermal_zone/TZ0/*
> <setting not supported>
> cooling mode:   active
> polling frequency:       10 seconds
> state:                   active[2]
> temperature:             45 C
> critical (S5):           105 C
> active[0]:               78 C: devices=0xdf415a40
> active[1]:               70 C: devices=0xdf4159dc
> active[2]:               40 C: devices=0xdf41598c
> active[3]:               30 C: devices=0xdf41593c
> 
> cat /proc/acpi/fan/*/*
> status:                  off
> status:                  off
> status:                  on
> status:                  on
> 
> And fan turns on.
> 
> In 2.6.22-rc1-mm1:
> echo "105:100:100:78:70:40:30" > /proc/acpi/thermal_zone/TZ0/trip_points
> bash: echo: write error: Błąd wejścia/wyjścia (input/output error)
> 
> rutek:/home/maciek# cat /proc/acpi/thermal_zone/TZ0/*
> <setting not supported>
> polling frequency:       10 seconds
> state:                   ok
> temperature:             45 C
> critical (S5):           256 C
> active[0]:               78 C: devices=0xc1827a40
> active[1]:               70 C: devices=0xc18279dc
> active[2]:               60 C: devices=0xc182798c
> active[3]:               50 C: devices=0xc182793c
> rutek:/home/maciek# cat /proc/acpi/fan/*/*
> status:                  off
> status:                  off
> status:                  off
> status:                  off
> 
> Fan turns on when temperature is over 50*C. (want: 30)
> 
> A read this:
> http://article.gmane.org/gmane.linux.acpi.devel/22750
> 
> But I don't have colling_policy, but only colling_mode:
> ls /proc/acpi/thermal_zone/TZ0/
> cooling_mode  polling_frequency  state  temperature  trip_points
> 
> Its bug or feature?
> 

Committed to mainline May 10:

Gitweb:     http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=11ccc0f249cb01a129f54760b8ff087f242935d4
Commit:     11ccc0f249cb01a129f54760b8ff087f242935d4
Parent:     de46c33745f5e2ad594c72f2cf5f490861b16ce1
Author:     Len Brown <len.brown@intel.com>
AuthorDate: Mon Apr 30 22:36:01 2007 -0400
Committer:  Len Brown <len.brown@intel.com>
CommitDate: Mon Apr 30 22:36:01 2007 -0400

    ACPI: thermal trip points are read-only
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-16 17:47   ` Chuck Ebbert
@ 2007-05-16 18:10     ` Goulven Guillard
  2007-05-17  9:23     ` Pavel Machek
  1 sibling, 0 replies; 41+ messages in thread
From: Goulven Guillard @ 2007-05-16 18:10 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Maciej Rutecki, Andrew Morton, linux-kernel, linux-acpi, lenb

Le 05/16/2007 07:47 PM, Chuck Ebbert a déclaré :

> Gitweb:     http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=11ccc0f249cb01a129f54760b8ff087f242935d4
> Commit:     11ccc0f249cb01a129f54760b8ff087f242935d4
> Parent:     de46c33745f5e2ad594c72f2cf5f490861b16ce1
> Author:     Len Brown <len.brown@intel.com>
> AuthorDate: Mon Apr 30 22:36:01 2007 -0400
> Committer:  Len Brown <len.brown@intel.com>
> CommitDate: Mon Apr 30 22:36:01 2007 -0400
> 
>     ACPI: thermal trip points are read-only


Should one understand that it IS a wanted behaviour ?

Isn't it the DSDT job (which is kernel-accessible, or isn't it ?) to
communicate trip_points to ACPI thermal zone ?

Isn't OSPM managing thermal zone ?

(http://acpi.sourceforge.net/documentation/thermal.html)




PS : Sorry for all these (maybe stupid) questions, but I think I
remember that changing trip_points had an effect on a (DSDT-bugged)
laptop I used to use, and I'd like to understand...

PPS : Sorry also for the english mistakes or approximations...




-- 
    ~~
   |Oo|   La banquise fond !!! Adoptez un pingouin...
  /|\/|\
   |__|            => http://doc.ubuntu-fr.org/
   ^__^
~~~|  |~~~








-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-16 17:47   ` Chuck Ebbert
  2007-05-16 18:10     ` Goulven Guillard
@ 2007-05-17  9:23     ` Pavel Machek
  2007-05-17 13:36       ` Maciej Rutecki
  2007-05-17 19:17       ` Len Brown
  1 sibling, 2 replies; 41+ messages in thread
From: Pavel Machek @ 2007-05-17  9:23 UTC (permalink / raw)
  To: Chuck Ebbert, len.brown
  Cc: Maciej Rutecki, Andrew Morton, linux-kernel, linux-acpi, lenb,
	torvalds

Hi!

> > In 2.6.20.9 I can change trippoints:
> > 
> > echo "105:100:100:78:70:40:30" > /proc/acpi/thermal_zone/TZ0/trip_points
> > echo 10  > /proc/acpi/thermal_zone/TZ0/polling_frequency
> > 
> > Then I got:
> > cat /proc/acpi/thermal_zone/TZ0/*
...
> > Its bug or feature?
> > 
> 
> Committed to mainline May 10:
> 
> Gitweb:     http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=11ccc0f249cb01a129f54760b8ff087f242935d4
> Commit:     11ccc0f249cb01a129f54760b8ff087f242935d4
> Parent:     de46c33745f5e2ad594c72f2cf5f490861b16ce1
> Author:     Len Brown <len.brown@intel.com>
> AuthorDate: Mon Apr 30 22:36:01 2007 -0400
> Committer:  Len Brown <len.brown@intel.com>
> CommitDate: Mon Apr 30 22:36:01 2007 -0400
> 
>     ACPI: thermal trip points are read-only

What was the rationale? Can we get this one reverted? 

Some machines (HP omnibook xe3) have broken trip points -- too high --
so machine will overheat and trigger hw shutdown before starting
passive cooling.

That's really broken, and write to trip points is reasonable way to
'fix' that. (I'd understand if you only ever let trip points to
decrease... but otoh root should be able to shoot himself....)

							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17  9:23     ` Pavel Machek
@ 2007-05-17 13:36       ` Maciej Rutecki
  2007-05-17 19:08         ` Len Brown
  2007-05-17 19:17       ` Len Brown
  1 sibling, 1 reply; 41+ messages in thread
From: Maciej Rutecki @ 2007-05-17 13:36 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Chuck Ebbert, len.brown, Andrew Morton, linux-kernel, linux-acpi,
	torvalds

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

Pavel Machek pisze:

> What was the rationale? Can we get this one reverted? 
> 
> Some machines (HP omnibook xe3) have broken trip points -- too high --
> so machine will overheat and trigger hw shutdown before starting
> passive cooling.
> 
> That's really broken, and write to trip points is reasonable way to
> 'fix' that. (I'd understand if you only ever let trip points to
> decrease... but otoh root should be able to shoot himself....)
> 
> 							Pavel

Many people need change trippoints, for example I have:

cat /proc/acpi/thermal_zone/TZ0/trip_points  | grep critical
critical (S5):           256 C

I _must_ change it to below 105 C, or edit DSDT table (too difficult to
me). I cannot use this kernel, when trip points are read only.

-- 
Maciej Rutecki
www.unixy.pl
Kernel Monkeys
(http://kernel.wikidot.com/)


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3265 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 13:36       ` Maciej Rutecki
@ 2007-05-17 19:08         ` Len Brown
  2007-05-17 20:09           ` Maciej Rutecki
  2007-05-17 21:53           ` Pavel Machek
  0 siblings, 2 replies; 41+ messages in thread
From: Len Brown @ 2007-05-17 19:08 UTC (permalink / raw)
  To: Maciej Rutecki
  Cc: Pavel Machek, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

On Thursday 17 May 2007 09:36, Maciej Rutecki wrote:

> Many people need change trippoints, for example I have:
> 
> cat /proc/acpi/thermal_zone/TZ0/trip_points  | grep critical
> critical (S5):           256 C
> 
> I _must_ change it to below 105 C, or edit DSDT table (too difficult to
> me). I cannot use this kernel, when trip points are read only.

What bad things happen if you leave the critical trip point at 256?
Do you find that you can drive the temperature over 105 and
the system fails to shut down?

-Len


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17  9:23     ` Pavel Machek
  2007-05-17 13:36       ` Maciej Rutecki
@ 2007-05-17 19:17       ` Len Brown
  2007-05-17 21:52         ` Pavel Machek
  2007-05-19 19:56         ` Thomas Renninger
  1 sibling, 2 replies; 41+ messages in thread
From: Len Brown @ 2007-05-17 19:17 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Chuck Ebbert, len.brown, Maciej Rutecki, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

On Thursday 17 May 2007 05:23, Pavel Machek wrote:

> >     ACPI: thermal trip points are read-only
> 
> What was the rationale? Can we get this one reverted? 
> 
> Some machines (HP omnibook xe3) have broken trip points -- too high --
> so machine will overheat and trigger hw shutdown before starting
> passive cooling.
> 
> That's really broken, and write to trip points is reasonable way to
> 'fix' that. (I'd understand if you only ever let trip points to
> decrease... but otoh root should be able to shoot himself....)

No, writing trip-points is neither a fix, nor it is reasonable.
It is a workaround at best, and it is a dangerous and mis-leading hack.

The OS has no capability to actually change the ACPI trip points
that are used by the BIOS.  Changing the OS copy of them
to make the user think that trip events will actually
happen when the temperature crosses the OS copy is crazy.

If there are systems with broken thermals and the
ACPI thermal control needs and over-ride to turn
on the fan, then that is fine -- but using
fake trip-points and giving the user the impression
that they are real is not viable.

-Len

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 19:08         ` Len Brown
@ 2007-05-17 20:09           ` Maciej Rutecki
  2007-05-17 20:42             ` Maciej Rutecki
  2007-05-17 21:53           ` Pavel Machek
  1 sibling, 1 reply; 41+ messages in thread
From: Maciej Rutecki @ 2007-05-17 20:09 UTC (permalink / raw)
  To: Len Brown
  Cc: Pavel Machek, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

[-- Attachment #1: Type: text/plain, Size: 417 bytes --]

Len Brown pisze:

> What bad things happen if you leave the critical trip point at 256?
> Do you find that you can drive the temperature over 105 and
> the system fails to shut down?
> 
> -Len
> 
> 

It isn't problem in this case (nx6310). But on hp nc nc6220 first trip
point is at 30 *C, so fan is usually on (noise, power consumption).

-- 
Maciej Rutecki
www.unixy.pl
Kernel Monkeys
(http://kernel.wikidot.com/)


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3265 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 20:09           ` Maciej Rutecki
@ 2007-05-17 20:42             ` Maciej Rutecki
  0 siblings, 0 replies; 41+ messages in thread
From: Maciej Rutecki @ 2007-05-17 20:42 UTC (permalink / raw)
  To: Maciej Rutecki
  Cc: Len Brown, Pavel Machek, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

[-- Attachment #1: Type: text/plain, Size: 124 bytes --]

Added to bugzilla (Bug 8496)
http://bugzilla.kernel.org/show_bug.cgi?id=8496

-- 
Maciej Rutecki
http://www.maciek.unixy.pl

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3265 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 19:17       ` Len Brown
@ 2007-05-17 21:52         ` Pavel Machek
  2007-05-17 22:35           ` Len Brown
  2007-05-19 19:56         ` Thomas Renninger
  1 sibling, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-17 21:52 UTC (permalink / raw)
  To: Len Brown
  Cc: Chuck Ebbert, len.brown, Maciej Rutecki, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

Hi!

> > >     ACPI: thermal trip points are read-only
> > 
> > What was the rationale? Can we get this one reverted? 
> > 
> > Some machines (HP omnibook xe3) have broken trip points -- too high --
> > so machine will overheat and trigger hw shutdown before starting
> > passive cooling.
> > 
> > That's really broken, and write to trip points is reasonable way to
> > 'fix' that. (I'd understand if you only ever let trip points to
> > decrease... but otoh root should be able to shoot himself....)
> 
> No, writing trip-points is neither a fix, nor it is reasonable.
> It is a workaround at best, and it is a dangerous and mis-leading hack.
> 
> The OS has no capability to actually change the ACPI trip points
> that are used by the BIOS.  Changing the OS copy of them
> to make the user think that trip events will actually
> happen when the temperature crosses the OS copy is crazy.

Aha... wait. It seemed to work for me when I enabled thermal
polling...

Slowing cpu down / shutdown / turn the fan on is done in the os after
all. Should we just start polling temperatures when user writes custom
trip points? 

> If there are systems with broken thermals and the
> ACPI thermal control needs and over-ride to turn
> on the fan, then that is fine -- but using
> fake trip-points and giving the user the impression
> that they are real is not viable.

They become real when we fake _TSP, too, ..?
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 19:08         ` Len Brown
  2007-05-17 20:09           ` Maciej Rutecki
@ 2007-05-17 21:53           ` Pavel Machek
  2007-05-17 22:42             ` Len Brown
  1 sibling, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-17 21:53 UTC (permalink / raw)
  To: Len Brown
  Cc: Maciej Rutecki, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

On Thu 2007-05-17 15:08:39, Len Brown wrote:
> On Thursday 17 May 2007 09:36, Maciej Rutecki wrote:
> 
> > Many people need change trippoints, for example I have:
> > 
> > cat /proc/acpi/thermal_zone/TZ0/trip_points  | grep critical
> > critical (S5):           256 C
> > 
> > I _must_ change it to below 105 C, or edit DSDT table (too difficult to
> > me). I cannot use this kernel, when trip points are read only.
> 
> What bad things happen if you leave the critical trip point at 256?
> Do you find that you can drive the temperature over 105 and
> the system fails to shut down?

Something similar happened to me on XE3, yes.

(Actual values were different; BIOS specified critical temperature at
cca 95C, but hw killed the power at cca 83C. Setting critical trip
point at 80C made the problem go away.)
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 21:52         ` Pavel Machek
@ 2007-05-17 22:35           ` Len Brown
  2007-06-04  9:02             ` Stefan Seyfried
  0 siblings, 1 reply; 41+ messages in thread
From: Len Brown @ 2007-05-17 22:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Chuck Ebbert, len.brown, Maciej Rutecki, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

> > No, writing trip-points is neither a fix, nor it is reasonable.
> > It is a workaround at best, and it is a dangerous and mis-leading hack.
> > 
> > The OS has no capability to actually change the ACPI trip points
> > that are used by the BIOS.  Changing the OS copy of them
> > to make the user think that trip events will actually
> > happen when the temperature crosses the OS copy is crazy.
> 
> Aha... wait. It seemed to work for me when I enabled thermal
> polling...

That's exactly the point.
If you allow a user to think they over-rode a trip-point
but that trip point never fires unless they enable polling mode,
then they're not going to get what they asked for.

Yes, SuSE enables polling mode by default, but that is just
distro specific "value add" that should eventually be fixed.

> Slowing cpu down / shutdown / turn the fan on is done in the os after
> all. Should we just start polling temperatures when user writes custom
> trip points? 

I actually agree with you for passively cooled embedded systems.
Indeed, that is the topic of one of my OLS papers.

However, for an off-the-shelf laptop that the vendor ships
with a specific active and passive cooling model, Linux
is not currently set up to ignore what the vendor provided
and go off on its own.  Yes, it could be done, but for
99.99% of cases, I expect it would be a mistake.

> > If there are systems with broken thermals and the
> > ACPI thermal control needs and over-ride to turn
> > on the fan, then that is fine -- but using
> > fake trip-points and giving the user the impression
> > that they are real is not viable.
> 
> They become real when we fake _TSP, too, ..?

We are mis-using _TSP today, and over-riding it
is a hack on top of a bug...

_TSP is only supposed to be for the passive cooling
algorithm -- which by definition is polling based.
It is not intended to be used for active cooling at all.
That is what active trip were invented for...

-Len

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 21:53           ` Pavel Machek
@ 2007-05-17 22:42             ` Len Brown
  2007-05-21 12:11               ` Pavel Machek
  0 siblings, 1 reply; 41+ messages in thread
From: Len Brown @ 2007-05-17 22:42 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Maciej Rutecki, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

> Something similar happened to me on XE3, yes.
> 
> (Actual values were different; BIOS specified critical temperature at
> cca 95C, but hw killed the power at cca 83C. Setting critical trip
> point at 80C made the problem go away.)

Great, please file a bug and include the acpidump from the XE3
and we'll fix it, rather than supporting a bogus (manual) workaround for it.

Of course if your system is running at 80*C and the hardware shuts
off at 83*C, you may have a broken fan, or one clogged with dust...

-Len


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 19:17       ` Len Brown
  2007-05-17 21:52         ` Pavel Machek
@ 2007-05-19 19:56         ` Thomas Renninger
  2007-05-21  3:50           ` Len Brown
  1 sibling, 1 reply; 41+ messages in thread
From: Thomas Renninger @ 2007-05-19 19:56 UTC (permalink / raw)
  To: Len Brown
  Cc: Pavel Machek, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Thu, 2007-05-17 at 15:17 -0400, Len Brown wrote:
> On Thursday 17 May 2007 05:23, Pavel Machek wrote:
> 
> > >     ACPI: thermal trip points are read-only
> > 
> > What was the rationale? Can we get this one reverted? 
> > 
> > Some machines (HP omnibook xe3) have broken trip points -- too high --
> > so machine will overheat and trigger hw shutdown before starting
> > passive cooling.
> > 
> > That's really broken, and write to trip points is reasonable way to
> > 'fix' that. (I'd understand if you only ever let trip points to
> > decrease... but otoh root should be able to shoot himself....)
> 
> No, writing trip-points is neither a fix, nor it is reasonable.
> It is a workaround at best, and it is a dangerous and mis-leading hack.
Yes it is a workaround for critical ACPI bugs like that or similar:
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.17/+bug/22336

It's also convenient to e.g. lower passive trip point to avoid fan
noise.

Some people are used to it, I already wanted to write a little userspace
prog to use them as it is really easy to fake cooling_mode (trip points
are modified by BIOS) and eliminate fan noise and other things by e.g.
reducing passsive or whatever trip point.

This is at least a major sysfs interface change, has this been discussed
somewhere before or declared deprecated?

It's there for a long time, why is this "a dangerous and mis-leading
hack." now?

I'd suggest to revert this and I can come with something like "only
allow lower values
than BIOS provides" patch if the current implementation is considered
dangerous.

      Thomas

> The OS has no capability to actually change the ACPI trip points
> that are used by the BIOS.  Changing the OS copy of them
> to make the user think that trip events will actually
> happen when the temperature crosses the OS copy is crazy.
> 
> If there are systems with broken thermals and the
> ACPI thermal control needs and over-ride to turn
> on the fan, then that is fine -- but using
> fake trip-points and giving the user the impression
> that they are real is not viable.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-19 19:56         ` Thomas Renninger
@ 2007-05-21  3:50           ` Len Brown
  2007-05-21 11:31             ` Thomas Renninger
  2007-05-21 12:10             ` Pavel Machek
  0 siblings, 2 replies; 41+ messages in thread
From: Len Brown @ 2007-05-21  3:50 UTC (permalink / raw)
  To: trenn
  Cc: Pavel Machek, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Saturday 19 May 2007 15:56, Thomas Renninger wrote:
> On Thu, 2007-05-17 at 15:17 -0400, Len Brown wrote:
> > On Thursday 17 May 2007 05:23, Pavel Machek wrote:
> > 
> > > >     ACPI: thermal trip points are read-only
> > > 
> > > What was the rationale? Can we get this one reverted? 
> > > 
> > > Some machines (HP omnibook xe3) have broken trip points -- too high --
> > > so machine will overheat and trigger hw shutdown before starting
> > > passive cooling.
> > > 
> > > That's really broken, and write to trip points is reasonable way to
> > > 'fix' that. (I'd understand if you only ever let trip points to
> > > decrease... but otoh root should be able to shoot himself....)
> > 
> > No, writing trip-points is neither a fix, nor it is reasonable.
> > It is a workaround at best, and it is a dangerous and mis-leading hack.
> Yes it is a workaround for critical ACPI bugs like that or similar:
> https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.17/+bug/22336

Thanks for pointing that out -- it is a great example
of how powerful mis-information can be.

The fact that the trip-points are writable has obscured,
rather than clarified, the actual causes of the failures.
No less than 4 people in that bug report declared that
cleaning the dust out of their fan fixed the root cause.
A bunch more said that the issues went away when they 
stopped using ubuntu's user-space power save daemon.

There are a couple more with broken active fan control --
which also gets obscured rather than clarified by
over-riding trip points.

And finally, there are probably some with clean fans
that are working properly, but are thermally challenged
systems.  I'll venture that Windows is NOT modifying or disabling
the critical trip point to work around this issue.
I'll venture that their thermal throttling is working
and ours may not be.

perhaps it was the recently fixed mod_timer() bug in thermal.c,
or perhaps it is one that we don't know about yet...

> It's also convenient to e.g. lower passive trip point to avoid fan
> noise.

nope, the OS can't reliably override the processor passive trip point.
That is what _SCP and cooling_mode are for.

The reason is that the BIOS can send us a trip-point changed event at any time,
the kernel will evaluate _PSV, and wipe out the modified OS version.

if you want to change the state of the fans,
then poke /proc/acpi/fan/ directly.
This will have effect until the next trip point
changes its state.

> Some people are used to it, I already wanted to write a little userspace
> prog to use them as it is really easy to fake cooling_mode (trip points
> are modified by BIOS) and eliminate fan noise and other things by e.g.
> reducing passsive or whatever trip point.

please save this effort for a non-ACPI system.

> This is at least a major sysfs interface change, has this been discussed
> somewhere before or declared deprecated?

it went out on linux-acpi, but I don't recall any discussion about it.

> It's there for a long time, why is this "a dangerous and mis-leading
> hack." now?

It has been dangerous and misleading since the day it went in.
If the user doesn't enable polling, then they are effectively
writing random numbers that have absolutely no effect on
the operation of the system, and hiding the numbers that
do control the operation of the system.

> I'd suggest to revert this and I can come with something like "only
> allow lower values
> than BIOS provides" patch if the current implementation is considered
> dangerous.

That simply will not address the issue.
Indeed, all the entries in the ubuntu bug report are about hitting
the critical temperature and having a critical shutdown when
it isn't wanted.  These people want to RAISE the critical shutdown
trip-point.  Their cooling problems must be fixed -- raising critical
trip points causes them instead to be ignored.

For folks with the reverse problem -- active cooling where the
fans kick in early than they'd like, they should just turn off
the fans via /proc/acpi/fan and not mess with the trip points at all.
If they make a mistake, they will be forgiven when the system
reaches the next trip point and turns the fan back on.

thanks,
-Len


> > The OS has no capability to actually change the ACPI trip points
> > that are used by the BIOS.  Changing the OS copy of them
> > to make the user think that trip events will actually
> > happen when the temperature crosses the OS copy is crazy.
> > 
> > If there are systems with broken thermals and the
> > ACPI thermal control needs and over-ride to turn
> > on the fan, then that is fine -- but using
> > fake trip-points and giving the user the impression
> > that they are real is not viable.
> 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21  3:50           ` Len Brown
@ 2007-05-21 11:31             ` Thomas Renninger
  2007-05-21 12:10             ` Pavel Machek
  1 sibling, 0 replies; 41+ messages in thread
From: Thomas Renninger @ 2007-05-21 11:31 UTC (permalink / raw)
  To: Len Brown
  Cc: Pavel Machek, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Sun, 2007-05-20 at 23:50 -0400, Len Brown wrote:
> On Saturday 19 May 2007 15:56, Thomas Renninger wrote:
> > On Thu, 2007-05-17 at 15:17 -0400, Len Brown wrote:
> > > On Thursday 17 May 2007 05:23, Pavel Machek wrote:
> > > 
> > > > >     ACPI: thermal trip points are read-only
> > > > 
> > > > What was the rationale? Can we get this one reverted? 
> > > > 
> > > > Some machines (HP omnibook xe3) have broken trip points -- too high --
> > > > so machine will overheat and trigger hw shutdown before starting
> > > > passive cooling.
> > > > 
> > > > That's really broken, and write to trip points is reasonable way to
> > > > 'fix' that. (I'd understand if you only ever let trip points to
> > > > decrease... but otoh root should be able to shoot himself....)
> > > 
> > > No, writing trip-points is neither a fix, nor it is reasonable.
> > > It is a workaround at best, and it is a dangerous and mis-leading hack.
> > Yes it is a workaround for critical ACPI bugs like that or similar:
> > https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.17/+bug/22336
> 
> Thanks for pointing that out -- it is a great example
> of how powerful mis-information can be.
> 
> The fact that the trip-points are writable has obscured,
> rather than clarified, the actual causes of the failures.
> No less than 4 people in that bug report declared that
> cleaning the dust out of their fan fixed the root cause.
> A bunch more said that the issues went away when they 
> stopped using ubuntu's user-space power save daemon.
> 
> There are a couple more with broken active fan control --
> which also gets obscured rather than clarified by
> over-riding trip points.
> 
> And finally, there are probably some with clean fans
> that are working properly, but are thermally challenged
> systems.  I'll venture that Windows is NOT modifying or disabling
> the critical trip point to work around this issue.
> I'll venture that their thermal throttling is working
> and ours may not be.
> 
> perhaps it was the recently fixed mod_timer() bug in thermal.c,
> or perhaps it is one that we don't know about yet...
> 
Whatever it was, it's in a final Ubuntu dist and the trip point
interface
could help some people to still be able to use it.

ACPI is very machine specific. 100 machines may work well and QA might
oversee the 100 and first where critical shutdowns or whatever happens.
Such workarounds are really helpful then.

Same for ignore _PPC and thermal polling (the latter is always on in our
distro,
I bet a lot machine would break if disabling it and just ripping out the
ability to set it, is really not a solution).

One big challenge in the ACPI subsystem (kernel or userspace) is to find
out BIOS implemenations that are at the limit of specs or which violate
the
specs and try to workaround them.
We are not in the position of M$ (at least in the desktop/laptop
segment) yet.
BIOS developers won't follow our implementations and IMO we should go
the
other way and provide more workarounds. If nobody needs them, the
better.

> > It's also convenient to e.g. lower passive trip point to avoid fan
> > noise.
> 
> nope, the OS can't reliably override the processor passive trip point.
> That is what _SCP and cooling_mode are for.
> 
> The reason is that the BIOS can send us a trip-point changed event at any time,
> the kernel will evaluate _PSV, and wipe out the modified OS version.
> 
> if you want to change the state of the fans,
> then poke /proc/acpi/fan/ directly.
> This will have effect until the next trip point
> changes its state.

> 
> > Some people are used to it, I already wanted to write a little userspace
> > prog to use them as it is really easy to fake cooling_mode (trip points
> > are modified by BIOS) and eliminate fan noise and other things by e.g.
> > reducing passsive or whatever trip point.
> 
> please save this effort for a non-ACPI system.
> 
> > This is at least a major sysfs interface change, has this been discussed
> > somewhere before or declared deprecated?
> 
> it went out on linux-acpi, but I don't recall any discussion about it.
> 
> > It's there for a long time, why is this "a dangerous and mis-leading
> > hack." now?
> 
> It has been dangerous and misleading since the day it went in.
> If the user doesn't enable polling, then they are effectively
> writing random numbers that have absolutely no effect on
> the operation of the system, and hiding the numbers that
> do control the operation of the system.
> 
> > I'd suggest to revert this and I can come with something like "only
> > allow lower values
> > than BIOS provides" patch if the current implementation is considered
> > dangerous.
> 
> That simply will not address the issue.
> Indeed, all the entries in the ubuntu bug report are about hitting
> the critical temperature and having a critical shutdown when
> it isn't wanted.  These people want to RAISE the critical shutdown
> trip-point.  Their cooling problems must be fixed -- raising critical
> trip points causes them instead to be ignored.
> 
> For folks with the reverse problem -- active cooling where the
> fans kick in early than they'd like, they should just turn off
> the fans via /proc/acpi/fan and not mess with the trip points at all.
> If they make a mistake, they will be forgiven when the system
> reaches the next trip point and turns the fan back on.

Yes, it's not correct and those trip points might get overridden by BIOS
again on some machines. It still could help and doesn't hurt (Ok, one
should
not increase the critical trip point, but that can be implemented...).

Again, pls go for more workarounds.

The most annoying situation for the developer and the user is after
investing
a lot of time, finding and possibly fixing a bug and then you need to
tell the guy:
  - Got it, please wait for the next kernel release coming out in some
weeks/months
  - Thanks for the work, but implementing it in the kernel of this ditro
version
    is too dangerous. Other machines might break (especially with ACPI
bugs). Better
    you wait for the next distro version coming out in half a year.


      Thomas

> 
> > > The OS has no capability to actually change the ACPI trip points
> > > that are used by the BIOS.  Changing the OS copy of them
> > > to make the user think that trip events will actually
> > > happen when the temperature crosses the OS copy is crazy.
> > > 
> > > If there are systems with broken thermals and the
> > > ACPI thermal control needs and over-ride to turn
> > > on the fan, then that is fine -- but using
> > > fake trip-points and giving the user the impression
> > > that they are real is not viable.
> > 
> > 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21  3:50           ` Len Brown
  2007-05-21 11:31             ` Thomas Renninger
@ 2007-05-21 12:10             ` Pavel Machek
  2007-05-21 13:27               ` Matthew Garrett
  1 sibling, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-21 12:10 UTC (permalink / raw)
  To: Len Brown
  Cc: trenn, Chuck Ebbert, len.brown, Maciej Rutecki, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

Hi!

> > > No, writing trip-points is neither a fix, nor it is reasonable.
> > > It is a workaround at best, and it is a dangerous and mis-leading hack.
> > Yes it is a workaround for critical ACPI bugs like that or similar:
> > https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.17/+bug/22336
> 
> Thanks for pointing that out -- it is a great example
> of how powerful mis-information can be.
> 
> The fact that the trip-points are writable has obscured,
> rather than clarified, the actual causes of the failures.
> No less than 4 people in that bug report declared that
> cleaning the dust out of their fan fixed the root cause.
> A bunch more said that the issues went away when they 
> stopped using ubuntu's user-space power save daemon.
> 
> There are a couple more with broken active fan control --
> which also gets obscured rather than clarified by
> over-riding trip points.
> 
> And finally, there are probably some with clean fans
> that are working properly, but are thermally challenged
> systems.  I'll venture that Windows is NOT modifying or disabling
> the critical trip point to work around this issue.
> I'll venture that their thermal throttling is working
> and ours may not be.
> 
> perhaps it was the recently fixed mod_timer() bug in thermal.c,
> or perhaps it is one that we don't know about yet...
> 
> > It's also convenient to e.g. lower passive trip point to avoid fan
> > noise.
> 
> nope, the OS can't reliably override the processor passive trip point.
> That is what _SCP and cooling_mode are for.

Yes, it is reliable if you turn on thermal polling.

> The reason is that the BIOS can send us a trip-point changed event at any time,
> the kernel will evaluate _PSV, and wipe out the modified OS version.
> 
> if you want to change the state of the fans,
> then poke /proc/acpi/fan/ directly.

Heh, you suggest this? It is even less functional than current
solution -- which works okay as long as you keep thermal polling
working.

> > It's there for a long time, why is this "a dangerous and mis-leading
> > hack." now?
> 
> It has been dangerous and misleading since the day it went in.
> If the user doesn't enable polling, then they are effectively
> writing random numbers that have absolutely no effect on
> the operation of the system, and hiding the numbers that
> do control the operation of the system.

You are misstating the situation. With thermal polling, it is pretty
much okay, and it is certainly better than "ride fans manually" hack
you suggested.

> For folks with the reverse problem -- active cooling where the
> fans kick in early than they'd like, they should just turn off
> the fans via /proc/acpi/fan and not mess with the trip points at
> all.

No. Manually turning off fans is even worse hack.
							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 22:42             ` Len Brown
@ 2007-05-21 12:11               ` Pavel Machek
  2007-06-01  2:46                 ` Len Brown
  0 siblings, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-21 12:11 UTC (permalink / raw)
  To: Len Brown
  Cc: Maciej Rutecki, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

On Thu 2007-05-17 18:42:43, Len Brown wrote:
> > Something similar happened to me on XE3, yes.
> > 
> > (Actual values were different; BIOS specified critical temperature at
> > cca 95C, but hw killed the power at cca 83C. Setting critical trip
> > point at 80C made the problem go away.)
> 
> Great, please file a bug and include the acpidump from the XE3
> and we'll fix it, rather than supporting a bogus (manual) workaround for it.

It is few years since I do not have that XE3 machine.

> Of course if your system is running at 80*C and the hardware shuts
> off at 83*C, you may have a broken fan, or one clogged with dust...

It _did_ have broken fan. It also had broken trip points.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 12:10             ` Pavel Machek
@ 2007-05-21 13:27               ` Matthew Garrett
  2007-05-21 13:29                 ` Pavel Machek
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Garrett @ 2007-05-21 13:27 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Mon, May 21, 2007 at 02:10:48PM +0200, Pavel Machek wrote:

> > nope, the OS can't reliably override the processor passive trip point.
> > That is what _SCP and cooling_mode are for.
> 
> Yes, it is reliable if you turn on thermal polling.

As Len says, the system can force a reevaluation of the trip points at 
any time which will wipe out the local settings. Either you ignore the 
spec and the notifications (potentially risking misbehaving hardware) or 
you end up in a perpetual race.

> > if you want to change the state of the fans,
> > then poke /proc/acpi/fan/ directly.
> 
> Heh, you suggest this? It is even less functional than current
> solution -- which works okay as long as you keep thermal polling
> working.

If there are problems with the fan behaviour, why don't we fix them?

> > For folks with the reverse problem -- active cooling where the
> > fans kick in early than they'd like, they should just turn off
> > the fans via /proc/acpi/fan and not mess with the trip points at
> > all.
> 
> No. Manually turning off fans is even worse hack.

It's significantly more correct.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 13:27               ` Matthew Garrett
@ 2007-05-21 13:29                 ` Pavel Machek
  2007-05-21 13:36                   ` Matthew Garrett
  0 siblings, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-21 13:29 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

Hi!

> > > For folks with the reverse problem -- active cooling where the
> > > fans kick in early than they'd like, they should just turn off
> > > the fans via /proc/acpi/fan and not mess with the trip points at
> > > all.
> > 
> > No. Manually turning off fans is even worse hack.
> 
> It's significantly more correct.

Significantly more correct? It forces you to do all the thermal
management in userspace!
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 13:29                 ` Pavel Machek
@ 2007-05-21 13:36                   ` Matthew Garrett
  2007-05-21 13:40                     ` Pavel Machek
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Garrett @ 2007-05-21 13:36 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Mon, May 21, 2007 at 03:29:48PM +0200, Pavel Machek wrote:
> > > No. Manually turning off fans is even worse hack.
> > 
> > It's significantly more correct.
> 
> Significantly more correct? It forces you to do all the thermal
> management in userspace!

Why's that a problem? Overriding the hardware policy has to be done 
somewhere, and doing it in userspace is no more dangerous than 
kernelspace.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 13:36                   ` Matthew Garrett
@ 2007-05-21 13:40                     ` Pavel Machek
  2007-05-21 13:45                       ` Matthew Garrett
  0 siblings, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-21 13:40 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Mon 2007-05-21 14:36:08, Matthew Garrett wrote:
> On Mon, May 21, 2007 at 03:29:48PM +0200, Pavel Machek wrote:
> > > > No. Manually turning off fans is even worse hack.
> > > 
> > > It's significantly more correct.
> > 
> > Significantly more correct? It forces you to do all the thermal
> > management in userspace!
> 
> Why's that a problem? 

Duplicating all the kernel logic in userspace, badly?
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 13:40                     ` Pavel Machek
@ 2007-05-21 13:45                       ` Matthew Garrett
  2007-05-21 22:42                         ` Pavel Machek
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Garrett @ 2007-05-21 13:45 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Mon, May 21, 2007 at 03:40:46PM +0200, Pavel Machek wrote:
> On Mon 2007-05-21 14:36:08, Matthew Garrett wrote:
> > On Mon, May 21, 2007 at 03:29:48PM +0200, Pavel Machek wrote:
> > > Significantly more correct? It forces you to do all the thermal
> > > management in userspace!
> > 
> > Why's that a problem? 
> 
> Duplicating all the kernel logic in userspace, badly?

So don't do it badly. The advantage of doing so is that you can make it 
work properly, which you can't by putting it in the kernel.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 13:45                       ` Matthew Garrett
@ 2007-05-21 22:42                         ` Pavel Machek
  2007-05-22  0:31                           ` Matthew Garrett
  0 siblings, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-21 22:42 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Mon 2007-05-21 14:45:53, Matthew Garrett wrote:
> On Mon, May 21, 2007 at 03:40:46PM +0200, Pavel Machek wrote:
> > On Mon 2007-05-21 14:36:08, Matthew Garrett wrote:
> > > On Mon, May 21, 2007 at 03:29:48PM +0200, Pavel Machek wrote:
> > > > Significantly more correct? It forces you to do all the thermal
> > > > management in userspace!
> > > 
> > > Why's that a problem? 
> > 
> > Duplicating all the kernel logic in userspace, badly?
> 
> So don't do it badly. The advantage of doing so is that you can make it 
> work properly, which you can't by putting it in the kernel.

You want stuff like critical shutdowns to work even if userspace is
dead.

I do not think you can control passive cooling adequately from
userspace, and you can certainly not prevent kernel from slowing
machine down too soon.

Plus, this is actually nasty user-visible change, and a regression
from 2.6.21. I am not sure why we are even debating this; user-kernel
interface changed without warning. Patch should be simply reverted.

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 22:42                         ` Pavel Machek
@ 2007-05-22  0:31                           ` Matthew Garrett
  2007-05-22  9:06                             ` Pavel Machek
  2007-05-24 14:16                             ` 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: " Thomas Renninger
  0 siblings, 2 replies; 41+ messages in thread
From: Matthew Garrett @ 2007-05-22  0:31 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Tue, May 22, 2007 at 12:42:00AM +0200, Pavel Machek wrote:
> On Mon 2007-05-21 14:45:53, Matthew Garrett wrote:
> > So don't do it badly. The advantage of doing so is that you can make it 
> > work properly, which you can't by putting it in the kernel.
> 
> You want stuff like critical shutdowns to work even if userspace is
> dead.

I don't think anyone suggested putting the critical shutdown control in 
userspace. The kernel already handles that fine.

> I do not think you can control passive cooling adequately from 
> userspace, and you can certainly not prevent kernel from slowing 
> machine down too soon.

Given the choice between something impossible and something difficult, 
I'm inclined towards picking the difficult one.

> Plus, this is actually nasty user-visible change, and a regression
> from 2.6.21. I am not sure why we are even debating this; user-kernel
> interface changed without warning. Patch should be simply reverted.

In http://lkml.org/lkml/2007/1/27/93 you were more than happy to break 
an interface even though it could be fixed in a (ugly) way that made it 
work again. Here, there's no way to fix this properly - the platform 
will quite happily do things based on what it believes the trip points 
should be, and one of those things may be to alter the trip points. 
Imagine the following situation:

1) Platform sets critical shutdown trip point to 85C
2) Userspace sets critical shutdown trip point to 95C
3) Temperature reaches 90C
4) Platform forces reevaluation of trip points
5) Entire invasion fleet is lost

How do you avoid that? Disable the ability for the platform to set trip 
points? You're breaking the spec and potentially causing hardware 
damage. If you have specific hardware that requires specific spec 
breakage, then a better approach would probably be to quirk the kernel 
to rectify it. On the other hand, if it works with the Other Leading OS, 
we ought to be able to just fix the problem properly.
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-22  0:31                           ` Matthew Garrett
@ 2007-05-22  9:06                             ` Pavel Machek
  2007-05-22  9:16                               ` Matthew Garrett
  2007-06-04  9:13                               ` Stefan Seyfried
  2007-05-24 14:16                             ` 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: " Thomas Renninger
  1 sibling, 2 replies; 41+ messages in thread
From: Pavel Machek @ 2007-05-22  9:06 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

Hi!

> > > So don't do it badly. The advantage of doing so is that you can make it 
> > > work properly, which you can't by putting it in the kernel.
> > 
> > You want stuff like critical shutdowns to work even if userspace is
> > dead.
> 
> I don't think anyone suggested putting the critical shutdown control in 
> userspace. The kernel already handles that fine.

No it does not. That is what this thread is about.

(On old xe3, critical trip point set by BIOS is ~95C, but machine dies
by hw safeguard at ~83C. Workaround is to lower critical trip point to
80C or so. Len broke this.)

> Imagine the following situation:
> 
> 1) Platform sets critical shutdown trip point to 85C
> 2) Userspace sets critical shutdown trip point to 95C
> 3) Temperature reaches 90C
> 4) Platform forces reevaluation of trip points
> 5) Entire invasion fleet is lost
> 
> How do you avoid that? Disable the ability for the platform to set trip 
> points? You're breaking the spec and potentially causing hardware 

We need to ignore trip point updates from BIOS, and we need to poll
thermals when use overrides trip points. That's expected. Plus I've
yet to see platform actually updating the trip points.

Speaking about hw damage... The broken BIOS on xe3 definitely caused
damage to its harddrive, so... we are preventing hw damage here.

(Plus, Len's patch broke user-kernel in stable series, without warning).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-22  9:06                             ` Pavel Machek
@ 2007-05-22  9:16                               ` Matthew Garrett
  2007-05-22  9:28                                 ` Goulven Guillard
  2007-05-22 10:05                                 ` Maciej Rutecki
  2007-06-04  9:13                               ` Stefan Seyfried
  1 sibling, 2 replies; 41+ messages in thread
From: Matthew Garrett @ 2007-05-22  9:16 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Len Brown, trenn, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Tue, May 22, 2007 at 11:06:36AM +0200, Pavel Machek wrote:

> We need to ignore trip point updates from BIOS, and we need to poll
> thermals when use overrides trip points. That's expected. Plus I've
> yet to see platform actually updating the trip points.

Try any recent HP bios.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-22  9:16                               ` Matthew Garrett
@ 2007-05-22  9:28                                 ` Goulven Guillard
  2007-05-22 10:05                                 ` Maciej Rutecki
  1 sibling, 0 replies; 41+ messages in thread
From: Goulven Guillard @ 2007-05-22  9:28 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Pavel Machek, Len Brown, trenn, Chuck Ebbert, len.brown,
	Maciej Rutecki, Andrew Morton, linux-kernel, linux-acpi, torvalds

Le 05/22/2007 11:16 AM, Matthew Garrett a déclaré :
> On Tue, May 22, 2007 at 11:06:36AM +0200, Pavel Machek wrote:
> 
>> We need to ignore trip point updates from BIOS, and we need to poll
>> thermals when use overrides trip points. That's expected. Plus I've
>> yet to see platform actually updating the trip points.
> 
> Try any recent HP bios.
> 

man cron... ;-)





-- 
    ~~
   |Oo|   La banquise fond !!! Adoptez un pingouin...
  /|\/|\
   |__|            => http://doc.ubuntu-fr.org/
   ^__^
~~~|  |~~~








-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-22  9:16                               ` Matthew Garrett
  2007-05-22  9:28                                 ` Goulven Guillard
@ 2007-05-22 10:05                                 ` Maciej Rutecki
  1 sibling, 0 replies; 41+ messages in thread
From: Maciej Rutecki @ 2007-05-22 10:05 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Pavel Machek, Len Brown, trenn, Chuck Ebbert, len.brown,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

[-- Attachment #1: Type: text/plain, Size: 536 bytes --]

Matthew Garrett pisze:
.
> 
> Try any recent HP bios.
> 

Yes...

hp nx 6310, bios version:
F.06. cpufreq works, MFCG Bios Error in dmesg (PCI: BIOS Bug: MCFG area
at f8000000 is not E820-reserved)
F.08. like above + cpufreq broken
F.09 Remove this errors, but problem with reboot (too long time - remove
psmouse module doesn't help) - some people reports it (i didn't test it)
F.0B suspend to ram broken, after suspend to disk keyboard doesn't work
F.0D I don't have the heart test it...

-- 
Maciej Rutecki
http://www.maciek.unixy.pl

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3265 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: [cannot change thermal trip points]
  2007-05-22  0:31                           ` Matthew Garrett
  2007-05-22  9:06                             ` Pavel Machek
@ 2007-05-24 14:16                             ` Thomas Renninger
  2007-05-24 14:36                               ` Matthew Garrett
  1 sibling, 1 reply; 41+ messages in thread
From: Thomas Renninger @ 2007-05-24 14:16 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-acpi

Stripping some CCs, acpi and kernel list should be enough this one goes
to...

On Tue, 2007-05-22 at 01:31 +0100, Matthew Garrett wrote:
> On Tue, May 22, 2007 at 12:42:00AM +0200, Pavel Machek wrote:
> > On Mon 2007-05-21 14:45:53, Matthew Garrett wrote:
> > > So don't do it badly. The advantage of doing so is that you can make it 
> > > work properly, which you can't by putting it in the kernel.
> > 
> > You want stuff like critical shutdowns to work even if userspace is
> > dead.
> 
> I don't think anyone suggested putting the critical shutdown control in 
> userspace. The kernel already handles that fine.
> 
> > I do not think you can control passive cooling adequately from 
> > userspace, and you can certainly not prevent kernel from slowing 
> > machine down too soon.
> 
> Given the choice between something impossible and something difficult, 
> I'm inclined towards picking the difficult one.

I doubt it is impossible, would you mind sharing your knowledge why you
think it is impossible or point to some related discussion, pls.

Does this mean checking temperature against trip points and adjust fan
and cpufreq should be done in a hal module?
In which stage is this, rfc, development, already in some git tree?

Yes, trip points are overridden by BIOS on HPs and what is the problem?
The workaround won't work for them, but it still does on others
(mainly on ThinkPads which have passive tp at about 89 C and critical on
91 C).

I could imagine an implementation for this, that e.g. critical...active9
get module parameters. BIOS updates for trip points get ignored as soon
as one is set and you can only decrease a value. Nothing bad can happen
and it will make some people happy (yes it's hacky, violates the specs
and so on..., but some more people have a working machine). Will this
(or similar) get accepted?

It's even more impossible to get ACPI working correctly for all machines
and all subsystems, these little workarounds can help some people to at
least use their machine or get some parts working better.

   Thomas


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: [cannot change thermal trip points]
  2007-05-24 14:16                             ` 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: " Thomas Renninger
@ 2007-05-24 14:36                               ` Matthew Garrett
  2007-05-24 18:18                                 ` Thomas Renninger
  2007-05-25  6:38                                 ` Pavel Machek
  0 siblings, 2 replies; 41+ messages in thread
From: Matthew Garrett @ 2007-05-24 14:36 UTC (permalink / raw)
  To: Thomas Renninger; +Cc: linux-kernel, linux-acpi

On Thu, May 24, 2007 at 04:16:53PM +0200, Thomas Renninger wrote:

> I doubt it is impossible, would you mind sharing your knowledge why you
> think it is impossible or point to some related discussion, pls.

Because, as Len has pointed out, you end up with two different ideas 
about what the trip points are - the kernel's and the hardware's. That 
works fine until some event in the firmware either forcibly 
resynchronises the two or makes assumptions about the spec-compliance of 
the interpreter.

> Yes, trip points are overridden by BIOS on HPs and what is the problem?
> The workaround won't work for them, but it still does on others
> (mainly on ThinkPads which have passive tp at about 89 C and critical on
> 91 C).

You don't know whether the workaround will work or not until you've 
performed a full audit of the platform firmware, which is going to 
potentially change between BIOS versions. It's entirely legal for the 
firmware to behave in this way, and even beneficial under various 
circumstances.

> I could imagine an implementation for this, that e.g. critical...active9
> get module parameters. BIOS updates for trip points get ignored as soon
> as one is set and you can only decrease a value. Nothing bad can happen
> and it will make some people happy (yes it's hacky, violates the specs
> and so on..., but some more people have a working machine). Will this
> (or similar) get accepted?

The interface would need to be more complicated than that if you wanted 
to be able to implement hysteresis, and there's the potential for 
hardware damage if paramaters are set inappropriately. Even then, 
there's no easy way of programatically determining whether it would work 
on any given hardware.

> It's even more impossible to get ACPI working correctly for all machines
> and all subsystems, these little workarounds can help some people to at
> least use their machine or get some parts working better.

It's fairly clearly not impossible, given that there exists at least one 
OS that these machines work with.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: [cannot change thermal trip points]
  2007-05-24 14:36                               ` Matthew Garrett
@ 2007-05-24 18:18                                 ` Thomas Renninger
  2007-05-25  6:38                                 ` Pavel Machek
  1 sibling, 0 replies; 41+ messages in thread
From: Thomas Renninger @ 2007-05-24 18:18 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-acpi

On Thu, 2007-05-24 at 15:36 +0100, Matthew Garrett wrote:
> On Thu, May 24, 2007 at 04:16:53PM +0200, Thomas Renninger wrote:
> 
> > I doubt it is impossible, would you mind sharing your knowledge why you
> > think it is impossible or point to some related discussion, pls.
> 
> Because, as Len has pointed out, you end up with two different ideas 
> about what the trip points are - the kernel's and the hardware's. That 
> works fine until some event in the firmware either forcibly 
> resynchronises the two or makes assumptions about the spec-compliance of 
> the interpreter.

Not sure what exactly you'd like to do in userspace, maybe you can be a
bit more precise here:
  a) Doing whole thermal management in userspace, reading temp, writing
     fan and cpufreq_max_freq, shutting down machine,...
  b) Workaround not switching on fans by double checking fan/temperature
     by a userspace daemon and try to finally trigger the switch by 
     writing to /proc/acpi/fan/state (or corresponding /sys,..)

IMO we need a some kind of fan watchdog like Henrique described
recently, maybe this could be put in userspace not sure.
Currently the fan can runs out of sync easily if the fan state is
changed behind the OSs back.


> > Yes, trip points are overridden by BIOS on HPs and what is the problem?
> > The workaround won't work for them, but it still does on others
> > (mainly on ThinkPads which have passive tp at about 89 C and critical on
> > 91 C).
> 
> You don't know whether the workaround will work or not
Hmm, I don't get the point. If it works it's great, if not you have a
problem anyway and can at least test a workaround.
>  until you've 
> performed a full audit of the platform firmware, which is going to 
> potentially change between BIOS versions. It's entirely legal for the 
> firmware to behave in this way, and even beneficial under various 
> circumstances.
But that's exactly what all these workarounds are for. You pass them if
you have a buggy BIOS. You wait for new BIOSes and hope that you can get
rid of the workaround...

> > I could imagine an implementation for this, that e.g. critical...active9
> > get module parameters. BIOS updates for trip points get ignored as soon
> > as one is set and you can only decrease a value. Nothing bad can happen
> > and it will make some people happy (yes it's hacky, violates the specs
> > and so on..., but some more people have a working machine). Will this
> > (or similar) get accepted?
> 
> The interface would need to be more complicated than that if you wanted 
> to be able to implement hysteresis, and there's the potential for 
> hardware damage if paramaters are set inappropriately. Even then, 
> there's no easy way of programatically determining whether it would work 
> on any given hardware.

The fact that 3 people complained rather fast for a patch in rc1-mm1,
looks like this is a workaround that is needed. I personally advised two
guys to use it with their ThinkPad in the summer and they are happy with
it.

I'd also like to have this a bit extended: be able to just modify
passive trip point.
IMO this is a very powerful feature allowing people a fanless system as
long as they have a cpufreq capable processor.

The idea having this in userspace is interesting. But as said rather
complicated to implement. The hysteresis implementation for passive
cooling works fine in kernel and is field tested, it should get used.

The problem with the ACPI spec is that it's rather complicated. This is
IMO mainly for a BIOS developer point of view for what I can say.
Therefore it's rather seldom picked up by BIOS vendors.
However for the kernel it's easy (to fake, to do) and it's working fine,
so why not making use of it?

IMO we should even provide a passive trip point (initially unused) when
there is no one defined by BIOS.

I agree that it's hard to find the temperature to not let the fan kick
in automatically. But it's really easy then for everyone to:
  - get a fanless system
  - workaround critical shutdowns
and all this is safe in respect to HW damage.

IMO this is an area where we can easily behave better than M$ does.

Maybe my first mails were a bit offending, don't know, we should get
this back to an objective discussion.

I especially like to have some comments from Len, before doing any work
for nothing (or before giving up):
   - Would such a passive trip point override be acceptable in any way
     (be it in userspace, kernel space or in whatever form -> to be 
     discussed)
   - Would such a workaround as I described in my mail before be 
     acceptable
   - If done in userspace, how should it look like exactly

Thanks,

   Thomas


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: [cannot change thermal trip points]
  2007-05-24 14:36                               ` Matthew Garrett
  2007-05-24 18:18                                 ` Thomas Renninger
@ 2007-05-25  6:38                                 ` Pavel Machek
  2007-05-27 21:51                                   ` Matthew Garrett
  1 sibling, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-25  6:38 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Thomas Renninger, linux-kernel, linux-acpi

Hi!

> 
> > I doubt it is impossible, would you mind sharing your knowledge why you
> > think it is impossible or point to some related discussion, pls.
> 
> Because, as Len has pointed out, you end up with two different ideas 
> about what the trip points are - the kernel's and the hardware's. That 
> works fine until some event in the firmware either forcibly 
> resynchronises the two or makes assumptions about the spec-compliance of 
> the interpreter.

...and suggested workaround is to drive fans directly from userspace,
which not only violates the specs and has all the problems with
desynchronized state, but ALSO FAILS TO WORK IN PRACTICE.

> > I could imagine an implementation for this, that e.g. critical...active9
> > get module parameters. BIOS updates for trip points get ignored as soon
> > as one is set and you can only decrease a value. Nothing bad can happen
> > and it will make some people happy (yes it's hacky, violates the specs
> > and so on..., but some more people have a working machine). Will this
> > (or similar) get accepted?
> 
> The interface would need to be more complicated than that if you wanted 
> to be able to implement hysteresis, and there's the potential for 
> hardware damage if paramaters are set inappropriately. Even then, 
> there's no easy way of programatically determining whether it would work 
> on any given hardware.

Not sure why you try to scare people with 'hardware damage'. HP XE3
bios already _was_ damaging hardware (it cooked the hard drive using
cpu as a heater), and no acpi magic can damage correctly working
machine.
							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: [cannot change thermal trip points]
  2007-05-25  6:38                                 ` Pavel Machek
@ 2007-05-27 21:51                                   ` Matthew Garrett
  2007-05-28 10:58                                     ` Pavel Machek
  0 siblings, 1 reply; 41+ messages in thread
From: Matthew Garrett @ 2007-05-27 21:51 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Thomas Renninger, linux-kernel, linux-acpi

On Fri, May 25, 2007 at 06:38:15AM +0000, Pavel Machek wrote:
> Hi!
> > Because, as Len has pointed out, you end up with two different ideas 
> > about what the trip points are - the kernel's and the hardware's. That 
> > works fine until some event in the firmware either forcibly 
> > resynchronises the two or makes assumptions about the spec-compliance of 
> > the interpreter.
> 
> ...and suggested workaround is to drive fans directly from userspace,
> which not only violates the specs and has all the problems with
> desynchronized state, but ALSO FAILS TO WORK IN PRACTICE.

I don't think that's obviously true. 11.3.2 of the 3.0 spec states:

"A package consisting of references to all active cooling devices that 
should be engaged when the associated active cooling threshold (_ACx) is 
exceeded." 

(referring to _ALx objects).

> > The interface would need to be more complicated than that if you wanted 
> > to be able to implement hysteresis, and there's the potential for 
> > hardware damage if paramaters are set inappropriately. Even then, 
> > there's no easy way of programatically determining whether it would work 
> > on any given hardware.
> 
> Not sure why you try to scare people with 'hardware damage'. HP XE3
> bios already _was_ damaging hardware (it cooked the hard drive using
> cpu as a heater), and no acpi magic can damage correctly working
> machine.

Given that this presumably didn't occur under Windows, I think it would 
be significantly better to figure out why and then fix that. 
Alternatively, if the firmware tables are actually genuinely broken in a 
way that's impossible to repair, you can replace the table. That has the 
advantage that there's no risk of the platform and the OS becoming 
confused.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: [cannot change thermal trip points]
  2007-05-27 21:51                                   ` Matthew Garrett
@ 2007-05-28 10:58                                     ` Pavel Machek
  2007-05-28 12:50                                       ` Matthew Garrett
  0 siblings, 1 reply; 41+ messages in thread
From: Pavel Machek @ 2007-05-28 10:58 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Thomas Renninger, linux-kernel, linux-acpi

Hi!

> > > Because, as Len has pointed out, you end up with two different ideas 
> > > about what the trip points are - the kernel's and the hardware's. That 
> > > works fine until some event in the firmware either forcibly 
> > > resynchronises the two or makes assumptions about the spec-compliance of 
> > > the interpreter.
> > 
> > ...and suggested workaround is to drive fans directly from userspace,
> > which not only violates the specs and has all the problems with
> > desynchronized state, but ALSO FAILS TO WORK IN PRACTICE.
> 
> I don't think that's obviously true. 11.3.2 of the 3.0 spec states:

> "A package consisting of references to all active cooling devices that 
> should be engaged when the associated active cooling threshold (_ACx) is 
> exceeded." 

We'd need:

a) way to tell acpi not to control fans any more

b) in kernel watchdog so that acpi starts controlling fans after oom
killer

c) way to control passive cooling from userspace.

Not something doable for 2.6.22.  


> > > The interface would need to be more complicated than that if you wanted 
> > > to be able to implement hysteresis, and there's the potential for 
> > > hardware damage if paramaters are set inappropriately. Even then, 
> > > there's no easy way of programatically determining whether it would work 
> > > on any given hardware.
> > 
> > Not sure why you try to scare people with 'hardware damage'. HP XE3
> > bios already _was_ damaging hardware (it cooked the hard drive using
> > cpu as a heater), and no acpi magic can damage correctly working
> > machine.
> 
> Given that this presumably didn't occur under Windows, I think it would
> be significantly better to figure out why and then fix that. 

It would happily occur under Windows. You just needed to load machine
in a way that cpu stayed ~80C.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: [cannot change thermal trip points]
  2007-05-28 10:58                                     ` Pavel Machek
@ 2007-05-28 12:50                                       ` Matthew Garrett
  0 siblings, 0 replies; 41+ messages in thread
From: Matthew Garrett @ 2007-05-28 12:50 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Thomas Renninger, linux-kernel, linux-acpi

On Mon, May 28, 2007 at 12:58:51PM +0200, Pavel Machek wrote:

> It would happily occur under Windows. You just needed to load machine
> in a way that cpu stayed ~80C.

So replace the DSDT. All the problems get solved that way.
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-21 12:11               ` Pavel Machek
@ 2007-06-01  2:46                 ` Len Brown
  2007-06-04 11:16                   ` Pavel Machek
  0 siblings, 1 reply; 41+ messages in thread
From: Len Brown @ 2007-06-01  2:46 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Maciej Rutecki, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds, Matthew Garrett

On Monday 21 May 2007 08:11, Pavel Machek wrote:
> On Thu 2007-05-17 18:42:43, Len Brown wrote:
> > > Something similar happened to me on XE3, yes.
> > > 
> > > (Actual values were different; BIOS specified critical temperature at
> > > cca 95C, but hw killed the power at cca 83C. Setting critical trip
> > > point at 80C made the problem go away.)
> > 
> > Great, please file a bug and include the acpidump from the XE3
> > and we'll fix it, rather than supporting a bogus (manual) workaround for it.
> 
> It is few years since I do not have that XE3 machine.
> 
> > Of course if your system is running at 80*C and the hardware shuts
> > off at 83*C, you may have a broken fan, or one clogged with dust...
> 
> It _did_ have broken fan. It also had broken trip points.

Thanks for clarifying this, Pavel.
If you come upon an XE3 where Linux-2.6.22 doesn't work as well
as Windows, please let me know.

Given that the justification for this ill-conceived workaround
seems to have diminished to the memory of broken hardware,
it is clear that we should stay the course of removing it
so that it doesn't further confuse future users.

If SuSE violently disagrees with me, you are certainly empowered
to restore the workaround in your distribution staring at 2.6.22
as part of your value add.  However, given its history of confusing
users, it seems that it might increase your support burden rather
than decrease it.

-Len

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-17 22:35           ` Len Brown
@ 2007-06-04  9:02             ` Stefan Seyfried
  2007-06-04 11:06               ` Pavel Machek
  0 siblings, 1 reply; 41+ messages in thread
From: Stefan Seyfried @ 2007-06-04  9:02 UTC (permalink / raw)
  To: Len Brown
  Cc: Pavel Machek, Chuck Ebbert, len.brown, Maciej Rutecki,
	Andrew Morton, linux-kernel, linux-acpi, torvalds

On Thu, May 17, 2007 at 06:35:48PM -0400, Len Brown wrote:
 
> Yes, SuSE enables polling mode by default, but that is just
> distro specific "value add" that should eventually be fixed.

I will do that for openSUSE FACTORY.
-- 
Stefan Seyfried
QA / R&D Team Mobile Devices        |              "Any ideas, John?"
SUSE LINUX Products GmbH, Nürnberg  | "Well, surrounding them's out." 

This footer brought to you by insane German lawmakers:
SUSE Linux Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-05-22  9:06                             ` Pavel Machek
  2007-05-22  9:16                               ` Matthew Garrett
@ 2007-06-04  9:13                               ` Stefan Seyfried
  1 sibling, 0 replies; 41+ messages in thread
From: Stefan Seyfried @ 2007-06-04  9:13 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Matthew Garrett, Len Brown, trenn, Chuck Ebbert, len.brown,
	Maciej Rutecki, Andrew Morton, linux-kernel, linux-acpi, torvalds

On Tue, May 22, 2007 at 11:06:36AM +0200, Pavel Machek wrote:
 
> We need to ignore trip point updates from BIOS, and we need to poll
> thermals when use overrides trip points. That's expected. Plus I've
> yet to see platform actually updating the trip points.

Thinkpad 600, whenever a trip point is crossed, all trip points are updated.
I think they implemented hysteresis that way.
ISTR that hp nx5000 did something similar, but i might be wrong on this one.
-- 
Stefan Seyfried
QA / R&D Team Mobile Devices        |              "Any ideas, John?"
SUSE LINUX Products GmbH, Nürnberg  | "Well, surrounding them's out." 

This footer brought to you by insane German lawmakers:
SUSE Linux Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-06-04  9:02             ` Stefan Seyfried
@ 2007-06-04 11:06               ` Pavel Machek
  0 siblings, 0 replies; 41+ messages in thread
From: Pavel Machek @ 2007-06-04 11:06 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Len Brown, Chuck Ebbert, len.brown, Maciej Rutecki, Andrew Morton,
	linux-kernel, linux-acpi, torvalds

On Mon 2007-06-04 11:02:01, Stefan Seyfried wrote:
> On Thu, May 17, 2007 at 06:35:48PM -0400, Len Brown wrote:
>  
> > Yes, SuSE enables polling mode by default, but that is just
> > distro specific "value add" that should eventually be fixed.
> 
> I will do that for openSUSE FACTORY.

Well, I still believe right solution is to enable polling mode as soon
as trip points are written (and ignoring bios updates from then
on). That gets trip point writing into functional state.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.22-rc1-mm1 [cannot change thermal trip points]
  2007-06-01  2:46                 ` Len Brown
@ 2007-06-04 11:16                   ` Pavel Machek
  0 siblings, 0 replies; 41+ messages in thread
From: Pavel Machek @ 2007-06-04 11:16 UTC (permalink / raw)
  To: Len Brown
  Cc: Maciej Rutecki, Chuck Ebbert, len.brown, Andrew Morton,
	linux-kernel, linux-acpi, torvalds, Matthew Garrett

On Thu 2007-05-31 22:46:11, Len Brown wrote:
> On Monday 21 May 2007 08:11, Pavel Machek wrote:
> > On Thu 2007-05-17 18:42:43, Len Brown wrote:
> > > > Something similar happened to me on XE3, yes.
> > > > 
> > > > (Actual values were different; BIOS specified critical temperature at
> > > > cca 95C, but hw killed the power at cca 83C. Setting critical trip
> > > > point at 80C made the problem go away.)
> > > 
> > > Great, please file a bug and include the acpidump from the XE3
> > > and we'll fix it, rather than supporting a bogus (manual) workaround for it.
> > 
> > It is few years since I do not have that XE3 machine.
> > 
> > > Of course if your system is running at 80*C and the hardware shuts
> > > off at 83*C, you may have a broken fan, or one clogged with dust...
> > 
> > It _did_ have broken fan. It also had broken trip points.
> 
> Thanks for clarifying this, Pavel.
> If you come upon an XE3 where Linux-2.6.22 doesn't work as well
> as Windows, please let me know.

"work as well as windows" is not good enough goal as far as I'm
concerned. Please don't break working setups.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2007-06-04 11:16 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070515201914.16944e04.akpm@linux-foundation.org>
2007-05-16 17:37 ` 2.6.22-rc1-mm1 [cannot change thermal trip points] Maciej Rutecki
2007-05-16 17:47   ` Chuck Ebbert
2007-05-16 18:10     ` Goulven Guillard
2007-05-17  9:23     ` Pavel Machek
2007-05-17 13:36       ` Maciej Rutecki
2007-05-17 19:08         ` Len Brown
2007-05-17 20:09           ` Maciej Rutecki
2007-05-17 20:42             ` Maciej Rutecki
2007-05-17 21:53           ` Pavel Machek
2007-05-17 22:42             ` Len Brown
2007-05-21 12:11               ` Pavel Machek
2007-06-01  2:46                 ` Len Brown
2007-06-04 11:16                   ` Pavel Machek
2007-05-17 19:17       ` Len Brown
2007-05-17 21:52         ` Pavel Machek
2007-05-17 22:35           ` Len Brown
2007-06-04  9:02             ` Stefan Seyfried
2007-06-04 11:06               ` Pavel Machek
2007-05-19 19:56         ` Thomas Renninger
2007-05-21  3:50           ` Len Brown
2007-05-21 11:31             ` Thomas Renninger
2007-05-21 12:10             ` Pavel Machek
2007-05-21 13:27               ` Matthew Garrett
2007-05-21 13:29                 ` Pavel Machek
2007-05-21 13:36                   ` Matthew Garrett
2007-05-21 13:40                     ` Pavel Machek
2007-05-21 13:45                       ` Matthew Garrett
2007-05-21 22:42                         ` Pavel Machek
2007-05-22  0:31                           ` Matthew Garrett
2007-05-22  9:06                             ` Pavel Machek
2007-05-22  9:16                               ` Matthew Garrett
2007-05-22  9:28                                 ` Goulven Guillard
2007-05-22 10:05                                 ` Maciej Rutecki
2007-06-04  9:13                               ` Stefan Seyfried
2007-05-24 14:16                             ` 2.6.22-rc1-mm1 Implementing fan/thermal control in userspace - Was: " Thomas Renninger
2007-05-24 14:36                               ` Matthew Garrett
2007-05-24 18:18                                 ` Thomas Renninger
2007-05-25  6:38                                 ` Pavel Machek
2007-05-27 21:51                                   ` Matthew Garrett
2007-05-28 10:58                                     ` Pavel Machek
2007-05-28 12:50                                       ` Matthew Garrett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).