* ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk
@ 2006-11-17 23:58 ` Linus Torvalds
2006-11-18 1:25 ` Linus Torvalds
0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2006-11-17 23:58 UTC (permalink / raw)
To: Len Brown, Adrian Bunk, Andrew Morton; +Cc: David Brownell, linux-acpi
On Fri, 17 Nov 2006, Adrian Bunk wrote:
>
> Subject : nasty ACPI regression, AE_TIME errors
> References : http://lkml.org/lkml/2006/11/15/12
> Submitter : David Brownell <david-b@pacbell.net>
> Handled-By : Len Brown <len.brown@intel.com>
> Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
> Status : problem is being debugged
I do not know if this is related, but testing one of my laptops (always a
good idea to check the week before release) shows that my trusty old
Compaq N620c locks up rather quickly at boot with the current -git tree.
Total lockup - no sysrq, no messages, no nothing.
I've mostly bisected it (what the _hell_ did we do before "git bisect"?),
and right now I know:
commit 9aaed2b42d00d4abb2748d72d599a8033600e2bf is bad (that's Len's "pull
trivial into test branch") commit.
v2.6.19-rc2 seems all good.
Which leaves a chunk of just a few ACPI commits left to bisect.
I'll do five or so more reboots, and I should be able to tell exactly
which commit breaks. It almost always locks up very early during boot
(generally during the "initializing udev" phase), although sometimes it
survives a bit further..
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-17 23:58 ` ACPI breakage (Re: 2.6.19-rc6: known regressions (v2)) Linus Torvalds
@ 2006-11-18 1:25 ` Linus Torvalds
0 siblings, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2006-11-18 1:25 UTC (permalink / raw)
To: Len Brown, Alexey Starikovskiy, Adrian Bunk, Andrew Morton
Cc: David Brownell, linux-acpi
On Fri, 17 Nov 2006, Linus Torvalds wrote:
>
> Total lockup - no sysrq, no messages, no nothing.
Dammit.
It looks like 37605a6900f6b4d886d995751fcfeef88c4e462c, and I should have
realized that immediately.
That commit re-introduces the bug that we already reverted once.
Why the hell did that idiotic thing go in, when we had to revert it once
already (see commit 72945b2b90a5554975b8f72673ab7139d232a121 for the
earlier revert).
It was broken then, it is broken now. Nothing has changed.
Why did you guys try to sneak it in again? Last time this same "use a
second workqueue" patch went in (in a different form), we had _exactly_
the same problems, with total lockups, and way too high CPU usage.
The bugzilla entry that you refer to in that commit is even the same one
that discussed why the _original_ patch was totally broken.
It's even the same AUTHOR who wrote the original buggy patch, that pushed
through the same buggy patch AGAIN.
Dammit, this is frustrating.
Why did people expect it to suddenly not be buggy?
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
@ 2006-11-18 16:23 Starikovskiy, Alexey Y
2006-11-18 17:12 ` Linus Torvalds
0 siblings, 1 reply; 15+ messages in thread
From: Starikovskiy, Alexey Y @ 2006-11-18 16:23 UTC (permalink / raw)
To: Linus Torvalds, Brown, Len, Adrian Bunk, Andrew Morton
Cc: David Brownell, linux-acpi
May because it does not have a single common line with the previous
patch?
Or may be because it fixes all the current AMD-HP notebooks?
Or may be because it did not fail while being in -mm?
I will not "sneak it in" again, I promise.
Regards,
Alex.
-----Original Message-----
From: Linus Torvalds [mailto:torvalds@osdl.org]
Sent: Saturday, November 18, 2006 4:25 AM
To: Brown, Len; Starikovskiy, Alexey Y; Adrian Bunk; Andrew Morton
Cc: David Brownell; linux-acpi@vger.kernel.org
Subject: Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
On Fri, 17 Nov 2006, Linus Torvalds wrote:
>
> Total lockup - no sysrq, no messages, no nothing.
Dammit.
It looks like 37605a6900f6b4d886d995751fcfeef88c4e462c, and I should
have
realized that immediately.
That commit re-introduces the bug that we already reverted once.
Why the hell did that idiotic thing go in, when we had to revert it once
already (see commit 72945b2b90a5554975b8f72673ab7139d232a121 for the
earlier revert).
It was broken then, it is broken now. Nothing has changed.
Why did you guys try to sneak it in again? Last time this same "use a
second workqueue" patch went in (in a different form), we had _exactly_
the same problems, with total lockups, and way too high CPU usage.
The bugzilla entry that you refer to in that commit is even the same one
that discussed why the _original_ patch was totally broken.
It's even the same AUTHOR who wrote the original buggy patch, that
pushed
through the same buggy patch AGAIN.
Dammit, this is frustrating.
Why did people expect it to suddenly not be buggy?
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-18 16:23 ACPI breakage (Re: 2.6.19-rc6: known regressions (v2)) Starikovskiy, Alexey Y
@ 2006-11-18 17:12 ` Linus Torvalds
2006-11-18 19:05 ` David Brownell
0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2006-11-18 17:12 UTC (permalink / raw)
To: Starikovskiy, Alexey Y
Cc: Brown, Len, Adrian Bunk, Andrew Morton, David Brownell,
linux-acpi
On Sat, 18 Nov 2006, Starikovskiy, Alexey Y wrote:
>
> May because it does not have a single common line with the previous
> patch?
Yeah, I do agree that it _looks_ very different as a patch, but it ends up
having all the same execution profiles..
It's been too long since I debugged the previous problem, so I don't
remember the exact details any more (back then I enabled ACPI debugging
and watched the messages scroll by etc - this time I initially thought it
was interrupt-related due to the other irq problems we've had, so I
started bisecting immediately _without_ doing any ACPI debugging stuff,
and by the time I actually bisected down enough, I recognized the problem,
so I didn't do all the same "enable ACPI messages and look deeply into
what is going on" thing).
But if I remember correctly, what happens is _roughly_ something like
this:
- thermal event happens - the CPU is getting warm, and the fan needs to
start up. Quite often, this happened early during boot (which is quite
busy - some init scripts are disgustingly CPU-intensive mainly due to
using inefficient scripting languages), but if it didn't happen there,
it's easy enough to force to happen other ways.
- part of the handling is "acpi_os_execute()" for something (don't ask me
what), but the interestign thing is how that "acpi_os_execue()" then
ends up causing a _recursive_ event.
- we handle the original event in kacpid, and hand over the new one as a
notification event. But the event keeps on happening, and kacpid keeps
on running, and the other thread doesn't actually ever _run_ because
kacpid holds he ACPI lock and is constantly busy.
- we not only are constantly running in kernel space, we also end up
eventually running out of memory for allocating all the work queue
entries.
So the reason the old code works is because everything is done in a single
thread, and yes, we end up getting multiple events, but because the queue
is all done onto the same queue that is _handling_ the events in the first
place, and because it's a FIFO queue, the notification events get handled
_before_ the later events.
So with the single-threaded situation, you basically end up always doing
the events in the same order they came in. In the "two separate threads"
case, you don't, and one thread will end up generating events forever,
waiting for them to happen, but they never _do_ happen, so you have a
lockup _and_ eventually an infinite event queue for the other thread.
> Or may be because it fixes all the current AMD-HP notebooks?
> Or may be because it did not fail while being in -mm?
I'm afraid that -mm doesn't get as much testing as it used to get.
Also, I do realize that the patch fixes other problems, but we have long
had a very strict policy that we do NOT accept regressions. Immediately
when you start accepting regressions, you will never know whether you're
going forward of backwards. It's better to have a known _old_ bug than to
introduce a new one.
So the "no regressions!" rule ends up trumping pretty much every single
other issue. It's unacceptable to have machines that used to work,
suddenly stop working. Even if it fixes another machine.
ACPI didn't use to have that rule, and it was wild and crazy. Maybe more
bugs got fixed, but the problem with accepting regressions is that nobody
can _ever_ trust that system. You do not want to have people _afraid_ of
upgrading - they should feel confident that upgrading never introduces any
new problems.
(Of course, that can never be reached 100%, but it's very much part of the
goal. It kind of falls into the same "backwards compatibility on
interfaces" absolute goal: it's ok to do new things, but you can never
allow them to break old programs)
> I will not "sneak it in" again, I promise.
Feel free to send me test patches when working on these things, because I
have no trouble at all to test my particular machine.
I think you'll find the ACPI dumps etc for that machine in your archives,
because I've sent them to Len and the acpi lists several times, but if you
want to get AML disassemblies etc, just tell me how. I've done them
before, but I work on this seldom enough that I always forget what the
magic incantations are, and where to get the tools etc.
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
@ 2006-11-18 19:01 Starikovskiy, Alexey Y
2006-11-18 19:05 ` Linus Torvalds
0 siblings, 1 reply; 15+ messages in thread
From: Starikovskiy, Alexey Y @ 2006-11-18 19:01 UTC (permalink / raw)
To: Linus Torvalds
Cc: Brown, Len, Adrian Bunk, Andrew Morton, David Brownell,
linux-acpi
>Feel free to send me test patches when working on these
>things, because I
>have no trouble at all to test my particular machine.
I've sent you a test patch back in July, but did not get a reply. May be
due to OLS?
Thanks,
Alex.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-18 17:12 ` Linus Torvalds
@ 2006-11-18 19:05 ` David Brownell
2006-11-18 22:09 ` Linus Torvalds
2006-11-19 4:33 ` David Brownell
0 siblings, 2 replies; 15+ messages in thread
From: David Brownell @ 2006-11-18 19:05 UTC (permalink / raw)
To: Alexey Starikovskiy, Linus Torvalds
Cc: Adrian Bunk, Andrew Morton, Brown, Len, linux-acpi
> On Sat, 18 Nov 2006, Starikovskiy, Alexey Y wrote:
>
> > Or may be because it fixes all the current AMD-HP notebooks?
Whatever "it" is sure broke mine though... the one that's
currently on my lap! :)
Running right now with a patch reverting the update which
made trouble on Linus' machine, but without Alexey's two
tweaks to the EC interrupt handler. So far so good, even
after doing things which had previously caused AE_TIME
errors pretty quickly. But then, the errors weren't what
I'd call reproducible either.
Linus' explanation of what went wrong looks compatible with
the symptoms I've seen, FWIW.
- Dave
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-18 19:01 Starikovskiy, Alexey Y
@ 2006-11-18 19:05 ` Linus Torvalds
[not found] ` <455FB44C.8050103@linux.intel.com>
0 siblings, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2006-11-18 19:05 UTC (permalink / raw)
To: Starikovskiy, Alexey Y
Cc: Brown, Len, Adrian Bunk, Andrew Morton, David Brownell,
linux-acpi
On Sat, 18 Nov 2006, Starikovskiy, Alexey Y wrote:
>
> I've sent you a test patch back in July, but did not get a reply. May be
> due to OLS?
Heh. Whenever you send me something like that, and I don't answer within a
few days, you can pretty much depend on me not answering - my mailqueue
just fills up too fast. And yeah, it might have been during OLS. Just
re-send when it happens.
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-18 19:05 ` David Brownell
@ 2006-11-18 22:09 ` Linus Torvalds
2006-11-18 22:16 ` Adrian Bunk
2006-11-19 4:33 ` David Brownell
1 sibling, 1 reply; 15+ messages in thread
From: Linus Torvalds @ 2006-11-18 22:09 UTC (permalink / raw)
To: David Brownell
Cc: Alexey Starikovskiy, Adrian Bunk, Andrew Morton, Brown, Len,
linux-acpi
On Sat, 18 Nov 2006, David Brownell wrote:
>
> Running right now with a patch reverting the update which
> made trouble on Linus' machine, but without Alexey's two
> tweaks to the EC interrupt handler. So far so good, even
> after doing things which had previously caused AE_TIME
> errors pretty quickly. But then, the errors weren't what
> I'd call reproducible either.
Ok, goodie.
Adrian, that means that there's one less regression on your list, unless
David reports that he can reproduce it again (I don't think he will be
able to: all the other ACPI changes looked relatively harmless, at least
in the particular area of ACPI changes I looked at)
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-18 22:09 ` Linus Torvalds
@ 2006-11-18 22:16 ` Adrian Bunk
0 siblings, 0 replies; 15+ messages in thread
From: Adrian Bunk @ 2006-11-18 22:16 UTC (permalink / raw)
To: Linus Torvalds
Cc: David Brownell, Alexey Starikovskiy, Andrew Morton, Brown, Len,
linux-acpi
On Sat, Nov 18, 2006 at 02:09:56PM -0800, Linus Torvalds wrote:
>
>
> On Sat, 18 Nov 2006, David Brownell wrote:
> >
> > Running right now with a patch reverting the update which
> > made trouble on Linus' machine, but without Alexey's two
> > tweaks to the EC interrupt handler. So far so good, even
> > after doing things which had previously caused AE_TIME
> > errors pretty quickly. But then, the errors weren't what
> > I'd call reproducible either.
>
> Ok, goodie.
>
> Adrian, that means that there's one less regression on your list, unless
> David reports that he can reproduce it again (I don't think he will be
> able to: all the other ACPI changes looked relatively harmless, at least
> in the particular area of ACPI changes I looked at)
I had already removed it from my list based on David's email.
> Linus
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-18 19:05 ` David Brownell
2006-11-18 22:09 ` Linus Torvalds
@ 2006-11-19 4:33 ` David Brownell
2006-11-20 18:46 ` David Brownell
1 sibling, 1 reply; 15+ messages in thread
From: David Brownell @ 2006-11-19 4:33 UTC (permalink / raw)
To: Alexey Starikovskiy
Cc: Linus Torvalds, Adrian Bunk, Andrew Morton, Brown, Len,
linux-acpi
On Saturday 18 November 2006 11:05 am, David Brownell wrote:
>
> Running right now with a patch reverting the update which
> made trouble on Linus' machine, but without Alexey's two
> tweaks to the EC interrupt handler. So far so good, even
> after doing things which had previously caused AE_TIME
> errors pretty quickly. But then, the errors weren't what
> I'd call reproducible either.
Hmm, well after a reboot to sort out some other patches,
and at uptime of ~2 hours, I noticed confusion about
whether AC or battery power was active, then the old:
ACPI Exception (evregion-0424): AE_TIME, Returned by Handler for [EmbeddedControl] [20060707]
ACPI Exception (dswexec-0458): AE_TIME, While resolving operands for [OpcodeName unavailable] [20060707]
ACPI Error (psparse-0537): Method parse/execution failed [\_TZ_.THRM._TMP] (Node ffff810002032d10), AE_TIME
So maybe that's not the entire story; sigh.
- Dave
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
[not found] ` <Pine.LNX.4.64.0611201003540.3692@woody.osdl.org>
@ 2006-11-20 18:27 ` Linus Torvalds
2006-11-20 19:31 ` Alexey Starikovskiy
2006-11-21 3:10 ` Sanjoy Mahajan
2006-11-20 22:13 ` Alexey Starikovskiy
1 sibling, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2006-11-20 18:27 UTC (permalink / raw)
To: Alexey Starikovskiy; +Cc: Brown, Len, linux-acpi
[ Digression from testing Alexey's patch that makes the Evo work again
with two separate threads ]
On Mon, 20 Nov 2006, Linus Torvalds wrote:
>
> Ok, this one works for me too, and looks much simpler.
Hmm. Some more testing shows that fan behaviour after a suspend-to-ram
event seems broken, but I suspect the breakage isn't new.
It seems that ACPI remembers fan state from before the suspend, and then
(incorrectly) uses that to decide whether it should turn fans on or off.
So for example, it seems to remember that the fan was already on, so it
won't ever turn it on again - even though the suspend will obviously have
turned off all fans too.
So after running for a while, I get (for example):
cat /proc/acpi/thermal_zone/TZ1/*
..
state: active[0]
temperature: 92 C
critical (S5): 99 C
passive: 95 C: tc1=1 tc2=2 tsp=100 devices=0xf7e42338
active[0]: 80 C: devices=0xc18d78ec
active[1]: 70 C: devices=0xc18d7888
active[2]: 60 C: devices=0xc18d7838
active[3]: 45 C: devices=0xc18d77e8
(it thinks all fans are on), but no fans are actually on:
cat /proc/acpi/fan/*/*
status: off
status: off
status: off
status: off
Of course, I'm not exactly having a lot of trust in ACPI in general, so
for all I know this is just more unfixable crap from the firmware. But it
smells like "I remember state from before the suspend".
Linus
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-19 4:33 ` David Brownell
@ 2006-11-20 18:46 ` David Brownell
0 siblings, 0 replies; 15+ messages in thread
From: David Brownell @ 2006-11-20 18:46 UTC (permalink / raw)
To: Alexey Starikovskiy
Cc: Linus Torvalds, Adrian Bunk, Andrew Morton, Brown, Len,
linux-acpi
On Saturday 18 November 2006 8:33 pm, David Brownell wrote:
> On Saturday 18 November 2006 11:05 am, David Brownell wrote:
> >
> > Running right now with a patch reverting the update which
> > made trouble on Linus' machine, but without Alexey's two
> > tweaks to the EC interrupt handler. So far so good, even
> > after doing things which had previously caused AE_TIME
> > errors pretty quickly. But then, the errors weren't what
> > I'd call reproducible either.
>
> Hmm, well after a reboot to sort out some other patches,
> and at uptime of ~2 hours, I noticed confusion about
> whether AC or battery power was active, then the old:
>
> ACPI Exception (evregion-0424): AE_TIME, Returned by Handler for [EmbeddedControl] [20060707]
> ACPI Exception (dswexec-0458): AE_TIME, While resolving operands for [OpcodeName unavailable] [20060707]
> ACPI Error (psparse-0537): Method parse/execution failed [\_TZ_.THRM._TMP] (Node ffff810002032d10), AE_TIME
>
> So maybe that's not the entire story; sigh.
Whatever it is, it hasn't shown its ugly little face since then.
So while it doesn't seem completely fixed ... it's nowhere near
as broken as it was previously.
- Dave
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-20 18:27 ` Linus Torvalds
@ 2006-11-20 19:31 ` Alexey Starikovskiy
2006-11-21 3:10 ` Sanjoy Mahajan
1 sibling, 0 replies; 15+ messages in thread
From: Alexey Starikovskiy @ 2006-11-20 19:31 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Brown, Len, linux-acpi
Linus Torvalds wrote:
> [ Digression from testing Alexey's patch that makes the Evo work again
> with two separate threads ]
>
> On Mon, 20 Nov 2006, Linus Torvalds wrote:
>
>> Ok, this one works for me too, and looks much simpler.
>>
>
> Hmm. Some more testing shows that fan behaviour after a suspend-to-ram
> event seems broken, but I suspect the breakage isn't new.
>
> It seems that ACPI remembers fan state from before the suspend, and then
> (incorrectly) uses that to decide whether it should turn fans on or off.
> So for example, it seems to remember that the fan was already on, so it
> won't ever turn it on again - even though the suspend will obviously have
> turned off all fans too.
>
>
We have patches in #7122 for similar issue in suspend-to-disk, it may
fix suspend-to-ram too?
It's related to order of ACPI devices resume and _WAK method execution.
Thanks,
Alex.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
[not found] ` <Pine.LNX.4.64.0611201003540.3692@woody.osdl.org>
2006-11-20 18:27 ` Linus Torvalds
@ 2006-11-20 22:13 ` Alexey Starikovskiy
1 sibling, 0 replies; 15+ messages in thread
From: Alexey Starikovskiy @ 2006-11-20 22:13 UTC (permalink / raw)
To: Linus Torvalds, linux-acpi, David Brownell
[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]
Linus Torvalds wrote:
> On Sun, 19 Nov 2006, Alexey Starikovskiy wrote:
>
>> I agree to all your comments with one exception, please see below. Attached is
>> the reworked patch against latest git. Please test.
>>
>
> Ok, this one works for me too, and looks much simpler.
>
>
>> Linus Torvalds wrote:
>>
>>> And we might as well do it when we add an entry to the _deferred_ queue, no?
>>>
>>
>> acpi_os_execute() is called from interrupt context for insertion into
>> _deferred_ queue, so it's not possible to yield in it, no?
>>
>
> Hmm. Yes. Anyway, the new patch looks acceptable, and certainly much
> simpler than trying to count events.
>
> It probably causes tons of new unnecessary scheduling events, but I doubt
> we really care.
>
> That said, what we _really_ want here is a "priority queue" for the
> events, and some way to put an event back on the queue while running it
> (eg ACPI "Sleep" event). But I guess the ACPI interpreter isn't done that
> way (ie you can't just push and pop ACPI state).
>
> Linus
>
Linus, thanks for diagnosing and testing. Yes, interpeter is not
currently able to put its stack aside.
David, could you try this patch too?
Regards,
Alex.
[-- Attachment #2: yield_on_deferred_events.patch --]
[-- Type: text/plain, Size: 3673 bytes --]
ACPI: created a dedicated workqueue for notify() execution
From: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com>
Needed to handle while loop in GPE handler of HP notebooks.
http://bugzilla.kernel.org/show_bug.cgi?id=5534
Yield processor before execution of deferred event queue. Needed to avoid
flooding of Compaq n620c with events.
---
drivers/acpi/osl.c | 51 +++++++++++++++++++++++++++++++++++----------------
1 files changed, 35 insertions(+), 16 deletions(-)
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 068fe4f..169ca04 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -34,6 +34,7 @@ #include <linux/smp_lock.h>
#include <linux/interrupt.h>
#include <linux/kmod.h>
#include <linux/delay.h>
+#include <linux/syscalls.h>
#include <linux/workqueue.h>
#include <linux/nmi.h>
#include <acpi/acpi.h>
@@ -73,6 +74,7 @@ static unsigned int acpi_irq_irq;
static acpi_osd_handler acpi_irq_handler;
static void *acpi_irq_context;
static struct workqueue_struct *kacpid_wq;
+static struct workqueue_struct *kacpi_notify_wq;
acpi_status acpi_os_initialize(void)
{
@@ -91,8 +93,9 @@ acpi_status acpi_os_initialize1(void)
return AE_NULL_ENTRY;
}
kacpid_wq = create_singlethread_workqueue("kacpid");
+ kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
BUG_ON(!kacpid_wq);
-
+ BUG_ON(!kacpi_notify_wq);
return AE_OK;
}
@@ -104,6 +107,7 @@ acpi_status acpi_os_terminate(void)
}
destroy_workqueue(kacpid_wq);
+ destroy_workqueue(kacpi_notify_wq);
return AE_OK;
}
@@ -566,10 +570,23 @@ void acpi_os_derive_pci_id(acpi_handle r
static void acpi_os_execute_deferred(void *context)
{
- struct acpi_os_dpc *dpc = NULL;
+ struct acpi_os_dpc *dpc = (struct acpi_os_dpc *)context;
+ if (!dpc) {
+ printk(KERN_ERR PREFIX "Invalid (NULL) context\n");
+ return;
+ }
+
+ sys_sched_yield();
+ dpc->function(dpc->context);
+
+ kfree(dpc);
+ return;
+}
- dpc = (struct acpi_os_dpc *)context;
+static void acpi_os_execute_notify(void *context)
+{
+ struct acpi_os_dpc *dpc = (struct acpi_os_dpc *)context;
if (!dpc) {
printk(KERN_ERR PREFIX "Invalid (NULL) context\n");
return;
@@ -604,14 +621,12 @@ acpi_status acpi_os_execute(acpi_execute
struct acpi_os_dpc *dpc;
struct work_struct *task;
- ACPI_FUNCTION_TRACE("os_queue_for_execution");
-
ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
"Scheduling function [%p(%p)] for deferred execution.\n",
function, context));
if (!function)
- return_ACPI_STATUS(AE_BAD_PARAMETER);
+ return AE_BAD_PARAMETER;
/*
* Allocate/initialize DPC structure. Note that this memory will be
@@ -624,9 +639,8 @@ acpi_status acpi_os_execute(acpi_execute
* from the same memory.
*/
- dpc =
- kmalloc(sizeof(struct acpi_os_dpc) + sizeof(struct work_struct),
- GFP_ATOMIC);
+ dpc = kzalloc(sizeof(struct acpi_os_dpc) +
+ sizeof(struct work_struct), GFP_ATOMIC);
if (!dpc)
return_ACPI_STATUS(AE_NO_MEMORY);
@@ -634,13 +648,18 @@ acpi_status acpi_os_execute(acpi_execute
dpc->context = context;
task = (void *)(dpc + 1);
- INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
-
- if (!queue_work(kacpid_wq, task)) {
- ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
- "Call to queue_work() failed.\n"));
- kfree(dpc);
- status = AE_ERROR;
+ if (type == OSL_NOTIFY_HANDLER) {
+ INIT_WORK(task, acpi_os_execute_notify, (void *)dpc);
+ if (!queue_work(kacpi_notify_wq, task)) {
+ status = AE_ERROR;
+ kfree(dpc);
+ }
+ } else {
+ INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
+ if (!queue_work(kacpid_wq, task)) {
+ status = AE_ERROR;
+ kfree(dpc);
+ }
}
return_ACPI_STATUS(status);
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: ACPI breakage (Re: 2.6.19-rc6: known regressions (v2))
2006-11-20 18:27 ` Linus Torvalds
2006-11-20 19:31 ` Alexey Starikovskiy
@ 2006-11-21 3:10 ` Sanjoy Mahajan
1 sibling, 0 replies; 15+ messages in thread
From: Sanjoy Mahajan @ 2006-11-21 3:10 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Alexey Starikovskiy, Brown, Len, linux-acpi
Linus Torvalds wrote:
(it thinks all fans are on), but no fans are actually on:
cat /proc/acpi/fan/*/*
status: off
status: off
status: off
status: off
I saw a related problem with suspend to disk; a second suspend to RAM
would hang while suspending -- bugzilla 5989, 6749 -- so I couldn't
test its effect on the fans.
After resuming, the fans would be off but the system thought they were
on, and the fan/*/* files would confirm that wrong idea. Because the
fans were allegedly on, they would never get turned on, even as the
temperature climbed into the sky. I reported this problem to the acpi
list and bugzilla, and the fan driver was fixed to have suspend/resume
methods, and that plus other patches eventually fixed it for my box
(IBM TP 600X). Though with some of the intermediate patches, I saw
the behavior you are seeing (with 'off' status in fan/*/*, but showing
'on' in thermal_zone/*/* or in 'acpi -t').
Also there was some question there on whether exactly the right
version of the patch got merged.
See the discussions in
<http://bugzilla.kernel.org/show_bug.cgi?id=5000>.
Unfortunately I haven't kept testing it because my 600X's screen died,
and its replacement (T60) doesn't export any fan control or trip
points to ACPI (it's all done at a lower level, alas, so the damn fan
is on way too much).
-Sanjoy
`Never underestimate the evil of which men of power are capable.'
--Bertrand Russell, _War Crimes in Vietnam_, chapter 1.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2006-11-21 3:11 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-18 16:23 ACPI breakage (Re: 2.6.19-rc6: known regressions (v2)) Starikovskiy, Alexey Y
2006-11-18 17:12 ` Linus Torvalds
2006-11-18 19:05 ` David Brownell
2006-11-18 22:09 ` Linus Torvalds
2006-11-18 22:16 ` Adrian Bunk
2006-11-19 4:33 ` David Brownell
2006-11-20 18:46 ` David Brownell
-- strict thread matches above, loose matches on Subject: below --
2006-11-18 19:01 Starikovskiy, Alexey Y
2006-11-18 19:05 ` Linus Torvalds
[not found] ` <455FB44C.8050103@linux.intel.com>
[not found] ` <Pine.LNX.4.64.0611182048560.3692@woody.osdl.org>
[not found] ` <456043F7.1030105@linux.intel.com>
[not found] ` <Pine.LNX.4.64.0611201003540.3692@woody.osdl.org>
2006-11-20 18:27 ` Linus Torvalds
2006-11-20 19:31 ` Alexey Starikovskiy
2006-11-21 3:10 ` Sanjoy Mahajan
2006-11-20 22:13 ` Alexey Starikovskiy
[not found] <Pine.LNX.4.64.0611152008450.3349@woody.osdl.org>
2006-11-17 20:40 ` 2.6.19-rc6: known regressions (v2) Adrian Bunk
2006-11-17 23:58 ` ACPI breakage (Re: 2.6.19-rc6: known regressions (v2)) Linus Torvalds
2006-11-18 1:25 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox