* RE: kacpi_notify?
@ 2006-07-13 2:46 Len Brown
0 siblings, 0 replies; 18+ messages in thread
From: Len Brown @ 2006-07-13 2:46 UTC (permalink / raw)
To: torvalds; +Cc: akpm, linux-acpi, robert.moore
>Here's a suggested revert.
Please try this smaller revert to just osl.c.
(it builds and boots for me)
It reverts acpi_os_queue_for_execution() to exactly
as it was in 2.6.17, except it changes the name to
acpi_os_execute() to match ACPICA 20060512.
(yes, it is okay we ignore the 1st parameter,
it wasn't used until the 5534 fix we are reverting)
thanks,
-Len
Signed-off-by: Len Brown <len.brown@intel.com>
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
diff --git a/drivers/acpi/events/evgpe.c b/drivers/acpi/events/evgpe.c
diff --git a/drivers/acpi/events/evmisc.c b/drivers/acpi/events/evmisc.c
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 47dfde9..b7d1514 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -36,7 +36,6 @@ #include <linux/kmod.h>
#include <linux/delay.h>
#include <linux/workqueue.h>
#include <linux/nmi.h>
-#include <linux/kthread.h>
#include <acpi/acpi.h>
#include <asm/io.h>
#include <acpi/acpi_bus.h>
@@ -583,16 +582,6 @@ static void acpi_os_execute_deferred(voi
return;
}
-static int acpi_os_execute_thread(void *context)
-{
- struct acpi_os_dpc *dpc = (struct acpi_os_dpc *)context;
- if (dpc) {
- dpc->function(dpc->context);
- kfree(dpc);
- }
- do_exit(0);
-}
-
/*******************************************************************************
*
* FUNCTION: acpi_os_execute
@@ -614,10 +603,16 @@ acpi_status acpi_os_execute(acpi_execute
acpi_status status = AE_OK;
struct acpi_os_dpc *dpc;
struct work_struct *task;
- struct task_struct *p;
+
+ ACPI_FUNCTION_TRACE("os_queue_for_execution");
+
+ ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
+ "Scheduling function [%p(%p)] for deferred execution.\n",
+ function, context));
if (!function)
- return AE_BAD_PARAMETER;
+ return_ACPI_STATUS(AE_BAD_PARAMETER);
+
/*
* Allocate/initialize DPC structure. Note that this memory will be
* freed by the callee. The kernel handles the tq_struct list in a
@@ -628,34 +623,27 @@ acpi_status acpi_os_execute(acpi_execute
* We can save time and code by allocating the DPC and tq_structs
* from the same memory.
*/
- if (type == OSL_NOTIFY_HANDLER) {
- dpc = kmalloc(sizeof(struct acpi_os_dpc), GFP_KERNEL);
- } else {
- dpc = kmalloc(sizeof(struct acpi_os_dpc) +
- sizeof(struct work_struct), GFP_ATOMIC);
- }
+
+ dpc =
+ kmalloc(sizeof(struct acpi_os_dpc) + sizeof(struct work_struct),
+ GFP_ATOMIC);
if (!dpc)
- return AE_NO_MEMORY;
+ return_ACPI_STATUS(AE_NO_MEMORY);
+
dpc->function = function;
dpc->context = context;
- if (type == OSL_NOTIFY_HANDLER) {
- p = kthread_create(acpi_os_execute_thread, dpc, "kacpid_notify");
- if (!IS_ERR(p)) {
- wake_up_process(p);
- } else {
- status = AE_NO_MEMORY;
- kfree(dpc);
- }
- } else {
- task = (void *)(dpc + 1);
- INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
- if (!queue_work(kacpid_wq, task)) {
- status = AE_ERROR;
- kfree(dpc);
- }
+ task = (void *)(dpc + 1);
+ INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
+
+ if (!queue_work(kacpid_wq, task)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
+ "Call to queue_work() failed.\n"));
+ kfree(dpc);
+ status = AE_ERROR;
}
- return status;
+
+ return_ACPI_STATUS(status);
}
EXPORT_SYMBOL(acpi_os_execute);
^ permalink raw reply related [flat|nested] 18+ messages in thread* RE: kacpi_notify?
@ 2006-07-13 15:32 Starikovskiy, Alexey Y
0 siblings, 0 replies; 18+ messages in thread
From: Starikovskiy, Alexey Y @ 2006-07-13 15:32 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Brown, Len, Andrew Morton, linux-acpi, Moore, Robert
>I don't think the _idea_ is crap per se, but it would at a
>minimum need a
>thread limit. But I think it's the wrong approach: especially
>if you put
>the current thread to sleep, you really don't want another
>thread at all,
>you are really just working around a problem that is totally
>internal to
>acpi (and the AML interpreter in particular).
Agree, this was meant as a temporary fix while we make
interpreter to be able to preempt, as you suggesting later.
>
>So I think the problem really lies elsewhere, and that the
>whole thread
>approach was trying to paper over it. And having a limited set
>of threads
>is probably potentially _worse_ then what we have now.
Right now we have overheating NX and NC series of notebooks
from HP with AMD processors (somehow Intel DSDTs do not
use Notify from infinite While loop), plus several "kacpid uses
100% of cpu" caused by Notify() events, which were scheduled to
rush through the queue if the bloker exits.
>
>Is there no way to have the AML interpreter have some state,
>and just push
>that current interrupted state back onto the "event queue",
>and just start executing the new one instead?
This interpreter cannot do such tricks and if we choose to
executer Notify() immidiately on the same stack we risk to overflow
it...
> That sounds like it should fix the _real_
>problem - a kind of "mini-scheduler" for ACPI events?
Yes, that will be a perfect solution, but it will require whole
interpreter rewrite,
so will take time.
>
> Linus
>
Thanks,
Alex.
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
@ 2006-07-13 11:07 Starikovskiy, Alexey Y
2006-07-13 15:00 ` kacpi_notify? Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Starikovskiy, Alexey Y @ 2006-07-13 11:07 UTC (permalink / raw)
To: Linus Torvalds, Brown, Len; +Cc: Andrew Morton, linux-acpi, Moore, Robert
Linus,
I'm terribly sorry that my patch broke on your machine.
May I ask you to send me or attach to #5534 output of acpidump from this
machine?
Do you think that the whole idea is crap, or if I limit number of
possible spawned threads and forsibly put current thread to sleep (which
will release ACPICA executer mutex), as it happens in DSDT of nx6125 it
will be possible to use it?
Thanks in advance, and once again my apology.
Alex.
>-----Original Message-----
>From: Linus Torvalds [mailto:torvalds@osdl.org]
>Sent: Thursday, July 13, 2006 4:15 AM
>To: Brown, Len
>Cc: Andrew Morton; linux-acpi@vger.kernel.org; Moore, Robert;
>Starikovskiy, Alexey Y
>Subject: RE: kacpi_notify?
>
>
>
>On Wed, 12 Jul 2006, Brown, Len wrote:
>>
>> Ugh, ACPICA starts depending on on acpi_os_execute() at that time,
>> so a simple revert will break the build -- needs to be tweaked.
>
>Yeah, I just noticed.
>
>Those "ACPICA" commits are crap. Crap, crap, crap. They do
>more than one
>thing, and are clearly just patches applied from a system that
>probably
>does have finer-grained granularity of tracking the changes.
>
>Here's a suggested revert. This boots, and seems to have the problem
>fixed. At least I can now recompile the kernel on that machine again,
>without the machine crashing.
>
> Linus
>
>----
>Revert "ACPI: execute Notify() handlers on new thread"
>
>This reverts commit b8d35192c55fb055792ff0641408eaaec7c88988 (and parts
>of commit 958dd242b691f64ab4632b4903dbb1e16fee8269: "ACPI: ACPICA
>20060512", which were intertwined, along with some Smart Battery logic
>that had already started using the broken routines).
>
>The thread execution doesn't actually solve the bug it set out to solve
>(see
>
> http://bugzilla.kernel.org/show_bug.cgi?id=5534
>
>for more details) because the new events can get caught behind the AML
>semaphore or other serialization. And when that happens, the notify
>threads keep on pilin gup until the system dies.
>
>Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>
>
>diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
>index e5d7963..bdf3e12 100644
>--- a/drivers/acpi/ec.c
>+++ b/drivers/acpi/ec.c
>@@ -759,7 +759,8 @@ static u32 acpi_ec_gpe_poll_handler(void
>
> acpi_disable_gpe(NULL, ec->common.gpe_bit, ACPI_ISR);
>
>- status = acpi_os_execute(OSL_EC_POLL_HANDLER,
>acpi_ec_gpe_query, ec);
>+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
>+ acpi_ec_gpe_query, ec);
>
> if (status == AE_OK)
> return ACPI_INTERRUPT_HANDLED;
>@@ -797,7 +798,7 @@ static u32 acpi_ec_gpe_intr_handler(void
>
> if (value & ACPI_EC_FLAG_SCI) {
> atomic_add(1, &ec->intr.pending_gpe);
>- status = acpi_os_execute(OSL_EC_BURST_HANDLER,
>+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
>
>acpi_ec_gpe_query, ec);
> return status == AE_OK ?
> ACPI_INTERRUPT_HANDLED : ACPI_INTERRUPT_NOT_HANDLED;
>diff --git a/drivers/acpi/events/evgpe.c b/drivers/acpi/events/evgpe.c
>index c76c058..aad17dd 100644
>--- a/drivers/acpi/events/evgpe.c
>+++ b/drivers/acpi/events/evgpe.c
>@@ -495,8 +495,8 @@ u32 acpi_ev_gpe_detect(struct acpi_gpe_x
> * RETURN: None
> *
> * DESCRIPTION: Perform the actual execution of a GPE control
>method. This
>- * function is called from an invocation of
>acpi_os_execute and
>- * therefore does NOT execute at interrupt level
>- so that
>+ * function is called from an invocation of
>acpi_os_queue_for_execution
>+ * (and therefore does NOT execute at interrupt
>level) so that
> * the control method itself is not executed in
>the context of
> * an interrupt handler.
> *
>@@ -692,9 +692,9 @@ acpi_ev_gpe_dispatch(struct acpi_gpe_eve
> * Execute the method associated with the GPE
> * NOTE: Level-triggered GPEs are cleared after
>the method completes.
> */
>- status = acpi_os_execute(OSL_GPE_HANDLER,
>-
>acpi_ev_asynch_execute_gpe_method,
>- gpe_event_info);
>+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
>+
>acpi_ev_asynch_execute_gpe_method,
>+ gpe_event_info);
> if (ACPI_FAILURE(status)) {
> ACPI_EXCEPTION((AE_INFO, status,
> "Unable to queue
>handler for GPE[%2X] - event disabled",
>diff --git a/drivers/acpi/events/evmisc.c
>b/drivers/acpi/events/evmisc.c
>index 6eef4ef..625dd3b 100644
>--- a/drivers/acpi/events/evmisc.c
>+++ b/drivers/acpi/events/evmisc.c
>@@ -192,9 +192,9 @@ acpi_ev_queue_notify_request(struct acpi
> notify_info->notify.value = (u16) notify_value;
> notify_info->notify.handler_obj = handler_obj;
>
>- status =
>- acpi_os_execute(OSL_NOTIFY_HANDLER,
>acpi_ev_notify_dispatch,
>- notify_info);
>+ status = acpi_os_queue_for_execution(OSD_PRIORITY_HIGH,
>+
>acpi_ev_notify_dispatch,
>+ notify_info);
> if (ACPI_FAILURE(status)) {
> acpi_ut_delete_generic_state(notify_info);
> }
>@@ -347,9 +347,9 @@ static u32 acpi_ev_global_lock_handler(v
>
> /* Run the Global Lock thread which will signal
>all waiting threads */
>
>- status =
>- acpi_os_execute(OSL_GLOBAL_LOCK_HANDLER,
>- acpi_ev_global_lock_thread,
>context);
>+ status = acpi_os_queue_for_execution(OSD_PRIORITY_HIGH,
>+
>acpi_ev_global_lock_thread,
>+ context);
> if (ACPI_FAILURE(status)) {
> ACPI_EXCEPTION((AE_INFO, status,
> "Could not queue Global
>Lock thread"));
>diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
>index 47dfde9..8eb3ab6 100644
>--- a/drivers/acpi/osl.c
>+++ b/drivers/acpi/osl.c
>@@ -36,7 +36,6 @@ #include <linux/kmod.h>
> #include <linux/delay.h>
> #include <linux/workqueue.h>
> #include <linux/nmi.h>
>-#include <linux/kthread.h>
> #include <acpi/acpi.h>
> #include <asm/io.h>
> #include <acpi/acpi_bus.h>
>@@ -583,41 +582,23 @@ static void acpi_os_execute_deferred(voi
> return;
> }
>
>-static int acpi_os_execute_thread(void *context)
>-{
>- struct acpi_os_dpc *dpc = (struct acpi_os_dpc *)context;
>- if (dpc) {
>- dpc->function(dpc->context);
>- kfree(dpc);
>- }
>- do_exit(0);
>-}
>-
>-/*************************************************************
>******************
>- *
>- * FUNCTION: acpi_os_execute
>- *
>- * PARAMETERS: Type - Type of the callback
>- * Function - Function to be executed
>- * Context - Function parameters
>- *
>- * RETURN: Status
>- *
>- * DESCRIPTION: Depending on type, either queues function for
>deferred execution or
>- * immediately executes function on a separate thread.
>- *
>-
>***************************************************************
>***************/
>-
>-acpi_status acpi_os_execute(acpi_execute_type type,
>+acpi_status
>+acpi_os_queue_for_execution(u32 priority,
> acpi_osd_exec_callback function,
>void *context)
> {
> acpi_status status = AE_OK;
> struct acpi_os_dpc *dpc;
> struct work_struct *task;
>- struct task_struct *p;
>+
>+ ACPI_FUNCTION_TRACE("os_queue_for_execution");
>+
>+ ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
>+ "Scheduling function [%p(%p)] for
>deferred execution.\n",
>+ function, context));
>
> if (!function)
>- return AE_BAD_PARAMETER;
>+ return_ACPI_STATUS(AE_BAD_PARAMETER);
>+
> /*
> * Allocate/initialize DPC structure. Note that this
>memory will be
> * freed by the callee. The kernel handles the
>tq_struct list in a
>@@ -628,37 +609,30 @@ acpi_status acpi_os_execute(acpi_execute
> * We can save time and code by allocating the DPC and
>tq_structs
> * from the same memory.
> */
>- if (type == OSL_NOTIFY_HANDLER) {
>- dpc = kmalloc(sizeof(struct acpi_os_dpc), GFP_KERNEL);
>- } else {
>- dpc = kmalloc(sizeof(struct acpi_os_dpc) +
>- sizeof(struct work_struct), GFP_ATOMIC);
>- }
>+
>+ dpc =
>+ kmalloc(sizeof(struct acpi_os_dpc) + sizeof(struct
>work_struct),
>+ GFP_ATOMIC);
> if (!dpc)
>- return AE_NO_MEMORY;
>+ return_ACPI_STATUS(AE_NO_MEMORY);
>+
> dpc->function = function;
> dpc->context = context;
>
>- if (type == OSL_NOTIFY_HANDLER) {
>- p = kthread_create(acpi_os_execute_thread, dpc,
>"kacpid_notify");
>- if (!IS_ERR(p)) {
>- wake_up_process(p);
>- } else {
>- status = AE_NO_MEMORY;
>- kfree(dpc);
>- }
>- } else {
>- task = (void *)(dpc + 1);
>- INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
>- if (!queue_work(kacpid_wq, task)) {
>- status = AE_ERROR;
>- kfree(dpc);
>- }
>+ task = (void *)(dpc + 1);
>+ INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
>+
>+ if (!queue_work(kacpid_wq, task)) {
>+ ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
>+ "Call to queue_work() failed.\n"));
>+ kfree(dpc);
>+ status = AE_ERROR;
> }
>- return status;
>+
>+ return_ACPI_STATUS(status);
> }
>
>-EXPORT_SYMBOL(acpi_os_execute);
>+EXPORT_SYMBOL(acpi_os_queue_for_execution);
>
> void acpi_os_wait_events_complete(void *context)
> {
>diff --git a/drivers/acpi/sbs.c b/drivers/acpi/sbs.c
>index db7b350..87d7800 100644
>--- a/drivers/acpi/sbs.c
>+++ b/drivers/acpi/sbs.c
>@@ -1366,7 +1366,7 @@ static void acpi_ac_remove(struct acpi_s
>
> static void acpi_sbs_update_queue_run(unsigned long data)
> {
>- acpi_os_execute(OSL_GPE_HANDLER, acpi_sbs_update_queue,
>(void *)data);
>+ acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
>acpi_sbs_update_queue, (void *)data);
> }
>
> static int acpi_sbs_update_run(struct acpi_sbs *sbs, int data_type)
>@@ -1658,11 +1658,11 @@ static int acpi_sbs_add(struct acpi_devi
>
> init_timer(&sbs->update_timer);
> if (update_mode == QUEUE_UPDATE_MODE) {
>- status = acpi_os_execute(OSL_GPE_HANDLER,
>+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
> acpi_sbs_update_queue,
>(void *)sbs);
> if (status != AE_OK) {
> ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
>- "acpi_os_execute()
>failed\n"));
>+
>"acpi_os_queue_for_execution() failed\n"));
> }
> }
> sbs->update_time = update_time;
>diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
>index 5753d06..e16e69a 100644
>--- a/drivers/acpi/thermal.c
>+++ b/drivers/acpi/thermal.c
>@@ -657,7 +657,8 @@ static void acpi_thermal_run(unsigned lo
> {
> struct acpi_thermal *tz = (struct acpi_thermal *)data;
> if (!tz->zombie)
>- acpi_os_execute(OSL_GPE_HANDLER,
>acpi_thermal_check, (void *)data);
>+ acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
>+ acpi_thermal_check,
>(void *)data);
> }
>
> static void acpi_thermal_check(void *data)
>diff --git a/include/acpi/acpiosxf.h b/include/acpi/acpiosxf.h
>index 0cd63bc..8040084 100644
>--- a/include/acpi/acpiosxf.h
>+++ b/include/acpi/acpiosxf.h
>@@ -50,16 +50,12 @@ #define __ACPIOSXF_H__
> #include "platform/acenv.h"
> #include "actypes.h"
>
>-/* Types for acpi_os_execute */
>+/* Priorities for acpi_os_queue_for_execution */
>
>-typedef enum {
>- OSL_GLOBAL_LOCK_HANDLER,
>- OSL_NOTIFY_HANDLER,
>- OSL_GPE_HANDLER,
>- OSL_DEBUGGER_THREAD,
>- OSL_EC_POLL_HANDLER,
>- OSL_EC_BURST_HANDLER
>-} acpi_execute_type;
>+#define OSD_PRIORITY_GPE 1
>+#define OSD_PRIORITY_HIGH 2
>+#define OSD_PRIORITY_MED 3
>+#define OSD_PRIORITY_LO 4
>
> #define ACPI_NO_UNIT_LIMIT ((u32) -1)
> #define ACPI_MUTEX_SEM 1
>@@ -188,8 +184,8 @@ acpi_os_remove_interrupt_handler(u32 gsi
> acpi_thread_id acpi_os_get_thread_id(void);
>
> acpi_status
>-acpi_os_execute(acpi_execute_type type,
>- acpi_osd_exec_callback function, void *context);
>+acpi_os_queue_for_execution(u32 priority,
>+ acpi_osd_exec_callback function,
>void *context);
>
> void acpi_os_wait_events_complete(void *context);
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread* RE: kacpi_notify?
2006-07-13 11:07 kacpi_notify? Starikovskiy, Alexey Y
@ 2006-07-13 15:00 ` Linus Torvalds
0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2006-07-13 15:00 UTC (permalink / raw)
To: Starikovskiy, Alexey Y
Cc: Brown, Len, Andrew Morton, linux-acpi, Moore, Robert
On Thu, 13 Jul 2006, Starikovskiy, Alexey Y wrote:
>
> I'm terribly sorry that my patch broke on your machine.
> May I ask you to send me or attach to #5534 output of acpidump from this
> machine?
I'll send it in another email, since I already generated it for Len ;)
> Do you think that the whole idea is crap, or if I limit number of
> possible spawned threads and forsibly put current thread to sleep (which
> will release ACPICA executer mutex), as it happens in DSDT of nx6125 it
> will be possible to use it?
I don't think the _idea_ is crap per se, but it would at a minimum need a
thread limit. But I think it's the wrong approach: especially if you put
the current thread to sleep, you really don't want another thread at all,
you are really just working around a problem that is totally internal to
acpi (and the AML interpreter in particular).
So I think the problem really lies elsewhere, and that the whole thread
approach was trying to paper over it. And having a limited set of threads
is probably potentially _worse_ then what we have now.
Is there no way to have the AML interpreter have some state, and just push
that current interrupted state back onto the "event queue", and just start
executing the new one instead? That sounds like it should fix the _real_
problem - a kind of "mini-scheduler" for ACPI events?
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
@ 2006-07-12 23:13 Brown, Len
2006-07-13 0:14 ` kacpi_notify? Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Brown, Len @ 2006-07-12 23:13 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, linux-acpi, Moore, Robert, Starikovskiy, Alexey Y
>>On Wed, 12 Jul 2006, Brown, Len wrote:
>>>
>>> >Likely related to bugzilla-5534
>>>
>>> b8d35192c55fb055792ff0641408eaaec7c88988
>>
>>Well, that one certainly looks likely.
>>
>>Any reason to not just revert it? The fundamental problems that it
>>introduces are obviously much worse than the fix.
>
>If reverting it fixes your EVO, then certainly this is what to
>do right now. However, as the saga in 5534 will testify, this will
make
>other systems unhappy, so we'll have to come back quickly with an
improved patch for 5534.
Ugh, ACPICA starts depending on on acpi_os_execute() at that time,
so a simple revert will break the build -- needs to be tweaked.
I can do it, but I've got to grab something to eat now...
-Len
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
2006-07-12 23:13 kacpi_notify? Brown, Len
@ 2006-07-13 0:14 ` Linus Torvalds
0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2006-07-13 0:14 UTC (permalink / raw)
To: Brown, Len
Cc: Andrew Morton, linux-acpi, Moore, Robert, Starikovskiy, Alexey Y
On Wed, 12 Jul 2006, Brown, Len wrote:
>
> Ugh, ACPICA starts depending on on acpi_os_execute() at that time,
> so a simple revert will break the build -- needs to be tweaked.
Yeah, I just noticed.
Those "ACPICA" commits are crap. Crap, crap, crap. They do more than one
thing, and are clearly just patches applied from a system that probably
does have finer-grained granularity of tracking the changes.
Here's a suggested revert. This boots, and seems to have the problem
fixed. At least I can now recompile the kernel on that machine again,
without the machine crashing.
Linus
----
Revert "ACPI: execute Notify() handlers on new thread"
This reverts commit b8d35192c55fb055792ff0641408eaaec7c88988 (and parts
of commit 958dd242b691f64ab4632b4903dbb1e16fee8269: "ACPI: ACPICA
20060512", which were intertwined, along with some Smart Battery logic
that had already started using the broken routines).
The thread execution doesn't actually solve the bug it set out to solve
(see
http://bugzilla.kernel.org/show_bug.cgi?id=5534
for more details) because the new events can get caught behind the AML
semaphore or other serialization. And when that happens, the notify
threads keep on pilin gup until the system dies.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index e5d7963..bdf3e12 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -759,7 +759,8 @@ static u32 acpi_ec_gpe_poll_handler(void
acpi_disable_gpe(NULL, ec->common.gpe_bit, ACPI_ISR);
- status = acpi_os_execute(OSL_EC_POLL_HANDLER, acpi_ec_gpe_query, ec);
+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
+ acpi_ec_gpe_query, ec);
if (status == AE_OK)
return ACPI_INTERRUPT_HANDLED;
@@ -797,7 +798,7 @@ static u32 acpi_ec_gpe_intr_handler(void
if (value & ACPI_EC_FLAG_SCI) {
atomic_add(1, &ec->intr.pending_gpe);
- status = acpi_os_execute(OSL_EC_BURST_HANDLER,
+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
acpi_ec_gpe_query, ec);
return status == AE_OK ?
ACPI_INTERRUPT_HANDLED : ACPI_INTERRUPT_NOT_HANDLED;
diff --git a/drivers/acpi/events/evgpe.c b/drivers/acpi/events/evgpe.c
index c76c058..aad17dd 100644
--- a/drivers/acpi/events/evgpe.c
+++ b/drivers/acpi/events/evgpe.c
@@ -495,8 +495,8 @@ u32 acpi_ev_gpe_detect(struct acpi_gpe_x
* RETURN: None
*
* DESCRIPTION: Perform the actual execution of a GPE control method. This
- * function is called from an invocation of acpi_os_execute and
- * therefore does NOT execute at interrupt level - so that
+ * function is called from an invocation of acpi_os_queue_for_execution
+ * (and therefore does NOT execute at interrupt level) so that
* the control method itself is not executed in the context of
* an interrupt handler.
*
@@ -692,9 +692,9 @@ acpi_ev_gpe_dispatch(struct acpi_gpe_eve
* Execute the method associated with the GPE
* NOTE: Level-triggered GPEs are cleared after the method completes.
*/
- status = acpi_os_execute(OSL_GPE_HANDLER,
- acpi_ev_asynch_execute_gpe_method,
- gpe_event_info);
+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
+ acpi_ev_asynch_execute_gpe_method,
+ gpe_event_info);
if (ACPI_FAILURE(status)) {
ACPI_EXCEPTION((AE_INFO, status,
"Unable to queue handler for GPE[%2X] - event disabled",
diff --git a/drivers/acpi/events/evmisc.c b/drivers/acpi/events/evmisc.c
index 6eef4ef..625dd3b 100644
--- a/drivers/acpi/events/evmisc.c
+++ b/drivers/acpi/events/evmisc.c
@@ -192,9 +192,9 @@ acpi_ev_queue_notify_request(struct acpi
notify_info->notify.value = (u16) notify_value;
notify_info->notify.handler_obj = handler_obj;
- status =
- acpi_os_execute(OSL_NOTIFY_HANDLER, acpi_ev_notify_dispatch,
- notify_info);
+ status = acpi_os_queue_for_execution(OSD_PRIORITY_HIGH,
+ acpi_ev_notify_dispatch,
+ notify_info);
if (ACPI_FAILURE(status)) {
acpi_ut_delete_generic_state(notify_info);
}
@@ -347,9 +347,9 @@ static u32 acpi_ev_global_lock_handler(v
/* Run the Global Lock thread which will signal all waiting threads */
- status =
- acpi_os_execute(OSL_GLOBAL_LOCK_HANDLER,
- acpi_ev_global_lock_thread, context);
+ status = acpi_os_queue_for_execution(OSD_PRIORITY_HIGH,
+ acpi_ev_global_lock_thread,
+ context);
if (ACPI_FAILURE(status)) {
ACPI_EXCEPTION((AE_INFO, status,
"Could not queue Global Lock thread"));
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 47dfde9..8eb3ab6 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -36,7 +36,6 @@ #include <linux/kmod.h>
#include <linux/delay.h>
#include <linux/workqueue.h>
#include <linux/nmi.h>
-#include <linux/kthread.h>
#include <acpi/acpi.h>
#include <asm/io.h>
#include <acpi/acpi_bus.h>
@@ -583,41 +582,23 @@ static void acpi_os_execute_deferred(voi
return;
}
-static int acpi_os_execute_thread(void *context)
-{
- struct acpi_os_dpc *dpc = (struct acpi_os_dpc *)context;
- if (dpc) {
- dpc->function(dpc->context);
- kfree(dpc);
- }
- do_exit(0);
-}
-
-/*******************************************************************************
- *
- * FUNCTION: acpi_os_execute
- *
- * PARAMETERS: Type - Type of the callback
- * Function - Function to be executed
- * Context - Function parameters
- *
- * RETURN: Status
- *
- * DESCRIPTION: Depending on type, either queues function for deferred execution or
- * immediately executes function on a separate thread.
- *
- ******************************************************************************/
-
-acpi_status acpi_os_execute(acpi_execute_type type,
+acpi_status
+acpi_os_queue_for_execution(u32 priority,
acpi_osd_exec_callback function, void *context)
{
acpi_status status = AE_OK;
struct acpi_os_dpc *dpc;
struct work_struct *task;
- struct task_struct *p;
+
+ ACPI_FUNCTION_TRACE("os_queue_for_execution");
+
+ ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
+ "Scheduling function [%p(%p)] for deferred execution.\n",
+ function, context));
if (!function)
- return AE_BAD_PARAMETER;
+ return_ACPI_STATUS(AE_BAD_PARAMETER);
+
/*
* Allocate/initialize DPC structure. Note that this memory will be
* freed by the callee. The kernel handles the tq_struct list in a
@@ -628,37 +609,30 @@ acpi_status acpi_os_execute(acpi_execute
* We can save time and code by allocating the DPC and tq_structs
* from the same memory.
*/
- if (type == OSL_NOTIFY_HANDLER) {
- dpc = kmalloc(sizeof(struct acpi_os_dpc), GFP_KERNEL);
- } else {
- dpc = kmalloc(sizeof(struct acpi_os_dpc) +
- sizeof(struct work_struct), GFP_ATOMIC);
- }
+
+ dpc =
+ kmalloc(sizeof(struct acpi_os_dpc) + sizeof(struct work_struct),
+ GFP_ATOMIC);
if (!dpc)
- return AE_NO_MEMORY;
+ return_ACPI_STATUS(AE_NO_MEMORY);
+
dpc->function = function;
dpc->context = context;
- if (type == OSL_NOTIFY_HANDLER) {
- p = kthread_create(acpi_os_execute_thread, dpc, "kacpid_notify");
- if (!IS_ERR(p)) {
- wake_up_process(p);
- } else {
- status = AE_NO_MEMORY;
- kfree(dpc);
- }
- } else {
- task = (void *)(dpc + 1);
- INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
- if (!queue_work(kacpid_wq, task)) {
- status = AE_ERROR;
- kfree(dpc);
- }
+ task = (void *)(dpc + 1);
+ INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
+
+ if (!queue_work(kacpid_wq, task)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
+ "Call to queue_work() failed.\n"));
+ kfree(dpc);
+ status = AE_ERROR;
}
- return status;
+
+ return_ACPI_STATUS(status);
}
-EXPORT_SYMBOL(acpi_os_execute);
+EXPORT_SYMBOL(acpi_os_queue_for_execution);
void acpi_os_wait_events_complete(void *context)
{
diff --git a/drivers/acpi/sbs.c b/drivers/acpi/sbs.c
index db7b350..87d7800 100644
--- a/drivers/acpi/sbs.c
+++ b/drivers/acpi/sbs.c
@@ -1366,7 +1366,7 @@ static void acpi_ac_remove(struct acpi_s
static void acpi_sbs_update_queue_run(unsigned long data)
{
- acpi_os_execute(OSL_GPE_HANDLER, acpi_sbs_update_queue, (void *)data);
+ acpi_os_queue_for_execution(OSD_PRIORITY_GPE, acpi_sbs_update_queue, (void *)data);
}
static int acpi_sbs_update_run(struct acpi_sbs *sbs, int data_type)
@@ -1658,11 +1658,11 @@ static int acpi_sbs_add(struct acpi_devi
init_timer(&sbs->update_timer);
if (update_mode == QUEUE_UPDATE_MODE) {
- status = acpi_os_execute(OSL_GPE_HANDLER,
+ status = acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
acpi_sbs_update_queue, (void *)sbs);
if (status != AE_OK) {
ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
- "acpi_os_execute() failed\n"));
+ "acpi_os_queue_for_execution() failed\n"));
}
}
sbs->update_time = update_time;
diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
index 5753d06..e16e69a 100644
--- a/drivers/acpi/thermal.c
+++ b/drivers/acpi/thermal.c
@@ -657,7 +657,8 @@ static void acpi_thermal_run(unsigned lo
{
struct acpi_thermal *tz = (struct acpi_thermal *)data;
if (!tz->zombie)
- acpi_os_execute(OSL_GPE_HANDLER, acpi_thermal_check, (void *)data);
+ acpi_os_queue_for_execution(OSD_PRIORITY_GPE,
+ acpi_thermal_check, (void *)data);
}
static void acpi_thermal_check(void *data)
diff --git a/include/acpi/acpiosxf.h b/include/acpi/acpiosxf.h
index 0cd63bc..8040084 100644
--- a/include/acpi/acpiosxf.h
+++ b/include/acpi/acpiosxf.h
@@ -50,16 +50,12 @@ #define __ACPIOSXF_H__
#include "platform/acenv.h"
#include "actypes.h"
-/* Types for acpi_os_execute */
+/* Priorities for acpi_os_queue_for_execution */
-typedef enum {
- OSL_GLOBAL_LOCK_HANDLER,
- OSL_NOTIFY_HANDLER,
- OSL_GPE_HANDLER,
- OSL_DEBUGGER_THREAD,
- OSL_EC_POLL_HANDLER,
- OSL_EC_BURST_HANDLER
-} acpi_execute_type;
+#define OSD_PRIORITY_GPE 1
+#define OSD_PRIORITY_HIGH 2
+#define OSD_PRIORITY_MED 3
+#define OSD_PRIORITY_LO 4
#define ACPI_NO_UNIT_LIMIT ((u32) -1)
#define ACPI_MUTEX_SEM 1
@@ -188,8 +184,8 @@ acpi_os_remove_interrupt_handler(u32 gsi
acpi_thread_id acpi_os_get_thread_id(void);
acpi_status
-acpi_os_execute(acpi_execute_type type,
- acpi_osd_exec_callback function, void *context);
+acpi_os_queue_for_execution(u32 priority,
+ acpi_osd_exec_callback function, void *context);
void acpi_os_wait_events_complete(void *context);
^ permalink raw reply related [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
@ 2006-07-12 23:08 Brown, Len
0 siblings, 0 replies; 18+ messages in thread
From: Brown, Len @ 2006-07-12 23:08 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, linux-acpi, Moore, Robert, Starikovskiy, Alexey Y
>On Wed, 12 Jul 2006, Brown, Len wrote:
>>
>> >Likely related to bugzilla-5534
>>
>> b8d35192c55fb055792ff0641408eaaec7c88988
>
>Well, that one certainly looks likely.
>
>Any reason to not just revert it? The fundamental problems that it
>introduces are obviously much worse than the fix.
If reverting it fixes your EVO, then certainly this is what to do right
now.
However, as the saga in 5534 will testify, this will make other systems
unhappy,
so we'll have to come back quickly with an improved patch for 5534.
-Len
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
@ 2006-07-12 22:42 Brown, Len
2006-07-12 23:02 ` kacpi_notify? Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Brown, Len @ 2006-07-12 22:42 UTC (permalink / raw)
To: Brown, Len, Linus Torvalds
Cc: Andrew Morton, linux-acpi, Moore, Robert, Starikovskiy, Alexey Y
>Likely related to bugzilla-5534
b8d35192c55fb055792ff0641408eaaec7c88988
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
2006-07-12 22:42 kacpi_notify? Brown, Len
@ 2006-07-12 23:02 ` Linus Torvalds
0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2006-07-12 23:02 UTC (permalink / raw)
To: Brown, Len
Cc: Andrew Morton, linux-acpi, Moore, Robert, Starikovskiy, Alexey Y
On Wed, 12 Jul 2006, Brown, Len wrote:
>
> >Likely related to bugzilla-5534
>
> b8d35192c55fb055792ff0641408eaaec7c88988
Well, that one certainly looks likely.
Any reason to not just revert it? The fundamental problems that it
introduces are obviously much worse than the fix.
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
@ 2006-07-12 22:39 Brown, Len
0 siblings, 0 replies; 18+ messages in thread
From: Brown, Len @ 2006-07-12 22:39 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, linux-acpi, Moore, Robert, Starikovskiy, Alexey Y
Likely related to bugzilla-5534
-Len
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
@ 2006-07-12 21:55 Brown, Len
2006-07-12 22:18 ` kacpi_notify? Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Brown, Len @ 2006-07-12 21:55 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andrew Morton, linux-acpi, Moore, Robert
Linus,
Does /proc/interrupts show that the acpi interrupt is ticking along
at the same rate as the kacpid_notify build-up?
If you run with CONFIG_ACPI_THERMAL=n or otherwise nuke
the thermal module, does the interrupt and the kacpid_notify symptom go
away?
thanks,
-Len
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
2006-07-12 21:55 kacpi_notify? Brown, Len
@ 2006-07-12 22:18 ` Linus Torvalds
2006-07-12 22:23 ` kacpi_notify? Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2006-07-12 22:18 UTC (permalink / raw)
To: Brown, Len; +Cc: Andrew Morton, linux-acpi, Moore, Robert
On Wed, 12 Jul 2006, Brown, Len wrote:
>
> Does /proc/interrupts show that the acpi interrupt is ticking along
> at the same rate as the kacpid_notify build-up?
It's really hard to tell.
The machine starts out fine, with no kacpid_notify buildup AT ALL.
Then, at some point, something changes. I dunno what, but I'm guessing the
temperature goes up, and it gets a temp event and wants to turn on the
fan.
And when that happens, things go to hell very quickly, and the whole
machine becomes undebuggable quite soon. And it's really hard to hit the
window between "uhhuh, bad things are starting to happen" and the "oops,
the machine is now totally deal, only the magic scrolllock keys work any
more".
So I can give you info about what the machine dos _before_ things go bad,
and I can give limited register info about what happens when the machine
is totally hosed, but I've only been able to get very lucky a few times
(usually it's a "ps ax" that I had running to try to trigger it, and it
locks up in the _middle_ of the "ps", showing tens of the kacpid_notify
things - there are probably thousands, but the "ps ax" takes long enough
that the machine runs out of memory before it even finishes.
> If you run with CONFIG_ACPI_THERMAL=n or otherwise nuke
> the thermal module, does the interrupt and the kacpid_notify symptom go
> away?
I'll try that. Right now I've started a "git bisect" run, since it seemed
to be the easiest way to get _some_ information about the problem, without
having to know much else about it.
The problem, of course, is that there are those 6000 commits (fine, that's
about 13 kernels to be tested), but more importantly that I'm not 100%
sure how repeatable it all is (see above - it's not like it locks up
immediately on boot, there seems to be something that triggers it, usually
while doing a kernel compile or a big grep), and if I ever get that answer
wrong, I'll be doing it all in vain.
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
2006-07-12 22:18 ` kacpi_notify? Linus Torvalds
@ 2006-07-12 22:23 ` Linus Torvalds
2006-07-12 22:37 ` kacpi_notify? Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2006-07-12 22:23 UTC (permalink / raw)
To: Brown, Len; +Cc: Andrew Morton, linux-acpi, Moore, Robert
On Wed, 12 Jul 2006, Linus Torvalds wrote:
>
> I'll try that. Right now I've started a "git bisect" run, since it seemed
> to be the easiest way to get _some_ information about the problem, without
> having to know much else about it.
Current result: it _looks_ like it was brought in by when I did the ACPI
merge on June 23 (commit 37224470c8c6d90a4062e76a08d4dc1fcf91fc89).
I've got a hundred-odd commits to go, but the next bisection test happens
to be the parent of my merge (your "merge linus into release branch"
merge: ae6c859b7dcd708efadf1c76279c33db213e3506), so if I'm right, I'd
expect that to be a bad tree.
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
2006-07-12 22:23 ` kacpi_notify? Linus Torvalds
@ 2006-07-12 22:37 ` Linus Torvalds
0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2006-07-12 22:37 UTC (permalink / raw)
To: Brown, Len; +Cc: Andrew Morton, linux-acpi, Moore, Robert
On Wed, 12 Jul 2006, Linus Torvalds wrote:
>
> I've got a hundred-odd commits to go, but the next bisection test happens
> to be the parent of my merge (your "merge linus into release branch"
> merge: ae6c859b7dcd708efadf1c76279c33db213e3506), so if I'm right, I'd
> expect that to be a bad tree.
Yup.
And yes, the problem seems to co-incide with getting about 300 acpi
interrupts per second. After about 9500 interrupts (each of which seems to
create one of these things), the machine is basically dead.
Ten thousand kacpid_notify threads is too much. Regardless of what brought
on this bug, I think there's something wrong in anything that keeps on
notifying things without keeping track of how many outstanding
notifications it already has.
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: kacpi_notify?
@ 2006-07-12 20:51 Brown, Len
[not found] ` <Pine.LNX.4.64.0607121356300.5623@g5.osdl.org>
0 siblings, 1 reply; 18+ messages in thread
From: Brown, Len @ 2006-07-12 20:51 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andrew Morton, linux-acpi, Moore, Robert
>With ctrl+scrolllock, I finally got something. The traceback for the
>D-state (millions and millions of them) is
>
> __down_failed
> acpi_ut_acquire_mutex
> acpi_ex_enter_interpreter
> acpi_ns_evaluate
> acpi_evaluate_object
> acpi_evaluate_integer
> acpi_os_execute_thread
> acpi_thermal_get_temperature
> acpi_thermal_check
> ..
>
>and 'kacpid' seems to be stuck using all CPU time, with the
>thing doing
>something like:
>
> EIP is at delay_tsc+0xb/0x13
> EFLAGS: 00000283 Not tainted (2.6.18-rc1-g155dbfd8 #24)
> EAX: 4aa48900 EBX: 00026be1 ECX: 4aa40b7e EDX: 0000001a
> ESI: 00000000 EDI: c039300d EBP: c0390df3 DS: 007b ES: 007b
> CR0: 8005003b CR2: 080516f0 CR3: 362dc000 CR4: 000006d0
> [<c01c94c0>] __delay+0x6/0x7
> [<c01f23ef>] acpi_os_stall+0x1d/0x29
> [<c0201f11>] acpi_ex_system_do_stall+0x37/0x3b
> [<c0200fca>] acpi_ex_opcode_1A_0T_0R+0x85/0xc8
> [<c01f5308>] acpi_ds_exec_end_op+0x133/0x553
> [<c020d0f3>] acpi_ps_parse_loop+0x777/0xbe0
> [<c020c488>] acpi_ps_parse_aml+0xd8/0x2d5
> [<c020dbbe>] acpi_ps_execute_pass+0xa9/0xd2
> [<c020dd6a>] acpi_ps_execute_method+0x153/0x231
> [<c02095e1>] acpi_ns_evaluate+0x179/0x24c
> [<c01fc12e>] acpi_ev_asynch_execute_gpe_method+0xeb/0x159
> [<c01f2083>] acpi_os_execute_deferred+0x19/0x21
> [<c01226a0>] run_workqueue+0x68/0x95
> [<c01f206a>] acpi_os_execute_deferred+0x0/0x21
> [<c0122b2e>] worker_thread+0xf9/0x12b
> [<c03570bf>] schedule+0x469/0x4cc
> [<c0113bfb>] default_wake_function+0x0/0xc
> [<c0122a35>] worker_thread+0x0/0x12b
> [<c01249bb>] kthread+0xad/0xd8
> [<c012490e>] kthread+0x0/0xd8
> [<c0101005>] kernel_thread_helper+0x5/0xb
>
>which I assume is the thing that holds the AML semaphore, and isn't
>releasing it.
>
>Is there any sane debugging info I can send people?
I think you just did:-)
The os_stall part is due to interpreting AML. If you can get the
output from acpidump, that would help us see what it is doing.
I ask most people to attach it to a new bugzilla entry,
but I don't know that you have an account:-)
Anyway, if you don't have it already, acpidump is in the latest pmtools
here:
http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
dmesg with CONFIG_ACPI_DEBUG might be interesting too.
thanks,
-Len
^ permalink raw reply [flat|nested] 18+ messages in thread* kacpi_notify?
@ 2006-07-12 19:41 Linus Torvalds
2006-07-12 20:42 ` kacpi_notify? Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2006-07-12 19:41 UTC (permalink / raw)
To: Len Brown; +Cc: Andrew Morton, linux-acpi
Hmm.
What's up with this?
2341 ? D< 0:00 [kacpid_notify]
2342 ? D< 0:00 [kacpid_notify]
2343 ? D< 0:00 [kacpid_notify]
2344 ? D< 0:00 [kacpid_notify]
2345 ? D< 0:00 [kacpid_notify]
2346 ? D< 0:00 [kacpid_notify]
2347 ? D< 0:00 [kacpid_notify]
...
(apparently about 300 of those processes, at which point the machine just
hangs, because even root cannot start any new processes, and I couldn't
actually get to debug this at all).
What would it be waiting on, and why?
This machine doesn't have any module support (at all), and I haven't
booted a new kernel on it in quite a while, so this isn't necessarily new
behaviour, but the last kernel I tried (which did _not_ have this problem,
obviously) was in April (commit 6e5882cfa24e1456702e463f6920fc0ca3c3d2b8,
to be exact).
Now, that's 6000+ commits ago, so I'd rather not even bisect this, if
somebody can come up with a more obvious explanation of why kacpid_notify
would be started over and over and over again, only to always get stuck..
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: kacpi_notify?
2006-07-12 19:41 kacpi_notify? Linus Torvalds
@ 2006-07-12 20:42 ` Linus Torvalds
0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2006-07-12 20:42 UTC (permalink / raw)
To: Len Brown; +Cc: Andrew Morton, linux-acpi
On Wed, 12 Jul 2006, Linus Torvalds wrote:
>
> (apparently about 300 of those processes, at which point the machine just
> hangs, because even root cannot start any new processes, and I couldn't
> actually get to debug this at all).
With ACPI debugging, I notice that it finally dies due to ACPI Error
AE_NO_MEMORY. Which I guess is just due to thousands of kacpi_notify
processes, and tons of allocations.
With ctrl+scrolllock, I finally got something. The traceback for the
D-state (millions and millions of them) is
__down_failed
acpi_ut_acquire_mutex
acpi_ex_enter_interpreter
acpi_ns_evaluate
acpi_evaluate_object
acpi_evaluate_integer
acpi_os_execute_thread
acpi_thermal_get_temperature
acpi_thermal_check
..
and 'kacpid' seems to be stuck using all CPU time, with the thing doing
something like:
EIP is at delay_tsc+0xb/0x13
EFLAGS: 00000283 Not tainted (2.6.18-rc1-g155dbfd8 #24)
EAX: 4aa48900 EBX: 00026be1 ECX: 4aa40b7e EDX: 0000001a
ESI: 00000000 EDI: c039300d EBP: c0390df3 DS: 007b ES: 007b
CR0: 8005003b CR2: 080516f0 CR3: 362dc000 CR4: 000006d0
[<c01c94c0>] __delay+0x6/0x7
[<c01f23ef>] acpi_os_stall+0x1d/0x29
[<c0201f11>] acpi_ex_system_do_stall+0x37/0x3b
[<c0200fca>] acpi_ex_opcode_1A_0T_0R+0x85/0xc8
[<c01f5308>] acpi_ds_exec_end_op+0x133/0x553
[<c020d0f3>] acpi_ps_parse_loop+0x777/0xbe0
[<c020c488>] acpi_ps_parse_aml+0xd8/0x2d5
[<c020dbbe>] acpi_ps_execute_pass+0xa9/0xd2
[<c020dd6a>] acpi_ps_execute_method+0x153/0x231
[<c02095e1>] acpi_ns_evaluate+0x179/0x24c
[<c01fc12e>] acpi_ev_asynch_execute_gpe_method+0xeb/0x159
[<c01f2083>] acpi_os_execute_deferred+0x19/0x21
[<c01226a0>] run_workqueue+0x68/0x95
[<c01f206a>] acpi_os_execute_deferred+0x0/0x21
[<c0122b2e>] worker_thread+0xf9/0x12b
[<c03570bf>] schedule+0x469/0x4cc
[<c0113bfb>] default_wake_function+0x0/0xc
[<c0122a35>] worker_thread+0x0/0x12b
[<c01249bb>] kthread+0xad/0xd8
[<c012490e>] kthread+0x0/0xd8
[<c0101005>] kernel_thread_helper+0x5/0xb
which I assume is the thing that holds the AML semaphore, and isn't
releasing it.
Is there any sane debugging info I can send people?
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2006-07-13 15:33 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-13 2:46 kacpi_notify? Len Brown
-- strict thread matches above, loose matches on Subject: below --
2006-07-13 15:32 kacpi_notify? Starikovskiy, Alexey Y
2006-07-13 11:07 kacpi_notify? Starikovskiy, Alexey Y
2006-07-13 15:00 ` kacpi_notify? Linus Torvalds
2006-07-12 23:13 kacpi_notify? Brown, Len
2006-07-13 0:14 ` kacpi_notify? Linus Torvalds
2006-07-12 23:08 kacpi_notify? Brown, Len
2006-07-12 22:42 kacpi_notify? Brown, Len
2006-07-12 23:02 ` kacpi_notify? Linus Torvalds
2006-07-12 22:39 kacpi_notify? Brown, Len
2006-07-12 21:55 kacpi_notify? Brown, Len
2006-07-12 22:18 ` kacpi_notify? Linus Torvalds
2006-07-12 22:23 ` kacpi_notify? Linus Torvalds
2006-07-12 22:37 ` kacpi_notify? Linus Torvalds
2006-07-12 20:51 kacpi_notify? Brown, Len
[not found] ` <Pine.LNX.4.64.0607121356300.5623@g5.osdl.org>
2006-07-12 21:21 ` kacpi_notify? Linus Torvalds
2006-07-12 19:41 kacpi_notify? Linus Torvalds
2006-07-12 20:42 ` kacpi_notify? Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox