public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* stack overflow
@ 2004-03-08 16:43 Stuart_Hayes-DYMqY+WieiM
       [not found] ` <CE41BFEF2481C246A8DE0D2B4DBACF4F128AA4-novRXWwkcpil7xnNSM18fRtLTTO9Z+wMojBamW5iJbs@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Stuart_Hayes-DYMqY+WieiM @ 2004-03-08 16:43 UTC (permalink / raw)
  To: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f; +Cc: Stuart_Hayes-DYMqY+WieiM


Hello!

I'm using the x86_64 architecture with Linux, and I'm getting what appear to be
stack overflows while the ACPI stuff is being initialized (all the _STA 
methods are being executed).

Just wondering if anyone has taken a look at stack usage with the ACPI stuff, 
and what could be done to fix this (other than simplifying the ACPI tables!).  
I suspect this problem might become an issue with a large number of people as 
the x86_64 architecture becomes more common, since it uses ACPI by default (and 
i386 did not).

Here are some of the reasons I believe the stack is overflowing:

I've added some "printk"s to the kernel, and I've found that the stack pointer 
goes down by ~6K between namespace/nseval.c:acpi_ns_evaluate_relative() and 
executer/exstore.c:acpi_ex_store().

If I disable local interrupts while the ACPI stuff is being initialized, it 
seems to make it through without failing.

If I simplify some of the methods, it seems to work ok.

Thanks
Stuart



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: stack overflow
       [not found] ` <CE41BFEF2481C246A8DE0D2B4DBACF4F128AA4-novRXWwkcpil7xnNSM18fRtLTTO9Z+wMojBamW5iJbs@public.gmane.org>
@ 2004-03-08 18:26   ` Andi Kleen
       [not found]     ` <20040308182630.GB9490-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
  2004-03-10  4:33   ` Len Brown
  1 sibling, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2004-03-08 18:26 UTC (permalink / raw)
  To: Stuart_Hayes-DYMqY+WieiM; +Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

> Here are some of the reasons I believe the stack is overflowing:
> 
> I've added some "printk"s to the kernel, and I've found that the stack pointer 
> goes down by ~6K between namespace/nseval.c:acpi_ns_evaluate_relative() and 
> executer/exstore.c:acpi_ex_store().

The usual way to start is do 

	objdump -S <acpi object modules> | grep sub.*rsp

then sort by the biggest stack pigs and fix them one by one (e.g.
by kmallocing local data instead of allocating it on the stack)
When afterwards the problem still occurs it is most likely recursion or 
to deep nesting.  I have an old 2.4 patch that can catch these, but it 
would need porting to 2.6.

-Andi


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: stack overflow
       [not found]     ` <20040308182630.GB9490-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
@ 2004-03-09  7:17       ` Len Brown
  0 siblings, 0 replies; 14+ messages in thread
From: Len Brown @ 2004-03-09  7:17 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Stuart_Hayes-DYMqY+WieiM, ACPI Developers

Stuart,
Does CONFIG_ACPI_DEBUG change the results of your measurements?

Is it possible to run an i386 kernel on the same system to see if we've
got an x86_64-specific issue?

There is some run-time stack tracing code in ACPI (see
acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.

thanks,
-Len

On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
> > Here are some of the reasons I believe the stack is overflowing:
> > 
> > I've added some "printk"s to the kernel, and I've found that the stack pointer 
> > goes down by ~6K between namespace/nseval.c:acpi_ns_evaluate_relative() and 
> > executer/exstore.c:acpi_ex_store().
> 
> The usual way to start is do 
> 
> 	objdump -S <acpi object modules> | grep sub.*rsp
> 
> then sort by the biggest stack pigs and fix them one by one (e.g.
> by kmallocing local data instead of allocating it on the stack)
> When afterwards the problem still occurs it is most likely recursion or 
> to deep nesting.  I have an old 2.4 patch that can catch these, but it 
> would need porting to 2.6.
> 
> -Andi
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> Acpi-devel mailing list
> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/acpi-devel



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
@ 2004-03-09 18:34 Moore, Robert
  0 siblings, 0 replies; 14+ messages in thread
From: Moore, Robert @ 2004-03-09 18:34 UTC (permalink / raw)
  To: Brown, Len, Andi Kleen
  Cc: Stuart_Hayes-DYMqY+WieiM, ACPI Developers, Grover, Andrew

There's the whole tracing mechanism that sits on the stack -- it goes
away when debug is disabled.

There is very, very little recursion in the ACPI subsystem, and only
when it is bounded.

Bob


-----Original Message-----
From: acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
[mailto:acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org] On Behalf Of Brown, Len
Sent: Monday, March 08, 2004 11:18 PM
To: Andi Kleen
Cc: Stuart_Hayes-DYMqY+WieiM@public.gmane.org; ACPI Developers
Subject: Re: [ACPI] stack overflow

Stuart,
Does CONFIG_ACPI_DEBUG change the results of your measurements?

Is it possible to run an i386 kernel on the same system to see if we've
got an x86_64-specific issue?

There is some run-time stack tracing code in ACPI (see
acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.

thanks,
-Len

On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
> > Here are some of the reasons I believe the stack is overflowing:
> > 
> > I've added some "printk"s to the kernel, and I've found that the
stack pointer 
> > goes down by ~6K between
namespace/nseval.c:acpi_ns_evaluate_relative() and 
> > executer/exstore.c:acpi_ex_store().
> 
> The usual way to start is do 
> 
> 	objdump -S <acpi object modules> | grep sub.*rsp
> 
> then sort by the biggest stack pigs and fix them one by one (e.g.
> by kmallocing local data instead of allocating it on the stack)
> When afterwards the problem still occurs it is most likely recursion
or 
> to deep nesting.  I have an old 2.4 patch that can catch these, but it

> would need porting to 2.6.
> 
> -Andi
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> Acpi-devel mailing list
> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/acpi-devel



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Acpi-devel mailing list
Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/acpi-devel


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
@ 2004-03-09 20:00 Stuart_Hayes-DYMqY+WieiM
  0 siblings, 0 replies; 14+ messages in thread
From: Stuart_Hayes-DYMqY+WieiM @ 2004-03-09 20:00 UTC (permalink / raw)
  To: len.brown-ral2JQCrhuEAvxtiuMwx3w, ak-l3A5Bk7waGM
  Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f


I am in the process of trying this without CONFIG_ACPI_DEBUG now.

I put a little extra debug stuff in utilities/utdebug.c to keep track
of all the nested functions, and added the ACPI_FUNCTION_TRACE to some
more of the functions, and this is what I get when the stack pointer is
lowest (the number by each is the address of the first argument of the
acpi_ut_trace(_*) function, which I store along with the function
name).  This gives a pretty good picture of what's going on with the
recursion and the deep nesting of functions.

  acpi_init 			0000010001e0defc
  acpi_bus_init 			0000010001e0deac
  acpi_initialize_objects 	0000010001e0de6c
  ns_initialize_devices 	0000010001e0de1c
  ns_walk_namespace 		0000010001e0dd8c
  ns_init_one_device 		0000010001e0dd2c
  ut_execute_STA 			0000010001e0dccc
  ut_evaluate_object 		0000010001e0dc5c
  ns_evaluate_relative 		0000010001e0db8c
  ns_evaluate_by_handle 	0000010001e0db1c
  ns_execute_control_method 	0000010001e0dacc
  psx_execute 			0000010001e0da5c
  ps_parse_aml 			0000010001e0da0c
  ps_parse_loop 			0000010001e0d8fc
  ds_exec_end_op 			0000010001e0d89c
  ds_resolve_operands 		0000010001e0d85c
  ex_resolve_to_value 		0000010001e0d80c
  ex_resolve_node_to_value 	0000010001e0d7ac
  ex_read_data_from_field 	0000010001e0d71c
  ex_extract_from_field 	0000010001e0d69c
  ex_field_datum_io 		0000010001e0d61c
  ex_access_region 		0000010001e0d5ac
  ev_address_space_dispatch 	0000010001e0d52c
  ev_pci_config_region_setup 	0000010001e0d4ac
  ut_evaluate_numeric_object 	0000010001e0d44c
  ut_evaluate_object 		0000010001e0d3dc
  ns_evaluate_relative 		0000010001e0d30c
  ns_evaluate_by_handle 	0000010001e0d29c
  ns_execute_control_method 	0000010001e0d24c
  psx_execute 			0000010001e0d1dc
  ps_parse_aml 			0000010001e0d18c
  ps_parse_loop 			0000010001e0d07c
  ds_exec_end_op 			0000010001e0d01c
  ex_resolve_operands 		0000010001e0cf9c
  ex_resolve_to_value 		0000010001e0cf4c
  ex_resolve_node_to_value 	0000010001e0ceec
  ex_read_data_from_field 	0000010001e0ce5c
  ex_extract_from_field 	0000010001e0cddc
  ex_field_datum_io 		0000010001e0cd5c
  ex_access_region 		0000010001e0ccec
  ev_address_space_dispatch 	0000010001e0cc6c
  ev_pci_config_region_setup 	0000010001e0cbec
  acpi_evaluate_integer 	0000010001e0caac
  acpi_evaluate_object 		0000010001e0ca2c
  ns_evaluate_relative 		0000010001e0c95c
  ns_evaluate_by_handle 	0000010001e0c8ec
  ns_execute_control_method 	0000010001e0c89c
  psx_execute 			0000010001e0c82c
  ps_parse_aml 			0000010001e0c7dc
  ps_parse_loop 			0000010001e0c6cc
  ds_exec_end_op 			0000010001e0c66c
  ex_resolve_operands 		0000010001e0c5ec
  ex_resolve_to_value 		0000010001e0c59c
  ex_resolve_node_to_value 	0000010001e0c53c
  ex_read_data_from_field 	0000010001e0c4ac
  ex_extract_from_field 	0000010001e0c42c
  ex_field_datum_io 		0000010001e0c3ac
  ex_access_region 		0000010001e0c33c
  ev_address_space_dispatch 	0000010001e0c2bc
  ex_enter_interpreter 		0000010001e0c27c
  ut_acquire_mutex 		0000010001e0c21c
  os_wait_semaphore 		0000010001e0c1cc

Thanks
Stuart



Len Brown wrote:
> Stuart,
> Does CONFIG_ACPI_DEBUG change the results of your measurements?
> 
> Is it possible to run an i386 kernel on the same system to see if
> we've got an x86_64-specific issue?
> 
> There is some run-time stack tracing code in ACPI (see
> acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.
> 
> thanks,
> -Len
> 
> On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
>>> Here are some of the reasons I believe the stack is overflowing:
>>> 
>>> I've added some "printk"s to the kernel, and I've found that the
>>> stack pointer goes down by ~6K between
>>> namespace/nseval.c:acpi_ns_evaluate_relative() and
>>> executer/exstore.c:acpi_ex_store(). 
>> 
>> The usual way to start is do
>> 
>> 	objdump -S <acpi object modules> | grep sub.*rsp
>> 
>> then sort by the biggest stack pigs and fix them one by one (e.g.
>> by kmallocing local data instead of allocating it on the stack)
>> When afterwards the problem still occurs it is most likely recursion
>> or to deep nesting.  I have an old 2.4 patch that can catch these,
>> but it would need porting to 2.6. 
>> 
>> -Andi
>> 
>> 
>> -------------------------------------------------------
>> This SF.Net email is sponsored by: IBM Linux Tutorials
>> Free Linux tutorial presented by Daniel Robbins, President and CEO of
>> GenToo technologies. Learn everything from fundamentals to system
>> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
>> _______________________________________________
>> Acpi-devel mailing list
>> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>> https://lists.sourceforge.net/lists/listinfo/acpi-devel




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
@ 2004-03-09 21:04 Stuart_Hayes-DYMqY+WieiM
  0 siblings, 0 replies; 14+ messages in thread
From: Stuart_Hayes-DYMqY+WieiM @ 2004-03-09 21:04 UTC (permalink / raw)
  To: len.brown-ral2JQCrhuEAvxtiuMwx3w, ak-l3A5Bk7waGM
  Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f


It seems to work with CONFIG_ACPI_DEBUG off.  I'm guessing we're just
squeaking by with that, though.  Wouldn't more complex ACPI methods 
cause the stack usage to go up, causing it to break again?

I don't think this will occur with i386, because the pointers are
half the size, but I'll try it as soon as I get a chance.

Thanks
Stuart


Hayes, Stuart wrote:
> I am in the process of trying this without CONFIG_ACPI_DEBUG now.
> 
> I put a little extra debug stuff in utilities/utdebug.c to keep track
> of all the nested functions, and added the ACPI_FUNCTION_TRACE to some
> more of the functions, and this is what I get when the stack pointer
> is lowest (the number by each is the address of the first argument of
> the acpi_ut_trace(_*) function, which I store along with the function
> name).  This gives a pretty good picture of what's going on with the
> recursion and the deep nesting of functions.
> 
>   acpi_init 			0000010001e0defc
>   acpi_bus_init 			0000010001e0deac
>   acpi_initialize_objects 	0000010001e0de6c
>   ns_initialize_devices 	0000010001e0de1c
>   ns_walk_namespace 		0000010001e0dd8c
>   ns_init_one_device 		0000010001e0dd2c
>   ut_execute_STA 			0000010001e0dccc
>   ut_evaluate_object 		0000010001e0dc5c
>   ns_evaluate_relative 		0000010001e0db8c
>   ns_evaluate_by_handle 	0000010001e0db1c
>   ns_execute_control_method 	0000010001e0dacc
>   psx_execute 			0000010001e0da5c
>   ps_parse_aml 			0000010001e0da0c
>   ps_parse_loop 			0000010001e0d8fc
>   ds_exec_end_op 			0000010001e0d89c
>   ds_resolve_operands 		0000010001e0d85c
>   ex_resolve_to_value 		0000010001e0d80c
>   ex_resolve_node_to_value 	0000010001e0d7ac
>   ex_read_data_from_field 	0000010001e0d71c
>   ex_extract_from_field 	0000010001e0d69c
>   ex_field_datum_io 		0000010001e0d61c
>   ex_access_region 		0000010001e0d5ac
>   ev_address_space_dispatch 	0000010001e0d52c
>   ev_pci_config_region_setup 	0000010001e0d4ac
>   ut_evaluate_numeric_object 	0000010001e0d44c
>   ut_evaluate_object 		0000010001e0d3dc
>   ns_evaluate_relative 		0000010001e0d30c
>   ns_evaluate_by_handle 	0000010001e0d29c
>   ns_execute_control_method 	0000010001e0d24c
>   psx_execute 			0000010001e0d1dc
>   ps_parse_aml 			0000010001e0d18c
>   ps_parse_loop 			0000010001e0d07c
>   ds_exec_end_op 			0000010001e0d01c
>   ex_resolve_operands 		0000010001e0cf9c
>   ex_resolve_to_value 		0000010001e0cf4c
>   ex_resolve_node_to_value 	0000010001e0ceec
>   ex_read_data_from_field 	0000010001e0ce5c
>   ex_extract_from_field 	0000010001e0cddc
>   ex_field_datum_io 		0000010001e0cd5c
>   ex_access_region 		0000010001e0ccec
>   ev_address_space_dispatch 	0000010001e0cc6c
>   ev_pci_config_region_setup 	0000010001e0cbec
>   acpi_evaluate_integer 	0000010001e0caac
>   acpi_evaluate_object 		0000010001e0ca2c
>   ns_evaluate_relative 		0000010001e0c95c
>   ns_evaluate_by_handle 	0000010001e0c8ec
>   ns_execute_control_method 	0000010001e0c89c
>   psx_execute 			0000010001e0c82c
>   ps_parse_aml 			0000010001e0c7dc
>   ps_parse_loop 			0000010001e0c6cc
>   ds_exec_end_op 			0000010001e0c66c
>   ex_resolve_operands 		0000010001e0c5ec
>   ex_resolve_to_value 		0000010001e0c59c
>   ex_resolve_node_to_value 	0000010001e0c53c
>   ex_read_data_from_field 	0000010001e0c4ac
>   ex_extract_from_field 	0000010001e0c42c
>   ex_field_datum_io 		0000010001e0c3ac
>   ex_access_region 		0000010001e0c33c
>   ev_address_space_dispatch 	0000010001e0c2bc
>   ex_enter_interpreter 		0000010001e0c27c
>   ut_acquire_mutex 		0000010001e0c21c
>   os_wait_semaphore 		0000010001e0c1cc
> 
> Thanks
> Stuart
> 
> 
> 
> Len Brown wrote:
>> Stuart,
>> Does CONFIG_ACPI_DEBUG change the results of your measurements?
>> 
>> Is it possible to run an i386 kernel on the same system to see if
>> we've got an x86_64-specific issue?
>> 
>> There is some run-time stack tracing code in ACPI (see
>> acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.
>> 
>> thanks,
>> -Len
>> 
>> On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
>>>> Here are some of the reasons I believe the stack is overflowing:
>>>> 
>>>> I've added some "printk"s to the kernel, and I've found that the
>>>> stack pointer goes down by ~6K between
>>>> namespace/nseval.c:acpi_ns_evaluate_relative() and
>>>> executer/exstore.c:acpi_ex_store().
>>> 
>>> The usual way to start is do
>>> 
>>> 	objdump -S <acpi object modules> | grep sub.*rsp
>>> 
>>> then sort by the biggest stack pigs and fix them one by one (e.g.
>>> by kmallocing local data instead of allocating it on the stack)
>>> When afterwards the problem still occurs it is most likely recursion
>>> or to deep nesting.  I have an old 2.4 patch that can catch these,
>>> but it would need porting to 2.6.
>>> 
>>> -Andi
>>> 
>>> 
>>> -------------------------------------------------------
>>> This SF.Net email is sponsored by: IBM Linux Tutorials
>>> Free Linux tutorial presented by Daniel Robbins, President and CEO
>>> of GenToo technologies. Learn everything from fundamentals to system
>>>
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
>>> _______________________________________________
>>> Acpi-devel mailing list
>>> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>>> https://lists.sourceforge.net/lists/listinfo/acpi-devel




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
@ 2004-03-09 22:48 Moore, Robert
  0 siblings, 0 replies; 14+ messages in thread
From: Moore, Robert @ 2004-03-09 22:48 UTC (permalink / raw)
  To: Stuart_Hayes-DYMqY+WieiM, Brown, Len, ak-l3A5Bk7waGM
  Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Can you please post your DSDT and give us some idea of which _STA method
is executing?

I seem to remember that there may be a bit of leftover recursion in the
operation region/field handling code, but something looks very odd about
the way the ev_pci_config_region_setup function is getting called twice.


>>It seems to work with CONFIG_ACPI_DEBUG off.  I'm guessing we're just
squeaking by with that, though.  Wouldn't more complex ACPI methods 
cause the stack usage to go up, causing it to break again?

I think this is an odd case (i.e., bug), since the interpreter has been
specifically architected to not use recursion -- however, the original
version of the interpreter did recurse based on the complexity of the
ASL code (when the interpreter was running as an application.)  This was
removed in all obvious cases, but I do think that there may be a couple
that were missed - fields being perhaps one of them.

Bob

-----Original Message-----
From: acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
[mailto:acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org] On Behalf Of
Stuart_Hayes-DYMqY+WieiM@public.gmane.org
Sent: Tuesday, March 09, 2004 12:00 PM
To: Brown, Len; ak-l3A5Bk7waGM@public.gmane.org
Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Subject: RE: [ACPI] stack overflow


I am in the process of trying this without CONFIG_ACPI_DEBUG now.

I put a little extra debug stuff in utilities/utdebug.c to keep track
of all the nested functions, and added the ACPI_FUNCTION_TRACE to some
more of the functions, and this is what I get when the stack pointer is
lowest (the number by each is the address of the first argument of the
acpi_ut_trace(_*) function, which I store along with the function
name).  This gives a pretty good picture of what's going on with the
recursion and the deep nesting of functions.

  acpi_init 			0000010001e0defc
  acpi_bus_init 			0000010001e0deac
  acpi_initialize_objects 	0000010001e0de6c
  ns_initialize_devices 	0000010001e0de1c
  ns_walk_namespace 		0000010001e0dd8c
  ns_init_one_device 		0000010001e0dd2c
  ut_execute_STA 			0000010001e0dccc
  ut_evaluate_object 		0000010001e0dc5c
  ns_evaluate_relative 		0000010001e0db8c
  ns_evaluate_by_handle 	0000010001e0db1c
  ns_execute_control_method 	0000010001e0dacc
  psx_execute 			0000010001e0da5c
  ps_parse_aml 			0000010001e0da0c
  ps_parse_loop 			0000010001e0d8fc
  ds_exec_end_op 			0000010001e0d89c
  ds_resolve_operands 		0000010001e0d85c
  ex_resolve_to_value 		0000010001e0d80c
  ex_resolve_node_to_value 	0000010001e0d7ac
  ex_read_data_from_field 	0000010001e0d71c
  ex_extract_from_field 	0000010001e0d69c
  ex_field_datum_io 		0000010001e0d61c
  ex_access_region 		0000010001e0d5ac
  ev_address_space_dispatch 	0000010001e0d52c
  ev_pci_config_region_setup 	0000010001e0d4ac
  ut_evaluate_numeric_object 	0000010001e0d44c
  ut_evaluate_object 		0000010001e0d3dc
  ns_evaluate_relative 		0000010001e0d30c
  ns_evaluate_by_handle 	0000010001e0d29c
  ns_execute_control_method 	0000010001e0d24c
  psx_execute 			0000010001e0d1dc
  ps_parse_aml 			0000010001e0d18c
  ps_parse_loop 			0000010001e0d07c
  ds_exec_end_op 			0000010001e0d01c
  ex_resolve_operands 		0000010001e0cf9c
  ex_resolve_to_value 		0000010001e0cf4c
  ex_resolve_node_to_value 	0000010001e0ceec
  ex_read_data_from_field 	0000010001e0ce5c
  ex_extract_from_field 	0000010001e0cddc
  ex_field_datum_io 		0000010001e0cd5c
  ex_access_region 		0000010001e0ccec
  ev_address_space_dispatch 	0000010001e0cc6c
  ev_pci_config_region_setup 	0000010001e0cbec
  acpi_evaluate_integer 	0000010001e0caac
  acpi_evaluate_object 		0000010001e0ca2c
  ns_evaluate_relative 		0000010001e0c95c
  ns_evaluate_by_handle 	0000010001e0c8ec
  ns_execute_control_method 	0000010001e0c89c
  psx_execute 			0000010001e0c82c
  ps_parse_aml 			0000010001e0c7dc
  ps_parse_loop 			0000010001e0c6cc
  ds_exec_end_op 			0000010001e0c66c
  ex_resolve_operands 		0000010001e0c5ec
  ex_resolve_to_value 		0000010001e0c59c
  ex_resolve_node_to_value 	0000010001e0c53c
  ex_read_data_from_field 	0000010001e0c4ac
  ex_extract_from_field 	0000010001e0c42c
  ex_field_datum_io 		0000010001e0c3ac
  ex_access_region 		0000010001e0c33c
  ev_address_space_dispatch 	0000010001e0c2bc
  ex_enter_interpreter 		0000010001e0c27c
  ut_acquire_mutex 		0000010001e0c21c
  os_wait_semaphore 		0000010001e0c1cc

Thanks
Stuart



Len Brown wrote:
> Stuart,
> Does CONFIG_ACPI_DEBUG change the results of your measurements?
> 
> Is it possible to run an i386 kernel on the same system to see if
> we've got an x86_64-specific issue?
> 
> There is some run-time stack tracing code in ACPI (see
> acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.
> 
> thanks,
> -Len
> 
> On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
>>> Here are some of the reasons I believe the stack is overflowing:
>>> 
>>> I've added some "printk"s to the kernel, and I've found that the
>>> stack pointer goes down by ~6K between
>>> namespace/nseval.c:acpi_ns_evaluate_relative() and
>>> executer/exstore.c:acpi_ex_store(). 
>> 
>> The usual way to start is do
>> 
>> 	objdump -S <acpi object modules> | grep sub.*rsp
>> 
>> then sort by the biggest stack pigs and fix them one by one (e.g.
>> by kmallocing local data instead of allocating it on the stack)
>> When afterwards the problem still occurs it is most likely recursion
>> or to deep nesting.  I have an old 2.4 patch that can catch these,
>> but it would need porting to 2.6. 
>> 
>> -Andi
>> 
>> 
>> -------------------------------------------------------
>> This SF.Net email is sponsored by: IBM Linux Tutorials
>> Free Linux tutorial presented by Daniel Robbins, President and CEO of
>> GenToo technologies. Learn everything from fundamentals to system
>> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
>> _______________________________________________
>> Acpi-devel mailing list
>> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>> https://lists.sourceforge.net/lists/listinfo/acpi-devel




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=ick
_______________________________________________
Acpi-devel mailing list
Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/acpi-devel


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: stack overflow
       [not found] ` <CE41BFEF2481C246A8DE0D2B4DBACF4F128AA4-novRXWwkcpil7xnNSM18fRtLTTO9Z+wMojBamW5iJbs@public.gmane.org>
  2004-03-08 18:26   ` Andi Kleen
@ 2004-03-10  4:33   ` Len Brown
       [not found]     ` <1078893223.2346.585.camel-D2Zvc0uNKG8@public.gmane.org>
  1 sibling, 1 reply; 14+ messages in thread
From: Len Brown @ 2004-03-10  4:33 UTC (permalink / raw)
  To: Stuart_Hayes-DYMqY+WieiM; +Cc: ACPI Developers, Robert Moore, Andi Kleen

On Mon, 2004-03-08 at 11:43, Stuart_Hayes-DYMqY+WieiM@public.gmane.org wrote:

> If I disable local interrupts while the ACPI stuff is being initialized, it 
> seems to make it through without failing.

hmmm, i wonder if the failure happens when ACPI is interrupted, or if
there is an issue with some ACPI code running in an interrupt?

re: the stack trace
I'm sure if Bob gets your DSDT he'll be able to address the recursion
issue at hand.

More interesting, perhaps, would be adding debugging code to
"gracefully" check for this failure.  There must be such DEBUG code
already built into the kernel someplace.

Sorting the list of stack frame sizes below shows
acpi_evaluate_integer() is the winner with 320 bytes on the stack.  Note
that this isn't from passing structures, but from allocating local
structures.  On i386 acpi_parse_object is 124 bytes, on x86_64 it will
be bigger...

 ./foo <stack.txt |sort -n
0 acpi_init
64 acpi_initialize_objects
64 ds_resolve_operands
64 ex_enter_interpreter
80 acpi_bus_init
80 ex_resolve_to_value
80 ex_resolve_to_value
80 ex_resolve_to_value
80 ns_execute_control_method
80 ns_execute_control_method
80 ns_execute_control_method
80 ns_initialize_devices
80 os_wait_semaphore
80 ps_parse_aml
80 ps_parse_aml
80 ps_parse_aml
96 ds_exec_end_op
96 ds_exec_end_op
96 ds_exec_end_op
96 ex_resolve_node_to_value
96 ex_resolve_node_to_value
96 ex_resolve_node_to_value
96 ns_init_one_device
96 ut_acquire_mutex
96 ut_evaluate_numeric_object
96 ut_execute_STA
112 ex_access_region
112 ex_access_region
112 ex_access_region
112 ns_evaluate_by_handle
112 ns_evaluate_by_handle
112 ns_evaluate_by_handle
112 psx_execute
112 psx_execute
112 psx_execute
112 ut_evaluate_object
112 ut_evaluate_object
128 acpi_evaluate_object
128 ev_address_space_dispatch
128 ev_address_space_dispatch
128 ev_address_space_dispatch
128 ev_pci_config_region_setup
128 ev_pci_config_region_setup
128 ex_extract_from_field
128 ex_extract_from_field
128 ex_extract_from_field
128 ex_field_datum_io
128 ex_field_datum_io
128 ex_field_datum_io
128 ex_resolve_operands
128 ex_resolve_operands
144 ex_read_data_from_field
144 ex_read_data_from_field
144 ex_read_data_from_field
144 ns_walk_namespace
208 ns_evaluate_relative
208 ns_evaluate_relative
208 ns_evaluate_relative
272 ps_parse_loop
272 ps_parse_loop
272 ps_parse_loop
320 acpi_evaluate_integer

#include <stdio.h>
#include <stdlib.h>

main()
{
        char name[512];
        unsigned long long number;
        unsigned long long previous = 0;

        while (EOF != scanf("%s %llx", &name, &number))
        {
                unsigned long long delta;
                if (previous == 0) previous = number;

                delta = previous - number;
                printf("%lld %s\n", delta, name);
        }
}




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: stack overflow
       [not found]     ` <1078893223.2346.585.camel-D2Zvc0uNKG8@public.gmane.org>
@ 2004-03-10 13:32       ` Andi Kleen
       [not found]         ` <20040310133208.GC12272-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Andi Kleen @ 2004-03-10 13:32 UTC (permalink / raw)
  To: Len Brown
  Cc: Stuart_Hayes-DYMqY+WieiM, ACPI Developers, Robert Moore,
	Andi Kleen

On Tue, Mar 09, 2004 at 11:33:43PM -0500, Brown, Len wrote:
> On Mon, 2004-03-08 at 11:43, Stuart_Hayes-DYMqY+WieiM@public.gmane.org wrote:
> 
> > If I disable local interrupts while the ACPI stuff is being initialized, it 
> > seems to make it through without failing.
> 
> hmmm, i wonder if the failure happens when ACPI is interrupted, or if
> there is an issue with some ACPI code running in an interrupt?

x86-64 has separate interrupt stacks, interrupts shouldn't be a problem.

> More interesting, perhaps, would be adding debugging code to
> "gracefully" check for this failure.  There must be such DEBUG code
> already built into the kernel someplace.

I have a patch for x86-64 (for 2.4, but could be ported). But it's 
quite a slow down because it instruments every function.

If you have some recursion either fix it or at least add an error
out when the stack gets too low. We can add an "stack_left" function
exported by the architecture.

> 
> Sorting the list of stack frame sizes below shows
> acpi_evaluate_integer() is the winner with 320 bytes on the stack.  Note
> that this isn't from passing structures, but from allocating local
> structures.  On i386 acpi_parse_object is 124 bytes, on x86_64 it will
> be bigger...

I would suggest to fix anything > 100 bytes at least 
(and double check anything that could be expanded on 64bit) 

-Andi


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
@ 2004-03-10 15:56 Stuart_Hayes-DYMqY+WieiM
  0 siblings, 0 replies; 14+ messages in thread
From: Stuart_Hayes-DYMqY+WieiM @ 2004-03-10 15:56 UTC (permalink / raw)
  To: robert.moore-ral2JQCrhuEAvxtiuMwx3w,
	len.brown-ral2JQCrhuEAvxtiuMwx3w, ak-l3A5Bk7waGM
  Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f


Here is the relevant method, and all of the fields, regions, and
methods needed to execute it (I think).  If this isn't sufficient,
let me know.  The method being run is ...VPR0.D0F0._STA.

This is cut from the output from Phoenix's "ad" program, and it is 
showing both the disassembled code and the AML itself.  Sorry about
the long lines.

Thanks
Stuart

    Scope/*0x10*//*0x85,0x31,0x03*/(\_SB/*0x5C,0x5F,0x53,0x42,0x5F*/)
    {
 
Device/*0x5B,0x82*//*0x8D,0xD0,0x02*/(PCI0/*0x50,0x43,0x49,0x30*/)
        {
 
Device/*0x5B,0x82*//*0x4C,0x89*/(VPR0/*0x56,0x50,0x52,0x30*/)
            {
 
Name/*0x08*/(DEVN/*0x44,0x45,0x56,0x4E*/,/*0x0A*/0x00/*0x00*/)
 
Method/*0x14*//*0x3B*/(_ADR/*0x5F,0x41,0x44,0x52*/,0,NotSerialized/*0x00
*/)
                {
 
Store/*0x70*/(/*0x0C*/0x00060000/*0x00,0x00,0x06,0x00*/,Local0/*0x60*/)
 
Store/*0x70*/(\_SB.PCI0.ISA.CPTP/*0x5C,0x2F,0x04,0x5F,0x53,0x42,0x5F,0x5
0,0x43,0x49,0x30,0x49,0x53,0x41,0x5F,0x43,0x50,0x54,0x50*/,Local1/*0x61*
/)
 
If/*0xA0*//*0x0C*/(LEqual/*0x93*/(Local1/*0x61*/,/*0x0A*/0x01/*0x01*/))
                    {
 
Store/*0x70*/(/*0x0C*/0x000A0000/*0x00,0x00,0x0A,0x00*/,Local0/*0x60*/)
                    }
 
Store/*0x70*/(Local0/*0x60*/,DEVN/*0x44,0x45,0x56,0x4E*/ /*
\_SB.PCI0.VPR0.DEVN */)
                    Store/*0x70*/(Local0/*0x60*/,Debug/*0x5B,0x31*/)
                    Return/*0xA4*/(Local0/*0x60*/)
                }
 
Method/*0x14*//*0x21*/(MADR/*0x4D,0x41,0x44,0x52*/,1,NotSerialized/*0x01
*/)
                {
                    Store/*0x70*/(Arg0/*0x68*/,Local0/*0x60*/)
 
If/*0xA0*//*0x0B*/(And/*0x7B*/(SCPL/*0x53,0x43,0x50,0x4C*/ /*
\_SB.PCI0.VPR0.SCPL */,/*0x0A*/0x40/*0x40*//*0x00*/))
                    {
                        Return/*0xA4*/(Local0/*0x60*/)
                    }
                    Else/*0xA1*//*0x0B*/
                    {
 
Or/*0x7D*/(Local0/*0x60*/,/*0x0C*/0x001F0000/*0x00,0x00,0x1F,0x00*/,Loca
l0/*0x60*/)
                        Return/*0xA4*/(Local0/*0x60*/)
                    }
                }

                OperationRegion(NBCF,PCI_Config,0x00,0x0100)
                Field(NBCF /* \_SB.PCI0.VPR0.NBCF
*/,ByteAcc,NoLock,Preserve)
                {
                    Offset(0x70),
                    Offset(0x73),
                    PNUM,8,
                    Offset(0x78),
                    SCPL,16,
                    SCPH,16,
                    SCTL,8,
                    SCTH,8,
                    SSTA,8,
                    Offset(0x80),
                    RPCT,8
                }


 
Method/*0x14*//*0x15*/(MSTA/*0x4D,0x53,0x54,0x41*/,1,NotSerialized/*0x01
*/)
                {
 
If/*0xA0*//*0x09*/(LEqual/*0x93*/(Arg0/*0x68*/,/*0x0B*/0xFFFF/*0xFF,0xFF
*/))
                    {
                        Return/*0xA4*/(/*0x0A*/0x00/*0x00*/)
                    }
                    Else/*0xA1*//*0x04*/
                    {
                        Return/*0xA4*/(/*0x0A*/0x0F/*0x0F*/)
                    }
                }

 
Device/*0x5B,0x82*//*0x4C,0x05*/(D0F0/*0x44,0x30,0x46,0x30*/)
                {
 
Method/*0x14*//*0x0D*/(_ADR/*0x5F,0x41,0x44,0x52*/,0,NotSerialized/*0x00
*/)
                    {
                        Return/*0xA4*/(MADR/*0x4D,0x41,0x44,0x52*/ /*
\_SB.PCI0.VPR0.MADR */(/*0x0A*/0x00/*0x00*/))
                    }
 
Method/*0x14*//*0x0F*/(_SUN/*0x5F,0x53,0x55,0x4E*/,0,NotSerialized/*0x00
*/)
                    {
 
Return/*0xA4*/(ShiftRight/*0x7A*/(SCPH/*0x53,0x43,0x50,0x48*/ /*
\_SB.PCI0.VPR0.SCPH */,/*0x0A*/0x03/*0x03*//*0x00*/))
                    }
 
OperationRegion/*0x5B,0x80*/(SCFG/*0x53,0x43,0x46,0x47*/,PCI_Config/*0x0
2*/,/*0x0A*/0x00/*0x00*/,/*0x0B*/0x0100/*0x00,0x01*/)
 
Field/*0x5B,0x81*//*0x0B*/(SCFG/*0x53,0x43,0x46,0x47*/ /*
\_SB.PCI0.VPR0.D0F0.SCFG */,WordAcc,NoLock,Preserve/*0x02*/)
                    {
                        VDID/*0x56,0x44,0x49,0x44*//*0x10*/,16
                    }
 
Method/*0x14*//*0x0F*/(_STA/*0x5F,0x53,0x54,0x41*/,0,NotSerialized/*0x00
*/)
                    {
                        Return/*0xA4*/(MSTA/*0x4D,0x53,0x54,0x41*/ /*
\_SB.PCI0.VPR0.MSTA */(VDID/*0x56,0x44,0x49,0x44*/ /*
\_SB.PCI0.VPR0.D0F0.VDID */))
                    }
 
Method/*0x14*//*0x0E*/(_EJ0/*0x5F,0x45,0x4A,0x30*/,1,NotSerialized/*0x01
*/)
                    {
                        PWCM/*0x50,0x57,0x43,0x4D*/ /*
\_SB.PCI0.VPR0.PWCM */(/*0x0A*/0xF8/*0xF8*/,/*0x0A*/0x07/*0x07*/)
                    }
                }
            }
        }
    }






Moore, Robert wrote:
> Can you please post your DSDT and give us some idea of which _STA
> method is executing?
> 
> I seem to remember that there may be a bit of leftover recursion in
> the operation region/field handling code, but something looks very
> odd about the way the ev_pci_config_region_setup function is getting
> called twice. 
> 
> 
>>> It seems to work with CONFIG_ACPI_DEBUG off.  I'm guessing we're
>>> just 
> squeaking by with that, though.  Wouldn't more complex ACPI methods
> cause the stack usage to go up, causing it to break again?
> 
> I think this is an odd case (i.e., bug), since the interpreter has
> been specifically architected to not use recursion -- however, the
> original version of the interpreter did recurse based on the
> complexity of the ASL code (when the interpreter was running as an
> application.)  This was removed in all obvious cases, but I do think
> that there may be a couple that were missed - fields being perhaps
> one of them. 
> 
> Bob
> 
> -----Original Message-----
> From: acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> [mailto:acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org] On Behalf Of
> Stuart_Hayes-DYMqY+WieiM@public.gmane.org
> Sent: Tuesday, March 09, 2004 12:00 PM
> To: Brown, Len; ak-l3A5Bk7waGM@public.gmane.org
> Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> Subject: RE: [ACPI] stack overflow
> 
> 
> I am in the process of trying this without CONFIG_ACPI_DEBUG now.
> 
> I put a little extra debug stuff in utilities/utdebug.c to keep track
> of all the nested functions, and added the ACPI_FUNCTION_TRACE to some
> more of the functions, and this is what I get when the stack pointer
> is lowest (the number by each is the address of the first argument of
> the acpi_ut_trace(_*) function, which I store along with the function
> name).  This gives a pretty good picture of what's going on with the
> recursion and the deep nesting of functions.
> 
>   acpi_init 			0000010001e0defc
>   acpi_bus_init 			0000010001e0deac
>   acpi_initialize_objects 	0000010001e0de6c
>   ns_initialize_devices 	0000010001e0de1c
>   ns_walk_namespace 		0000010001e0dd8c
>   ns_init_one_device 		0000010001e0dd2c
>   ut_execute_STA 			0000010001e0dccc
>   ut_evaluate_object 		0000010001e0dc5c
>   ns_evaluate_relative 		0000010001e0db8c
>   ns_evaluate_by_handle 	0000010001e0db1c
>   ns_execute_control_method 	0000010001e0dacc
>   psx_execute 			0000010001e0da5c
>   ps_parse_aml 			0000010001e0da0c
>   ps_parse_loop 			0000010001e0d8fc
>   ds_exec_end_op 			0000010001e0d89c
>   ds_resolve_operands 		0000010001e0d85c
>   ex_resolve_to_value 		0000010001e0d80c
>   ex_resolve_node_to_value 	0000010001e0d7ac
>   ex_read_data_from_field 	0000010001e0d71c
>   ex_extract_from_field 	0000010001e0d69c
>   ex_field_datum_io 		0000010001e0d61c
>   ex_access_region 		0000010001e0d5ac
>   ev_address_space_dispatch 	0000010001e0d52c
>   ev_pci_config_region_setup 	0000010001e0d4ac
>   ut_evaluate_numeric_object 	0000010001e0d44c
>   ut_evaluate_object 		0000010001e0d3dc
>   ns_evaluate_relative 		0000010001e0d30c
>   ns_evaluate_by_handle 	0000010001e0d29c
>   ns_execute_control_method 	0000010001e0d24c
>   psx_execute 			0000010001e0d1dc
>   ps_parse_aml 			0000010001e0d18c
>   ps_parse_loop 			0000010001e0d07c
>   ds_exec_end_op 			0000010001e0d01c
>   ex_resolve_operands 		0000010001e0cf9c
>   ex_resolve_to_value 		0000010001e0cf4c
>   ex_resolve_node_to_value 	0000010001e0ceec
>   ex_read_data_from_field 	0000010001e0ce5c
>   ex_extract_from_field 	0000010001e0cddc
>   ex_field_datum_io 		0000010001e0cd5c
>   ex_access_region 		0000010001e0ccec
>   ev_address_space_dispatch 	0000010001e0cc6c
>   ev_pci_config_region_setup 	0000010001e0cbec
>   acpi_evaluate_integer 	0000010001e0caac
>   acpi_evaluate_object 		0000010001e0ca2c
>   ns_evaluate_relative 		0000010001e0c95c
>   ns_evaluate_by_handle 	0000010001e0c8ec
>   ns_execute_control_method 	0000010001e0c89c
>   psx_execute 			0000010001e0c82c
>   ps_parse_aml 			0000010001e0c7dc
>   ps_parse_loop 			0000010001e0c6cc
>   ds_exec_end_op 			0000010001e0c66c
>   ex_resolve_operands 		0000010001e0c5ec
>   ex_resolve_to_value 		0000010001e0c59c
>   ex_resolve_node_to_value 	0000010001e0c53c
>   ex_read_data_from_field 	0000010001e0c4ac
>   ex_extract_from_field 	0000010001e0c42c
>   ex_field_datum_io 		0000010001e0c3ac
>   ex_access_region 		0000010001e0c33c
>   ev_address_space_dispatch 	0000010001e0c2bc
>   ex_enter_interpreter 		0000010001e0c27c
>   ut_acquire_mutex 		0000010001e0c21c
>   os_wait_semaphore 		0000010001e0c1cc
> 
> Thanks
> Stuart
> 
> 
> 
> Len Brown wrote:
>> Stuart,
>> Does CONFIG_ACPI_DEBUG change the results of your measurements?
>> 
>> Is it possible to run an i386 kernel on the same system to see if
>> we've got an x86_64-specific issue?
>> 
>> There is some run-time stack tracing code in ACPI (see
>> acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.
>> 
>> thanks,
>> -Len
>> 
>> On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
>>>> Here are some of the reasons I believe the stack is overflowing:
>>>> 
>>>> I've added some "printk"s to the kernel, and I've found that the
>>>> stack pointer goes down by ~6K between
>>>> namespace/nseval.c:acpi_ns_evaluate_relative() and
>>>> executer/exstore.c:acpi_ex_store().
>>> 
>>> The usual way to start is do
>>> 
>>> 	objdump -S <acpi object modules> | grep sub.*rsp
>>> 
>>> then sort by the biggest stack pigs and fix them one by one (e.g.
>>> by kmallocing local data instead of allocating it on the stack)
>>> When afterwards the problem still occurs it is most likely recursion
>>> or to deep nesting.  I have an old 2.4 patch that can catch these,
>>> but it would need porting to 2.6.
>>> 
>>> -Andi
>>> 
>>> 
>>> -------------------------------------------------------
>>> This SF.Net email is sponsored by: IBM Linux Tutorials
>>> Free Linux tutorial presented by Daniel Robbins, President and CEO
>>> of GenToo technologies. Learn everything from fundamentals to system
>>>
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
>>> _______________________________________________
>>> Acpi-devel mailing list
>>> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>>> https://lists.sourceforge.net/lists/listinfo/acpi-devel
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=ick
> _______________________________________________
> Acpi-devel mailing list
> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/acpi-devel




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
@ 2004-03-10 17:08 Stuart_Hayes-DYMqY+WieiM
       [not found] ` <CE41BFEF2481C246A8DE0D2B4DBACF4F020E5FD9-novRXWwkcpil7xnNSM18fRtLTTO9Z+wMojBamW5iJbs@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Stuart_Hayes-DYMqY+WieiM @ 2004-03-10 17:08 UTC (permalink / raw)
  To: robert.moore-ral2JQCrhuEAvxtiuMwx3w,
	len.brown-ral2JQCrhuEAvxtiuMwx3w, ak-l3A5Bk7waGM
  Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f


Here is the console output that occurred right before the lowest
stack pointer occurred (the point where the system output the list
of ACPI functions that I sent earlier).  You can see from this
which methods were being executed, and where the code was in
evaluating ...VPR0.D0F0._STA.

Also, I should probably mention that this is a 2.4 kernel.  It's
RHEL3, which is based on 2.4.21, but I've tested this with the
latest 2.4 kernel (2.4.25), and the results were the same.

Thanks
Stuart


 in acpi_ns_init_one_device for D0F0 (ptr=0000010001ddacf0)
.  in acpi_ns_evaluate_relative to execute D0F0, method _STA
  ...internalized path=_STA
  ...looked up node, pointer=0000010001dd91f0
  (in apci_ns_execute_control_method _STA)
  (method aml: a4 4d 53 54 41 56 44 49 44 )
  in acpi_ns_evaluate_relative to execute D0F0, method _ADR
  ...internalized path=_ADR
  ...looked up node, pointer=0000010001ddad70
  (in apci_ns_execute_control_method _ADR)
  (method aml: a4 4d 41 44 52 a 0 )
  in acpi_ns_evaluate_relative to execute VPR0, method _ADR
  ...internalized path=_ADR
  ...looked up node, pointer=0000010001ddbaf0
  (in apci_ns_execute_control_method _ADR)
  (method aml: 70 c 0 0 6 0 60 70 5c 2f 4 5f 53 42 5f 50 43 49 30 49 53
41 5f 43 50 54 50 61 a0 c 93 61 a 1 70 c 0 0 a 0 60 70 60 44 45 56 4e 70
60 5b 31 a4 60 )
[ACPI Debug] Integer: 0000000000060000
...exiting acpi_ns_evaluate_relative for obj=_ADR
  in acpi_ns_evaluate_relative to execute PCI0, method _SEG
  ...internalized path=_SEG
  ...looked up node, pointer=0000000000000000
[PCI0._SEG] was not found
  in acpi_ns_evaluate_relative to execute PCI0, method _BBN
  ...internalized path=_BBN
  ...looked up node, pointer=0000010001de64f0
...exiting acpi_ns_evaluate_relative for obj=_BBN
  in acpi_ns_evaluate_relative to execute VPR0, method _ADR
  ...internalized path=_ADR
  ...looked up node, pointer=0000010001ddbaf0
  (in apci_ns_execute_control_method _ADR)
  (method aml: 70 c 0 0 6 0 60 70 5c 2f 4 5f 53 42 5f 50 43 49 30 49 53
41 5f 43 50 54 50 61 a0 c 93 61 a 1 70 c 0 0 a 0 60 70 60 44 45 56 4e 70
60 5b 31 a4 60 

--(function list I sent earlier was here)--


Hayes, Stuart wrote:
> Here is the relevant method, and all of the fields, regions, and
> methods needed to execute it (I think).  If this isn't sufficient,
> let me know.  The method being run is ...VPR0.D0F0._STA.
> 
> This is cut from the output from Phoenix's "ad" program, and it is
> showing both the disassembled code and the AML itself.  Sorry about
> the long lines.
> 
> Thanks
> Stuart
> 
> 
> 
> 
> Moore, Robert wrote:
>> Can you please post your DSDT and give us some idea of which _STA
>> method is executing? 
>> 
>> I seem to remember that there may be a bit of leftover recursion in
>> the operation region/field handling code, but something looks very
>> odd about the way the ev_pci_config_region_setup function is getting
>> called twice. 
>> 
>> 
>>>> It seems to work with CONFIG_ACPI_DEBUG off.  I'm guessing we're
>>>> just
>> squeaking by with that, though.  Wouldn't more complex ACPI methods
>> cause the stack usage to go up, causing it to break again?
>> 
>> I think this is an odd case (i.e., bug), since the interpreter has
>> been specifically architected to not use recursion -- however, the
>> original version of the interpreter did recurse based on the
>> complexity of the ASL code (when the interpreter was running as an
>> application.)  This was removed in all obvious cases, but I do think
>> that there may be a couple that were missed - fields being perhaps
>> one of them. 
>> 
>> Bob
>> 
>> -----Original Message-----
>> From: acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>> [mailto:acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org] On Behalf Of
>> Stuart_Hayes-DYMqY+WieiM@public.gmane.org Sent: Tuesday, March 09, 2004 12:00 PM
>> To: Brown, Len; ak-l3A5Bk7waGM@public.gmane.org
>> Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>> Subject: RE: [ACPI] stack overflow
>> 
>> 
>> I am in the process of trying this without CONFIG_ACPI_DEBUG now.
>> 
>> I put a little extra debug stuff in utilities/utdebug.c to keep track
>> of all the nested functions, and added the ACPI_FUNCTION_TRACE to
>> some more of the functions, and this is what I get when the stack
>> pointer is lowest (the number by each is the address of the first
>> argument of the acpi_ut_trace(_*) function, which I store along with
>> the function name).  This gives a pretty good picture of what's
>> going on with the recursion and the deep nesting of functions.
>> 
>>   acpi_init 			0000010001e0defc
>>   acpi_bus_init 			0000010001e0deac
>>   acpi_initialize_objects 	0000010001e0de6c
>>   ns_initialize_devices 	0000010001e0de1c
>>   ns_walk_namespace 		0000010001e0dd8c
>>   ns_init_one_device 		0000010001e0dd2c
>>   ut_execute_STA 			0000010001e0dccc
>>   ut_evaluate_object 		0000010001e0dc5c
>>   ns_evaluate_relative 		0000010001e0db8c
>>   ns_evaluate_by_handle 	0000010001e0db1c
>>   ns_execute_control_method 	0000010001e0dacc
>>   psx_execute 			0000010001e0da5c
>>   ps_parse_aml 			0000010001e0da0c
>>   ps_parse_loop 			0000010001e0d8fc
>>   ds_exec_end_op 			0000010001e0d89c
>>   ds_resolve_operands 		0000010001e0d85c
>>   ex_resolve_to_value 		0000010001e0d80c
>>   ex_resolve_node_to_value 	0000010001e0d7ac
>>   ex_read_data_from_field 	0000010001e0d71c
>>   ex_extract_from_field 	0000010001e0d69c
>>   ex_field_datum_io 		0000010001e0d61c
>>   ex_access_region 		0000010001e0d5ac
>>   ev_address_space_dispatch 	0000010001e0d52c
>>   ev_pci_config_region_setup 	0000010001e0d4ac
>>   ut_evaluate_numeric_object 	0000010001e0d44c
>>   ut_evaluate_object 		0000010001e0d3dc
>>   ns_evaluate_relative 		0000010001e0d30c
>>   ns_evaluate_by_handle 	0000010001e0d29c
>>   ns_execute_control_method 	0000010001e0d24c
>>   psx_execute 			0000010001e0d1dc
>>   ps_parse_aml 			0000010001e0d18c
>>   ps_parse_loop 			0000010001e0d07c
>>   ds_exec_end_op 			0000010001e0d01c
>>   ex_resolve_operands 		0000010001e0cf9c
>>   ex_resolve_to_value 		0000010001e0cf4c
>>   ex_resolve_node_to_value 	0000010001e0ceec
>>   ex_read_data_from_field 	0000010001e0ce5c
>>   ex_extract_from_field 	0000010001e0cddc
>>   ex_field_datum_io 		0000010001e0cd5c
>>   ex_access_region 		0000010001e0ccec
>>   ev_address_space_dispatch 	0000010001e0cc6c
>>   ev_pci_config_region_setup 	0000010001e0cbec
>>   acpi_evaluate_integer 	0000010001e0caac
>>   acpi_evaluate_object 		0000010001e0ca2c
>>   ns_evaluate_relative 		0000010001e0c95c
>>   ns_evaluate_by_handle 	0000010001e0c8ec
>>   ns_execute_control_method 	0000010001e0c89c
>>   psx_execute 			0000010001e0c82c
>>   ps_parse_aml 			0000010001e0c7dc
>>   ps_parse_loop 			0000010001e0c6cc
>>   ds_exec_end_op 			0000010001e0c66c
>>   ex_resolve_operands 		0000010001e0c5ec
>>   ex_resolve_to_value 		0000010001e0c59c
>>   ex_resolve_node_to_value 	0000010001e0c53c
>>   ex_read_data_from_field 	0000010001e0c4ac
>>   ex_extract_from_field 	0000010001e0c42c
>>   ex_field_datum_io 		0000010001e0c3ac
>>   ex_access_region 		0000010001e0c33c
>>   ev_address_space_dispatch 	0000010001e0c2bc
>>   ex_enter_interpreter 		0000010001e0c27c
>>   ut_acquire_mutex 		0000010001e0c21c
>>   os_wait_semaphore 		0000010001e0c1cc
>> 
>> Thanks
>> Stuart
>> 
>> 
>> 
>> Len Brown wrote:
>>> Stuart,
>>> Does CONFIG_ACPI_DEBUG change the results of your measurements?
>>> 
>>> Is it possible to run an i386 kernel on the same system to see if
>>> we've got an x86_64-specific issue?
>>> 
>>> There is some run-time stack tracing code in ACPI (see
>>> acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.
>>> 
>>> thanks,
>>> -Len
>>> 
>>> On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
>>>>> Here are some of the reasons I believe the stack is overflowing:
>>>>> 
>>>>> I've added some "printk"s to the kernel, and I've found that the
>>>>> stack pointer goes down by ~6K between
>>>>> namespace/nseval.c:acpi_ns_evaluate_relative() and
>>>>> executer/exstore.c:acpi_ex_store().
>>>> 
>>>> The usual way to start is do
>>>> 
>>>> 	objdump -S <acpi object modules> | grep sub.*rsp
>>>> 
>>>> then sort by the biggest stack pigs and fix them one by one (e.g.
>>>> by kmallocing local data instead of allocating it on the stack)
>>>> When afterwards the problem still occurs it is most likely
>>>> recursion or to deep nesting.  I have an old 2.4 patch that can
>>>> catch these, but it would need porting to 2.6.
>>>> 
>>>> -Andi
>>>> 
>>>> 
>>>> -------------------------------------------------------
>>>> This SF.Net email is sponsored by: IBM Linux Tutorials
>>>> Free Linux tutorial presented by Daniel Robbins, President and CEO
>>>> of GenToo technologies. Learn everything from fundamentals to
>>>> system
>>>>
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
>>>> _______________________________________________ 
>>>> Acpi-devel mailing list
>>>> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>>>> https://lists.sourceforge.net/lists/listinfo/acpi-devel
>> 
>> 
>> 
>> 
>> -------------------------------------------------------
>> This SF.Net email is sponsored by: IBM Linux Tutorials
>> Free Linux tutorial presented by Daniel Robbins, President and CEO of
>> GenToo technologies. Learn everything from fundamentals to system
>> administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=ick
>> _______________________________________________
>> Acpi-devel mailing list
>> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
>> https://lists.sourceforge.net/lists/listinfo/acpi-devel




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
       [not found] ` <CE41BFEF2481C246A8DE0D2B4DBACF4F020E5FD9-novRXWwkcpil7xnNSM18fRtLTTO9Z+wMojBamW5iJbs@public.gmane.org>
@ 2004-03-10 18:40   ` Len Brown
  0 siblings, 0 replies; 14+ messages in thread
From: Len Brown @ 2004-03-10 18:40 UTC (permalink / raw)
  To: Stuart_Hayes-DYMqY+WieiM; +Cc: Robert Moore, Andi Kleen, ACPI Developers

Stuart,
not that I expect it to make a difference, but note that you can update
to the latest ACPI interpreter by applying the ACPI patch here:

http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/

I do have a patch there for 2.4.25.

Thanks for all the info -- I'm sure that Bob will be able to use it to
find out why the interpreter erroneously went recursive.

thanks,
-Len

On Wed, 2004-03-10 at 12:08, Stuart_Hayes-DYMqY+WieiM@public.gmane.org wrote:
> Here is the console output that occurred right before the lowest
> stack pointer occurred (the point where the system output the list
> of ACPI functions that I sent earlier).  You can see from this
> which methods were being executed, and where the code was in
> evaluating ...VPR0.D0F0._STA.
> 
> Also, I should probably mention that this is a 2.4 kernel.  It's
> RHEL3, which is based on 2.4.21, but I've tested this with the
> latest 2.4 kernel (2.4.25), and the results were the same.
> 
> Thanks
> Stuart
> 
> 
>  in acpi_ns_init_one_device for D0F0 (ptr=0000010001ddacf0)
> .  in acpi_ns_evaluate_relative to execute D0F0, method _STA
>   ...internalized path=_STA
>   ...looked up node, pointer=0000010001dd91f0
>   (in apci_ns_execute_control_method _STA)
>   (method aml: a4 4d 53 54 41 56 44 49 44 )
>   in acpi_ns_evaluate_relative to execute D0F0, method _ADR
>   ...internalized path=_ADR
>   ...looked up node, pointer=0000010001ddad70
>   (in apci_ns_execute_control_method _ADR)
>   (method aml: a4 4d 41 44 52 a 0 )
>   in acpi_ns_evaluate_relative to execute VPR0, method _ADR
>   ...internalized path=_ADR
>   ...looked up node, pointer=0000010001ddbaf0
>   (in apci_ns_execute_control_method _ADR)
>   (method aml: 70 c 0 0 6 0 60 70 5c 2f 4 5f 53 42 5f 50 43 49 30 49 53 41
> 5f 43 50 54 50 61 a0 c 93 61 a 1 70 c 0 0 a 0 60 70 60 44 45 56 4e 70 60 5b
> 31 a4 60 )
> [ACPI Debug] Integer: 0000000000060000
> ...exiting acpi_ns_evaluate_relative for obj=_ADR
>   in acpi_ns_evaluate_relative to execute PCI0, method _SEG
>   ...internalized path=_SEG
>   ...looked up node, pointer=0000000000000000
> [PCI0._SEG] was not found
>   in acpi_ns_evaluate_relative to execute PCI0, method _BBN
>   ...internalized path=_BBN
>   ...looked up node, pointer=0000010001de64f0
> ...exiting acpi_ns_evaluate_relative for obj=_BBN
>   in acpi_ns_evaluate_relative to execute VPR0, method _ADR
>   ...internalized path=_ADR
>   ...looked up node, pointer=0000010001ddbaf0
>   (in apci_ns_execute_control_method _ADR)
>   (method aml: 70 c 0 0 6 0 60 70 5c 2f 4 5f 53 42 5f 50 43 49 30 49 53 41
> 5f 43 50 54 50 61 a0 c 93 61 a 1 70 c 0 0 a 0 60 70 60 44 45 56 4e 70 60 5b
> 31 a4 60 
> 
> --(function list I sent earlier was here)--
> 
> 
> Hayes, Stuart wrote:
> > Here is the relevant method, and all of the fields, regions, and
> > methods needed to execute it (I think).  If this isn't sufficient,
> > let me know.  The method being run is ...VPR0.D0F0._STA.
> > 
> > This is cut from the output from Phoenix's "ad" program, and it is
> > showing both the disassembled code and the AML itself.  Sorry about
> > the long lines.
> > 
> > Thanks
> > Stuart
> > 
> > 
> > 
> > 
> > Moore, Robert wrote:
> >> Can you please post your DSDT and give us some idea of which _STA
> >> method is executing? 
> >> 
> >> I seem to remember that there may be a bit of leftover recursion in
> >> the operation region/field handling code, but something looks very
> >> odd about the way the ev_pci_config_region_setup function is getting
> >> called twice. 
> >> 
> >> 
> >>>> It seems to work with CONFIG_ACPI_DEBUG off.  I'm guessing we're
> >>>> just
> >> squeaking by with that, though.  Wouldn't more complex ACPI methods
> >> cause the stack usage to go up, causing it to break again?
> >> 
> >> I think this is an odd case (i.e., bug), since the interpreter has
> >> been specifically architected to not use recursion -- however, the
> >> original version of the interpreter did recurse based on the
> >> complexity of the ASL code (when the interpreter was running as an
> >> application.)  This was removed in all obvious cases, but I do think
> >> that there may be a couple that were missed - fields being perhaps
> >> one of them. 
> >> 
> >> Bob
> >> 
> >> -----Original Message-----
> >> From: acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> >> [mailto:acpi-devel-admin-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org] On Behalf Of
> >> Stuart_Hayes-DYMqY+WieiM@public.gmane.org Sent: Tuesday, March 09, 2004 12:00 PM
> >> To: Brown, Len; ak-l3A5Bk7waGM@public.gmane.org
> >> Cc: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> >> Subject: RE: [ACPI] stack overflow
> >> 
> >> 
> >> I am in the process of trying this without CONFIG_ACPI_DEBUG now.
> >> 
> >> I put a little extra debug stuff in utilities/utdebug.c to keep track
> >> of all the nested functions, and added the ACPI_FUNCTION_TRACE to
> >> some more of the functions, and this is what I get when the stack
> >> pointer is lowest (the number by each is the address of the first
> >> argument of the acpi_ut_trace(_*) function, which I store along with
> >> the function name).  This gives a pretty good picture of what's
> >> going on with the recursion and the deep nesting of functions.
> >> 
> >>   acpi_init 			0000010001e0defc
> >>   acpi_bus_init 			0000010001e0deac
> >>   acpi_initialize_objects 	0000010001e0de6c
> >>   ns_initialize_devices 	0000010001e0de1c
> >>   ns_walk_namespace 		0000010001e0dd8c
> >>   ns_init_one_device 		0000010001e0dd2c
> >>   ut_execute_STA 			0000010001e0dccc
> >>   ut_evaluate_object 		0000010001e0dc5c
> >>   ns_evaluate_relative 		0000010001e0db8c
> >>   ns_evaluate_by_handle 	0000010001e0db1c
> >>   ns_execute_control_method 	0000010001e0dacc
> >>   psx_execute 			0000010001e0da5c
> >>   ps_parse_aml 			0000010001e0da0c
> >>   ps_parse_loop 			0000010001e0d8fc
> >>   ds_exec_end_op 			0000010001e0d89c
> >>   ds_resolve_operands 		0000010001e0d85c
> >>   ex_resolve_to_value 		0000010001e0d80c
> >>   ex_resolve_node_to_value 	0000010001e0d7ac
> >>   ex_read_data_from_field 	0000010001e0d71c
> >>   ex_extract_from_field 	0000010001e0d69c
> >>   ex_field_datum_io 		0000010001e0d61c
> >>   ex_access_region 		0000010001e0d5ac
> >>   ev_address_space_dispatch 	0000010001e0d52c
> >>   ev_pci_config_region_setup 	0000010001e0d4ac
> >>   ut_evaluate_numeric_object 	0000010001e0d44c
> >>   ut_evaluate_object 		0000010001e0d3dc
> >>   ns_evaluate_relative 		0000010001e0d30c
> >>   ns_evaluate_by_handle 	0000010001e0d29c
> >>   ns_execute_control_method 	0000010001e0d24c
> >>   psx_execute 			0000010001e0d1dc
> >>   ps_parse_aml 			0000010001e0d18c
> >>   ps_parse_loop 			0000010001e0d07c
> >>   ds_exec_end_op 			0000010001e0d01c
> >>   ex_resolve_operands 		0000010001e0cf9c
> >>   ex_resolve_to_value 		0000010001e0cf4c
> >>   ex_resolve_node_to_value 	0000010001e0ceec
> >>   ex_read_data_from_field 	0000010001e0ce5c
> >>   ex_extract_from_field 	0000010001e0cddc
> >>   ex_field_datum_io 		0000010001e0cd5c
> >>   ex_access_region 		0000010001e0ccec
> >>   ev_address_space_dispatch 	0000010001e0cc6c
> >>   ev_pci_config_region_setup 	0000010001e0cbec
> >>   acpi_evaluate_integer 	0000010001e0caac
> >>   acpi_evaluate_object 		0000010001e0ca2c
> >>   ns_evaluate_relative 		0000010001e0c95c
> >>   ns_evaluate_by_handle 	0000010001e0c8ec
> >>   ns_execute_control_method 	0000010001e0c89c
> >>   psx_execute 			0000010001e0c82c
> >>   ps_parse_aml 			0000010001e0c7dc
> >>   ps_parse_loop 			0000010001e0c6cc
> >>   ds_exec_end_op 			0000010001e0c66c
> >>   ex_resolve_operands 		0000010001e0c5ec
> >>   ex_resolve_to_value 		0000010001e0c59c
> >>   ex_resolve_node_to_value 	0000010001e0c53c
> >>   ex_read_data_from_field 	0000010001e0c4ac
> >>   ex_extract_from_field 	0000010001e0c42c
> >>   ex_field_datum_io 		0000010001e0c3ac
> >>   ex_access_region 		0000010001e0c33c
> >>   ev_address_space_dispatch 	0000010001e0c2bc
> >>   ex_enter_interpreter 		0000010001e0c27c
> >>   ut_acquire_mutex 		0000010001e0c21c
> >>   os_wait_semaphore 		0000010001e0c1cc
> >> 
> >> Thanks
> >> Stuart
> >> 
> >> 
> >> 
> >> Len Brown wrote:
> >>> Stuart,
> >>> Does CONFIG_ACPI_DEBUG change the results of your measurements?
> >>> 
> >>> Is it possible to run an i386 kernel on the same system to see if
> >>> we've got an x86_64-specific issue?
> >>> 
> >>> There is some run-time stack tracing code in ACPI (see
> >>> acpi_gbl_lowest_stack_pointer) but it hasn't been used in a while.
> >>> 
> >>> thanks,
> >>> -Len
> >>> 
> >>> On Mon, 2004-03-08 at 13:26, Andi Kleen wrote:
> >>>>> Here are some of the reasons I believe the stack is overflowing:
> >>>>> 
> >>>>> I've added some "printk"s to the kernel, and I've found that the
> >>>>> stack pointer goes down by ~6K between
> >>>>> namespace/nseval.c:acpi_ns_evaluate_relative() and
> >>>>> executer/exstore.c:acpi_ex_store().
> >>>> 
> >>>> The usual way to start is do
> >>>> 
> >>>> 	objdump -S <acpi object modules> | grep sub.*rsp
> >>>> 
> >>>> then sort by the biggest stack pigs and fix them one by one (e.g.
> >>>> by kmallocing local data instead of allocating it on the stack)
> >>>> When afterwards the problem still occurs it is most likely
> >>>> recursion or to deep nesting.  I have an old 2.4 patch that can
> >>>> catch these, but it would need porting to 2.6.
> >>>> 
> >>>> -Andi
> >>>> 
> >>>> 
> >>>> -------------------------------------------------------
> >>>> This SF.Net email is sponsored by: IBM Linux Tutorials
> >>>> Free Linux tutorial presented by Daniel Robbins, President and CEO
> >>>> of GenToo technologies. Learn everything from fundamentals to
> >>>> system
> >>>> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> >>>> _______________________________________________ 
> >>>> Acpi-devel mailing list
> >>>> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> >>>> https://lists.sourceforge.net/lists/listinfo/acpi-devel
> >> 
> >> 
> >> 
> >> 
> >> -------------------------------------------------------
> >> This SF.Net email is sponsored by: IBM Linux Tutorials
> >> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> >> GenToo technologies. Learn everything from fundamentals to system
> >> administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=ick
> >> _______________________________________________
> >> Acpi-devel mailing list
> >> Acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> >> https://lists.sourceforge.net/lists/listinfo/acpi-devel
> 



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: stack overflow
       [not found]         ` <20040310133208.GC12272-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
@ 2004-03-10 18:44           ` Len Brown
  0 siblings, 0 replies; 14+ messages in thread
From: Len Brown @ 2004-03-10 18:44 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Stuart_Hayes-DYMqY+WieiM, ACPI Developers, Robert Moore

On Wed, 2004-03-10 at 08:32, Andi Kleen wrote:

> If you have some recursion either fix it or at least add an error
> out when the stack gets too low. We can add an "stack_left" function
> exported by the architecture.

I think we'll want to put some fun-time sanity checks for illegal
recursion into the interpreter.  That should be less invasive than doing
the full blown stack check -- which can be enabled as a separate DEBUG
test when needed.

> > Sorting the list of stack frame sizes below shows
> > acpi_evaluate_integer() is the winner with 320 bytes on the stack.  Note
> > that this isn't from passing structures, but from allocating local
> > structures.  On i386 acpi_parse_object is 124 bytes, on x86_64 it will
> > be bigger...
> 
> I would suggest to fix anything > 100 bytes at least 
> (and double check anything that could be expanded on 64bit) 

Agreed, Bob and I will fix the big stack users.

thanks,
-Len




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: stack overflow
@ 2004-03-22 17:40 Stuart_Hayes-DYMqY+WieiM
  0 siblings, 0 replies; 14+ messages in thread
From: Stuart_Hayes-DYMqY+WieiM @ 2004-03-22 17:40 UTC (permalink / raw)
  To: acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: robert.moore-ral2JQCrhuEAvxtiuMwx3w

Len Brown wrote:
> On Wed, 2004-03-10 at 08:32, Andi Kleen wrote:
> 
>> If you have some recursion either fix it or at least add an error
>> out when the stack gets too low. We can add an "stack_left" function
>> exported by the architecture.
> 
> I think we'll want to put some fun-time sanity checks for illegal
> recursion into the interpreter.  That should be less invasive than
> doing the full blown stack check -- which can be enabled as a
> separate DEBUG test when needed.
> 
>>> Sorting the list of stack frame sizes below shows
>>> acpi_evaluate_integer() is the winner with 320 bytes on the stack. 
>>> Note that this isn't from passing structures, but from allocating
>>> local structures.  On i386 acpi_parse_object is 124 bytes, on
>>> x86_64 it will be bigger...
>> 
>> I would suggest to fix anything > 100 bytes at least
>> (and double check anything that could be expanded on 64bit)
> 
> Agreed, Bob and I will fix the big stack users.
> 
> thanks,
> -Len

Thanks for all the help.  Robert Moore identified a big part of the 
problem (recursion):

(quoting Robert)
"The issue is how PCI_Config operation regions are initialized.

1) a _STA accesses a PCI_Config space field

2) This eventually causes transfer to ev_pci_config_region_setup

3) ev_pci_config_region_setup attempts to resolve the _ADR object

4) The implementation of _ADR in the ASL accesses PCI_Config space

5) This in turn causes another call to ev_pci_config_region_setup"
(end of quote)

Turning off ACPI debug messages seems to fix my problem for now.  
With debug messages enabled, the stack usage seems to be roughly 64 
extra bytes for each function called (presumably from the extra local 
variables defined in some of the debug macros in acmacros.h).  I have 
modified my DSDT so that only one layer of recursion happens, and, 
with this, the *difference* in kernel stack usage between having ACPI 
debug enabled and disabled is 2776 bytes.  Even with this better DSDT, 
the kernel stack overflows by 616 bytes (I used a larger kernel stack 
to figure that out...) if I have debug messages enabled.  FYI, the 
kernel stack on my system is 5408 bytes total (8K - size of the 
task_struct).

With debug messages disabled, I have a good 2K of margin on the stack 
with my modified DSDT.

Thanks!
Stuart


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id\x1470&alloc_id638&op=click

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-03-22 17:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-10 17:08 stack overflow Stuart_Hayes-DYMqY+WieiM
     [not found] ` <CE41BFEF2481C246A8DE0D2B4DBACF4F020E5FD9-novRXWwkcpil7xnNSM18fRtLTTO9Z+wMojBamW5iJbs@public.gmane.org>
2004-03-10 18:40   ` Len Brown
  -- strict thread matches above, loose matches on Subject: below --
2004-03-22 17:40 Stuart_Hayes-DYMqY+WieiM
2004-03-10 15:56 Stuart_Hayes-DYMqY+WieiM
2004-03-09 22:48 Moore, Robert
2004-03-09 21:04 Stuart_Hayes-DYMqY+WieiM
2004-03-09 20:00 Stuart_Hayes-DYMqY+WieiM
2004-03-09 18:34 Moore, Robert
2004-03-08 16:43 Stuart_Hayes-DYMqY+WieiM
     [not found] ` <CE41BFEF2481C246A8DE0D2B4DBACF4F128AA4-novRXWwkcpil7xnNSM18fRtLTTO9Z+wMojBamW5iJbs@public.gmane.org>
2004-03-08 18:26   ` Andi Kleen
     [not found]     ` <20040308182630.GB9490-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2004-03-09  7:17       ` Len Brown
2004-03-10  4:33   ` Len Brown
     [not found]     ` <1078893223.2346.585.camel-D2Zvc0uNKG8@public.gmane.org>
2004-03-10 13:32       ` Andi Kleen
     [not found]         ` <20040310133208.GC12272-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2004-03-10 18:44           ` Len Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox