* Re: 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses
[not found] ` <A6974D8E5F98D511BB910002A50A6647615F214C-N2PTB0HCzHJF3Yvz3xaN/VDQ4js95KgL@public.gmane.org>
@ 2004-02-17 6:26 ` Len Brown
[not found] ` <1076999173.2508.30.camel-D2Zvc0uNKG8@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Len Brown @ 2004-02-17 6:26 UTC (permalink / raw)
To: Alessandro Suardi; +Cc: linux-kernel, ACPI Developers
Alessandro,
Sure looks like a failure in the ACPI processor driver.
Please confirm your system is otherwise happy when you disable the
processor driver. eg. CONFIG_ACPI_PROCESSOR=n
Also, it would be helpful to know if this failure started recently or
you saw it in previous releases, b/c we've made some changes to the
processor driver recently.
thanks,
-Len
ps. acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org is the preferred alias to send
Linux ACPI issues -- it includes linux-acpi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org which is a small
sub-set.
On Mon, 2004-02-16 at 17:47, Alessandro Suardi wrote:
> [CC:ing linux-acpi since some acpi stuff appears in backtraces]
>
> While apparently doing nothing special (possibly a 'rm' on a
> regular ext3 filesystem) my laptop hung. Not completely, as
> I could
>
> * switch virtual desktops within Ximian Desktop 2
> * click on the kill window top right button, see the "app is
> not responding, kill it anyway ?" dialog, say ok, see the
> gnome-terminal vanish
> * Alt-Fn to virtual consoles, type a login name (but getting
> no prompt for the password - this hung)
> * Alt-SysRq
>
>
> Trying to get more info, I Alt-SysRq-P seeing this (handcopied
> but should be fairly reliable :) :
>
>
> Pid: 0, comm: swapper
> EIP: 0060: acpi_processor_idle+0x13c/0x1cb
>
> default_idle+0x0/0x27
> rest_init+0x0/0x5e
> acpi_nt_copy_ipackage_to_ipackage+0x69/0xdb
> default_idle+0x0/0x27
> rest_init+0x0/0x5e
> cpu_idle+0x2e/0x37
> start_kernel+0x182/0x1b0
> unknown_bootoption+0x0,0xff
>
>
> While copying this down, there were 'ps' oopses at regular
> intervals (say 2/3 minutes apart from each other), with this
> further oops trace:
>
> pid_revalidate+0x28/0xd2
> pid_revalidate+0x41/0xd2
> dput+0x22/0x21f
> link_path_walk+0x61b/0x957
> buffered_rmqueue+0xc1/0x15a
> __alloc_pages+0xa4/0x342
> proc_info_read+0x74/0x155
> filp_open+0x67/0x69
> vfs_read+0xbc/0x127
> sys_read+0x42/0x63
> sysenter_past_esp+0x52/0x71
>
> And right after each oops a further trace, with the warning
> that 'ps' exited with a preempt_count of 1:
>
> Bad: scheduling while atomic
>
> schedule
> unmap_page_range
> unmap_vmas
> exit_mmap
> mmput
> do_exit
> do_divide
> do_page_fault
> acpi_processor_set_performance
> error_code
> file_read_actor
>
> There was more, but I couldn't copy further info due to pressing
> time constraints. This isn't the first time a 2.6.x kernel hangs
> on me, and IIRC 2.6.1 never did.
>
>
> Oh, and of course I still can't Alt-SysRq-B :(
>
>
> Thanks for looking into this, ciao,
>
> --alessandro
>
> "Two rivers run too deep
> The seasons change and so do I"
> (U2, "Indian Summer Sky")
>
-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses
[not found] ` <1076999173.2508.30.camel-D2Zvc0uNKG8@public.gmane.org>
@ 2004-02-17 20:10 ` Alessandro Suardi
[not found] ` <4032752E.9070201-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Alessandro Suardi @ 2004-02-17 20:10 UTC (permalink / raw)
To: Len Brown; +Cc: linux-kernel, ACPI Developers
[-- Attachment #1: Type: text/plain, Size: 1873 bytes --]
Len Brown wrote:
> Alessandro,
> Sure looks like a failure in the ACPI processor driver.
>
> Please confirm your system is otherwise happy when you disable the
> processor driver. eg. CONFIG_ACPI_PROCESSOR=n
>
> Also, it would be helpful to know if this failure started recently or
> you saw it in previous releases, b/c we've made some changes to the
> processor driver recently.
Will run from now for a couple of weeks with CONFIG_ACPI_PROCESSOR=n;
I checked my logs and noticed my first hang happened with 2.6.2, but
so far I only experienced the problem twice since Feb 6.
I just now noticed that in /var/log I have the full Oops traces
(until I Alt-SysRq'd out of it), so I'm attaching them; would you
please take a further look and confirm this is _only_ an ACPI-related
issue ?
messages.gz is 2.6.3-rc3, messages.2.gz is 2.6.2 vanilla.
> thanks,
> -Len
>
> ps. acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org is the preferred alias to send
> Linux ACPI issues -- it includes linux-acpi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org which is a small
> sub-set.
OK, thanks for the info, will do next time.
> On Mon, 2004-02-16 at 17:47, Alessandro Suardi wrote:
>
>>[CC:ing linux-acpi since some acpi stuff appears in backtraces]
>>
>>While apparently doing nothing special (possibly a 'rm' on a
>> regular ext3 filesystem) my laptop hung. Not completely, as
>> I could
>>
>> * switch virtual desktops within Ximian Desktop 2
>> * click on the kill window top right button, see the "app is
>> not responding, kill it anyway ?" dialog, say ok, see the
>> gnome-terminal vanish
>> * Alt-Fn to virtual consoles, type a login name (but getting
>> no prompt for the password - this hung)
>> * Alt-SysRq
Many thanks,
--alessandro
"Two rivers run too deep
The seasons change and so do I"
(U2, "Indian Summer Sky")
[-- Attachment #2: messages.gz --]
[-- Type: application/x-gzip, Size: 5232 bytes --]
[-- Attachment #3: messages.2.gz --]
[-- Type: application/x-gzip, Size: 8850 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Re: 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses
[not found] ` <4032752E.9070201-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2004-02-18 9:00 ` Dominik Brodowski
0 siblings, 0 replies; 3+ messages in thread
From: Dominik Brodowski @ 2004-02-18 9:00 UTC (permalink / raw)
To: Alessandro Suardi; +Cc: Len Brown, linux-kernel, ACPI Developers
[-- Attachment #1: Type: text/plain, Size: 6283 bytes --]
On Tue, Feb 17, 2004 at 09:10:22PM +0100, Alessandro Suardi wrote:
> Will run from now for a couple of weeks with CONFIG_ACPI_PROCESSOR=n;
> I checked my logs and noticed my first hang happened with 2.6.2,
IIRC 2.6.2 didn't yet contain the processor updates...
> I just now noticed that in /var/log I have the full Oops traces
> (until I Alt-SysRq'd out of it), so I'm attaching them; would you
> please take a further look and confirm this is _only_ an ACPI-related
> issue ?
The first Oops seems to be not related to ACPI:
Feb 16 15:57:48 incident kernel: Oops: 0000 [#1]
Feb 16 15:57:48 incident kernel: CPU: 0
Feb 16 15:57:48 incident kernel: EIP: 0060:[<c0242045>] Not tainted
Feb 16 15:57:48 incident kernel: EFLAGS: 00010246
Feb 16 15:57:48 incident kernel: EIP is at init_dev+0x2b/0x567
Feb 16 15:57:48 incident kernel: eax: a1192400 ebx: d2e6e000 ecx: c0418f38 edx: 00008802
Feb 16 15:57:48 incident kernel: esi: 02000000 edi: f3986480 ebp: a1192400 esp: d2e6fe98
Feb 16 15:57:48 incident kernel: ds: 007b es: 007b ss: 0068
Feb 16 15:57:48 incident kernel: Process sh (pid: 7260, threadinfo=d2e6e000 task=cd98ecc0)
Feb 16 15:57:48 incident kernel: Stack: a1192400 00000000 f7dabb80 c01655e6 f7dabb80 c03e52d0 00000000 f782f080
Feb 16 15:57:48 incident kernel: f30d2580 c015cc36 f7dabb80 d2e6ff04 d2e6ff00 f7dabb80 420d2290 f554c300
Feb 16 15:57:48 incident kernel: d2e6e000 02000000 f3986480 00500000 c0242eea 02000000 a1192400 d2e6ff00
Feb 16 15:57:48 incident kernel: Call Trace:
Feb 16 15:57:48 incident kernel: [<c01655e6>] dput+0x22/0x21f
Feb 16 15:57:48 incident kernel: [<c015cc36>] link_path_walk+0x61b/0x957
Feb 16 15:57:48 incident kernel: [<c0242eea>] tty_open+0x90/0x36d
Feb 16 15:57:48 incident kernel: [<c0242e5a>] tty_open+0x0/0x36d
Feb 16 15:57:48 incident kernel: [<c0157c7d>] chrdev_open+0xf3/0x21c
Feb 16 15:57:48 incident kernel: [<c015d860>] open_namei+0xa6/0x400
Feb 16 15:57:48 incident kernel: [<c0157b8a>] chrdev_open+0x0/0x21c
Feb 16 15:57:48 incident kernel: [<c014e264>] dentry_open+0x14d/0x218
Feb 16 15:57:48 incident kernel: [<c014e115>] filp_open+0x67/0x69
Feb 16 15:57:48 incident kernel: [<c014e598>] sys_open+0x5b/0x8b
Feb 16 15:57:48 incident kernel: [<c0108f9d>] sysenter_past_esp+0x52/0x71
... and neither the second, but then the "bad: scheduling while atomic"
calls start. And this call trace looks quite strange... There is no reason
ps should call "acpi_processor_set_performance".... But well, the kernel is
in an inconsistent state already because of the two previous oopses...
Is the kernel compiled with "frame pointers"? CONFIG_FRAME_POINTER ? If not,
please change this setting to "y".
What follows then are other oopses and bad: scheduling while atomic notices
where I cannot see any relation to ACPI.
Feb 16 16:07:16 incident kernel: SysRq : Show Regs
Feb 16 16:07:16 incident kernel:
Feb 16 16:07:16 incident kernel: Pid: 0, comm: swapper
Feb 16 16:07:16 incident kernel: EIP: 0060:[<c02380f8>] CPU: 0
Feb 16 16:07:16 incident kernel: EIP is at acpi_processor_idle+0x13c/0x1cb
Feb 16 16:07:16 incident kernel: EFLAGS: 00000216 Not tainted
Feb 16 16:07:16 incident kernel: EAX: 0050d212 EBX: 00000808 ECX: 0050d079 EDX: 00000808
Feb 16 16:07:16 incident kernel: ESI: c1b7d2b0 EDI: c0105000 EBP: c1b7d200 DS: 007b ES: 007b
Feb 16 16:07:16 incident kernel: CR0: 8005003b CR2: 421b7000 CR3: 35924000 CR4: 000006d0
Feb 16 16:07:16 incident kernel: Call Trace:
Feb 16 16:07:16 incident kernel: [<c0106cee>] default_idle+0x0/0x27
Feb 16 16:07:16 incident kernel: [<c0105000>] rest_init+0x0/0x5e
Feb 16 16:07:16 incident kernel: [<c023007b>] acpi_ut_copy_ipackage_to_ipackage+0x69/0xdb
Feb 16 16:07:16 incident kernel: [<c0106cee>] default_idle+0x0/0x27
Feb 16 16:07:16 incident kernel: [<c0105000>] rest_init+0x0/0x5e
Feb 16 16:07:16 incident kernel: [<c0106d79>] cpu_idle+0x2e/0x37
Feb 16 16:07:16 incident kernel: [<c0462686>] start_kernel+0x182/0x1b0
Feb 16 16:07:16 incident kernel: [<c04623dd>] unknown_bootoption+0x0/0xff
acpi_processor_idle seems to innocent, "ps" is causing an oops again:
Feb 16 16:08:28 incident kernel: Unable to handle kernel paging request at virtual address 02000064
Feb 16 16:08:28 incident kernel: printing eip:
Feb 16 16:08:28 incident kernel: c017b7ce
Feb 16 16:08:28 incident kernel: *pde = 00000000
Feb 16 16:08:28 incident kernel: Oops: 0000 [#7]
Feb 16 16:08:28 incident kernel: CPU: 0
Feb 16 16:08:28 incident kernel: EIP: 0060:[<c017b7ce>] Not tainted
Feb 16 16:08:28 incident kernel: EFLAGS: 00010286
Feb 16 16:08:28 incident kernel: EIP is at proc_pid_stat+0xa8/0x53c
Feb 16 16:08:28 incident kernel: eax: 00000000 ebx: 02000000 ecx: f4971000 edx: c03e6330
Feb 16 16:08:28 incident kernel: esi: e9b0d900 edi: c4c7c580 ebp: c3a58000 esp: c3a59e3c
Feb 16 16:08:28 incident kernel: ds: 007b es: 007b ss: 0068
Feb 16 16:08:28 incident kernel: Process ps (pid: 7430, threadinfo=c3a58000 task=d56e52e0)
Feb 16 16:08:28 incident kernel: Stack: c4c7c580 ffffffff 00000008 c4c7c780 00000010 f1db65f0 f7f57858 c3a58000
Feb 16 16:08:28 incident kernel: c3a58000 f1db6580 e3935006 c0179382 f1db6ef0 f7f570f8 c3a58000 c3a58000
Feb 16 16:08:28 incident kernel: f1db6e80 f254f510 c017939b e9b0d900 f1db6e80 c3a59f70 f7ff4700 c3a59f00
Feb 16 16:08:28 incident kernel: Call Trace:
Feb 16 16:08:28 incident kernel: [<c0179382>] pid_revalidate+0x28/0xd2
Feb 16 16:08:28 incident kernel: [<c017939b>] pid_revalidate+0x41/0xd2
Feb 16 16:08:28 incident kernel: [<c01655e6>] dput+0x22/0x21f
Feb 16 16:08:28 incident kernel: [<c015cc36>] link_path_walk+0x61b/0x957
Feb 16 16:08:28 incident kernel: [<c013741c>] buffered_rmqueue+0xc1/0x15a
Feb 16 16:08:28 incident kernel: [<c0137559>] __alloc_pages+0xa4/0x342
Feb 16 16:08:28 incident kernel: [<c01787e2>] proc_info_read+0x74/0x155
Feb 16 16:08:28 incident kernel: [<c014e115>] filp_open+0x67/0x69
Feb 16 16:08:28 incident kernel: [<c014ee92>] vfs_read+0xbc/0x127
Feb 16 16:08:28 incident kernel: [<c014f11d>] sys_read+0x42/0x63
Feb 16 16:08:28 incident kernel: [<c0108f9d>] sysenter_past_esp+0x52/0x71
Dominik
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-02-18 9:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <A6974D8E5F98D511BB910002A50A6647615F214C@hdsmsx402.hd.intel.com>
[not found] ` <A6974D8E5F98D511BB910002A50A6647615F214C-N2PTB0HCzHJF3Yvz3xaN/VDQ4js95KgL@public.gmane.org>
2004-02-17 6:26 ` 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses Len Brown
[not found] ` <1076999173.2508.30.camel-D2Zvc0uNKG8@public.gmane.org>
2004-02-17 20:10 ` Alessandro Suardi
[not found] ` <4032752E.9070201-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2004-02-18 9:00 ` Dominik Brodowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox