* Re: 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses [not found] ` <A6974D8E5F98D511BB910002A50A6647615F214C-N2PTB0HCzHJF3Yvz3xaN/VDQ4js95KgL@public.gmane.org> @ 2004-02-17 6:26 ` Len Brown [not found] ` <1076999173.2508.30.camel-D2Zvc0uNKG8@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Len Brown @ 2004-02-17 6:26 UTC (permalink / raw) To: Alessandro Suardi; +Cc: linux-kernel, ACPI Developers Alessandro, Sure looks like a failure in the ACPI processor driver. Please confirm your system is otherwise happy when you disable the processor driver. eg. CONFIG_ACPI_PROCESSOR=n Also, it would be helpful to know if this failure started recently or you saw it in previous releases, b/c we've made some changes to the processor driver recently. thanks, -Len ps. acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org is the preferred alias to send Linux ACPI issues -- it includes linux-acpi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org which is a small sub-set. On Mon, 2004-02-16 at 17:47, Alessandro Suardi wrote: > [CC:ing linux-acpi since some acpi stuff appears in backtraces] > > While apparently doing nothing special (possibly a 'rm' on a > regular ext3 filesystem) my laptop hung. Not completely, as > I could > > * switch virtual desktops within Ximian Desktop 2 > * click on the kill window top right button, see the "app is > not responding, kill it anyway ?" dialog, say ok, see the > gnome-terminal vanish > * Alt-Fn to virtual consoles, type a login name (but getting > no prompt for the password - this hung) > * Alt-SysRq > > > Trying to get more info, I Alt-SysRq-P seeing this (handcopied > but should be fairly reliable :) : > > > Pid: 0, comm: swapper > EIP: 0060: acpi_processor_idle+0x13c/0x1cb > > default_idle+0x0/0x27 > rest_init+0x0/0x5e > acpi_nt_copy_ipackage_to_ipackage+0x69/0xdb > default_idle+0x0/0x27 > rest_init+0x0/0x5e > cpu_idle+0x2e/0x37 > start_kernel+0x182/0x1b0 > unknown_bootoption+0x0,0xff > > > While copying this down, there were 'ps' oopses at regular > intervals (say 2/3 minutes apart from each other), with this > further oops trace: > > pid_revalidate+0x28/0xd2 > pid_revalidate+0x41/0xd2 > dput+0x22/0x21f > link_path_walk+0x61b/0x957 > buffered_rmqueue+0xc1/0x15a > __alloc_pages+0xa4/0x342 > proc_info_read+0x74/0x155 > filp_open+0x67/0x69 > vfs_read+0xbc/0x127 > sys_read+0x42/0x63 > sysenter_past_esp+0x52/0x71 > > And right after each oops a further trace, with the warning > that 'ps' exited with a preempt_count of 1: > > Bad: scheduling while atomic > > schedule > unmap_page_range > unmap_vmas > exit_mmap > mmput > do_exit > do_divide > do_page_fault > acpi_processor_set_performance > error_code > file_read_actor > > There was more, but I couldn't copy further info due to pressing > time constraints. This isn't the first time a 2.6.x kernel hangs > on me, and IIRC 2.6.1 never did. > > > Oh, and of course I still can't Alt-SysRq-B :( > > > Thanks for looking into this, ciao, > > --alessandro > > "Two rivers run too deep > The seasons change and so do I" > (U2, "Indian Summer Sky") > ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <1076999173.2508.30.camel-D2Zvc0uNKG8@public.gmane.org>]
* Re: 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses [not found] ` <1076999173.2508.30.camel-D2Zvc0uNKG8@public.gmane.org> @ 2004-02-17 20:10 ` Alessandro Suardi [not found] ` <4032752E.9070201-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Alessandro Suardi @ 2004-02-17 20:10 UTC (permalink / raw) To: Len Brown; +Cc: linux-kernel, ACPI Developers [-- Attachment #1: Type: text/plain, Size: 1873 bytes --] Len Brown wrote: > Alessandro, > Sure looks like a failure in the ACPI processor driver. > > Please confirm your system is otherwise happy when you disable the > processor driver. eg. CONFIG_ACPI_PROCESSOR=n > > Also, it would be helpful to know if this failure started recently or > you saw it in previous releases, b/c we've made some changes to the > processor driver recently. Will run from now for a couple of weeks with CONFIG_ACPI_PROCESSOR=n; I checked my logs and noticed my first hang happened with 2.6.2, but so far I only experienced the problem twice since Feb 6. I just now noticed that in /var/log I have the full Oops traces (until I Alt-SysRq'd out of it), so I'm attaching them; would you please take a further look and confirm this is _only_ an ACPI-related issue ? messages.gz is 2.6.3-rc3, messages.2.gz is 2.6.2 vanilla. > thanks, > -Len > > ps. acpi-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org is the preferred alias to send > Linux ACPI issues -- it includes linux-acpi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org which is a small > sub-set. OK, thanks for the info, will do next time. > On Mon, 2004-02-16 at 17:47, Alessandro Suardi wrote: > >>[CC:ing linux-acpi since some acpi stuff appears in backtraces] >> >>While apparently doing nothing special (possibly a 'rm' on a >> regular ext3 filesystem) my laptop hung. Not completely, as >> I could >> >> * switch virtual desktops within Ximian Desktop 2 >> * click on the kill window top right button, see the "app is >> not responding, kill it anyway ?" dialog, say ok, see the >> gnome-terminal vanish >> * Alt-Fn to virtual consoles, type a login name (but getting >> no prompt for the password - this hung) >> * Alt-SysRq Many thanks, --alessandro "Two rivers run too deep The seasons change and so do I" (U2, "Indian Summer Sky") [-- Attachment #2: messages.gz --] [-- Type: application/x-gzip, Size: 5232 bytes --] [-- Attachment #3: messages.2.gz --] [-- Type: application/x-gzip, Size: 8850 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <4032752E.9070201-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>]
* Re: Re: 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses [not found] ` <4032752E.9070201-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> @ 2004-02-18 9:00 ` Dominik Brodowski 0 siblings, 0 replies; 3+ messages in thread From: Dominik Brodowski @ 2004-02-18 9:00 UTC (permalink / raw) To: Alessandro Suardi; +Cc: Len Brown, linux-kernel, ACPI Developers [-- Attachment #1: Type: text/plain, Size: 6283 bytes --] On Tue, Feb 17, 2004 at 09:10:22PM +0100, Alessandro Suardi wrote: > Will run from now for a couple of weeks with CONFIG_ACPI_PROCESSOR=n; > I checked my logs and noticed my first hang happened with 2.6.2, IIRC 2.6.2 didn't yet contain the processor updates... > I just now noticed that in /var/log I have the full Oops traces > (until I Alt-SysRq'd out of it), so I'm attaching them; would you > please take a further look and confirm this is _only_ an ACPI-related > issue ? The first Oops seems to be not related to ACPI: Feb 16 15:57:48 incident kernel: Oops: 0000 [#1] Feb 16 15:57:48 incident kernel: CPU: 0 Feb 16 15:57:48 incident kernel: EIP: 0060:[<c0242045>] Not tainted Feb 16 15:57:48 incident kernel: EFLAGS: 00010246 Feb 16 15:57:48 incident kernel: EIP is at init_dev+0x2b/0x567 Feb 16 15:57:48 incident kernel: eax: a1192400 ebx: d2e6e000 ecx: c0418f38 edx: 00008802 Feb 16 15:57:48 incident kernel: esi: 02000000 edi: f3986480 ebp: a1192400 esp: d2e6fe98 Feb 16 15:57:48 incident kernel: ds: 007b es: 007b ss: 0068 Feb 16 15:57:48 incident kernel: Process sh (pid: 7260, threadinfo=d2e6e000 task=cd98ecc0) Feb 16 15:57:48 incident kernel: Stack: a1192400 00000000 f7dabb80 c01655e6 f7dabb80 c03e52d0 00000000 f782f080 Feb 16 15:57:48 incident kernel: f30d2580 c015cc36 f7dabb80 d2e6ff04 d2e6ff00 f7dabb80 420d2290 f554c300 Feb 16 15:57:48 incident kernel: d2e6e000 02000000 f3986480 00500000 c0242eea 02000000 a1192400 d2e6ff00 Feb 16 15:57:48 incident kernel: Call Trace: Feb 16 15:57:48 incident kernel: [<c01655e6>] dput+0x22/0x21f Feb 16 15:57:48 incident kernel: [<c015cc36>] link_path_walk+0x61b/0x957 Feb 16 15:57:48 incident kernel: [<c0242eea>] tty_open+0x90/0x36d Feb 16 15:57:48 incident kernel: [<c0242e5a>] tty_open+0x0/0x36d Feb 16 15:57:48 incident kernel: [<c0157c7d>] chrdev_open+0xf3/0x21c Feb 16 15:57:48 incident kernel: [<c015d860>] open_namei+0xa6/0x400 Feb 16 15:57:48 incident kernel: [<c0157b8a>] chrdev_open+0x0/0x21c Feb 16 15:57:48 incident kernel: [<c014e264>] dentry_open+0x14d/0x218 Feb 16 15:57:48 incident kernel: [<c014e115>] filp_open+0x67/0x69 Feb 16 15:57:48 incident kernel: [<c014e598>] sys_open+0x5b/0x8b Feb 16 15:57:48 incident kernel: [<c0108f9d>] sysenter_past_esp+0x52/0x71 ... and neither the second, but then the "bad: scheduling while atomic" calls start. And this call trace looks quite strange... There is no reason ps should call "acpi_processor_set_performance".... But well, the kernel is in an inconsistent state already because of the two previous oopses... Is the kernel compiled with "frame pointers"? CONFIG_FRAME_POINTER ? If not, please change this setting to "y". What follows then are other oopses and bad: scheduling while atomic notices where I cannot see any relation to ACPI. Feb 16 16:07:16 incident kernel: SysRq : Show Regs Feb 16 16:07:16 incident kernel: Feb 16 16:07:16 incident kernel: Pid: 0, comm: swapper Feb 16 16:07:16 incident kernel: EIP: 0060:[<c02380f8>] CPU: 0 Feb 16 16:07:16 incident kernel: EIP is at acpi_processor_idle+0x13c/0x1cb Feb 16 16:07:16 incident kernel: EFLAGS: 00000216 Not tainted Feb 16 16:07:16 incident kernel: EAX: 0050d212 EBX: 00000808 ECX: 0050d079 EDX: 00000808 Feb 16 16:07:16 incident kernel: ESI: c1b7d2b0 EDI: c0105000 EBP: c1b7d200 DS: 007b ES: 007b Feb 16 16:07:16 incident kernel: CR0: 8005003b CR2: 421b7000 CR3: 35924000 CR4: 000006d0 Feb 16 16:07:16 incident kernel: Call Trace: Feb 16 16:07:16 incident kernel: [<c0106cee>] default_idle+0x0/0x27 Feb 16 16:07:16 incident kernel: [<c0105000>] rest_init+0x0/0x5e Feb 16 16:07:16 incident kernel: [<c023007b>] acpi_ut_copy_ipackage_to_ipackage+0x69/0xdb Feb 16 16:07:16 incident kernel: [<c0106cee>] default_idle+0x0/0x27 Feb 16 16:07:16 incident kernel: [<c0105000>] rest_init+0x0/0x5e Feb 16 16:07:16 incident kernel: [<c0106d79>] cpu_idle+0x2e/0x37 Feb 16 16:07:16 incident kernel: [<c0462686>] start_kernel+0x182/0x1b0 Feb 16 16:07:16 incident kernel: [<c04623dd>] unknown_bootoption+0x0/0xff acpi_processor_idle seems to innocent, "ps" is causing an oops again: Feb 16 16:08:28 incident kernel: Unable to handle kernel paging request at virtual address 02000064 Feb 16 16:08:28 incident kernel: printing eip: Feb 16 16:08:28 incident kernel: c017b7ce Feb 16 16:08:28 incident kernel: *pde = 00000000 Feb 16 16:08:28 incident kernel: Oops: 0000 [#7] Feb 16 16:08:28 incident kernel: CPU: 0 Feb 16 16:08:28 incident kernel: EIP: 0060:[<c017b7ce>] Not tainted Feb 16 16:08:28 incident kernel: EFLAGS: 00010286 Feb 16 16:08:28 incident kernel: EIP is at proc_pid_stat+0xa8/0x53c Feb 16 16:08:28 incident kernel: eax: 00000000 ebx: 02000000 ecx: f4971000 edx: c03e6330 Feb 16 16:08:28 incident kernel: esi: e9b0d900 edi: c4c7c580 ebp: c3a58000 esp: c3a59e3c Feb 16 16:08:28 incident kernel: ds: 007b es: 007b ss: 0068 Feb 16 16:08:28 incident kernel: Process ps (pid: 7430, threadinfo=c3a58000 task=d56e52e0) Feb 16 16:08:28 incident kernel: Stack: c4c7c580 ffffffff 00000008 c4c7c780 00000010 f1db65f0 f7f57858 c3a58000 Feb 16 16:08:28 incident kernel: c3a58000 f1db6580 e3935006 c0179382 f1db6ef0 f7f570f8 c3a58000 c3a58000 Feb 16 16:08:28 incident kernel: f1db6e80 f254f510 c017939b e9b0d900 f1db6e80 c3a59f70 f7ff4700 c3a59f00 Feb 16 16:08:28 incident kernel: Call Trace: Feb 16 16:08:28 incident kernel: [<c0179382>] pid_revalidate+0x28/0xd2 Feb 16 16:08:28 incident kernel: [<c017939b>] pid_revalidate+0x41/0xd2 Feb 16 16:08:28 incident kernel: [<c01655e6>] dput+0x22/0x21f Feb 16 16:08:28 incident kernel: [<c015cc36>] link_path_walk+0x61b/0x957 Feb 16 16:08:28 incident kernel: [<c013741c>] buffered_rmqueue+0xc1/0x15a Feb 16 16:08:28 incident kernel: [<c0137559>] __alloc_pages+0xa4/0x342 Feb 16 16:08:28 incident kernel: [<c01787e2>] proc_info_read+0x74/0x155 Feb 16 16:08:28 incident kernel: [<c014e115>] filp_open+0x67/0x69 Feb 16 16:08:28 incident kernel: [<c014ee92>] vfs_read+0xbc/0x127 Feb 16 16:08:28 incident kernel: [<c014f11d>] sys_read+0x42/0x63 Feb 16 16:08:28 incident kernel: [<c0108f9d>] sysenter_past_esp+0x52/0x71 Dominik [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-02-18 9:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <A6974D8E5F98D511BB910002A50A6647615F214C@hdsmsx402.hd.intel.com>
[not found] ` <A6974D8E5F98D511BB910002A50A6647615F214C-N2PTB0HCzHJF3Yvz3xaN/VDQ4js95KgL@public.gmane.org>
2004-02-17 6:26 ` 2.6.3-rc3 (and possibly earlier 2.6): weird hang and oopses Len Brown
[not found] ` <1076999173.2508.30.camel-D2Zvc0uNKG8@public.gmane.org>
2004-02-17 20:10 ` Alessandro Suardi
[not found] ` <4032752E.9070201-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2004-02-18 9:00 ` Dominik Brodowski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox