public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* BUG() on apm resume in 2.6.18-rc2
@ 2006-07-27  3:38 J. Bruce Fields
  2006-07-27  6:10 ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: J. Bruce Fields @ 2006-07-27  3:38 UTC (permalink / raw)
  To: linux-kernel

After suspending to ram and then resuming, I'm getting the following on
2.6.18-rc2.  I also checked the latest git (54721324...) with the same
results.  The last that I know was OK was 2.6.17.

Let me know if any more information would be useful.

Copied by hand as fast as I could type, so there are probably typos....

--b.

2.6.18-rc2-g64821324
BUG: unable to handle kernel paging request at virtual address c0729c38
 printing eip:
c0109204
*pde = 009bf027
*pte = 00729000
Oops: 0000 [#1]
PREEMPT DEBUG_PAGEALLOC
Modules linked in:
CPU:    0
EIP:    0060:[<c0109204>]    Not tainted VLI
EFLAGS: 00010246   (2.6.18-rc2-g64821324 #134)
EIP is at mcheck_init+0x4/0x80
eax: 000010ff   ebx: f584bd90   ecx: c010a230   edx: 00000000
esi: 00004102   edi: 00004102   ebp: f5887ef8   esp: f5887ef4
ds: 007b   es: 007b  ss: 0068
Process apmd (pid: 4032, ti=f5887000 task=f5cf0560 task.ti=f5887000)
Stack: f584bd90 f5887f08 c04e1dcf c070a220 00000000 f5887f14 c04e1dfd c09b1080
       f5887f28 c010c820 00000002 00000002 f6a35f74 f5887f4c c010d72e 00000000
       00000000 00000000 f5cf0ad0 00000246 f53e3f28 00000000 f5887f70 c016cf8b
Call Trace:
 [<c01036fc>] show_stack_log_lvl+0x9c/0xd0
 [<c01038fe>] show_registers+0x17e/0x210
 [<c0103a92>] die+0x102/0x2c0
 [<c05683b5>] do_page_fault+0x305/0x630
 [<c0103159>] error_code+0x39/0x40
 [<c04e1dcf>] __restore_processor_state+0x16f/0x190
 [<c04e1dfd>] restore_processor_state+0xd/0x10
 [<c010c820>] suspend+0xa0/0x1c0
 [<c010d72e>] do_ioctl+0x13e/0x170
 [<c016cf8b>] do_ioctl+0x6b/0x80
 [<c016cff1>] vfs_ioctl+0x51/0x2c0
 [<c016d290>] sys_ioctl+0x30/0x50
 [<c0102e37>] syscall_call+0x7/0xb
Code: 90 90 90 90 90 90 90 55 89 e5 e8 48 f7 28 00 50 68 38 66 59 c0 e8 1d d7 00
 00 58 5a c9 c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 53 <83> 3d 38 9c 72 c0 01 8b
 5d 08 74 20 8a 43 01 3c 02 74 1e 3c 05
EIP: [<c0109204>] mcheck_init+0x4/0x80 SS:ESP 0068:f5887ef4
 <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic(0:0, irqs_disabled():1
 [<c0103776>] show_trace+0x16/0x20
 [<c0103d1b>] dump_stack+0x1b/0x20
 [<c0111815>] __might_sleep+0x95/0xa0
 [<c012c603>] down_read+0x13/0x40
 [<c01239be>] blocking_notifier_call_chain+0xe/0x30
 [<c01172d3>] profile_task_exit+0x13/0x20
 [<c0118acb>] do_exit+0x1b/0x940
 [<c0103c44>] die+0x2b4/0x2c0
 [<c05683b5>= do_page_fault+0x305/0x630
 [<c0103159>] error_code+0x39/0x40
 [<c04e1dcf>] __restore_processor_state+0x16f/0x190
 [<c04e1dfd>] restore_processor_state+0xd/0x10
 [<c010c820>] supsend+0xa0/0x1c0
 [<c010d72e>] do_ioctl+0x13e/0x170
 [<c016cf8b>] do_ioctl+0x6b/0x80
 [<c016cff1>] vfs_ioctl+0x51/0x2c0
 [<c016d290>] sys_ioctl+0x30/0x50
 [<c0102e37>] syscall_call+0x7/0xb

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() on apm resume in 2.6.18-rc2
  2006-07-27  3:38 BUG() on apm resume in 2.6.18-rc2 J. Bruce Fields
@ 2006-07-27  6:10 ` Andrew Morton
  2006-07-27  6:29   ` Johannes Weiner
  2006-07-28  4:34   ` J. Bruce Fields
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2006-07-27  6:10 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-kernel

On Wed, 26 Jul 2006 23:38:19 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> After suspending to ram and then resuming, I'm getting the following on
> 2.6.18-rc2.  I also checked the latest git (54721324...) with the same
> results.  The last that I know was OK was 2.6.17.
> 
> Let me know if any more information would be useful.
> 
> Copied by hand as fast as I could type, so there are probably typos....
> 
> --b.
> 
> 2.6.18-rc2-g64821324
> BUG: unable to handle kernel paging request at virtual address c0729c38
>  printing eip:
> c0109204
> *pde = 009bf027
> *pte = 00729000
> Oops: 0000 [#1]
> PREEMPT DEBUG_PAGEALLOC
> Modules linked in:
> CPU:    0
> EIP:    0060:[<c0109204>]    Not tainted VLI
> EFLAGS: 00010246   (2.6.18-rc2-g64821324 #134)
> EIP is at mcheck_init+0x4/0x80
> eax: 000010ff   ebx: f584bd90   ecx: c010a230   edx: 00000000
> esi: 00004102   edi: 00004102   ebp: f5887ef8   esp: f5887ef4
> ds: 007b   es: 007b  ss: 0068
> Process apmd (pid: 4032, ti=f5887000 task=f5cf0560 task.ti=f5887000)
> Stack: f584bd90 f5887f08 c04e1dcf c070a220 00000000 f5887f14 c04e1dfd c09b1080
>        f5887f28 c010c820 00000002 00000002 f6a35f74 f5887f4c c010d72e 00000000
>        00000000 00000000 f5cf0ad0 00000246 f53e3f28 00000000 f5887f70 c016cf8b
> Call Trace:
>  [<c01036fc>] show_stack_log_lvl+0x9c/0xd0
>  [<c01038fe>] show_registers+0x17e/0x210
>  [<c0103a92>] die+0x102/0x2c0
>  [<c05683b5>] do_page_fault+0x305/0x630
>  [<c0103159>] error_code+0x39/0x40
>  [<c04e1dcf>] __restore_processor_state+0x16f/0x190
>  [<c04e1dfd>] restore_processor_state+0xd/0x10
>  [<c010c820>] suspend+0xa0/0x1c0
>  [<c010d72e>] do_ioctl+0x13e/0x170
>  [<c016cf8b>] do_ioctl+0x6b/0x80
>  [<c016cff1>] vfs_ioctl+0x51/0x2c0
>  [<c016d290>] sys_ioctl+0x30/0x50
>  [<c0102e37>] syscall_call+0x7/0xb
> Code: 90 90 90 90 90 90 90 55 89 e5 e8 48 f7 28 00 50 68 38 66 59 c0 e8 1d d7 00
>  00 58 5a c9 c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 53 <83> 3d 38 9c 72 c0 01 8b
>  5d 08 74 20 8a 43 01 3c 02 74 1e 3c 05

This?

--- a/./arch/i386/kernel/cpu/mcheck/mce.h~mce-section-fix
+++ a/./arch/i386/kernel/cpu/mcheck/mce.h
@@ -9,6 +9,6 @@ void winchip_mcheck_init(struct cpuinfo_
 /* Call the installed machine check handler for this CPU setup. */
 extern fastcall void (*machine_check_vector)(struct pt_regs *, long error_code);
 
-extern int mce_disabled __initdata;
+extern int mce_disabled;
 extern int nr_mce_banks;
 
_


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() on apm resume in 2.6.18-rc2
  2006-07-27  6:10 ` Andrew Morton
@ 2006-07-27  6:29   ` Johannes Weiner
  2006-07-27  7:08     ` Andrew Morton
  2006-07-28  4:34   ` J. Bruce Fields
  1 sibling, 1 reply; 6+ messages in thread
From: Johannes Weiner @ 2006-07-27  6:29 UTC (permalink / raw)
  To: linux-kernel

Hi,

On Wed, Jul 26, 2006 at 11:10:49PM -0700, Andrew Morton wrote:
> This?
> 
> --- a/./arch/i386/kernel/cpu/mcheck/mce.h~mce-section-fix
> +++ a/./arch/i386/kernel/cpu/mcheck/mce.h
> @@ -9,6 +9,6 @@ void winchip_mcheck_init(struct cpuinfo_
>  /* Call the installed machine check handler for this CPU setup. */
>  extern fastcall void (*machine_check_vector)(struct pt_regs *, long error_code);
>  
> -extern int mce_disabled __initdata;
> +extern int mce_disabled;
>  extern int nr_mce_banks;

What hinted you to that? I didn't read much oopses, so...

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() on apm resume in 2.6.18-rc2
  2006-07-27  6:29   ` Johannes Weiner
@ 2006-07-27  7:08     ` Andrew Morton
  2006-07-27  7:38       ` Johannes Weiner
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-07-27  7:08 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: linux-kernel

On Thu, 27 Jul 2006 08:29:32 +0200
Johannes Weiner <hnazfoo@googlemail.com> wrote:

> Hi,
> 
> On Wed, Jul 26, 2006 at 11:10:49PM -0700, Andrew Morton wrote:
> > This?
> > 
> > --- a/./arch/i386/kernel/cpu/mcheck/mce.h~mce-section-fix
> > +++ a/./arch/i386/kernel/cpu/mcheck/mce.h
> > @@ -9,6 +9,6 @@ void winchip_mcheck_init(struct cpuinfo_
> >  /* Call the installed machine check handler for this CPU setup. */
> >  extern fastcall void (*machine_check_vector)(struct pt_regs *, long error_code);
> >  
> > -extern int mce_disabled __initdata;
> > +extern int mce_disabled;
> >  extern int nr_mce_banks;
> 
> What hinted you to that? I didn't read much oopses, so...
> 

(please always do reply-to-all)

That was an easy one - it crashed at

EIP is at mcheck_init+0x4/0x80

right at the start of mcheck_init(), so it had to be the access of
mce_disabled.

It accessed the address c0729c38:

BUG: unable to handle kernel paging request at virtual address c0729c38

and the code dump shows an access to that address.

And the only way in which an access to a global variable of this nature can
oops is if that variable has been unmapped from the kenrel address space. 
We unmap (and reuse) the __init memory, so it had to be a sectioning bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() on apm resume in 2.6.18-rc2
  2006-07-27  7:08     ` Andrew Morton
@ 2006-07-27  7:38       ` Johannes Weiner
  0 siblings, 0 replies; 6+ messages in thread
From: Johannes Weiner @ 2006-07-27  7:38 UTC (permalink / raw)
  To: linux-kernel

Hi,

On Thu, Jul 27, 2006 at 12:08:56AM -0700, Andrew Morton wrote:
> (please always do reply-to-all)

Oh, sorry. Sometimes mutt doesn't recognize mails belonging to a list.

> That was an easy one - it crashed at
> 
> EIP is at mcheck_init+0x4/0x80
> 
> right at the start of mcheck_init(), so it had to be the access of
> mce_disabled.
> 
> It accessed the address c0729c38:
> 
> BUG: unable to handle kernel paging request at virtual address c0729c38
> 
> and the code dump shows an access to that address.
> 
> And the only way in which an access to a global variable of this nature can
> oops is if that variable has been unmapped from the kenrel address space. 
> We unmap (and reuse) the __init memory, so it had to be a sectioning bug.

The fix was clear to me, though. After seeing it, I could imagine what
went wrong.

Hannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG() on apm resume in 2.6.18-rc2
  2006-07-27  6:10 ` Andrew Morton
  2006-07-27  6:29   ` Johannes Weiner
@ 2006-07-28  4:34   ` J. Bruce Fields
  1 sibling, 0 replies; 6+ messages in thread
From: J. Bruce Fields @ 2006-07-28  4:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Wed, Jul 26, 2006 at 11:10:49PM -0700, Andrew Morton wrote:
> This?

Yep.  With 2.6.18-rc2 + that patch, the BUG() is gone and suspend-resume
works fine.

Thanks!--b.

> --- a/./arch/i386/kernel/cpu/mcheck/mce.h~mce-section-fix
> +++ a/./arch/i386/kernel/cpu/mcheck/mce.h
> @@ -9,6 +9,6 @@ void winchip_mcheck_init(struct cpuinfo_
>  /* Call the installed machine check handler for this CPU setup. */
>  extern fastcall void (*machine_check_vector)(struct pt_regs *, long error_code);
>  
> -extern int mce_disabled __initdata;
> +extern int mce_disabled;
>  extern int nr_mce_banks;
>  
> _
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-07-28  4:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-27  3:38 BUG() on apm resume in 2.6.18-rc2 J. Bruce Fields
2006-07-27  6:10 ` Andrew Morton
2006-07-27  6:29   ` Johannes Weiner
2006-07-27  7:08     ` Andrew Morton
2006-07-27  7:38       ` Johannes Weiner
2006-07-28  4:34   ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox