public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Sander Eikelenboom <linux@eikelenboom.it>
Cc: neilb@suse.de, john.stultz@linaro.org,
	stefan.bader@canonical.com, rjw@sisk.pl,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [    1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
Date: Tue, 3 Jan 2012 17:33:13 -0500	[thread overview]
Message-ID: <20120103223313.GA12939@phenom.dumpdata.com> (raw)
In-Reply-To: <167823371.20120103211005@eikelenboom.it>

On Tue, Jan 03, 2012 at 09:10:05PM +0100, Sander Eikelenboom wrote:
> Tuesday, January 3, 2012, 8:07:54 PM, you wrote:
> 
> > On Tue, Jan 03, 2012 at 05:13:51PM +0100, Sander Eikelenboom wrote:
> >> Hi all,
> >> 
> >> While trying a vanilla 3.2.0-rc7+ kernel (commit 115e8e705e4be071b9e06ff72578e3b603f2ba65) as host and guest kernels under Xen:
> >> 
> >> The kernels only boot when a guest has MORE than 1 cpu, with ONE CPU it gives this stacktrace:
> 
> > Yikes. So without the 115e8e705e4be071b9e06ff72578e3b603f2ba65 it boots right? So
> > regression? Lets CC Rafeal.
> 
> > But the git commit:
> 
> > ommit 115e8e705e4be071b9e06ff72578e3b603f2ba65
> > Merge: 733bbb7 f88e1ae
> > Author: Linus Torvalds <torvalds@linux-foundation.org>
> > Date:   Mon Jan 2 12:34:03 2012 -0800
> 
> >     Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
> >     
> >     * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
> >       dt/device: Fix auxdata matching to handle entries without a name override
> 
> > Looks to be unrelated? Or is there some other bug?
> 
> > Is this by any chance related to "rtc: Expire alarms after the time is set."
> > (93b2ec0128c431148b216b8f7337c1a52131ef03) which breaks Amazon EC2 instances?
> 
> > I think Stefan had a patch for this..
> 
> > Ah, please see attached file.
> 
> > Sander, could you please do two tests:
> 
> >  1). Revert the 93b2ec0128c431148b216b8f7337c1a52131ef03 and see if that fixes it
> >  2). Use the latest linus/master and try the attached patch?
> 
> Both 1 and 2 make the problem disappear !
> (no panic and no long delay on boot seen on 64bit with 1 and multiple cpu's)

Excellent. Can you also send me your guest config please? I am not able to reproduce this myself :-(

> 
> Thanks !
> 
> --
> Sander
> 
> > Thanks!
> 
> >> 
> >> [    1.074218] i8042: No controller found
> >> [    1.074510] mousedev: PS/2 mouse device common for all mice
> >> [    1.233365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
> >> [    1.233382] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> >> [    1.233394] PGD 0
> >> [    1.233399] Oops: 0002 [#1] SMP
> >> [    1.233406] CPU 0
> >> [    1.233409] Modules linked in:
> >> [    1.233415]
> >> [    1.233419] Pid: 586, comm: kworker/0:1 Not tainted 3.2.0-rc7+ #1
> >> [    1.233427] RIP: e030:[<ffffffff8107a6c4>]  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> >> [    1.233436] RSP: e02b:ffff88000ee07b20  EFLAGS: 00010002
> >> [    1.233441] RAX: ffff88000ecea000 RBX: ffffffff82729c80 RCX: 00005684b0256000
> >> [    1.233447] RDX: 0000000000000598 RSI: ffff88000ecea000 RDI: 0000000000000000
> >> [    1.233452] RBP: ffff88000ee07b20 R08: 0000000000000000 R09: 0000000000000001
> >> [    1.233458] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffd0
> >> [    1.233464] R13: 00000000000000ff R14: 0000000000000023 R15: 0000000000000014
> >> [    1.233472] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> >> [    1.233479] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [    1.233484] CR2: 0000000000000598 CR3: 0000000001e05000 CR4: 0000000000000660
> >> [    1.233490] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [    1.233496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> [    1.233502] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> >> [    1.233508] Stack:
> >> [    1.233511]  ffff88000ee07b30 ffffffff8107a72a ffff88000ee07b40 ffffffff8107a743
> >> [    1.233522]  ffff88000ee07b50 ffffffff81575250 ffff88000ee07b80 ffffffff815779c7
> >> [    1.233533]  ffffffff81e10500 00000000000000df 0000000000000020 ffffffff82729c80
> >> [    1.233545] Call Trace:
> >> [    1.233550]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> >> [    1.233556]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> >> [    1.233564]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> >> [    1.233571]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> >> [    1.233577]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> >> [    1.233583]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> >> [    1.234342]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> >> [    1.234342]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> >> [    1.234342]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> >> [    1.234342]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> >> [    1.234342]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> >> [    1.234342]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> >> [    1.234342]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> >> [    1.234342]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> >> [    1.234342]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> >> [    1.234342]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> >> [    1.234342]  [<ffffffff81081526>] kthread+0x96/0xa0
> >> [    1.234342]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> >> [    1.234342]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> >> [    1.234342]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> >> [    1.234342]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
> >> [    1.234342] Code: 48 89 e5 48 89 ce 40 80 e6 00 83 e1 04 48 0f 45 c6 48 8b 70 08 65 8b 3c 25 b0 d9 00 00 e8 65 fc ff ff c9 c3 0f 1f 00 55 48 89 e5 <3e> 0f ba 2a 00 19 c9 31 c0 85 c9 74 07 c9 c3 0f 1f 44 00 00 e8
> >> [    1.234342] RIP  [<ffffffff8107a6c4>] queue_work_on+0x4/0x30
> >> [    1.234342]  RSP <ffff88000ee07b20>
> >> [    1.234342] CR2: 0000000000000598
> >> [    1.234342] ---[ end trace e13f105b060373ec ]---
> >> [    1.277121] BUG: unable to handle kernel paging request at fffffffffffffff8
> >> [    1.277130] IP: [<ffffffff81080f8b>] kthread_data+0xb/0x20
> >> [    1.277138] PGD 1e07067 PUD 1e08067 PMD 0
> >> [    1.277147] Oops: 0000 [#2] SMP
> >> [    1.277153] CPU 0
> >> [    1.277156] Modules linked in:
> >> [    1.277162]
> >> [    1.277166] Pid: 586, comm: kworker/0:1 Tainted: G      D      3.2.0-rc7+ #1
> >> [    1.277175] RIP: e030:[<ffffffff81080f8b>]  [<ffffffff81080f8b>] kthread_data+0xb/0x20
> >> [    1.277184] RSP: e02b:ffff88000ee07708  EFLAGS: 00010096
> >> [    1.277189] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> >> [    1.278053] RDX: ffff88000ffe7100 RSI: 0000000000000000 RDI: ffff88000edbbe80
> >> [    1.278053] RBP: ffff88000ee07708 R08: ffff88000edbbef0 R09: 0000000000000001
> >> [    1.278053] R10: 0000000000000800 R11: 0000000000000000 R12: ffff88000edbc1f0
> >> [    1.278053] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000ee07840
> >> [    1.278053] FS:  0000000000000000(0000) GS:ffff88000ffd5000(0000) knlGS:0000000000000000
> >> [    1.278053] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [    1.278053] CR2: fffffffffffffff8 CR3: 0000000001e05000 CR4: 0000000000000660
> >> [    1.278053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> [    1.278053] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> [    1.278053] Process kworker/0:1 (pid: 586, threadinfo ffff88000ee06000, task ffff88000edbbe80)
> >> [    1.278053] Stack:
> >> [    1.278053]  ffff88000ee07728 ffffffff8107b050 ffff88000ee07728 ffff88000ffe7100
> >> [    1.278053]  ffff88000ee077c8 ffffffff818e1d7c ffffffff81066d79 0000000000000000
> >> [    1.278053]  ffff88000edbbe80 0000000000012100 ffff88000ee07fd8 ffff88000ee06010
> >> [    1.278053] Call Trace:
> >> [    1.278053]  [<ffffffff8107b050>] wq_worker_sleeping+0x10/0xa0
> >> [    1.278053]  [<ffffffff818e1d7c>] __schedule+0x54c/0x8b0
> >> [    1.278053]  [<ffffffff81066d79>] ? do_exit+0x519/0x850
> >> [    1.278053]  [<ffffffff810087af>] ? xen_restore_fl_direct_reloc+0x4/0x4
> >> [    1.278053]  [<ffffffff818e23ea>] schedule+0x3a/0x60
> >> [    1.278053]  [<ffffffff81066def>] do_exit+0x58f/0x850
> >> [    1.278053]  [<ffffffff81063d3d>] ? kmsg_dump+0xfd/0x140
> >> [    1.278053]  [<ffffffff818e5b67>] oops_end+0xc7/0x120
> >> [    1.278053]  [<ffffffff810640ff>] ? console_unlock+0x21f/0x290
> >> [    1.278053]  [<ffffffff81036285>] no_context+0xf5/0x270
> >> [    1.278053]  [<ffffffff8103654d>] __bad_area_nosemaphore+0x14d/0x220
> >> [    1.278053]  [<ffffffff8103662e>] bad_area_nosemaphore+0xe/0x10
> >> [    1.278053]  [<ffffffff818e8826>] do_page_fault+0x336/0x490
> >> [    1.278053]  [<ffffffff81007fed>] ? xen_force_evtchn_callback+0xd/0x10
> >> [    1.278053]  [<ffffffff810087c2>] ? check_events+0x12/0x20
> >> [    1.278053]  [<ffffffff818e50b5>] page_fault+0x25/0x30
> >> [    1.278053]  [<ffffffff8107a6c4>] ? queue_work_on+0x4/0x30
> >> [    1.278053]  [<ffffffff8107a72a>] queue_work+0x1a/0x20
> >> [    1.278053]  [<ffffffff8107a743>] schedule_work+0x13/0x20
> >> [    1.278053]  [<ffffffff81575250>] rtc_update_irq+0x10/0x20
> >> [    1.278053]  [<ffffffff815779c7>] cmos_checkintr+0x67/0x70
> >> [    1.278053]  [<ffffffff81577a1d>] cmos_irq_disable+0x4d/0x60
> >> [    1.278053]  [<ffffffff81578ad1>] ? cmos_set_alarm+0xc1/0x220
> >> [    1.278053]  [<ffffffff81578ade>] cmos_set_alarm+0xce/0x220
> >> [    1.278053]  [<ffffffff81574c43>] ? rtc_time_to_tm+0xe3/0x1b0
> >> [    1.278053]  [<ffffffff8157541b>] __rtc_set_alarm+0x9b/0xa0
> >> [    1.278053]  [<ffffffff81575899>] rtc_timer_do_work+0x1c9/0x1e0
> >> [    1.278053]  [<ffffffff81096127>] ? lock_acquire+0x97/0xb0
> >> [    1.278053]  [<ffffffff81079d20>] process_one_work+0x190/0x450
> >> [    1.278053]  [<ffffffff81079cbf>] ? process_one_work+0x12f/0x450
> >> [    1.278053]  [<ffffffff815756d0>] ? rtc_timer_start+0x80/0x80
> >> [    1.278053]  [<ffffffff8107cb21>] worker_thread+0x171/0x3a0
> >> [    1.278053]  [<ffffffff8107c9b0>] ? manage_workers+0x210/0x210
> >> [    1.278053]  [<ffffffff81081526>] kthread+0x96/0xa0
> >> [    1.278053]  [<ffffffff818ed774>] kernel_thread_helper+0x4/0x10
> >> [    1.278053]  [<ffffffff818eb7f8>] ? int_ret_from_sys_call+0x7/0x1b
> >> [    1.278053]  [<ffffffff818e4e45>] ? retint_restore_args+0x5/0x6
> >> [    1.278053]  [<ffffffff818ed770>] ? gs_change+0x13/0x13
> >> [    1.278053] Code: 55 65 48 8b 04 25 40 c4 00 00 48 8b 80 18 03 00 00 48 89 e5 8b 40 f0 c9 c3 0f 1f 80 00 00 00 00 48 8b 87 18 03 00 00 55 48 89 e5 <48> 8b 40 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
> >> [    1.278053] RIP  [<ffffffff81080f8b>] kthread_data+0xb/0x20
> >> [    1.278053]  RSP <ffff88000ee07708>
> >> [    1.278053] CR2: fffffffffffffff8
> >> [    1.278053] ---[ end trace e13f105b060373ed ]---
> >> [    1.278053] Fixing recursive fault but reboot is needed!
> >> 
> >> --
> >> Sander
> 

  reply	other threads:[~2012-01-03 23:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-03 16:13 [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 [ 1.478005] IP: [<ffffffff8107a6c4>] queue_work_on+0x4/0x30 Sander Eikelenboom
2012-01-03 19:07 ` Regression: ONE CPU fails bootup at " Konrad Rzeszutek Wilk
2012-01-03 19:17   ` Sander Eikelenboom
2012-01-03 19:26   ` Stefan Bader
2012-01-03 20:11     ` Sander Eikelenboom
2012-01-03 20:10   ` Sander Eikelenboom
2012-01-03 22:33     ` Konrad Rzeszutek Wilk [this message]
2012-01-03 23:09   ` John Stultz
2012-01-04  0:31     ` NeilBrown
2012-01-04  0:53       ` John Stultz
2012-01-04  1:20         ` NeilBrown
2012-01-04 14:46           ` Konrad Rzeszutek Wilk
2012-01-04 15:12           ` Regression: ONE CPU fails bootup at Re: [3.2.0-RC7] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598 " Stefan Bader
2012-01-05 22:03             ` NeilBrown
2012-01-04  8:17         ` Stefan Bader
2012-01-04 12:25           ` Stefan Bader
2012-01-04 13:17             ` Sander Eikelenboom
2012-01-04 18:33               ` John Stultz
2012-01-04 14:13             ` Stefan Bader
2012-01-06 20:41               ` John Stultz
2012-01-08 20:48                 ` Sander Eikelenboom
2012-01-09 13:26                 ` Stefan Bader
2012-01-04 18:35             ` John Stultz
2012-01-04 18:36           ` John Stultz
2012-01-04 18:50             ` Stefan Bader
2012-01-04 19:47             ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120103223313.GA12939@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@eikelenboom.it \
    --cc=neilb@suse.de \
    --cc=rjw@sisk.pl \
    --cc=stefan.bader@canonical.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox