From: Andy Whitcroft <apw@shadowen.org>
To: Satyam Sharma <satyam.sharma@gmail.com>
Cc: mel@csn.ul.ie, linux-kernel@vger.kernel.org,
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
linuxppc-dev@ozlabs.org, paulus@samba.org, anton@samba.org,
Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [BUG][2.6.23-rc6] Badness at arch/powerpc/kernel/smp.c:202
Date: Fri, 14 Sep 2007 12:03:41 +0100 [thread overview]
Message-ID: <20070914110341.GA4168@shadowen.org> (raw)
In-Reply-To: <a781481a0709140337u2c26f770h2af6d457bc39f280@mail.gmail.com>
Anton, this seems a little reminicient of that bug which popped up in
2.6.23-rc3 so do with SLB loading (if memory serves), with machine
checks and signal 7's. Of course that is _supposed_ to be fixed by this
time ...
I believe it was Paul who fixed up that one, and he is already copied.
-apw
On Fri, Sep 14, 2007 at 04:07:37PM +0530, Satyam Sharma wrote:
> > With 2.6.23-rc6 running on the ppc64 box, following oops is hit
> >
> > Oops: Machine check, sig: 7 [#1]
> >
> > SMP NR_CPUS=128 pSeries
> >
> > Modules linked in: binfmt_misc ipv6 dm_mod ehci_hcd ohci_hcd usbcore
> >
> > NIP: c0000000000ed560 LR: c0000000000efc7c CTR: c0000000000ed504
> >
> > REGS: c00000000ffef680 TRAP: 0200 Not tainted (2.6.23-rc6-autokern1)
> >
> > MSR: 8000000000109032 <EE,ME,IR,DR> CR: 28002042 XER: 00000010
> >
> > TASK = c0000000ecf9f000[0] 'swapper' THREAD: c00000000ff8c000 CPU: 2
> >
> > GPR00: 0000000000000000 c00000000ffef900 c0000000006fe598 c0000000d7a8f200
> >
> > GPR04: 0000000000001000 0000000000000000 0000000000001000 8000000000c26393
> >
> > GPR08: c0000000006b43d0 0000000000000001 0000000000001000 0000000000000000
> >
> > GPR12: 0000000048000048 c0000000005f1700 0000000000000000 0000000007a8dcd0
> >
> > GPR16: 0000000000000002 0000000000000000 0000000000000000 0000000000000000
> >
> > GPR20: 0000000000000000 0000000000001000 0000000000001000 0000000000000000
> >
> > GPR24: 0000000000000000 0000000000000000 0000000000001000 c0000000063234e8
> >
> > GPR28: 0000000000001000 0000000000000000 c000000000689c08 c00000000ff3a480
> >
> > NIP [c0000000000ed560] .end_bio_bh_io_sync+0x5c/0xac
> >
> > LR [c0000000000efc7c] .bio_endio+0xb4/0xd4
> >
> > Call Trace:
> >
> > [c00000000ffef900] [c00000000ffef990] 0xc00000000ffef990 (unreliable)
> >
> > [c00000000ffef980] [c0000000000efc7c] .bio_endio+0xb4/0xd4
> >
> > [c00000000ffefa10] [c000000000290060] .__end_that_request_first+0x154/0x548
> >
> > [c00000000ffefae0] [c00000000035af10] .scsi_end_request+0x40/0x138
> >
> > [c00000000ffefb80] [c00000000035b234] .scsi_io_completion+0x188/0x454
> >
> > [c00000000ffefc60] [c000000000372a24] .sd_rw_intr+0x2e4/0x338
> >
> > [c00000000ffefd30] [c000000000354548] .scsi_finish_command+0xbc/0xe0
> >
> > [c00000000ffefdc0] [c00000000035bdf0] .scsi_softirq_done+0x140/0x188
> >
> > [c00000000ffefe60] [c000000000293184] .blk_done_softirq+0xa0/0xd0
> >
> > [c00000000ffefef0] [c000000000055e1c] .__do_softirq+0xa8/0x164
> >
> > [c00000000ffeff90] [c000000000023f14] .call_do_softirq+0x14/0x24
> >
> > [c00000000ff8f960] [c00000000000bd30] .do_softirq+0x68/0xac
> >
> > [c00000000ff8f9f0] [c000000000055f70] .irq_exit+0x54/0x6c
> >
> > [c00000000ff8fa70] [c00000000000c358] .do_IRQ+0x170/0x1ac
> >
> > [c00000000ff8fb00] [c000000000004780] hardware_interrupt_entry+0x18/0x98
> >
> > --- Exception: 501 at .pseries_dedicated_idle_sleep+0xe0/0x194
> >
> > LR = .pseries_dedicated_idle_sleep+0xd0/0x194
> >
> > [c00000000ff8fdf0] [0000000000000000] .__start+0x4000000000000000/0x8 (unreliable)
> >
> > [c00000000ff8fe80] [c000000000010bd4] .cpu_idle+0x104/0x1d8
> >
> > [c00000000ff8ff00] [c00000000002672c] .start_secondary+0x160/0x184
> >
> > [c00000000ff8ff90] [c000000000008364] .start_secondary_prolog+0xc/0x10
> >
> > Instruction dump:
> >
> > 409a0030 393f0018 38000080 7d6048a8 7d6b0378 7d6049ad 40a2fff4 38002000
> >
> > 7d2018a8 7d290378 7d2019ad 40a2fff4 <e9230038> e89f0018 e9690000 f8410028
> >
> > Kernel panic - not syncing: Fatal exception in interrupt
>
> This oops is the real bug here, but is that a machine check exception?
> If so, it could be a hardware failure what you saw there instead, and not
> really a kernel bug ...
WARNING: multiple messages have this Message-ID (diff)
From: Andy Whitcroft <apw@shadowen.org>
To: Satyam Sharma <satyam.sharma@gmail.com>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
linuxppc-dev@ozlabs.org, linux-kernel@vger.kernel.org,
paulus@samba.org, Balbir Singh <balbir@linux.vnet.ibm.com>,
anton@samba.org, mel@csn.ul.ie
Subject: Re: [BUG][2.6.23-rc6] Badness at arch/powerpc/kernel/smp.c:202
Date: Fri, 14 Sep 2007 12:03:41 +0100 [thread overview]
Message-ID: <20070914110341.GA4168@shadowen.org> (raw)
In-Reply-To: <a781481a0709140337u2c26f770h2af6d457bc39f280@mail.gmail.com>
Anton, this seems a little reminicient of that bug which popped up in
2.6.23-rc3 so do with SLB loading (if memory serves), with machine
checks and signal 7's. Of course that is _supposed_ to be fixed by this
time ...
I believe it was Paul who fixed up that one, and he is already copied.
-apw
On Fri, Sep 14, 2007 at 04:07:37PM +0530, Satyam Sharma wrote:
> > With 2.6.23-rc6 running on the ppc64 box, following oops is hit
> >
> > Oops: Machine check, sig: 7 [#1]
> >
> > SMP NR_CPUS=128 pSeries
> >
> > Modules linked in: binfmt_misc ipv6 dm_mod ehci_hcd ohci_hcd usbcore
> >
> > NIP: c0000000000ed560 LR: c0000000000efc7c CTR: c0000000000ed504
> >
> > REGS: c00000000ffef680 TRAP: 0200 Not tainted (2.6.23-rc6-autokern1)
> >
> > MSR: 8000000000109032 <EE,ME,IR,DR> CR: 28002042 XER: 00000010
> >
> > TASK = c0000000ecf9f000[0] 'swapper' THREAD: c00000000ff8c000 CPU: 2
> >
> > GPR00: 0000000000000000 c00000000ffef900 c0000000006fe598 c0000000d7a8f200
> >
> > GPR04: 0000000000001000 0000000000000000 0000000000001000 8000000000c26393
> >
> > GPR08: c0000000006b43d0 0000000000000001 0000000000001000 0000000000000000
> >
> > GPR12: 0000000048000048 c0000000005f1700 0000000000000000 0000000007a8dcd0
> >
> > GPR16: 0000000000000002 0000000000000000 0000000000000000 0000000000000000
> >
> > GPR20: 0000000000000000 0000000000001000 0000000000001000 0000000000000000
> >
> > GPR24: 0000000000000000 0000000000000000 0000000000001000 c0000000063234e8
> >
> > GPR28: 0000000000001000 0000000000000000 c000000000689c08 c00000000ff3a480
> >
> > NIP [c0000000000ed560] .end_bio_bh_io_sync+0x5c/0xac
> >
> > LR [c0000000000efc7c] .bio_endio+0xb4/0xd4
> >
> > Call Trace:
> >
> > [c00000000ffef900] [c00000000ffef990] 0xc00000000ffef990 (unreliable)
> >
> > [c00000000ffef980] [c0000000000efc7c] .bio_endio+0xb4/0xd4
> >
> > [c00000000ffefa10] [c000000000290060] .__end_that_request_first+0x154/0x548
> >
> > [c00000000ffefae0] [c00000000035af10] .scsi_end_request+0x40/0x138
> >
> > [c00000000ffefb80] [c00000000035b234] .scsi_io_completion+0x188/0x454
> >
> > [c00000000ffefc60] [c000000000372a24] .sd_rw_intr+0x2e4/0x338
> >
> > [c00000000ffefd30] [c000000000354548] .scsi_finish_command+0xbc/0xe0
> >
> > [c00000000ffefdc0] [c00000000035bdf0] .scsi_softirq_done+0x140/0x188
> >
> > [c00000000ffefe60] [c000000000293184] .blk_done_softirq+0xa0/0xd0
> >
> > [c00000000ffefef0] [c000000000055e1c] .__do_softirq+0xa8/0x164
> >
> > [c00000000ffeff90] [c000000000023f14] .call_do_softirq+0x14/0x24
> >
> > [c00000000ff8f960] [c00000000000bd30] .do_softirq+0x68/0xac
> >
> > [c00000000ff8f9f0] [c000000000055f70] .irq_exit+0x54/0x6c
> >
> > [c00000000ff8fa70] [c00000000000c358] .do_IRQ+0x170/0x1ac
> >
> > [c00000000ff8fb00] [c000000000004780] hardware_interrupt_entry+0x18/0x98
> >
> > --- Exception: 501 at .pseries_dedicated_idle_sleep+0xe0/0x194
> >
> > LR = .pseries_dedicated_idle_sleep+0xd0/0x194
> >
> > [c00000000ff8fdf0] [0000000000000000] .__start+0x4000000000000000/0x8 (unreliable)
> >
> > [c00000000ff8fe80] [c000000000010bd4] .cpu_idle+0x104/0x1d8
> >
> > [c00000000ff8ff00] [c00000000002672c] .start_secondary+0x160/0x184
> >
> > [c00000000ff8ff90] [c000000000008364] .start_secondary_prolog+0xc/0x10
> >
> > Instruction dump:
> >
> > 409a0030 393f0018 38000080 7d6048a8 7d6b0378 7d6049ad 40a2fff4 38002000
> >
> > 7d2018a8 7d290378 7d2019ad 40a2fff4 <e9230038> e89f0018 e9690000 f8410028
> >
> > Kernel panic - not syncing: Fatal exception in interrupt
>
> This oops is the real bug here, but is that a machine check exception?
> If so, it could be a hardware failure what you saw there instead, and not
> really a kernel bug ...
next prev parent reply other threads:[~2007-09-14 11:04 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-14 10:05 [BUG][2.6.23-rc6] Badness at arch/powerpc/kernel/smp.c:202 Kamalesh Babulal
2007-09-14 10:37 ` Satyam Sharma
2007-09-14 10:37 ` Satyam Sharma
2007-09-14 11:03 ` Andy Whitcroft [this message]
2007-09-14 11:03 ` Andy Whitcroft
2007-09-17 23:43 ` [PATCH] powerpc: Avoid pointless WARN_ON(irqs_disabled()) from panic codepath Satyam Sharma
2007-09-17 23:43 ` Satyam Sharma
2007-09-18 0:08 ` Satyam Sharma
2007-09-18 0:08 ` Satyam Sharma
2007-09-18 1:37 ` Randy Dunlap
2007-09-18 1:37 ` Randy Dunlap
2007-09-18 1:46 ` Josh Boyer
2007-09-18 1:46 ` Josh Boyer
2007-09-19 13:45 ` Satyam Sharma
2007-09-19 13:45 ` Satyam Sharma
2007-09-19 15:29 ` Randy Dunlap
2007-09-19 15:29 ` Randy Dunlap
2007-09-20 8:25 ` Kamalesh Babulal
2007-09-20 8:25 ` Kamalesh Babulal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070914110341.GA4168@shadowen.org \
--to=apw@shadowen.org \
--cc=anton@samba.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=kamalesh@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mel@csn.ul.ie \
--cc=paulus@samba.org \
--cc=satyam.sharma@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.