All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.12-rc4-mm2 boot failure
@ 2005-05-16 21:04 Martin J. Bligh
  2005-05-16 21:25 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Martin J. Bligh @ 2005-05-16 21:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

PPC64 NUMA box. Maybe this is the same NUMA slab problem you were 
hitting before ...

Oops: Exception in kernel mode, sig: 5 [#1]^M
SMP NR_CPUS=32 NUMA PSERIES LPAR ^M
Modules linked in:^M
NIP: C000000000099624 XER: 00000000 LR: C00000000009A014 CTR: C00000000028C0D4^M
REGS: c00000000057ba10 TRAP: 0700   Not tainted  (2.6.12-rc4-mm2-autokern1)^M
MSR: 8000000000029032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 CR: 24004022^M
DAR: 8000000000009032 DSISR: c0000000006c82bf^M
TASK: c0000000005e2100[0] 'swapper' THREAD: c000000000578000 CPU: 0^M
GPR00: 0000000000000001 C00000000057BC90 C0000000006C0568 C00000077FFD2590 ^M
GPR04: 0000000000000000 FFFFFFFFFFFFFFFF C0000000006C83D0 C0000000005E3A24 ^M
GPR08: C0000000005E3A18 0000000000000000 C0000000006C83C8 C0000000006C82E8 ^M
GPR12: 000000000000000A C0000000005CD000 0000000000000000 0000000000000000 ^M
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ^M
GPR20: 0000000000230000 0000000003A10000 0000000000000060 0000000003F143C8 ^M
GPR24: C0000000005CD000 C0000000006BE208 C000000000577D68 0000000000008000 ^M
GPR28: 0000000000000000 00000000000080D0 C0000000005E2100 0000000000000001 ^M
NIP [c000000000099624] .interleave_nodes+0x38/0xd0^M
LR [c00000000009a014] .alloc_pages_current+0x100/0x134^M
Call Trace:^M
[c00000000057bc90] [000000000000001d] 0x1d (unreliable)^M
[c00000000057bd20] [c00000000009a014] .alloc_pages_current+0x100/0x134^M
[c00000000057bdc0] [c00000000007abd4] .get_zeroed_page+0x28/0x90^M
[c00000000057be40] [c0000000004e2e68] .pidmap_init+0x24/0xa0^M
[c00000000057bed0] [c0000000004c7734] .start_kernel+0x21c/0x30c^M
[c00000000057bf90] [c00000000000c010] .__setup_cpu_power3+0x0/0x4^M
Instruction dump:^M
fba1ffe8 fbc1fff0 f8010010 f821ff71 60000000 ebcd0160 a93e0788 793f0020 ^M
7fe9fe70 7d20fa78 7c004850 54000ffe <0b000000> 3ba30010 38bf0001 38800001 ^M
 <0>Kernel panic - not syncing: Attempted to kill the idle task!^M


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.12-rc4-mm2 boot failure
  2005-05-16 21:04 2.6.12-rc4-mm2 boot failure Martin J. Bligh
@ 2005-05-16 21:25 ` Andrew Morton
  2005-05-16 21:36   ` Martin J. Bligh
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2005-05-16 21:25 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: linux-kernel, Christoph Lameter

"Martin J. Bligh" <mbligh@mbligh.org> wrote:
>
> PPC64 NUMA box. Maybe this is the same NUMA slab problem you were 
> hitting before ...

Probably.  Christoph, this patch has crossed the grief threshold - I'll
drop it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.12-rc4-mm2 boot failure
  2005-05-16 21:25 ` Andrew Morton
@ 2005-05-16 21:36   ` Martin J. Bligh
  2005-05-16 21:40     ` Christoph Lameter
  2005-05-17 23:22     ` Martin J. Bligh
  0 siblings, 2 replies; 8+ messages in thread
From: Martin J. Bligh @ 2005-05-16 21:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Christoph Lameter



--On Monday, May 16, 2005 14:25:04 -0700 Andrew Morton <akpm@osdl.org> wrote:

> "Martin J. Bligh" <mbligh@mbligh.org> wrote:
>> 
>> PPC64 NUMA box. Maybe this is the same NUMA slab problem you were 
>> hitting before ...
> 
> Probably.  Christoph, this patch has crossed the grief threshold - I'll
> drop it.

OK, fair enough. Christoph, I am interested in seeing your patch work 
... is something that's needed. If you want, I can help you offline 
with some testing on a variety of platforms.

M.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.12-rc4-mm2 boot failure
  2005-05-16 21:36   ` Martin J. Bligh
@ 2005-05-16 21:40     ` Christoph Lameter
  2005-05-16 22:14       ` Martin J. Bligh
  2005-05-17 23:22     ` Martin J. Bligh
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Lameter @ 2005-05-16 21:40 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Andrew Morton, linux-kernel

On Mon, 16 May 2005, Martin J. Bligh wrote:

> --On Monday, May 16, 2005 14:25:04 -0700 Andrew Morton <akpm@osdl.org> wrote:
> 
> > "Martin J. Bligh" <mbligh@mbligh.org> wrote:
> >> 
> >> PPC64 NUMA box. Maybe this is the same NUMA slab problem you were 
> >> hitting before ...
> > 
> > Probably.  Christoph, this patch has crossed the grief threshold - I'll
> > drop it.
> 
> OK, fair enough. Christoph, I am interested in seeing your patch work 
> ... is something that's needed. If you want, I can help you offline 
> with some testing on a variety of platforms.

Some description of the failure would be helpful. A boot log? .config?

Does the box have CONFIG_NUMA off and CONFIG_DISCONTIG on?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.12-rc4-mm2 boot failure
  2005-05-16 21:40     ` Christoph Lameter
@ 2005-05-16 22:14       ` Martin J. Bligh
  0 siblings, 0 replies; 8+ messages in thread
From: Martin J. Bligh @ 2005-05-16 22:14 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Andrew Morton, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 932 bytes --]

--On Monday, May 16, 2005 14:40:57 -0700 Christoph Lameter <clameter@engr.sgi.com> wrote:

> On Mon, 16 May 2005, Martin J. Bligh wrote:
> 
>> --On Monday, May 16, 2005 14:25:04 -0700 Andrew Morton <akpm@osdl.org> wrote:
>> 
>> > "Martin J. Bligh" <mbligh@mbligh.org> wrote:
>> >> 
>> >> PPC64 NUMA box. Maybe this is the same NUMA slab problem you were 
>> >> hitting before ...
>> > 
>> > Probably.  Christoph, this patch has crossed the grief threshold - I'll
>> > drop it.
>> 
>> OK, fair enough. Christoph, I am interested in seeing your patch work 
>> ... is something that's needed. If you want, I can help you offline 
>> with some testing on a variety of platforms.
> 
> Some description of the failure would be helpful. A boot log? .config?
> 
> Does the box have CONFIG_NUMA off and CONFIG_DISCONTIG on?

attatched boot log. Config file is here:

http://ftp.kernel.org/pub/linux/kernel/people/mbligh/config/abat/p570

M.

[-- Attachment #2: boot_failure.log --]
[-- Type: application/octet-stream, Size: 14565 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.12-rc4-mm2 boot failure
  2005-05-16 21:36   ` Martin J. Bligh
  2005-05-16 21:40     ` Christoph Lameter
@ 2005-05-17 23:22     ` Martin J. Bligh
  2005-05-18  1:07       ` Christoph Lameter
  1 sibling, 1 reply; 8+ messages in thread
From: Martin J. Bligh @ 2005-05-17 23:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Christoph Lameter



--On Monday, May 16, 2005 14:36:21 -0700 "Martin J. Bligh" <mbligh@mbligh.org> wrote:

> 
> 
> --On Monday, May 16, 2005 14:25:04 -0700 Andrew Morton <akpm@osdl.org> wrote:
> 
>> "Martin J. Bligh" <mbligh@mbligh.org> wrote:
>>> 
>>> PPC64 NUMA box. Maybe this is the same NUMA slab problem you were 
>>> hitting before ...
>> 
>> Probably.  Christoph, this patch has crossed the grief threshold - I'll
>> drop it.
> 
> OK, fair enough. Christoph, I am interested in seeing your patch work 
> ... is something that's needed. If you want, I can help you offline 
> with some testing on a variety of platforms.

OK, I backed out the slab patches from -mm2, and confirmed the problem 
went away.

M.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.12-rc4-mm2 boot failure
  2005-05-17 23:22     ` Martin J. Bligh
@ 2005-05-18  1:07       ` Christoph Lameter
  2005-05-18  5:06         ` Martin J. Bligh
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Lameter @ 2005-05-18  1:07 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Andrew Morton, linux-kernel

On Tue, 17 May 2005, Martin J. Bligh wrote:

> > OK, fair enough. Christoph, I am interested in seeing your patch work 
> > ... is something that's needed. If you want, I can help you offline 
> > with some testing on a variety of platforms.
> 
> OK, I backed out the slab patches from -mm2, and confirmed the problem 
> went away.

Is there any way I can access the system to figure out what is wrong? The 
failure is in the page allocator and it seems that a node id is wrong.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.12-rc4-mm2 boot failure
  2005-05-18  1:07       ` Christoph Lameter
@ 2005-05-18  5:06         ` Martin J. Bligh
  0 siblings, 0 replies; 8+ messages in thread
From: Martin J. Bligh @ 2005-05-18  5:06 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Andrew Morton, linux-kernel



--Christoph Lameter <clameter@engr.sgi.com> wrote (on Tuesday, May 17, 2005 18:07:16 -0700):

> On Tue, 17 May 2005, Martin J. Bligh wrote:
> 
>> > OK, fair enough. Christoph, I am interested in seeing your patch work 
>> > ... is something that's needed. If you want, I can help you offline 
>> > with some testing on a variety of platforms.
>> 
>> OK, I backed out the slab patches from -mm2, and confirmed the problem 
>> went away.
> 
> Is there any way I can access the system to figure out what is wrong? The 
> failure is in the page allocator and it seems that a node id is wrong.

Not really - IBM doesn't tend to like letting outside parties into their
network ;-) I think OSDL might have some power boxes now ... maybe it
fails on there?

M.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-05-18  5:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-16 21:04 2.6.12-rc4-mm2 boot failure Martin J. Bligh
2005-05-16 21:25 ` Andrew Morton
2005-05-16 21:36   ` Martin J. Bligh
2005-05-16 21:40     ` Christoph Lameter
2005-05-16 22:14       ` Martin J. Bligh
2005-05-17 23:22     ` Martin J. Bligh
2005-05-18  1:07       ` Christoph Lameter
2005-05-18  5:06         ` Martin J. Bligh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.