* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 13:12 ` [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc () Kamalesh Babulal
@ 2008-03-04 14:40 ` Michael Neuling
2008-03-04 18:33 ` Andrew Morton
2008-03-04 18:36 ` Andrew Morton
2008-03-05 8:22 ` Benjamin Herrenschmidt
2 siblings, 1 reply; 33+ messages in thread
From: Michael Neuling @ 2008-03-04 14:40 UTC (permalink / raw)
To: Kamalesh Babulal; +Cc: linuxppc-dev, Andrew Morton, linux-kernel
In message <47CD4AB3.3080409@linux.vnet.ibm.com> you wrote:
> Hi Andrew,
>
> The 2.6.25-rc3-mm1 kernel panics while bootup on power box. The machine boote
d up
> without the panic on the third attempt, but badness call trace were seen whil
e running
> tests
>
> 1) The kernel panic on first attempt
>
> Unable to handle kernel paging request for data at address 0x00000000
> Faulting instruction address: 0xc00000000000cb2c
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=128 NUMA pSeries
> Modules linked in:
> NIP: c00000000000cb2c LR: c00000000000caf8 CTR: 0000000000000226
> REGS: c00000000068f360 TRAP: 0300 Not tainted (2.6.25-rc3-mm1-autotest)
> MSR: 8000000000001032 <ME,IR,DR> CR: 28000024 XER: 20000001
> DAR: 0000000000000000, DSISR: 0000000040000000
> TASK = c0000000005c8590[0] 'swapper' THREAD: c00000000068c000 CPU: 0
> GPR00: c00000000068f5e0 c00000000068f5e0 c00000000068e690 0000000000000000
> GPR04: 00000000000035e0 000000000087264e c000000008011280 c000000000594000
> GPR08: c0000000005c9300 0000000000000000 c000000000591090 c00000000068c000
> GPR12: 8000000000009032 c0000000005c9300 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000008000 0000000000000000
> GPR20: 0000000000000000 0000000000000000 000000000000007f 0000000000018000
> GPR24: 0000000000000001 0000000000000080 0000000000000018 0000000000000000
> GPR28: 0000000000000c00 c000000000588988 c000000000639be8 c000000008001c00
> NIP [c00000000000cb2c] .do_IRQ+0x74/0x1c4
> LR [c00000000000caf8] .do_IRQ+0x40/0x1c4
> Call Trace:
> [c00000000068f5e0] [c00000000000caf8] .do_IRQ+0x40/0x1c4 (unreliable)
> [c00000000068f680] [c000000000004790] hardware_interrupt_entry+0x18/0x1c
> --- Exception: 501 at .memset+0x70/0xfc
> LR = .__alloc_bootmem_core+0x39c/0x3dc
> [c00000000068f970] [c00000000068fa10] init_thread_union+0x3a10/0x4000 (unreli
able)
> [c00000000068fa30] [c00000000057237c] .__alloc_bootmem_node+0x38/0x8c
> [c00000000068fad0] [c0000000003c477c] .zone_wait_table_init+0x74/0x108
> [c00000000068fb60] [c0000000003d9058] .init_currently_empty_zone+0x40/0x11c
> [c00000000068fc00] [c0000000003d94c8] .free_area_init_node+0x394/0x3fc
> [c00000000068fcf0] [c00000000057314c] .free_area_init_nodes+0x2d8/0x364
> [c00000000068fd90] [c00000000056682c] .paging_init+0x40/0x58
> [c00000000068fe40] [c00000000055ba34] .setup_arch+0x20c/0x240
> [c00000000068fee0] [c000000000552690] .start_kernel+0xdc/0x414
> [c00000000068ff90] [c000000000008594] .start_here_common+0x54/0xc0
> Instruction dump:
> 7c200b78 780404a0 2ba408ff 41bd001c e87e80a8 3884ff00 48058d21 60000000
> 480054cd 60000000 e93e80b0 e92900b8 <e8090000> f8410028 e9690010 e8490008
I'm not getting a crash but I am getting this:
start_kernel(): bug: interrupts were enabled *very* early, fixing it
...and you're getting a null pointer access here (in do_IRQ):
irq = ppc_md.get_irq();
Are we somehow enabling interrupts before we've setup ppc_md.get_irq?
Mikey
>
> 2) The kernel panic on second attempt
>
> Unable to handle kernel paging request for data at address 0x00000000
> Faulting instruction address: 0xc00000000000cb2c
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=128 NUMA pSeries
> Modules linked in:
> NIP: c00000000000cb2c LR: c00000000000caf8 CTR: 0000000000014a99
> REGS: c00000000068f410 TRAP: 0300 Not tainted (2.6.25-rc3-mm1-autotest)
> MSR: 8000000000001032 <ME,IR,DR> CR: 28000044 XER: 00000001
> DAR: 0000000000000000, DSISR: 0000000040000000
> TASK = c0000000005c8590[0] 'swapper' THREAD: c00000000068c000 CPU: 0
> GPR00: c00000000068f690 c00000000068f690 c00000000068e690 0000000000000000
> GPR04: 0000000000003690 0000000000537672 c000000001ad59c0 c000000000594000
> GPR08: c0000000005c9300 0000000000000000 c000000000591090 c00000000068c000
> GPR12: 8000000000009032 c0000000005c9300 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000230000 0000000000000000 0000000000ffffff 0000000001000000
> GPR24: 0000000000001000 0000000001000000 0000000000001000 0000000000000000
> GPR28: 0000000000000000 c0000000005889c8 c000000000639be8 c000000001000000
> NIP [c00000000000cb2c] .do_IRQ+0x74/0x1c4
> LR [c00000000000caf8] .do_IRQ+0x40/0x1c4
> Call Trace:
> [c00000000068f690] [c00000000000caf8] .do_IRQ+0x40/0x1c4 (unreliable)
> [c00000000068f730] [c000000000004790] hardware_interrupt_entry+0x18/0x1c
> --- Exception: 501 at .memset+0x80/0xfc
> LR = .__alloc_bootmem_core+0x39c/0x3dc
> [c00000000068fa20] [c000000000641a78] sysctl_pernet_ops+0x108e0/0x1d6e0 (unre
liable)
> [c00000000068fae0] [c00000000057237c] .__alloc_bootmem_node+0x38/0x8c
> [c00000000068fb80] [c0000000003c48dc] .__earlyonly_bootmem_alloc+0x24/0x3c
> [c00000000068fc00] [c0000000003d885c] .vmemmap_populate+0x7c/0xf4
> [c00000000068fc90] [c0000000003d9b6c] .sparse_mem_map_populate+0x38/0x64
> [c00000000068fd10] [c000000000573ec4] .sparse_early_mem_map_alloc+0x54/0x98
> [c00000000068fda0] [c000000000573f70] .sparse_init+0x68/0x148
> [c00000000068fe40] [c00000000055b9ec] .setup_arch+0x1c4/0x240
> [c00000000068fee0] [c000000000552690] .start_kernel+0xdc/0x414
> [c00000000068ff90] [c000000000008594] .start_here_common+0x54/0xc0
> Instruction dump:
> 7c200b78 780404a0 2ba408ff 41bd001c e87e80a8 3884ff00 48058d21 60000000
> 480054cd 60000000 e93e80b0 e92900b8 <e8090000> f8410028 e9690010 e8490008
>
> 3) Third attempt kernel booted up but had the following call trace 264 times
while running
> test
>
> Badness at include/linux/gfp.h:110
> NIP: c0000000000b4ff0 LR: c0000000000b4fa0 CTR: c00000000019cdb4
> REGS: c000000009edf250 TRAP: 0700 Not tainted (2.6.25-rc3-mm1-autotest)
> MSR: 8000000000029032 <EE,ME,IR,DR> CR: 22024042 XER: 20000003
> TASK = c000000009062140[548] 'kjournald' THREAD: c000000009edc000 CPU: 0
> NIP [c0000000000b4ff0] .get_page_from_freelist+0x29c/0x898
> LR [c0000000000b4fa0] .get_page_from_freelist+0x24c/0x898
> Call Trace:
> [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
> [c000000009edf8c0] [c0000000001fe0f8] .radix_tree_preload+0x50/0xd4
> [c000000009edf960] [c0000000000ad048] .add_to_page_cache+0x38/0x12c
> [c000000009edfa00] [c0000000000ad158] .add_to_page_cache_lru+0x1c/0x4c
> [c000000009edfa90] [c0000000000add58] .find_or_create_page+0x60/0xa8
> [c000000009edfb30] [c00000000011e478] .__getblk+0x140/0x310
> [c000000009edfc00] [c0000000001b78c4] .journal_get_descriptor_buffer+0x44/0xd
8
> [c000000009edfca0] [c0000000001b236c] .journal_commit_transaction+0x948/0x159
0
> [c000000009edfe00] [c0000000001b585c] .kjournald+0xf4/0x2ac
> [c000000009edff00] [c00000000007ff4c] .kthread+0x84/0xd0
> [c000000009edff90] [c000000000028900] .kernel_thread+0x4c/0x68
> Instruction dump:
> 7dc57378 48009575 60000000 2fa30000 419e0490 56c902d8 3c000018 7dd907b4
> 7ad2c7e2 7f890000 7c000026 5400fffe <0b000000> e93e8128 3b000000 80090000
> --
> Thanks & Regards,
> Kamalesh Babulal,
> Linux Technology Center,
> IBM, ISTL.
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 14:40 ` Michael Neuling
@ 2008-03-04 18:33 ` Andrew Morton
2008-03-05 8:23 ` Benjamin Herrenschmidt
2008-03-06 0:03 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 33+ messages in thread
From: Andrew Morton @ 2008-03-04 18:33 UTC (permalink / raw)
To: Michael Neuling
Cc: Matthew Wilcox, linuxppc-dev, linux-kernel, Kamalesh Babulal
On Tue, 04 Mar 2008 15:40:56 +0100 Michael Neuling <mikey@neuling.org> wrote:
> In message <47CD4AB3.3080409@linux.vnet.ibm.com> you wrote:
> > Hi Andrew,
> >
> > The 2.6.25-rc3-mm1 kernel panics while bootup on power box. The machine boote
> d up
> > without the panic on the third attempt, but badness call trace were seen whil
> e running
> > tests
> >
> > 1) The kernel panic on first attempt
> >
> > Unable to handle kernel paging request for data at address 0x00000000
> > Faulting instruction address: 0xc00000000000cb2c
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > SMP NR_CPUS=128 NUMA pSeries
> > Modules linked in:
> > NIP: c00000000000cb2c LR: c00000000000caf8 CTR: 0000000000000226
> > REGS: c00000000068f360 TRAP: 0300 Not tainted (2.6.25-rc3-mm1-autotest)
> > MSR: 8000000000001032 <ME,IR,DR> CR: 28000024 XER: 20000001
> > DAR: 0000000000000000, DSISR: 0000000040000000
> > TASK = c0000000005c8590[0] 'swapper' THREAD: c00000000068c000 CPU: 0
> > GPR00: c00000000068f5e0 c00000000068f5e0 c00000000068e690 0000000000000000
> > GPR04: 00000000000035e0 000000000087264e c000000008011280 c000000000594000
> > GPR08: c0000000005c9300 0000000000000000 c000000000591090 c00000000068c000
> > GPR12: 8000000000009032 c0000000005c9300 0000000000000000 0000000000000000
> > GPR16: 0000000000000000 0000000000000000 0000000000008000 0000000000000000
> > GPR20: 0000000000000000 0000000000000000 000000000000007f 0000000000018000
> > GPR24: 0000000000000001 0000000000000080 0000000000000018 0000000000000000
> > GPR28: 0000000000000c00 c000000000588988 c000000000639be8 c000000008001c00
> > NIP [c00000000000cb2c] .do_IRQ+0x74/0x1c4
> > LR [c00000000000caf8] .do_IRQ+0x40/0x1c4
> > Call Trace:
> > [c00000000068f5e0] [c00000000000caf8] .do_IRQ+0x40/0x1c4 (unreliable)
> > [c00000000068f680] [c000000000004790] hardware_interrupt_entry+0x18/0x1c
> > --- Exception: 501 at .memset+0x70/0xfc
> > LR = .__alloc_bootmem_core+0x39c/0x3dc
> > [c00000000068f970] [c00000000068fa10] init_thread_union+0x3a10/0x4000 (unreli
> able)
> > [c00000000068fa30] [c00000000057237c] .__alloc_bootmem_node+0x38/0x8c
> > [c00000000068fad0] [c0000000003c477c] .zone_wait_table_init+0x74/0x108
> > [c00000000068fb60] [c0000000003d9058] .init_currently_empty_zone+0x40/0x11c
> > [c00000000068fc00] [c0000000003d94c8] .free_area_init_node+0x394/0x3fc
> > [c00000000068fcf0] [c00000000057314c] .free_area_init_nodes+0x2d8/0x364
> > [c00000000068fd90] [c00000000056682c] .paging_init+0x40/0x58
> > [c00000000068fe40] [c00000000055ba34] .setup_arch+0x20c/0x240
> > [c00000000068fee0] [c000000000552690] .start_kernel+0xdc/0x414
> > [c00000000068ff90] [c000000000008594] .start_here_common+0x54/0xc0
> > Instruction dump:
> > 7c200b78 780404a0 2ba408ff 41bd001c e87e80a8 3884ff00 48058d21 60000000
> > 480054cd 60000000 e93e80b0 e92900b8 <e8090000> f8410028 e9690010 e8490008
>
> I'm not getting a crash but I am getting this:
>
> start_kernel(): bug: interrupts were enabled *very* early, fixing it
>
> ...and you're getting a null pointer access here (in do_IRQ):
>
> irq = ppc_md.get_irq();
>
> Are we somehow enabling interrupts before we've setup ppc_md.get_irq?
>
Yes, we are - it's the semaphore rewrite which is doing this in
start_kernel(). It's being discussed.
Enabling interrupts too early on powerpc was discovered to be fatal on
powerpc years ago. It looks like that remains the case.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 18:33 ` Andrew Morton
@ 2008-03-05 8:23 ` Benjamin Herrenschmidt
2008-03-06 0:03 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 33+ messages in thread
From: Benjamin Herrenschmidt @ 2008-03-05 8:23 UTC (permalink / raw)
To: Andrew Morton
Cc: Matthew Wilcox, linuxppc-dev, Michael Neuling, linux-kernel,
Kamalesh Babulal
On Tue, 2008-03-04 at 10:33 -0800, Andrew Morton wrote:
> > Are we somehow enabling interrupts before we've setup
> ppc_md.get_irq?
> >
>
> Yes, we are - it's the semaphore rewrite which is doing this in
> start_kernel(). It's being discussed.
>
> Enabling interrupts too early on powerpc was discovered to be fatal on
> powerpc years ago. It looks like that remains the case.
Yes, it is and will probably always be. All that semaphore mucking
around that hard-enables interrupts is just asking for trouble (and on
more than just powerpc... heh, how do you do if your main interrupt
controller hasn't even been initialized yet ?)
Ben.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 18:33 ` Andrew Morton
2008-03-05 8:23 ` Benjamin Herrenschmidt
@ 2008-03-06 0:03 ` Benjamin Herrenschmidt
2008-03-06 0:44 ` Andrew Morton
1 sibling, 1 reply; 33+ messages in thread
From: Benjamin Herrenschmidt @ 2008-03-06 0:03 UTC (permalink / raw)
To: Andrew Morton
Cc: Matthew Wilcox, linuxppc-dev, Michael Neuling, linux-kernel,
Kamalesh Babulal
> Yes, we are - it's the semaphore rewrite which is doing this in
> start_kernel(). It's being discussed.
>
> Enabling interrupts too early on powerpc was discovered to be fatal on
> powerpc years ago. It looks like that remains the case.
Regarding these issues. I could make it non fatal and just WARN_ON,
provided that I have a way to differentiate legal vs. illegal calls
to local_irq_enable(). We already have that function mostly out of
line in C code due to our lazy irq disabling scheme, so the overhead of
testing some global kernel state would be minimum here.
However, I don't see anything around init/main.c:start_kernel() that I
can use. What do you reckon here we should do ? Add some kind of global
we set before calling local_irq_enable() ? Or make early_boot_irqs_on()
do that generically
It's currently defined as an empty inline without CONFIG_TRACE_IRQFLAGS
but we could make it set a flag instead.
I'm pretty sure other archs have similar problems, especially in the
embedded world where you are booted with random junk firmwares that may
leave devices, interrupt controllers etc... in random state, and
enabling incoming IRQs before the arch code properly initializes the
main interrupt controller can be fatal. I know at least of an ARM board
I worked on a while ago that had a similar issues.
On ppc32, unfortunately, our local_irq_enable/restore are nice inlines
that whack the appropriate MSR bits directly, thus adding a test for a
global flag would add some bloat/overhead that I'd like to avoid, at
least until we decide to also do lazy disabling on those, if ever...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-06 0:03 ` Benjamin Herrenschmidt
@ 2008-03-06 0:44 ` Andrew Morton
2008-03-06 0:52 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 33+ messages in thread
From: Andrew Morton @ 2008-03-06 0:44 UTC (permalink / raw)
To: benh; +Cc: willy, linuxppc-dev, mikey, linux-kernel, kamalesh
On Thu, 06 Mar 2008 11:03:31 +1100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
> > Yes, we are - it's the semaphore rewrite which is doing this in
> > start_kernel(). It's being discussed.
> >
> > Enabling interrupts too early on powerpc was discovered to be fatal on
> > powerpc years ago. It looks like that remains the case.
>
> Regarding these issues. I could make it non fatal and just WARN_ON,
> provided that I have a way to differentiate legal vs. illegal calls
> to local_irq_enable().
And local_irq_restore() and various other things.
> We already have that function mostly out of
> line in C code due to our lazy irq disabling scheme, so the overhead of
> testing some global kernel state would be minimum here.
>
> However, I don't see anything around init/main.c:start_kernel() that I
> can use. What do you reckon here we should do ? Add some kind of global
> we set before calling local_irq_enable() ? Or make early_boot_irqs_on()
> do that generically
>
> It's currently defined as an empty inline without CONFIG_TRACE_IRQFLAGS
> but we could make it set a flag instead.
>
> I'm pretty sure other archs have similar problems, especially in the
> embedded world where you are booted with random junk firmwares that may
> leave devices, interrupt controllers etc... in random state, and
> enabling incoming IRQs before the arch code properly initializes the
> main interrupt controller can be fatal. I know at least of an ARM board
> I worked on a while ago that had a similar issues.
>
> On ppc32, unfortunately, our local_irq_enable/restore are nice inlines
> that whack the appropriate MSR bits directly, thus adding a test for a
> global flag would add some bloat/overhead that I'd like to avoid, at
> least until we decide to also do lazy disabling on those, if ever...
I'd have thought that the way to do this would be to add it to lockdep -
lockdep already has all the infrastructure and code sites to do this.
Set some special flag saying its-ok-to-enable-interrupts-now and test that
in lockdep.
akpm:/usr/src/25> grep LOCKDEP arch/powerpc/Kconfig
akpm:/usr/src/25>
losers ;)
Still, doing it for
akpm:/usr/src/25> grep -l LOCKDEP arch/*/Kconfig
arch/arm/Kconfig
arch/avr32/Kconfig
arch/mips/Kconfig
arch/s390/Kconfig
arch/sh/Kconfig
arch/sparc64/Kconfig
arch/um/Kconfig
arch/x86/Kconfig
should give pretty good coverage.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-06 0:44 ` Andrew Morton
@ 2008-03-06 0:52 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 33+ messages in thread
From: Benjamin Herrenschmidt @ 2008-03-06 0:52 UTC (permalink / raw)
To: Andrew Morton; +Cc: willy, linuxppc-dev, mikey, linux-kernel, kamalesh
On Wed, 2008-03-05 at 16:44 -0800, Andrew Morton wrote:
> On Thu, 06 Mar 2008 11:03:31 +1100
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
> >
> > > Yes, we are - it's the semaphore rewrite which is doing this in
> > > start_kernel(). It's being discussed.
> > >
> > > Enabling interrupts too early on powerpc was discovered to be fatal on
> > > powerpc years ago. It looks like that remains the case.
> >
> > Regarding these issues. I could make it non fatal and just WARN_ON,
> > provided that I have a way to differentiate legal vs. illegal calls
> > to local_irq_enable().
>
> And local_irq_restore() and various other things.
Yes, on powerpc 64 bits, they all go down to one C function that does
the lazy enable/disable, so it would be easy to deal with. 32 bits
doesn't have it that simple tho.
> I'd have thought that the way to do this would be to add it to lockdep -
> lockdep already has all the infrastructure and code sites to do this.
>
> Set some special flag saying its-ok-to-enable-interrupts-now and test that
> in lockdep.
Ok.
> akpm:/usr/src/25> grep LOCKDEP arch/powerpc/Kconfig
> akpm:/usr/src/25>
>
> losers ;)
I have lockdep patches for powerpc 32 and 64 bits. They aren't upstream
yet as they need a bit more beating up and there's at least one machine
that doesn't seem to like them, so I'm working on just that. That's a
good idea to add the test to lockdep tho, I'll see what I can do.
> Still, doing it for
>
> akpm:/usr/src/25> grep -l LOCKDEP arch/*/Kconfig
> arch/arm/Kconfig
> arch/avr32/Kconfig
> arch/mips/Kconfig
> arch/s390/Kconfig
> arch/sh/Kconfig
> arch/sparc64/Kconfig
> arch/um/Kconfig
> arch/x86/Kconfig
>
> should give pretty good coverage.
Ben.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 13:12 ` [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc () Kamalesh Babulal
2008-03-04 14:40 ` Michael Neuling
@ 2008-03-04 18:36 ` Andrew Morton
2008-03-04 18:47 ` Pekka Enberg
2008-03-04 19:18 ` Pekka Enberg
2008-03-05 8:22 ` Benjamin Herrenschmidt
2 siblings, 2 replies; 33+ messages in thread
From: Andrew Morton @ 2008-03-04 18:36 UTC (permalink / raw)
To: Kamalesh Babulal
Cc: linuxppc-dev, Mel Gorman, linux-kernel, linux-mm, Pekka Enberg
On Tue, 04 Mar 2008 18:42:19 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> 3) Third attempt kernel booted up but had the following call trace 264 times while running
> test
>
> Badness at include/linux/gfp.h:110
> NIP: c0000000000b4ff0 LR: c0000000000b4fa0 CTR: c00000000019cdb4
> REGS: c000000009edf250 TRAP: 0700 Not tainted (2.6.25-rc3-mm1-autotest)
> MSR: 8000000000029032 <EE,ME,IR,DR> CR: 22024042 XER: 20000003
> TASK = c000000009062140[548] 'kjournald' THREAD: c000000009edc000 CPU: 0
> NIP [c0000000000b4ff0] .get_page_from_freelist+0x29c/0x898
> LR [c0000000000b4fa0] .get_page_from_freelist+0x24c/0x898
> Call Trace:
> [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
> [c000000009edf8c0] [c0000000001fe0f8] .radix_tree_preload+0x50/0xd4
> [c000000009edf960] [c0000000000ad048] .add_to_page_cache+0x38/0x12c
> [c000000009edfa00] [c0000000000ad158] .add_to_page_cache_lru+0x1c/0x4c
> [c000000009edfa90] [c0000000000add58] .find_or_create_page+0x60/0xa8
> [c000000009edfb30] [c00000000011e478] .__getblk+0x140/0x310
> [c000000009edfc00] [c0000000001b78c4] .journal_get_descriptor_buffer+0x44/0xd8
> [c000000009edfca0] [c0000000001b236c] .journal_commit_transaction+0x948/0x1590
> [c000000009edfe00] [c0000000001b585c] .kjournald+0xf4/0x2ac
> [c000000009edff00] [c00000000007ff4c] .kthread+0x84/0xd0
> [c000000009edff90] [c000000000028900] .kernel_thread+0x4c/0x68
> Instruction dump:
> 7dc57378 48009575 60000000 2fa30000 419e0490 56c902d8 3c000018 7dd907b4
> 7ad2c7e2 7f890000 7c000026 5400fffe <0b000000> e93e8128 3b000000 80090000
/* Convert GFP flags to their corresponding migrate type */
static inline int allocflags_to_migratetype(gfp_t gfp_flags)
{
WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
Mel, Pekka: would you have some head-scratching time for this one please?
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 18:36 ` Andrew Morton
@ 2008-03-04 18:47 ` Pekka Enberg
2008-03-04 19:18 ` Pekka Enberg
1 sibling, 0 replies; 33+ messages in thread
From: Pekka Enberg @ 2008-03-04 18:47 UTC (permalink / raw)
To: Andrew Morton
Cc: linuxppc-dev, Mel Gorman, linux-kernel, Kamalesh Babulal,
linux-mm
On Tue, 04 Mar 2008 18:42:19 +0530 Kamalesh Babulal
<kamalesh@linux.vnet.ibm.com> wrote:
> > 3) Third attempt kernel booted up but had the following call trace 264 times while running
> > test
> >
> > Badness at include/linux/gfp.h:110
> > NIP: c0000000000b4ff0 LR: c0000000000b4fa0 CTR: c00000000019cdb4
> > REGS: c000000009edf250 TRAP: 0700 Not tainted (2.6.25-rc3-mm1-autotest)
> > MSR: 8000000000029032 <EE,ME,IR,DR> CR: 22024042 XER: 20000003
> > TASK = c000000009062140[548] 'kjournald' THREAD: c000000009edc000 CPU: 0
> > NIP [c0000000000b4ff0] .get_page_from_freelist+0x29c/0x898
> > LR [c0000000000b4fa0] .get_page_from_freelist+0x24c/0x898
> > Call Trace:
> > [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> > [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> > [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> > [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
> > [c000000009edf8c0] [c0000000001fe0f8] .radix_tree_preload+0x50/0xd4
> > [c000000009edf960] [c0000000000ad048] .add_to_page_cache+0x38/0x12c
> > [c000000009edfa00] [c0000000000ad158] .add_to_page_cache_lru+0x1c/0x4c
> > [c000000009edfa90] [c0000000000add58] .find_or_create_page+0x60/0xa8
> > [c000000009edfb30] [c00000000011e478] .__getblk+0x140/0x310
> > [c000000009edfc00] [c0000000001b78c4] .journal_get_descriptor_buffer+0x44/0xd8
> > [c000000009edfca0] [c0000000001b236c] .journal_commit_transaction+0x948/0x1590
> > [c000000009edfe00] [c0000000001b585c] .kjournald+0xf4/0x2ac
> > [c000000009edff00] [c00000000007ff4c] .kthread+0x84/0xd0
> > [c000000009edff90] [c000000000028900] .kernel_thread+0x4c/0x68
> > Instruction dump:
> > 7dc57378 48009575 60000000 2fa30000 419e0490 56c902d8 3c000018 7dd907b4
> > 7ad2c7e2 7f890000 7c000026 5400fffe <0b000000> e93e8128 3b000000 80090000
On Tue, Mar 4, 2008 at 8:36 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> /* Convert GFP flags to their corresponding migrate type */
> static inline int allocflags_to_migratetype(gfp_t gfp_flags)
> {
> WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
>
> Mel, Pekka: would you have some head-scratching time for this one please?
Sure. Just to double-check, this is with SLAB, right? Do you see this with SLUB?
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 18:36 ` Andrew Morton
2008-03-04 18:47 ` Pekka Enberg
@ 2008-03-04 19:18 ` Pekka Enberg
2008-03-04 19:35 ` Mel Gorman
1 sibling, 1 reply; 33+ messages in thread
From: Pekka Enberg @ 2008-03-04 19:18 UTC (permalink / raw)
To: Andrew Morton
Cc: linuxppc-dev, Mel Gorman, linux-kernel, Kamalesh Babulal,
linux-mm
Andrew Morton wrote:
> > [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> > [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> > [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> > [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
> > [c000000009edf8c0] [c0000000001fe0f8] .radix_tree_preload+0x50/0xd4
> > [c000000009edf960] [c0000000000ad048] .add_to_page_cache+0x38/0x12c
> > [c000000009edfa00] [c0000000000ad158] .add_to_page_cache_lru+0x1c/0x4c
> > [c000000009edfa90] [c0000000000add58] .find_or_create_page+0x60/0xa8
> > [c000000009edfb30] [c00000000011e478] .__getblk+0x140/0x310
> > [c000000009edfc00] [c0000000001b78c4] .journal_get_descriptor_buffer+0x44/0xd8
> > [c000000009edfca0] [c0000000001b236c] .journal_commit_transaction+0x948/0x1590
> > [c000000009edfe00] [c0000000001b585c] .kjournald+0xf4/0x2ac
> > [c000000009edff00] [c00000000007ff4c] .kthread+0x84/0xd0
> > [c000000009edff90] [c000000000028900] .kernel_thread+0x4c/0x68
> > Instruction dump:
> > 7dc57378 48009575 60000000 2fa30000 419e0490 56c902d8 3c000018 7dd907b4
> > 7ad2c7e2 7f890000 7c000026 5400fffe <0b000000> e93e8128 3b000000 80090000
>
> /* Convert GFP flags to their corresponding migrate type */
> static inline int allocflags_to_migratetype(gfp_t gfp_flags)
> {
> WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
>
> Mel, Pekka: would you have some head-scratching time for this one please?
What we have is __getblk() -> __getblk_slow() -> grow_buffers() ->
grow_dev_page() doing find_or_create_page() with __GFP_MOVABLE set. That
path then eventually does radix_tree_preload -> kmem_cache_alloc() to a
cache that has SLAB_RECLAIM_ACCOUNT set which implies __GFP_RECLAIMABLE
(for both SLAB and SLUB). So we oops there.
I suspect the WARN_ON() is bogus although I really don't know that part
of the code all too well. Mel?
Pekka
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 19:18 ` Pekka Enberg
@ 2008-03-04 19:35 ` Mel Gorman
2008-03-04 19:41 ` Pekka Enberg
0 siblings, 1 reply; 33+ messages in thread
From: Mel Gorman @ 2008-03-04 19:35 UTC (permalink / raw)
To: Pekka Enberg; +Cc: linux-mm, linuxppc-dev, Andrew Morton, Kamalesh Babulal
On (04/03/08 21:18), Pekka Enberg didst pronounce:
> Andrew Morton wrote:
> >> [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> >> [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> >> [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> >> [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
> >> [c000000009edf8c0] [c0000000001fe0f8] .radix_tree_preload+0x50/0xd4
> >> [c000000009edf960] [c0000000000ad048] .add_to_page_cache+0x38/0x12c
> >> [c000000009edfa00] [c0000000000ad158] .add_to_page_cache_lru+0x1c/0x4c
> >> [c000000009edfa90] [c0000000000add58] .find_or_create_page+0x60/0xa8
> >> [c000000009edfb30] [c00000000011e478] .__getblk+0x140/0x310
> >> [c000000009edfc00] [c0000000001b78c4]
> >.journal_get_descriptor_buffer+0x44/0xd8
> >> [c000000009edfca0] [c0000000001b236c]
> >.journal_commit_transaction+0x948/0x1590
> >> [c000000009edfe00] [c0000000001b585c] .kjournald+0xf4/0x2ac
> >> [c000000009edff00] [c00000000007ff4c] .kthread+0x84/0xd0
> >> [c000000009edff90] [c000000000028900] .kernel_thread+0x4c/0x68
> >> Instruction dump:
> >> 7dc57378 48009575 60000000 2fa30000 419e0490 56c902d8 3c000018 7dd907b4
> >> 7ad2c7e2 7f890000 7c000026 5400fffe <0b000000> e93e8128 3b000000
> >80090000
> >/* Convert GFP flags to their corresponding migrate type */
> >static inline int allocflags_to_migratetype(gfp_t gfp_flags)
> >{
> > WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
> >
> >Mel, Pekka: would you have some head-scratching time for this one please?
>
> What we have is __getblk() -> __getblk_slow() -> grow_buffers() ->
> grow_dev_page() doing find_or_create_page() with __GFP_MOVABLE set. That
> path then eventually does radix_tree_preload -> kmem_cache_alloc() to a
> cache that has SLAB_RECLAIM_ACCOUNT set which implies __GFP_RECLAIMABLE
> (for both SLAB and SLUB). So we oops there.
>
> I suspect the WARN_ON() is bogus although I really don't know that part
> of the code all too well. Mel?
>
The warn-on is valid. A situation should not exist that allows both flags to
be set. I suspect if remove-set_migrateflags.patch was reverted from -mm
the warning would not trigger. Christoph, would it be reasonable to always
clear __GFP_MOVABLE when __GFP_RECLAIMABLE is set for SLAB_RECLAIM_ACCOUNT.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 19:35 ` Mel Gorman
@ 2008-03-04 19:41 ` Pekka Enberg
2008-03-04 19:56 ` Christoph Lameter
2008-03-04 20:01 ` Christoph Lameter
0 siblings, 2 replies; 33+ messages in thread
From: Pekka Enberg @ 2008-03-04 19:41 UTC (permalink / raw)
To: Mel Gorman
Cc: linuxppc-dev, Kamalesh Babulal, linux-mm, Andrew Morton,
Christoph Lameter
(adding Christoph as cc)
On Tue, Mar 4, 2008 at 9:35 PM, Mel Gorman <mel@csn.ul.ie> wrote:
> > >> [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> > >> [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> > >> [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> > >> [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
> > >> [c000000009edf8c0] [c0000000001fe0f8] .radix_tree_preload+0x50/0xd4
> > >> [c000000009edf960] [c0000000000ad048] .add_to_page_cache+0x38/0x12c
> > >> [c000000009edfa00] [c0000000000ad158] .add_to_page_cache_lru+0x1c/0x4c
> > >> [c000000009edfa90] [c0000000000add58] .find_or_create_page+0x60/0xa8
> > >> [c000000009edfb30] [c00000000011e478] .__getblk+0x140/0x310
> > >> [c000000009edfc00] [c0000000001b78c4]
> > >.journal_get_descriptor_buffer+0x44/0xd8
> > >> [c000000009edfca0] [c0000000001b236c]
> > >.journal_commit_transaction+0x948/0x1590
> > >> [c000000009edfe00] [c0000000001b585c] .kjournald+0xf4/0x2ac
> > >> [c000000009edff00] [c00000000007ff4c] .kthread+0x84/0xd0
> > >> [c000000009edff90] [c000000000028900] .kernel_thread+0x4c/0x68
> > >> Instruction dump:
> > >> 7dc57378 48009575 60000000 2fa30000 419e0490 56c902d8 3c000018 7dd907b4
> > >> 7ad2c7e2 7f890000 7c000026 5400fffe <0b000000> e93e8128 3b000000
> > >80090000
> > >/* Convert GFP flags to their corresponding migrate type */
> > >static inline int allocflags_to_migratetype(gfp_t gfp_flags)
> > >{
> > > WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
> > >
> > >Mel, Pekka: would you have some head-scratching time for this one please?
> >
> > What we have is __getblk() -> __getblk_slow() -> grow_buffers() ->
> > grow_dev_page() doing find_or_create_page() with __GFP_MOVABLE set. That
> > path then eventually does radix_tree_preload -> kmem_cache_alloc() to a
> > cache that has SLAB_RECLAIM_ACCOUNT set which implies __GFP_RECLAIMABLE
> > (for both SLAB and SLUB). So we oops there.
> >
> > I suspect the WARN_ON() is bogus although I really don't know that part
> > of the code all too well. Mel?
> >
>
> The warn-on is valid. A situation should not exist that allows both flags to
> be set. I suspect if remove-set_migrateflags.patch was reverted from -mm
> the warning would not trigger. Christoph, would it be reasonable to always
> clear __GFP_MOVABLE when __GFP_RECLAIMABLE is set for SLAB_RECLAIM_ACCOUNT.
>
> --
> Mel Gorman
> Part-time Phd Student Linux Technology Center
> University of Limerick IBM Dublin Software Lab
>
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 19:41 ` Pekka Enberg
@ 2008-03-04 19:56 ` Christoph Lameter
2008-03-04 20:01 ` Pekka J Enberg
2008-03-04 20:01 ` Christoph Lameter
1 sibling, 1 reply; 33+ messages in thread
From: Christoph Lameter @ 2008-03-04 19:56 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, Mel Gorman, Kamalesh Babulal, linuxppc-dev,
Andrew Morton
On Tue, 4 Mar 2008, Pekka Enberg wrote:
> > > I suspect the WARN_ON() is bogus although I really don't know that part
> > > of the code all too well. Mel?
> > >
> >
> > The warn-on is valid. A situation should not exist that allows both flags to
> > be set. I suspect if remove-set_migrateflags.patch was reverted from -mm
> > the warning would not trigger. Christoph, would it be reasonable to always
> > clear __GFP_MOVABLE when __GFP_RECLAIMABLE is set for SLAB_RECLAIM_ACCOUNT.
Slab allocations should never be passed these flags since the slabs do
their own thing there.
The following patch would clear these in slub:
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6.25-rc3-mm1/mm/slub.c
===================================================================
--- linux-2.6.25-rc3-mm1.orig/mm/slub.c 2008-03-04 11:53:47.600342756 -0800
+++ linux-2.6.25-rc3-mm1/mm/slub.c 2008-03-04 11:55:40.153855150 -0800
@@ -1033,8 +1033,8 @@ static struct page *allocate_slab(struct
struct page *page;
int pages = 1 << s->order;
+ flags &= ~GFP_MOVABLE_MASK;
flags |= s->allocflags;
-
page = alloc_slab_page(flags | __GFP_NOWARN | __GFP_NORETRY,
node, s->order);
if (unlikely(!page)) {
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 19:56 ` Christoph Lameter
@ 2008-03-04 20:01 ` Pekka J Enberg
2008-03-04 20:02 ` Christoph Lameter
2008-03-04 20:07 ` Christoph Lameter
0 siblings, 2 replies; 33+ messages in thread
From: Pekka J Enberg @ 2008-03-04 20:01 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-mm, Mel Gorman, Kamalesh Babulal, linuxppc-dev,
Andrew Morton
On Tue, 4 Mar 2008, Christoph Lameter wrote:
> Slab allocations should never be passed these flags since the slabs do
> their own thing there.
>
> The following patch would clear these in slub:
Here's the same fix for SLAB:
diff --git a/mm/slab.c b/mm/slab.c
index 473e6c2..c6dbf7e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1677,6 +1677,7 @@ static void *kmem_getpages(struct kmem_cache *cachep, gfp_t flags, int nodeid)
flags |= __GFP_COMP;
#endif
+ flags &= ~GFP_MOVABLE_MASK;
flags |= cachep->gfpflags;
if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
flags |= __GFP_RECLAIMABLE;
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:01 ` Pekka J Enberg
@ 2008-03-04 20:02 ` Christoph Lameter
2008-03-04 20:07 ` Christoph Lameter
1 sibling, 0 replies; 33+ messages in thread
From: Christoph Lameter @ 2008-03-04 20:02 UTC (permalink / raw)
To: Pekka J Enberg
Cc: linux-mm, Mel Gorman, Kamalesh Babulal, linuxppc-dev,
Andrew Morton
On Tue, 4 Mar 2008, Pekka J Enberg wrote:
> On Tue, 4 Mar 2008, Christoph Lameter wrote:
> > Slab allocations should never be passed these flags since the slabs do
> > their own thing there.
> >
> > The following patch would clear these in slub:
>
> Here's the same fix for SLAB:
That is an immediate fix ok. But there must be some location where SLAB
does the masking of the gfp bits where things go wrong. Looking for that.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:01 ` Pekka J Enberg
2008-03-04 20:02 ` Christoph Lameter
@ 2008-03-04 20:07 ` Christoph Lameter
2008-03-04 20:08 ` Pekka Enberg
` (2 more replies)
1 sibling, 3 replies; 33+ messages in thread
From: Christoph Lameter @ 2008-03-04 20:07 UTC (permalink / raw)
To: Pekka J Enberg
Cc: linux-mm, Mel Gorman, Kamalesh Babulal, linuxppc-dev,
Andrew Morton
I think this is the correct fix.
The NUMA fallback logic should be passing local_flags to kmem_get_pages()
and not simply the flags.
Maybe a stable candidate since we are now simply
passing on flags to the page allocator on the fallback path.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
---
mm/slab.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6.25-rc3-mm1/mm/slab.c
===================================================================
--- linux-2.6.25-rc3-mm1.orig/mm/slab.c 2008-03-04 12:01:07.430911920 -0800
+++ linux-2.6.25-rc3-mm1/mm/slab.c 2008-03-04 12:04:54.449857145 -0800
@@ -3277,7 +3277,7 @@ retry:
if (local_flags & __GFP_WAIT)
local_irq_enable();
kmem_flagcheck(cache, flags);
- obj = kmem_getpages(cache, flags, -1);
+ obj = kmem_getpages(cache, local_flags, -1);
if (local_flags & __GFP_WAIT)
local_irq_disable();
if (obj) {
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:07 ` Christoph Lameter
@ 2008-03-04 20:08 ` Pekka Enberg
2008-03-05 2:28 ` Kamalesh Babulal
2008-03-04 20:34 ` Andrew Morton
2008-03-05 14:31 ` Mel Gorman
2 siblings, 1 reply; 33+ messages in thread
From: Pekka Enberg @ 2008-03-04 20:08 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-mm, Mel Gorman, Kamalesh Babulal, linuxppc-dev,
Andrew Morton
Christoph Lameter wrote:
> I think this is the correct fix.
>
> The NUMA fallback logic should be passing local_flags to kmem_get_pages()
> and not simply the flags.
>
> Maybe a stable candidate since we are now simply
> passing on flags to the page allocator on the fallback path.
>
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
Indeed, good catch. I spotted the same thing just few seconds ago.
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Was it you Kamalesh that reported this? Can you please re-test?
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:08 ` Pekka Enberg
@ 2008-03-05 2:28 ` Kamalesh Babulal
0 siblings, 0 replies; 33+ messages in thread
From: Kamalesh Babulal @ 2008-03-05 2:28 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, Mel Gorman, linuxppc-dev, Andrew Morton,
Christoph Lameter
Pekka Enberg wrote:
> Christoph Lameter wrote:
>> I think this is the correct fix.
>>
>> The NUMA fallback logic should be passing local_flags to kmem_get_pages()
>> and not simply the flags.
>>
>> Maybe a stable candidate since we are now simply
>> passing on flags to the page allocator on the fallback path.
>>
>> Signed-off-by: Christoph Lameter <clameter@sgi.com>
>
> Indeed, good catch. I spotted the same thing just few seconds ago.
>
> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
>
> Was it you Kamalesh that reported this? Can you please re-test?
Thanks the patch fixes the kernel bug.
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:07 ` Christoph Lameter
2008-03-04 20:08 ` Pekka Enberg
@ 2008-03-04 20:34 ` Andrew Morton
2008-03-04 20:44 ` Pekka Enberg
2008-03-05 14:02 ` Mel Gorman
2008-03-05 14:31 ` Mel Gorman
2 siblings, 2 replies; 33+ messages in thread
From: Andrew Morton @ 2008-03-04 20:34 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm, mel, kamalesh, linuxppc-dev, penberg, stable
On Tue, 4 Mar 2008 12:07:39 -0800 (PST)
Christoph Lameter <clameter@sgi.com> wrote:
> I think this is the correct fix.
>
> The NUMA fallback logic should be passing local_flags to kmem_get_pages()
> and not simply the flags.
>
> Maybe a stable candidate since we are now simply
> passing on flags to the page allocator on the fallback path.
Do we know why this is only reported in 2.6.25-rc3-mm1?
Why does this need fixing in 2.6.24.x?
Thanks.
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
>
> ---
> mm/slab.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6.25-rc3-mm1/mm/slab.c
> ===================================================================
> --- linux-2.6.25-rc3-mm1.orig/mm/slab.c 2008-03-04 12:01:07.430911920 -0800
> +++ linux-2.6.25-rc3-mm1/mm/slab.c 2008-03-04 12:04:54.449857145 -0800
> @@ -3277,7 +3277,7 @@ retry:
> if (local_flags & __GFP_WAIT)
> local_irq_enable();
> kmem_flagcheck(cache, flags);
> - obj = kmem_getpages(cache, flags, -1);
> + obj = kmem_getpages(cache, local_flags, -1);
> if (local_flags & __GFP_WAIT)
> local_irq_disable();
> if (obj) {
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:34 ` Andrew Morton
@ 2008-03-04 20:44 ` Pekka Enberg
2008-03-04 21:44 ` Christoph Lameter
2008-03-05 14:02 ` Mel Gorman
1 sibling, 1 reply; 33+ messages in thread
From: Pekka Enberg @ 2008-03-04 20:44 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, mel, kamalesh, linuxppc-dev, stable, Christoph Lameter
Andrew Morton wrote:
> On Tue, 4 Mar 2008 12:07:39 -0800 (PST)
> Christoph Lameter <clameter@sgi.com> wrote:
>
>> I think this is the correct fix.
>>
>> The NUMA fallback logic should be passing local_flags to kmem_get_pages()
>> and not simply the flags.
>>
>> Maybe a stable candidate since we are now simply
>> passing on flags to the page allocator on the fallback path.
>
> Do we know why this is only reported in 2.6.25-rc3-mm1?
>
> Why does this need fixing in 2.6.24.x?
Looking at the code, it's triggerable in 2.6.24.3 at least. Why we don't
have a report yet, probably because (1) the default allocator is SLUB
which doesn't suffer from this and (2) you need a big honkin' NUMA box
that causes fallback allocations to happen to trigger it.
Pekka
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:44 ` Pekka Enberg
@ 2008-03-04 21:44 ` Christoph Lameter
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Lameter @ 2008-03-04 21:44 UTC (permalink / raw)
To: Pekka Enberg; +Cc: linux-mm, mel, kamalesh, linuxppc-dev, Andrew Morton, stable
On Tue, 4 Mar 2008, Pekka Enberg wrote:
> Looking at the code, it's triggerable in 2.6.24.3 at least. Why we don't have
> a report yet, probably because (1) the default allocator is SLUB which doesn't
> suffer from this and (2) you need a big honkin' NUMA box that causes fallback
> allocations to happen to trigger it.
Plus the issue only became a problem after the antifrag stuff went in.
That came with SLUB as the default.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:34 ` Andrew Morton
2008-03-04 20:44 ` Pekka Enberg
@ 2008-03-05 14:02 ` Mel Gorman
1 sibling, 0 replies; 33+ messages in thread
From: Mel Gorman @ 2008-03-05 14:02 UTC (permalink / raw)
To: Andrew Morton
Cc: linuxppc-dev, kamalesh, linux-mm, penberg, stable,
Christoph Lameter
On (04/03/08 12:34), Andrew Morton didst pronounce:
> On Tue, 4 Mar 2008 12:07:39 -0800 (PST)
> Christoph Lameter <clameter@sgi.com> wrote:
>
> > I think this is the correct fix.
> >
> > The NUMA fallback logic should be passing local_flags to kmem_get_pages()
> > and not simply the flags.
> >
> > Maybe a stable candidate since we are now simply
> > passing on flags to the page allocator on the fallback path.
>
> Do we know why this is only reported in 2.6.25-rc3-mm1?
>
> Why does this need fixing in 2.6.24.x?
>
I don't believe it needs to be fixed in 2.6.24.3. The call-sites in
lib/radix-tree.c there look like
ret = kmem_cache_alloc(radix_tree_node_cachep,
set_migrateflags(gfp_mask, __GFP_RECLAIMABLE));
node = kmem_cache_alloc(radix_tree_node_cachep,
set_migrateflags(gfp_mask, __GFP_RECLAIMABLE));
and set_migrateflags() looks like
#define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
static inline gfp_t set_migrateflags(gfp_t gfp, gfp_t migrate_flags)
{
BUG_ON((gfp & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
return (gfp & ~(GFP_MOVABLE_MASK)) | migrate_flags;
}
so the flags were already getting cleared and the WARN_ON could not
trigger in this path. In 2.6.25-rc3-mm1, the patch
remove-set_migrateflags.patch gets rid of set_migateflags()
which led to this situation.
The surprise is that it didn't get caught in an earlier -mm but it could
be because it only affected slab.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 20:07 ` Christoph Lameter
2008-03-04 20:08 ` Pekka Enberg
2008-03-04 20:34 ` Andrew Morton
@ 2008-03-05 14:31 ` Mel Gorman
2 siblings, 0 replies; 33+ messages in thread
From: Mel Gorman @ 2008-03-05 14:31 UTC (permalink / raw)
To: Christoph Lameter
Cc: linuxppc-dev, Kamalesh Babulal, linux-mm, Pekka J Enberg,
Andrew Morton
On (04/03/08 12:07), Christoph Lameter didst pronounce:
> I think this is the correct fix.
>
> The NUMA fallback logic should be passing local_flags to kmem_get_pages()
> and not simply the flags.
>
> Maybe a stable candidate since we are now simply
> passing on flags to the page allocator on the fallback path.
>
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Thanks Christoph.
>
> ---
> mm/slab.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6.25-rc3-mm1/mm/slab.c
> ===================================================================
> --- linux-2.6.25-rc3-mm1.orig/mm/slab.c 2008-03-04 12:01:07.430911920 -0800
> +++ linux-2.6.25-rc3-mm1/mm/slab.c 2008-03-04 12:04:54.449857145 -0800
> @@ -3277,7 +3277,7 @@ retry:
> if (local_flags & __GFP_WAIT)
> local_irq_enable();
> kmem_flagcheck(cache, flags);
> - obj = kmem_getpages(cache, flags, -1);
> + obj = kmem_getpages(cache, local_flags, -1);
> if (local_flags & __GFP_WAIT)
> local_irq_disable();
> if (obj) {
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 19:41 ` Pekka Enberg
2008-03-04 19:56 ` Christoph Lameter
@ 2008-03-04 20:01 ` Christoph Lameter
1 sibling, 0 replies; 33+ messages in thread
From: Christoph Lameter @ 2008-03-04 20:01 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, Mel Gorman, Kamalesh Babulal, linuxppc-dev,
Andrew Morton
On Tue, 4 Mar 2008, Pekka Enberg wrote:
> > > >> [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> > > >> [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> > > >> [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> > > >> [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
Ahh! This is SLAB. slub does not suffer this problem since new_slab()
masks the bits correctly.
So we need to fix SLAB.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc ()
2008-03-04 13:12 ` [BUG] 2.6.25-rc3-mm1 kernel panic while bootup on powerpc () Kamalesh Babulal
2008-03-04 14:40 ` Michael Neuling
2008-03-04 18:36 ` Andrew Morton
@ 2008-03-05 8:22 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 33+ messages in thread
From: Benjamin Herrenschmidt @ 2008-03-05 8:22 UTC (permalink / raw)
To: Kamalesh Babulal; +Cc: linuxppc-dev, Andrew Morton, linux-kernel
On Tue, 2008-03-04 at 18:42 +0530, Kamalesh Babulal wrote:
> Hi Andrew,
>
> The 2.6.25-rc3-mm1 kernel panics while bootup on power box. The machine booted up
> without the panic on the third attempt, but badness call trace were seen while running
> tests
We are taking a HW interrupt ... we aren't supposed to take HW
interrupts that early during boot afaik.
Is it yet another case of somebody hard-enabling interrupts with
local_irq_enable() ?
Ben.
> 1) The kernel panic on first attempt
>
> Unable to handle kernel paging request for data at address 0x00000000
> Faulting instruction address: 0xc00000000000cb2c
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=128 NUMA pSeries
> Modules linked in:
> NIP: c00000000000cb2c LR: c00000000000caf8 CTR: 0000000000000226
> REGS: c00000000068f360 TRAP: 0300 Not tainted (2.6.25-rc3-mm1-autotest)
> MSR: 8000000000001032 <ME,IR,DR> CR: 28000024 XER: 20000001
> DAR: 0000000000000000, DSISR: 0000000040000000
> TASK = c0000000005c8590[0] 'swapper' THREAD: c00000000068c000 CPU: 0
> GPR00: c00000000068f5e0 c00000000068f5e0 c00000000068e690 0000000000000000
> GPR04: 00000000000035e0 000000000087264e c000000008011280 c000000000594000
> GPR08: c0000000005c9300 0000000000000000 c000000000591090 c00000000068c000
> GPR12: 8000000000009032 c0000000005c9300 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000008000 0000000000000000
> GPR20: 0000000000000000 0000000000000000 000000000000007f 0000000000018000
> GPR24: 0000000000000001 0000000000000080 0000000000000018 0000000000000000
> GPR28: 0000000000000c00 c000000000588988 c000000000639be8 c000000008001c00
> NIP [c00000000000cb2c] .do_IRQ+0x74/0x1c4
> LR [c00000000000caf8] .do_IRQ+0x40/0x1c4
> Call Trace:
> [c00000000068f5e0] [c00000000000caf8] .do_IRQ+0x40/0x1c4 (unreliable)
> [c00000000068f680] [c000000000004790] hardware_interrupt_entry+0x18/0x1c
> --- Exception: 501 at .memset+0x70/0xfc
> LR = .__alloc_bootmem_core+0x39c/0x3dc
> [c00000000068f970] [c00000000068fa10] init_thread_union+0x3a10/0x4000 (unreliable)
> [c00000000068fa30] [c00000000057237c] .__alloc_bootmem_node+0x38/0x8c
> [c00000000068fad0] [c0000000003c477c] .zone_wait_table_init+0x74/0x108
> [c00000000068fb60] [c0000000003d9058] .init_currently_empty_zone+0x40/0x11c
> [c00000000068fc00] [c0000000003d94c8] .free_area_init_node+0x394/0x3fc
> [c00000000068fcf0] [c00000000057314c] .free_area_init_nodes+0x2d8/0x364
> [c00000000068fd90] [c00000000056682c] .paging_init+0x40/0x58
> [c00000000068fe40] [c00000000055ba34] .setup_arch+0x20c/0x240
> [c00000000068fee0] [c000000000552690] .start_kernel+0xdc/0x414
> [c00000000068ff90] [c000000000008594] .start_here_common+0x54/0xc0
> Instruction dump:
> 7c200b78 780404a0 2ba408ff 41bd001c e87e80a8 3884ff00 48058d21 60000000
> 480054cd 60000000 e93e80b0 e92900b8 <e8090000> f8410028 e9690010 e8490008
>
> 2) The kernel panic on second attempt
>
> Unable to handle kernel paging request for data at address 0x00000000
> Faulting instruction address: 0xc00000000000cb2c
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=128 NUMA pSeries
> Modules linked in:
> NIP: c00000000000cb2c LR: c00000000000caf8 CTR: 0000000000014a99
> REGS: c00000000068f410 TRAP: 0300 Not tainted (2.6.25-rc3-mm1-autotest)
> MSR: 8000000000001032 <ME,IR,DR> CR: 28000044 XER: 00000001
> DAR: 0000000000000000, DSISR: 0000000040000000
> TASK = c0000000005c8590[0] 'swapper' THREAD: c00000000068c000 CPU: 0
> GPR00: c00000000068f690 c00000000068f690 c00000000068e690 0000000000000000
> GPR04: 0000000000003690 0000000000537672 c000000001ad59c0 c000000000594000
> GPR08: c0000000005c9300 0000000000000000 c000000000591090 c00000000068c000
> GPR12: 8000000000009032 c0000000005c9300 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000230000 0000000000000000 0000000000ffffff 0000000001000000
> GPR24: 0000000000001000 0000000001000000 0000000000001000 0000000000000000
> GPR28: 0000000000000000 c0000000005889c8 c000000000639be8 c000000001000000
> NIP [c00000000000cb2c] .do_IRQ+0x74/0x1c4
> LR [c00000000000caf8] .do_IRQ+0x40/0x1c4
> Call Trace:
> [c00000000068f690] [c00000000000caf8] .do_IRQ+0x40/0x1c4 (unreliable)
> [c00000000068f730] [c000000000004790] hardware_interrupt_entry+0x18/0x1c
> --- Exception: 501 at .memset+0x80/0xfc
> LR = .__alloc_bootmem_core+0x39c/0x3dc
> [c00000000068fa20] [c000000000641a78] sysctl_pernet_ops+0x108e0/0x1d6e0 (unreliable)
> [c00000000068fae0] [c00000000057237c] .__alloc_bootmem_node+0x38/0x8c
> [c00000000068fb80] [c0000000003c48dc] .__earlyonly_bootmem_alloc+0x24/0x3c
> [c00000000068fc00] [c0000000003d885c] .vmemmap_populate+0x7c/0xf4
> [c00000000068fc90] [c0000000003d9b6c] .sparse_mem_map_populate+0x38/0x64
> [c00000000068fd10] [c000000000573ec4] .sparse_early_mem_map_alloc+0x54/0x98
> [c00000000068fda0] [c000000000573f70] .sparse_init+0x68/0x148
> [c00000000068fe40] [c00000000055b9ec] .setup_arch+0x1c4/0x240
> [c00000000068fee0] [c000000000552690] .start_kernel+0xdc/0x414
> [c00000000068ff90] [c000000000008594] .start_here_common+0x54/0xc0
> Instruction dump:
> 7c200b78 780404a0 2ba408ff 41bd001c e87e80a8 3884ff00 48058d21 60000000
> 480054cd 60000000 e93e80b0 e92900b8 <e8090000> f8410028 e9690010 e8490008
>
> 3) Third attempt kernel booted up but had the following call trace 264 times while running
> test
>
> Badness at include/linux/gfp.h:110
> NIP: c0000000000b4ff0 LR: c0000000000b4fa0 CTR: c00000000019cdb4
> REGS: c000000009edf250 TRAP: 0700 Not tainted (2.6.25-rc3-mm1-autotest)
> MSR: 8000000000029032 <EE,ME,IR,DR> CR: 22024042 XER: 20000003
> TASK = c000000009062140[548] 'kjournald' THREAD: c000000009edc000 CPU: 0
> NIP [c0000000000b4ff0] .get_page_from_freelist+0x29c/0x898
> LR [c0000000000b4fa0] .get_page_from_freelist+0x24c/0x898
> Call Trace:
> [c000000009edf5f0] [c0000000000b56e4] .__alloc_pages_internal+0xf8/0x470
> [c000000009edf6e0] [c0000000000e0458] .kmem_getpages+0x8c/0x194
> [c000000009edf770] [c0000000000e1050] .fallback_alloc+0x194/0x254
> [c000000009edf820] [c0000000000e14b0] .kmem_cache_alloc+0xd8/0x144
> [c000000009edf8c0] [c0000000001fe0f8] .radix_tree_preload+0x50/0xd4
> [c000000009edf960] [c0000000000ad048] .add_to_page_cache+0x38/0x12c
> [c000000009edfa00] [c0000000000ad158] .add_to_page_cache_lru+0x1c/0x4c
> [c000000009edfa90] [c0000000000add58] .find_or_create_page+0x60/0xa8
> [c000000009edfb30] [c00000000011e478] .__getblk+0x140/0x310
> [c000000009edfc00] [c0000000001b78c4] .journal_get_descriptor_buffer+0x44/0xd8
> [c000000009edfca0] [c0000000001b236c] .journal_commit_transaction+0x948/0x1590
> [c000000009edfe00] [c0000000001b585c] .kjournald+0xf4/0x2ac
> [c000000009edff00] [c00000000007ff4c] .kthread+0x84/0xd0
> [c000000009edff90] [c000000000028900] .kernel_thread+0x4c/0x68
> Instruction dump:
> 7dc57378 48009575 60000000 2fa30000 419e0490 56c902d8 3c000018 7dd907b4
> 7ad2c7e2 7f890000 7c000026 5400fffe <0b000000> e93e8128 3b000000 80090000
^ permalink raw reply [flat|nested] 33+ messages in thread