public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>,
	linux-kernel@vger.kernel.org,
	Christoph Lameter <clameter@sgi.com>, Mel Gorman <mel@csn.ul.ie>,
	Nick Piggin <npiggin@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Yinghai.Lu@sun.com
Subject: Re: [bug] SLUB + mm/slab.c boot crash in -rc9
Date: Tue, 15 Apr 2008 18:15:32 +0200	[thread overview]
Message-ID: <20080415161532.GA15088@elte.hu> (raw)
In-Reply-To: <alpine.LFD.1.00.0804150838161.2879@woody.linux-foundation.org>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, 15 Apr 2008, Ingo Molnar wrote:
> >
> > debug output is:
> > 
> >   http://redhat.com/~mingo/misc/log-Thu_Apr_10_10_41_16_CEST_2008.bad.rc9
> > 
> > so it's probably the first few page allocations (setup_cpu_cache()) 
> > going wrong already - suggesting a some fundamental borkage in SLAB?
> 
> Well, I think it suggests some fundamental borkage in the page 
> allocator.
> 
> That first warn-on is from the "alloc_pages_node()" returning NULL at 
> bootup. Sure, it could be that the arguments are bogus, but that 
> sounds unlikely since none of that is dependent on any kconfig stuff.
> 
> The fact that it happens with both SLUB/SLAB makes that even more 
> obvious.
> 
> Now, you don't have fault injection on, so it can't be that, and your 
> debug entry for *z == NULL didn' trigger in alloc_pages, so it's no 
> that one either.
> 
> However, if __alloc_pages() failed, I would have expected to see the 
> "memory allocation failed" printk. Why didn't it? Is 
> printk_ratelimit() broken at boot (last_msg start out as zero - maybe 
> i should start out as a negative number)?

btw., now with a second full day spent on this regression, i have 
figured out a workaround the hard way: increasing SECTION_SIZE_BITS in 
include/asm-x86/sparsemem.h from 26 to 27 makes it go away. (i.e. we use 
section chunks of 128 MB instead of 64 MB before) I've given up on 
analyzing the crash site - it seems rather random and uninformative and 
just suggests page allocator borkage.

So this seems like a general sparsemem borkage. PAE uses a shift of 30 
due to page->flags shortage (which masks this bug), 64-bit uses 27 which 
too probably masks this bug.

Since this is a !NUMA config and !PAE as well, NODES_SHIFT is 0, 
ZONES_SHIFT is 2, so the theory of running out of bits in page->flags is 
wrong as well.

I also tried a hack to double the size of all sparsemem mem_map 
allocations (on the theory of an overflow there) - but it didnt help.

So i think we need to go down further into the page allocator. Perhaps 
the buddy bitmaps are wrongly sized somewhere. I'm grasping at straws.

Btw., Mel Gorman has reproduced crashes with my bzImage on his box (and 
a hang with my config, using his build), so i think we can eliminate hw 
and build environment specialities as a cause.

	Ingo

  reply	other threads:[~2008-04-15 16:16 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-11  7:41 [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!" Ingo Molnar
2008-04-11  8:21 ` Pekka Enberg
2008-04-11  8:50   ` Pekka Enberg
2008-04-11  8:54     ` Ingo Molnar
2008-04-11  9:05       ` Pekka Enberg
2008-04-11  9:08         ` Pekka Enberg
2008-04-11  9:11           ` Pekka Enberg
2008-04-11  9:24             ` Ingo Molnar
2008-04-11 10:34               ` Nick Piggin
2008-04-11 19:28               ` Christoph Lameter
2008-04-12 10:38                 ` Christoph Lameter
2008-04-12 17:22                   ` Yinghai Lu
2008-04-15  5:43                 ` Ingo Molnar
2008-04-15  9:36               ` Mel Gorman
2008-04-15 10:03                 ` Ingo Molnar
2008-04-15  6:25             ` [bug] SLUB + mm/slab.c boot crash in -rc9 Ingo Molnar
2008-04-15  6:41               ` Pekka Enberg
2008-04-15  7:08                 ` Ingo Molnar
2008-04-15  8:31                   ` Yinghai Lu
2008-04-15  8:46                     ` Ingo Molnar
2008-04-15  9:11                   ` Ingo Molnar
2008-04-15 16:02               ` Linus Torvalds
2008-04-15 16:15                 ` Ingo Molnar [this message]
2008-04-15 17:23                   ` Linus Torvalds
2008-04-15 19:35                     ` Ingo Molnar
2008-04-15 19:41                       ` Ingo Molnar
2008-04-15 19:39                     ` Christoph Lameter
2008-04-15 19:54                       ` Ingo Molnar
2008-04-15 20:03                         ` Christoph Lameter
2008-04-15 20:17                           ` Ingo Molnar
2008-04-15 20:28                             ` Ingo Molnar
2008-04-15 20:34                               ` Ingo Molnar
2008-04-15 20:42                                 ` Ingo Molnar
2008-04-15 20:50                                   ` Christoph Lameter
2008-04-15 20:58                                     ` Ingo Molnar
2008-04-15 21:08                                       ` Christoph Lameter
2008-04-15 21:16                                         ` Mike Travis
2008-04-15 21:19                                         ` Ingo Molnar
2008-04-15 21:21                                           ` Christoph Lameter
2008-04-15 21:23                                             ` Ingo Molnar
2008-04-15 21:24                                               ` Christoph Lameter
2008-04-15 21:28                                                 ` Ingo Molnar
2008-04-15 21:33                                                   ` Christoph Lameter
2008-04-15 21:43                                                   ` Mike Travis
2008-04-15 22:07                                                   ` Ingo Molnar
2008-04-15 21:27                                           ` Mike Travis
2008-04-15 20:34                             ` Pekka Enberg
2008-04-15 20:40                               ` Ingo Molnar
2008-04-15 21:06                                 ` Linus Torvalds
2008-04-15 21:13                                   ` Ingo Molnar
2008-04-15 21:24                                     ` Ingo Molnar
2008-04-15 21:42                                       ` Christoph Lameter
2008-04-15 21:55                                         ` Ingo Molnar
2008-04-15 22:06                                           ` Christoph Lameter
2008-04-15 22:13                                             ` Ingo Molnar
2008-04-15 22:27                                               ` Christoph Lameter
2008-04-15 22:32                                                 ` Ingo Molnar
2008-04-15 23:22                                                 ` Christoph Lameter
2008-04-15 23:27                                                   ` Ingo Molnar
2008-04-15 23:32                                                     ` Christoph Lameter
2008-04-16  0:04                                                     ` Christoph Lameter
2008-04-15 23:18                                             ` Yinghai Lu
2008-04-16  0:03                                   ` [patch] mm: sparsemem memory_present() memory corruption fix Ingo Molnar
2008-04-16  0:10                                     ` Christoph Lameter
2008-04-16  0:18                                     ` Ingo Molnar
2008-04-16  0:32                                       ` Yinghai Lu
2008-04-16  0:44                                         ` Ingo Molnar
2008-04-16  0:46                                           ` Christoph Lameter
2008-04-16  0:52                                             ` Ingo Molnar
2008-04-16  1:17                                               ` Ingo Molnar
2008-04-16  1:30                                                 ` Yinghai Lu
2008-04-16  2:00                                                   ` Yinghai Lu
2008-04-16  2:20                                                     ` KAMEZAWA Hiroyuki
2008-04-16  0:56                                           ` Yinghai Lu
2008-04-16  1:02                                             ` Ingo Molnar
2008-04-16  1:17                                               ` Yinghai Lu
2008-04-16  0:19                                     ` Christoph Lameter
2008-04-16  0:33                                       ` Yinghai Lu
2008-04-16  0:36                                       ` Ingo Molnar
2008-04-16  0:34                                     ` Ingo Molnar
2008-04-16  0:40                                       ` Ingo Molnar
2008-04-16  0:45                                         ` Christoph Lameter
2008-04-16  0:52                                           ` Ingo Molnar
2008-04-16  1:14                                         ` Ingo Molnar
2008-04-16  2:45                                       ` Linus Torvalds
2008-04-16  1:48                                     ` KAMEZAWA Hiroyuki
2008-04-16 14:05                                     ` Mel Gorman
2008-04-16 15:03                                     ` Ingo Molnar
2008-04-15 20:54                             ` [bug] SLUB + mm/slab.c boot crash in -rc9 Christoph Lameter
2008-04-15 20:58                               ` Ingo Molnar
2008-04-15 21:08                                 ` Ingo Molnar
2008-04-15 20:23                   ` Ingo Molnar
2008-04-11 19:26           ` [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!" Christoph Lameter
2008-04-11 19:25         ` Christoph Lameter
2008-04-15  5:49           ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080415161532.GA15088@elte.hu \
    --to=mingo@elte.hu \
    --cc=Yinghai.Lu@sun.com \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox