From: Mel Gorman <mel@csn.ul.ie>
To: Ingo Molnar <mingo@elte.hu>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>,
linux-kernel@vger.kernel.org,
Christoph Lameter <clameter@sgi.com>,
Nick Piggin <npiggin@suse.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
Yinghai.Lu@sun.com
Subject: Re: [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!"
Date: Tue, 15 Apr 2008 10:36:28 +0100 [thread overview]
Message-ID: <20080415093628.GD20316@csn.ul.ie> (raw)
In-Reply-To: <20080411092452.GE10801@elte.hu>
On (11/04/08 11:24), Ingo Molnar didst pronounce:
>
> * Pekka Enberg <penberg@cs.helsinki.fi> wrote:
>
> > On Fri, Apr 11, 2008 at 12:05 PM, Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> > > > Right. Then you probably want to look into any changes in arch/x86/
> > > > related to setting up the zonelists. I'm fairly certain this is not a
> > > > slab bug and I don't see any recent changes to the page allocator
> > > > either that would explain this.
> > >
> > > I'd be willing to put some money on this:
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b7ad149d62ffffaccb9f565dfe7e5bae739d6836
> >
> > And I'd lose as you're 32-bit. Oh well, that's the price to pay for
> > pretending to know x86 arch internals.
>
> yeah, sorry - we are working hard to unify generic bits like that, but
> it's a huge architecture.
>
> btw., i always felt that the zone/memory setup is rather fragile and
> ad-hoc in places and it trusts the architecture code too much. Just in
> the .25 cycle i've seen about a dozen bugs all around that thing. I
> believe we should work on making the info that an architecture feeds to
> the MM "fool proof" - i.e. sanity-check for overlaps and other common
> setup errors.
I hadn't realised that such setup errors were common. It should be already able
to handle some overlapping problems in add_active_range().
I'm playing catch-up here but looking at your dmesg output, I see the
following snippets.
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 00000000efff8000 (usable)
[ 0.000000] BIOS-e820: 00000000efff8000 - 00000000f0000000 (ACPI data)
There are two portions of usable memory with a few holes there.
[ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
[ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
[ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[ 0.000000] BIOS-e820: 0000000100000000 - 0000000110000000 (usable)
And is memory over the 4GB boundary but....
[ 0.000000] Warning only 4GB will be used.
[ 0.000000] Use a HIGHMEM64G enabled kernel.
[ 0.000000] Entering add_active_range(0, 0, 1048576) 0 entries of 256 used
It's recognised and only memory below 4GB is registered and it's all on
node 0. However, I do note that it also registers all the holes as valid
memory. The memory should never get freed because it should be reserved
during boot by reserve_bootmem() but it still raises an eyebrow.
[ 0.000000] early_node_map[1] active PFN ranges
[ 0.000000] 0: 0 -> 1048576
[ 0.000000] On node 0 totalpages: 1048576
[ 0.000000] DMA zone: 32 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 4064 pages, LIFO batch:0
[ 0.000000] Normal zone: 1760 pages used for memmap
[ 0.000000] Normal zone: 223520 pages, LIFO batch:31
[ 0.000000] HighMem zone: 6400 pages used for memmap
[ 0.000000] HighMem zone: 812800 pages, LIFO batch:31
[ 0.000000] Movable zone: 0 pages used for memmap
And from this, it looks like memmap is getting setup. So far, it looks
like basic initialisation was ok.
> It is easy for an architecture to mess up those things...
> Especially on oddball systems that are too large or too small to be
> normally tested. It's a common, reoccuring bug pattern that we could
> avoid by being a bit more resilient.
>
> if this is a zone setup bug then a sanity-check could catch it right
> where it happens - not much later in the slab code or so.
>
> Ingo
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
next prev parent reply other threads:[~2008-04-15 9:36 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-11 7:41 [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!" Ingo Molnar
2008-04-11 8:21 ` Pekka Enberg
2008-04-11 8:50 ` Pekka Enberg
2008-04-11 8:54 ` Ingo Molnar
2008-04-11 9:05 ` Pekka Enberg
2008-04-11 9:08 ` Pekka Enberg
2008-04-11 9:11 ` Pekka Enberg
2008-04-11 9:24 ` Ingo Molnar
2008-04-11 10:34 ` Nick Piggin
2008-04-11 19:28 ` Christoph Lameter
2008-04-12 10:38 ` Christoph Lameter
2008-04-12 17:22 ` Yinghai Lu
2008-04-15 5:43 ` Ingo Molnar
2008-04-15 9:36 ` Mel Gorman [this message]
2008-04-15 10:03 ` Ingo Molnar
2008-04-15 6:25 ` [bug] SLUB + mm/slab.c boot crash in -rc9 Ingo Molnar
2008-04-15 6:41 ` Pekka Enberg
2008-04-15 7:08 ` Ingo Molnar
2008-04-15 8:31 ` Yinghai Lu
2008-04-15 8:46 ` Ingo Molnar
2008-04-15 9:11 ` Ingo Molnar
2008-04-15 16:02 ` Linus Torvalds
2008-04-15 16:15 ` Ingo Molnar
2008-04-15 17:23 ` Linus Torvalds
2008-04-15 19:35 ` Ingo Molnar
2008-04-15 19:41 ` Ingo Molnar
2008-04-15 19:39 ` Christoph Lameter
2008-04-15 19:54 ` Ingo Molnar
2008-04-15 20:03 ` Christoph Lameter
2008-04-15 20:17 ` Ingo Molnar
2008-04-15 20:28 ` Ingo Molnar
2008-04-15 20:34 ` Ingo Molnar
2008-04-15 20:42 ` Ingo Molnar
2008-04-15 20:50 ` Christoph Lameter
2008-04-15 20:58 ` Ingo Molnar
2008-04-15 21:08 ` Christoph Lameter
2008-04-15 21:16 ` Mike Travis
2008-04-15 21:19 ` Ingo Molnar
2008-04-15 21:21 ` Christoph Lameter
2008-04-15 21:23 ` Ingo Molnar
2008-04-15 21:24 ` Christoph Lameter
2008-04-15 21:28 ` Ingo Molnar
2008-04-15 21:33 ` Christoph Lameter
2008-04-15 21:43 ` Mike Travis
2008-04-15 22:07 ` Ingo Molnar
2008-04-15 21:27 ` Mike Travis
2008-04-15 20:34 ` Pekka Enberg
2008-04-15 20:40 ` Ingo Molnar
2008-04-15 21:06 ` Linus Torvalds
2008-04-15 21:13 ` Ingo Molnar
2008-04-15 21:24 ` Ingo Molnar
2008-04-15 21:42 ` Christoph Lameter
2008-04-15 21:55 ` Ingo Molnar
2008-04-15 22:06 ` Christoph Lameter
2008-04-15 22:13 ` Ingo Molnar
2008-04-15 22:27 ` Christoph Lameter
2008-04-15 22:32 ` Ingo Molnar
2008-04-15 23:22 ` Christoph Lameter
2008-04-15 23:27 ` Ingo Molnar
2008-04-15 23:32 ` Christoph Lameter
2008-04-16 0:04 ` Christoph Lameter
2008-04-15 23:18 ` Yinghai Lu
2008-04-16 0:03 ` [patch] mm: sparsemem memory_present() memory corruption fix Ingo Molnar
2008-04-16 0:10 ` Christoph Lameter
2008-04-16 0:18 ` Ingo Molnar
2008-04-16 0:32 ` Yinghai Lu
2008-04-16 0:44 ` Ingo Molnar
2008-04-16 0:46 ` Christoph Lameter
2008-04-16 0:52 ` Ingo Molnar
2008-04-16 1:17 ` Ingo Molnar
2008-04-16 1:30 ` Yinghai Lu
2008-04-16 2:00 ` Yinghai Lu
2008-04-16 2:20 ` KAMEZAWA Hiroyuki
2008-04-16 0:56 ` Yinghai Lu
2008-04-16 1:02 ` Ingo Molnar
2008-04-16 1:17 ` Yinghai Lu
2008-04-16 0:19 ` Christoph Lameter
2008-04-16 0:33 ` Yinghai Lu
2008-04-16 0:36 ` Ingo Molnar
2008-04-16 0:34 ` Ingo Molnar
2008-04-16 0:40 ` Ingo Molnar
2008-04-16 0:45 ` Christoph Lameter
2008-04-16 0:52 ` Ingo Molnar
2008-04-16 1:14 ` Ingo Molnar
2008-04-16 2:45 ` Linus Torvalds
2008-04-16 1:48 ` KAMEZAWA Hiroyuki
2008-04-16 14:05 ` Mel Gorman
2008-04-16 15:03 ` Ingo Molnar
2008-04-15 20:54 ` [bug] SLUB + mm/slab.c boot crash in -rc9 Christoph Lameter
2008-04-15 20:58 ` Ingo Molnar
2008-04-15 21:08 ` Ingo Molnar
2008-04-15 20:23 ` Ingo Molnar
2008-04-11 19:26 ` [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!" Christoph Lameter
2008-04-11 19:25 ` Christoph Lameter
2008-04-15 5:49 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080415093628.GD20316@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=Yinghai.Lu@sun.com \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=rjw@sisk.pl \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.