From: Vivek Goyal <vgoyal@in.ibm.com>
To: Mel Gorman <mel@skynet.ie>
Cc: Steve Fox <drfickle@us.ibm.com>, Andi Kleen <ak@suse.de>,
Badari Pulavarty <pbadari@us.ibm.com>,
Martin Bligh <mbligh@mbligh.org>, Andrew Morton <akpm@osdl.org>,
lkml <linux-kernel@vger.kernel.org>,
netdev@vger.kernel.org, kmannth@us.ibm.com,
Andy Whitcroft <apw@shadowen.org>
Subject: Re: 2.6.18-mm2 boot failure on x86-64
Date: Fri, 6 Oct 2006 13:59:50 -0400 [thread overview]
Message-ID: <20061006175950.GD19756@in.ibm.com> (raw)
In-Reply-To: <20061006171105.GC9881@skynet.ie>
On Fri, Oct 06, 2006 at 06:11:05PM +0100, Mel Gorman wrote:
> On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > > > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > > > Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > > > BIOS-provided physical RAM map:
> > > > BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > > > BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > > > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > > > BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > > > BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > > > BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > > > BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > > > BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> > >
> > > I continued what Steve was doing this morning to see could this be
> > > pinned down. After placing 'CHECK;' in a few places as suggested by
> > > Andi's check, the problem code was identified as that following in
> > > mm/bootmem.c#init_bootmem_core()
> > >
> > > mapsize = get_mapsize(bdata);
> > > memset(bdata->node_bootmem_map, 0xff, mapsize);
> > >
> > > That explains the value in the array at least. A few more printfs around
> > > this point printed out the following in the boot log
> > >
> > > init_bootmem_core(0, 1909, 0, 12582912)
> > > init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> > > AAGH: afinfo corrupted at mm/bootmem.c:121
> > >
> > > where;
> > >
> > > 1909 == mapstart
> > > 0 == start
> > > 12582912 == end
> > > 1572864 == mapsize
> > >
> > > mapstart, start and end being the parameters being passed to
> > > init_bootmem_core(). This means we are calling memset for the physical
> > > range 0x775000 -> 0x8F5000 which is in a usable range according to the
> > > BIOS-e820 map it appears.
> > >
> >
> > Hi Mel,
> >
>
> Hi.
>
> > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > from System.map will tell us. That will confirm that above memset step is
> > stomping over bss. Then we have to just find that somewhere probably
> > we allocated wrong physical memory area for bootmem allocator map.
> >
>
> BSS is at 0x643000 -> 0x777BC4
> init_bootmem wipes from 0x777000 -> 0x8F7000
>
> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> fix is below. It adds a check in bad_addr() to see if the BSS section is
> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> the source of the problem even if it's not the 100% correct fix.
>
Ok, it looks like that code is assuming that memory area returned by
find_e820_area() is page aligned. I found two such instances and that's
what is leading to problem.
bootmap_size = init_bootmem_node(NODE_DATA(nodeid),
bootmap_start >> PAGE_SHIFT,
start_pfn, end_pfn);
Here bootmap_start is not page aligned and I guess currently should
contain the value 0x777BC4 (just beyond _end). But the moement I do
bootmap_start>>PAGE_SHIFT, I start stomping bss.
Similar is the case here.
bootmap = find_e820_area(0, end_pfn<<PAGE_SHIFT, bootmap_size);
if (bootmap == -1L)
panic("Cannot find bootmem map of size %ld\n",bootmap_size);
bootmap_size = init_bootmem(bootmap >> PAGE_SHIFT, end_pfn);
So may be we should return a page aligned address from find_e820_area().
May be we can change bad_addr() to set *addrp to next page aligned
boundary for every check?
*addrp = PAGE_ALIGN(__pa_symbol(&_end));
Thanks
Vivek
next prev parent reply other threads:[~2006-10-06 18:00 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060928014623.ccc9b885.akpm@osdl.org>
[not found] ` <efh217$8au$1@sea.gmane.org>
2006-09-28 21:01 ` 2.6.18-mm2 Andrew Morton
2006-09-28 22:45 ` 2.6.18-mm2 Stephen Hemminger
2006-10-04 13:42 ` 2.6.18-mm2 boot failure on x86-64 Steve Fox
2006-10-04 15:45 ` Andrew Morton
2006-10-04 15:55 ` Vivek Goyal
2006-10-04 15:56 ` Andi Kleen
2006-10-05 1:57 ` Keith Mannthey
2006-10-04 16:41 ` Steve Fox
2006-10-05 0:06 ` Andrew Morton
2006-10-05 0:51 ` Vivek Goyal
2006-10-05 0:57 ` Andi Kleen
2006-10-05 1:08 ` Martin Bligh
2006-10-05 2:05 ` Keith Mannthey
2006-10-05 14:53 ` Steve Fox
2006-10-05 15:12 ` Badari Pulavarty
2006-10-05 15:32 ` Steve Fox
2006-10-05 15:40 ` Andi Kleen
2006-10-05 17:57 ` Steve Fox
2006-10-05 18:27 ` Andi Kleen
2006-10-05 18:51 ` Steve Fox
2006-10-05 19:05 ` Andi Kleen
2006-10-05 20:42 ` Steve Fox
2006-10-05 20:50 ` Andi Kleen
2006-10-06 2:23 ` Steve Fox
2006-10-06 14:33 ` Mel Gorman
2006-10-06 15:36 ` Vivek Goyal
2006-10-06 17:11 ` Mel Gorman
2006-10-06 17:34 ` Vivek Goyal
2006-10-06 17:59 ` Vivek Goyal [this message]
2006-10-06 18:03 ` Steve Fox
2006-10-06 20:04 ` Vivek Goyal
2006-10-09 9:53 ` Mel Gorman
2006-10-16 18:16 ` Vivek Goyal
2006-10-16 23:58 ` Andrew Morton
2006-10-17 12:18 ` Adrian Bunk
2006-10-17 17:32 ` Mel Gorman
2006-10-05 18:52 ` Vivek Goyal
2006-10-05 19:08 ` Andi Kleen
2006-10-05 20:25 ` Steve Fox
2006-10-05 20:39 ` Mel Gorman
2006-10-05 20:51 ` Andi Kleen
2006-10-05 23:14 ` 2.6.18-mm2 boot failure on x86-64 II Andi Kleen
2006-10-05 23:32 ` keith mannthey
2006-10-05 23:35 ` Andi Kleen
2006-10-05 23:58 ` keith mannthey
2006-10-06 0:02 ` Badari Pulavarty
2006-10-06 0:12 ` Andrew Morton
[not found] ` <200609290319.k8T3JOwS005455@turing-police.cc.vt.edu>
[not found] ` <20060928202931.dc324339.akpm@osdl.org>
[not found] ` <200609291519.k8TFJfvw004256@turing-police.cc.vt.edu>
[not found] ` <20060929124558.33ef6c75.akpm@osdl.org>
2006-09-30 0:01 ` 2.6.18-mm2 - oops in cache_alloc_refill() Valdis.Kletnieks
2006-09-30 1:20 ` Andrew Morton
2006-09-30 1:33 ` Jean Tourrilhes
2006-09-30 3:31 ` Valdis.Kletnieks
2006-09-30 7:50 ` Valdis.Kletnieks
2006-09-30 8:33 ` Andrew Morton
2006-09-30 1:40 ` Jean Tourrilhes
2006-09-30 3:31 ` Valdis.Kletnieks
2006-09-30 1:57 ` Makefile for linux modules x z
2006-09-30 8:55 ` Sam Ravnborg
2006-09-30 1:59 ` x z
2006-10-02 17:52 ` 2.6.18-mm2 - oops in cache_alloc_refill() Jean Tourrilhes
2006-10-02 19:57 ` Valdis.Kletnieks
2006-10-03 15:58 ` Samuel Tardieu
2006-10-03 16:34 ` Jean Tourrilhes
2006-10-03 16:45 ` Samuel Tardieu
2006-10-03 17:07 ` Jean Tourrilhes
2006-10-05 22:37 ` Pavel Roskin
2006-10-05 22:42 ` Jean Tourrilhes
[not found] ` <20060930133706.GA3291@melchior.yamamaya.is-a-geek.org>
2006-09-30 19:53 ` 2.6.18-mm2 Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061006175950.GD19756@in.ibm.com \
--to=vgoyal@in.ibm.com \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=apw@shadowen.org \
--cc=drfickle@us.ibm.com \
--cc=kmannth@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@mbligh.org \
--cc=mel@skynet.ie \
--cc=netdev@vger.kernel.org \
--cc=pbadari@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).