From: Bharata B Rao <bharata@in.ibm.com>
To: Andi Kleen <ak@suse.de>
Cc: Ray Bryant <raybry@mpdtxmail.amd.com>,
Christoph Lameter <clameter@engr.sgi.com>,
discuss@x86-64.org, linux-kernel@vger.kernel.org
Subject: Re: [discuss] mmap, mbind and write to mmap'ed memory crashes 2.6.16-rc1[2] on 2 node X86_64
Date: Wed, 8 Feb 2006 17:40:00 +0530 [thread overview]
Message-ID: <20060208121000.GA9906@in.ibm.com> (raw)
In-Reply-To: <200602080036.31059.ak@suse.de>
On Wed, Feb 08, 2006 at 12:36:30AM +0100, Andi Kleen wrote:
> On Wednesday 08 February 2006 00:27, Ray Bryant wrote:
> > On Tuesday 07 February 2006 10:49, Christoph Lameter wrote:
> > > On Tue, 7 Feb 2006, Bharata B Rao wrote:
> > > > I can still crash my x86_64 box with Christoph's program.
> > >
> > > So it looks like the problem is arch specific. Test program runs fine on
> > > ia64.
> > >
> > > > page = 0xffffffffffffffd8
> > > > &page->lru = 0000000000000000
> > >
> > > Yup lru field overwritten as I thought.
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
> >
> > For what it is worth:
> >
> > Christoph's test program runs fine on my 32 GB, 4 socket, 8 core Opteron 64
>
> Opteron 64? A new exciting upcomming product? @)
>
> > box with 2.6.16-rc1.
>
> Yes it also works on my test box and also some other simple tests with MPOL_BIND.
> But we had similar reports on two different systems, so there's very likely a problem.
> Just need to reproduce it somehow.
>
I believe I understand why I am seeing this problem with my setup.
The zones in my machine look like this:
On node 0 totalpages: 773791
DMA zone: 2151 pages, LIFO batch:0
DMA32 zone: 771640 pages, LIFO batch:31
Normal zone: 0 pages, LIFO batch:0
HighMem zone: 0 pages, LIFO batch:0
On node 1 totalpages: 500592
DMA zone: 0 pages, LIFO batch:0
DMA32 zone: 242032 pages, LIFO batch:31
Normal zone: 258560 pages, LIFO batch:31
HighMem zone: 0 pages, LIFO batch:0
So it can be seen that the node 0 has only DMA and DMA32 zones while
node 1 has only DMA32 and Normal zones.
The current mempolicy code assumes that the highest zone(policy_zone) that
comes under the memory policy is valid (by which I mean zone->present_pages
is non-zero) for all nodes, which is not true in my case. In this case
the policy_zone gets set to ZONE_NORMAL (highest zone here).
When mbind'ing to node 0, bind_zonelist()(and subsequent functions) binds
the ZONE_NORMAL zone to vma->vm_policy. During the write fault, the allocator
is asked to allocate from a non-existent ZONE_NORMAL zone for node 0. This
I believe is causing the oops I am seeing. It is still not clear to me
why doesn't the allocator fail the allocations from a zone which has
zone->present_pages=0 gracefully.
This whole problem wasn't seen on 2.6.15.2 because, bind_zonelist()
actually makes sure that the zone it is binding to has a non-zero
zone->present_pages.
Regards,
Bharata.
next prev parent reply other threads:[~2006-02-08 12:05 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060205163618.GB21972@in.ibm.com>
2006-02-05 17:03 ` [discuss] mmap, mbind and write to mmap'ed memory crashes 2.6.16-rc1[2] on 2 node X86_64 Andi Kleen
2006-02-06 16:11 ` Christoph Lameter
2006-02-06 18:12 ` Andi Kleen
2006-02-06 18:25 ` Christoph Lameter
2006-02-06 18:31 ` Andi Kleen
2006-02-06 18:45 ` Christoph Lameter
2006-02-06 18:55 ` Andi Kleen
2006-02-06 19:22 ` Christoph Lameter
2006-02-07 5:59 ` Bharata B Rao
2006-02-07 16:49 ` Christoph Lameter
2006-02-07 23:27 ` Ray Bryant
2006-02-07 23:36 ` Andi Kleen
2006-02-08 12:10 ` Bharata B Rao [this message]
2006-02-08 15:42 ` Christoph Lameter
2006-02-08 15:45 ` Andi Kleen
2006-02-08 15:59 ` Christoph Lameter
2006-02-08 16:06 ` Andi Kleen
2006-02-08 16:20 ` Christoph Lameter
2006-02-08 16:27 ` Andi Kleen
2006-02-08 16:51 ` Christoph Lameter
2006-02-09 4:39 ` Bharata B Rao
2006-02-09 9:58 ` Andi Kleen
2006-02-14 19:33 ` Christoph Lameter
2006-02-15 5:46 ` Bharata B Rao
2006-02-15 10:38 ` Bharata B Rao
2006-02-15 11:21 ` Andi Kleen
2006-02-15 18:14 ` Christoph Lameter
2006-02-16 5:18 ` Bharata B Rao
2006-02-15 18:10 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060208121000.GA9906@in.ibm.com \
--to=bharata@in.ibm.com \
--cc=ak@suse.de \
--cc=clameter@engr.sgi.com \
--cc=discuss@x86-64.org \
--cc=linux-kernel@vger.kernel.org \
--cc=raybry@mpdtxmail.amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.