From: Mel Gorman <mgorman@suse.de>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm@kvack.org, bugzilla-daemon@bugzilla.kernel.org,
bugme-daemon@bugzilla.kernel.org, qcui@redhat.com,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [Bugme-new] [Bug 36192] New: Kernel panic when boot the 2.6.39+ kernel based off of 2.6.32 kernel
Date: Tue, 7 Jun 2011 11:13:00 +0100 [thread overview]
Message-ID: <20110607101300.GL5247@suse.de> (raw)
In-Reply-To: <20110607180630.be24e7c3.kamezawa.hiroyu@jp.fujitsu.com>
On Tue, Jun 07, 2011 at 06:06:30PM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 7 Jun 2011 10:03:13 +0100
> Mel Gorman <mgorman@suse.de> wrote:
>
> > On Tue, Jun 07, 2011 at 09:57:08AM +0900, KAMEZAWA Hiroyuki wrote:
> > > On Mon, 6 Jun 2011 14:45:19 -0700
> > > Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > > Hopefully he can test this one for us as well, thanks.
> > > >
> > >
> > > A patch with better description (of mine) is here.
> > > Anyway, I felt I needed a fix for ARM special case.
> > >
> > > ==
> > > fix-init-page_cgroup-for-sparsemem-taking-care-of-broken-page-flags.patch
> > > Even with SPARSEMEM, there are some magical memmap.
> > >
> >
> > Who wants to introduce SPARSEMEM_MAGICAL?
> >
>
> ARM guys ;)
>
> > > If a Node is not aligned to SECTION, memmap of pfn which is out of
> > > Node's range is not initialized. And page->flags contains 0.
> > >
> >
> > This is tangential but it might be worth introducing
> > CONFIG_DEBUG_MEMORY_MODEL that WARN_ONs page->flag == 0 in
> > pfn_to_page() to catch some accesses outside node boundaries. Not for
> > this bug though.
> >
>
> Hmm, buf if zone == 0 && section == 0 && nid == 0, page->flags is 0.
>
Sorry, what I meant to suggest was that page->flags outside of
boundaries be initialised to a poison value that is an impossible
combination of flags and check that.
> > > If Node(0) doesn't exist, NODE_DATA(pfn_to_nid(pfn)) causes error.
> > >
> >
> > Well, not in itself. It causes a bug when we try allocate memory
> > from node 0 but there is a subtle performance bug here as well. For
> > unaligned nodes, the cgroup information can be allocated from node
> > 0 instead of node-local.
> >
> > > In another case, for example, ARM frees memmap which is never be used
> > > even under SPARSEMEM. In that case, page->flags will contain broken
> > > value.
> > >
> >
> > Again, not as such. In that case, struct page is not valid memory
> > at all.
>
> Hmm, IIUC, ARM's code frees memmap by free_bootmem().....so, memory used
> for 'struct page' is valid and can access (but it's not struct page.)
>
> If my English sounds strange, I'm sorry. Hm
>
> How about this ?
> ==
> In another case, for example, ARM frees memmap which is never be used
> and reuse memory for memmap for other purpose. So, in that case,
> a page got by pfn_to_page(pfn) may not a struct page.
> ==
>
Much better.
>
>
> >
> > > This patch does a strict check on nid which is obtained by
> > > pfn_to_page() and use proper NID for page_cgroup allocation.
> > >
> > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > >
> > > ---
> > > mm/page_cgroup.c | 36 +++++++++++++++++++++++++++++++++++-
> > > 1 file changed, 35 insertions(+), 1 deletion(-)
> > >
> > > Index: linux-3.0-rc1/mm/page_cgroup.c
> > > ===================================================================
> > > --- linux-3.0-rc1.orig/mm/page_cgroup.c
> > > +++ linux-3.0-rc1/mm/page_cgroup.c
> > > @@ -168,6 +168,7 @@ static int __meminit init_section_page_c
> > > struct mem_section *section;
> > > unsigned long table_size;
> > > unsigned long nr;
> > > + unsigned long tmp;
> > > int nid, index;
> > >
> > > nr = pfn_to_section_nr(pfn);
> > > @@ -175,8 +176,41 @@ static int __meminit init_section_page_c
> > >
> > > if (section->page_cgroup)
> > > return 0;
> > > + /*
> > > + * check Node-ID. Because we get 'pfn' which is obtained by calculation,
> > > + * the pfn may "not exist" or "alreay freed". Even if pfn_valid() returns
> > > + * true, page->flags may contain broken value and pfn_to_nid() returns
> > > + * bad value.
> > > + * (See CONFIG_ARCH_HAS_HOLES_MEMORYMODEL and ARM's free_memmap())
> > > + * So, we need to do careful check, here.
> > > + */
> >
> > You don't really need to worry about ARM here as long as you stay
> > within node boundaries and you only care about the first valid page
> > in the node. Why not lookup NODE_DATA(nid) and make sure start and
> > end are within the node boundaries?
> >
>
> I thought ARM's code just takes care of MAX_ORDER alignment..
Which is not the same as section alignment and whatever alignment it's
using, the start of the node is still going to be valid.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-06-07 10:13 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-36192-10286@https.bugzilla.kernel.org/>
2011-05-30 6:19 ` [Bugme-new] [Bug 36192] New: Kernel panic when boot the 2.6.39+ kernel based off of 2.6.32 kernel Andrew Morton
2011-05-30 7:01 ` KAMEZAWA Hiroyuki
2011-05-30 7:12 ` Minchan Kim
2011-05-30 7:29 ` KAMEZAWA Hiroyuki
2011-05-30 7:54 ` KAMEZAWA Hiroyuki
2011-05-30 8:51 ` KAMEZAWA Hiroyuki
2011-06-06 12:54 ` Johannes Weiner
2011-06-06 21:45 ` Andrew Morton
2011-06-06 23:45 ` KAMEZAWA Hiroyuki
2011-06-07 8:45 ` Mel Gorman
2011-06-07 8:43 ` KAMEZAWA Hiroyuki
2011-06-07 9:09 ` Mel Gorman
2011-06-07 9:33 ` KAMEZAWA Hiroyuki
2011-06-07 10:18 ` Mel Gorman
2011-06-07 23:40 ` KAMEZAWA Hiroyuki
2011-06-08 0:42 ` KAMEZAWA Hiroyuki
2011-06-08 7:43 ` Mel Gorman
2011-06-08 8:45 ` KAMEZAWA Hiroyuki
2011-06-08 9:03 ` Mel Gorman
2011-06-08 10:15 ` Johannes Weiner
2011-06-09 1:04 ` KAMEZAWA Hiroyuki
2011-06-09 1:42 ` [PATCH] [BUGFIX] Avoid getting nid from invalid struct page at page_cgroup allocation (as " KAMEZAWA Hiroyuki
2011-06-07 0:57 ` KAMEZAWA Hiroyuki
2011-06-07 7:51 ` Johannes Weiner
2011-06-07 7:55 ` KAMEZAWA Hiroyuki
2011-06-07 10:26 ` Johannes Weiner
2011-06-07 23:45 ` KAMEZAWA Hiroyuki
2011-06-08 9:33 ` Johannes Weiner
2011-06-07 9:03 ` Mel Gorman
2011-06-07 9:06 ` KAMEZAWA Hiroyuki
2011-06-07 10:13 ` Mel Gorman [this message]
2011-06-07 8:37 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110607101300.GL5247@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=bugme-daemon@bugzilla.kernel.org \
--cc=bugzilla-daemon@bugzilla.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=qcui@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.