linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Jan Kara <jack@suse.cz>
Cc: Henrik Rydberg <rydberg@euromail.se>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm@kvack.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Oops in 3.7-rc8 isolate_free_pages_block()
Date: Thu, 6 Dec 2012 16:19:34 +0000	[thread overview]
Message-ID: <20121206161934.GA17258@suse.de> (raw)
In-Reply-To: <20121206144821.GC18547@quack.suse.cz>

On Thu, Dec 06, 2012 at 03:48:21PM +0100, Jan Kara wrote:
> On Thu 06-12-12 10:17:44, Henrik Rydberg wrote:
> > Hi Linus,
> > 
> > This is the third time I encounter this oops in 3.7, but the first
> > time I managed to get a decent screenshot:
> > 
> > http://bitmath.org/test/oops-3.7-rc8.jpg
> > 
> > It seems to have to do with page migration. I run with transparent
> > hugepages configured, just for the fun of it.
> > 
> > I am happy to test any suggestions.
>   Adding linux-mm and Mel as an author of compaction in particular to CC...
> It seems that while traversing struct page structures, we entered into a new
> huge page (note that RBX is 0xffffea0001c00000 - just the beginning of
> a huge page) and oopsed on PageBuddy test (_mapcount is at offset 0x18 in
> struct page). It might be useful if you provide disassembly of
> isolate_freepages_block() function in your kernel so that we can guess more
> from other register contents...
> 

Still travelling and am not in a position to test this properly :(.
However, this bug feels very similar to a bug in the migration scanner where
a pfn_valid check is missed because the start is not aligned.  Henrik, when
did this start happening? I would be a little surprised if it started between
3.6 and 3.7-rcX but maybe it's just easier to hit now for some reason. How
reproducible is this? Is there anything in particular you do to trigger the
oops? Does the following patch help any? It's only compile tested I'm afraid.

---8<---
mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGES block during isolation for free

Commit 0bf380bc (mm: compaction: check pfn_valid when entering a new
MAX_ORDER_NR_PAGES block during isolation for migration) added a check
for pfn_valid() when isolating pages for migration as the scanner does
not necessarily start pageblock-aligned. However, the free scanner has
the same problem. If it encounters a hole, it can also trigger an oops
when is calls PageBuddy(page) on a page that is within an hole.

Reported-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: stable@vger.kernel.org
---
 mm/compaction.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 9eef558..7d85ad485 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -298,6 +298,16 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
 			continue;
 		if (!valid_page)
 			valid_page = page;
+
+		/*
+		 * As blockpfn may not start aligned, blockpfn->end_pfn
+		 * may cross a MAX_ORDER_NR_PAGES boundary and a pfn_valid
+		 * check is necessary. If the pfn is not valid, stop
+		 * isolation.
+		 */
+		if ((blockpfn & (MAX_ORDER_NR_PAGES - 1)) == 0 &&
+		    !pfn_valid(blockpfn))
+			break;
 		if (!PageBuddy(page))
 			continue;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2012-12-06 16:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20121206091744.GA1397@polaris.bitmath.org>
2012-12-06 14:48 ` Oops in 3.7-rc8 isolate_free_pages_block() Jan Kara
2012-12-06 15:22   ` Henrik Rydberg
2012-12-06 16:10     ` Linus Torvalds
2012-12-06 16:35       ` Mel Gorman
2012-12-06 16:19   ` Mel Gorman [this message]
2012-12-06 16:50     ` Linus Torvalds
2012-12-06 17:55       ` Mel Gorman
2012-12-06 18:19         ` Linus Torvalds
2012-12-06 18:21           ` Mel Gorman
2012-12-06 18:32           ` Henrik Rydberg
2012-12-06 18:41             ` Linus Torvalds
2012-12-06 19:01               ` Mel Gorman
2012-12-06 19:28               ` Henrik Rydberg
2012-12-06 19:38                 ` Linus Torvalds
2012-12-06 21:39                   ` Henrik Rydberg
2012-12-07  8:32                   ` Mel Gorman
2012-12-06 16:58     ` Henrik Rydberg
2012-12-06 17:22     ` Henrik Rydberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121206161934.GA17258@suse.de \
    --to=mgorman@suse.de \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rydberg@euromail.se \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).