All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Whitcroft <apw@shadowen.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Adam Litke <agl@us.ibm.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	nacc <nacc@linux.vnet.ibm.com>, agl <agl@linux.vnet.ibm.com>
Subject: Re: [BUG] [PATCH v2] Make setup_zone_migrate_reserve() aware of overlapping nodes
Date: Tue, 26 Aug 2008 10:29:29 +0100	[thread overview]
Message-ID: <20080826092929.GD29207@brain> (raw)
In-Reply-To: <20080821113338.GA29950@csn.ul.ie>

On Thu, Aug 21, 2008 at 12:33:39PM +0100, Mel Gorman wrote:
> On (20/08/08 14:55), Adam Litke didst pronounce:
> >     Changes since V1
> >      - Fix build for !NUMA
> >      - Add VM_BUG_ON() to catch this problem at the source
> >     
> >     I have gotten to the root cause of the hugetlb badness I reported back on
> >     August 15th.  My system has the following memory topology (note the
> >     overlapping node):
> >     
> >             Node 0 Memory: 0x8000000-0x44000000
> >             Node 1 Memory: 0x0-0x8000000 0x44000000-0x80000000
> >     
> >     setup_zone_migrate_reserve() scans the address range 0x0-0x8000000 looking
> >     for a pageblock to move onto the MIGRATE_RESERVE list.  Finding no
> >     candidates, it happily continues the scan into 0x8000000-0x44000000.  When
> >     a pageblock is found, the pages are moved to the MIGRATE_RESERVE list on
> >     the wrong zone.  Oops.
> >     
> >     (Andrew: once the proper fix is agreed upon, this should also be a
> >     candidate for -stable.)
> >     
> >     setup_zone_migrate_reserve() should skip pageblocks in overlapping nodes.
> >     
> >     Signed-off-by: Adam Litke <agl@us.ibm.com>
> > 
> 
> zone_to_nid(zone) is called every time in the loop even though it will never
> change. This is less than optimal but setup_zone_migrate_reserve() is only
> called during init and when min_free_kbytes is adjusted so it's not worth
> worrying about. Otherwise it looks good.
> 
> Acked-by: Mel Gorman <mel@csn.ul.ie>
> 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index af982f7..feb7916 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -694,6 +694,9 @@ static int move_freepages(struct zone *zone,
> >  #endif
> >  
> >  	for (page = start_page; page <= end_page;) {
> > +		/* Make sure we are not inadvertently changing nodes */
> > +		VM_BUG_ON(page_to_nid(page) != zone_to_nid(zone));
> > +
> >  		if (!pfn_valid_within(page_to_pfn(page))) {
> >  			page++;
> >  			continue;
> > @@ -2516,6 +2519,10 @@ static void setup_zone_migrate_reserve(struct zone *zone)
> >  			continue;
> >  		page = pfn_to_page(pfn);
> >  
> > +		/* Watch out for overlapping nodes */
> > +		if (page_to_nid(page) != zone_to_nid(zone))
> > +			continue;
> > +
> >  		/* Blocks with reserved pages will never free, skip them. */
> >  		if (PageReserved(page))
> >  			continue;

This patch looks sane.  I do note that we have a config option to tell
us whether we have any possibility of overlapping nodes, and we have an
early version of a check for this early_pfn_in_nid() in mm.h.  You might
consider having a non-early variant of this which could be optimised
away for those arches which do not have CONFIG_NODES_SPAN_OTHER_NODES.

In 'unearlifying' this to pfn_in_nid() I think we have a small naming
issue with these function as they are only valid for use with pfns within
an existing node.  They should probabally both be *pfn_in_nid_within()
or something in line with pfn_valid_within().

-apw

WARNING: multiple messages have this Message-ID (diff)
From: Andy Whitcroft <apw@shadowen.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Adam Litke <agl@us.ibm.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	nacc <nacc@linux.vnet.ibm.com>, agl <agl@linux.vnet.ibm.com>
Subject: Re: [BUG] [PATCH v2] Make setup_zone_migrate_reserve() aware of overlapping nodes
Date: Tue, 26 Aug 2008 10:29:29 +0100	[thread overview]
Message-ID: <20080826092929.GD29207@brain> (raw)
In-Reply-To: <20080821113338.GA29950@csn.ul.ie>

On Thu, Aug 21, 2008 at 12:33:39PM +0100, Mel Gorman wrote:
> On (20/08/08 14:55), Adam Litke didst pronounce:
> >     Changes since V1
> >      - Fix build for !NUMA
> >      - Add VM_BUG_ON() to catch this problem at the source
> >     
> >     I have gotten to the root cause of the hugetlb badness I reported back on
> >     August 15th.  My system has the following memory topology (note the
> >     overlapping node):
> >     
> >             Node 0 Memory: 0x8000000-0x44000000
> >             Node 1 Memory: 0x0-0x8000000 0x44000000-0x80000000
> >     
> >     setup_zone_migrate_reserve() scans the address range 0x0-0x8000000 looking
> >     for a pageblock to move onto the MIGRATE_RESERVE list.  Finding no
> >     candidates, it happily continues the scan into 0x8000000-0x44000000.  When
> >     a pageblock is found, the pages are moved to the MIGRATE_RESERVE list on
> >     the wrong zone.  Oops.
> >     
> >     (Andrew: once the proper fix is agreed upon, this should also be a
> >     candidate for -stable.)
> >     
> >     setup_zone_migrate_reserve() should skip pageblocks in overlapping nodes.
> >     
> >     Signed-off-by: Adam Litke <agl@us.ibm.com>
> > 
> 
> zone_to_nid(zone) is called every time in the loop even though it will never
> change. This is less than optimal but setup_zone_migrate_reserve() is only
> called during init and when min_free_kbytes is adjusted so it's not worth
> worrying about. Otherwise it looks good.
> 
> Acked-by: Mel Gorman <mel@csn.ul.ie>
> 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index af982f7..feb7916 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -694,6 +694,9 @@ static int move_freepages(struct zone *zone,
> >  #endif
> >  
> >  	for (page = start_page; page <= end_page;) {
> > +		/* Make sure we are not inadvertently changing nodes */
> > +		VM_BUG_ON(page_to_nid(page) != zone_to_nid(zone));
> > +
> >  		if (!pfn_valid_within(page_to_pfn(page))) {
> >  			page++;
> >  			continue;
> > @@ -2516,6 +2519,10 @@ static void setup_zone_migrate_reserve(struct zone *zone)
> >  			continue;
> >  		page = pfn_to_page(pfn);
> >  
> > +		/* Watch out for overlapping nodes */
> > +		if (page_to_nid(page) != zone_to_nid(zone))
> > +			continue;
> > +
> >  		/* Blocks with reserved pages will never free, skip them. */
> >  		if (PageReserved(page))
> >  			continue;

This patch looks sane.  I do note that we have a config option to tell
us whether we have any possibility of overlapping nodes, and we have an
early version of a check for this early_pfn_in_nid() in mm.h.  You might
consider having a non-early variant of this which could be optimised
away for those arches which do not have CONFIG_NODES_SPAN_OTHER_NODES.

In 'unearlifying' this to pfn_in_nid() I think we have a small naming
issue with these function as they are only valid for use with pfns within
an existing node.  They should probabally both be *pfn_in_nid_within()
or something in line with pfn_valid_within().

-apw

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-08-26  9:29 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-15 22:01 [BUG] __GFP_THISNODE is not always honored Adam Litke
2008-08-15 22:01 ` Adam Litke
2008-08-18 10:59 ` Mel Gorman
2008-08-18 10:59   ` Mel Gorman
2008-08-18 18:16   ` Adam Litke
2008-08-18 19:57     ` Mel Gorman
2008-08-18 19:57       ` Mel Gorman
2008-08-18 19:14   ` Christoph Lameter
2008-08-18 19:14     ` Christoph Lameter
2008-08-18 19:21 ` Christoph Lameter
2008-08-18 19:21   ` Christoph Lameter
2008-08-18 19:52   ` Mel Gorman
2008-08-18 19:52     ` Mel Gorman
2008-08-20 17:08 ` [BUG] Make setup_zone_migrate_reserve() aware of overlapping nodes Adam Litke
2008-08-20 17:08   ` Adam Litke
2008-08-20 18:11   ` Dave Hansen
2008-08-20 18:11     ` Dave Hansen
2008-08-20 19:55     ` [BUG] [PATCH v2] " Adam Litke
2008-08-20 19:55       ` Adam Litke
2008-08-21 11:33       ` Mel Gorman
2008-08-21 11:33         ` Mel Gorman
2008-08-26  9:29         ` Andy Whitcroft [this message]
2008-08-26  9:29           ` Andy Whitcroft

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080826092929.GD29207@brain \
    --to=apw@shadowen.org \
    --cc=agl@linux.vnet.ibm.com \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=nacc@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.