All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>, Eric Whitney <eric.whitney@hp.com>
Subject: Re: Suspect use of "first_zones_zonelist()"
Date: Tue, 22 Apr 2008 18:49:02 +0100	[thread overview]
Message-ID: <20080422174901.GA7261@csn.ul.ie> (raw)
In-Reply-To: <1208884215.5534.57.camel@localhost>

On (22/04/08 13:10), Lee Schermerhorn didst pronounce:
> > > <SNIP>
> > >
> > > --i.e., the first that satisfies any nodemask
> > > constraint.  I renamed 'dummy' to 'zone', ignore the return value and
> > > use:  newnid = zone->node.  [I guess I could use zonelist_node_idx(zr
> > > -1) as well.] 
> > 
> > zr - 1 would be vunerable to the iterator implementation changing.
> 
> Ah, good point.  Shouldn't peek under the covers like that.
> 
> > 
> > >  This results in page migration to the expected node.
> > > 
> > 
> > This use of zone instead of the zoneref cursor should be made throughout.
> > 
> > > Anyway, after discovering this, I checked other usages of
> > > first_zones_zonelist() outside of the iterator macros, and I THINK they
> > > might be making the same mistake?
> > > 
> > 
> > Yes, you're right.
> > 
> > > Here's a patch that "fixes" these.  Do you agree?  Or am I
> > > misunderstanding this area [again!]?
> > > 
> > 
> > No, I screwed up with the use of cursors and didn't get caught for it as
> > the effect would be very difficult to spot normally. I extended your patch
> > slightly below to catch the other callers. Can you take a read-through please?
> 
> OK.  Looks good.  I see I missed one case.
> 
> A suggestion.  How about enhancing the comment [maybe a kernel doc
> block?] on first_zones_zonelist() to explain that it returns the zone
> via the zone parameter and that the return value is a cursor for
> iterators?  Perhaps similarly for next_zones_zonelist() in mmzone.c?
> 
> Or would you like me to take a cut at this?
> 

That's a good suggestion. How about the following?

=======
Subject: [PATCH] fix off-by-one usage of first_zones_zonelist()

Against:  2.6.25-mm1

The return value of first_zones_zonelist() is actually the zoneref
AFTER the "requested zone"--i.e., the first zone in the zonelist
that satisfies any nodemask constraint.  The "requested zone" is
returned via the @zone parameter.  The returned zoneref is intended
to be passed to next_zones_zonelist() on subsequent iterations.

This patch fixes first_zones_zonelist() callers appropriately and documents
first_zones_zonelist() to help avoid a repeat of the same mistake.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 fs/buffer.c            |    9 ++++-----
 include/linux/mmzone.h |   25 ++++++++++++++++++++++++-
 mm/mempolicy.c         |    9 ++++-----
 mm/page_alloc.c        |    4 ++--
 4 files changed, 34 insertions(+), 13 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/fs/buffer.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/fs/buffer.c
--- linux-2.6.25-mm1-clean/fs/buffer.c	2008-04-22 10:30:02.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/fs/buffer.c	2008-04-22 18:35:56.000000000 +0100
@@ -368,18 +368,17 @@ void invalidate_bdev(struct block_device
  */
 static void free_more_memory(void)
 {
-	struct zoneref *zrefs;
-	struct zone *dummy;
+	struct zone *zone;
 	int nid;
 
 	wakeup_pdflush(1024);
 	yield();
 
 	for_each_online_node(nid) {
-		zrefs = first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
+		(void)first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
 						gfp_zone(GFP_NOFS), NULL,
-						&dummy);
-		if (zrefs->zone)
+						&zone);
+		if (zone)
 			try_to_free_pages(node_zonelist(nid, GFP_NOFS), 0,
 						GFP_NOFS);
 	}
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/include/linux/mmzone.h linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/include/linux/mmzone.h
--- linux-2.6.25-mm1-clean/include/linux/mmzone.h	2008-04-22 10:30:03.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/include/linux/mmzone.h	2008-04-22 18:45:37.000000000 +0100
@@ -742,12 +742,36 @@ static inline int zonelist_node_idx(stru
 #endif /* CONFIG_NUMA */
 }
 
+/**
+ * next_zones_zonelist - Returns the next zone at or below highest_zoneidx within the allowed nodemask using a cursor within a zonelist as a starting point
+ * @z - The cursor used as a starting point for the search
+ * @highest_zoneidx - The zone index of the highest zone to return
+ * @nodes - An optional nodemask to filter the zonelist with
+ * @zone - The first suitable zone found is returned via this parameter
+ *
+ * This function returns the next zone at or below a given zone index that is
+ * within the allowed nodemask using a cursor as the starting point for the
+ * search. The zoneref returned is a cursor that is used as the next starting
+ * point for future calls to next_zones_zonelist().
+ */
 struct zoneref *next_zones_zonelist(struct zoneref *z,
 					enum zone_type highest_zoneidx,
 					nodemask_t *nodes,
 					struct zone **zone);
 
-/* Returns the first zone at or below highest_zoneidx in a zonelist */
+/**
+ * first_zones_zonelist - Returns the first zone at or below highest_zoneidx within the allowed nodemask in a zonelist
+ * @zonelist - The zonelist to search for a suitable zone
+ * @highest_zoneidx - The zone index of the highest zone to return
+ * @nodes - An optional nodemask to filter the zonelist with
+ * @zone - The first suitable zone found is returned via this parameter
+ *
+ * This function returns the first zone at or below a given zone index that is
+ * within the allowed nodemask. The zoneref returned is a cursor that can be
+ * used to iterate the zonelist with next_zones_zonelist. The cursor should
+ * not be used by the caller as it does not match the value of the zone
+ * returned.
+ */
 static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
 					enum zone_type highest_zoneidx,
 					nodemask_t *nodes,
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/mm/mempolicy.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/mempolicy.c
--- linux-2.6.25-mm1-clean/mm/mempolicy.c	2008-04-22 10:30:04.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/mempolicy.c	2008-04-22 18:35:56.000000000 +0100
@@ -1396,14 +1396,13 @@ unsigned slab_node(struct mempolicy *pol
 		 * first node.
 		 */
 		struct zonelist *zonelist;
-		struct zoneref *z;
-		struct zone *dummy;
+		struct zone *zone;
 		enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
 		zonelist = &NODE_DATA(numa_node_id())->node_zonelists[0];
-		z = first_zones_zonelist(zonelist, highest_zoneidx,
+		(void)first_zones_zonelist(zonelist, highest_zoneidx,
 							&policy->v.nodes,
-							&dummy);
-		return zonelist_node_idx(z);
+							&zone);
+		return zone->node;
 	}
 
 	default:
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/mm/page_alloc.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/page_alloc.c
--- linux-2.6.25-mm1-clean/mm/page_alloc.c	2008-04-22 10:30:04.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/page_alloc.c	2008-04-22 18:35:56.000000000 +0100
@@ -1412,9 +1412,9 @@ get_page_from_freelist(gfp_t gfp_mask, n
 	int zlc_active = 0;		/* set if using zonelist_cache */
 	int did_zlc_setup = 0;		/* just call zlc_setup() one time */
 
-	z = first_zones_zonelist(zonelist, high_zoneidx, nodemask,
+	(void)first_zones_zonelist(zonelist, high_zoneidx, nodemask,
 							&preferred_zone);
-	classzone_idx = zonelist_zone_idx(z);
+	classzone_idx = zone_idx(preferred_zone);
 
 zonelist_scan:
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-04-22 17:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-22 15:17 Suspect use of "first_zones_zonelist()" Lee Schermerhorn
2008-04-22 16:15 ` Mel Gorman
2008-04-22 17:10   ` Lee Schermerhorn
2008-04-22 17:49     ` Mel Gorman [this message]
2008-04-22 18:01       ` Lee Schermerhorn
2008-04-22 18:40         ` Lee Schermerhorn
2008-04-23  6:02           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080422174901.GA7261@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=eric.whitney@hp.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.