All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Mel Gorman <mel@skynet.ie>
Cc: linux-mm@kvack.org, ak@suse.de,
	Nishanth Aravamudan <nacc@us.ibm.com>,
	pj@sgi.com, kxr@sgi.com, Christoph Lameter <clameter@sgi.com>,
	akpm@linux-foundation.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH/RFC] 2.6.23-rc1-mm1:  MPOL_PREFERRED fixups for preferred_node < 0
Date: Tue, 31 Jul 2007 11:58:07 -0400	[thread overview]
Message-ID: <1185897487.6240.29.camel@localhost> (raw)
In-Reply-To: <20070731153227.GB18506@skynet.ie>

On Tue, 2007-07-31 at 16:32 +0100, Mel Gorman wrote:
> On (30/07/07 18:00), Lee Schermerhorn didst pronounce:
> > On Mon, 2007-07-30 at 17:38 -0400, Lee Schermerhorn wrote:
> > > These are some "issues" that I came across working on the Memoryless
> > > Node series.  I'm using the same cc: list as that series as the issues
> > > are somewhat related.
> > > 
> > > Only boot tested at this point.
> > 
> > I sent the wrong patch--forgot to refresh before posting :-(.  Bogus
> > code in mpol_to_str() in previous patch.
> > 
> > Try this one.
> > 
> > Lee
> > 
> > > ---------------------------
> > 
> > PATCH/RFC - MPOL_PREFERRED fixups for "local allocation"
> > 
> > Here are a couple of potential "fixups" for MPOL_PREFERRED behavior
> > when v.preferred_node < 0 -- i.e., "local allocation":
> > 
> > 1)  [do_]get_mempolicy() calls the misnamed get_zonemask() to fetch the
> >     nodemask associated with a policy.  Currently, get_zonemask() returns
> >     the set of nodes with memory, when the policy 'mode' is 'PREFERRED,
> 
> Consider a cleanup that renames get_zonemask because the naming is
> misleading.

I can do that.  Wanted to hear from others, such as yourself first.

> 
> >     and the preferred_node is < 0.  Return the set of allowed nodes
> >     instead.  This will already have been masked to include only nodes
> >     with memory.
> > 
> > 2)  When a task is moved into a [new] cpuset, mpol_rebind_policy() is
> >     called to adjust any task and vma policy nodes to be valid in the
> >     new cpuset.  However, when the policy is MPOL_PREFERRED, and the
> >     preferred_node is <0, no rebind is necessary.  The "local allocation"
> >     indication is valid in any cpuset.
> > 
> > 3)  mpol_to_str() produces a printable, "human readable" string from a
> >     struct mempolicy.  For MPOL_PREFERRED with preferred_node <0,  show
> >     the entire set of valid nodes.  Although, technically, MPOL_PREFERRED
> >     takes only a single node, preferred_node <0 is a local allocation policy,
> >     with the preferred node determined by the context where the task
> >     is executing.  All of the allowed nodes are possible, as the task
> >     migrates amoung the nodes in the cpuset.
> > 
> > Comments?
> > 
> > Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
> > 
> >  mm/mempolicy.c |   31 ++++++++++++++++++++++++-------
> >  1 file changed, 24 insertions(+), 7 deletions(-)
> > 
> > Index: Linux/mm/mempolicy.c
> > ===================================================================
> > --- Linux.orig/mm/mempolicy.c	2007-07-30 17:32:06.000000000 -0400
> > +++ Linux/mm/mempolicy.c	2007-07-30 17:38:17.000000000 -0400
> > @@ -494,9 +494,11 @@ static void get_zonemask(struct mempolic
> >  		*nodes = p->v.nodes;
> >  		break;
> >  	case MPOL_PREFERRED:
> > -		/* or use current node instead of memory_map? */
> > +		/*
> > +		 * for "local policy", return allowed memories
> > +		 */
> >  		if (p->v.preferred_node < 0)
> > -			*nodes = node_states[N_MEMORY];
> > +			*nodes = cpuset_current_mems_allowed;
> >  		else
> 
> Is this actually a bugfix? From this context, it looks like memory
> policies using MPOL_PREFERRED can ignore cpusets.

Not a serious bug, if it is one.  More of a cleanup.   All this does is
return a node mask in the case where the application has a task memory
policy of 'PREFERRED with a node id of -1 [which happens when you
specify an empty nodemask to set_mempolicy() or mbind()].  This means
"local allocation"--the actual "current node id" is fetched at
allocation time.  This is a little know "feature" of get_mempolicy().
The results is misleading, but there isn't much the application can do
with it.  Node masks are ANDed with cpuset_current_mems_allowed when
installed via a syscall.

> 
> >  			node_set(p->v.preferred_node, *nodes);
> >  		break;
> > @@ -1650,6 +1652,7 @@ void mpol_rebind_policy(struct mempolicy
> >  {
> >  	nodemask_t *mpolmask;
> >  	nodemask_t tmp;
> > +	int nid;
> >  
> >  	if (!pol)
> >  		return;
> > @@ -1668,9 +1671,15 @@ void mpol_rebind_policy(struct mempolicy
> >  						*mpolmask, *newmask);
> >  		break;
> >  	case MPOL_PREFERRED:
> > -		pol->v.preferred_node = node_remap(pol->v.preferred_node,

Ultimately, node_remap() [the bitmap functions it calls] will return the
old value of "-1" because it's outside the valid range for a the node
bitmasks.  However, it doesn't seem right to be calling node_remap()
with an invalid node id.  I think it's clearer this way:

> > +		/*
> > +		 * no need to remap "local policy"
> > +		 */
> > +		nid = pol->v.preferred_node;
> > +		if (nid >= 0) {
> > +			pol->v.preferred_node = node_remap(nid,
> >  						*mpolmask, *newmask);
> > -		*mpolmask = *newmask;
> > +			*mpolmask = *newmask;
> > +		}
> >  		break;
> >  	case MPOL_BIND: {
> >  		nodemask_t nodes;
> > @@ -1745,7 +1754,7 @@ static const char * const policy_types[]
> >  static inline int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
> >  {
> >  	char *p = buffer;
> > -	int l;
> > +	int nid, l;
> >  	nodemask_t nodes;
> >  	int mode = pol ? pol->policy : MPOL_DEFAULT;
> >  
> > @@ -1755,8 +1764,16 @@ static inline int mpol_to_str(char *buff
> >  		break;
> >  
> >  	case MPOL_PREFERRED:
> > -		nodes_clear(nodes);
> > -		node_set(pol->v.preferred_node, nodes);

Here, I think set_bit() will set bit 31.  Again, misleading, IMO.

> > +		nid = pol->v.preferred_node;
> > +		/*
> > +		 * local interleave, show all valid nodes
> > +		 */
> > +		if (nid < 0 )
> > +			nodes = cpuset_current_mems_allowed;
> > +		else {
> > +			nodes_clear(nodes);
> > +			node_set(nid, nodes);
> > +		}
> >  		break;
> >  
> >  	case MPOL_BIND:
> > 
> 
> -- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-07-31 15:58 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-27 19:43 [PATCH 00/14] NUMA: Memoryless node support V4 Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Lee Schermerhorn
2007-07-30 21:38   ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 Lee Schermerhorn
2007-07-30 22:00     ` Lee Schermerhorn
2007-07-31 15:32       ` Mel Gorman
2007-07-31 15:58         ` Lee Schermerhorn [this message]
2007-07-31 21:05     ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 - v2 Lee Schermerhorn
2007-08-01  2:22   ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Andrew Morton
2007-08-01  2:52     ` Christoph Lameter
2007-08-01  3:05       ` Andrew Morton
2007-08-01  3:14         ` Christoph Lameter
2007-08-01  3:32           ` Andrew Morton
2007-08-01  3:37             ` Christoph Lameter
     [not found]             ` <Pine.LNX.4.64.0707312151400.2894@schroedinger.engr.sgi.com>
2007-08-01  5:07               ` Andrew Morton
2007-08-01  5:11                 ` Andrew Morton
2007-08-01  5:22                 ` Christoph Lameter
2007-08-01 10:24                   ` Mel Gorman
2007-08-02 16:23                   ` Mel Gorman
2007-08-02 20:00                     ` Christoph Lameter
2007-08-01  5:36             ` Paul Mundt
2007-08-01  9:19             ` Andi Kleen
2007-08-01 14:03             ` Lee Schermerhorn
2007-08-01 17:41               ` Christoph Lameter
2007-08-01 17:54                 ` Lee Schermerhorn
2007-08-02 20:05                 ` [PATCH/RFC/WIP] cpuset-independent interleave policy Lee Schermerhorn
2007-08-02 20:34                   ` Christoph Lameter
2007-08-02 21:04                     ` Lee Schermerhorn
2007-08-03  0:31                       ` Christoph Lameter
2007-08-02 20:19                 ` Audit of "all uses of node_online()" Lee Schermerhorn
2007-08-02 20:26                   ` Christoph Lameter
2007-08-08 22:19                     ` Lee Schermerhorn
2007-08-08 23:40                       ` Christoph Lameter
2007-08-16 14:17                         ` [PATCH/RFC] memoryless nodes - fixup uses of node_online_map in generic code Lee Schermerhorn
2007-08-16 18:33                           ` Christoph Lameter
2007-08-16 19:15                             ` Lee Schermerhorn
2007-08-16 21:10                         ` Lee Schermerhorn
2007-08-16 21:13                           ` Christoph Lameter
2007-08-24 16:09                         ` [PATCH] 2.6.23-rc3-mm1 - Move setup of N_CPU node state mask Lee Schermerhorn
2007-09-06 13:56                           ` Mel Gorman
2007-08-02 20:33                   ` Audit of "all uses of node_online()" Andrew Morton
2007-08-02 20:45                     ` Lee Schermerhorn
2007-08-01 15:58           ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Nishanth Aravamudan
2007-08-01 16:09             ` Nishanth Aravamudan
2007-08-01 17:47             ` Christoph Lameter
2007-08-01 15:25         ` Nishanth Aravamudan
2007-07-27 19:43 ` [PATCH 02/14] Memoryless nodes: introduce mask of nodes with memory Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 03/14] Memoryless Nodes: Fix interleave behavior Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 04/14] OOM: use the N_MEMORY map instead of constructing one on the fly Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 05/14] Memoryless Nodes: No need for kswapd Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 06/14] Memoryless Node: Slab support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 07/14] Memoryless nodes: SLUB support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 08/14] Uncached allocator: Handle memoryless nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 09/14] Memoryless node: Allow profiling data to fall back to other nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 10/14] Memoryless nodes: Update memory policy and page migration Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 11/14] Add N_CPU node state Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 12/14] Memoryless nodes: Fix GFP_THISNODE behavior Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 13/14] Memoryless Nodes: use "node_memory_map" for cpusets Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 14/14] Memoryless nodes: drop one memoryless node boot warning Lee Schermerhorn
2007-07-27 20:59 ` [PATCH 00/14] NUMA: Memoryless node support V4 Nishanth Aravamudan
2007-07-30 13:48   ` Lee Schermerhorn
2007-07-29 12:35 ` Paul Jackson
2007-07-30 16:07   ` Lee Schermerhorn
2007-07-30 18:56     ` Paul Jackson
2007-07-30 21:19 ` Nishanth Aravamudan
2007-07-30 22:06   ` Christoph Lameter
2007-07-30 22:35     ` Andi Kleen
2007-07-30 22:36       ` Christoph Lameter
2007-07-31 23:18         ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1185897487.6240.29.camel@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kxr@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@skynet.ie \
    --cc=nacc@us.ibm.com \
    --cc=pj@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.