From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46C63BDE.20602@google.com> Date: Fri, 17 Aug 2007 17:22:54 -0700 From: Ethan Solomita MIME-Version: 1.0 Subject: cpusets vs. mempolicy and how to get interleaving Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Jackson , Christoph Lameter , linux-mm@kvack.org List-ID: application to request NUMA interleaving in the face of cpusets and modifications to mems_allowed. I'm hoping for some advice. What we want: Ideally, we want a task to express its preference for interleaved memory allocations without having to provide a list of nodes. The kernel will automatically round-robin amongst the task's mems_allowed. The problem: At least in our environment, an independent "cpuset manager" process may choose to rewrite a cpuset's mems file at any time, possibly increasing or decreasing the number of available nodes. If weight(mems_allowed) is decreased, the task's MPOL_INTERLEAVE policy's nodemask will be shrunk to fit the new mems_allowed. If weight(mems_allowed) is grown, the policy's nodemask will not gain new nodes. What we want is for the task to "set it and forget it," i.e. to express a preference for interleaving and then never worry about NUMA again. If the nodemask sent via sys_mempolicy(MPOL_INTERLEAVE) served as a mask against mems_allowed, then we would specify an all-1s nodemask. I realize that this doesn't work with backwards compatibility so I'm looking for advice. A new policy MPOL_INTERLEAVE_ALL that doesn't take a nodemask argument and interleaves within mems_allowed? Any better suggestions? Thanks! -- Ethan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46C63D5D.3020107@google.com> Date: Fri, 17 Aug 2007 17:29:17 -0700 From: Ethan Solomita MIME-Version: 1.0 Subject: Re: cpusets vs. mempolicy and how to get interleaving References: <46C63BDE.20602@google.com> In-Reply-To: <46C63BDE.20602@google.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Jackson , Christoph Lameter , linux-mm@kvack.org List-ID: [Resend -- the original email seems to have lost its first line] I'm concerned that there isn't an adequate mechanism available for an application to request NUMA interleaving in the face of cpusets and modifications to mems_allowed. I'm hoping for some advice. What we want: Ideally, we want a task to express its preference for interleaved memory allocations without having to provide a list of nodes. The kernel will automatically round-robin amongst the task's mems_allowed. The problem: At least in our environment, an independent "cpuset manager" process may choose to rewrite a cpuset's mems file at any time, possibly increasing or decreasing the number of available nodes. If weight(mems_allowed) is decreased, the task's MPOL_INTERLEAVE policy's nodemask will be shrunk to fit the new mems_allowed. If weight(mems_allowed) is grown, the policy's nodemask will not gain new nodes. What we want is for the task to "set it and forget it," i.e. to express a preference for interleaving and then never worry about NUMA again. If the nodemask sent via sys_mempolicy(MPOL_INTERLEAVE) served as a mask against mems_allowed, then we would specify an all-1s nodemask. I realize that this doesn't work with backwards compatibility so I'm looking for advice. A new policy MPOL_INTERLEAVE_ALL that doesn't take a nodemask argument and interleaves within mems_allowed? Any better suggestions? Thanks! -- Ethan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 17 Aug 2007 18:07:07 -0700 (PDT) From: Christoph Lameter Subject: Re: cpusets vs. mempolicy and how to get interleaving In-Reply-To: <46C63BDE.20602@google.com> Message-ID: References: <46C63BDE.20602@google.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Ethan Solomita Cc: Paul Jackson , linux-mm@kvack.org List-ID: On Fri, 17 Aug 2007, Ethan Solomita wrote: > Ideally, we want a task to express its preference for interleaved > memory allocations without having to provide a list of nodes. The kernel will > automatically round-robin amongst the task's mems_allowed. You can do that by writing 1 to /dev/cpuset//memory_spread_page > I realize that this doesn't work with backwards compatibility so I'm > looking for advice. A new policy MPOL_INTERLEAVE_ALL that doesn't take a > nodemask argument and interleaves within mems_allowed? Any better suggestions? No need for a policy. Just use what I suggested above. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46C650B8.6040004@google.com> Date: Fri, 17 Aug 2007 18:51:52 -0700 From: Ethan Solomita MIME-Version: 1.0 Subject: Re: cpusets vs. mempolicy and how to get interleaving References: <46C63BDE.20602@google.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: Paul Jackson , linux-mm@kvack.org List-ID: Christoph Lameter wrote: > On Fri, 17 Aug 2007, Ethan Solomita wrote: > >> Ideally, we want a task to express its preference for interleaved >> memory allocations without having to provide a list of nodes. The kernel will >> automatically round-robin amongst the task's mems_allowed. > > You can do that by writing 1 to /dev/cpuset//memory_spread_page Sorry, also noticed that the above doesn't affect anonymous pages, just page cache, and we'd want interleaved anonymous pages. -- Ethan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sun, 19 Aug 2007 03:18:08 -0700 (PDT) From: David Rientjes Subject: Re: cpusets vs. mempolicy and how to get interleaving In-Reply-To: <46C63D5D.3020107@google.com> Message-ID: References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=us-ascii Sender: owner-linux-mm@kvack.org Return-Path: To: Ethan Solomita Cc: Paul Jackson , Christoph Lameter , linux-mm@kvack.org List-ID: On Fri, 17 Aug 2007, Ethan Solomita wrote: > Ideally, we want a task to express its preference for interleaved memory > allocations without having to provide a list of nodes. The kernel will > automatically round-robin amongst the task's mems_allowed. > Just pass the result of grep Mems_allowed /proc/pid/status | awk '{ print $2 }' | sed s/,// to set_mempolicy() with MPOL_INTERLEAVE. > At least in our environment, an independent "cpuset manager" process may > choose to rewrite a cpuset's mems file at any time, possibly increasing or > decreasing the number of available nodes. If weight(mems_allowed) is > decreased, the task's MPOL_INTERLEAVE policy's nodemask will be shrunk to fit > the new mems_allowed. If weight(mems_allowed) is grown, the policy's nodemask > will not gain new nodes. > This is not unlike the traditional use of cpusets; a cpuset's mems_allowed may be freely changed at any time. If the weight of a task's mems_allowed decreases, you would want a simple remap from the old nodemask to the new nodemask. node_remap() provides this functionality already. > What we want is for the task to "set it and forget it," i.e. to express a > preference for interleaving and then never worry about NUMA again. If the > nodemask sent via sys_mempolicy(MPOL_INTERLEAVE) served as a mask against > mems_allowed, then we would specify an all-1s nodemask. > It already does exactly what you want. cpuset_update_task_memory_state() is invoked anytime an allocation with __GFP_WAIT is requested via alloc_pages_current() in process context in addition to alloc_page_vma() for any userspace mapped pages. If a task's mems_allowed has changed in its cpuset behind the task's back, mpol_rebind_policy() is called for that task's mempolicy with a pointer to the cpuset's mems_allowed. This will be considered the new mems_allowed for the task and is stored in its task_struct. mpol_rebind_policy() will rebind MPOL_INTERLEAVE policies by remapping the old mems_allowed nodemask with the new nodemask from the cpuset and, at the same time, update task->il_next to specify the next node to allocate from as reflected by the new nodemask. David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46C8E604.8040101@google.com> Date: Sun, 19 Aug 2007 17:53:24 -0700 From: Ethan Solomita MIME-Version: 1.0 Subject: Re: cpusets vs. mempolicy and how to get interleaving References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: David Rientjes Cc: Paul Jackson , Christoph Lameter , linux-mm@kvack.org List-ID: David Rientjes wrote: > On Fri, 17 Aug 2007, Ethan Solomita wrote: > >> Ideally, we want a task to express its preference for interleaved memory >> allocations without having to provide a list of nodes. The kernel will >> automatically round-robin amongst the task's mems_allowed. >> >> At least in our environment, an independent "cpuset manager" process may >> choose to rewrite a cpuset's mems file at any time, possibly increasing or >> decreasing the number of available nodes. If weight(mems_allowed) is >> decreased, the task's MPOL_INTERLEAVE policy's nodemask will be shrunk to fit >> the new mems_allowed. If weight(mems_allowed) is grown, the policy's nodemask >> will not gain new nodes. >> > > This is not unlike the traditional use of cpusets; a cpuset's mems_allowed > may be freely changed at any time. > > If the weight of a task's mems_allowed decreases, you would want a simple > remap from the old nodemask to the new nodemask. node_remap() provides > this functionality already. And what happens when the weight then goes back up? e.g. at first the mems_allowed specifies nodes 0 and 1, and the user sets a MPOL_INTERLEAVE policy across nodes 0 and 1. At some point the "cpuset manager" shrinks the number of nodes to just node 0, then later it adds back node 1. What nodes are in my MPOL_INTERLEAVE policy? As I read the code, I'll only have one node in the mempolicy. If that's true, this doesn't do what I want. -- Ethan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sun, 19 Aug 2007 19:34:31 -0700 From: Paul Jackson Subject: Re: cpusets vs. mempolicy and how to get interleaving Message-Id: <20070819193431.dce5d4cf.pj@sgi.com> In-Reply-To: <46C8E604.8040101@google.com> References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Ethan Solomita Cc: rientjes@google.com, clameter@sgi.com, linux-mm@kvack.org List-ID: Ethan wrote: > And what happens when the weight then goes back up? e.g. at first the > mems_allowed specifies nodes 0 and 1, and the user sets a > MPOL_INTERLEAVE policy across nodes 0 and 1. At some point the "cpuset > manager" shrinks the number of nodes to just node 0, then later it adds > back node 1. What nodes are in my MPOL_INTERLEAVE policy? > > As I read the code, I'll only have one node in the mempolicy. If that's > true, this doesn't do what I want. I read the code the same way. Sounds to me like you want a new and different MPOL_* mempolicy, that interleaves over whatever nodes are available (allowed) to the task. The existing MPOL_INTERLEAVE mempolicy interleaves over some specified nodemask, so we do the best we can to remap that set when it changes. You want a mempolicy that interleaves over all available nodes, not over some specified subset of them. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson 1.925.600.0401 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46C92AF4.20607@google.com> Date: Sun, 19 Aug 2007 22:47:32 -0700 From: Ethan Solomita MIME-Version: 1.0 Subject: Re: cpusets vs. mempolicy and how to get interleaving References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> In-Reply-To: <20070819193431.dce5d4cf.pj@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Jackson Cc: rientjes@google.com, clameter@sgi.com, linux-mm@kvack.org List-ID: Paul Jackson wrote: > Ethan wrote: >> And what happens when the weight then goes back up? e.g. at first the >> mems_allowed specifies nodes 0 and 1, and the user sets a >> MPOL_INTERLEAVE policy across nodes 0 and 1. At some point the "cpuset >> manager" shrinks the number of nodes to just node 0, then later it adds >> back node 1. What nodes are in my MPOL_INTERLEAVE policy? >> >> As I read the code, I'll only have one node in the mempolicy. If that's >> true, this doesn't do what I want. > > I read the code the same way. > > Sounds to me like you want a new and different MPOL_* mempolicy, that > interleaves over whatever nodes are available (allowed) to the task. > > The existing MPOL_INTERLEAVE mempolicy interleaves over some specified > nodemask, so we do the best we can to remap that set when it changes. > > You want a mempolicy that interleaves over all available nodes, not over > some specified subset of them. OK, then I'll proceed with a new MPOL. Do you believe that this will be of general interest? i.e. worth placing in linux-mm? BTW, a slightly different MPOL_INTERLEAVE implementation would help, wherein we save the nodemask originally specified by the user and do the remap from the original nodemask rather than the current nodemask. This would also let the user specify an all-ones nodemask which would then be remapped onto mems_allowed. But I'm guessing that these changes would be impossible due to breaking compatibility? -- Ethan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sun, 19 Aug 2007 22:53:20 -0700 From: Paul Jackson Subject: Re: cpusets vs. mempolicy and how to get interleaving Message-Id: <20070819225320.6562fbd1.pj@sgi.com> In-Reply-To: <46C92AF4.20607@google.com> References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Ethan Solomita Cc: rientjes@google.com, clameter@sgi.com, linux-mm@kvack.org List-ID: Ethan wrote: > OK, then I'll proceed with a new MPOL. Do you believe that this will be > of general interest? i.e. worth placing in linux-mm? I've no idea if it is of general interest or not. I'm not interested ;). But I'm just one person. > BTW, a slightly different MPOL_INTERLEAVE implementation would help, > wherein we save the nodemask originally specified by the user and do the > remap from the original nodemask rather than the current nodemask. I kinda like this idea; though keep in mind that since I don't use mempolicy mechanisms, I am not loosing any sleep over minor(?) compatibility breakages. It would take someone familiar with the actual users or usages of MPOL_INTERLEAVE to know if or how much this would bite actual users/usages. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson 1.925.600.0401 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 20 Aug 2007 01:10:07 -0700 (PDT) From: David Rientjes Subject: Re: cpusets vs. mempolicy and how to get interleaving In-Reply-To: <20070819225320.6562fbd1.pj@sgi.com> Message-ID: References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> <20070819225320.6562fbd1.pj@sgi.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=us-ascii Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Jackson Cc: Ethan Solomita , clameter@sgi.com, linux-mm@kvack.org List-ID: On Sun, 19 Aug 2007, Paul Jackson wrote: > > BTW, a slightly different MPOL_INTERLEAVE implementation would help, > > wherein we save the nodemask originally specified by the user and do the > > remap from the original nodemask rather than the current nodemask. > > I kinda like this idea; though keep in mind that since I don't use > mempolicy mechanisms, I am not loosing any sleep over minor(?) > compatibility breakages. It would take someone familiar with the > actual users or usages of MPOL_INTERLEAVE to know if or how much > this would bite actual users/usages. > Like I've already said, there is absolutely no reason to add a new MPOL variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE gets similar results. So just modify mpol_rebind_policy() so that if /dev/cpuset//memory_spread_page is true, you rebind the interleaved nodemask to all nodes in the new nodemask. That's the well-defined cpuset interface for getting an interleaved behavior already. Let's not create new memory policies that only work for a very specific and configurable case when the basic underlying mechanism to that policy is already present in the cpuset interface, namely, PF_SPREAD_PAGE. David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 20 Aug 2007 11:25:38 -0700 From: Paul Jackson Subject: Re: cpusets vs. mempolicy and how to get interleaving Message-Id: <20070820112538.42337443.pj@sgi.com> In-Reply-To: References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> <20070819225320.6562fbd1.pj@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: David Rientjes Cc: solo@google.com, clameter@sgi.com, linux-mm@kvack.org List-ID: David wrote: > Like I've already said, there is absolutely no reason to add a new MPOL > variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE > gets similar results. So just modify mpol_rebind_policy() so that if > /dev/cpuset//memory_spread_page is true, you rebind the > interleaved nodemask to all nodes in the new nodemask. That's the > well-defined cpuset interface for getting an interleaved behavior already. Hmm ... nice. As David likely guesses, I didn't read his earlier suggestion of this. Thanks for repeating it. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson 1.925.600.0401 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46C9DD62.8020803@google.com> Date: Mon, 20 Aug 2007 11:28:50 -0700 From: Ethan Solomita MIME-Version: 1.0 Subject: Re: cpusets vs. mempolicy and how to get interleaving References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> <20070819225320.6562fbd1.pj@sgi.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: David Rientjes Cc: Paul Jackson , clameter@sgi.com, linux-mm@kvack.org List-ID: David Rientjes wrote: > > Like I've already said, there is absolutely no reason to add a new MPOL > variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE > gets similar results. So just modify mpol_rebind_policy() so that if > /dev/cpuset//memory_spread_page is true, you rebind the > interleaved nodemask to all nodes in the new nodemask. That's the > well-defined cpuset interface for getting an interleaved behavior already. memory_spread_page is only for file-backed pages, not anon pages. -- Ethan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 20 Aug 2007 11:40:37 -0700 (PDT) From: David Rientjes Subject: Re: cpusets vs. mempolicy and how to get interleaving In-Reply-To: <46C9DD62.8020803@google.com> Message-ID: References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> <20070819225320.6562fbd1.pj@sgi.com> <46C9DD62.8020803@google.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=us-ascii Sender: owner-linux-mm@kvack.org Return-Path: To: Ethan Solomita Cc: Paul Jackson , clameter@sgi.com, linux-mm@kvack.org List-ID: On Mon, 20 Aug 2007, Ethan Solomita wrote: > > Like I've already said, there is absolutely no reason to add a new MPOL > > variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE gets > > similar results. So just modify mpol_rebind_policy() so that if > > /dev/cpuset//memory_spread_page is true, you rebind the interleaved > > nodemask to all nodes in the new nodemask. That's the well-defined cpuset > > interface for getting an interleaved behavior already. > > memory_spread_page is only for file-backed pages, not anon pages. Please read what I said above, all you have to do is modify mpol_rebind_policy() so that if /dev/cpuset//memory_spread_page is true, you rebind the interleaved nodemask to all nodes in the new nodemask. This only happens for the MPOL_INTERLEAVE case because the application has made it quite clear through set_mempolicy(MPOL_INTERLEAVE, ...) that it wants this behavior. int cpuset_is_spread_page(struct task_struct *task) { int ret; task_lock(task); ret = is_spread_page(task->cpuset); task_unlock(task); return ret; } void mpol_rebind_policy(struct mempolicy *pol, const nodemask_t *newmask) { ... case MPOL_INTERLEAVE: if (cpuset_is_spread_page(current)) pol->v.nodes = *newmask; else { nodes_remap(tmp, pol->v.nodes, *mpolmask, *newmask); pol->v.nodes = tmp; } ... ... } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 20 Aug 2007 12:07:16 -0700 (PDT) From: Christoph Lameter Subject: Re: cpusets vs. mempolicy and how to get interleaving In-Reply-To: <46C92AF4.20607@google.com> Message-ID: References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Lee Schermerhorn Cc: Paul Jackson , Ethan Solomita , rientjes@google.com, linux-mm@kvack.org List-ID: On Sun, 19 Aug 2007, Ethan Solomita wrote: > OK, then I'll proceed with a new MPOL. Do you believe that this will > be of general interest? i.e. worth placing in linux-mm? Ummmm... Lets first get Lee onto this. AFAIK he already has an implementation for such a thing. Lee: Would you respond to these emails? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <46C9F098.2050806@google.com> Date: Mon, 20 Aug 2007 12:50:48 -0700 From: Ethan Solomita MIME-Version: 1.0 Subject: Re: cpusets vs. mempolicy and how to get interleaving References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> <20070819225320.6562fbd1.pj@sgi.com> <46C9DD62.8020803@google.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: David Rientjes Cc: Paul Jackson , clameter@sgi.com, linux-mm@kvack.org List-ID: David Rientjes wrote: > On Mon, 20 Aug 2007, Ethan Solomita wrote: > >>> Like I've already said, there is absolutely no reason to add a new MPOL >>> variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE gets >>> similar results. So just modify mpol_rebind_policy() so that if >>> /dev/cpuset//memory_spread_page is true, you rebind the interleaved >>> nodemask to all nodes in the new nodemask. That's the well-defined cpuset >>> interface for getting an interleaved behavior already. >> memory_spread_page is only for file-backed pages, not anon pages. > > Please read what I said above, all you have to do is modify > mpol_rebind_policy() so that if /dev/cpuset//memory_spread_page is > true, you rebind the interleaved nodemask to all nodes in the new > nodemask. > > This only happens for the MPOL_INTERLEAVE case because the application has > made it quite clear through set_mempolicy(MPOL_INTERLEAVE, ...) that it > wants this behavior. I understood what you said, I just had API concerns. If the definition of memory_spread_page is that file-backed pages get spread, it seems counterintuitive to me that it could have an effect on all memory allocations. Also MPOL_INTERLEAVE doesn't necessarily make it clear that this behavior is desired. One use (not the one I'm interested in, though) of this MPOL is to let the user specify specific nodes of interest, and they might not want a rebind to revert them to interleaving amongst all nodes. If MPOL_INTERLEAVE's nodemask were really a mask, applied on top of mems_allowed, this would be much easier, but it's not. -- Ethan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Re: cpusets vs. mempolicy and how to get interleaving From: Lee Schermerhorn In-Reply-To: References: <46C63BDE.20602@google.com> <46C63D5D.3020107@google.com> <46C8E604.8040101@google.com> <20070819193431.dce5d4cf.pj@sgi.com> <46C92AF4.20607@google.com> Content-Type: text/plain Date: Tue, 21 Aug 2007 10:14:55 -0400 Message-Id: <1187705695.5066.8.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: Paul Jackson , Ethan Solomita , rientjes@google.com, linux-mm@kvack.org, Eric Whitney List-ID: On Mon, 2007-08-20 at 12:07 -0700, Christoph Lameter wrote: > On Sun, 19 Aug 2007, Ethan Solomita wrote: > > > OK, then I'll proceed with a new MPOL. Do you believe that this will > > be of general interest? i.e. worth placing in linux-mm? > > Ummmm... Lets first get Lee onto this. AFAIK he already has an > implementation for such a thing. > > Lee: Would you respond to these emails? > Here's the post for cpuset-independent interleave [a.k.a. "contextual interleave"]. http://marc.info/?l=linux-mm&m=118608528417158&w=4 I'm maintaining this with a half a dozen other mempolicy cleanups and enhancements. I'll try to post the entire series later this week. Meanwhile, the patch linked above should apply to current mm tree with little conflict. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org