* cpusets vs. mempolicy and how to get interleaving
@ 2007-08-18 0:22 Ethan Solomita
2007-08-18 0:29 ` Ethan Solomita
2007-08-18 1:07 ` Christoph Lameter
0 siblings, 2 replies; 16+ messages in thread
From: Ethan Solomita @ 2007-08-18 0:22 UTC (permalink / raw)
To: Paul Jackson, Christoph Lameter, linux-mm
application to request NUMA interleaving in the face of cpusets and
modifications to mems_allowed. I'm hoping for some advice.
What we want:
Ideally, we want a task to express its preference for interleaved
memory allocations without having to provide a list of nodes. The kernel
will automatically round-robin amongst the task's mems_allowed.
The problem:
At least in our environment, an independent "cpuset manager" process
may choose to rewrite a cpuset's mems file at any time, possibly
increasing or decreasing the number of available nodes. If
weight(mems_allowed) is decreased, the task's MPOL_INTERLEAVE policy's
nodemask will be shrunk to fit the new mems_allowed. If
weight(mems_allowed) is grown, the policy's nodemask will not gain new
nodes.
What we want is for the task to "set it and forget it," i.e. to express
a preference for interleaving and then never worry about NUMA again. If
the nodemask sent via sys_mempolicy(MPOL_INTERLEAVE) served as a mask
against mems_allowed, then we would specify an all-1s nodemask.
I realize that this doesn't work with backwards compatibility so I'm
looking for advice. A new policy MPOL_INTERLEAVE_ALL that doesn't take a
nodemask argument and interleaves within mems_allowed? Any better
suggestions?
Thanks!
-- Ethan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-18 0:22 cpusets vs. mempolicy and how to get interleaving Ethan Solomita
@ 2007-08-18 0:29 ` Ethan Solomita
2007-08-19 10:18 ` David Rientjes
2007-08-18 1:07 ` Christoph Lameter
1 sibling, 1 reply; 16+ messages in thread
From: Ethan Solomita @ 2007-08-18 0:29 UTC (permalink / raw)
To: Paul Jackson, Christoph Lameter, linux-mm
[Resend -- the original email seems to have lost its first line]
I'm concerned that there isn't an adequate mechanism available for
an application to request NUMA interleaving in the face of cpusets and
modifications to mems_allowed. I'm hoping for some advice.
What we want:
Ideally, we want a task to express its preference for interleaved
memory allocations without having to provide a list of nodes. The kernel
will automatically round-robin amongst the task's mems_allowed.
The problem:
At least in our environment, an independent "cpuset manager"
process may choose to rewrite a cpuset's mems file at any time, possibly
increasing or decreasing the number of available nodes. If
weight(mems_allowed) is decreased, the task's MPOL_INTERLEAVE policy's
nodemask will be shrunk to fit the new mems_allowed. If
weight(mems_allowed) is grown, the policy's nodemask will not gain new
nodes.
What we want is for the task to "set it and forget it," i.e. to
express a preference for interleaving and then never worry about NUMA
again. If the nodemask sent via sys_mempolicy(MPOL_INTERLEAVE) served as
a mask against mems_allowed, then we would specify an all-1s nodemask.
I realize that this doesn't work with backwards compatibility so
I'm looking for advice. A new policy MPOL_INTERLEAVE_ALL that doesn't
take a nodemask argument and interleaves within mems_allowed? Any better
suggestions?
Thanks!
-- Ethan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-18 0:22 cpusets vs. mempolicy and how to get interleaving Ethan Solomita
2007-08-18 0:29 ` Ethan Solomita
@ 2007-08-18 1:07 ` Christoph Lameter
2007-08-18 1:51 ` Ethan Solomita
1 sibling, 1 reply; 16+ messages in thread
From: Christoph Lameter @ 2007-08-18 1:07 UTC (permalink / raw)
To: Ethan Solomita; +Cc: Paul Jackson, linux-mm
On Fri, 17 Aug 2007, Ethan Solomita wrote:
> Ideally, we want a task to express its preference for interleaved
> memory allocations without having to provide a list of nodes. The kernel will
> automatically round-robin amongst the task's mems_allowed.
You can do that by writing 1 to /dev/cpuset/<cpuset>/memory_spread_page
> I realize that this doesn't work with backwards compatibility so I'm
> looking for advice. A new policy MPOL_INTERLEAVE_ALL that doesn't take a
> nodemask argument and interleaves within mems_allowed? Any better suggestions?
No need for a policy. Just use what I suggested above.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-18 1:07 ` Christoph Lameter
@ 2007-08-18 1:51 ` Ethan Solomita
0 siblings, 0 replies; 16+ messages in thread
From: Ethan Solomita @ 2007-08-18 1:51 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Paul Jackson, linux-mm
Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Ethan Solomita wrote:
>
>> Ideally, we want a task to express its preference for interleaved
>> memory allocations without having to provide a list of nodes. The kernel will
>> automatically round-robin amongst the task's mems_allowed.
>
> You can do that by writing 1 to /dev/cpuset/<cpuset>/memory_spread_page
Sorry, also noticed that the above doesn't affect anonymous pages, just
page cache, and we'd want interleaved anonymous pages.
-- Ethan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-18 0:29 ` Ethan Solomita
@ 2007-08-19 10:18 ` David Rientjes
2007-08-20 0:53 ` Ethan Solomita
0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2007-08-19 10:18 UTC (permalink / raw)
To: Ethan Solomita; +Cc: Paul Jackson, Christoph Lameter, linux-mm
On Fri, 17 Aug 2007, Ethan Solomita wrote:
> Ideally, we want a task to express its preference for interleaved memory
> allocations without having to provide a list of nodes. The kernel will
> automatically round-robin amongst the task's mems_allowed.
>
Just pass the result of
grep Mems_allowed /proc/pid/status | awk '{ print $2 }' | sed s/,//
to set_mempolicy() with MPOL_INTERLEAVE.
> At least in our environment, an independent "cpuset manager" process may
> choose to rewrite a cpuset's mems file at any time, possibly increasing or
> decreasing the number of available nodes. If weight(mems_allowed) is
> decreased, the task's MPOL_INTERLEAVE policy's nodemask will be shrunk to fit
> the new mems_allowed. If weight(mems_allowed) is grown, the policy's nodemask
> will not gain new nodes.
>
This is not unlike the traditional use of cpusets; a cpuset's mems_allowed
may be freely changed at any time.
If the weight of a task's mems_allowed decreases, you would want a simple
remap from the old nodemask to the new nodemask. node_remap() provides
this functionality already.
> What we want is for the task to "set it and forget it," i.e. to express a
> preference for interleaving and then never worry about NUMA again. If the
> nodemask sent via sys_mempolicy(MPOL_INTERLEAVE) served as a mask against
> mems_allowed, then we would specify an all-1s nodemask.
>
It already does exactly what you want.
cpuset_update_task_memory_state() is invoked anytime an allocation with
__GFP_WAIT is requested via alloc_pages_current() in process context in
addition to alloc_page_vma() for any userspace mapped pages.
If a task's mems_allowed has changed in its cpuset behind the task's back,
mpol_rebind_policy() is called for that task's mempolicy with a pointer to
the cpuset's mems_allowed. This will be considered the new mems_allowed
for the task and is stored in its task_struct.
mpol_rebind_policy() will rebind MPOL_INTERLEAVE policies by remapping the
old mems_allowed nodemask with the new nodemask from the cpuset and, at
the same time, update task->il_next to specify the next node to allocate
from as reflected by the new nodemask.
David
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-19 10:18 ` David Rientjes
@ 2007-08-20 0:53 ` Ethan Solomita
2007-08-20 2:34 ` Paul Jackson
0 siblings, 1 reply; 16+ messages in thread
From: Ethan Solomita @ 2007-08-20 0:53 UTC (permalink / raw)
To: David Rientjes; +Cc: Paul Jackson, Christoph Lameter, linux-mm
David Rientjes wrote:
> On Fri, 17 Aug 2007, Ethan Solomita wrote:
>
>> Ideally, we want a task to express its preference for interleaved memory
>> allocations without having to provide a list of nodes. The kernel will
>> automatically round-robin amongst the task's mems_allowed.
>>
>> At least in our environment, an independent "cpuset manager" process may
>> choose to rewrite a cpuset's mems file at any time, possibly increasing or
>> decreasing the number of available nodes. If weight(mems_allowed) is
>> decreased, the task's MPOL_INTERLEAVE policy's nodemask will be shrunk to fit
>> the new mems_allowed. If weight(mems_allowed) is grown, the policy's nodemask
>> will not gain new nodes.
>>
>
> This is not unlike the traditional use of cpusets; a cpuset's mems_allowed
> may be freely changed at any time.
>
> If the weight of a task's mems_allowed decreases, you would want a simple
> remap from the old nodemask to the new nodemask. node_remap() provides
> this functionality already.
And what happens when the weight then goes back up? e.g. at first the
mems_allowed specifies nodes 0 and 1, and the user sets a
MPOL_INTERLEAVE policy across nodes 0 and 1. At some point the "cpuset
manager" shrinks the number of nodes to just node 0, then later it adds
back node 1. What nodes are in my MPOL_INTERLEAVE policy?
As I read the code, I'll only have one node in the mempolicy. If that's
true, this doesn't do what I want.
-- Ethan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 0:53 ` Ethan Solomita
@ 2007-08-20 2:34 ` Paul Jackson
2007-08-20 5:47 ` Ethan Solomita
0 siblings, 1 reply; 16+ messages in thread
From: Paul Jackson @ 2007-08-20 2:34 UTC (permalink / raw)
To: Ethan Solomita; +Cc: rientjes, clameter, linux-mm
Ethan wrote:
> And what happens when the weight then goes back up? e.g. at first the
> mems_allowed specifies nodes 0 and 1, and the user sets a
> MPOL_INTERLEAVE policy across nodes 0 and 1. At some point the "cpuset
> manager" shrinks the number of nodes to just node 0, then later it adds
> back node 1. What nodes are in my MPOL_INTERLEAVE policy?
>
> As I read the code, I'll only have one node in the mempolicy. If that's
> true, this doesn't do what I want.
I read the code the same way.
Sounds to me like you want a new and different MPOL_* mempolicy, that
interleaves over whatever nodes are available (allowed) to the task.
The existing MPOL_INTERLEAVE mempolicy interleaves over some specified
nodemask, so we do the best we can to remap that set when it changes.
You want a mempolicy that interleaves over all available nodes, not over
some specified subset of them.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 2:34 ` Paul Jackson
@ 2007-08-20 5:47 ` Ethan Solomita
2007-08-20 5:53 ` Paul Jackson
2007-08-20 19:07 ` Christoph Lameter
0 siblings, 2 replies; 16+ messages in thread
From: Ethan Solomita @ 2007-08-20 5:47 UTC (permalink / raw)
To: Paul Jackson; +Cc: rientjes, clameter, linux-mm
Paul Jackson wrote:
> Ethan wrote:
>> And what happens when the weight then goes back up? e.g. at first the
>> mems_allowed specifies nodes 0 and 1, and the user sets a
>> MPOL_INTERLEAVE policy across nodes 0 and 1. At some point the "cpuset
>> manager" shrinks the number of nodes to just node 0, then later it adds
>> back node 1. What nodes are in my MPOL_INTERLEAVE policy?
>>
>> As I read the code, I'll only have one node in the mempolicy. If that's
>> true, this doesn't do what I want.
>
> I read the code the same way.
>
> Sounds to me like you want a new and different MPOL_* mempolicy, that
> interleaves over whatever nodes are available (allowed) to the task.
>
> The existing MPOL_INTERLEAVE mempolicy interleaves over some specified
> nodemask, so we do the best we can to remap that set when it changes.
>
> You want a mempolicy that interleaves over all available nodes, not over
> some specified subset of them.
OK, then I'll proceed with a new MPOL. Do you believe that this will be
of general interest? i.e. worth placing in linux-mm?
BTW, a slightly different MPOL_INTERLEAVE implementation would help,
wherein we save the nodemask originally specified by the user and do the
remap from the original nodemask rather than the current nodemask. This
would also let the user specify an all-ones nodemask which would then be
remapped onto mems_allowed. But I'm guessing that these changes would be
impossible due to breaking compatibility?
-- Ethan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 5:47 ` Ethan Solomita
@ 2007-08-20 5:53 ` Paul Jackson
2007-08-20 8:10 ` David Rientjes
2007-08-20 19:07 ` Christoph Lameter
1 sibling, 1 reply; 16+ messages in thread
From: Paul Jackson @ 2007-08-20 5:53 UTC (permalink / raw)
To: Ethan Solomita; +Cc: rientjes, clameter, linux-mm
Ethan wrote:
> OK, then I'll proceed with a new MPOL. Do you believe that this will be
> of general interest? i.e. worth placing in linux-mm?
I've no idea if it is of general interest or not. I'm not interested ;).
But I'm just one person.
> BTW, a slightly different MPOL_INTERLEAVE implementation would help,
> wherein we save the nodemask originally specified by the user and do the
> remap from the original nodemask rather than the current nodemask.
I kinda like this idea; though keep in mind that since I don't use
mempolicy mechanisms, I am not loosing any sleep over minor(?)
compatibility breakages. It would take someone familiar with the
actual users or usages of MPOL_INTERLEAVE to know if or how much
this would bite actual users/usages.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 5:53 ` Paul Jackson
@ 2007-08-20 8:10 ` David Rientjes
2007-08-20 18:25 ` Paul Jackson
2007-08-20 18:28 ` Ethan Solomita
0 siblings, 2 replies; 16+ messages in thread
From: David Rientjes @ 2007-08-20 8:10 UTC (permalink / raw)
To: Paul Jackson; +Cc: Ethan Solomita, clameter, linux-mm
On Sun, 19 Aug 2007, Paul Jackson wrote:
> > BTW, a slightly different MPOL_INTERLEAVE implementation would help,
> > wherein we save the nodemask originally specified by the user and do the
> > remap from the original nodemask rather than the current nodemask.
>
> I kinda like this idea; though keep in mind that since I don't use
> mempolicy mechanisms, I am not loosing any sleep over minor(?)
> compatibility breakages. It would take someone familiar with the
> actual users or usages of MPOL_INTERLEAVE to know if or how much
> this would bite actual users/usages.
>
Like I've already said, there is absolutely no reason to add a new MPOL
variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE
gets similar results. So just modify mpol_rebind_policy() so that if
/dev/cpuset/<cpuset>/memory_spread_page is true, you rebind the
interleaved nodemask to all nodes in the new nodemask. That's the
well-defined cpuset interface for getting an interleaved behavior already.
Let's not create new memory policies that only work for a very specific
and configurable case when the basic underlying mechanism to that policy
is already present in the cpuset interface, namely, PF_SPREAD_PAGE.
David
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 8:10 ` David Rientjes
@ 2007-08-20 18:25 ` Paul Jackson
2007-08-20 18:28 ` Ethan Solomita
1 sibling, 0 replies; 16+ messages in thread
From: Paul Jackson @ 2007-08-20 18:25 UTC (permalink / raw)
To: David Rientjes; +Cc: solo, clameter, linux-mm
David wrote:
> Like I've already said, there is absolutely no reason to add a new MPOL
> variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE
> gets similar results. So just modify mpol_rebind_policy() so that if
> /dev/cpuset/<cpuset>/memory_spread_page is true, you rebind the
> interleaved nodemask to all nodes in the new nodemask. That's the
> well-defined cpuset interface for getting an interleaved behavior already.
Hmm ... nice.
As David likely guesses, I didn't read his earlier suggestion of this.
Thanks for repeating it.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 8:10 ` David Rientjes
2007-08-20 18:25 ` Paul Jackson
@ 2007-08-20 18:28 ` Ethan Solomita
2007-08-20 18:40 ` David Rientjes
1 sibling, 1 reply; 16+ messages in thread
From: Ethan Solomita @ 2007-08-20 18:28 UTC (permalink / raw)
To: David Rientjes; +Cc: Paul Jackson, clameter, linux-mm
David Rientjes wrote:
>
> Like I've already said, there is absolutely no reason to add a new MPOL
> variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE
> gets similar results. So just modify mpol_rebind_policy() so that if
> /dev/cpuset/<cpuset>/memory_spread_page is true, you rebind the
> interleaved nodemask to all nodes in the new nodemask. That's the
> well-defined cpuset interface for getting an interleaved behavior already.
memory_spread_page is only for file-backed pages, not anon pages.
-- Ethan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 18:28 ` Ethan Solomita
@ 2007-08-20 18:40 ` David Rientjes
2007-08-20 19:50 ` Ethan Solomita
0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2007-08-20 18:40 UTC (permalink / raw)
To: Ethan Solomita; +Cc: Paul Jackson, clameter, linux-mm
On Mon, 20 Aug 2007, Ethan Solomita wrote:
> > Like I've already said, there is absolutely no reason to add a new MPOL
> > variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE gets
> > similar results. So just modify mpol_rebind_policy() so that if
> > /dev/cpuset/<cpuset>/memory_spread_page is true, you rebind the interleaved
> > nodemask to all nodes in the new nodemask. That's the well-defined cpuset
> > interface for getting an interleaved behavior already.
>
> memory_spread_page is only for file-backed pages, not anon pages.
Please read what I said above, all you have to do is modify
mpol_rebind_policy() so that if /dev/cpuset/<cpuset>/memory_spread_page is
true, you rebind the interleaved nodemask to all nodes in the new
nodemask.
This only happens for the MPOL_INTERLEAVE case because the application has
made it quite clear through set_mempolicy(MPOL_INTERLEAVE, ...) that it
wants this behavior.
int cpuset_is_spread_page(struct task_struct *task)
{
int ret;
task_lock(task);
ret = is_spread_page(task->cpuset);
task_unlock(task);
return ret;
}
void mpol_rebind_policy(struct mempolicy *pol, const nodemask_t *newmask)
{
...
case MPOL_INTERLEAVE:
if (cpuset_is_spread_page(current))
pol->v.nodes = *newmask;
else {
nodes_remap(tmp, pol->v.nodes, *mpolmask, *newmask);
pol->v.nodes = tmp;
}
...
...
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 5:47 ` Ethan Solomita
2007-08-20 5:53 ` Paul Jackson
@ 2007-08-20 19:07 ` Christoph Lameter
2007-08-21 14:14 ` Lee Schermerhorn
1 sibling, 1 reply; 16+ messages in thread
From: Christoph Lameter @ 2007-08-20 19:07 UTC (permalink / raw)
To: Lee Schermerhorn; +Cc: Paul Jackson, Ethan Solomita, rientjes, linux-mm
On Sun, 19 Aug 2007, Ethan Solomita wrote:
> OK, then I'll proceed with a new MPOL. Do you believe that this will
> be of general interest? i.e. worth placing in linux-mm?
Ummmm... Lets first get Lee onto this. AFAIK he already has an
implementation for such a thing.
Lee: Would you respond to these emails?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 18:40 ` David Rientjes
@ 2007-08-20 19:50 ` Ethan Solomita
0 siblings, 0 replies; 16+ messages in thread
From: Ethan Solomita @ 2007-08-20 19:50 UTC (permalink / raw)
To: David Rientjes; +Cc: Paul Jackson, clameter, linux-mm
David Rientjes wrote:
> On Mon, 20 Aug 2007, Ethan Solomita wrote:
>
>>> Like I've already said, there is absolutely no reason to add a new MPOL
>>> variant for this case. As Christoph already mentioned, PF_SPREAD_PAGE gets
>>> similar results. So just modify mpol_rebind_policy() so that if
>>> /dev/cpuset/<cpuset>/memory_spread_page is true, you rebind the interleaved
>>> nodemask to all nodes in the new nodemask. That's the well-defined cpuset
>>> interface for getting an interleaved behavior already.
>> memory_spread_page is only for file-backed pages, not anon pages.
>
> Please read what I said above, all you have to do is modify
> mpol_rebind_policy() so that if /dev/cpuset/<cpuset>/memory_spread_page is
> true, you rebind the interleaved nodemask to all nodes in the new
> nodemask.
>
> This only happens for the MPOL_INTERLEAVE case because the application has
> made it quite clear through set_mempolicy(MPOL_INTERLEAVE, ...) that it
> wants this behavior.
I understood what you said, I just had API concerns. If the definition
of memory_spread_page is that file-backed pages get spread, it seems
counterintuitive to me that it could have an effect on all memory
allocations.
Also MPOL_INTERLEAVE doesn't necessarily make it clear that this
behavior is desired. One use (not the one I'm interested in, though) of
this MPOL is to let the user specify specific nodes of interest, and
they might not want a rebind to revert them to interleaving amongst all
nodes.
If MPOL_INTERLEAVE's nodemask were really a mask, applied on top of
mems_allowed, this would be much easier, but it's not.
-- Ethan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: cpusets vs. mempolicy and how to get interleaving
2007-08-20 19:07 ` Christoph Lameter
@ 2007-08-21 14:14 ` Lee Schermerhorn
0 siblings, 0 replies; 16+ messages in thread
From: Lee Schermerhorn @ 2007-08-21 14:14 UTC (permalink / raw)
To: Christoph Lameter
Cc: Paul Jackson, Ethan Solomita, rientjes, linux-mm, Eric Whitney
On Mon, 2007-08-20 at 12:07 -0700, Christoph Lameter wrote:
> On Sun, 19 Aug 2007, Ethan Solomita wrote:
>
> > OK, then I'll proceed with a new MPOL. Do you believe that this will
> > be of general interest? i.e. worth placing in linux-mm?
>
> Ummmm... Lets first get Lee onto this. AFAIK he already has an
> implementation for such a thing.
>
> Lee: Would you respond to these emails?
>
Here's the post for cpuset-independent interleave [a.k.a. "contextual
interleave"].
http://marc.info/?l=linux-mm&m=118608528417158&w=4
I'm maintaining this with a half a dozen other mempolicy cleanups and
enhancements. I'll try to post the entire series later this week.
Meanwhile, the patch linked above should apply to current mm tree with
little conflict.
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2007-08-21 14:14 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-18 0:22 cpusets vs. mempolicy and how to get interleaving Ethan Solomita
2007-08-18 0:29 ` Ethan Solomita
2007-08-19 10:18 ` David Rientjes
2007-08-20 0:53 ` Ethan Solomita
2007-08-20 2:34 ` Paul Jackson
2007-08-20 5:47 ` Ethan Solomita
2007-08-20 5:53 ` Paul Jackson
2007-08-20 8:10 ` David Rientjes
2007-08-20 18:25 ` Paul Jackson
2007-08-20 18:28 ` Ethan Solomita
2007-08-20 18:40 ` David Rientjes
2007-08-20 19:50 ` Ethan Solomita
2007-08-20 19:07 ` Christoph Lameter
2007-08-21 14:14 ` Lee Schermerhorn
2007-08-18 1:07 ` Christoph Lameter
2007-08-18 1:51 ` Ethan Solomita
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).