* [PATCH] Limit initial tasks' and top level cpuset's mems_allowed to nodes with memory
@ 2009-05-04 3:06 Lee Schermerhorn
2009-05-04 10:02 ` Miao Xie
0 siblings, 1 reply; 3+ messages in thread
From: Lee Schermerhorn @ 2009-05-04 3:06 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-numa, miaox, Mel Gorman, Doug Chapman,
Eric Whitney, Bjorn Helgaas
Against: 2.6.20-rc3-mmotm-090428-1631
Since cpusetmm-update-tasks-mems_allowed-in-time.patch removed the call outs
to cpuset_update_task_memory_state(), tasks in the top cpuset don't get their
mems_allowed updated to just nodes with memory. cpuset_init()initializes
the top cpuset's mems_allowed with nodes_setall() and
cpuset_init_current_mems_allowed() and kernel_init() initialize the kernel
initialization tasks' mems_allowed to all possible nodes. Tasks in the top
cpuset that inherit the init task's mems_allowed without modification will
have all possible nodes set. This can be seen by examining the Mems_allowed
field in /proc/<pid>/status in such a task.
"numactl --interleave=all" also initializes the interleave node mask to all
ones, depending on the masking with mems_allowed to eliminate non-existent
nodes and nodes without memory. As this was not happening, the interleave
policy was attempting to dereference non-existent nodes.
This patch modifies the nodes_setall() calls in two cpuset init functions and
the initialization of task #1's mems_allowed to use node_states[N_HIGH_MEMORY].
This mask has been initialized to contain only existing nodes with memory by
the time the respective init functions are called.
This fixes the bogus pointer deref [Nat Consumption fault on ia64] reported
in:
[BUG] 2.6.30-rc3-mmotm-090428-1814 -- bogus pointer deref
[The time--1814--was incorrect in that subject line, but the date was correct.]
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
init/main.c | 4 ++--
kernel/cpuset.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
Index: linux-2.6.30-rc3-mmotm-090428-1631/kernel/cpuset.c
===================================================================
--- linux-2.6.30-rc3-mmotm-090428-1631.orig/kernel/cpuset.c 2009-05-03 18:26:24.000000000 -0400
+++ linux-2.6.30-rc3-mmotm-090428-1631/kernel/cpuset.c 2009-05-03 20:46:04.000000000 -0400
@@ -1846,7 +1846,7 @@ int __init cpuset_init(void)
BUG();
cpumask_setall(top_cpuset.cpus_allowed);
- nodes_setall(top_cpuset.mems_allowed);
+ top_cpuset.mems_allowed = node_states[N_HIGH_MEMORY];
fmeter_init(&top_cpuset.fmeter);
set_bit(CS_SCHED_LOAD_BALANCE, &top_cpuset.flags);
@@ -2118,7 +2118,7 @@ void cpuset_cpus_allowed_locked(struct t
void cpuset_init_current_mems_allowed(void)
{
- nodes_setall(current->mems_allowed);
+ current->mems_allowed = node_states[N_HIGH_MEMORY];
}
/**
Index: linux-2.6.30-rc3-mmotm-090428-1631/init/main.c
===================================================================
--- linux-2.6.30-rc3-mmotm-090428-1631.orig/init/main.c 2009-05-03 20:46:04.000000000 -0400
+++ linux-2.6.30-rc3-mmotm-090428-1631/init/main.c 2009-05-03 20:54:03.000000000 -0400
@@ -849,9 +849,9 @@ static int __init kernel_init(void * unu
lock_kernel();
/*
- * init can allocate pages on any node
+ * init can allocate pages on any node with memory
*/
- set_mems_allowed(node_possible_map);
+ set_mems_allowed(node_states[N_HIGH_MEMORY]);
/*
* init can run on any cpu.
*/
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] Limit initial tasks' and top level cpuset's mems_allowed to nodes with memory
2009-05-04 3:06 [PATCH] Limit initial tasks' and top level cpuset's mems_allowed to nodes with memory Lee Schermerhorn
@ 2009-05-04 10:02 ` Miao Xie
2009-05-04 15:17 ` Lee Schermerhorn
0 siblings, 1 reply; 3+ messages in thread
From: Miao Xie @ 2009-05-04 10:02 UTC (permalink / raw)
To: lts
Cc: Andrew Morton, linux-mm, linux-numa, Mel Gorman, Doug Chapman,
Eric Whitney, Bjorn Helgaas
on 2009-5-4 11:06 Lee Schermerhorn wrote:
> Against: 2.6.20-rc3-mmotm-090428-1631
>
> Since cpusetmm-update-tasks-mems_allowed-in-time.patch removed the call outs
> to cpuset_update_task_memory_state(), tasks in the top cpuset don't get their
> mems_allowed updated to just nodes with memory. cpuset_init()initializes
> the top cpuset's mems_allowed with nodes_setall() and
> cpuset_init_current_mems_allowed() and kernel_init() initialize the kernel
> initialization tasks' mems_allowed to all possible nodes. Tasks in the top
> cpuset that inherit the init task's mems_allowed without modification will
> have all possible nodes set. This can be seen by examining the Mems_allowed
> field in /proc/<pid>/status in such a task.
>
> "numactl --interleave=all" also initializes the interleave node mask to all
> ones, depending on the masking with mems_allowed to eliminate non-existent
> nodes and nodes without memory. As this was not happening, the interleave
> policy was attempting to dereference non-existent nodes.
>
> This patch modifies the nodes_setall() calls in two cpuset init functions and
> the initialization of task #1's mems_allowed to use node_states[N_HIGH_MEMORY].
> This mask has been initialized to contain only existing nodes with memory by
> the time the respective init functions are called.
You forget to modify the cpuset_attach(). This function will initialize the
mems_allowed of the task which is being moved into the top cpuset by node_possible_map.
Beside that, if you use node_states[N_HIGH_MEMORY] to initialize the mems_allowed
of the tasks in the top cpuset, you must update it when adding a node with memory into
the system. So you also must modify cpuset_track_online_nodes().
Thanks
Miao
>
> This fixes the bogus pointer deref [Nat Consumption fault on ia64] reported
> in:
>
> [BUG] 2.6.30-rc3-mmotm-090428-1814 -- bogus pointer deref
>
> [The time--1814--was incorrect in that subject line, but the date was correct.]
>
> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
>
> init/main.c | 4 ++--
> kernel/cpuset.c | 4 ++--
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> Index: linux-2.6.30-rc3-mmotm-090428-1631/kernel/cpuset.c
> ===================================================================
> --- linux-2.6.30-rc3-mmotm-090428-1631.orig/kernel/cpuset.c 2009-05-03 18:26:24.000000000 -0400
> +++ linux-2.6.30-rc3-mmotm-090428-1631/kernel/cpuset.c 2009-05-03 20:46:04.000000000 -0400
> @@ -1846,7 +1846,7 @@ int __init cpuset_init(void)
> BUG();
>
> cpumask_setall(top_cpuset.cpus_allowed);
> - nodes_setall(top_cpuset.mems_allowed);
> + top_cpuset.mems_allowed = node_states[N_HIGH_MEMORY];
>
> fmeter_init(&top_cpuset.fmeter);
> set_bit(CS_SCHED_LOAD_BALANCE, &top_cpuset.flags);
> @@ -2118,7 +2118,7 @@ void cpuset_cpus_allowed_locked(struct t
>
> void cpuset_init_current_mems_allowed(void)
> {
> - nodes_setall(current->mems_allowed);
> + current->mems_allowed = node_states[N_HIGH_MEMORY];
> }
>
> /**
> Index: linux-2.6.30-rc3-mmotm-090428-1631/init/main.c
> ===================================================================
> --- linux-2.6.30-rc3-mmotm-090428-1631.orig/init/main.c 2009-05-03 20:46:04.000000000 -0400
> +++ linux-2.6.30-rc3-mmotm-090428-1631/init/main.c 2009-05-03 20:54:03.000000000 -0400
> @@ -849,9 +849,9 @@ static int __init kernel_init(void * unu
> lock_kernel();
>
> /*
> - * init can allocate pages on any node
> + * init can allocate pages on any node with memory
> */
> - set_mems_allowed(node_possible_map);
> + set_mems_allowed(node_states[N_HIGH_MEMORY]);
> /*
> * init can run on any cpu.
> */
>
>
>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] Limit initial tasks' and top level cpuset's mems_allowed to nodes with memory
2009-05-04 10:02 ` Miao Xie
@ 2009-05-04 15:17 ` Lee Schermerhorn
0 siblings, 0 replies; 3+ messages in thread
From: Lee Schermerhorn @ 2009-05-04 15:17 UTC (permalink / raw)
To: miaox
Cc: lts, Andrew Morton, linux-mm, linux-numa, Mel Gorman,
Doug Chapman, Eric Whitney, Bjorn Helgaas
On Mon, 2009-05-04 at 18:02 +0800, Miao Xie wrote:
> on 2009-5-4 11:06 Lee Schermerhorn wrote:
> > Against: 2.6.20-rc3-mmotm-090428-1631
> >
> > Since cpusetmm-update-tasks-mems_allowed-in-time.patch removed the call outs
> > to cpuset_update_task_memory_state(), tasks in the top cpuset don't get their
> > mems_allowed updated to just nodes with memory. cpuset_init()initializes
> > the top cpuset's mems_allowed with nodes_setall() and
> > cpuset_init_current_mems_allowed() and kernel_init() initialize the kernel
> > initialization tasks' mems_allowed to all possible nodes. Tasks in the top
> > cpuset that inherit the init task's mems_allowed without modification will
> > have all possible nodes set. This can be seen by examining the Mems_allowed
> > field in /proc/<pid>/status in such a task.
> >
> > "numactl --interleave=all" also initializes the interleave node mask to all
> > ones, depending on the masking with mems_allowed to eliminate non-existent
> > nodes and nodes without memory. As this was not happening, the interleave
> > policy was attempting to dereference non-existent nodes.
> >
> > This patch modifies the nodes_setall() calls in two cpuset init functions and
> > the initialization of task #1's mems_allowed to use node_states[N_HIGH_MEMORY].
> > This mask has been initialized to contain only existing nodes with memory by
> > the time the respective init functions are called.
>
> You forget to modify the cpuset_attach(). This function will initialize the
> mems_allowed of the task which is being moved into the top cpuset by node_possible_map.
Thanks, I'll look at that. I had tested moving tasks between cpusets
and thought that it was working, but I'd been looking at this for a
while and could have been imagining it. I'll look for all uses of
node_possible_map, etc.
>
> Beside that, if you use node_states[N_HIGH_MEMORY] to initialize the mems_allowed
> of the tasks in the top cpuset, you must update it when adding a node with memory into
> the system. So you also must modify cpuset_track_online_nodes().
So, we'll need to walk the tasks in the top-level cpuset and update
their mems_allowed on node on/off-line. I'd have thought we already did
that, but must admit I didn't check. I'll take a look at how
cpuset_track_online_nodes() interacts with mems_allowed, ...
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-05-04 15:17 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-04 3:06 [PATCH] Limit initial tasks' and top level cpuset's mems_allowed to nodes with memory Lee Schermerhorn
2009-05-04 10:02 ` Miao Xie
2009-05-04 15:17 ` Lee Schermerhorn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).