From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>, Paul Jackson <pj@sgi.com>,
Nishanth Aravamudan <nacc@us.ibm.com>
Cc: akpm@linux-foundation.org, kxr@sgi.com, linux-mm@kvack.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: [PATCH take3] Memoryless nodes: use "node_memory_map" for cpuset mems_allowed validation
Date: Tue, 24 Jul 2007 16:30:19 -0400 [thread overview]
Message-ID: <1185309019.5649.69.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0707111204470.17503@schroedinger.engr.sgi.com>
Memoryless Nodes: use "node_memory_map" for cpusets - take 3
Against 2.6.22-rc6-mm1 atop Christoph Lameter's memoryless nodes
series
take 2:
+ replaced node_online_map in cpuset_current_mems_allowed()
with node_states[N_MEMORY]
+ replaced node_online_map in cpuset_init_smp() with
node_states[N_MEMORY]
take 3:
+ fix up comments and top level cpuset tracking of nodes
with memory [instead of on-line nodes].
+ maybe I got them all this time?
cpusets try to ensure that any node added to a cpuset's
mems_allowed is on-line and contains memory. The assumption
was that online nodes contained memory. Thus, it is possible
to add memoryless nodes to a cpuset and then add tasks to this
cpuset. This results in continuous series of oom-kill and
apparent system hang.
Change cpusets to use node_states[N_MEMORY] [a.k.a.
node_memory_map] in place of node_online_map when vetting
memories. Return error if admin attempts to write a non-empty
mems_allowed node mask containing only memoryless-nodes.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
include/linux/cpuset.h | 2 -
kernel/cpuset.c | 51 +++++++++++++++++++++++++++++++------------------
2 files changed, 34 insertions(+), 19 deletions(-)
Index: Linux/kernel/cpuset.c
===================================================================
--- Linux.orig/kernel/cpuset.c 2007-07-24 11:24:56.000000000 -0400
+++ Linux/kernel/cpuset.c 2007-07-24 12:20:40.000000000 -0400
@@ -316,26 +316,26 @@ static void guarantee_online_cpus(const
/*
* Return in *pmask the portion of a cpusets's mems_allowed that
- * are online. If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online mems. If we get
- * all the way to the top and still haven't found any online mems,
- * return node_online_map.
+ * are online, with memory. If none are online with memory, walk
+ * up the cpuset hierarchy until we find one that does have some
+ * online mems. If we get all the way to the top and still haven't
+ * found any online mems, return node_states[N_MEMORY].
*
* One way or another, we guarantee to return some non-empty subset
- * of node_online_map.
+ * of node_states[N_MEMORY].
*
* Call with callback_mutex held.
*/
static void guarantee_online_mems(const struct cpuset *cs, nodemask_t *pmask)
{
- while (cs && !nodes_intersects(cs->mems_allowed, node_online_map))
+ while (cs && !nodes_intersects(cs->mems_allowed, node_states[N_MEMORY]))
cs = cs->parent;
if (cs)
- nodes_and(*pmask, cs->mems_allowed, node_online_map);
+ nodes_and(*pmask, cs->mems_allowed, node_states[N_MEMORY]);
else
- *pmask = node_online_map;
- BUG_ON(!nodes_intersects(*pmask, node_online_map));
+ *pmask = node_states[N_MEMORY];
+ BUG_ON(!nodes_intersects(*pmask, node_states[N_MEMORY]));
}
/**
@@ -606,7 +606,7 @@ static int update_nodemask(struct cpuset
int retval;
struct container_iter it;
- /* top_cpuset.mems_allowed tracks node_online_map; it's read-only */
+ /* top_cpuset.mems_allowed tracks node_states[N_MEMORY]; it's read-only */
if (cs == &top_cpuset)
return -EACCES;
@@ -623,8 +623,21 @@ static int update_nodemask(struct cpuset
retval = nodelist_parse(buf, trialcs.mems_allowed);
if (retval < 0)
goto done;
+ if (!nodes_intersects(trialcs.mems_allowed,
+ node_states[N_MEMORY])) {
+ /*
+ * error if only memoryless nodes specified.
+ */
+ retval = -ENOSPC;
+ goto done;
+ }
}
- nodes_and(trialcs.mems_allowed, trialcs.mems_allowed, node_online_map);
+ /*
+ * Exclude memoryless nodes. We know that trialcs.mems_allowed
+ * contains at least one node with memory.
+ */
+ nodes_and(trialcs.mems_allowed, trialcs.mems_allowed,
+ node_states[N_MEMORY]);
oldmem = cs->mems_allowed;
if (nodes_equal(oldmem, trialcs.mems_allowed)) {
retval = 0; /* Too easy - nothing to do */
@@ -1366,8 +1379,9 @@ static void guarantee_online_cpus_mems_i
/*
* The cpus_allowed and mems_allowed nodemasks in the top_cpuset track
- * cpu_online_map and node_online_map. Force the top cpuset to track
- * whats online after any CPU or memory node hotplug or unplug event.
+ * cpu_online_map and node_states[N_MEMORY]. Force the top cpuset to
+ * track what's online after any CPU or memory node hotplug or unplug
+ * event.
*
* To ensure that we don't remove a CPU or node from the top cpuset
* that is currently in use by a child cpuset (which would violate
@@ -1387,7 +1401,7 @@ static void common_cpu_mem_hotplug_unplu
guarantee_online_cpus_mems_in_subtree(&top_cpuset);
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
mutex_unlock(&callback_mutex);
container_unlock();
@@ -1412,8 +1426,9 @@ static int cpuset_handle_cpuhp(struct no
#ifdef CONFIG_MEMORY_HOTPLUG
/*
- * Keep top_cpuset.mems_allowed tracking node_online_map.
- * Call this routine anytime after you change node_online_map.
+ * Keep top_cpuset.mems_allowed tracking node_states[N_MEMORY].
+ * Call this routine anytime after you change
+ * node_states[N_MEMORY].
* See also the previous routine cpuset_handle_cpuhp().
*/
@@ -1432,7 +1447,7 @@ void cpuset_track_online_nodes(void)
void __init cpuset_init_smp(void)
{
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
hotcpu_notifier(cpuset_handle_cpuhp, 0);
}
@@ -1472,7 +1487,7 @@ void cpuset_init_current_mems_allowed(vo
*
* Description: Returns the nodemask_t mems_allowed of the cpuset
* attached to the specified @tsk. Guaranteed to return some non-empty
- * subset of node_online_map, even if this means going outside the
+ * subset of node_states[N_MEMORY], even if this means going outside the
* tasks cpuset.
**/
Index: Linux/include/linux/cpuset.h
===================================================================
--- Linux.orig/include/linux/cpuset.h 2007-07-24 11:24:56.000000000 -0400
+++ Linux/include/linux/cpuset.h 2007-07-24 12:20:56.000000000 -0400
@@ -92,7 +92,7 @@ static inline nodemask_t cpuset_mems_all
return node_possible_map;
}
-#define cpuset_current_mems_allowed (node_online_map)
+#define cpuset_current_mems_allowed (node_states[N_MEMORY))
static inline void cpuset_init_current_mems_allowed(void) {}
static inline void cpuset_update_task_memory_state(void) {}
#define cpuset_nodes_subset_current_mems_allowed(nodes) (1)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-07-24 20:30 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070711182219.234782227@sgi.com>
[not found] ` <20070711182252.138829364@sgi.com>
2007-07-11 18:46 ` [patch 10/12] Memoryless nodes: Update memory policy and page migration Nishanth Aravamudan
2007-07-11 18:56 ` Christoph Lameter
[not found] ` <20070711182252.376540447@sgi.com>
2007-07-11 19:04 ` [patch 11/12] Add N_CPU node state Christoph Lameter
[not found] ` <20070711182250.005856256@sgi.com>
2007-07-11 19:06 ` [patch 01/12] NUMA: Generic management of nodemasks for various purposes Christoph Lameter
2007-07-11 19:32 ` Lee Schermerhorn
2007-07-20 20:49 ` [PATCH] Memoryless nodes: use "node_memory_map" for cpuset mems_allowed validation Lee Schermerhorn
2007-07-20 22:07 ` Nishanth Aravamudan
2007-07-23 19:09 ` Nishanth Aravamudan
2007-07-23 19:23 ` Paul Jackson
2007-07-23 20:08 ` Nishanth Aravamudan
2007-07-23 20:59 ` Lee Schermerhorn
2007-07-23 21:48 ` Nishanth Aravamudan
2007-07-24 14:11 ` Lee Schermerhorn
2007-07-24 16:16 ` Nishanth Aravamudan
2007-07-24 14:15 ` [PATCH take2] " Lee Schermerhorn
2007-07-24 16:19 ` Nishanth Aravamudan
2007-07-24 19:01 ` Lee Schermerhorn
2007-07-25 15:50 ` Nishanth Aravamudan
2007-07-24 20:30 ` Lee Schermerhorn [this message]
2007-07-25 15:53 ` [PATCH take3] " Nishanth Aravamudan
2007-07-25 22:00 ` Nishanth Aravamudan
2007-07-26 13:04 ` Lee Schermerhorn
2007-07-27 0:40 ` Nishanth Aravamudan
2007-07-27 14:15 ` Lee Schermerhorn
2007-07-24 20:35 ` [PATCH/RFC] Memoryless nodes: Suppress redundant "node with no memory" messages Lee Schermerhorn
2007-07-25 15:56 ` Nishanth Aravamudan
[not found] ` <20070711182251.433134748@sgi.com>
2007-07-12 0:07 ` [patch 07/12] Memoryless nodes: SLUB support Andrew Morton
2007-07-12 1:42 ` Christoph Lameter
2007-07-12 18:33 ` Nishanth Aravamudan
2007-07-12 18:38 ` Christoph Lameter
2007-07-13 15:14 ` [patch 00/12] NUMA: Memoryless node support V3 Nishanth Aravamudan
2007-07-13 16:43 ` Christoph Lameter
2007-07-13 16:52 ` Nishanth Aravamudan
2007-07-13 17:20 ` Lee Schermerhorn
2007-07-13 17:23 ` Christoph Lameter
2007-07-13 19:22 ` Lee Schermerhorn
2007-07-13 20:53 ` Lee Schermerhorn
2007-07-13 21:34 ` Christoph Lameter
2007-07-13 23:18 ` Nishanth Aravamudan
[not found] ` <1185310277.5649.90.camel@localhost>
[not found] ` <Pine.LNX.4.64.0707241402010.4773@schroedinger.engr.sgi.com>
[not found] ` <1185372692.5604.22.camel@localhost>
2007-07-25 15:45 ` Lee Schermerhorn
2007-07-25 19:16 ` 2.6.23-rc1-mm1: boot hang on ia64 with memoryless nodes Lee Schermerhorn
2007-07-25 19:38 ` Christoph Lameter
2007-07-25 20:03 ` Christoph Lameter
2007-07-25 21:18 ` Lee Schermerhorn
2007-07-26 13:53 ` Lee Schermerhorn
2007-07-26 13:53 ` Lee Schermerhorn
2007-07-26 14:00 ` KAMEZAWA Hiroyuki
2007-07-26 14:00 ` KAMEZAWA Hiroyuki
2007-07-26 18:10 ` Lee Schermerhorn
2007-07-26 18:10 ` Lee Schermerhorn
2007-07-26 14:33 ` Lee Schermerhorn
2007-07-26 14:33 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1185309019.5649.69.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kxr@sgi.com \
--cc=linux-mm@kvack.org \
--cc=nacc@us.ibm.com \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.