From: Lee Schermerhorn <lee.schermerhorn@hp.com>
To: linux-mm@kvack.org
Cc: ak@suse.de, Lee Schermerhorn <lee.schermerhorn@hp.com>,
Nishanth Aravamudan <nacc@us.ibm.com>,
pj@sgi.com, kxr@sgi.com, Christoph Lameter <clameter@sgi.com>,
Mel Gorman <mel@skynet.ie>,
akpm@linux-foundation.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: [PATCH 13/14] Memoryless Nodes: use "node_memory_map" for cpusets
Date: Fri, 27 Jul 2007 15:44:40 -0400 [thread overview]
Message-ID: <20070727194440.18614.95660.sendpatchset@localhost> (raw)
In-Reply-To: <20070727194316.18614.36380.sendpatchset@localhost>
[patch 13/14] Memoryless Nodes: use "node_memory_map" for cpusets - take 4
Against 2.6.22-rc1-mm1 atop Christoph Lameter's memoryless nodes
series
take 2:
+ replaced node_online_map in cpuset_current_mems_allowed()
with node_states[N_MEMORY]
+ replaced node_online_map in cpuset_init_smp() with
node_states[N_MEMORY]
take 3:
+ fix up comments and top level cpuset tracking of nodes
with memory [instead of on-line nodes]
take 4:
+ fix typo in !CPUSETS definition of cpuset_current_mems_allowed()
+ fix up Documentation/cpusets.txt to reflect these changes.
cpusets try to ensure that any node added to a cpuset's
mems_allowed is on-line and contains memory. The assumption
was that online nodes contained memory. Thus, it is possible
to add memoryless nodes to a cpuset and then add tasks to this
cpuset. This results in continuous series of oom-kill and
apparent system hang.
Change cpusets to use node_states[N_MEMORY] [a.k.a.
node_memory_map] in place of node_online_map when vetting
memories. Return error if admin attempts to write a non-empty
mems_allowed node mask containing only memoryless-nodes.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>
Tested on 4-node ppc64 with 2 memoryless nodes. Top cpuset
(and all subsequent ones) only allow nodes 0 and 1 (the
nodes with memory).
Documentation/cpusets.txt | 8 ++++---
include/linux/cpuset.h | 2 -
kernel/cpuset.c | 51 +++++++++++++++++++++++++++++-----------------
3 files changed, 39 insertions(+), 22 deletions(-)
Index: Linux/kernel/cpuset.c
===================================================================
--- Linux.orig/kernel/cpuset.c 2007-07-26 12:40:16.000000000 -0400
+++ Linux/kernel/cpuset.c 2007-07-26 12:55:29.000000000 -0400
@@ -307,26 +307,26 @@ static void guarantee_online_cpus(const
/*
* Return in *pmask the portion of a cpusets's mems_allowed that
- * are online. If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online mems. If we get
- * all the way to the top and still haven't found any online mems,
- * return node_online_map.
+ * are online, with memory. If none are online with memory, walk
+ * up the cpuset hierarchy until we find one that does have some
+ * online mems. If we get all the way to the top and still haven't
+ * found any online mems, return node_states[N_MEMORY].
*
* One way or another, we guarantee to return some non-empty subset
- * of node_online_map.
+ * of node_states[N_MEMORY].
*
* Call with callback_mutex held.
*/
static void guarantee_online_mems(const struct cpuset *cs, nodemask_t *pmask)
{
- while (cs && !nodes_intersects(cs->mems_allowed, node_online_map))
+ while (cs && !nodes_intersects(cs->mems_allowed, node_states[N_MEMORY]))
cs = cs->parent;
if (cs)
- nodes_and(*pmask, cs->mems_allowed, node_online_map);
+ nodes_and(*pmask, cs->mems_allowed, node_states[N_MEMORY]);
else
- *pmask = node_online_map;
- BUG_ON(!nodes_intersects(*pmask, node_online_map));
+ *pmask = node_states[N_MEMORY];
+ BUG_ON(!nodes_intersects(*pmask, node_states[N_MEMORY]));
}
/**
@@ -597,7 +597,7 @@ static int update_nodemask(struct cpuset
int retval;
struct container_iter it;
- /* top_cpuset.mems_allowed tracks node_online_map; it's read-only */
+ /* top_cpuset.mems_allowed tracks node_states[N_MEMORY]; it's read-only */
if (cs == &top_cpuset)
return -EACCES;
@@ -614,8 +614,21 @@ static int update_nodemask(struct cpuset
retval = nodelist_parse(buf, trialcs.mems_allowed);
if (retval < 0)
goto done;
+ if (!nodes_intersects(trialcs.mems_allowed,
+ node_states[N_MEMORY])) {
+ /*
+ * error if only memoryless nodes specified.
+ */
+ retval = -ENOSPC;
+ goto done;
+ }
}
- nodes_and(trialcs.mems_allowed, trialcs.mems_allowed, node_online_map);
+ /*
+ * Exclude memoryless nodes. We know that trialcs.mems_allowed
+ * contains at least one node with memory.
+ */
+ nodes_and(trialcs.mems_allowed, trialcs.mems_allowed,
+ node_states[N_MEMORY]);
oldmem = cs->mems_allowed;
if (nodes_equal(oldmem, trialcs.mems_allowed)) {
retval = 0; /* Too easy - nothing to do */
@@ -1356,8 +1369,9 @@ static void guarantee_online_cpus_mems_i
/*
* The cpus_allowed and mems_allowed nodemasks in the top_cpuset track
- * cpu_online_map and node_online_map. Force the top cpuset to track
- * whats online after any CPU or memory node hotplug or unplug event.
+ * cpu_online_map and node_states[N_MEMORY]. Force the top cpuset to
+ * track what's online after any CPU or memory node hotplug or unplug
+ * event.
*
* To ensure that we don't remove a CPU or node from the top cpuset
* that is currently in use by a child cpuset (which would violate
@@ -1377,7 +1391,7 @@ static void common_cpu_mem_hotplug_unplu
guarantee_online_cpus_mems_in_subtree(&top_cpuset);
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
mutex_unlock(&callback_mutex);
container_unlock();
@@ -1405,8 +1419,9 @@ static int cpuset_handle_cpuhp(struct no
#ifdef CONFIG_MEMORY_HOTPLUG
/*
- * Keep top_cpuset.mems_allowed tracking node_online_map.
- * Call this routine anytime after you change node_online_map.
+ * Keep top_cpuset.mems_allowed tracking node_states[N_MEMORY].
+ * Call this routine anytime after you change
+ * node_states[N_MEMORY].
* See also the previous routine cpuset_handle_cpuhp().
*/
@@ -1425,7 +1440,7 @@ void cpuset_track_online_nodes(void)
void __init cpuset_init_smp(void)
{
top_cpuset.cpus_allowed = cpu_online_map;
- top_cpuset.mems_allowed = node_online_map;
+ top_cpuset.mems_allowed = node_states[N_MEMORY];
hotcpu_notifier(cpuset_handle_cpuhp, 0);
}
@@ -1465,7 +1480,7 @@ void cpuset_init_current_mems_allowed(vo
*
* Description: Returns the nodemask_t mems_allowed of the cpuset
* attached to the specified @tsk. Guaranteed to return some non-empty
- * subset of node_online_map, even if this means going outside the
+ * subset of node_states[N_MEMORY], even if this means going outside the
* tasks cpuset.
**/
Index: Linux/include/linux/cpuset.h
===================================================================
--- Linux.orig/include/linux/cpuset.h 2007-07-26 12:40:16.000000000 -0400
+++ Linux/include/linux/cpuset.h 2007-07-26 12:55:30.000000000 -0400
@@ -92,7 +92,7 @@ static inline nodemask_t cpuset_mems_all
return node_possible_map;
}
-#define cpuset_current_mems_allowed (node_online_map)
+#define cpuset_current_mems_allowed (node_states[N_MEMORY])
static inline void cpuset_init_current_mems_allowed(void) {}
static inline void cpuset_update_task_memory_state(void) {}
#define cpuset_nodes_subset_current_mems_allowed(nodes) (1)
Index: Linux/Documentation/cpusets.txt
===================================================================
--- Linux.orig/Documentation/cpusets.txt 2007-07-25 09:29:48.000000000 -0400
+++ Linux/Documentation/cpusets.txt 2007-07-26 13:02:00.000000000 -0400
@@ -8,6 +8,7 @@ Portions Copyright (c) 2004-2006 Silicon
Modified by Paul Jackson <pj@sgi.com>
Modified by Christoph Lameter <clameter@sgi.com>
Modified by Paul Menage <menage@google.com>
+Modified by Lee Schermerhorn <lee.schermerhorn@hp.com>
CONTENTS:
=========
@@ -35,7 +36,8 @@ CONTENTS:
----------------------
Cpusets provide a mechanism for assigning a set of CPUs and Memory
-Nodes to a set of tasks.
+Nodes to a set of tasks. In this document "Memory Node" refers to
+an on-line node that contains memory.
Cpusets constrain the CPU and Memory placement of tasks to only
the resources within a tasks current cpuset. They form a nested
@@ -207,8 +209,8 @@ and name space for cpusets, with a minim
The cpus and mems files in the root (top_cpuset) cpuset are
read-only. The cpus file automatically tracks the value of
cpu_online_map using a CPU hotplug notifier, and the mems file
-automatically tracks the value of node_online_map using the
-cpuset_track_online_nodes() hook.
+automatically tracks the value of node_states[N_MEMORY]--i.e.,
+nodes with memory--using the cpuset_track_online_nodes() hook.
1.4 What are exclusive cpusets ?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-07-27 19:44 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-27 19:43 [PATCH 00/14] NUMA: Memoryless node support V4 Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Lee Schermerhorn
2007-07-30 21:38 ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 Lee Schermerhorn
2007-07-30 22:00 ` Lee Schermerhorn
2007-07-31 15:32 ` Mel Gorman
2007-07-31 15:58 ` Lee Schermerhorn
2007-07-31 21:05 ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 - v2 Lee Schermerhorn
2007-08-01 2:22 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Andrew Morton
2007-08-01 2:52 ` Christoph Lameter
2007-08-01 3:05 ` Andrew Morton
2007-08-01 3:14 ` Christoph Lameter
2007-08-01 3:32 ` Andrew Morton
2007-08-01 3:37 ` Christoph Lameter
[not found] ` <Pine.LNX.4.64.0707312151400.2894@schroedinger.engr.sgi.com>
2007-08-01 5:07 ` Andrew Morton
2007-08-01 5:11 ` Andrew Morton
2007-08-01 5:22 ` Christoph Lameter
2007-08-01 10:24 ` Mel Gorman
2007-08-02 16:23 ` Mel Gorman
2007-08-02 20:00 ` Christoph Lameter
2007-08-01 5:36 ` Paul Mundt
2007-08-01 9:19 ` Andi Kleen
2007-08-01 14:03 ` Lee Schermerhorn
2007-08-01 17:41 ` Christoph Lameter
2007-08-01 17:54 ` Lee Schermerhorn
2007-08-02 20:05 ` [PATCH/RFC/WIP] cpuset-independent interleave policy Lee Schermerhorn
2007-08-02 20:34 ` Christoph Lameter
2007-08-02 21:04 ` Lee Schermerhorn
2007-08-03 0:31 ` Christoph Lameter
2007-08-02 20:19 ` Audit of "all uses of node_online()" Lee Schermerhorn
2007-08-02 20:26 ` Christoph Lameter
2007-08-08 22:19 ` Lee Schermerhorn
2007-08-08 23:40 ` Christoph Lameter
2007-08-16 14:17 ` [PATCH/RFC] memoryless nodes - fixup uses of node_online_map in generic code Lee Schermerhorn
2007-08-16 18:33 ` Christoph Lameter
2007-08-16 19:15 ` Lee Schermerhorn
2007-08-16 21:10 ` Lee Schermerhorn
2007-08-16 21:13 ` Christoph Lameter
2007-08-24 16:09 ` [PATCH] 2.6.23-rc3-mm1 - Move setup of N_CPU node state mask Lee Schermerhorn
2007-09-06 13:56 ` Mel Gorman
2007-08-02 20:33 ` Audit of "all uses of node_online()" Andrew Morton
2007-08-02 20:45 ` Lee Schermerhorn
2007-08-01 15:58 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Nishanth Aravamudan
2007-08-01 16:09 ` Nishanth Aravamudan
2007-08-01 17:47 ` Christoph Lameter
2007-08-01 15:25 ` Nishanth Aravamudan
2007-07-27 19:43 ` [PATCH 02/14] Memoryless nodes: introduce mask of nodes with memory Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 03/14] Memoryless Nodes: Fix interleave behavior Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 04/14] OOM: use the N_MEMORY map instead of constructing one on the fly Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 05/14] Memoryless Nodes: No need for kswapd Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 06/14] Memoryless Node: Slab support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 07/14] Memoryless nodes: SLUB support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 08/14] Uncached allocator: Handle memoryless nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 09/14] Memoryless node: Allow profiling data to fall back to other nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 10/14] Memoryless nodes: Update memory policy and page migration Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 11/14] Add N_CPU node state Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 12/14] Memoryless nodes: Fix GFP_THISNODE behavior Lee Schermerhorn
2007-07-27 19:44 ` Lee Schermerhorn [this message]
2007-07-27 19:44 ` [PATCH 14/14] Memoryless nodes: drop one memoryless node boot warning Lee Schermerhorn
2007-07-27 20:59 ` [PATCH 00/14] NUMA: Memoryless node support V4 Nishanth Aravamudan
2007-07-30 13:48 ` Lee Schermerhorn
2007-07-29 12:35 ` Paul Jackson
2007-07-30 16:07 ` Lee Schermerhorn
2007-07-30 18:56 ` Paul Jackson
2007-07-30 21:19 ` Nishanth Aravamudan
2007-07-30 22:06 ` Christoph Lameter
2007-07-30 22:35 ` Andi Kleen
2007-07-30 22:36 ` Christoph Lameter
2007-07-31 23:18 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070727194440.18614.95660.sendpatchset@localhost \
--to=lee.schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kxr@sgi.com \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
--cc=nacc@us.ibm.com \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.