From: Lee Schermerhorn <lee.schermerhorn@hp.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org, ak@suse.de, mel@skynet.ie,
clameter@sgi.com, eric.whitney@hp.com
Subject: [PATCH/RFC 7/8] Mem Policy: MPOL_PREFERRED cleanups for "local allocation"
Date: Thu, 06 Dec 2007 16:21:29 -0500 [thread overview]
Message-ID: <20071206212129.6279.1028.sendpatchset@localhost> (raw)
In-Reply-To: <20071206212047.6279.10881.sendpatchset@localhost>
PATCH/RFC 07/08 - Mem Policy: MPOL_PREFERRED cleanups for "local allocation" - V5
Against: 2.6.24-rc2-mm1
V4 -> V5:
+ change mpol_to_str() to show "local" policy for MPOL_PREFERRED with
preferred_node == -1. libnuma wrappers and numactl use the term
"local allocation", so let's use it here.
V3 -> V4:
+ updated Documentation/vm/numa_memory_policy.txt to better explain
[I think] the "local allocation" feature of MPOL_PREFERRED.
V2 -> V3:
+ renamed get_nodemask() to get_policy_nodemask() to more closely
match what it's doing.
V1 -> V2:
+ renamed get_zonemask() to get_nodemask(). Mel Gorman suggested this
was a valid "cleanup".
Here are a couple of "cleanups" for MPOL_PREFERRED behavior
when v.preferred_node < 0 -- i.e., "local allocation":
1) [do_]get_mempolicy() calls the now renamed get_policy_nodemask()
to fetch the nodemask associated with a policy. Currently,
get_policy_nodemask() returns the set of nodes with memory, when
the policy 'mode' is 'PREFERRED, and the preferred_node is < 0.
Return the set of allowed nodes instead. This will already have
been masked to include only nodes with memory.
2) When a task is moved into a [new] cpuset, mpol_rebind_policy() is
called to adjust any task and vma policy nodes to be valid in the
new cpuset. However, when the policy is MPOL_PREFERRED, and the
preferred_node is <0, no rebind is necessary. The "local allocation"
indication is valid in any cpuset. Existing code will "do the right
thing" because node_remap() will just return the argument node when
it is outside of the valid range of node ids. However, I think it is
clearer and cleaner to skip the remap explicitly in this case.
3) mpol_to_str() produces a printable, "human readable" string from a
struct mempolicy. For MPOL_PREFERRED with preferred_node <0, show
"local", as this indicates local allocation, as the task migrates
among nodes. Note that this matches the usage of "local allocation"
in libnuma() and numactl. Without this change, I believe that node_set()
[via set_bit()] will set bit 31, resulting in a misleading display.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/mempolicy.c | 47 +++++++++++++++++++++++++++++++++++------------
1 file changed, 35 insertions(+), 12 deletions(-)
Index: Linux/mm/mempolicy.c
===================================================================
--- Linux.orig/mm/mempolicy.c 2007-11-21 11:28:33.000000000 -0500
+++ Linux/mm/mempolicy.c 2007-11-21 11:30:17.000000000 -0500
@@ -484,10 +484,13 @@ static long do_set_mempolicy(int mode, n
return 0;
}
-/* Fill a zone bitmap for a policy */
-static void get_zonemask(struct mempolicy *p, nodemask_t *nodes)
+/*
+ * Fill a zone bitmap for a policy for mempolicy query
+ */
+static void get_policy_nodemask(struct mempolicy *p, nodemask_t *nodes)
{
nodes_clear(*nodes);
+
switch (policy_mode(p)) {
case MPOL_BIND:
/* Fall through */
@@ -495,9 +498,11 @@ static void get_zonemask(struct mempolic
*nodes = p->v.nodes;
break;
case MPOL_PREFERRED:
- /* or use current node instead of memory_map? */
+ /*
+ * for "local policy", return allowed memories
+ */
if (p->v.preferred_node < 0)
- *nodes = node_states[N_HIGH_MEMORY];
+ *nodes = cpuset_current_mems_allowed;
else
node_set(p->v.preferred_node, *nodes);
break;
@@ -578,7 +583,7 @@ static long do_get_mempolicy(int *policy
err = 0;
if (nmask)
- get_zonemask(pol, nmask);
+ get_policy_nodemask(pol, nmask);
out:
mpol_cond_free(pol);
@@ -643,7 +648,7 @@ int do_migrate_pages(struct mm_struct *m
int err = 0;
nodemask_t tmp;
- down_read(&mm->mmap_sem);
+ down_read(&mm->mmap_sem);
err = migrate_vmas(mm, from_nodes, to_nodes, flags);
if (err)
@@ -1749,6 +1754,7 @@ static void mpol_rebind_policy(struct me
{
nodemask_t *mpolmask;
nodemask_t tmp;
+ int nid;
if (!pol)
return;
@@ -1767,9 +1773,15 @@ static void mpol_rebind_policy(struct me
*mpolmask, *newmask);
break;
case MPOL_PREFERRED:
- pol->v.preferred_node = node_remap(pol->v.preferred_node,
+ /*
+ * no need to remap "local policy"
+ */
+ nid = pol->v.preferred_node;
+ if (nid >= 0) {
+ pol->v.preferred_node = node_remap(nid,
*mpolmask, *newmask);
- *mpolmask = *newmask;
+ *mpolmask = *newmask;
+ }
break;
default:
BUG();
@@ -1807,8 +1819,13 @@ void mpol_rebind_mm(struct mm_struct *mm
* Display pages allocated per node and memory policy via /proc.
*/
+/*
+ * "local" is pseudo-policy: MPOL_PREFERRED with preferred_node == -1
+ * Used only for mpol_to_str()
+ */
+#define MPOL_LOCAL (MPOL_INTERLEAVE + 1)
static const char * const policy_types[] =
- { "default", "prefer", "bind", "interleave" };
+ { "default", "prefer", "bind", "interleave", "local" };
/*
* Convert a mempolicy into a string.
@@ -1818,6 +1835,7 @@ static const char * const policy_types[]
static inline int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
{
char *p = buffer;
+ int nid;
int l;
nodemask_t nodes;
int mode;
@@ -1834,7 +1852,12 @@ static inline int mpol_to_str(char *buff
case MPOL_PREFERRED:
nodes_clear(nodes);
- node_set(pol->v.preferred_node, nodes);
+ nid = pol->v.preferred_node;
+ if (nid < 0)
+ mode = MPOL_LOCAL;
+ else {
+ node_set(nid, nodes);
+ }
break;
case MPOL_BIND:
@@ -1849,8 +1872,8 @@ static inline int mpol_to_str(char *buff
}
l = strlen(policy_types[mode]);
- if (buffer + maxlen < p + l + 1)
- return -ENOSPC;
+ if (buffer + maxlen < p + l + 1)
+ return -ENOSPC;
strcpy(p, policy_types[mode]);
p += l;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-12-06 21:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy Lee Schermerhorn
2007-12-06 21:24 ` Andi Kleen
2007-12-06 21:34 ` Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 2/8] Mem Policy: Fixup Fallback for Default Shmem Policy Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 3/8] Mem Policy: Mark shared policies for unref Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 4/8] Mem Policy: Document {set|get}_policy() vm_ops APIs Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 5/8] Mem Policy: Rework mempolicy Reference Counting [yet again] Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 6/8] Mem Policy: Use MPOL_PREFERRED for system-wide default policy Lee Schermerhorn
2007-12-06 21:21 ` Lee Schermerhorn [this message]
2007-12-06 21:21 ` [PATCH/RFC 8/8] Mem Policy: Fix up MPOL_BIND documentation Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071206212129.6279.1028.sendpatchset@localhost \
--to=lee.schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.