* [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups
@ 2007-12-06 21:20 Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy Lee Schermerhorn
` (7 more replies)
0 siblings, 8 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:20 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, ak, eric.whitney, clameter, mel
PATCH/RFC 00/08 Mem Policy: Reference Counting/Fallback Fixes and
Miscellaneous mempolicy cleanup
Against: 2.6.24-rc2-mm1
Note: These patches are based atop Mel Gorman's "twozonelist"
series. Patch 5 depends on the elimination of the external
zonelist attached to MPOL_BIND policies. Patch 8 updates the
mempolicy documentation to reflect a change introduced by Mel's
patches. I will rebase and repost less the 'RFC' and to resolve
any comments after Mel's patches go into -mm.
Patch 1 takes mmap_sem for write when installing task memory policy.
Suggested by and originally posted by Christoph Lameter.
Patch 2 fixes a problem with fallback when a get_policy() vm_op returns
NULL. Currently does not follow vma->task->system default policy path.
Patch 3 marks shared policies as such. Only shared policies require
unref after lookup.
Patch 4 just documents the mempolicy reference semantics assumed by this
series for the set and get policy vm_ops where the prototypes are defined.
Patch 5 contains the actual rework of mempolicy reference counting. This
patch backs out the code that performed unref on all mempolicy other that
current task's and system default, and performs unref only when needed--
effectively only on shared policies. Also updates the numa_memory_policy.txt
document to describe the memory policy reference counting semantics as I
currently understand them.
Patches 6 and 7 are cleanups of the internal usage of MPOL_DEFAULT and
MPOL_PREFERRED.
Patch 8 updates the memory policy documentation to reflect the fact that,
with Mel's twozonelist series, MPOL_BIND now searches the allowed nodes
in distance order.
This series in currently an RFC. The patches in in this series build, boot
and survive memtoy testing on an x86_64 numa platform. I have also tested with
instrumentation to track and report the reference counts. So far, my testing
shows that the patches are working as I expect.
Lee Schermerhorn
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
@ 2007-12-06 21:20 ` Lee Schermerhorn
2007-12-06 21:24 ` Andi Kleen
2007-12-06 21:20 ` [PATCH/RFC 2/8] Mem Policy: Fixup Fallback for Default Shmem Policy Lee Schermerhorn
` (6 subsequent siblings)
7 siblings, 1 reply; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:20 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, ak, mel, eric.whitney, clameter
PATCH/RFC 01/08 Mem Policy: Write lock mmap_sem while changing task mempolicy
Against: 2.6.24-rc2-mm1
A read of /proc/<pid>/numa_maps holds the target task's mmap_sem
for read while examining each vma's mempolicy. A vma's mempolicy
can fall back to the task's policy. However, the task could be
changing it's task policy and free the one that the show_numa_maps()
is examining.
To prevent this, grab the mmap_sem for write when updating task
mempolicy. Pointed out to me by Christoph Lameter and extracted
and reworked from Christoph's alternative mempol reference counting
patch.
This is analogous to the way that do_mbind() and do_get_mempolicy()
prevent races between task's sharing an mm_struct [a.k.a. threads]
setting and querying a mempolicy for a particular address.
Note: this is necessary, but not sufficient, to allow us to stop
taking an extra reference on "other task's mempolicy" in get_vma_policy.
Subsequent patches will complete this update, allowing us to simplify
the tests for whether we need to unref a mempolicy at various points
in the code.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
[needs Christoph's sign-off if he agrees]
mm/mempolicy.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
Index: linux-2.6.24-rc4-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.24-rc4-mm1.orig/mm/mempolicy.c 2007-12-05 11:54:20.000000000 -0500
+++ linux-2.6.24-rc4-mm1/mm/mempolicy.c 2007-12-05 11:54:22.000000000 -0500
@@ -450,17 +450,30 @@ static void mpol_set_task_struct_flag(vo
static long do_set_mempolicy(int mode, nodemask_t *nodes)
{
struct mempolicy *new;
+ struct mm_struct *mm = current->mm;
if (contextualize_policy(mode, nodes))
return -EINVAL;
new = mpol_new(mode, nodes);
if (IS_ERR(new))
return PTR_ERR(new);
+
+ /*
+ * prevent changing our mempolicy while show_numa_maps()
+ * is using it.
+ * Note: do_set_mempolicy() can be called at init time
+ * with no 'mm'.
+ */
+ if (mm)
+ down_write(&mm->mmap_sem);
mpol_free(current->mempolicy);
current->mempolicy = new;
mpol_set_task_struct_flag();
if (new && new->policy == MPOL_INTERLEAVE)
current->il_next = first_node(new->v.nodes);
+ if (mm)
+ up_write(&mm->mmap_sem);
+
return 0;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 2/8] Mem Policy: Fixup Fallback for Default Shmem Policy
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy Lee Schermerhorn
@ 2007-12-06 21:20 ` Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 3/8] Mem Policy: Mark shared policies for unref Lee Schermerhorn
` (5 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:20 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, clameter, ak, eric.whitney, mel
PATCH/RFC 02/08 Mem Policy: Fixup Fallback for Default Shmem/Shm Policy
Against: 2.6.24-rc2-mm1
get_vma_policy() is not handling fallback to task policy correctly
when the get_policy() vm_op returns NULL. The NULL overwrites
the 'pol' variable that was holding the fallback task mempolicy.
So, it was falling back directly to system default policy.
Fix get_vma_policy() to use only non-NULL policy returned from
the vma get_policy op.
shm_get_policy() was falling back to current task's mempolicy if
the "backing file system" [tmpfs vs hugetlbfs] does not support
the get_policy vm_op and the vma policy is null. This is incorrect
for show_numa_maps() which is likely querying the numa_maps of
some task other than current. Remove this fallback.
Like get_vma_policy(), do_get_mempolicy() was potentially overwriting
the pol variable, which contains the current task's mempolicy as
first fallback, with a NULL policy. This would cause incorrect
fallback to system default policy, instead of any non-NULL task
mempolicy. Further, do_get_mempolicy() duplicates code in
get_vma_policy(). Change do_get_mempolicy() to call get_vma_policy()
when MPOL_F_ADDR specified.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
ipc/shm.c | 2 --
mm/mempolicy.c | 20 +++++++++++---------
2 files changed, 11 insertions(+), 11 deletions(-)
Index: Linux/mm/mempolicy.c
===================================================================
--- Linux.orig/mm/mempolicy.c 2007-11-28 12:58:36.000000000 -0500
+++ Linux/mm/mempolicy.c 2007-11-28 13:01:58.000000000 -0500
@@ -110,6 +110,8 @@ struct mempolicy default_policy = {
.policy = MPOL_DEFAULT,
};
+static struct mempolicy *get_vma_policy(struct task_struct *task,
+ struct vm_area_struct *vma, unsigned long addr);
static void mpol_rebind_policy(struct mempolicy *pol,
const nodemask_t *newmask);
@@ -543,15 +545,12 @@ static long do_get_mempolicy(int *policy
up_read(&mm->mmap_sem);
return -EFAULT;
}
- if (vma->vm_ops && vma->vm_ops->get_policy)
- pol = vma->vm_ops->get_policy(vma, addr);
- else
- pol = vma->vm_policy;
- } else if (addr)
+ pol = get_vma_policy(current, vma, addr);
+ } else if (addr) {
return -EINVAL;
-
- if (!pol)
+ } else if (!pol) {
pol = &default_policy;
+ }
if (flags & MPOL_F_NODE) {
if (flags & MPOL_F_ADDR) {
@@ -1116,7 +1115,7 @@ asmlinkage long compat_sys_mbind(compat_
* @task != current]. It is the caller's responsibility to
* free the reference in these cases.
*/
-static struct mempolicy * get_vma_policy(struct task_struct *task,
+static struct mempolicy *get_vma_policy(struct task_struct *task,
struct vm_area_struct *vma, unsigned long addr)
{
struct mempolicy *pol = task->mempolicy;
@@ -1124,7 +1123,10 @@ static struct mempolicy * get_vma_policy
if (vma) {
if (vma->vm_ops && vma->vm_ops->get_policy) {
- pol = vma->vm_ops->get_policy(vma, addr);
+ struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
+ addr);
+ if (vpol)
+ pol = vpol;
shared_pol = 1; /* if pol non-NULL, add ref below */
} else if (vma->vm_policy &&
vma->vm_policy->policy != MPOL_DEFAULT)
Index: Linux/ipc/shm.c
===================================================================
--- Linux.orig/ipc/shm.c 2007-11-28 12:02:42.000000000 -0500
+++ Linux/ipc/shm.c 2007-11-28 13:01:58.000000000 -0500
@@ -273,8 +273,6 @@ static struct mempolicy *shm_get_policy(
pol = sfd->vm_ops->get_policy(vma, addr);
else if (vma->vm_policy)
pol = vma->vm_policy;
- else
- pol = current->mempolicy;
return pol;
}
#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 3/8] Mem Policy: Mark shared policies for unref
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 2/8] Mem Policy: Fixup Fallback for Default Shmem Policy Lee Schermerhorn
@ 2007-12-06 21:21 ` Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 4/8] Mem Policy: Document {set|get}_policy() vm_ops APIs Lee Schermerhorn
` (4 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, ak, mel, clameter, eric.whitney
PATCH/RFC 03/08 - Mem Policy: Mark shared policies for unref
Against: 2.6.24-rc2-mm1
As part of yet another rework of mempolicy reference counting,
we want to be able to identify shared policies efficiently,
because they have an extra ref taken on lookup that needs to
be removed when we're finished using the policy.
Note: the extra ref is required because the policies are
shared between tasks/processes and can be changed/freed
by one task while another task is using them--e.g., for
page allocation.
Reusing part of my yet-to-be/maybe-never merged "context-independent"
interleave policy patch, this patch encodes the "shared" state
in an upper bit of the 'mode' member of the mempolicy structure.
Note this member has been renamed from 'policy' to 'mode' to better
match documentation and, more importantly, to catch any direct
references to the member.
The mode member must already be tested to determine the policy mode,
so no extra memory references should be required. However, for
testing the policy--e.g., in the several switch() and if() statements--
the MPOL_SHARED flag must be masked off using the policy_mode() inline
function. This allows additional flags to be so encoded, should that
become useful--e.g., for "context-independent" interleave policy,
cpuset-relative node id numbering, or any other future extension to
mempolicy.
I set the MPOL_SHARED flag when the policy is installed in the shared
policy rb-tree. Don't need/want to clear the flag when removing from the
tree as the mempolicy is freed [unref'd] internally to the sp_delete()
function. However, a task could hold another reference on this mempolicy
from a prior lookup. We need the MPOL_SHARED flag to stay put so that
any tasks holding a ref will unref, eventually freeing, the mempolicy.
A later patch in this series will introduce a function to conditionally
unref [mpol_free] a policy. The MPOL_SHARED flag is one reason
[currently the only reason] to unref/free a policy via the conditional
free.
Note: an alternative to marking shared policies, suggested recently
by Christoph Lameter, is to define an additional argument to
get_vma_policy() that points to a 'needs_unref' variable. We would
test 'needs_unref' in all functions that lookup policies
to determine if the policy need to be unref'd. We could then set
the 'needs ref' in get_vma_policy() for non-null mempolicies
returned by a vma get_policy() op. This means that the shm
get_policy() vm_op would need to add a ref when falling back to
vma policy--e.g., for SHM_HUGETLB segments--to mimic shmem refs.
OR, we could pass the extra args all the way down the vm policy op
call stacks...
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Documentation/vm/numa_memory_policy.txt | 4 --
include/linux/mempolicy.h | 18 ++++++++++++
mm/mempolicy.c | 45 +++++++++++++++++---------------
3 files changed, 41 insertions(+), 26 deletions(-)
Index: Linux/mm/mempolicy.c
===================================================================
--- Linux.orig/mm/mempolicy.c 2007-12-05 13:49:07.000000000 -0500
+++ Linux/mm/mempolicy.c 2007-12-06 14:17:40.000000000 -0500
@@ -107,7 +107,7 @@ enum zone_type policy_zone = 0;
struct mempolicy default_policy = {
.refcnt = ATOMIC_INIT(1), /* never free it */
- .policy = MPOL_DEFAULT,
+ .mode = MPOL_DEFAULT,
};
static struct mempolicy *get_vma_policy(struct task_struct *task,
@@ -194,7 +194,7 @@ static struct mempolicy *mpol_new(int mo
policy->v.nodes = *nodes;
break;
}
- policy->policy = mode;
+ policy->mode = mode;
policy->cpuset_mems_allowed = cpuset_mems_allowed(current);
return policy;
}
@@ -471,7 +471,7 @@ static long do_set_mempolicy(int mode, n
mpol_free(current->mempolicy);
current->mempolicy = new;
mpol_set_task_struct_flag();
- if (new && new->policy == MPOL_INTERLEAVE)
+ if (new && policy_mode(new) == MPOL_INTERLEAVE)
current->il_next = first_node(new->v.nodes);
if (mm)
up_write(&mm->mmap_sem);
@@ -483,7 +483,7 @@ static long do_set_mempolicy(int mode, n
static void get_zonemask(struct mempolicy *p, nodemask_t *nodes)
{
nodes_clear(*nodes);
- switch (p->policy) {
+ switch (policy_mode(p)) {
case MPOL_DEFAULT:
break;
case MPOL_BIND:
@@ -559,14 +559,14 @@ static long do_get_mempolicy(int *policy
goto out;
*policy = err;
} else if (pol == current->mempolicy &&
- pol->policy == MPOL_INTERLEAVE) {
+ policy_mode(pol) == MPOL_INTERLEAVE) {
*policy = current->il_next;
} else {
err = -EINVAL;
goto out;
}
} else
- *policy = pol->policy;
+ *policy = policy_mode(pol);
if (vma) {
up_read(¤t->mm->mmap_sem);
@@ -1129,7 +1129,7 @@ static struct mempolicy *get_vma_policy(
pol = vpol;
shared_pol = 1; /* if pol non-NULL, add ref below */
} else if (vma->vm_policy &&
- vma->vm_policy->policy != MPOL_DEFAULT)
+ policy_mode(vma->vm_policy) != MPOL_DEFAULT)
pol = vma->vm_policy;
}
if (!pol)
@@ -1143,7 +1143,7 @@ static struct mempolicy *get_vma_policy(
static nodemask_t *nodemask_policy(gfp_t gfp, struct mempolicy *policy)
{
/* Lower zones don't get a nodemask applied for MPOL_BIND */
- if (unlikely(policy->policy == MPOL_BIND) &&
+ if (unlikely(policy_mode(policy) == MPOL_BIND) &&
gfp_zone(gfp) >= policy_zone &&
cpuset_nodemask_valid_mems_allowed(&policy->v.nodes))
return &policy->v.nodes;
@@ -1156,7 +1156,7 @@ static struct zonelist *zonelist_policy(
{
int nd;
- switch (policy->policy) {
+ switch (policy_mode(policy)) {
case MPOL_PREFERRED:
nd = policy->v.preferred_node;
if (nd < 0)
@@ -1205,7 +1205,7 @@ static unsigned interleave_nodes(struct
*/
unsigned slab_node(struct mempolicy *policy)
{
- int pol = policy ? policy->policy : MPOL_DEFAULT;
+ int pol = policy ? policy_mode(policy) : MPOL_DEFAULT;
switch (pol) {
case MPOL_INTERLEAVE:
@@ -1298,7 +1298,7 @@ struct zonelist *huge_zonelist(struct vm
struct zonelist *zl;
*mpol = NULL; /* probably no unref needed */
- if (pol->policy == MPOL_INTERLEAVE) {
+ if (policy_mode(pol) == MPOL_INTERLEAVE) {
unsigned nid;
nid = interleave_nid(pol, vma, addr, HPAGE_SHIFT);
@@ -1308,7 +1308,7 @@ struct zonelist *huge_zonelist(struct vm
zl = zonelist_policy(GFP_HIGHUSER, pol);
if (unlikely(pol != &default_policy && pol != current->mempolicy)) {
- if (pol->policy != MPOL_BIND)
+ if (policy_mode(pol) != MPOL_BIND)
__mpol_free(pol); /* finished with pol */
else
*mpol = pol; /* unref needed after allocation */
@@ -1362,7 +1362,7 @@ alloc_page_vma(gfp_t gfp, struct vm_area
cpuset_update_task_memory_state();
- if (unlikely(pol->policy == MPOL_INTERLEAVE)) {
+ if (unlikely(policy_mode(pol) == MPOL_INTERLEAVE)) {
unsigned nid;
nid = interleave_nid(pol, vma, addr, PAGE_SHIFT);
@@ -1411,7 +1411,7 @@ struct page *alloc_pages_current(gfp_t g
cpuset_update_task_memory_state();
if (!pol || in_interrupt() || (gfp & __GFP_THISNODE))
pol = &default_policy;
- if (pol->policy == MPOL_INTERLEAVE)
+ if (policy_mode(pol) == MPOL_INTERLEAVE)
return alloc_page_interleave(gfp, order, interleave_nodes(pol));
return __alloc_pages_nodemask(gfp, order,
zonelist_policy(gfp, pol), nodemask_policy(gfp, pol));
@@ -1447,9 +1447,11 @@ int __mpol_equal(struct mempolicy *a, st
{
if (!a || !b)
return 0;
- if (a->policy != b->policy)
+
+ if (a->mode != b->mode)
return 0;
- switch (a->policy) {
+
+ switch (policy_mode(a)) {
case MPOL_DEFAULT:
return 1;
case MPOL_BIND:
@@ -1469,7 +1471,7 @@ void __mpol_free(struct mempolicy *p)
{
if (!atomic_dec_and_test(&p->refcnt))
return;
- p->policy = MPOL_DEFAULT;
+ p->mode = MPOL_DEFAULT;
kmem_cache_free(policy_cache, p);
}
@@ -1535,7 +1537,7 @@ static void sp_insert(struct shared_poli
rb_link_node(&new->nd, parent, p);
rb_insert_color(&new->nd, &sp->root);
pr_debug("inserting %lx-%lx: %d\n", new->start, new->end,
- new->policy ? new->policy->policy : 0);
+ new->policy ? policy_mode(new->policy) : 0);
}
/* Find shared policy intersecting idx */
@@ -1575,6 +1577,7 @@ static struct sp_node *sp_alloc(unsigned
n->start = start;
n->end = end;
mpol_get(pol);
+ pol->mode |= MPOL_SHARED; /* for unref */
n->policy = pol;
return n;
}
@@ -1660,7 +1663,7 @@ int mpol_set_shared_policy(struct shared
pr_debug("set_shared_policy %lx sz %lu %d %lx\n",
vma->vm_pgoff,
- sz, npol? npol->policy : -1,
+ sz, npol? policy_mode(npol) : -1,
npol ? nodes_addr(npol->v.nodes)[0] : -1);
if (npol) {
@@ -1756,7 +1759,7 @@ static void mpol_rebind_policy(struct me
if (nodes_equal(*mpolmask, *newmask))
return;
- switch (pol->policy) {
+ switch (policy_mode(pol)) {
case MPOL_DEFAULT:
break;
case MPOL_BIND:
@@ -1822,7 +1825,7 @@ static inline int mpol_to_str(char *buff
char *p = buffer;
int l;
nodemask_t nodes;
- int mode = pol ? pol->policy : MPOL_DEFAULT;
+ int mode = pol ? policy_mode(pol) : MPOL_DEFAULT;
switch (mode) {
case MPOL_DEFAULT:
Index: Linux/include/linux/mempolicy.h
===================================================================
--- Linux.orig/include/linux/mempolicy.h 2007-12-05 11:54:20.000000000 -0500
+++ Linux/include/linux/mempolicy.h 2007-12-06 14:17:40.000000000 -0500
@@ -15,6 +15,13 @@
#define MPOL_INTERLEAVE 3
#define MPOL_MAX MPOL_INTERLEAVE
+#define MPOL_MODE 0x0ff /* reserve 8 bits for policy "mode" */
+
+/*
+ * OR'd into struct mempolicy 'policy' member for 'shared policies'
+ * so that we can easily identify them for unref after lookup/use.
+ */
+#define MPOL_SHARED (1 << 8)
/* Flags for get_mem_policy */
#define MPOL_F_NODE (1<<0) /* return next IL mode instead of node mask */
@@ -62,7 +69,7 @@ struct mm_struct;
*/
struct mempolicy {
atomic_t refcnt;
- short policy; /* See MPOL_* above */
+ short mode; /* See MPOL_* above */
union {
short preferred_node; /* preferred */
nodemask_t nodes; /* interleave/bind */
@@ -72,6 +79,15 @@ struct mempolicy {
};
/*
+ * Return 'policy' [a.k.a. 'mode'] member of mpol, less CONTEXT
+ * or any other modifiers.
+ */
+static inline int policy_mode(struct mempolicy *mpol)
+{
+ return mpol->mode & MPOL_MODE;
+}
+
+/*
* Support for managing mempolicy data objects (clone, copy, destroy)
* The default fast path of a NULL MPOL_DEFAULT policy is always inlined.
*/
Index: Linux/Documentation/vm/numa_memory_policy.txt
===================================================================
--- Linux.orig/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:17:40.000000000 -0500
+++ Linux/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:18:27.000000000 -0500
@@ -143,10 +143,6 @@ Components of Memory Policies
structure, struct mempolicy. Details of this structure will be discussed
in context, below, as required to explain the behavior.
- Note: in some functions AND in the struct mempolicy itself, the mode
- is called "policy". However, to avoid confusion with the policy tuple,
- this document will continue to use the term "mode".
-
Linux memory policy supports the following 4 behavioral modes:
Default Mode--MPOL_DEFAULT: The behavior specified by this mode is
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 4/8] Mem Policy: Document {set|get}_policy() vm_ops APIs
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
` (2 preceding siblings ...)
2007-12-06 21:21 ` [PATCH/RFC 3/8] Mem Policy: Mark shared policies for unref Lee Schermerhorn
@ 2007-12-06 21:21 ` Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 5/8] Mem Policy: Rework mempolicy Reference Counting [yet again] Lee Schermerhorn
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, ak, eric.whitney, clameter, mel
PATCH/RFC 04/08 Mem Policy: Document {set|get}_policy() vm_ops APIs
Against: 2.6.24-rc2-mm1
Document mempolicy return value reference semantics assumed by
the rest of the mempolicy code for the set_ and get_policy vm_ops
in <linux/mm.h>--where the prototypes are defined--to inform any
future mempolicy vm_op writers what the rest of the subsystem
expects of them.
Note: An alternative, suggested by Christoph Lameter: we could
define get_policy() to add an extra ref to any non-null mempolicy
returned. get_vma_policy() could then inform its caller--e.g., via
an addtional argument point to a 'needs_unref' variable--that the
policy needs unref [mpol_free()] after use.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
include/linux/mm.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
Index: Linux/include/linux/mm.h
===================================================================
--- Linux.orig/include/linux/mm.h 2007-10-29 13:20:52.000000000 -0400
+++ Linux/include/linux/mm.h 2007-10-29 13:25:33.000000000 -0400
@@ -173,7 +173,25 @@ struct vm_operations_struct {
* writable, if an error is returned it will cause a SIGBUS */
int (*page_mkwrite)(struct vm_area_struct *vma, struct page *page);
#ifdef CONFIG_NUMA
+ /*
+ * set_policy() op must add a reference to any non-NULL @new mempolicy
+ * to hold the policy upon return. Caller should pass NULL @new to
+ * remove a policy and fall back to surrounding context--i.e. do not
+ * install a MPOL_DEFAULT policy, nor the task or system default
+ * mempolicy.
+ */
int (*set_policy)(struct vm_area_struct *vma, struct mempolicy *new);
+
+ /*
+ * get_policy() op must add reference [mpol_get()] to any policy at
+ * (vma,addr) marked as MPOL_SHARED. The shared policy infrastructure
+ * in mm/mempolicy.c will do this automatically.
+ * get_policy() must NOT add a ref if the policy at (vma,addr) is not
+ * marked as MPOL_SHARED. vma policies are protected by the mmap_sem.
+ * If no [shared/vma] mempolicy exists at the addr, get_policy() op
+ * must return NULL--i.e., do not "fallback" to task or system default
+ * policy.
+ */
struct mempolicy *(*get_policy)(struct vm_area_struct *vma,
unsigned long addr);
int (*migrate)(struct vm_area_struct *vma, const nodemask_t *from,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 5/8] Mem Policy: Rework mempolicy Reference Counting [yet again]
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
` (3 preceding siblings ...)
2007-12-06 21:21 ` [PATCH/RFC 4/8] Mem Policy: Document {set|get}_policy() vm_ops APIs Lee Schermerhorn
@ 2007-12-06 21:21 ` Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 6/8] Mem Policy: Use MPOL_PREFERRED for system-wide default policy Lee Schermerhorn
` (2 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, ak, mel, eric.whitney, clameter
PATCH/RFC 05/08 Mem Policy: rework mempolicy reference counting [yet again]
Against: 2.6.24-rc2-mm1
N.B., this patch depends on Mel Gorman's "one zonelist" series. See
discussion of read_swap_cache_async() below.
After further discussion with Christoph Lameter, it has become clear
that my earlier attempts to clean up the mempolicy reference counting
were a bit of overkill in some areas, resulting in superflous ref/unref
in what are usually fast paths. In other areas, further inspection
reveals that I botched the unref for interleave policies. This patch
attempts to clean this up. Maybe I'll get it right this time.
So, here's what [I think] is happening:
1) system default mempolicy needs no protection by extra reference counts
as it is never freed. However, we need to be real sure that we never
unref the sys default mempolicy.
2) The current task's mempolicy needs no extra references because it can
only be changed by the task itself. That can't happen when we're in
here using the policy for allocation or querying it via get_mempolicy().
3) An other task's mempolicy needs no extra reference [after patch 1 of
this series] because the caller must hold the target task's mm's
mmap_sem when accessing the mempolicy. Currently, this only occurs
when show_numa_maps() looks up a task's per vma mempolicy and the
mempolicy falls back to [non-NULL] task policy. show_numa_maps() is
called from the /proc/<pid>/numa_maps handler holding the target
task's mmap_sem for read.
N.B., this only works if do_set_mempolicy() grabs the mmap_sem for write
when updating the task mempolicy. This is covered by patch 1 of this
series.
4) A task's [non-shared] vma policy needs no extra references because all
lookups and usage of vma policy occurs with the mmap_sem held for read--
e.g., in the fault path or in do_get_mempolicy().
5) A shared policy--i.e., a mempolicy for a range of a shared memory region
[really a mmap()ed tmpfs file]--managed by the shared policy infrastructure
in mm/mempoicy.c requires an extra reference when looked up for allocation
or query. The shared policy infrastructure has always added this reference.
Shmem page allocation [shmem_alloc_page() and shmem_swapin()] released
the ref count, but new_vma_page() [page migration] and show_numa_maps()
never did, resulting in leaking of mempolicy structures applied to shared
memory regions allocated by shmget(). We need to release this extra
reference when finished with the mempolicy.
When are we "finished" with the mempolicy?
For MPOL_PREFERRED policies [including MPOL_DEFAULT == "preferred local"],
we finished as soon as we've obtained the zonelist for the target node.
For MPOL_INTERLEAVE policies, we're finished as soon as we've determined
the target node for the interleave.
For MPOL_BIND policies, because they contain a custom zonelist used for
page allocation, we're only finished after we've converted this zonelist
to a nodemask for get_mempolicy()/show_numa_maps() or after we've allocated
a page [or failed to] based on the nodelist. [Note: when Mel Gorman's
onezonelist series gets merged--he says hopefully--this paragraph will
apply to the custom nodemask that replaces the zonelist, as the nodemask
must also be held over the allocation.]
But, again, lookup of mempolicy, based on (vma, address) need only add a
reference for shared policy, and we need only unref the policy when finished
for shared policies. So, this patch backs all of the unneeded extra
reference counting added by my previous attempt. It then unrefs only
shared policies when we're finished with them, using the mpol_cond_free()
[conditional free] helper function introduced by this patch.
Note that shmem_swapin() calls read_swap_cache_async() with a dummy vma
containing just the policy. read_swap_cache_async() can call
alloc_page_vma() multiple times, so we can't let alloc_page_vma() unref
the shared policy in this case. To avoid this, we make a copy of any
non-null shared policy and remove the MPOL_SHARED flag from the copy.
I introduced a new static inline function "mpol_cond_assign()" to assign
the shared policy to an on-stack policy and remove the flags that would
require a conditional free. This depends on Mel Gorman's "one zonelist"
patch series that eliminates the custom zonelist hanging off MPOL_BIND
policies.
This patch updates the numa_memory_policy.txt document to explain the
reference counting semantics, as discussed above.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Documentation/vm/numa_memory_policy.txt | 69 ++++++++++++++++++++++++++++++++
include/linux/mempolicy.h | 42 +++++++++++++++++++
mm/hugetlb.c | 2
mm/mempolicy.c | 46 +++++++++------------
mm/shmem.c | 16 ++++---
5 files changed, 142 insertions(+), 33 deletions(-)
Index: Linux/mm/mempolicy.c
===================================================================
--- Linux.orig/mm/mempolicy.c 2007-12-06 14:17:40.000000000 -0500
+++ Linux/mm/mempolicy.c 2007-12-06 14:18:34.000000000 -0500
@@ -578,6 +578,7 @@ static long do_get_mempolicy(int *policy
get_zonemask(pol, nmask);
out:
+ mpol_cond_free(pol);
if (vma)
up_read(¤t->mm->mmap_sem);
return err;
@@ -1110,16 +1111,18 @@ asmlinkage long compat_sys_mbind(compat_
*
* Returns effective policy for a VMA at specified address.
* Falls back to @task or system default policy, as necessary.
- * Returned policy has extra reference count if shared, vma,
- * or some other task's policy [show_numa_maps() can pass
- * @task != current]. It is the caller's responsibility to
- * free the reference in these cases.
+ * Current or other task's task mempolicy and non-shared vma policies
+ * are protected by the task's mmap_sem, which must be held for read by
+ * the caller.
+ * Shared policies [those marked as MPOL_SHARED] require an extra reference
+ * count--added by the get_policy() vm_op, as appropriate--to protect against
+ * freeing by another task. It is the caller's responsibility to free the
+ * extra reference for shared policies.
*/
static struct mempolicy *get_vma_policy(struct task_struct *task,
struct vm_area_struct *vma, unsigned long addr)
{
struct mempolicy *pol = task->mempolicy;
- int shared_pol = 0;
if (vma) {
if (vma->vm_ops && vma->vm_ops->get_policy) {
@@ -1127,15 +1130,12 @@ static struct mempolicy *get_vma_policy(
addr);
if (vpol)
pol = vpol;
- shared_pol = 1; /* if pol non-NULL, add ref below */
} else if (vma->vm_policy &&
policy_mode(vma->vm_policy) != MPOL_DEFAULT)
pol = vma->vm_policy;
}
if (!pol)
pol = &default_policy;
- else if (!shared_pol && pol != current->mempolicy)
- mpol_get(pol); /* vma or other task's policy */
return pol;
}
@@ -1202,6 +1202,10 @@ static unsigned interleave_nodes(struct
/*
* Depending on the memory policy provide a node from which to allocate the
* next slab entry.
+ * @policy must be protected by freeing by the caller. If @policy is
+ * the current task's mempolicy, this protection is implicit, as only the
+ * task can change it's policy. The system default policy requires no
+ * such protection.
*/
unsigned slab_node(struct mempolicy *policy)
{
@@ -1295,25 +1299,18 @@ struct zonelist *huge_zonelist(struct vm
gfp_t gfp_flags, struct mempolicy **mpol)
{
struct mempolicy *pol = get_vma_policy(current, vma, addr);
- struct zonelist *zl;
- *mpol = NULL; /* probably no unref needed */
if (policy_mode(pol) == MPOL_INTERLEAVE) {
unsigned nid;
nid = interleave_nid(pol, vma, addr, HPAGE_SHIFT);
- __mpol_free(pol); /* finished with pol */
+ mpol_cond_free(pol); /* finished with pol */
+ *mpol = NULL;
return node_zonelist(nid, gfp_flags);
}
- zl = zonelist_policy(GFP_HIGHUSER, pol);
- if (unlikely(pol != &default_policy && pol != current->mempolicy)) {
- if (policy_mode(pol) != MPOL_BIND)
- __mpol_free(pol); /* finished with pol */
- else
- *mpol = pol; /* unref needed after allocation */
- }
- return zl;
+ *mpol = pol; /* unref needed after allocation */
+ return zonelist_policy(GFP_HIGHUSER, pol);
}
#endif
@@ -1366,12 +1363,13 @@ alloc_page_vma(gfp_t gfp, struct vm_area
unsigned nid;
nid = interleave_nid(pol, vma, addr, PAGE_SHIFT);
+ mpol_cond_free(pol);
return alloc_page_interleave(gfp, 0, nid);
}
zl = zonelist_policy(gfp, pol);
- if (pol != &default_policy && pol != current->mempolicy) {
+ if (unlikely(mpol_needs_cond_ref(pol))) {
/*
- * slow path: ref counted policy -- shared or vma
+ * slow path: ref counted shared policy
*/
struct page *page = __alloc_pages_nodemask(gfp, 0,
zl, nodemask_policy(gfp, pol));
@@ -1956,11 +1954,7 @@ int show_numa_map(struct seq_file *m, vo
pol = get_vma_policy(priv->task, vma, vma->vm_start);
mpol_to_str(buffer, sizeof(buffer), pol);
- /*
- * unref shared or other task's mempolicy
- */
- if (pol != &default_policy && pol != current->mempolicy)
- __mpol_free(pol);
+ mpol_cond_free(pol);
seq_printf(m, "%08lx %s", vma->vm_start, buffer);
Index: Linux/mm/shmem.c
===================================================================
--- Linux.orig/mm/shmem.c 2007-12-06 14:17:40.000000000 -0500
+++ Linux/mm/shmem.c 2007-12-06 14:18:34.000000000 -0500
@@ -1048,16 +1048,19 @@ out:
static struct page *shmem_swapin(swp_entry_t entry, gfp_t gfp,
struct shmem_inode_info *info, unsigned long idx)
{
+ struct mempolicy mpol, *spol;
struct vm_area_struct pvma;
struct page *page;
+ spol = mpol_cond_assign(&mpol,
+ mpol_shared_policy_lookup(&info->policy, idx));
+
/* Create a pseudo vma that just contains the policy */
pvma.vm_start = 0;
pvma.vm_pgoff = idx;
pvma.vm_ops = NULL;
- pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, idx);
+ pvma.vm_policy = spol;
page = swapin_readahead(entry, gfp, &pvma, 0);
- mpol_free(pvma.vm_policy);
return page;
}
@@ -1065,16 +1068,17 @@ static struct page *shmem_alloc_page(gfp
struct shmem_inode_info *info, unsigned long idx)
{
struct vm_area_struct pvma;
- struct page *page;
/* Create a pseudo vma that just contains the policy */
pvma.vm_start = 0;
pvma.vm_pgoff = idx;
pvma.vm_ops = NULL;
pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, idx);
- page = alloc_page_vma(gfp, &pvma, 0);
- mpol_free(pvma.vm_policy);
- return page;
+
+ /*
+ * alloc_page_vma() will drop the shared policy reference
+ */
+ return alloc_page_vma(gfp, &pvma, 0);
}
#else
static inline int shmem_parse_mpol(char *value, int *policy,
Index: Linux/include/linux/mempolicy.h
===================================================================
--- Linux.orig/include/linux/mempolicy.h 2007-12-06 14:17:40.000000000 -0500
+++ Linux/include/linux/mempolicy.h 2007-12-06 14:18:34.000000000 -0500
@@ -99,6 +99,38 @@ static inline void mpol_free(struct memp
__mpol_free(pol);
}
+/*
+ * does policy need explicit unref after use?
+ * currently only needed for shared policies.
+ */
+static inline int mpol_needs_cond_ref(struct mempolicy *pol)
+{
+ return (pol && (pol->mode & MPOL_SHARED));
+}
+
+static inline void mpol_cond_free(struct mempolicy *pol)
+{
+ if (mpol_needs_cond_ref(pol))
+ __mpol_free(pol);
+}
+
+/*
+ * Assign *@frompol to *@topol if conditional ref needed, eliminate the
+ * MPOL_* flags that require conditional ref and drop the extra ref.
+ * Use @tompol for, e.g., multiple allocations with a single policy lookup.
+ */
+static inline struct mempolicy *mpol_cond_assign(struct mempolicy *tompol,
+ struct mempolicy *frompol)
+{
+ if (!mpol_needs_cond_ref(frompol))
+ return frompol;
+
+ *tompol = *frompol;
+ tompol->mode &= ~MPOL_SHARED;
+ __mpol_free(frompol);
+ return tompol;
+}
+
extern struct mempolicy *__mpol_copy(struct mempolicy *pol);
static inline struct mempolicy *mpol_copy(struct mempolicy *pol)
{
@@ -196,6 +228,16 @@ static inline void mpol_free(struct memp
{
}
+static inline void mpol_cond_free(struct mempolicy *pol)
+{
+}
+
+static inline struct mempolicy *mpol_cond_assign(struct mempolicy *to,
+ struct mempolicy *from)
+{
+ return from;
+}
+
static inline void mpol_get(struct mempolicy *pol)
{
}
Index: Linux/mm/hugetlb.c
===================================================================
--- Linux.orig/mm/hugetlb.c 2007-12-06 14:17:40.000000000 -0500
+++ Linux/mm/hugetlb.c 2007-12-06 14:18:34.000000000 -0500
@@ -95,7 +95,7 @@ static struct page *dequeue_huge_page(st
break;
}
}
- mpol_free(mpol); /* unref if mpol !NULL */
+ mpol_cond_free(mpol);
return page;
}
Index: Linux/Documentation/vm/numa_memory_policy.txt
===================================================================
--- Linux.orig/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:18:27.000000000 -0500
+++ Linux/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:18:34.000000000 -0500
@@ -227,6 +227,75 @@ Components of Memory Policies
the temporary interleaved system default policy works in this
mode.
+MEMORY POLICY REFERENCE COUNTING
+
+To resolve use/free races, struct mempolicy contains an atomic reference
+count field. Internal interfaces, mpol_get()/mpol_free() increment and
+decrement this reference count, respectively. mpol_free() will only free
+the structure back to the mempolicy kmem cache when the reference count
+goes to zero.
+
+When a new memory policy is allocated, it's reference count is initialized
+to '1', representing the reference held by the task that is installing the
+new policy. When a pointer to a memory policy structure is stored in another
+structure, another reference is added, as the task's reference will be dropped
+on completion of the policy installation.
+
+During run-time "usage" of the policy, we attempt to minimize atomic operations
+on the reference count, as this can lead to cache lines bouncing between cpus
+and NUMA nodes. "Usage" here means one of the following:
+
+1) querying of the policy, either by the task itself [using the get_mempolicy()
+ API discussed below] or by another task using the /proc/<pid>/numa_maps
+ interface.
+
+2) examination of the policy to determine the policy mode and associated node
+ or node lists, if any, for page allocation. This is considered a "hot
+ path". Note that for MPOL_BIND, the "usage" extends across the entire
+ allocation process, which may sleep during page reclaimation, because the
+ BIND policy has a custom node list containing the nodes specified by the
+ policy.
+
+We can avoid taking an extra reference during the usages listed above as
+follows:
+
+1) we never need to get/free the system default policy as this is never
+ changed nor freed, once the system is up and running.
+
+2) for querying the policy, we do not need to take an extra reference on the
+ target task's task policy nor vma policies because we always acquire the
+ task's mm's mmap_sem for read during the query. The set_mempolicy() and
+ mbind() APIs [see below] always acquire the mmap_sem for write when
+ installing or replacing task or vma policies. Thus, there is no possibility
+ of a task or thread freeing a policy while another task or thread is
+ querying it.
+
+3) Page allocation usage of task or vma policy occurs in the fault path where
+ we hold them mmap_sem for read. Again, because replacing the task or vma
+ policy requires that the mmap_sem be held for write, the policy can't be
+ freed out from under us while we're using it for page allocation.
+
+4) Shared policies require special consideration. One task can replace a
+ shared memory policy while another task, with a distinct mmap_sem, is
+ querying or allocating a page based on the policy. To resolve this
+ potential race, the shared policy infrastructure adds an extra reference
+ to the shared policy during lookup while holding a spin lock on the shared
+ policy management structure. This requires that we drop this extra
+ reference when we're finished "using" the policy. We must drop the
+ extra reference on shared policies in the same query/allocation paths
+ used for non-shared policies. For this reason, shared policies are marked
+ as such, and the extra reference is dropped "conditionally"--i.e., only
+ for shared policies.
+
+ Because of this extra reference counting, and because we must lookup
+ shared policies in a tree structure under spinlock, shared policies are
+ more expensive to use in the page allocation path. This is expecially
+ true for shared policies on shared memory regions shared by tasks running
+ on different NUMA nodes. This extra overhead can be avoided by always
+ falling back to task or system default policy for shared memory regions,
+ or by prefaulting the entire shared memory region into memory and locking
+ it down. However, this might not be appropriate for all applications.
+
MEMORY POLICY APIs
Linux supports 3 system calls for controlling memory policy. These APIS
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 6/8] Mem Policy: Use MPOL_PREFERRED for system-wide default policy
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
` (4 preceding siblings ...)
2007-12-06 21:21 ` [PATCH/RFC 5/8] Mem Policy: Rework mempolicy Reference Counting [yet again] Lee Schermerhorn
@ 2007-12-06 21:21 ` Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 7/8] Mem Policy: MPOL_PREFERRED cleanups for "local allocation" Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 8/8] Mem Policy: Fix up MPOL_BIND documentation Lee Schermerhorn
7 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, clameter, ak, eric.whitney, mel
PATCH/RFC 06/08 Mem Policy: Use MPOL_PREFERRED for system-wide default policy
Against: 2.6.24-rc2-mm1
V2 -> V3:
+ mpol_to_str(): show "default" policy when &default_policy is
passed in, rather than the details of the default_policy, in
/proc/<pid>/numa_maps.
V1 -> V2:
+ restore BUG()s in switch(policy) default cases -- per
Christoph
+ eliminate unneeded re-init of struct mempolicy policy member
before freeing
Currently, when one specifies MPOL_DEFAULT via a NUMA memory
policy API [set_mempolicy(), mbind() and internal versions],
the kernel simply installs a NULL struct mempolicy pointer in
the appropriate context: task policy, vma policy, or shared
policy. This causes any use of that policy to "fall back" to
the next most specific policy scope.
The only use of MPOL_DEFAULT to mean "local allocation" is in
the system default policy. This requires extra checks/cases
for MPOL_DEFAULT in many mempolicy.c functions.
There is another, "preferred" way to specify local allocation via
the APIs. That is using the MPOL_PREFERRED policy mode with an
empty nodemask. Internally, the empty nodemask gets converted to
a preferred_node id of '-1'. All internal usage of MPOL_PREFERRED
will convert the '-1' to the id of the node local to the cpu
where the allocation occurs.
System default policy, except during boot, is hard-coded to
"local allocation". By using the MPOL_PREFERRED mode with a
negative value of preferred node for system default policy,
MPOL_DEFAULT will never occur in the 'policy' member of a
struct mempolicy. Thus, we can remove all checks for
MPOL_DEFAULT when converting policy to a node id/zonelist in
the allocation paths.
In slab_node() return local node id when policy pointer is NULL.
No need to set a pol value to take the switch default. Replace
switch default with BUG()--i.e., shouldn't happen.
With this patch MPOL_DEFAULT is only used in the APIs, including
internal calls to do_set_mempolicy() and in the display of policy
in /proc/<pid>/numa_maps. It always means "fall back" to the the
next most specific policy scope. This simplifies the description
of memory policies quite a bit, with no visible change in behavior.
This patch updates Documentation to reflect this change.
Tested with set_mempolicy() using numactl with memtoy, and
tested mbind() with memtoy. All seems to work "as expected".
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Documentation/vm/numa_memory_policy.txt | 70 ++++++++++++--------------------
mm/mempolicy.c | 38 +++++++++--------
2 files changed, 47 insertions(+), 61 deletions(-)
Index: Linux/mm/mempolicy.c
===================================================================
--- Linux.orig/mm/mempolicy.c 2007-12-06 14:18:34.000000000 -0500
+++ Linux/mm/mempolicy.c 2007-12-06 14:20:17.000000000 -0500
@@ -105,9 +105,13 @@ static struct kmem_cache *sn_cache;
policied. */
enum zone_type policy_zone = 0;
+/*
+ * run-time system-wide default policy => local allocation
+ */
struct mempolicy default_policy = {
.refcnt = ATOMIC_INIT(1), /* never free it */
- .mode = MPOL_DEFAULT,
+ .mode = MPOL_PREFERRED,
+ .v = { .preferred_node = -1 },
};
static struct mempolicy *get_vma_policy(struct task_struct *task,
@@ -166,7 +170,8 @@ static struct mempolicy *mpol_new(int mo
mode, nodes ? nodes_addr(*nodes)[0] : -1);
if (mode == MPOL_DEFAULT)
- return NULL;
+ return NULL; /* simply delete any existing policy */
+
policy = kmem_cache_alloc(policy_cache, GFP_KERNEL);
if (!policy)
return ERR_PTR(-ENOMEM);
@@ -484,8 +489,6 @@ static void get_zonemask(struct mempolic
{
nodes_clear(*nodes);
switch (policy_mode(p)) {
- case MPOL_DEFAULT:
- break;
case MPOL_BIND:
/* Fall through */
case MPOL_INTERLEAVE:
@@ -1130,8 +1133,7 @@ static struct mempolicy *get_vma_policy(
addr);
if (vpol)
pol = vpol;
- } else if (vma->vm_policy &&
- policy_mode(vma->vm_policy) != MPOL_DEFAULT)
+ } else if (vma->vm_policy)
pol = vma->vm_policy;
}
if (!pol)
@@ -1175,7 +1177,6 @@ static struct zonelist *zonelist_policy(
nd = first_node(policy->v.nodes);
break;
case MPOL_INTERLEAVE: /* should not happen */
- case MPOL_DEFAULT:
nd = numa_node_id();
break;
default:
@@ -1209,9 +1210,10 @@ static unsigned interleave_nodes(struct
*/
unsigned slab_node(struct mempolicy *policy)
{
- int pol = policy ? policy_mode(policy) : MPOL_DEFAULT;
+ if (!policy)
+ return numa_node_id();
- switch (pol) {
+ switch (policy_mode(policy)) {
case MPOL_INTERLEAVE:
return interleave_nodes(policy);
@@ -1232,10 +1234,10 @@ unsigned slab_node(struct mempolicy *pol
case MPOL_PREFERRED:
if (policy->v.preferred_node >= 0)
return policy->v.preferred_node;
- /* Fall through */
+ return numa_node_id();
default:
- return numa_node_id();
+ BUG();
}
}
@@ -1450,8 +1452,6 @@ int __mpol_equal(struct mempolicy *a, st
return 0;
switch (policy_mode(a)) {
- case MPOL_DEFAULT:
- return 1;
case MPOL_BIND:
/* Fall through */
case MPOL_INTERLEAVE:
@@ -1469,7 +1469,6 @@ void __mpol_free(struct mempolicy *p)
{
if (!atomic_dec_and_test(&p->refcnt))
return;
- p->mode = MPOL_DEFAULT;
kmem_cache_free(policy_cache, p);
}
@@ -1637,7 +1636,7 @@ void mpol_shared_policy_init(struct shar
if (policy != MPOL_DEFAULT) {
struct mempolicy *newpol;
- /* Falls back to MPOL_DEFAULT on any error */
+ /* Falls back to NULL policy [MPOL_DEFAULT] on any error */
newpol = mpol_new(policy, policy_nodes);
if (!IS_ERR(newpol)) {
/* Create pseudo-vma that contains just the policy */
@@ -1758,8 +1757,6 @@ static void mpol_rebind_policy(struct me
return;
switch (policy_mode(pol)) {
- case MPOL_DEFAULT:
- break;
case MPOL_BIND:
/* Fall through */
case MPOL_INTERLEAVE:
@@ -1823,7 +1820,12 @@ static inline int mpol_to_str(char *buff
char *p = buffer;
int l;
nodemask_t nodes;
- int mode = pol ? policy_mode(pol) : MPOL_DEFAULT;
+ int mode;
+
+ if (!pol || pol == &default_policy)
+ mode = MPOL_DEFAULT;
+ else
+ mode = policy_mode(pol);
switch (mode) {
case MPOL_DEFAULT:
Index: Linux/Documentation/vm/numa_memory_policy.txt
===================================================================
--- Linux.orig/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:18:34.000000000 -0500
+++ Linux/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:18:39.000000000 -0500
@@ -145,63 +145,47 @@ Components of Memory Policies
Linux memory policy supports the following 4 behavioral modes:
- Default Mode--MPOL_DEFAULT: The behavior specified by this mode is
- context or scope dependent.
+ Default Mode--MPOL_DEFAULT: This mode is only used in the memory
+ policy APIs. Internally, MPOL_DEFAULT is converted to the NULL
+ memory policy in all policy scopes. Any existing non-default policy
+ will simply be removed when MPOL_DEFAULT is specified. As a result,
+ MPOL_DEFAULT means "fall back to the next most specific policy scope."
+
+ For example, a NULL or default task policy will fall back to the
+ system default policy. A NULL or default vma policy will fall
+ back to the task policy.
- As mentioned in the Policy Scope section above, during normal
- system operation, the System Default Policy is hard coded to
- contain the Default mode.
-
- In this context, default mode means "local" allocation--that is
- attempt to allocate the page from the node associated with the cpu
- where the fault occurs. If the "local" node has no memory, or the
- node's memory can be exhausted [no free pages available], local
- allocation will "fallback to"--attempt to allocate pages from--
- "nearby" nodes, in order of increasing "distance".
-
- Implementation detail -- subject to change: "Fallback" uses
- a per node list of sibling nodes--called zonelists--built at
- boot time, or when nodes or memory are added or removed from
- the system [memory hotplug]. These per node zonelist are
- constructed with nodes in order of increasing distance based
- on information provided by the platform firmware.
-
- When a task/process policy or a shared policy contains the Default
- mode, this also means "local allocation", as described above.
-
- In the context of a VMA, Default mode means "fall back to task
- policy"--which may or may not specify Default mode. Thus, Default
- mode can not be counted on to mean local allocation when used
- on a non-shared region of the address space. However, see
- MPOL_PREFERRED below.
-
- The Default mode does not use the optional set of nodes.
+ When specified in one of the memory policy APIs, the Default mode
+ does not use the optional set of nodes.
MPOL_BIND: This mode specifies that memory must come from the
set of nodes specified by the policy.
The memory policy APIs do not specify an order in which the nodes
- will be searched. However, unlike "local allocation", the Bind
- policy does not consider the distance between the nodes. Rather,
- allocations will fallback to the nodes specified by the policy in
- order of numeric node id. Like everything in Linux, this is subject
- to change.
+ will be searched. However, unlike "local allocation" discussed
+ below, the Bind policy does not consider the distance between the
+ nodes. Rather, allocations will fallback to the nodes specified
+ by the policy in order of numeric node id. Like everything in
+ Linux, this is subject to change.
MPOL_PREFERRED: This mode specifies that the allocation should be
attempted from the single node specified in the policy. If that
- allocation fails, the kernel will search other nodes, exactly as
- it would for a local allocation that started at the preferred node
- in increasing distance from the preferred node. "Local" allocation
- policy can be viewed as a Preferred policy that starts at the node
- containing the cpu where the allocation takes place.
+ allocation fails, the kernel will search other nodes, in order of
+ increasing distance from the preferred node based on information
+ provided by the platform firmware.
Internally, the Preferred policy uses a single node--the
preferred_node member of struct mempolicy. A "distinguished
value of this preferred_node, currently '-1', is interpreted
as "the node containing the cpu where the allocation takes
- place"--local allocation. This is the way to specify
- local allocation for a specific range of addresses--i.e. for
- VMA policies.
+ place"--local allocation. "Local" allocation policy can be
+ viewed as a Preferred policy that starts at the node containing
+ the cpu where the allocation takes place.
+
+ As mentioned in the Policy Scope section above, during normal
+ system operation, the System Default Policy is hard coded to
+ specify "local allocation". This policy uses the Preferred
+ policy with the special negative value of preferred_node.
MPOL_INTERLEAVED: This mode specifies that page allocations be
interleaved, on a page granularity, across the nodes specified in
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 7/8] Mem Policy: MPOL_PREFERRED cleanups for "local allocation"
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
` (5 preceding siblings ...)
2007-12-06 21:21 ` [PATCH/RFC 6/8] Mem Policy: Use MPOL_PREFERRED for system-wide default policy Lee Schermerhorn
@ 2007-12-06 21:21 ` Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 8/8] Mem Policy: Fix up MPOL_BIND documentation Lee Schermerhorn
7 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, ak, mel, clameter, eric.whitney
PATCH/RFC 07/08 - Mem Policy: MPOL_PREFERRED cleanups for "local allocation" - V5
Against: 2.6.24-rc2-mm1
V4 -> V5:
+ change mpol_to_str() to show "local" policy for MPOL_PREFERRED with
preferred_node == -1. libnuma wrappers and numactl use the term
"local allocation", so let's use it here.
V3 -> V4:
+ updated Documentation/vm/numa_memory_policy.txt to better explain
[I think] the "local allocation" feature of MPOL_PREFERRED.
V2 -> V3:
+ renamed get_nodemask() to get_policy_nodemask() to more closely
match what it's doing.
V1 -> V2:
+ renamed get_zonemask() to get_nodemask(). Mel Gorman suggested this
was a valid "cleanup".
Here are a couple of "cleanups" for MPOL_PREFERRED behavior
when v.preferred_node < 0 -- i.e., "local allocation":
1) [do_]get_mempolicy() calls the now renamed get_policy_nodemask()
to fetch the nodemask associated with a policy. Currently,
get_policy_nodemask() returns the set of nodes with memory, when
the policy 'mode' is 'PREFERRED, and the preferred_node is < 0.
Return the set of allowed nodes instead. This will already have
been masked to include only nodes with memory.
2) When a task is moved into a [new] cpuset, mpol_rebind_policy() is
called to adjust any task and vma policy nodes to be valid in the
new cpuset. However, when the policy is MPOL_PREFERRED, and the
preferred_node is <0, no rebind is necessary. The "local allocation"
indication is valid in any cpuset. Existing code will "do the right
thing" because node_remap() will just return the argument node when
it is outside of the valid range of node ids. However, I think it is
clearer and cleaner to skip the remap explicitly in this case.
3) mpol_to_str() produces a printable, "human readable" string from a
struct mempolicy. For MPOL_PREFERRED with preferred_node <0, show
"local", as this indicates local allocation, as the task migrates
among nodes. Note that this matches the usage of "local allocation"
in libnuma() and numactl. Without this change, I believe that node_set()
[via set_bit()] will set bit 31, resulting in a misleading display.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/mempolicy.c | 47 +++++++++++++++++++++++++++++++++++------------
1 file changed, 35 insertions(+), 12 deletions(-)
Index: Linux/mm/mempolicy.c
===================================================================
--- Linux.orig/mm/mempolicy.c 2007-11-21 11:28:33.000000000 -0500
+++ Linux/mm/mempolicy.c 2007-11-21 11:30:17.000000000 -0500
@@ -484,10 +484,13 @@ static long do_set_mempolicy(int mode, n
return 0;
}
-/* Fill a zone bitmap for a policy */
-static void get_zonemask(struct mempolicy *p, nodemask_t *nodes)
+/*
+ * Fill a zone bitmap for a policy for mempolicy query
+ */
+static void get_policy_nodemask(struct mempolicy *p, nodemask_t *nodes)
{
nodes_clear(*nodes);
+
switch (policy_mode(p)) {
case MPOL_BIND:
/* Fall through */
@@ -495,9 +498,11 @@ static void get_zonemask(struct mempolic
*nodes = p->v.nodes;
break;
case MPOL_PREFERRED:
- /* or use current node instead of memory_map? */
+ /*
+ * for "local policy", return allowed memories
+ */
if (p->v.preferred_node < 0)
- *nodes = node_states[N_HIGH_MEMORY];
+ *nodes = cpuset_current_mems_allowed;
else
node_set(p->v.preferred_node, *nodes);
break;
@@ -578,7 +583,7 @@ static long do_get_mempolicy(int *policy
err = 0;
if (nmask)
- get_zonemask(pol, nmask);
+ get_policy_nodemask(pol, nmask);
out:
mpol_cond_free(pol);
@@ -643,7 +648,7 @@ int do_migrate_pages(struct mm_struct *m
int err = 0;
nodemask_t tmp;
- down_read(&mm->mmap_sem);
+ down_read(&mm->mmap_sem);
err = migrate_vmas(mm, from_nodes, to_nodes, flags);
if (err)
@@ -1749,6 +1754,7 @@ static void mpol_rebind_policy(struct me
{
nodemask_t *mpolmask;
nodemask_t tmp;
+ int nid;
if (!pol)
return;
@@ -1767,9 +1773,15 @@ static void mpol_rebind_policy(struct me
*mpolmask, *newmask);
break;
case MPOL_PREFERRED:
- pol->v.preferred_node = node_remap(pol->v.preferred_node,
+ /*
+ * no need to remap "local policy"
+ */
+ nid = pol->v.preferred_node;
+ if (nid >= 0) {
+ pol->v.preferred_node = node_remap(nid,
*mpolmask, *newmask);
- *mpolmask = *newmask;
+ *mpolmask = *newmask;
+ }
break;
default:
BUG();
@@ -1807,8 +1819,13 @@ void mpol_rebind_mm(struct mm_struct *mm
* Display pages allocated per node and memory policy via /proc.
*/
+/*
+ * "local" is pseudo-policy: MPOL_PREFERRED with preferred_node == -1
+ * Used only for mpol_to_str()
+ */
+#define MPOL_LOCAL (MPOL_INTERLEAVE + 1)
static const char * const policy_types[] =
- { "default", "prefer", "bind", "interleave" };
+ { "default", "prefer", "bind", "interleave", "local" };
/*
* Convert a mempolicy into a string.
@@ -1818,6 +1835,7 @@ static const char * const policy_types[]
static inline int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
{
char *p = buffer;
+ int nid;
int l;
nodemask_t nodes;
int mode;
@@ -1834,7 +1852,12 @@ static inline int mpol_to_str(char *buff
case MPOL_PREFERRED:
nodes_clear(nodes);
- node_set(pol->v.preferred_node, nodes);
+ nid = pol->v.preferred_node;
+ if (nid < 0)
+ mode = MPOL_LOCAL;
+ else {
+ node_set(nid, nodes);
+ }
break;
case MPOL_BIND:
@@ -1849,8 +1872,8 @@ static inline int mpol_to_str(char *buff
}
l = strlen(policy_types[mode]);
- if (buffer + maxlen < p + l + 1)
- return -ENOSPC;
+ if (buffer + maxlen < p + l + 1)
+ return -ENOSPC;
strcpy(p, policy_types[mode]);
p += l;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH/RFC 8/8] Mem Policy: Fix up MPOL_BIND documentation
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
` (6 preceding siblings ...)
2007-12-06 21:21 ` [PATCH/RFC 7/8] Mem Policy: MPOL_PREFERRED cleanups for "local allocation" Lee Schermerhorn
@ 2007-12-06 21:21 ` Lee Schermerhorn
7 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:21 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, ak, eric.whitney, clameter, mel
PATCH/RFC 08/08 - Mem Policy: Fix up MPOL_BIND documentation
Against: 2.6.24-rc4-mm1
With Mel Gorman's "twozonelist" patch series, the MPOL_BIND mode will
search the bind nodemask in order of distance from the node on which
the allocation is performed. Update the mempolicy document to reflect
this [desirable] change.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Documentation/vm/numa_memory_policy.txt | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
Index: Linux/Documentation/vm/numa_memory_policy.txt
===================================================================
--- Linux.orig/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:18:39.000000000 -0500
+++ Linux/Documentation/vm/numa_memory_policy.txt 2007-12-06 14:27:07.000000000 -0500
@@ -162,11 +162,10 @@ Components of Memory Policies
set of nodes specified by the policy.
The memory policy APIs do not specify an order in which the nodes
- will be searched. However, unlike "local allocation" discussed
- below, the Bind policy does not consider the distance between the
- nodes. Rather, allocations will fallback to the nodes specified
- by the policy in order of numeric node id. Like everything in
- Linux, this is subject to change.
+ will be searched. However, the Bind policy will allocate a page
+ from the node in the specified set of nodes that is closest to the
+ node on which the task performing the allocation is executing and
+ that contains a free page that satisfies the request.
MPOL_PREFERRED: This mode specifies that the allocation should be
attempted from the single node specified in the policy. If that
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy
2007-12-06 21:20 ` [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy Lee Schermerhorn
@ 2007-12-06 21:24 ` Andi Kleen
2007-12-06 21:34 ` Lee Schermerhorn
0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2007-12-06 21:24 UTC (permalink / raw)
To: Lee Schermerhorn; +Cc: linux-mm, akpm, mel, eric.whitney, clameter
On Thursday 06 December 2007 22:20:53 Lee Schermerhorn wrote:
> PATCH/RFC 01/08 Mem Policy: Write lock mmap_sem while changing task mempolicy
>
> Against: 2.6.24-rc2-mm1
>
> A read of /proc/<pid>/numa_maps holds the target task's mmap_sem
> for read while examining each vma's mempolicy. A vma's mempolicy
> can fall back to the task's policy. However, the task could be
> changing it's task policy and free the one that the show_numa_maps()
> is examining.
But do_set_mempolicy doesn't actually modify the mempolicy. It just
replaces it using essentially Copy-on-write.
If the numa_maps holds a proper reference count (I haven't
checked if it does) it can keep the old unmodified one as long as it wants.
I don't think a write lock is needed.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy
2007-12-06 21:24 ` Andi Kleen
@ 2007-12-06 21:34 ` Lee Schermerhorn
0 siblings, 0 replies; 11+ messages in thread
From: Lee Schermerhorn @ 2007-12-06 21:34 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-mm, akpm, mel, eric.whitney, clameter
On Thu, 2007-12-06 at 22:24 +0100, Andi Kleen wrote:
> On Thursday 06 December 2007 22:20:53 Lee Schermerhorn wrote:
> > PATCH/RFC 01/08 Mem Policy: Write lock mmap_sem while changing task mempolicy
> >
> > Against: 2.6.24-rc2-mm1
> >
> > A read of /proc/<pid>/numa_maps holds the target task's mmap_sem
> > for read while examining each vma's mempolicy. A vma's mempolicy
> > can fall back to the task's policy. However, the task could be
> > changing it's task policy and free the one that the show_numa_maps()
> > is examining.
>
> But do_set_mempolicy doesn't actually modify the mempolicy. It just
> replaces it using essentially Copy-on-write.
>
> If the numa_maps holds a proper reference count (I haven't
> checked if it does) it can keep the old unmodified one as long as it wants.
>
> I don't think a write lock is needed.
Hi, Andi.
You are correct. But Christoph wants to avoid as many cases for having
to increment the reference count as possible and to simplify/eliminate
the tests of whether the increment/decrement is required. numa_maps
isn't a performance path, but it uses get_vma_policy()--same as used by
page allocation. If you look at patch 5, you'll see that I've
eliminated all extra references in this path, except the one taken by
shared policy lookup. The new 'mpol_cond_free()' [and
mpol_need_cond_unref() helper--used in alloc_page_vma() "slow path"]
only trigger for shared policy now. This is a single test of the
mempolicy mode [formerly policy] member that should be cache hot, if not
actually in a register.
This was what Christoph was trying to achieve, I think.
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-12-06 21:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-06 21:20 [PATCH/RFC 0/8] Mem Policy: More Reference Counting/Fallback Fixes and Misc Cleanups Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 1/8] Mem Policy: Write lock mmap_sem while changing task mempolicy Lee Schermerhorn
2007-12-06 21:24 ` Andi Kleen
2007-12-06 21:34 ` Lee Schermerhorn
2007-12-06 21:20 ` [PATCH/RFC 2/8] Mem Policy: Fixup Fallback for Default Shmem Policy Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 3/8] Mem Policy: Mark shared policies for unref Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 4/8] Mem Policy: Document {set|get}_policy() vm_ops APIs Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 5/8] Mem Policy: Rework mempolicy Reference Counting [yet again] Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 6/8] Mem Policy: Use MPOL_PREFERRED for system-wide default policy Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 7/8] Mem Policy: MPOL_PREFERRED cleanups for "local allocation" Lee Schermerhorn
2007-12-06 21:21 ` [PATCH/RFC 8/8] Mem Policy: Fix up MPOL_BIND documentation Lee Schermerhorn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).