* [PATCH 0/5] Memory policy corruption fixes V2
@ 2012-08-20 16:36 Mel Gorman
2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
` (6 more replies)
0 siblings, 7 replies; 28+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
To: Andrew Morton, KOSAKI Motohiro
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, LKML, Linux-MM, Mel Gorman
This is a rebase with some small changes to Kosaki's "mempolicy memory
corruption fixlet" series. I had expected that Kosaki would have revised
the series by now but it's been waiting a long time.
Changelog since V1
o Rebase to 3.6-rc2
o Editted some of the changelogs
o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
o Reworked the refcount imbalance fix slightly
o Do not call mpol_put in shmem_alloc_page.
I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
did not test LTP such as Josh reported a problem with or with a database that
used shared policies like Andi tested. The series is almost all Kosaki's
work of course. If he has a revised series that simply got delayed in
posting it should take precedence.
include/linux/mempolicy.h | 2 +-
mm/mempolicy.c | 142 +++++++++++++++++++++++++++++----------------
2 files changed, 93 insertions(+), 51 deletions(-)
--
1.7.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages"
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
` (5 subsequent siblings)
6 siblings, 0 replies; 28+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
To: Andrew Morton, KOSAKI Motohiro
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, LKML, Linux-MM, Mel Gorman
From: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
commit [05f144a0: mm: mempolicy: Let vma_merge and vma_split handle
vma->vm_policy linkages] removed vma->vm_policy updates code but it is
the purpose of mbind_range(). Now, mbind_range() is virtually a no-op
and while it does not allow memory corruption it is not the right fix.
This patch is a revert.
[mgorman@suse.de: Edited changelog]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/mempolicy.c | 41 ++++++++++++++++++++++++-----------------
1 files changed, 24 insertions(+), 17 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index bd92431..22dcf9a 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -607,6 +607,27 @@ check_range(struct mm_struct *mm, unsigned long start, unsigned long end,
return first;
}
+/* Apply policy to a single VMA */
+static int policy_vma(struct vm_area_struct *vma, struct mempolicy *new)
+{
+ int err = 0;
+ struct mempolicy *old = vma->vm_policy;
+
+ pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
+ vma->vm_start, vma->vm_end, vma->vm_pgoff,
+ vma->vm_ops, vma->vm_file,
+ vma->vm_ops ? vma->vm_ops->set_policy : NULL);
+
+ if (vma->vm_ops && vma->vm_ops->set_policy)
+ err = vma->vm_ops->set_policy(vma, new);
+ if (!err) {
+ mpol_get(new);
+ vma->vm_policy = new;
+ mpol_put(old);
+ }
+ return err;
+}
+
/* Step 2: apply policy to a range and do splits. */
static int mbind_range(struct mm_struct *mm, unsigned long start,
unsigned long end, struct mempolicy *new_pol)
@@ -655,23 +676,9 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
if (err)
goto out;
}
-
- /*
- * Apply policy to a single VMA. The reference counting of
- * policy for vma_policy linkages has already been handled by
- * vma_merge and split_vma as necessary. If this is a shared
- * policy then ->set_policy will increment the reference count
- * for an sp node.
- */
- pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
- vma->vm_start, vma->vm_end, vma->vm_pgoff,
- vma->vm_ops, vma->vm_file,
- vma->vm_ops ? vma->vm_ops->set_policy : NULL);
- if (vma->vm_ops && vma->vm_ops->set_policy) {
- err = vma->vm_ops->set_policy(vma, new_pol);
- if (err)
- goto out;
- }
+ err = policy_vma(vma, new_pol);
+ if (err)
+ goto out;
}
out:
--
1.7.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 2/5] mempolicy: Remove mempolicy sharing
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
2012-08-20 19:35 ` Christoph Lameter
2012-08-22 19:03 ` Andrew Morton
2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
` (4 subsequent siblings)
6 siblings, 2 replies; 28+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
To: Andrew Morton, KOSAKI Motohiro
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, LKML, Linux-MM, Mel Gorman
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Dave Jones' system call fuzz testing tool "trinity" triggered the following
bug error with slab debugging enabled
[ 7613.229315] =============================================================================
[ 7613.229955] BUG numa_policy (Not tainted): Poison overwritten
[ 7613.230560] -----------------------------------------------------------------------------
[ 7613.230560]
[ 7613.231834] INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b
[ 7613.232518] INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154
[ 7613.233188] __slab_alloc+0x3d3/0x445
[ 7613.233877] kmem_cache_alloc+0x29d/0x2b0
[ 7613.234564] mpol_new+0xa3/0x140
[ 7613.235236] sys_mbind+0x142/0x620
[ 7613.235929] system_call_fastpath+0x16/0x1b
[ 7613.236640] INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154
[ 7613.237354] __slab_free+0x2e/0x1de
[ 7613.238080] kmem_cache_free+0x25a/0x260
[ 7613.238799] __mpol_put+0x27/0x30
[ 7613.239515] remove_vma+0x68/0x90
[ 7613.240223] exit_mmap+0x118/0x140
[ 7613.240939] mmput+0x73/0x110
[ 7613.241651] exit_mm+0x108/0x130
[ 7613.242367] do_exit+0x162/0xb90
[ 7613.243074] do_group_exit+0x4f/0xc0
[ 7613.243790] sys_exit_group+0x17/0x20
[ 7613.244507] system_call_fastpath+0x16/0x1b
[ 7613.245212] INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x (null) flags=0x20000000004080
[ 7613.246000] INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0
The problem is that the structure is being prematurely freed due to a
reference count imbalance. In the following case mbind(addr, len) should
replace the memory policies of both vma1 and vma2 and thus they will
become to share the same mempolicy and the new mempolicy will have the
MPOL_F_SHARED flag.
+-------------------+-------------------+
| vma1 | vma2(shmem) |
+-------------------+-------------------+
| |
addr addr+len
alloc_pages_vma() uses get_vma_policy() and mpol_cond_put() pair for
maintaining the mempolicy reference count. The current rule is that
get_vma_policy() only increments refcount for shmem VMA and mpol_conf_put()
only decrements refcount if the policy has MPOL_F_SHARED.
In above case, vma1 is not shmem vma and vma->policy has MPOL_F_SHARED!
The reference count will be decreased even though was not increased whenever
alloc_page_vma() is called. This has been broken since commit [52cd3b07:
mempolicy: rework mempolicy Reference Counting] in 2008.
There is another serious bug with the sharing of memory policies. Currently,
mempolicy rebind logic (it is called from cpuset rebinding) ignores a refcount
of mempolicy and override it forcibly. Thus, any mempolicy sharing may cause
mempolicy corruption. The bug was introduced by commit [68860ec1: cpusets:
automatic numa mempolicy rebinding].
Ideally, the shared policy handling would be rewritten to either properly
handle COW of the policy structures or at least reference count MPOL_F_SHARED
based exclusively on information within the policy. However, this patch takes
the easier approach of disabling any policy sharing between VMAs. Each new
range allocated with sp_alloc will allocate a new policy, set the reference
count to 1 and drop the reference count of the old policy. This increases
the memory footprint but is not expected to be a major problem as mbind()
is unlikely to be used for fine-grained ranges. It is also inefficient
because it means we allocate a new policy even in cases where mbind_range()
could use the new_policy passed to it. However, it is more straight-forward
and the change should be invisible to the user.
[mgorman@suse.de: Edited changelog]
Reported-by: Dave Jones <davej@redhat.com>,
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/mempolicy.c | 52 ++++++++++++++++++++++++++++++++++++++--------------
1 files changed, 38 insertions(+), 14 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 22dcf9a..8025720 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -607,24 +607,39 @@ check_range(struct mm_struct *mm, unsigned long start, unsigned long end,
return first;
}
-/* Apply policy to a single VMA */
-static int policy_vma(struct vm_area_struct *vma, struct mempolicy *new)
+/*
+ * Apply policy to a single VMA
+ * This must be called with the mmap_sem held for writing.
+ */
+static int vma_replace_policy(struct vm_area_struct *vma,
+ struct mempolicy *pol)
{
- int err = 0;
- struct mempolicy *old = vma->vm_policy;
+ int err;
+ struct mempolicy *old;
+ struct mempolicy *new;
pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
vma->vm_start, vma->vm_end, vma->vm_pgoff,
vma->vm_ops, vma->vm_file,
vma->vm_ops ? vma->vm_ops->set_policy : NULL);
- if (vma->vm_ops && vma->vm_ops->set_policy)
+ new = mpol_dup(pol);
+ if (IS_ERR(new))
+ return PTR_ERR(new);
+
+ if (vma->vm_ops && vma->vm_ops->set_policy) {
err = vma->vm_ops->set_policy(vma, new);
- if (!err) {
- mpol_get(new);
- vma->vm_policy = new;
- mpol_put(old);
+ if (err)
+ goto err_out;
}
+
+ old = vma->vm_policy;
+ vma->vm_policy = new; /* protected by mmap_sem */
+ mpol_put(old);
+
+ return 0;
+ err_out:
+ mpol_put(new);
return err;
}
@@ -676,7 +691,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
if (err)
goto out;
}
- err = policy_vma(vma, new_pol);
+ err = vma_replace_policy(vma, new_pol);
if (err)
goto out;
}
@@ -2153,15 +2168,24 @@ static void sp_delete(struct shared_policy *sp, struct sp_node *n)
static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
struct mempolicy *pol)
{
- struct sp_node *n = kmem_cache_alloc(sn_cache, GFP_KERNEL);
+ struct sp_node *n;
+ struct mempolicy *newpol;
+ n = kmem_cache_alloc(sn_cache, GFP_KERNEL);
if (!n)
return NULL;
+
+ newpol = mpol_dup(pol);
+ if (IS_ERR(newpol)) {
+ kmem_cache_free(sn_cache, n);
+ return NULL;
+ }
+ newpol->flags |= MPOL_F_SHARED;
+
n->start = start;
n->end = end;
- mpol_get(pol);
- pol->flags |= MPOL_F_SHARED; /* for unref */
- n->policy = pol;
+ n->policy = newpol;
+
return n;
}
--
1.7.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 3/5] mempolicy: fix a race in shared_policy_replace()
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
2012-08-20 19:52 ` Christoph Lameter
2012-09-07 22:59 ` KOSAKI Motohiro
2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
` (3 subsequent siblings)
6 siblings, 2 replies; 28+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
To: Andrew Morton, KOSAKI Motohiro
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, LKML, Linux-MM, Mel Gorman
shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot
be dereferenced if sp->lock is not held and 2) another thread can modify
sp_node between spin_unlock for allocating a new sp node and next spin_lock.
The bug was introduced before 2.6.12-rc2.
Kosaki's original patch for this problem was to allocate an sp node and policy
within shared_policy_replace and initialise it when the lock is reacquired. I
was not keen on this approach because it partially duplicates sp_alloc(). As
the paths were sp->lock is taken are not that performance critical this
patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc().
[kosaki.motohiro@jp.fujitsu.com: Original patch]
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
include/linux/mempolicy.h | 2 +-
mm/mempolicy.c | 37 ++++++++++++++++---------------------
2 files changed, 17 insertions(+), 22 deletions(-)
diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 95b738c..df08254 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -188,7 +188,7 @@ struct sp_node {
struct shared_policy {
struct rb_root root;
- spinlock_t lock;
+ struct mutex mutex;
};
void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 8025720..d297929 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2083,7 +2083,7 @@ bool __mpol_equal(struct mempolicy *a, struct mempolicy *b)
*/
/* lookup first element intersecting start-end */
-/* Caller holds sp->lock */
+/* Caller holds sp->mutex */
static struct sp_node *
sp_lookup(struct shared_policy *sp, unsigned long start, unsigned long end)
{
@@ -2147,13 +2147,13 @@ mpol_shared_policy_lookup(struct shared_policy *sp, unsigned long idx)
if (!sp->root.rb_node)
return NULL;
- spin_lock(&sp->lock);
+ mutex_lock(&sp->mutex);
sn = sp_lookup(sp, idx, idx+1);
if (sn) {
mpol_get(sn->policy);
pol = sn->policy;
}
- spin_unlock(&sp->lock);
+ mutex_unlock(&sp->mutex);
return pol;
}
@@ -2193,10 +2193,10 @@ static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
static int shared_policy_replace(struct shared_policy *sp, unsigned long start,
unsigned long end, struct sp_node *new)
{
- struct sp_node *n, *new2 = NULL;
+ struct sp_node *n;
+ int ret = 0;
-restart:
- spin_lock(&sp->lock);
+ mutex_lock(&sp->mutex);
n = sp_lookup(sp, start, end);
/* Take care of old policies in the same range. */
while (n && n->start < end) {
@@ -2209,16 +2209,14 @@ restart:
} else {
/* Old policy spanning whole new range. */
if (n->end > end) {
+ struct sp_node *new2;
+ new2 = sp_alloc(end, n->end, n->policy);
if (!new2) {
- spin_unlock(&sp->lock);
- new2 = sp_alloc(end, n->end, n->policy);
- if (!new2)
- return -ENOMEM;
- goto restart;
+ ret = -ENOMEM;
+ goto out;
}
n->end = start;
sp_insert(sp, new2);
- new2 = NULL;
break;
} else
n->end = start;
@@ -2229,12 +2227,9 @@ restart:
}
if (new)
sp_insert(sp, new);
- spin_unlock(&sp->lock);
- if (new2) {
- mpol_put(new2->policy);
- kmem_cache_free(sn_cache, new2);
- }
- return 0;
+out:
+ mutex_unlock(&sp->mutex);
+ return ret;
}
/**
@@ -2252,7 +2247,7 @@ void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol)
int ret;
sp->root = RB_ROOT; /* empty tree == default mempolicy */
- spin_lock_init(&sp->lock);
+ mutex_init(&sp->mutex);
if (mpol) {
struct vm_area_struct pvma;
@@ -2318,7 +2313,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
if (!p->root.rb_node)
return;
- spin_lock(&p->lock);
+ mutex_lock(&p->mutex);
next = rb_first(&p->root);
while (next) {
n = rb_entry(next, struct sp_node, nd);
@@ -2327,7 +2322,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
mpol_put(n->policy);
kmem_cache_free(sn_cache, n);
}
- spin_unlock(&p->lock);
+ mutex_unlock(&p->mutex);
}
/* assumes fs == KERNEL_DS */
--
1.7.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
` (2 preceding siblings ...)
2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
2012-08-20 19:46 ` Christoph Lameter
2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
` (2 subsequent siblings)
6 siblings, 1 reply; 28+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
To: Andrew Morton, KOSAKI Motohiro
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, LKML, Linux-MM, Mel Gorman
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
When shared_policy_replace() fails to allocate new->policy is not freed
correctly by mpol_set_shared_policy(). The problem is that shared mempolicy
code directly call kmem_cache_free() in multiple places where it is easy
to make a mistake.
This patch creates an sp_free wrapper function and uses it. The bug was
introduced pre-git age (IOW, before 2.6.12-rc2).
[mgorman@suse.de: Editted changelog]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/mempolicy.c | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index d297929..45f9825 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2157,12 +2157,17 @@ mpol_shared_policy_lookup(struct shared_policy *sp, unsigned long idx)
return pol;
}
+static void sp_free(struct sp_node *n)
+{
+ mpol_put(n->policy);
+ kmem_cache_free(sn_cache, n);
+}
+
static void sp_delete(struct shared_policy *sp, struct sp_node *n)
{
pr_debug("deleting %lx-l%lx\n", n->start, n->end);
rb_erase(&n->nd, &sp->root);
- mpol_put(n->policy);
- kmem_cache_free(sn_cache, n);
+ sp_free(n);
}
static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
@@ -2301,7 +2306,7 @@ int mpol_set_shared_policy(struct shared_policy *info,
}
err = shared_policy_replace(info, vma->vm_pgoff, vma->vm_pgoff+sz, new);
if (err && new)
- kmem_cache_free(sn_cache, new);
+ sp_free(new);
return err;
}
@@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
while (next) {
n = rb_entry(next, struct sp_node, nd);
next = rb_next(&n->nd);
- rb_erase(&n->nd, &p->root);
- mpol_put(n->policy);
- kmem_cache_free(sn_cache, n);
+ sp_delete(p, n);
}
mutex_unlock(&p->mutex);
}
--
1.7.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
` (3 preceding siblings ...)
2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
2012-08-20 19:51 ` Christoph Lameter
2012-08-21 7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
2012-08-21 21:46 ` Andi Kleen
6 siblings, 1 reply; 28+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
To: Andrew Morton, KOSAKI Motohiro
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, LKML, Linux-MM, Mel Gorman
[cc9a6c87: cpuset: mm: reduce large amounts of memory barrier related damage
v3] introduced a potential memory corruption. shmem_alloc_page() uses a
pseudo vma and it has one significant unique combination, vma->vm_ops=NULL
and vma->policy->flags & MPOL_F_SHARED.
get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and
mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto
retry_cpuset' path, decrements the reference count and frees the policy
prematurely.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/mempolicy.c | 17 +++++++++++++++--
1 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 45f9825..82e872f 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1545,15 +1545,28 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
struct vm_area_struct *vma, unsigned long addr)
{
struct mempolicy *pol = task->mempolicy;
+ int got_ref;
if (vma) {
if (vma->vm_ops && vma->vm_ops->get_policy) {
struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
addr);
- if (vpol)
+ if (vpol) {
pol = vpol;
- } else if (vma->vm_policy)
+ got_ref = 1;
+ }
+ } else if (vma->vm_policy) {
pol = vma->vm_policy;
+
+ /*
+ * shmem_alloc_page() passes MPOL_F_SHARED policy with
+ * a pseudo vma whose vma->vm_ops=NULL. Take a reference
+ * count on these policies which will be dropped by
+ * mpol_cond_put() later
+ */
+ if (mpol_needs_cond_ref(pol))
+ mpol_get(pol);
+ }
}
if (!pol)
pol = &default_policy;
--
1.7.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
@ 2012-08-20 19:35 ` Christoph Lameter
2012-08-22 19:03 ` Andrew Morton
1 sibling, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:35 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Mon, 20 Aug 2012, Mel Gorman wrote:
> Ideally, the shared policy handling would be rewritten to either properly
> handle COW of the policy structures or at least reference count MPOL_F_SHARED
> based exclusively on information within the policy. However, this patch takes
> the easier approach of disabling any policy sharing between VMAs. Each new
> range allocated with sp_alloc will allocate a new policy, set the reference
> count to 1 and drop the reference count of the old policy. This increases
> the memory footprint but is not expected to be a major problem as mbind()
> is unlikely to be used for fine-grained ranges. It is also inefficient
> because it means we allocate a new policy even in cases where mbind_range()
> could use the new_policy passed to it. However, it is more straight-forward
> and the change should be invisible to the user.
Hmmm. I dont like the additional memory use but this is definitely an
issue that needs addressing.
Reviewed-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
@ 2012-08-20 19:46 ` Christoph Lameter
2012-08-21 7:15 ` Mel Gorman
0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:46 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Mon, 20 Aug 2012, Mel Gorman wrote:
> @@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
> while (next) {
> n = rb_entry(next, struct sp_node, nd);
> next = rb_next(&n->nd);
> - rb_erase(&n->nd, &p->root);
Looks like we need to keep the above line? sp_delete does not remove the
tree entry.
> - mpol_put(n->policy);
> - kmem_cache_free(sn_cache, n);
> + sp_delete(p, n);
> }
> mutex_unlock(&p->mutex);
> }
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
@ 2012-08-20 19:51 ` Christoph Lameter
2012-08-21 7:26 ` Mel Gorman
0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:51 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Mon, 20 Aug 2012, Mel Gorman wrote:
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 45f9825..82e872f 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1545,15 +1545,28 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
> struct vm_area_struct *vma, unsigned long addr)
> {
> struct mempolicy *pol = task->mempolicy;
> + int got_ref;
New variable. Need to set it to zero?
>
> if (vma) {
> if (vma->vm_ops && vma->vm_ops->get_policy) {
> struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
> addr);
> - if (vpol)
> + if (vpol) {
> pol = vpol;
> - } else if (vma->vm_policy)
> + got_ref = 1;
Set the new variable. But it was not initialzed before. So now its 1 or
undefined?
> + }
> + } else if (vma->vm_policy) {
> pol = vma->vm_policy;
> +
> + /*
> + * shmem_alloc_page() passes MPOL_F_SHARED policy with
> + * a pseudo vma whose vma->vm_ops=NULL. Take a reference
> + * count on these policies which will be dropped by
> + * mpol_cond_put() later
> + */
> + if (mpol_needs_cond_ref(pol))
> + mpol_get(pol);
> + }
> }
> if (!pol)
> pol = &default_policy;
>
I do not see any use of got_ref. Can we get rid of the variable?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 3/5] mempolicy: fix a race in shared_policy_replace()
2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
@ 2012-08-20 19:52 ` Christoph Lameter
2012-09-07 22:59 ` KOSAKI Motohiro
1 sibling, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:52 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Mon, 20 Aug 2012, Mel Gorman wrote:
> Kosaki's original patch for this problem was to allocate an sp node and policy
> within shared_policy_replace and initialise it when the lock is reacquired. I
> was not keen on this approach because it partially duplicates sp_alloc(). As
> the paths were sp->lock is taken are not that performance critical this
> patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc().
Reviewed-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
2012-08-20 19:46 ` Christoph Lameter
@ 2012-08-21 7:15 ` Mel Gorman
2012-08-21 15:38 ` Christoph Lameter
0 siblings, 1 reply; 28+ messages in thread
From: Mel Gorman @ 2012-08-21 7:15 UTC (permalink / raw)
To: Christoph Lameter
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Mon, Aug 20, 2012 at 07:46:09PM +0000, Christoph Lameter wrote:
> On Mon, 20 Aug 2012, Mel Gorman wrote:
>
> > @@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
> > while (next) {
> > n = rb_entry(next, struct sp_node, nd);
> > next = rb_next(&n->nd);
> > - rb_erase(&n->nd, &p->root);
>
> Looks like we need to keep the above line? sp_delete does not remove the
> tree entry.
>
> > - mpol_put(n->policy);
> > - kmem_cache_free(sn_cache, n);
> > + sp_delete(p, n);
Yes it does, could you have accidentally mixed up sp_free (which does not
remove the tree entry) and sp_delete (which does)? The altered code ends
up looking like this;
static void sp_delete(struct shared_policy *sp, struct sp_node *n)
{
pr_debug("deleting %lx-l%lx\n", n->start, n->end);
rb_erase(&n->nd, &sp->root); <----- frees node here
sp_free(n);
}
void mpol_free_shared_policy(struct shared_policy *p)
{
struct sp_node *n;
struct rb_node *next;
if (!p->root.rb_node)
return;
mutex_lock(&p->mutex);
next = rb_first(&p->root);
while (next) {
n = rb_entry(next, struct sp_node, nd);
next = rb_next(&n->nd);
sp_delete(p, n); <---- equivalent to rb_erase(&n->nd, &p->root); sp_free(n);
}
mutex_unlock(&p->mutex);
}
Thanks Christoph for looking at this.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-08-20 19:51 ` Christoph Lameter
@ 2012-08-21 7:26 ` Mel Gorman
2012-08-21 15:37 ` Christoph Lameter
2012-09-07 23:06 ` KOSAKI Motohiro
0 siblings, 2 replies; 28+ messages in thread
From: Mel Gorman @ 2012-08-21 7:26 UTC (permalink / raw)
To: Christoph Lameter
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Mon, Aug 20, 2012 at 07:51:10PM +0000, Christoph Lameter wrote:
> On Mon, 20 Aug 2012, Mel Gorman wrote:
>
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 45f9825..82e872f 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1545,15 +1545,28 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
> > struct vm_area_struct *vma, unsigned long addr)
> > {
> > struct mempolicy *pol = task->mempolicy;
> > + int got_ref;
>
> New variable. Need to set it to zero?
>
Not needed at all, I was meant to get rid of it. Ben had pointed out this
exact problem with the initialisation.
> >
> > if (vma) {
> > if (vma->vm_ops && vma->vm_ops->get_policy) {
> > struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
> > addr);
> > - if (vpol)
> > + if (vpol) {
> > pol = vpol;
> > - } else if (vma->vm_policy)
> > + got_ref = 1;
>
> Set the new variable. But it was not initialzed before. So now its 1 or
> undefined?
>
It's not even needed because the next block is the code that originally
cared about the value of got_ref.
> > + }
> > + } else if (vma->vm_policy) {
> > pol = vma->vm_policy;
> > +
> > + /*
> > + * shmem_alloc_page() passes MPOL_F_SHARED policy with
> > + * a pseudo vma whose vma->vm_ops=NULL. Take a reference
> > + * count on these policies which will be dropped by
> > + * mpol_cond_put() later
> > + */
> > + if (mpol_needs_cond_ref(pol))
> > + mpol_get(pol);
> > + }
> > }
> > if (!pol)
> > pol = &default_policy;
> >
>
> I do not see any use of got_ref. Can we get rid of the variable?
>
Yes, here is a correct version of the patch. Thanks Christoph.
---8<---
mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
[cc9a6c87: cpuset: mm: reduce large amounts of memory barrier related damage
v3] introduced a potential memory corruption. shmem_alloc_page() uses a
pseudo vma and it has one significant unique combination, vma->vm_ops=NULL
and vma->policy->flags & MPOL_F_SHARED.
get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and
mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto
retry_cpuset' path, decrements the reference count and frees the policy
prematurely.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/mempolicy.c | 12 +++++++++++-
1 files changed, 11 insertions(+), 1 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 45f9825..9842ef5 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1552,8 +1552,18 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
addr);
if (vpol)
pol = vpol;
- } else if (vma->vm_policy)
+ } else if (vma->vm_policy) {
pol = vma->vm_policy;
+
+ /*
+ * shmem_alloc_page() passes MPOL_F_SHARED policy with
+ * a pseudo vma whose vma->vm_ops=NULL. Take a reference
+ * count on these policies which will be dropped by
+ * mpol_cond_put() later
+ */
+ if (mpol_needs_cond_ref(pol))
+ mpol_get(pol);
+ }
}
if (!pol)
pol = &default_policy;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 0/5] Memory policy corruption fixes V2
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
` (4 preceding siblings ...)
2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
@ 2012-08-21 7:29 ` Mel Gorman
2012-09-06 12:40 ` Josh Boyer
2012-08-21 21:46 ` Andi Kleen
6 siblings, 1 reply; 28+ messages in thread
From: Mel Gorman @ 2012-08-21 7:29 UTC (permalink / raw)
To: Josh Boyer
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, Andrew Morton, KOSAKI Motohiro, LKML, Linux-MM
On Mon, Aug 20, 2012 at 05:36:29PM +0100, Mel Gorman wrote:
> This is a rebase with some small changes to Kosaki's "mempolicy memory
> corruption fixlet" series. I had expected that Kosaki would have revised
> the series by now but it's been waiting a long time.
>
> Changelog since V1
> o Rebase to 3.6-rc2
> o Editted some of the changelogs
> o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
> o Reworked the refcount imbalance fix slightly
> o Do not call mpol_put in shmem_alloc_page.
>
> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
> did not test LTP such as Josh reported a problem with or with a database that
> used shared policies like Andi tested. The series is almost all Kosaki's
> work of course. If he has a revised series that simply got delayed in
> posting it should take precedence.
I meant to add Josh to the cc, adding him now.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-08-21 7:26 ` Mel Gorman
@ 2012-08-21 15:37 ` Christoph Lameter
2012-09-07 23:06 ` KOSAKI Motohiro
1 sibling, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2012-08-21 15:37 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Tue, 21 Aug 2012, Mel Gorman wrote:
> mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
Reviewed-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
2012-08-21 7:15 ` Mel Gorman
@ 2012-08-21 15:38 ` Christoph Lameter
0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2012-08-21 15:38 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Tue, 21 Aug 2012, Mel Gorman wrote:
> On Mon, Aug 20, 2012 at 07:46:09PM +0000, Christoph Lameter wrote:
> > On Mon, 20 Aug 2012, Mel Gorman wrote:
> >
> > > @@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
> > > while (next) {
> > > n = rb_entry(next, struct sp_node, nd);
> > > next = rb_next(&n->nd);
> > > - rb_erase(&n->nd, &p->root);
> >
> > Looks like we need to keep the above line? sp_delete does not remove the
> > tree entry.
> >
> > > - mpol_put(n->policy);
> > > - kmem_cache_free(sn_cache, n);
> > > + sp_delete(p, n);
>
> Yes it does, could you have accidentally mixed up sp_free (which does not
> remove the tree entry) and sp_delete (which does)? The altered code ends
> up looking like this;
Yup I got that mixed up.
Reviewed-by: Christoph Lameter <cl@linux.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 0/5] Memory policy corruption fixes V2
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
` (5 preceding siblings ...)
2012-08-21 7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
@ 2012-08-21 21:46 ` Andi Kleen
6 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2012-08-21 21:46 UTC (permalink / raw)
To: Mel Gorman
Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Christoph Lameter,
Ben Hutchings, Hugh Dickins, LKML, Linux-MM
> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
> did not test LTP such as Josh reported a problem with or with a database that
> used shared policies like Andi tested. The series is almost all Kosaki's
> work of course. If he has a revised series that simply got delayed in
> posting it should take precedence.
Initial tests of this patchkit look with a test programgood, full database tests
are still pending.
-Andi
--
ak@linux.intel.com -- Speaking for myself only
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
2012-08-20 19:35 ` Christoph Lameter
@ 2012-08-22 19:03 ` Andrew Morton
2012-08-22 19:33 ` Mel Gorman
2012-08-22 19:35 ` Andi Kleen
1 sibling, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2012-08-22 19:03 UTC (permalink / raw)
To: Mel Gorman
Cc: KOSAKI Motohiro, Dave Jones, Christoph Lameter, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Mon, 20 Aug 2012 17:36:31 +0100
Mel Gorman <mgorman@suse.de> wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>
> Dave Jones' system call fuzz testing tool "trinity" triggered the following
> bug error with slab debugging enabled
>
> ...
>
> Cc: <stable@vger.kernel.org>
The patch dosn't apply to 3.5 at all well. I don't see much point in
retaining the stable tag so I think I'll remove it, and suggest that
you prepare a fresh patch for Greg and explain the situation?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
2012-08-22 19:03 ` Andrew Morton
@ 2012-08-22 19:33 ` Mel Gorman
2012-08-22 19:35 ` Andi Kleen
1 sibling, 0 replies; 28+ messages in thread
From: Mel Gorman @ 2012-08-22 19:33 UTC (permalink / raw)
To: Andrew Morton
Cc: KOSAKI Motohiro, Dave Jones, Christoph Lameter, Ben Hutchings,
Andi Kleen, Hugh Dickins, LKML, Linux-MM
On Wed, Aug 22, 2012 at 12:03:14PM -0700, Andrew Morton wrote:
> On Mon, 20 Aug 2012 17:36:31 +0100
> Mel Gorman <mgorman@suse.de> wrote:
>
> > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> >
> > Dave Jones' system call fuzz testing tool "trinity" triggered the following
> > bug error with slab debugging enabled
> >
> > ...
> >
> > Cc: <stable@vger.kernel.org>
>
> The patch dosn't apply to 3.5 at all well. I don't see much point in
> retaining the stable tag so I think I'll remove it, and suggest that
> you prepare a fresh patch for Greg and explain the situation?
>
Sure. I'll do the backport at the time they get merged to mainline and
jump through the hoops. Thanks.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
2012-08-22 19:03 ` Andrew Morton
2012-08-22 19:33 ` Mel Gorman
@ 2012-08-22 19:35 ` Andi Kleen
1 sibling, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2012-08-22 19:35 UTC (permalink / raw)
To: Andrew Morton
Cc: Mel Gorman, KOSAKI Motohiro, Dave Jones, Christoph Lameter,
Ben Hutchings, Hugh Dickins, LKML, Linux-MM
On Wed, Aug 22, 2012 at 12:03:14PM -0700, Andrew Morton wrote:
> On Mon, 20 Aug 2012 17:36:31 +0100
> Mel Gorman <mgorman@suse.de> wrote:
>
> > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> >
> > Dave Jones' system call fuzz testing tool "trinity" triggered the following
> > bug error with slab debugging enabled
> >
> > ...
> >
> > Cc: <stable@vger.kernel.org>
>
> The patch dosn't apply to 3.5 at all well. I don't see much point in
> retaining the stable tag so I think I'll remove it, and suggest that
> you prepare a fresh patch for Greg and explain the situation?
Everything applies fine if you redo the revert manually.
BTW we tested it now and the new patchkit indeed fixes the database.
Please queue for 3.6 and 3.5 stable.
-Andi
--
ak@linux.intel.com -- Speaking for myself only
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 0/5] Memory policy corruption fixes V2
2012-08-21 7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
@ 2012-09-06 12:40 ` Josh Boyer
2012-09-07 9:43 ` Mel Gorman
0 siblings, 1 reply; 28+ messages in thread
From: Josh Boyer @ 2012-09-06 12:40 UTC (permalink / raw)
To: Mel Gorman
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, Andrew Morton, KOSAKI Motohiro, LKML, Linux-MM
On Tue, Aug 21, 2012 at 3:29 AM, Mel Gorman <mgorman@suse.de> wrote:
> On Mon, Aug 20, 2012 at 05:36:29PM +0100, Mel Gorman wrote:
>> This is a rebase with some small changes to Kosaki's "mempolicy memory
>> corruption fixlet" series. I had expected that Kosaki would have revised
>> the series by now but it's been waiting a long time.
>>
>> Changelog since V1
>> o Rebase to 3.6-rc2
>> o Editted some of the changelogs
>> o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
>> o Reworked the refcount imbalance fix slightly
>> o Do not call mpol_put in shmem_alloc_page.
>>
>> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
>> did not test LTP such as Josh reported a problem with or with a database that
>> used shared policies like Andi tested. The series is almost all Kosaki's
>> work of course. If he has a revised series that simply got delayed in
>> posting it should take precedence.
>
> I meant to add Josh to the cc, adding him now.
Thank you.
I see Andi has done some testing and Acked this patchset. Christoph
appears to have Acked it as well. Is there anything else needed for
it to get in mainline? Just want to make sure this doesn't get dropped
because we all forgot about it after KS/Plumbers.
josh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 0/5] Memory policy corruption fixes V2
2012-09-06 12:40 ` Josh Boyer
@ 2012-09-07 9:43 ` Mel Gorman
0 siblings, 0 replies; 28+ messages in thread
From: Mel Gorman @ 2012-09-07 9:43 UTC (permalink / raw)
To: Josh Boyer
Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
Hugh Dickins, Andrew Morton, KOSAKI Motohiro, LKML, Linux-MM
On Thu, Sep 06, 2012 at 08:40:21AM -0400, Josh Boyer wrote:
> On Tue, Aug 21, 2012 at 3:29 AM, Mel Gorman <mgorman@suse.de> wrote:
> > On Mon, Aug 20, 2012 at 05:36:29PM +0100, Mel Gorman wrote:
> >> This is a rebase with some small changes to Kosaki's "mempolicy memory
> >> corruption fixlet" series. I had expected that Kosaki would have revised
> >> the series by now but it's been waiting a long time.
> >>
> >> Changelog since V1
> >> o Rebase to 3.6-rc2
> >> o Editted some of the changelogs
> >> o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
> >> o Reworked the refcount imbalance fix slightly
> >> o Do not call mpol_put in shmem_alloc_page.
> >>
> >> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
> >> did not test LTP such as Josh reported a problem with or with a database that
> >> used shared policies like Andi tested. The series is almost all Kosaki's
> >> work of course. If he has a revised series that simply got delayed in
> >> posting it should take precedence.
> >
> > I meant to add Josh to the cc, adding him now.
>
> Thank you.
>
> I see Andi has done some testing and Acked this patchset. Christoph
> appears to have Acked it as well. Is there anything else needed for
> it to get in mainline?
Not that I'm aware of.
> Just want to make sure this doesn't get dropped
> because we all forgot about it after KS/Plumbers.
>
Right now it's in linux-next via Andrew. Bar major surprises I expect
it'll end up in mainline in due course. When it does, I'll do a -stable
backport.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 3/5] mempolicy: fix a race in shared_policy_replace()
2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
2012-08-20 19:52 ` Christoph Lameter
@ 2012-09-07 22:59 ` KOSAKI Motohiro
1 sibling, 0 replies; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-09-07 22:59 UTC (permalink / raw)
To: mgorman
Cc: akpm, kosaki.motohiro, davej, cl, ben, ak, hughd, linux-kernel,
linux-mm
First off, thank you very much for reworking this for me. I haven't got a chance
to get a test machine for this.
> shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot
> be dereferenced if sp->lock is not held and 2) another thread can modify
> sp_node between spin_unlock for allocating a new sp node and next spin_lock.
> The bug was introduced before 2.6.12-rc2.
>
> Kosaki's original patch for this problem was to allocate an sp node and policy
> within shared_policy_replace and initialise it when the lock is reacquired. I
> was not keen on this approach because it partially duplicates sp_alloc(). As
> the paths were sp->lock is taken are not that performance critical this
> patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc().
Looks make sense.
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-08-21 7:26 ` Mel Gorman
2012-08-21 15:37 ` Christoph Lameter
@ 2012-09-07 23:06 ` KOSAKI Motohiro
1 sibling, 0 replies; 28+ messages in thread
From: KOSAKI Motohiro @ 2012-09-07 23:06 UTC (permalink / raw)
To: mgorman
Cc: cl, akpm, kosaki.motohiro, davej, ben, ak, hughd, linux-kernel,
linux-mm
> mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
>
> [cc9a6c87: cpuset: mm: reduce large amounts of memory barrier related damage
> v3] introduced a potential memory corruption. shmem_alloc_page() uses a
> pseudo vma and it has one significant unique combination, vma->vm_ops=NULL
> and vma->policy->flags & MPOL_F_SHARED.
>
> get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and
> mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
> Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto
> retry_cpuset' path, decrements the reference count and frees the policy
> prematurely.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> ---
> mm/mempolicy.c | 12 +++++++++++-
> 1 files changed, 11 insertions(+), 1 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 45f9825..9842ef5 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1552,8 +1552,18 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
> addr);
> if (vpol)
> pol = vpol;
> - } else if (vma->vm_policy)
> + } else if (vma->vm_policy) {
> pol = vma->vm_policy;
> +
> + /*
> + * shmem_alloc_page() passes MPOL_F_SHARED policy with
> + * a pseudo vma whose vma->vm_ops=NULL. Take a reference
> + * count on these policies which will be dropped by
> + * mpol_cond_put() later
> + */
> + if (mpol_needs_cond_ref(pol))
> + mpol_get(pol);
> + }
Ok, looks sene change. thank you.
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> }
> if (!pol)
> pol = &default_policy;
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-10-09 16:58 [PATCH 0/5] Memory policy corruption fixes -stable Mel Gorman
@ 2012-10-09 16:58 ` Mel Gorman
2012-12-04 12:54 ` Tommi Rantala
0 siblings, 1 reply; 28+ messages in thread
From: Mel Gorman @ 2012-10-09 16:58 UTC (permalink / raw)
To: Stable
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Dave Jones,
Christoph Lameter, Hugh Dickins, LKML, Linux-MM, Mel Gorman
commit 00442ad04a5eac08a98255697c510e708f6082e2 upstream.
Commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier
related damage v3") introduced a potential memory corruption.
shmem_alloc_page() uses a pseudo vma and it has one significant unique
combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED.
get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL
and mpol_cond_put() DOES decrease a policy ref when a policy has
MPOL_F_SHARED. Therefore, when a cpuset update race occurs,
alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the
reference count and frees the policy prematurely.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Christoph Lameter <cl@linux.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/mempolicy.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 1763418..3d64b36 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1552,8 +1552,18 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
addr);
if (vpol)
pol = vpol;
- } else if (vma->vm_policy)
+ } else if (vma->vm_policy) {
pol = vma->vm_policy;
+
+ /*
+ * shmem_alloc_page() passes MPOL_F_SHARED policy with
+ * a pseudo vma whose vma->vm_ops=NULL. Take a reference
+ * count on these policies which will be dropped by
+ * mpol_cond_put() later
+ */
+ if (mpol_needs_cond_ref(pol))
+ mpol_get(pol);
+ }
}
if (!pol)
pol = &default_policy;
--
1.7.9.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-10-09 16:58 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
@ 2012-12-04 12:54 ` Tommi Rantala
2012-12-04 14:15 ` Mel Gorman
0 siblings, 1 reply; 28+ messages in thread
From: Tommi Rantala @ 2012-12-04 12:54 UTC (permalink / raw)
To: Mel Gorman
Cc: Stable, Andi Kleen, Andrew Morton, KOSAKI Motohiro, Dave Jones,
Christoph Lameter, Hugh Dickins, LKML, Linux-MM
2012/10/9 Mel Gorman <mgorman@suse.de>:
> commit 00442ad04a5eac08a98255697c510e708f6082e2 upstream.
>
> Commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier
> related damage v3") introduced a potential memory corruption.
> shmem_alloc_page() uses a pseudo vma and it has one significant unique
> combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED.
>
> get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL
> and mpol_cond_put() DOES decrease a policy ref when a policy has
> MPOL_F_SHARED. Therefore, when a cpuset update race occurs,
> alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the
> reference count and frees the policy prematurely.
Hello,
kmemleak is complaining about memory leaks that point to the mbind()
syscall. I've seen this only in v3.7-rcX, so I bisected this, and
found that this patch is the first mainline commit where I'm able to
reproduce it with Trinity.
$ ./trinity -q -C32 -c mbind -N100000
Trinity v1.1pre Dave Jones <davej@redhat.com> 2012
[2823] Marking 64-bit syscall 237 (mbind) as enabled
[2823] Marking 32-bit syscall 274 (mbind) as enabled
Enabling syscall mbind
Initial random seed from time of day: 829620642 (0x317301a2)
[2824] Watchdog is alive
[2823] Started watchdog process, PID is 2824
[2825] Main thread is alive.
375 sockets created based on info from socket cachefile.
Generating file descriptors
Added 24 filenames from /dev
Added 23893 filenames from /proc
Added 8415 filenames from /sys
[2825] Random reseed: 4210789068 (0xfafb8acc)
[watchdog] 1060 iterations. [F:996 S:63]
[watchdog] 6588 iterations. [F:6119 S:468]
[watchdog] 12405 iterations. [F:11509 S:894]
[watchdog] 18163 iterations. [F:16850 S:1311]
[watchdog] 24001 iterations. [F:22260 S:1741]
[watchdog] 30122 iterations. [F:27935 S:2184]
[watchdog] 36074 iterations. [F:33465 S:2605]
[watchdog] 42042 iterations. [F:38971 S:3069]
[watchdog] 47949 iterations. [F:44419 S:3527]
[watchdog] 53873 iterations. [F:49908 S:3961]
[watchdog] 59719 iterations. [F:55345 S:4369]
[watchdog] 65583 iterations. [F:60787 S:4790]
[watchdog] 71690 iterations. [F:66451 S:5233]
[watchdog] 77755 iterations. [F:72070 S:5681]
[watchdog] 83850 iterations. [F:77714 S:6134]
[watchdog] 89877 iterations. [F:83296 S:6582]
[watchdog] 95890 iterations. [F:88851 S:7042]
[2825] Bailing main loop. Exit reason: Reached maximum syscall count
[2824] Watchdog exiting
Ran 100017 syscalls. Successes: 7355 Failures: 92665
# echo scan > /sys/kernel/debug/kmemleak
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff88002c8b1060 (size 24):
comm "trinity-child13", pid 2141, jiffies 4294861068 (age 1585.092s)
hex dump (first 24 bytes):
02 00 00 00 01 00 03 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 ........
backtrace:
[<ffffffff81a53481>] kmemleak_alloc+0x21/0x50
[<ffffffff81196116>] kmem_cache_alloc+0x96/0x220
[<ffffffff8118fd02>] __mpol_dup+0x22/0x190
[<ffffffff8118feb8>] sp_alloc+0x48/0xa0
[<ffffffff81190960>] mpol_set_shared_policy+0x40/0xd0
[<ffffffff8115f1f8>] shmem_set_policy+0x28/0x30
[<ffffffff811902c1>] mbind_range+0x1a1/0x210
[<ffffffff811904fc>] do_mbind+0x1cc/0x2d0
[<ffffffff811906a2>] sys_mbind+0xa2/0xb0
[<ffffffff81a924a9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
unreferenced object 0xffff88003d83f168 (size 24):
comm "trinity-child10", pid 2686, jiffies 4295117470 (age 1328.725s)
hex dump (first 24 bytes):
01 00 00 00 01 00 03 00 00 00 00 00 00 00 00 00 ................
01 00 00 00 00 00 00 00 ........
backtrace:
[<ffffffff81a53481>] kmemleak_alloc+0x21/0x50
[<ffffffff81196116>] kmem_cache_alloc+0x96/0x220
[<ffffffff8118fd02>] __mpol_dup+0x22/0x190
[<ffffffff8118feb8>] sp_alloc+0x48/0xa0
[<ffffffff81190960>] mpol_set_shared_policy+0x40/0xd0
[<ffffffff8115f1f8>] shmem_set_policy+0x28/0x30
[<ffffffff811902c1>] mbind_range+0x1a1/0x210
[<ffffffff811904fc>] do_mbind+0x1cc/0x2d0
[<ffffffff811906a2>] sys_mbind+0xa2/0xb0
[<ffffffff81a924a9>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
#
Since the patch is touching the reference counting, I suppose the
finding could be legit.
Tommi
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> Reviewed-by: Christoph Lameter <cl@linux.com>
> Cc: Josh Boyer <jwboyer@gmail.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> ---
> mm/mempolicy.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 1763418..3d64b36 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1552,8 +1552,18 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
> addr);
> if (vpol)
> pol = vpol;
> - } else if (vma->vm_policy)
> + } else if (vma->vm_policy) {
> pol = vma->vm_policy;
> +
> + /*
> + * shmem_alloc_page() passes MPOL_F_SHARED policy with
> + * a pseudo vma whose vma->vm_ops=NULL. Take a reference
> + * count on these policies which will be dropped by
> + * mpol_cond_put() later
> + */
> + if (mpol_needs_cond_ref(pol))
> + mpol_get(pol);
> + }
> }
> if (!pol)
> pol = &default_policy;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-12-04 12:54 ` Tommi Rantala
@ 2012-12-04 14:15 ` Mel Gorman
2012-12-05 5:11 ` Hugh Dickins
0 siblings, 1 reply; 28+ messages in thread
From: Mel Gorman @ 2012-12-04 14:15 UTC (permalink / raw)
To: Tommi Rantala
Cc: Stable, Andi Kleen, Andrew Morton, KOSAKI Motohiro, Dave Jones,
Christoph Lameter, Hugh Dickins, LKML, Linux-MM
On Tue, Dec 04, 2012 at 02:54:08PM +0200, Tommi Rantala wrote:
> 2012/10/9 Mel Gorman <mgorman@suse.de>:
> > commit 00442ad04a5eac08a98255697c510e708f6082e2 upstream.
> >
> > Commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier
> > related damage v3") introduced a potential memory corruption.
> > shmem_alloc_page() uses a pseudo vma and it has one significant unique
> > combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED.
> >
> > get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL
> > and mpol_cond_put() DOES decrease a policy ref when a policy has
> > MPOL_F_SHARED. Therefore, when a cpuset update race occurs,
> > alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the
> > reference count and frees the policy prematurely.
>
> Hello,
>
> kmemleak is complaining about memory leaks that point to the mbind()
> syscall. I've seen this only in v3.7-rcX, so I bisected this, and
> found that this patch is the first mainline commit where I'm able to
> reproduce it with Trinity.
>
Uncool.
I'm writing this from an airport so am not in the position to test properly
but at a glance I'm not seeing what drops the reference count taken by
mpol_shared_policy_lookup() in all cases. vm_ops->get_policy() probably
gets it right but what about shmem_alloc_page() and shmem_swapin()?
This patch is only compile tested. If the reference counts are dropped
somewhere I did not spot quickly then it'll cause a use-after-free bug
instead but is worth trying anyway.
diff --git a/mm/shmem.c b/mm/shmem.c
index 89341b6..6229a43 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -912,6 +912,7 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
{
struct mempolicy mpol, *spol;
struct vm_area_struct pvma;
+ struct page *page;
spol = mpol_cond_copy(&mpol,
mpol_shared_policy_lookup(&info->policy, index));
@@ -922,13 +923,19 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
pvma.vm_pgoff = index + info->vfs_inode.i_ino;
pvma.vm_ops = NULL;
pvma.vm_policy = spol;
- return swapin_readahead(swap, gfp, &pvma, 0);
+ page = swapin_readahead(swap, gfp, &pvma, 0);
+
+ /* Drop reference taken by mpol_shared_policy_lookup() */
+ mpol_cond_put(pvma.vm_policy);
+
+ return page;
}
static struct page *shmem_alloc_page(gfp_t gfp,
struct shmem_inode_info *info, pgoff_t index)
{
struct vm_area_struct pvma;
+ struct page *page;
/* Create a pseudo vma that just contains the policy */
pvma.vm_start = 0;
@@ -940,7 +947,12 @@ static struct page *shmem_alloc_page(gfp_t gfp,
/*
* alloc_page_vma() will drop the shared policy reference
*/
- return alloc_page_vma(gfp, &pvma, 0);
+ page = alloc_page_vma(gfp, &pvma, 0);
+
+ /* Drop reference taken by mpol_shared_policy_lookup() */
+ mpol_cond_put(pvma.vm_policy);
+
+ return page;
}
#else /* !CONFIG_NUMA */
#ifdef CONFIG_TMPFS
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-12-04 14:15 ` Mel Gorman
@ 2012-12-05 5:11 ` Hugh Dickins
2012-12-05 6:28 ` Hugh Dickins
0 siblings, 1 reply; 28+ messages in thread
From: Hugh Dickins @ 2012-12-05 5:11 UTC (permalink / raw)
To: Mel Gorman
Cc: Tommi Rantala, Stable, Andi Kleen, Andrew Morton, KOSAKI Motohiro,
Dave Jones, Christoph Lameter, LKML, Linux-MM
On Tue, 4 Dec 2012, Mel Gorman wrote:
> On Tue, Dec 04, 2012 at 02:54:08PM +0200, Tommi Rantala wrote:
> > 2012/10/9 Mel Gorman <mgorman@suse.de>:
> > > commit 00442ad04a5eac08a98255697c510e708f6082e2 upstream.
> > >
> > > Commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier
> > > related damage v3") introduced a potential memory corruption.
> > > shmem_alloc_page() uses a pseudo vma and it has one significant unique
> > > combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED.
> > >
> > > get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL
> > > and mpol_cond_put() DOES decrease a policy ref when a policy has
> > > MPOL_F_SHARED. Therefore, when a cpuset update race occurs,
> > > alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the
> > > reference count and frees the policy prematurely.
> >
> > Hello,
> >
> > kmemleak is complaining about memory leaks that point to the mbind()
> > syscall. I've seen this only in v3.7-rcX, so I bisected this, and
> > found that this patch is the first mainline commit where I'm able to
> > reproduce it with Trinity.
> >
>
> Uncool.
>
> I'm writing this from an airport so am not in the position to test properly
> but at a glance I'm not seeing what drops the reference count taken by
> mpol_shared_policy_lookup() in all cases. vm_ops->get_policy() probably
> gets it right but what about shmem_alloc_page() and shmem_swapin()?
>
> This patch is only compile tested. If the reference counts are dropped
> somewhere I did not spot quickly then it'll cause a use-after-free bug
> instead but is worth trying anyway.
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 89341b6..6229a43 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -912,6 +912,7 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
> {
> struct mempolicy mpol, *spol;
> struct vm_area_struct pvma;
> + struct page *page;
>
> spol = mpol_cond_copy(&mpol,
> mpol_shared_policy_lookup(&info->policy, index));
> @@ -922,13 +923,19 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
> pvma.vm_pgoff = index + info->vfs_inode.i_ino;
> pvma.vm_ops = NULL;
> pvma.vm_policy = spol;
> - return swapin_readahead(swap, gfp, &pvma, 0);
> + page = swapin_readahead(swap, gfp, &pvma, 0);
> +
> + /* Drop reference taken by mpol_shared_policy_lookup() */
> + mpol_cond_put(pvma.vm_policy);
> +
> + return page;
> }
>
> static struct page *shmem_alloc_page(gfp_t gfp,
> struct shmem_inode_info *info, pgoff_t index)
> {
> struct vm_area_struct pvma;
> + struct page *page;
>
> /* Create a pseudo vma that just contains the policy */
> pvma.vm_start = 0;
> @@ -940,7 +947,12 @@ static struct page *shmem_alloc_page(gfp_t gfp,
> /*
> * alloc_page_vma() will drop the shared policy reference
> */
> - return alloc_page_vma(gfp, &pvma, 0);
> + page = alloc_page_vma(gfp, &pvma, 0);
> +
> + /* Drop reference taken by mpol_shared_policy_lookup() */
> + mpol_cond_put(pvma.vm_policy);
> +
> + return page;
> }
> #else /* !CONFIG_NUMA */
> #ifdef CONFIG_TMPFS
Thank you, Tommi and Mel. Easy enough for me to reproduce without
kmemleak and trinity, by mounting a tmpfs with mpol= and keeping an
eye on numa_policy in /proc/slabinfo while building a tree there.
Yes, your patch fixes it Mel, but I prefer it as below, with a couple
of mods: removing the no longer true comment, and leaving shmem_swapin()
alone with just a comment. It appears to be the job of the rather weird
mpol_cond_copy() to drop the reference on the original mempolicy, and
clear MPOL_F_SHARED so the copy won't need one (it's trying to cope with
the fact that swapin_readahead will make an unknown number of calls to
alloc_page_vma). So I'd rather not add another mpol_cond_put there,
whose cond will never be met.
I don't much like the result, but that's because it's adding further
cruft on top of the onstack pseudo-vma stuff: more impetus for me to
revisit the alloc_page_mpol() patch I worked on years ago, but gave
up when I couldn't understand the mpol refcounting: hopefully I'll
find that Kosaki's changes have made it all clearer now.
Please consent to the addition of your signoff: thanks!
[PATCH] tmpfs: fix shared mempolicy leak
From: Mel Gorman <mgorman@suse.de>
Commit 00442ad04a5e ("mempolicy: fix a memory corruption by refcount
imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
on expecting alloc_page_vma() to drop the refcount it had acquired.
This deserves a rework: but for now fix the leak in shmem_alloc_page().
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Awaiting-Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
---
mm/shmem.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
--- 3.7-rc8/mm/shmem.c 2012-11-16 19:26:56.388459961 -0800
+++ linux/mm/shmem.c 2012-12-04 20:00:44.556241603 -0800
@@ -922,13 +922,17 @@ static struct page *shmem_swapin(swp_ent
pvma.vm_pgoff = index + info->vfs_inode.i_ino;
pvma.vm_ops = NULL;
pvma.vm_policy = spol;
+
return swapin_readahead(swap, gfp, &pvma, 0);
+
+ /* mpol_cond_copy already dropped ref from mpol_shared_policy_lookup */
}
static struct page *shmem_alloc_page(gfp_t gfp,
struct shmem_inode_info *info, pgoff_t index)
{
struct vm_area_struct pvma;
+ struct page *page;
/* Create a pseudo vma that just contains the policy */
pvma.vm_start = 0;
@@ -937,10 +941,12 @@ static struct page *shmem_alloc_page(gfp
pvma.vm_ops = NULL;
pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index);
- /*
- * alloc_page_vma() will drop the shared policy reference
- */
- return alloc_page_vma(gfp, &pvma, 0);
+ page = alloc_page_vma(gfp, &pvma, 0);
+
+ /* Drop reference taken by mpol_shared_policy_lookup() */
+ mpol_cond_put(pvma.vm_policy);
+
+ return page;
}
#else /* !CONFIG_NUMA */
#ifdef CONFIG_TMPFS
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
2012-12-05 5:11 ` Hugh Dickins
@ 2012-12-05 6:28 ` Hugh Dickins
0 siblings, 0 replies; 28+ messages in thread
From: Hugh Dickins @ 2012-12-05 6:28 UTC (permalink / raw)
To: Mel Gorman
Cc: Tommi Rantala, Stable, Andi Kleen, Andrew Morton, KOSAKI Motohiro,
Dave Jones, Christoph Lameter, LKML, Linux-MM
On Tue, 4 Dec 2012, Hugh Dickins wrote:
>
> Yes, your patch fixes it Mel, but I prefer it as below, with a couple
> of mods: removing the no longer true comment, and leaving shmem_swapin()
> alone with just a comment. It appears to be the job of the rather weird
> mpol_cond_copy() to drop the reference on the original mempolicy, and
> clear MPOL_F_SHARED so the copy won't need one (it's trying to cope with
> the fact that swapin_readahead will make an unknown number of calls to
> alloc_page_vma). So I'd rather not add another mpol_cond_put there,
> whose cond will never be met.
Hold on, ignore that patch for now, I think I had my priorities
upside down: it would be better for shmem_swapin() to behave as
you proposed, and we delete the mpol_cond_copy() weirdness instead.
Your 00442ad04a5e changed alloc_pages_vma() to keep its refcounting
in balance, so it now does not matter that swapin_readahead() makes
an unknown number of calls to it: we should simply take a reference
before and drop it after, just as you do in shmem_alloc_page().
I'd still like to revisit alloc_page_vma(), and its refcount
manipulations do now appear redundant; but changing that is not
something I want to get into in a last minute rush. But getting rid
of mpol_cond_copy() should be safe and clear, I'll test that out now
and reply with an updated patch (or else admit I got confused).
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2012-12-05 6:28 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
2012-08-20 19:35 ` Christoph Lameter
2012-08-22 19:03 ` Andrew Morton
2012-08-22 19:33 ` Mel Gorman
2012-08-22 19:35 ` Andi Kleen
2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
2012-08-20 19:52 ` Christoph Lameter
2012-09-07 22:59 ` KOSAKI Motohiro
2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
2012-08-20 19:46 ` Christoph Lameter
2012-08-21 7:15 ` Mel Gorman
2012-08-21 15:38 ` Christoph Lameter
2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
2012-08-20 19:51 ` Christoph Lameter
2012-08-21 7:26 ` Mel Gorman
2012-08-21 15:37 ` Christoph Lameter
2012-09-07 23:06 ` KOSAKI Motohiro
2012-08-21 7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
2012-09-06 12:40 ` Josh Boyer
2012-09-07 9:43 ` Mel Gorman
2012-08-21 21:46 ` Andi Kleen
-- strict thread matches above, loose matches on Subject: below --
2012-10-09 16:58 [PATCH 0/5] Memory policy corruption fixes -stable Mel Gorman
2012-10-09 16:58 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
2012-12-04 12:54 ` Tommi Rantala
2012-12-04 14:15 ` Mel Gorman
2012-12-05 5:11 ` Hugh Dickins
2012-12-05 6:28 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).