From: Mel Gorman <mgorman@suse.de>
To: kosaki.motohiro@gmail.com
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Andrew Morton <akpm@google.com>, Dave Jones <davej@redhat.com>,
Christoph Lameter <cl@linux.com>,
stable@vger.kernel.org,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrew Morton <akpm@linux-foundation.org>,
Miao Xie <miaox@cn.fujitsu.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 5/6] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
Date: Tue, 12 Jun 2012 15:20:12 +0100 [thread overview]
Message-ID: <20120612142012.GB20467@suse.de> (raw)
In-Reply-To: <1339406250-10169-6-git-send-email-kosaki.motohiro@gmail.com>
On Mon, Jun 11, 2012 at 05:17:29AM -0400, kosaki.motohiro@gmail.com wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>
> commit cc9a6c8776 (cpuset: mm: reduce large amounts of memory barrier related
> damage v3) introduced a memory corruption.
>
Ouch. No biscuits for Mel.
> shmem_alloc_page() passes pseudo vma and it has one significant unique
> combination, vma->vm_ops=NULL and (vma->policy->flags & MPOL_F_SHARED).
>
> Now, get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL
> and mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
> Therefore, when alloc_pages_vma() goes 'goto retry_cpuset' path, a policy
> refcount will be decreased too much and therefore it will make a memory corruption.
>
Yes, this is true. Hitting the bug requires that the cpuset is being
updated during the allocation so it's not a common but it is real. I'm
surprised I did not hit this while I was running the cpuset stress test
that originally introduced [get|put]_mems_allowed().
> This patch fixes it.
>
> Cc: Dave Jones <davej@redhat.com>,
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Christoph Lameter <cl@linux.com>,
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: <stable@vger.kernel.org>
> Cc: Miao Xie <miaox@cn.fujitsu.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Andi Kleen <ak@linux.intel.com>
> ---
> mm/mempolicy.c | 13 ++++++++++++-
> mm/shmem.c | 9 +++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 7fb7d51..0da0969 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1544,18 +1544,29 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
> struct vm_area_struct *vma, unsigned long addr)
> {
> struct mempolicy *pol = task->mempolicy;
> + int got_ref;
>
> if (vma) {
> if (vma->vm_ops && vma->vm_ops->get_policy) {
> struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
> addr);
> - if (vpol)
> + if (vpol) {
> pol = vpol;
> + got_ref = 1;
> + }
> } else if (vma->vm_policy)
> pol = vma->vm_policy;
> }
> if (!pol)
> pol = &default_policy;
> +
> + /*
> + * shmem_alloc_page() passes MPOL_F_SHARED policy with vma->vm_ops=NULL.
> + * Thus, we need to take additional ref for avoiding refcount imbalance.
> + */
> + if (!got_ref && mpol_needs_cond_ref(pol))
> + mpol_get(pol);
> +
> return pol;
> }
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index d576b84..eb5f1eb 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -919,6 +919,7 @@ static struct page *shmem_alloc_page(gfp_t gfp,
> struct shmem_inode_info *info, pgoff_t index)
> {
> struct vm_area_struct pvma;
> + struct page *page;
>
> /* Create a pseudo vma that just contains the policy */
> pvma.vm_start = 0;
> @@ -926,10 +927,10 @@ static struct page *shmem_alloc_page(gfp_t gfp,
> pvma.vm_ops = NULL;
> pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index);
>
> - /*
> - * alloc_page_vma() will drop the shared policy reference
> - */
> - return alloc_page_vma(gfp, &pvma, 0);
> + page = alloc_page_vma(gfp, &pvma, 0);
> +
> + mpol_put(pvma.vm_policy);
> + return page;
> }
Why does dequeue_huge_page_vma() not need to be changed as well? It's
currently using mpol_cond_put() but if there is a goto retry_cpuset then
will it have not take an additional reference count and leak?
Would it be more straight forward to put the mpol_cond_put() and __mpol_put()
calls after the "goto retry_cpuset" checks instead?
> #else /* !CONFIG_NUMA */
> #ifdef CONFIG_TMPFS
> --
> 1.7.1
>
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: kosaki.motohiro@gmail.com
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Andrew Morton <akpm@google.com>, Dave Jones <davej@redhat.com>,
Christoph Lameter <cl@linux.com>,
stable@vger.kernel.org,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrew Morton <akpm@linux-foundation.org>,
Miao Xie <miaox@cn.fujitsu.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 5/6] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
Date: Tue, 12 Jun 2012 15:20:12 +0100 [thread overview]
Message-ID: <20120612142012.GB20467@suse.de> (raw)
In-Reply-To: <1339406250-10169-6-git-send-email-kosaki.motohiro@gmail.com>
On Mon, Jun 11, 2012 at 05:17:29AM -0400, kosaki.motohiro@gmail.com wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>
> commit cc9a6c8776 (cpuset: mm: reduce large amounts of memory barrier related
> damage v3) introduced a memory corruption.
>
Ouch. No biscuits for Mel.
> shmem_alloc_page() passes pseudo vma and it has one significant unique
> combination, vma->vm_ops=NULL and (vma->policy->flags & MPOL_F_SHARED).
>
> Now, get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL
> and mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
> Therefore, when alloc_pages_vma() goes 'goto retry_cpuset' path, a policy
> refcount will be decreased too much and therefore it will make a memory corruption.
>
Yes, this is true. Hitting the bug requires that the cpuset is being
updated during the allocation so it's not a common but it is real. I'm
surprised I did not hit this while I was running the cpuset stress test
that originally introduced [get|put]_mems_allowed().
> This patch fixes it.
>
> Cc: Dave Jones <davej@redhat.com>,
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Christoph Lameter <cl@linux.com>,
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: <stable@vger.kernel.org>
> Cc: Miao Xie <miaox@cn.fujitsu.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Andi Kleen <ak@linux.intel.com>
> ---
> mm/mempolicy.c | 13 ++++++++++++-
> mm/shmem.c | 9 +++++----
> 2 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 7fb7d51..0da0969 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1544,18 +1544,29 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
> struct vm_area_struct *vma, unsigned long addr)
> {
> struct mempolicy *pol = task->mempolicy;
> + int got_ref;
>
> if (vma) {
> if (vma->vm_ops && vma->vm_ops->get_policy) {
> struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
> addr);
> - if (vpol)
> + if (vpol) {
> pol = vpol;
> + got_ref = 1;
> + }
> } else if (vma->vm_policy)
> pol = vma->vm_policy;
> }
> if (!pol)
> pol = &default_policy;
> +
> + /*
> + * shmem_alloc_page() passes MPOL_F_SHARED policy with vma->vm_ops=NULL.
> + * Thus, we need to take additional ref for avoiding refcount imbalance.
> + */
> + if (!got_ref && mpol_needs_cond_ref(pol))
> + mpol_get(pol);
> +
> return pol;
> }
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index d576b84..eb5f1eb 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -919,6 +919,7 @@ static struct page *shmem_alloc_page(gfp_t gfp,
> struct shmem_inode_info *info, pgoff_t index)
> {
> struct vm_area_struct pvma;
> + struct page *page;
>
> /* Create a pseudo vma that just contains the policy */
> pvma.vm_start = 0;
> @@ -926,10 +927,10 @@ static struct page *shmem_alloc_page(gfp_t gfp,
> pvma.vm_ops = NULL;
> pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index);
>
> - /*
> - * alloc_page_vma() will drop the shared policy reference
> - */
> - return alloc_page_vma(gfp, &pvma, 0);
> + page = alloc_page_vma(gfp, &pvma, 0);
> +
> + mpol_put(pvma.vm_policy);
> + return page;
> }
Why does dequeue_huge_page_vma() not need to be changed as well? It's
currently using mpol_cond_put() but if there is a goto retry_cpuset then
will it have not take an additional reference count and leak?
Would it be more straight forward to put the mpol_cond_put() and __mpol_put()
calls after the "goto retry_cpuset" checks instead?
> #else /* !CONFIG_NUMA */
> #ifdef CONFIG_TMPFS
> --
> 1.7.1
>
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2012-06-12 14:20 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-11 9:17 [PATCH 0/6][resend] mempolicy memory corruption fixlet kosaki.motohiro
2012-06-11 9:17 ` kosaki.motohiro
2012-06-11 9:17 ` [PATCH 1/6] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" kosaki.motohiro
2012-06-11 9:17 ` kosaki.motohiro
2012-06-11 14:43 ` Christoph Lameter
2012-06-11 14:43 ` Christoph Lameter
2012-06-11 9:17 ` [PATCH 2/6] mempolicy: remove all mempolicy sharing kosaki.motohiro
2012-06-11 9:17 ` kosaki.motohiro
2012-06-11 15:02 ` Christoph Lameter
2012-06-11 15:02 ` Christoph Lameter
2012-06-12 16:46 ` KOSAKI Motohiro
2012-06-12 16:46 ` KOSAKI Motohiro
2012-06-12 16:46 ` KOSAKI Motohiro
2012-06-12 13:55 ` Mel Gorman
2012-06-12 13:55 ` Mel Gorman
2012-06-12 16:45 ` KOSAKI Motohiro
2012-06-12 16:45 ` KOSAKI Motohiro
2012-06-11 9:17 ` [PATCH 3/6] mempolicy: fix a race in shared_policy_replace() kosaki.motohiro
2012-06-11 9:17 ` kosaki.motohiro
2012-06-11 9:17 ` [PATCH 4/6] mempolicy: fix refcount leak in mpol_set_shared_policy() kosaki.motohiro
2012-06-11 9:17 ` kosaki.motohiro
2012-06-11 9:17 ` [PATCH 5/6] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() kosaki.motohiro
2012-06-11 9:17 ` kosaki.motohiro
2012-06-11 13:33 ` Ben Hutchings
2012-06-11 15:24 ` KOSAKI Motohiro
2012-06-11 15:24 ` KOSAKI Motohiro
2012-06-12 14:20 ` Mel Gorman [this message]
2012-06-12 14:20 ` Mel Gorman
2012-06-12 16:31 ` KOSAKI Motohiro
2012-06-12 16:31 ` KOSAKI Motohiro
2012-06-11 9:17 ` [PATCH 6/6] MAINTAINERS: Added MEMPOLICY entry kosaki.motohiro
2012-06-11 9:17 ` kosaki.motohiro
2012-06-11 15:01 ` [PATCH 0/6][resend] mempolicy memory corruption fixlet Christoph Lameter
2012-06-11 15:01 ` Christoph Lameter
2012-07-31 12:33 ` Josh Boyer
2012-07-31 12:33 ` Josh Boyer
2012-08-06 19:32 ` KOSAKI Motohiro
2012-08-06 19:32 ` KOSAKI Motohiro
2012-08-15 11:40 ` Josh Boyer
2012-08-15 11:40 ` Josh Boyer
2012-08-15 20:20 ` Andrew Morton
2012-08-15 20:20 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2012-05-30 9:02 [PATCH 0/6] " kosaki.motohiro
2012-05-30 9:02 ` [PATCH 5/6] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() kosaki.motohiro
2012-05-30 9:02 ` kosaki.motohiro
2012-05-30 20:37 ` Andi Kleen
2012-05-30 20:37 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120612142012.GB20467@suse.de \
--to=mgorman@suse.de \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@google.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=davej@redhat.com \
--cc=kosaki.motohiro@gmail.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miaox@cn.fujitsu.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.