All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Jeons <simon.jeons@gmail.com>
To: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: protect against concurrent vma expansion
Date: Wed, 19 Dec 2012 20:56:34 -0500	[thread overview]
Message-ID: <1355968594.1415.4.camel@kernel-VirtualBox> (raw)
In-Reply-To: <20121204144820.GA13916@google.com>

On Tue, 2012-12-04 at 06:48 -0800, Michel Lespinasse wrote:
> expand_stack() runs with a shared mmap_sem lock. Because of this, there
> could be multiple concurrent stack expansions in the same mm, which may
> cause problems in the vma gap update code.
> 
> I propose to solve this by taking the mm->page_table_lock around such vma
> expansions, in order to avoid the concurrency issue. We only have to worry
> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
> lock and all vma modificaitons other than expand_stack() are done under
> an exclusive mmap_sem lock.

Hi Michel and Andrew,

One question.

I found that mainly callsite of expand_stack() is #PF, but it holds
mmap_sem each time before call expand_stack(), how can hold a *shared*
mmap_sem happen?

> 
> I previously tried to achieve the same effect by making sure all
> growable vmas in a given mm would share the same anon_vma, which we
> already lock here. However this turned out to be difficult - all of the
> schemes I tried for refcounting the growable anon_vma and clearing
> turned out ugly. So, I'm now proposing only the minimal fix.
> 
> The overhead of taking the page table lock during stack expansion is
> expected to be small: glibc doesn't use expandable stacks for the
> threads it creates, so having multiple growable stacks is actually
> uncommon and we don't expect the page table lock to get bounced
> between threads.
> 
> Signed-off-by: Michel Lespinasse <walken@google.com>
> 
> ---
>  mm/mmap.c |   28 ++++++++++++++++++++++++++++
>  1 files changed, 28 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9ed3a06242a0..2b7d9e78a569 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2069,6 +2069,18 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  		if (vma->vm_pgoff + (size >> PAGE_SHIFT) >= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_end = address;
>  				anon_vma_interval_tree_post_update_vma(vma);
> @@ -2076,6 +2088,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  					vma_gap_update(vma->vm_next);
>  				else
>  					vma->vm_mm->highest_vm_end = address;
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}
> @@ -2126,11 +2140,25 @@ int expand_downwards(struct vm_area_struct *vma,
>  		if (grow <= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_start = address;
>  				vma->vm_pgoff -= grow;
>  				anon_vma_interval_tree_post_update_vma(vma);
>  				vma_gap_update(vma);
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Simon Jeons <simon.jeons@gmail.com>
To: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: protect against concurrent vma expansion
Date: Wed, 19 Dec 2012 20:56:34 -0500	[thread overview]
Message-ID: <1355968594.1415.4.camel@kernel-VirtualBox> (raw)
In-Reply-To: <20121204144820.GA13916@google.com>

On Tue, 2012-12-04 at 06:48 -0800, Michel Lespinasse wrote:
> expand_stack() runs with a shared mmap_sem lock. Because of this, there
> could be multiple concurrent stack expansions in the same mm, which may
> cause problems in the vma gap update code.
> 
> I propose to solve this by taking the mm->page_table_lock around such vma
> expansions, in order to avoid the concurrency issue. We only have to worry
> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
> lock and all vma modificaitons other than expand_stack() are done under
> an exclusive mmap_sem lock.

Hi Michel and Andrew,

One question.

I found that mainly callsite of expand_stack() is #PF, but it holds
mmap_sem each time before call expand_stack(), how can hold a *shared*
mmap_sem happen?

> 
> I previously tried to achieve the same effect by making sure all
> growable vmas in a given mm would share the same anon_vma, which we
> already lock here. However this turned out to be difficult - all of the
> schemes I tried for refcounting the growable anon_vma and clearing
> turned out ugly. So, I'm now proposing only the minimal fix.
> 
> The overhead of taking the page table lock during stack expansion is
> expected to be small: glibc doesn't use expandable stacks for the
> threads it creates, so having multiple growable stacks is actually
> uncommon and we don't expect the page table lock to get bounced
> between threads.
> 
> Signed-off-by: Michel Lespinasse <walken@google.com>
> 
> ---
>  mm/mmap.c |   28 ++++++++++++++++++++++++++++
>  1 files changed, 28 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9ed3a06242a0..2b7d9e78a569 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2069,6 +2069,18 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  		if (vma->vm_pgoff + (size >> PAGE_SHIFT) >= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_end = address;
>  				anon_vma_interval_tree_post_update_vma(vma);
> @@ -2076,6 +2088,8 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
>  					vma_gap_update(vma->vm_next);
>  				else
>  					vma->vm_mm->highest_vm_end = address;
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}
> @@ -2126,11 +2140,25 @@ int expand_downwards(struct vm_area_struct *vma,
>  		if (grow <= vma->vm_pgoff) {
>  			error = acct_stack_growth(vma, size, grow);
>  			if (!error) {
> +				/*
> +				 * vma_gap_update() doesn't support concurrent
> +				 * updates, but we only hold a shared mmap_sem
> +				 * lock here, so we need to protect against
> +				 * concurrent vma expansions.
> +				 * vma_lock_anon_vma() doesn't help here, as
> +				 * we don't guarantee that all growable vmas
> +				 * in a mm share the same root anon vma.
> +				 * So, we reuse mm->page_table_lock to guard
> +				 * against concurrent vma expansions.
> +				 */
> +				spin_lock(&vma->vm_mm->page_table_lock);
>  				anon_vma_interval_tree_pre_update_vma(vma);
>  				vma->vm_start = address;
>  				vma->vm_pgoff -= grow;
>  				anon_vma_interval_tree_post_update_vma(vma);
>  				vma_gap_update(vma);
> +				spin_unlock(&vma->vm_mm->page_table_lock);
> +
>  				perf_event_mmap(vma);
>  			}
>  		}



  reply	other threads:[~2012-12-20  2:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-01  6:56 [PATCH] mm: protect against concurrent vma expansion Michel Lespinasse
2012-12-01  6:56 ` Michel Lespinasse
2012-12-03 23:01 ` Andrew Morton
2012-12-03 23:01   ` Andrew Morton
2012-12-04  0:35   ` Michel Lespinasse
2012-12-04  0:35     ` Michel Lespinasse
2012-12-04  0:43     ` Andrew Morton
2012-12-04  0:43       ` Andrew Morton
2012-12-04 14:48       ` Michel Lespinasse
2012-12-04 14:48         ` Michel Lespinasse
2012-12-20  1:56         ` Simon Jeons [this message]
2012-12-20  1:56           ` Simon Jeons
2012-12-20  3:01           ` Michel Lespinasse
2012-12-20  3:01             ` Michel Lespinasse
2013-01-04  0:40             ` Simon Jeons
2013-01-04  0:40               ` Simon Jeons
2013-01-04  0:50               ` Michel Lespinasse
2013-01-04  0:50                 ` Michel Lespinasse
2013-01-04  1:18                 ` Simon Jeons
2013-01-04  1:18                   ` Simon Jeons
2013-01-04  2:49               ` Al Viro
2013-01-04  2:49                 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1355968594.1415.4.camel@kernel-VirtualBox \
    --to=simon.jeons@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.