From: Rik van Riel <riel@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>,
Andrea Arcangeli <aarcange@redhat.com>,
Minchan Kim <minchan.kim@gmail.com>,
Linux-MM <linux-mm@kvack.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
LKML <linux-kernel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Subject: [PATCH 4/5] always lock the root (oldest) anon_vma
Date: Wed, 26 May 2010 11:27:06 -0400 [thread overview]
Message-ID: <20100526112706.145f72eb@annuminas.surriel.com> (raw)
In-Reply-To: <20100526112403.635be0ed@annuminas.surriel.com>
Subject: always lock the root (oldest) anon_vma
Always (and only) lock the root (oldest) anon_vma whenever we do something in an
anon_vma. The recently introduced anon_vma scalability is due to the rmap code
scanning only the VMAs that need to be scanned. Many common operations still
took the anon_vma lock on the root anon_vma, so always taking that lock is not
expected to introduce any scalability issues.
However, always taking the same lock does mean we only need to take one lock,
which means rmap_walk on pages from any anon_vma in the vma is excluded from
occurring during an munmap, expand_stack or other operation that needs to
exclude rmap_walk and similar functions.
Also add the proper locking to vma_adjust.
Signed-off-by: Rik van Riel <riel@redhat.com>
---
v3:
- fix locking inversion in vma_adjust, spotted by Andrea
- mm_take_all locks needs to use the bitflag in the root anon_vma,
since that is the one that gets locked (Andrea Arcangeli)
v2:
- conditionally take the anon_vma lock in vma_adjust, like introduced
in 252c5f94d944487e9f50ece7942b0fbf659c5c31 (with a proper comment)
include/linux/rmap.h | 8 ++++----
mm/ksm.c | 2 +-
mm/migrate.c | 2 +-
mm/mmap.c | 30 ++++++++++++++++++++++--------
4 files changed, 28 insertions(+), 14 deletions(-)
Index: linux-2.6.34/include/linux/rmap.h
===================================================================
--- linux-2.6.34.orig/include/linux/rmap.h
+++ linux-2.6.34/include/linux/rmap.h
@@ -104,24 +104,24 @@ static inline void vma_lock_anon_vma(str
{
struct anon_vma *anon_vma = vma->anon_vma;
if (anon_vma)
- spin_lock(&anon_vma->lock);
+ spin_lock(&anon_vma->root->lock);
}
static inline void vma_unlock_anon_vma(struct vm_area_struct *vma)
{
struct anon_vma *anon_vma = vma->anon_vma;
if (anon_vma)
- spin_unlock(&anon_vma->lock);
+ spin_unlock(&anon_vma->root->lock);
}
static inline void anon_vma_lock(struct anon_vma *anon_vma)
{
- spin_lock(&anon_vma->lock);
+ spin_lock(&anon_vma->root->lock);
}
static inline void anon_vma_unlock(struct anon_vma *anon_vma)
{
- spin_unlock(&anon_vma->lock);
+ spin_unlock(&anon_vma->root->lock);
}
/*
Index: linux-2.6.34/mm/ksm.c
===================================================================
--- linux-2.6.34.orig/mm/ksm.c
+++ linux-2.6.34/mm/ksm.c
@@ -325,7 +325,7 @@ static void drop_anon_vma(struct rmap_it
{
struct anon_vma *anon_vma = rmap_item->anon_vma;
- if (atomic_dec_and_lock(&anon_vma->external_refcount, &anon_vma->lock)) {
+ if (atomic_dec_and_lock(&anon_vma->external_refcount, &anon_vma->root->lock)) {
int empty = list_empty(&anon_vma->head);
anon_vma_unlock(anon_vma);
if (empty)
Index: linux-2.6.34/mm/mmap.c
===================================================================
--- linux-2.6.34.orig/mm/mmap.c
+++ linux-2.6.34/mm/mmap.c
@@ -506,6 +506,7 @@ int vma_adjust(struct vm_area_struct *vm
struct vm_area_struct *importer = NULL;
struct address_space *mapping = NULL;
struct prio_tree_root *root = NULL;
+ struct anon_vma *anon_vma = NULL;
struct file *file = vma->vm_file;
long adjust_next = 0;
int remove_next = 0;
@@ -578,6 +579,17 @@ again: remove_next = 1 + (end > next->
}
}
+ /*
+ * When changing only vma->vm_end, we don't really need anon_vma
+ * lock. This is a fairly rare case by itself, but the anon_vma
+ * lock may be shared between many sibling processes. Skipping
+ * the lock for brk adjustments makes a difference sometimes.
+ */
+ if (vma->anon_vma && (insert || importer || start != vma->vm_start)) {
+ anon_vma = vma->anon_vma;
+ anon_vma_lock(anon_vma);
+ }
+
if (root) {
flush_dcache_mmap_lock(mapping);
vma_prio_tree_remove(vma, root);
@@ -617,6 +629,8 @@ again: remove_next = 1 + (end > next->
__insert_vm_struct(mm, insert);
}
+ if (anon_vma)
+ anon_vma_unlock(anon_vma);
if (mapping)
spin_unlock(&mapping->i_mmap_lock);
@@ -2466,23 +2480,23 @@ static DEFINE_MUTEX(mm_all_locks_mutex);
static void vm_lock_anon_vma(struct mm_struct *mm, struct anon_vma *anon_vma)
{
- if (!test_bit(0, (unsigned long *) &anon_vma->head.next)) {
+ if (!test_bit(0, (unsigned long *) &anon_vma->root->head.next)) {
/*
* The LSB of head.next can't change from under us
* because we hold the mm_all_locks_mutex.
*/
- spin_lock_nest_lock(&anon_vma->lock, &mm->mmap_sem);
+ spin_lock_nest_lock(&anon_vma->root->lock, &mm->mmap_sem);
/*
* We can safely modify head.next after taking the
- * anon_vma->lock. If some other vma in this mm shares
+ * anon_vma->root->lock. If some other vma in this mm shares
* the same anon_vma we won't take it again.
*
* No need of atomic instructions here, head.next
* can't change from under us thanks to the
- * anon_vma->lock.
+ * anon_vma->root->lock.
*/
if (__test_and_set_bit(0, (unsigned long *)
- &anon_vma->head.next))
+ &anon_vma->root->head.next))
BUG();
}
}
@@ -2573,7 +2587,7 @@ out_unlock:
static void vm_unlock_anon_vma(struct anon_vma *anon_vma)
{
- if (test_bit(0, (unsigned long *) &anon_vma->head.next)) {
+ if (test_bit(0, (unsigned long *) &anon_vma->root->head.next)) {
/*
* The LSB of head.next can't change to 0 from under
* us because we hold the mm_all_locks_mutex.
@@ -2584,10 +2598,10 @@ static void vm_unlock_anon_vma(struct an
*
* No need of atomic instructions here, head.next
* can't change from under us until we release the
- * anon_vma->lock.
+ * anon_vma->root->lock.
*/
if (!__test_and_clear_bit(0, (unsigned long *)
- &anon_vma->head.next))
+ &anon_vma->root->head.next))
BUG();
anon_vma_unlock(anon_vma);
}
Index: linux-2.6.34/mm/migrate.c
===================================================================
--- linux-2.6.34.orig/mm/migrate.c
+++ linux-2.6.34/mm/migrate.c
@@ -682,7 +682,7 @@ skip_unmap:
rcu_unlock:
/* Drop an anon_vma reference if we took one */
- if (anon_vma && atomic_dec_and_lock(&anon_vma->external_refcount, &anon_vma->lock)) {
+ if (anon_vma && atomic_dec_and_lock(&anon_vma->external_refcount, &anon_vma->root->lock)) {
int empty = list_empty(&anon_vma->head);
anon_vma_unlock(anon_vma);
if (empty)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-05-26 15:29 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-12 17:38 [PATCH 0/5] always lock the root anon_vma Rik van Riel
2010-05-12 17:39 ` [PATCH 1/5] rename anon_vma_lock to vma_lock_anon_vma Rik van Riel
2010-05-12 20:57 ` Mel Gorman
2010-05-13 0:30 ` KAMEZAWA Hiroyuki
2010-05-12 17:39 ` [PATCH 3/5] track the root (oldest) anon_vma Rik van Riel
2010-05-12 20:59 ` Mel Gorman
2010-05-12 21:01 ` Rik van Riel
2010-05-13 0:38 ` KAMEZAWA Hiroyuki
2010-05-13 2:25 ` Rik van Riel
2010-05-14 0:04 ` KAMEZAWA Hiroyuki
2010-05-12 17:40 ` [PATCH 4/5] always lock " Rik van Riel
2010-05-12 21:02 ` Mel Gorman
2010-05-12 21:08 ` Rik van Riel
2010-05-13 9:54 ` Mel Gorman
2010-05-13 14:33 ` [PATCH -v2 " Rik van Riel
2010-05-13 21:09 ` Andrew Morton
2010-05-13 22:50 ` Rik van Riel
2010-05-14 9:33 ` Mel Gorman
2010-05-26 4:00 ` Rik van Riel
2010-05-26 4:15 ` Andrew Morton
2010-05-26 5:46 ` james toy
2010-06-01 0:57 ` james toy
2010-05-26 15:24 ` [PATCH -v2 0/5] always lock the root anon_vma Rik van Riel
2010-05-26 15:25 ` [PATCH 1/5] rename anon_vma_lock to vma_lock_anon_vma Rik van Riel
2010-05-26 17:25 ` Linus Torvalds
2010-05-26 19:01 ` Rik van Riel
2010-05-26 19:25 ` Linus Torvalds
2010-05-26 19:35 ` Rik van Riel
2010-05-26 15:25 ` [PATCH 2/5] change direct call of spin_lock(anon_vma->lock) to inline function Rik van Riel
2010-05-26 15:26 ` [PATCH 3/5] track the root (oldest) anon_vma Rik van Riel
2010-05-26 15:27 ` Rik van Riel [this message]
2010-05-26 15:27 ` [PATCH 5/5] extend KSM refcounts to the anon_vma root Rik van Riel
2010-05-12 21:55 ` [PATCH 4/5] always lock the root (oldest) anon_vma Linus Torvalds
2010-05-12 22:18 ` Rik van Riel
2010-05-12 22:26 ` Linus Torvalds
2010-05-12 17:41 ` [PATCH 5/5] extend KSM refcounts to the anon_vma root Rik van Riel
2010-05-12 21:07 ` Mel Gorman
2010-05-12 21:09 ` Rik van Riel
2010-05-13 11:26 ` Mel Gorman
2010-05-13 13:11 ` Rik van Riel
2010-05-13 13:24 ` Mel Gorman
2010-05-13 14:34 ` [PATCH -v2 " Rik van Riel
2010-05-19 1:05 ` Andrea Arcangeli
2010-05-12 17:41 ` [PATCH 2/5] change direct call of spin_lock(anon_vma->lock) to inline function Rik van Riel
2010-05-12 20:58 ` Mel Gorman
2010-05-13 0:32 ` KAMEZAWA Hiroyuki
2010-05-20 22:42 ` [PATCH 6/5] adjust mm_take_all_locks to anon-vma-root locking Andrea Arcangeli
2010-05-20 23:07 ` Rik van Riel
-- strict thread matches above, loose matches on Subject: below --
2010-05-26 19:38 [PATCH -v3 0/5] always lock the root anon_vma Rik van Riel
2010-05-26 19:40 ` [PATCH 4/5] always lock the root (oldest) anon_vma Rik van Riel
2010-05-26 20:36 ` Larry Woodman
2010-05-27 0:57 ` KAMEZAWA Hiroyuki
2010-05-27 13:55 ` Minchan Kim
2010-05-27 17:48 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100526112706.145f72eb@annuminas.surriel.com \
--to=riel@redhat.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=minchan.kim@gmail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).