All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Shaohua Li <shaohua.li@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Miller <davem@davemloft.net>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Russell King <rmk@arm.linux.org.uk>,
	Paul Mundt <lethal@linux-sh.org>, Jeff Dike <jdike@addtoit.com>,
	Richard Weinberger <richard@nod.at>,
	"Luck, Tony" <tony.luck@intel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
	Namhyung Kim <namhyung@gmail.com>,
	"Shi, Alex" <alex.shi@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: [PATCH] mm, memory-failure: Fix spinlock vs mutex order
Date: Fri, 17 Jun 2011 21:53:05 +0200	[thread overview]
Message-ID: <1308340385.12801.101.camel@laptop> (raw)
In-Reply-To: <1308335557.12801.24.camel@laptop>

On Fri, 2011-06-17 at 20:32 +0200, Peter Zijlstra wrote:
> Aside from the THP thing there's a user in memory-failure.c, which looks
> to be broken as it is because its calling things under tasklist_lock
> which isn't preemptible, but it looks like we can simply swap the
> tasklist_lock vs page_lock_anon_vma.
> 

I thought about maybe using rcu, but then thought the thing is probably
wanting to exclude new tasks as it wants to kill all mm users.

---
Subject: mm, memory-failure: Fix spinlock vs mutex order

We cannot take a mutex while holding a spinlock, so flip the order as
its documented to be random.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/memory-failure.c |   21 ++++++---------------
 mm/rmap.c           |    5 ++---
 2 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index eac0ba5..740c4f5 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -391,10 +391,11 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
 	struct task_struct *tsk;
 	struct anon_vma *av;
 
-	read_lock(&tasklist_lock);
 	av = page_lock_anon_vma(page);
 	if (av == NULL)	/* Not actually mapped anymore */
-		goto out;
+		return;
+
+	read_lock(&tasklist_lock);
 	for_each_process (tsk) {
 		struct anon_vma_chain *vmac;
 
@@ -408,9 +409,8 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
 				add_to_kill(tsk, page, vma, to_kill, tkc);
 		}
 	}
-	page_unlock_anon_vma(av);
-out:
 	read_unlock(&tasklist_lock);
+	page_unlock_anon_vma(av);
 }
 
 /*
@@ -424,17 +424,8 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill,
 	struct prio_tree_iter iter;
 	struct address_space *mapping = page->mapping;
 
-	/*
-	 * A note on the locking order between the two locks.
-	 * We don't rely on this particular order.
-	 * If you have some other code that needs a different order
-	 * feel free to switch them around. Or add a reverse link
-	 * from mm_struct to task_struct, then this could be all
-	 * done without taking tasklist_lock and looping over all tasks.
-	 */
-
-	read_lock(&tasklist_lock);
 	mutex_lock(&mapping->i_mmap_mutex);
+	read_lock(&tasklist_lock);
 	for_each_process(tsk) {
 		pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 
@@ -454,8 +445,8 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill,
 				add_to_kill(tsk, page, vma, to_kill, tkc);
 		}
 	}
-	mutex_unlock(&mapping->i_mmap_mutex);
 	read_unlock(&tasklist_lock);
+	mutex_unlock(&mapping->i_mmap_mutex);
 }
 
 /*
diff --git a/mm/rmap.c b/mm/rmap.c
index 0eb463e..5e51855 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -38,9 +38,8 @@
  *                           in arch-dependent flush_dcache_mmap_lock,
  *                           within inode_wb_list_lock in __sync_single_inode)
  *
- * (code doesn't rely on that order so it could be switched around)
- * ->tasklist_lock
- *   anon_vma->mutex      (memory_failure, collect_procs_anon)
+ * anon_vma->mutex,mapping->i_mutex      (memory_failure, collect_procs_anon)
+ *   ->tasklist_lock
  *     pte map lock
  */
 



WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Shaohua Li <shaohua.li@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Miller <davem@davemloft.net>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Russell King <rmk@arm.linux.org.uk>,
	Paul Mundt <lethal@linux-sh.org>, Jeff Dike <jdike@addtoit.com>,
	Richard Weinberger <richard@nod.at>,
	"Luck, Tony" <tony.luck@intel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
	Namhyung Kim <namhyung@gmail.com>,
	"Shi, Alex" <alex.shi@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: [PATCH] mm, memory-failure: Fix spinlock vs mutex order
Date: Fri, 17 Jun 2011 21:53:05 +0200	[thread overview]
Message-ID: <1308340385.12801.101.camel@laptop> (raw)
In-Reply-To: <1308335557.12801.24.camel@laptop>

On Fri, 2011-06-17 at 20:32 +0200, Peter Zijlstra wrote:
> Aside from the THP thing there's a user in memory-failure.c, which looks
> to be broken as it is because its calling things under tasklist_lock
> which isn't preemptible, but it looks like we can simply swap the
> tasklist_lock vs page_lock_anon_vma.
> 

I thought about maybe using rcu, but then thought the thing is probably
wanting to exclude new tasks as it wants to kill all mm users.

---
Subject: mm, memory-failure: Fix spinlock vs mutex order

We cannot take a mutex while holding a spinlock, so flip the order as
its documented to be random.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/memory-failure.c |   21 ++++++---------------
 mm/rmap.c           |    5 ++---
 2 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index eac0ba5..740c4f5 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -391,10 +391,11 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
 	struct task_struct *tsk;
 	struct anon_vma *av;
 
-	read_lock(&tasklist_lock);
 	av = page_lock_anon_vma(page);
 	if (av == NULL)	/* Not actually mapped anymore */
-		goto out;
+		return;
+
+	read_lock(&tasklist_lock);
 	for_each_process (tsk) {
 		struct anon_vma_chain *vmac;
 
@@ -408,9 +409,8 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
 				add_to_kill(tsk, page, vma, to_kill, tkc);
 		}
 	}
-	page_unlock_anon_vma(av);
-out:
 	read_unlock(&tasklist_lock);
+	page_unlock_anon_vma(av);
 }
 
 /*
@@ -424,17 +424,8 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill,
 	struct prio_tree_iter iter;
 	struct address_space *mapping = page->mapping;
 
-	/*
-	 * A note on the locking order between the two locks.
-	 * We don't rely on this particular order.
-	 * If you have some other code that needs a different order
-	 * feel free to switch them around. Or add a reverse link
-	 * from mm_struct to task_struct, then this could be all
-	 * done without taking tasklist_lock and looping over all tasks.
-	 */
-
-	read_lock(&tasklist_lock);
 	mutex_lock(&mapping->i_mmap_mutex);
+	read_lock(&tasklist_lock);
 	for_each_process(tsk) {
 		pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 
@@ -454,8 +445,8 @@ static void collect_procs_file(struct page *page, struct list_head *to_kill,
 				add_to_kill(tsk, page, vma, to_kill, tkc);
 		}
 	}
-	mutex_unlock(&mapping->i_mmap_mutex);
 	read_unlock(&tasklist_lock);
+	mutex_unlock(&mapping->i_mmap_mutex);
 }
 
 /*
diff --git a/mm/rmap.c b/mm/rmap.c
index 0eb463e..5e51855 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -38,9 +38,8 @@
  *                           in arch-dependent flush_dcache_mmap_lock,
  *                           within inode_wb_list_lock in __sync_single_inode)
  *
- * (code doesn't rely on that order so it could be switched around)
- * ->tasklist_lock
- *   anon_vma->mutex      (memory_failure, collect_procs_anon)
+ * anon_vma->mutex,mapping->i_mutex      (memory_failure, collect_procs_anon)
+ *   ->tasklist_lock
  *     pte map lock
  */
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-06-17 19:49 UTC|newest]

Thread overview: 166+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-15  0:29 REGRESSION: Performance regressions from switching anon_vma->lock to mutex Tim Chen
2011-06-15  0:29 ` Tim Chen
2011-06-15  0:36 ` Andi Kleen
2011-06-15  0:36   ` Andi Kleen
2011-06-17 19:07   ` Ingo Molnar
2011-06-17 19:07     ` Ingo Molnar
2011-06-15  1:21 ` Linus Torvalds
2011-06-15  1:21   ` Linus Torvalds
2011-06-15  3:42   ` Linus Torvalds
2011-06-15  1:26 ` Shaohua Li
2011-06-15  1:26   ` Shaohua Li
2011-06-15 11:52   ` Peter Zijlstra
2011-06-15 11:52     ` Peter Zijlstra
2011-06-15 12:49     ` Peter Zijlstra
2011-06-15 12:49       ` Peter Zijlstra
2011-06-15 16:18     ` Andi Kleen
2011-06-15 16:18       ` Andi Kleen
2011-06-15 16:45       ` Peter Zijlstra
2011-06-15 16:45         ` Peter Zijlstra
2011-06-15 16:47         ` Andi Kleen
2011-06-15 16:47           ` Andi Kleen
2011-06-15 18:43         ` Tim Chen
2011-06-15 18:43           ` Tim Chen
2011-06-15 20:32           ` Peter Zijlstra
2011-06-15 20:32             ` Peter Zijlstra
2011-06-15 20:57             ` Andi Kleen
2011-06-15 20:57               ` Andi Kleen
2011-06-15 21:12               ` Tim Chen
2011-06-15 21:12                 ` Tim Chen
2011-06-15 21:37                 ` Peter Zijlstra
2011-06-15 21:37                   ` Peter Zijlstra
2011-06-15 21:51                   ` Linus Torvalds
2011-06-15 21:51                     ` Linus Torvalds
2011-06-15 22:19                     ` Andi Kleen
2011-06-15 22:19                       ` Andi Kleen
2011-06-16  0:16                       ` Linus Torvalds
2011-06-16  0:16                         ` Linus Torvalds
2011-06-16 20:14                         ` Andi Kleen
2011-06-16 20:14                           ` Andi Kleen
2011-06-16 20:37                           ` Linus Torvalds
2011-06-16 20:37                             ` Linus Torvalds
2011-06-17  0:24                             ` Andi Kleen
2011-06-17  9:13                               ` Ingo Molnar
2011-06-17  9:13                                 ` Ingo Molnar
2011-06-15 22:15                   ` Andi Kleen
2011-06-15 22:15                     ` Andi Kleen
2011-06-16  1:08                   ` Tim Chen
2011-06-16  1:08                     ` Tim Chen
2011-06-16  1:50                   ` Linus Torvalds
2011-06-16  1:50                     ` Linus Torvalds
2011-06-16 20:26                     ` Tim Chen
2011-06-16 20:26                       ` Tim Chen
2011-06-16 20:47                       ` Linus Torvalds
2011-06-16 20:47                         ` Linus Torvalds
2011-06-16 21:05                         ` Linus Torvalds
2011-06-16 21:05                           ` Linus Torvalds
2011-06-16 21:06                           ` Linus Torvalds
2011-06-16 21:26                             ` Linus Torvalds
2011-06-16 21:26                               ` Linus Torvalds
2011-06-17  3:58                               ` Linus Torvalds
2011-06-17 11:28                                 ` Peter Zijlstra
2011-06-17 11:28                                   ` Peter Zijlstra
2011-06-17 11:54                                   ` Peter Zijlstra
2011-06-17 11:54                                     ` Peter Zijlstra
2011-06-17 16:36                                   ` Linus Torvalds
2011-06-17 16:36                                     ` Linus Torvalds
2011-06-17 17:41                                     ` Hugh Dickins
2011-06-17 17:41                                       ` Hugh Dickins
2011-06-17 17:55                                       ` Peter Zijlstra
2011-06-17 17:55                                         ` Peter Zijlstra
2011-06-17 18:01                                       ` Linus Torvalds
2011-06-17 18:01                                         ` Linus Torvalds
2011-06-17 18:18                                         ` Peter Zijlstra
2011-06-17 18:18                                           ` Peter Zijlstra
2011-06-17 18:32                                           ` Peter Zijlstra
2011-06-17 18:32                                             ` Peter Zijlstra
2011-06-17 18:39                                             ` Linus Torvalds
2011-06-17 18:41                                               ` Linus Torvalds
2011-06-17 18:41                                                 ` Linus Torvalds
2011-06-17 20:19                                               ` Tim Chen
2011-06-17 20:19                                                 ` Tim Chen
2011-06-17 22:20                                               ` Hugh Dickins
2011-06-17 22:20                                                 ` Hugh Dickins
2011-06-18  4:47                                                 ` Linus Torvalds
2011-06-18  4:47                                                   ` Linus Torvalds
2011-06-17 19:53                                             ` Peter Zijlstra [this message]
2011-06-17 19:53                                               ` [PATCH] mm, memory-failure: Fix spinlock vs mutex order Peter Zijlstra
2011-06-17 20:04                                               ` Andi Kleen
2011-06-17 20:04                                                 ` Andi Kleen
2011-06-17 16:46                                   ` REGRESSION: Performance regressions from switching anon_vma->lock to mutex Linus Torvalds
2011-06-17 16:46                                     ` Linus Torvalds
2011-06-17 17:28                                     ` Linus Torvalds
2011-06-17 19:40                                     ` Andi Kleen
2011-06-17 19:40                                       ` Andi Kleen
2011-06-18  8:08                                       ` Ingo Molnar
2011-06-18  8:08                                         ` Ingo Molnar
2011-06-17 18:22                                 ` Tim Chen
2011-06-17 18:22                                   ` Tim Chen
2011-06-17 19:05                                   ` Ray Lee
2011-06-17 19:05                                     ` Ray Lee
2011-06-16 22:00                           ` Andi Kleen
2011-06-16 22:00                             ` Andi Kleen
2011-06-15 10:36 ` Peter Zijlstra
2011-06-15 10:36   ` Peter Zijlstra
2011-06-15 10:58   ` Peter Zijlstra
2011-06-15 10:58     ` Peter Zijlstra
2011-06-15 11:41     ` Peter Zijlstra
2011-06-15 11:41       ` Peter Zijlstra
2011-06-15 19:11     ` Linus Torvalds
2011-06-15 19:11       ` Linus Torvalds
2011-06-15 19:24       ` Andrew Morton
2011-06-15 19:24         ` Andrew Morton
2011-06-15 20:16         ` Ingo Molnar
2011-06-15 20:16           ` Ingo Molnar
2011-06-15 20:55           ` Linus Torvalds
2011-06-15 20:55             ` Linus Torvalds
2011-06-15 20:12       ` [GIT PULL] " Ingo Molnar
2011-06-15 20:12         ` Ingo Molnar
2011-06-15 20:29         ` Paul E. McKenney
2011-06-15 20:29           ` Paul E. McKenney
2011-06-15 20:47           ` Linus Torvalds
2011-06-15 20:47             ` Linus Torvalds
2011-06-15 20:54             ` Paul E. McKenney
2011-06-15 20:54               ` Paul E. McKenney
2011-06-15 21:05         ` Linus Torvalds
2011-06-15 21:05           ` Linus Torvalds
2011-06-15 21:15           ` Paul E. McKenney
2011-06-15 21:15             ` Paul E. McKenney
2011-06-15 21:27             ` Linus Torvalds
2011-06-15 21:27               ` Linus Torvalds
2011-06-16  7:03           ` Ingo Molnar
2011-06-16  7:03             ` Ingo Molnar
2011-06-16 17:16             ` Paul E. McKenney
2011-06-16 17:16               ` Paul E. McKenney
2011-06-16 20:25               ` Ingo Molnar
2011-06-16 20:25                 ` Ingo Molnar
2011-06-16 21:01                 ` Frederic Weisbecker
2011-06-16 21:01                   ` Frederic Weisbecker
2011-06-16 23:02                   ` Ingo Molnar
2011-06-16 23:02                     ` Ingo Molnar
2011-06-17 15:19                     ` Frederic Weisbecker
2011-06-17 15:19                       ` Frederic Weisbecker
2011-06-16 21:02                 ` Andi Kleen
2011-06-16 21:02                   ` Andi Kleen
2011-06-16 22:21                 ` Benjamin Herrenschmidt
2011-06-16 22:21                   ` Benjamin Herrenschmidt
2011-06-16 22:38                   ` Ingo Molnar
2011-06-16 22:38                     ` Ingo Molnar
2011-06-16 22:47                     ` Andi Kleen
2011-06-16 22:47                       ` Andi Kleen
2011-06-16 22:58                       ` Ingo Molnar
2011-06-16 22:58                         ` Ingo Molnar
2011-06-17  0:45                         ` Paul E. McKenney
2011-06-17  0:45                           ` Paul E. McKenney
2011-06-17  9:43                           ` Ingo Molnar
2011-06-17  9:43                             ` Ingo Molnar
2011-06-17 16:48                             ` Paul E. McKenney
2011-06-17 16:48                               ` Paul E. McKenney
2011-06-16 23:37                 ` Paul E. McKenney
2011-06-16 23:37                   ` Paul E. McKenney
2011-06-15 20:13       ` Tim Chen
2011-06-15 20:13         ` Tim Chen
2011-06-15 20:17         ` Ingo Molnar
2011-06-15 20:17           ` Ingo Molnar
2011-06-15 20:21           ` Tim Chen
2011-06-15 20:21             ` Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1308340385.12801.101.camel@laptop \
    --to=peterz@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@intel.com \
    --cc=benh@kernel.crashing.org \
    --cc=davem@davemloft.net \
    --cc=hughd@google.com \
    --cc=jdike@addtoit.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=lethal@linux-sh.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=namhyung@gmail.com \
    --cc=npiggin@kernel.dk \
    --cc=richard@nod.at \
    --cc=rjw@sisk.pl \
    --cc=rmk@arm.linux.org.uk \
    --cc=schwidefsky@de.ibm.com \
    --cc=shaohua.li@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.