All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	"hugh.dickins@tiscali.co.uk" <hugh.dickins@tiscali.co.uk>,
	"riel@redhat.com" <riel@redhat.com>,
	"chris.mason@oracle.com" <chris.mason@oracle.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: [PATCH] HWPOISON: fix tasklist_lock/anon_vma locking order
Date: Wed, 10 Jun 2009 11:10:00 +0800	[thread overview]
Message-ID: <20090610031000.GD6597@localhost> (raw)
In-Reply-To: <20090609100922.GF14820@wotan.suse.de>

On Tue, Jun 09, 2009 at 06:09:22PM +0800, Nick Piggin wrote:
> On Wed, Jun 03, 2009 at 08:46:47PM +0200, Andi Kleen wrote:
> 
> Why not have this in rmap.c and not export the locking?
> I don't know.. does Hugh care?

I don't know either :)

> > +/*
> > + * Collect processes when the error hit an anonymous page.
> > + */
> > +static void collect_procs_anon(struct page *page, struct list_head *to_kill,
> > +			      struct to_kill **tkc)
> > +{
> > +	struct vm_area_struct *vma;
> > +	struct task_struct *tsk;
> > +	struct anon_vma *av = page_lock_anon_vma(page);
> > +
> > +	if (av == NULL)	/* Not actually mapped anymore */
> > +		return;
> > +
> > +	read_lock(&tasklist_lock);
> > +	for_each_process (tsk) {
> > +		if (!tsk->mm)
> > +			continue;
> > +		list_for_each_entry (vma, &av->head, anon_vma_node) {
> > +			if (vma->vm_mm == tsk->mm)
> > +				add_to_kill(tsk, page, vma, to_kill, tkc);
> > +		}
> > +	}
> > +	page_unlock_anon_vma(av);
> > +	read_unlock(&tasklist_lock);
> > +}
> > +
> > +/*
> > + * Collect processes when the error hit a file mapped page.
> > + */
> > +static void collect_procs_file(struct page *page, struct list_head *to_kill,
> > +			      struct to_kill **tkc)
> > +{
> > +	struct vm_area_struct *vma;
> > +	struct task_struct *tsk;
> > +	struct prio_tree_iter iter;
> > +	struct address_space *mapping = page_mapping(page);
> > +
> > +	/*
> > +	 * A note on the locking order between the two locks.
> > +	 * We don't rely on this particular order.
> > +	 * If you have some other code that needs a different order
> > +	 * feel free to switch them around. Or add a reverse link
> > +	 * from mm_struct to task_struct, then this could be all
> > +	 * done without taking tasklist_lock and looping over all tasks.
> > +	 */
> > +
> > +	read_lock(&tasklist_lock);
> > +	spin_lock(&mapping->i_mmap_lock);
> 
> This still has my original complaint that it nests tasklist lock inside
> anon vma lock and outside inode mmap lock (and anon_vma nests inside i_mmap).
> I guess the property of our current rw locks means that does not matter,
> but it could if we had "fair" rw locks, or some tree (-rt tree maybe)
> changed rw lock to a plain exclusive lock.

Andi must forgot that - he did change the comment on locking order.
This incremental patch aligns the code with his comment in rmap.c.

---
HWPOISON: fix tasklist_lock/anon_vma locking order

To avoid possible deadlock. Proposed by Nick Piggin:

  You have tasklist_lock(R) nesting outside i_mmap_lock, and inside anon_vma
  lock. And anon_vma lock nests inside i_mmap_lock.

  This seems fragile. If rwlocks ever become FIFO or tasklist_lock changes
  type (maybe -rt kernels do it), then you could have a task holding
  anon_vma lock and waiting for tasklist_lock, and another holding tasklist
  lock and waiting for i_mmap_lock, and another holding i_mmap_lock and
  waiting for anon_vma lock.

CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/memory-failure.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- sound-2.6.orig/mm/memory-failure.c
+++ sound-2.6/mm/memory-failure.c
@@ -215,12 +215,14 @@ static void collect_procs_anon(struct pa
 {
 	struct vm_area_struct *vma;
 	struct task_struct *tsk;
-	struct anon_vma *av = page_lock_anon_vma(page);
+	struct anon_vma *av;
 
+	read_lock(&tasklist_lock);
+
+	av = page_lock_anon_vma(page);
 	if (av == NULL)	/* Not actually mapped anymore */
-		return;
+		goto out;
 
-	read_lock(&tasklist_lock);
 	for_each_process (tsk) {
 		if (!tsk->mm)
 			continue;
@@ -230,6 +232,7 @@ static void collect_procs_anon(struct pa
 		}
 	}
 	page_unlock_anon_vma(av);
+out:
 	read_unlock(&tasklist_lock);
 }
 

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	"hugh.dickins@tiscali.co.uk" <hugh.dickins@tiscali.co.uk>,
	"riel@redhat.com" <riel@redhat.com>,
	"chris.mason@oracle.com" <chris.mason@oracle.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: [PATCH] HWPOISON: fix tasklist_lock/anon_vma locking order
Date: Wed, 10 Jun 2009 11:10:00 +0800	[thread overview]
Message-ID: <20090610031000.GD6597@localhost> (raw)
In-Reply-To: <20090609100922.GF14820@wotan.suse.de>

On Tue, Jun 09, 2009 at 06:09:22PM +0800, Nick Piggin wrote:
> On Wed, Jun 03, 2009 at 08:46:47PM +0200, Andi Kleen wrote:
> 
> Why not have this in rmap.c and not export the locking?
> I don't know.. does Hugh care?

I don't know either :)

> > +/*
> > + * Collect processes when the error hit an anonymous page.
> > + */
> > +static void collect_procs_anon(struct page *page, struct list_head *to_kill,
> > +			      struct to_kill **tkc)
> > +{
> > +	struct vm_area_struct *vma;
> > +	struct task_struct *tsk;
> > +	struct anon_vma *av = page_lock_anon_vma(page);
> > +
> > +	if (av == NULL)	/* Not actually mapped anymore */
> > +		return;
> > +
> > +	read_lock(&tasklist_lock);
> > +	for_each_process (tsk) {
> > +		if (!tsk->mm)
> > +			continue;
> > +		list_for_each_entry (vma, &av->head, anon_vma_node) {
> > +			if (vma->vm_mm == tsk->mm)
> > +				add_to_kill(tsk, page, vma, to_kill, tkc);
> > +		}
> > +	}
> > +	page_unlock_anon_vma(av);
> > +	read_unlock(&tasklist_lock);
> > +}
> > +
> > +/*
> > + * Collect processes when the error hit a file mapped page.
> > + */
> > +static void collect_procs_file(struct page *page, struct list_head *to_kill,
> > +			      struct to_kill **tkc)
> > +{
> > +	struct vm_area_struct *vma;
> > +	struct task_struct *tsk;
> > +	struct prio_tree_iter iter;
> > +	struct address_space *mapping = page_mapping(page);
> > +
> > +	/*
> > +	 * A note on the locking order between the two locks.
> > +	 * We don't rely on this particular order.
> > +	 * If you have some other code that needs a different order
> > +	 * feel free to switch them around. Or add a reverse link
> > +	 * from mm_struct to task_struct, then this could be all
> > +	 * done without taking tasklist_lock and looping over all tasks.
> > +	 */
> > +
> > +	read_lock(&tasklist_lock);
> > +	spin_lock(&mapping->i_mmap_lock);
> 
> This still has my original complaint that it nests tasklist lock inside
> anon vma lock and outside inode mmap lock (and anon_vma nests inside i_mmap).
> I guess the property of our current rw locks means that does not matter,
> but it could if we had "fair" rw locks, or some tree (-rt tree maybe)
> changed rw lock to a plain exclusive lock.

Andi must forgot that - he did change the comment on locking order.
This incremental patch aligns the code with his comment in rmap.c.

---
HWPOISON: fix tasklist_lock/anon_vma locking order

To avoid possible deadlock. Proposed by Nick Piggin:

  You have tasklist_lock(R) nesting outside i_mmap_lock, and inside anon_vma
  lock. And anon_vma lock nests inside i_mmap_lock.

  This seems fragile. If rwlocks ever become FIFO or tasklist_lock changes
  type (maybe -rt kernels do it), then you could have a task holding
  anon_vma lock and waiting for tasklist_lock, and another holding tasklist
  lock and waiting for i_mmap_lock, and another holding i_mmap_lock and
  waiting for anon_vma lock.

CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/memory-failure.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- sound-2.6.orig/mm/memory-failure.c
+++ sound-2.6/mm/memory-failure.c
@@ -215,12 +215,14 @@ static void collect_procs_anon(struct pa
 {
 	struct vm_area_struct *vma;
 	struct task_struct *tsk;
-	struct anon_vma *av = page_lock_anon_vma(page);
+	struct anon_vma *av;
 
+	read_lock(&tasklist_lock);
+
+	av = page_lock_anon_vma(page);
 	if (av == NULL)	/* Not actually mapped anymore */
-		return;
+		goto out;
 
-	read_lock(&tasklist_lock);
 	for_each_process (tsk) {
 		if (!tsk->mm)
 			continue;
@@ -230,6 +232,7 @@ static void collect_procs_anon(struct pa
 		}
 	}
 	page_unlock_anon_vma(av);
+out:
 	read_unlock(&tasklist_lock);
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-06-10  3:10 UTC|newest]

Thread overview: 142+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-03 18:46 [PATCH] [0/16] HWPOISON: Intro Andi Kleen
2009-06-03 18:46 ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [1/16] HWPOISON: Add page flag for poisoned pages Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [2/16] HWPOISON: Export some rmap vma locking to outside world Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [3/16] HWPOISON: Add support for poison swap entries v2 Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [4/16] HWPOISON: Add new SIGBUS error codes for hardware poison signals Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [5/16] HWPOISON: Add basic support for poisoned pages in fault handler v3 Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [6/16] HWPOISON: Add various poison checks in mm/memory.c Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-04  4:26   ` Wu Fengguang
2009-06-04  4:26     ` Wu Fengguang
2009-06-04  5:19     ` Andi Kleen
2009-06-04  5:19       ` Andi Kleen
2009-06-04 11:55       ` Wu Fengguang
2009-06-04 11:55         ` Wu Fengguang
2009-06-04 12:52         ` Andi Kleen
2009-06-04 12:52           ` Andi Kleen
2009-06-04 12:50           ` Wu Fengguang
2009-06-04 12:50             ` Wu Fengguang
2009-06-04 13:02             ` Andi Kleen
2009-06-04 13:02               ` Andi Kleen
2009-06-04 13:16               ` Wu Fengguang
2009-06-04 13:16                 ` Wu Fengguang
2009-06-09 10:25   ` Nick Piggin
2009-06-09 10:25     ` Nick Piggin
2009-06-09 12:21     ` Wu Fengguang
2009-06-09 12:21       ` Wu Fengguang
2009-06-09 12:35       ` Nick Piggin
2009-06-09 12:35         ` Nick Piggin
2009-06-03 18:46 ` [PATCH] [7/16] HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-09  9:54   ` Nick Piggin
2009-06-09  9:54     ` Nick Piggin
2009-06-09 12:34     ` [PATCH] HWPOISON: define VM_FAULT_HWPOISON to 0 when feature is disabled Wu Fengguang
2009-06-09 12:34       ` Wu Fengguang
2009-06-03 18:46 ` [PATCH] [8/16] HWPOISON: Use bitmask/action code for try_to_unmap behaviour Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-09  9:57   ` Nick Piggin
2009-06-09  9:57     ` Nick Piggin
2009-06-10  2:27     ` Wu Fengguang
2009-06-10  2:27       ` Wu Fengguang
2009-06-10  6:07       ` Nick Piggin
2009-06-10  6:07         ` Nick Piggin
2009-06-03 18:46 ` [PATCH] [9/16] HWPOISON: Handle hardware poisoned pages in try_to_unmap Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-04  4:35   ` Wu Fengguang
2009-06-04  4:35     ` Wu Fengguang
2009-06-04  5:21     ` Andi Kleen
2009-06-04  5:21       ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [10/16] HWPOISON: Handle poisoned pages in set_page_dirty() Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-04  0:36   ` Wu Fengguang
2009-06-04  0:36     ` Wu Fengguang
2009-06-04  5:27     ` Andi Kleen
2009-06-04  5:27       ` Andi Kleen
2009-06-09  9:59   ` Nick Piggin
2009-06-09  9:59     ` Nick Piggin
2009-06-09 12:51     ` Wu Fengguang
2009-06-09 12:51       ` Wu Fengguang
2009-06-03 18:46 ` [PATCH] [11/16] HWPOISON: check and isolate corrupted free pages v2 Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-09 10:02   ` Nick Piggin
2009-06-09 10:02     ` Nick Piggin
2009-06-09 13:03     ` Wu Fengguang
2009-06-09 13:03       ` Wu Fengguang
2009-06-09 13:28       ` Nick Piggin
2009-06-09 13:28         ` Nick Piggin
2009-06-09 13:49         ` Wu Fengguang
2009-06-09 13:49           ` Wu Fengguang
2009-06-09 13:55           ` Nick Piggin
2009-06-09 13:55             ` Nick Piggin
2009-06-09 14:56             ` Wu Fengguang
2009-06-09 14:56               ` Wu Fengguang
2009-06-09 15:31               ` Nick Piggin
2009-06-09 15:31                 ` Nick Piggin
2009-06-03 18:46 ` [PATCH] [12/16] Refactor truncate to allow direct truncating of page Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-04  4:32   ` Wu Fengguang
2009-06-04  4:32     ` Wu Fengguang
2009-06-04  5:20     ` Andi Kleen
2009-06-04  5:20       ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v5 Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-04  3:24   ` Wu Fengguang
2009-06-04  3:24     ` Wu Fengguang
2009-06-04  5:13     ` Andi Kleen
2009-06-04  5:13       ` Andi Kleen
2009-06-04  9:07       ` Wu Fengguang
2009-06-04  9:07         ` Wu Fengguang
2009-06-04  9:26         ` Andi Kleen
2009-06-04  9:26           ` Andi Kleen
2009-06-09  9:51   ` Nick Piggin
2009-06-09  9:51     ` Nick Piggin
2009-06-09 11:14     ` Nick Piggin
2009-06-09 11:14       ` Nick Piggin
2009-06-09 10:09   ` Nick Piggin
2009-06-09 10:09     ` Nick Piggin
2009-06-09 16:05     ` Hugh Dickins
2009-06-09 16:05       ` Hugh Dickins
2009-06-09 16:35       ` Nick Piggin
2009-06-09 16:35         ` Nick Piggin
2009-06-10  8:38       ` Wu Fengguang
2009-06-10  8:38         ` Wu Fengguang
2009-06-10  8:59         ` Nick Piggin
2009-06-10  8:59           ` Nick Piggin
2009-06-10  9:20           ` Wu Fengguang
2009-06-10  9:20             ` Wu Fengguang
2009-06-10 11:03             ` Nick Piggin
2009-06-10 11:03               ` Nick Piggin
2009-06-10 12:16               ` Wu Fengguang
2009-06-10 12:16                 ` Wu Fengguang
2009-06-10 12:36                 ` Nick Piggin
2009-06-10 12:36                   ` Nick Piggin
2009-06-12  9:58       ` Andi Kleen
2009-06-12  9:58         ` Andi Kleen
2009-06-10  3:10     ` Wu Fengguang [this message]
2009-06-10  3:10       ` [PATCH] HWPOISON: fix tasklist_lock/anon_vma locking order Wu Fengguang
2009-06-03 18:46 ` [PATCH] [14/16] HWPOISON: FOR TESTING: Enable memory failure code unconditionally Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [15/16] HWPOISON: Add madvise() based injector for hardware poisoned pages v3 Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [16/16] HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs Andi Kleen
2009-06-03 18:46   ` Andi Kleen
2009-06-09 10:20 ` [PATCH] [0/16] HWPOISON: Intro Nick Piggin
2009-06-09 10:20   ` Nick Piggin
2009-06-10  9:07   ` Wu Fengguang
2009-06-10  9:07     ` Wu Fengguang
2009-06-10  9:18     ` Nick Piggin
2009-06-10  9:18       ` Nick Piggin
2009-06-10  9:45       ` Wu Fengguang
2009-06-10  9:45         ` Wu Fengguang
2009-06-10 11:15         ` Nick Piggin
2009-06-10 11:15           ` Nick Piggin
2009-06-10 12:36           ` Wu Fengguang
2009-06-10 12:36             ` Wu Fengguang
2009-06-10 12:47             ` Nick Piggin
2009-06-10 12:47               ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090610031000.GD6597@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=chris.mason@oracle.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.