From: Nick Piggin <npiggin@suse.de>
To: Andi Kleen <andi@firstfloor.org>
Cc: hugh.dickins@tiscali.co.uk, riel@redhat.com,
chris.mason@oracle.com, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
fengguang.wu@intel.com
Subject: Re: [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v5
Date: Tue, 9 Jun 2009 12:09:22 +0200 [thread overview]
Message-ID: <20090609100922.GF14820@wotan.suse.de> (raw)
In-Reply-To: <20090603184648.2E2131D028F@basil.firstfloor.org>
On Wed, Jun 03, 2009 at 08:46:47PM +0200, Andi Kleen wrote:
Why not have this in rmap.c and not export the locking?
I don't know.. does Hugh care?
> +/*
> + * Collect processes when the error hit an anonymous page.
> + */
> +static void collect_procs_anon(struct page *page, struct list_head *to_kill,
> + struct to_kill **tkc)
> +{
> + struct vm_area_struct *vma;
> + struct task_struct *tsk;
> + struct anon_vma *av = page_lock_anon_vma(page);
> +
> + if (av == NULL) /* Not actually mapped anymore */
> + return;
> +
> + read_lock(&tasklist_lock);
> + for_each_process (tsk) {
> + if (!tsk->mm)
> + continue;
> + list_for_each_entry (vma, &av->head, anon_vma_node) {
> + if (vma->vm_mm == tsk->mm)
> + add_to_kill(tsk, page, vma, to_kill, tkc);
> + }
> + }
> + page_unlock_anon_vma(av);
> + read_unlock(&tasklist_lock);
> +}
> +
> +/*
> + * Collect processes when the error hit a file mapped page.
> + */
> +static void collect_procs_file(struct page *page, struct list_head *to_kill,
> + struct to_kill **tkc)
> +{
> + struct vm_area_struct *vma;
> + struct task_struct *tsk;
> + struct prio_tree_iter iter;
> + struct address_space *mapping = page_mapping(page);
> +
> + /*
> + * A note on the locking order between the two locks.
> + * We don't rely on this particular order.
> + * If you have some other code that needs a different order
> + * feel free to switch them around. Or add a reverse link
> + * from mm_struct to task_struct, then this could be all
> + * done without taking tasklist_lock and looping over all tasks.
> + */
> +
> + read_lock(&tasklist_lock);
> + spin_lock(&mapping->i_mmap_lock);
This still has my original complaint that it nests tasklist lock inside
anon vma lock and outside inode mmap lock (and anon_vma nests inside i_mmap).
I guess the property of our current rw locks means that does not matter,
but it could if we had "fair" rw locks, or some tree (-rt tree maybe)
changed rw lock to a plain exclusive lock.
> + for_each_process(tsk) {
> + pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> +
> + if (!tsk->mm)
> + continue;
> +
> + vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff,
> + pgoff)
> + if (vma->vm_mm == tsk->mm)
> + add_to_kill(tsk, page, vma, to_kill, tkc);
> + }
> + spin_unlock(&mapping->i_mmap_lock);
> + read_unlock(&tasklist_lock);
> +}
> +
> +/*
> + * Collect the processes who have the corrupted page mapped to kill.
> + * This is done in two steps for locking reasons.
> + * First preallocate one tokill structure outside the spin locks,
> + * so that we can kill at least one process reasonably reliable.
> + */
> +static void collect_procs(struct page *page, struct list_head *tokill)
> +{
> + struct to_kill *tk;
> +
> + tk = kmalloc(sizeof(struct to_kill), GFP_KERNEL);
> + /* memory allocation failure is implicitly handled */
> + if (PageAnon(page))
> + collect_procs_anon(page, tokill, &tk);
> + else
> + collect_procs_file(page, tokill, &tk);
> + kfree(tk);
> +}
> Index: linux/mm/filemap.c
> ===================================================================
> --- linux.orig/mm/filemap.c 2009-06-03 19:37:38.000000000 +0200
> +++ linux/mm/filemap.c 2009-06-03 20:13:43.000000000 +0200
> @@ -105,6 +105,10 @@
> *
> * ->task->proc_lock
> * ->dcache_lock (proc_pid_lookup)
> + *
> + * (code doesn't rely on that order, so you could switch it around)
> + * ->tasklist_lock (memory_failure, collect_procs_ao)
> + * ->i_mmap_lock
> */
>
> /*
> Index: linux/mm/rmap.c
> ===================================================================
> --- linux.orig/mm/rmap.c 2009-06-03 19:37:38.000000000 +0200
> +++ linux/mm/rmap.c 2009-06-03 20:13:43.000000000 +0200
> @@ -36,6 +36,10 @@
> * mapping->tree_lock (widely used, in set_page_dirty,
> * in arch-dependent flush_dcache_mmap_lock,
> * within inode_lock in __sync_single_inode)
> + *
> + * (code doesn't rely on that order so it could be switched around)
> + * ->tasklist_lock
> + * anon_vma->lock (memory_failure, collect_procs_anon)
> */
>
> #include <linux/mm.h>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-06-09 9:37 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-03 18:46 [PATCH] [0/16] HWPOISON: Intro Andi Kleen
2009-06-03 18:46 ` [PATCH] [1/16] HWPOISON: Add page flag for poisoned pages Andi Kleen
2009-06-03 18:46 ` [PATCH] [2/16] HWPOISON: Export some rmap vma locking to outside world Andi Kleen
2009-06-03 18:46 ` [PATCH] [3/16] HWPOISON: Add support for poison swap entries v2 Andi Kleen
2009-06-03 18:46 ` [PATCH] [4/16] HWPOISON: Add new SIGBUS error codes for hardware poison signals Andi Kleen
2009-06-03 18:46 ` [PATCH] [5/16] HWPOISON: Add basic support for poisoned pages in fault handler v3 Andi Kleen
2009-06-03 18:46 ` [PATCH] [6/16] HWPOISON: Add various poison checks in mm/memory.c Andi Kleen
2009-06-04 4:26 ` Wu Fengguang
2009-06-04 5:19 ` Andi Kleen
2009-06-04 11:55 ` Wu Fengguang
2009-06-04 12:52 ` Andi Kleen
2009-06-04 12:50 ` Wu Fengguang
2009-06-04 13:02 ` Andi Kleen
2009-06-04 13:16 ` Wu Fengguang
2009-06-09 10:25 ` Nick Piggin
2009-06-09 12:21 ` Wu Fengguang
2009-06-09 12:35 ` Nick Piggin
2009-06-03 18:46 ` [PATCH] [7/16] HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 Andi Kleen
2009-06-09 9:54 ` Nick Piggin
2009-06-09 12:34 ` [PATCH] HWPOISON: define VM_FAULT_HWPOISON to 0 when feature is disabled Wu Fengguang
2009-06-03 18:46 ` [PATCH] [8/16] HWPOISON: Use bitmask/action code for try_to_unmap behaviour Andi Kleen
2009-06-09 9:57 ` Nick Piggin
2009-06-10 2:27 ` Wu Fengguang
2009-06-10 6:07 ` Nick Piggin
2009-06-03 18:46 ` [PATCH] [9/16] HWPOISON: Handle hardware poisoned pages in try_to_unmap Andi Kleen
2009-06-04 4:35 ` Wu Fengguang
2009-06-04 5:21 ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [10/16] HWPOISON: Handle poisoned pages in set_page_dirty() Andi Kleen
2009-06-04 0:36 ` Wu Fengguang
2009-06-04 5:27 ` Andi Kleen
2009-06-09 9:59 ` Nick Piggin
2009-06-09 12:51 ` Wu Fengguang
2009-06-03 18:46 ` [PATCH] [11/16] HWPOISON: check and isolate corrupted free pages v2 Andi Kleen
2009-06-09 10:02 ` Nick Piggin
2009-06-09 13:03 ` Wu Fengguang
2009-06-09 13:28 ` Nick Piggin
2009-06-09 13:49 ` Wu Fengguang
2009-06-09 13:55 ` Nick Piggin
2009-06-09 14:56 ` Wu Fengguang
2009-06-09 15:31 ` Nick Piggin
2009-06-03 18:46 ` [PATCH] [12/16] Refactor truncate to allow direct truncating of page Andi Kleen
2009-06-04 4:32 ` Wu Fengguang
2009-06-04 5:20 ` Andi Kleen
2009-06-03 18:46 ` [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v5 Andi Kleen
2009-06-04 3:24 ` Wu Fengguang
2009-06-04 5:13 ` Andi Kleen
2009-06-04 9:07 ` Wu Fengguang
2009-06-04 9:26 ` Andi Kleen
2009-06-09 9:51 ` Nick Piggin
2009-06-09 11:14 ` Nick Piggin
2009-06-09 10:09 ` Nick Piggin [this message]
2009-06-09 16:05 ` Hugh Dickins
2009-06-09 16:35 ` Nick Piggin
2009-06-10 8:38 ` Wu Fengguang
2009-06-10 8:59 ` Nick Piggin
2009-06-10 9:20 ` Wu Fengguang
2009-06-10 11:03 ` Nick Piggin
2009-06-10 12:16 ` Wu Fengguang
2009-06-10 12:36 ` Nick Piggin
2009-06-12 9:58 ` Andi Kleen
2009-06-10 3:10 ` [PATCH] HWPOISON: fix tasklist_lock/anon_vma locking order Wu Fengguang
2009-06-03 18:46 ` [PATCH] [14/16] HWPOISON: FOR TESTING: Enable memory failure code unconditionally Andi Kleen
2009-06-03 18:46 ` [PATCH] [15/16] HWPOISON: Add madvise() based injector for hardware poisoned pages v3 Andi Kleen
2009-06-03 18:46 ` [PATCH] [16/16] HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs Andi Kleen
2009-06-09 10:20 ` [PATCH] [0/16] HWPOISON: Intro Nick Piggin
2009-06-10 9:07 ` Wu Fengguang
2009-06-10 9:18 ` Nick Piggin
2009-06-10 9:45 ` Wu Fengguang
2009-06-10 11:15 ` Nick Piggin
2009-06-10 12:36 ` Wu Fengguang
2009-06-10 12:47 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090609100922.GF14820@wotan.suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=chris.mason@oracle.com \
--cc=fengguang.wu@intel.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).