From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Robin Holt <holt@sgi.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Wanpeng Li <liwanp@linux.vnet.ibm.com>,
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
Avi Kivity <avi@redhat.com>, Hugh Dickins <hughd@google.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Sagi Grimberg <sagig@mellanox.co.il>,
Haggai Eran <haggaie@mellanox.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [ 29/58] mmu_notifier_unregister NULL Pointer deref and multiple ->release() callouts
Date: Mon, 25 Feb 2013 14:19:25 -0800 [thread overview]
Message-ID: <20130225221642.414619800@linuxfoundation.org> (raw)
In-Reply-To: <20130225221636.018756060@linuxfoundation.org>
3.7-stable review patch. If anyone has any objections, please let me know.
------------------
From: Robin Holt <holt@sgi.com>
commit 751efd8610d3d7d67b7bdf7f62646edea7365dd7 upstream.
There is a race condition between mmu_notifier_unregister() and
__mmu_notifier_release().
Assume two tasks, one calling mmu_notifier_unregister() as a result of a
filp_close() ->flush() callout (task A), and the other calling
mmu_notifier_release() from an mmput() (task B).
A B
t1 srcu_read_lock()
t2 if (!hlist_unhashed())
t3 srcu_read_unlock()
t4 srcu_read_lock()
t5 hlist_del_init_rcu()
t6 synchronize_srcu()
t7 srcu_read_unlock()
t8 hlist_del_rcu() <--- NULL pointer deref.
Additionally, the list traversal in __mmu_notifier_release() is not
protected by the by the mmu_notifier_mm->hlist_lock which can result in
callouts to the ->release() notifier from both mmu_notifier_unregister()
and __mmu_notifier_release().
-stable suggestions:
The stable trees prior to 3.7.y need commits 21a92735f660 and
70400303ce0c cherry-picked in that order prior to cherry-picking this
commit. The 3.7.y tree already has those two commits.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Sagi Grimberg <sagig@mellanox.co.il>
Cc: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/mmu_notifier.c | 82 +++++++++++++++++++++++++++---------------------------
1 file changed, 42 insertions(+), 40 deletions(-)
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -37,49 +37,51 @@ static struct srcu_struct srcu;
void __mmu_notifier_release(struct mm_struct *mm)
{
struct mmu_notifier *mn;
- struct hlist_node *n;
int id;
/*
- * SRCU here will block mmu_notifier_unregister until
- * ->release returns.
+ * srcu_read_lock() here will block synchronize_srcu() in
+ * mmu_notifier_unregister() until all registered
+ * ->release() callouts this function makes have
+ * returned.
*/
id = srcu_read_lock(&srcu);
- hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_mm->list, hlist)
- /*
- * if ->release runs before mmu_notifier_unregister it
- * must be handled as it's the only way for the driver
- * to flush all existing sptes and stop the driver
- * from establishing any more sptes before all the
- * pages in the mm are freed.
- */
- if (mn->ops->release)
- mn->ops->release(mn, mm);
- srcu_read_unlock(&srcu, id);
-
spin_lock(&mm->mmu_notifier_mm->lock);
while (unlikely(!hlist_empty(&mm->mmu_notifier_mm->list))) {
mn = hlist_entry(mm->mmu_notifier_mm->list.first,
struct mmu_notifier,
hlist);
+
/*
- * We arrived before mmu_notifier_unregister so
- * mmu_notifier_unregister will do nothing other than
- * to wait ->release to finish and
- * mmu_notifier_unregister to return.
+ * Unlink. This will prevent mmu_notifier_unregister()
+ * from also making the ->release() callout.
*/
hlist_del_init_rcu(&mn->hlist);
+ spin_unlock(&mm->mmu_notifier_mm->lock);
+
+ /*
+ * Clear sptes. (see 'release' description in mmu_notifier.h)
+ */
+ if (mn->ops->release)
+ mn->ops->release(mn, mm);
+
+ spin_lock(&mm->mmu_notifier_mm->lock);
}
spin_unlock(&mm->mmu_notifier_mm->lock);
/*
- * synchronize_srcu here prevents mmu_notifier_release to
- * return to exit_mmap (which would proceed freeing all pages
- * in the mm) until the ->release method returns, if it was
- * invoked by mmu_notifier_unregister.
- *
- * The mmu_notifier_mm can't go away from under us because one
- * mm_count is hold by exit_mmap.
+ * All callouts to ->release() which we have done are complete.
+ * Allow synchronize_srcu() in mmu_notifier_unregister() to complete
+ */
+ srcu_read_unlock(&srcu, id);
+
+ /*
+ * mmu_notifier_unregister() may have unlinked a notifier and may
+ * still be calling out to it. Additionally, other notifiers
+ * may have been active via vmtruncate() et. al. Block here
+ * to ensure that all notifier callouts for this mm have been
+ * completed and the sptes are really cleaned up before returning
+ * to exit_mmap().
*/
synchronize_srcu(&srcu);
}
@@ -294,31 +296,31 @@ void mmu_notifier_unregister(struct mmu_
{
BUG_ON(atomic_read(&mm->mm_count) <= 0);
+ spin_lock(&mm->mmu_notifier_mm->lock);
if (!hlist_unhashed(&mn->hlist)) {
- /*
- * SRCU here will force exit_mmap to wait ->release to finish
- * before freeing the pages.
- */
int id;
- id = srcu_read_lock(&srcu);
/*
- * exit_mmap will block in mmu_notifier_release to
- * guarantee ->release is called before freeing the
- * pages.
+ * Ensure we synchronize up with __mmu_notifier_release().
*/
+ id = srcu_read_lock(&srcu);
+
+ hlist_del_rcu(&mn->hlist);
+ spin_unlock(&mm->mmu_notifier_mm->lock);
+
if (mn->ops->release)
mn->ops->release(mn, mm);
- srcu_read_unlock(&srcu, id);
- spin_lock(&mm->mmu_notifier_mm->lock);
- hlist_del_rcu(&mn->hlist);
+ /*
+ * Allow __mmu_notifier_release() to complete.
+ */
+ srcu_read_unlock(&srcu, id);
+ } else
spin_unlock(&mm->mmu_notifier_mm->lock);
- }
/*
- * Wait any running method to finish, of course including
- * ->release if it was run by mmu_notifier_relase instead of us.
+ * Wait for any running method to finish, including ->release() if it
+ * was run by __mmu_notifier_release() instead of us.
*/
synchronize_srcu(&srcu);
next prev parent reply other threads:[~2013-02-25 22:28 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-25 22:18 [ 00/58] 3.7.10-stable review Greg Kroah-Hartman
2013-02-25 22:18 ` [ 01/58] drm/nouveau/vm: fix memory corruption when pgt allocation fails Greg Kroah-Hartman
2013-02-25 22:18 ` [ 02/58] x86-32, mm: Rip out x86_32 NUMA remapping code Greg Kroah-Hartman
2013-02-25 22:18 ` [ 03/58] x86-32, mm: Remove reference to resume_map_numa_kva() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 04/58] x86-32, mm: Remove reference to alloc_remap() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 05/58] perf tools: Fix build with bison 2.3 and older Greg Kroah-Hartman
2013-02-25 22:19 ` [ 06/58] perf hists: Fix period symbol_conf.field_sep display Greg Kroah-Hartman
2013-02-25 22:19 ` [ 07/58] mm: fix pageblock bitmap allocation Greg Kroah-Hartman
2013-02-25 22:19 ` [ 08/58] timeconst.pl: Eliminate Perl warning Greg Kroah-Hartman
2013-02-25 22:19 ` [ 09/58] genirq: Avoid deadlock in spurious handling Greg Kroah-Hartman
2013-02-25 22:19 ` [ 10/58] posix-cpu-timers: Fix nanosleep task_struct leak Greg Kroah-Hartman
2013-02-25 22:19 ` [ 11/58] hrtimer: Prevent hrtimer_enqueue_reprogram race Greg Kroah-Hartman
2013-02-25 22:19 ` [ 12/58] x86: Hyper-V: register clocksource only if its advertised Greg Kroah-Hartman
2013-02-25 22:19 ` [ 13/58] workqueue: un-GPL function delayed_work_timer_fn() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 14/58] ALSA: ali5451: remove irq enabling in pointer callback Greg Kroah-Hartman
2013-02-25 22:19 ` [ 15/58] ALSA: rme32.c irq enabling after spin_lock_irq Greg Kroah-Hartman
2013-02-25 22:19 ` [ 16/58] tty: Prevent deadlock in n_gsm driver Greg Kroah-Hartman
2013-02-25 22:19 ` [ 17/58] tty: set_termios/set_termiox should not return -EINTR Greg Kroah-Hartman
2013-02-25 22:19 ` [ 18/58] USB: serial: fix null-pointer dereferences on disconnect Greg Kroah-Hartman
2013-02-25 22:19 ` [ 19/58] serial: imx: Fix recursive locking bug Greg Kroah-Hartman
2013-02-25 22:19 ` [ 20/58] serial_core: Fix type definition for PORT_BRCM_TRUMANAGE Greg Kroah-Hartman
2013-02-25 22:19 ` [ 21/58] b43: Increase number of RX DMA slots Greg Kroah-Hartman
2013-02-25 22:19 ` [ 22/58] rtlwifi: rtl8192cu: Add new USB ID Greg Kroah-Hartman
2013-02-25 22:19 ` [ 23/58] rtlwifi: usb: allocate URB control message setup_packet and data buffer separately Greg Kroah-Hartman
2013-02-25 22:19 ` [ 24/58] tty vt: fix character insertion overflow Greg Kroah-Hartman
2013-02-25 22:19 ` [ 25/58] xen: Send spinlock IPI to all waiters Greg Kroah-Hartman
2013-02-25 22:19 ` [ 26/58] xen: close evtchn port if binding to irq fails Greg Kroah-Hartman
2013-02-25 22:19 ` [ 27/58] zram: Fix deadlock bug in partial read/write Greg Kroah-Hartman
2013-02-25 22:19 ` [ 28/58] Driver core: treat unregistered bus_types as having no devices Greg Kroah-Hartman
2013-02-25 22:19 ` Greg Kroah-Hartman [this message]
2013-02-25 22:19 ` [ 30/58] KVM: s390: Handle hosts not supporting s390-virtio Greg Kroah-Hartman
2013-02-25 22:19 ` [ 31/58] s390/kvm: Fix store status for ACRS/FPRS Greg Kroah-Hartman
2013-02-25 22:19 ` [ 32/58] futex: Revert "futex: Mark get_robust_list as deprecated" Greg Kroah-Hartman
2013-02-25 22:19 ` [ 33/58] inotify: remove broken mask checks causing unmount to be EINVAL Greg Kroah-Hartman
2013-02-25 22:19 ` [ 34/58] fs/block_dev.c: page cache wrongly left invalidated after revalidate_disk() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 35/58] ocfs2: unlock super lock if lockres refresh failed Greg Kroah-Hartman
2013-02-25 22:19 ` [ 36/58] drivers/video/backlight/adp88?0_bl.c: fix resume Greg Kroah-Hartman
2013-02-25 22:19 ` [ 37/58] tmpfs: fix use-after-free of mempolicy object Greg Kroah-Hartman
2013-02-25 22:19 ` [ 38/58] mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages Greg Kroah-Hartman
2013-02-25 22:19 ` [ 39/58] xen-pciback: rate limit error messages from xen_pcibk_enable_msi{,x}() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 40/58] drivercore: Fix ordering between deferred_probe and exiting initcalls Greg Kroah-Hartman
2013-02-25 22:19 ` [ 41/58] umount oops when remove blocklayoutdriver first Greg Kroah-Hartman
2013-02-25 22:19 ` [ 42/58] NLM: Ensure that we resend all pending blocking locks after a reclaim Greg Kroah-Hartman
2013-02-25 22:19 ` [ 43/58] NFSv4.1: Dont decode skipped layoutgets Greg Kroah-Hartman
2013-02-25 22:19 ` [ 44/58] p54usb: corrected USB ID for T-Com Sinus 154 data II Greg Kroah-Hartman
2013-02-25 22:19 ` [ 45/58] ALSA: usb-audio: fix Roland A-PRO support Greg Kroah-Hartman
2013-02-25 22:19 ` [ 46/58] ALSA: usb: Fix Processing Unit Descriptor parsers Greg Kroah-Hartman
2013-02-25 22:19 ` [ 47/58] ALSA: hda - Release assigned pin/cvt at error path of hdmi_pcm_open() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 48/58] ALSA: hda - Fix default multichannel HDMI mapping regression Greg Kroah-Hartman
2013-02-25 22:19 ` [ 49/58] ALSA: hda - Workaround for silent output on Sony Vaio VGC-LN51JGB with ALC889 Greg Kroah-Hartman
2013-02-25 22:19 ` [ 50/58] ALSA: hda - hdmi: ELD shouldnt be valid after unplug Greg Kroah-Hartman
2013-02-25 22:19 ` [ 51/58] GFS2: Get a block reservation before resizing a file Greg Kroah-Hartman
2013-02-25 22:34 ` Bob Peterson
2013-02-25 22:19 ` [ 52/58] sunvdc: Fix off-by-one in generic_request() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 53/58] sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE Greg Kroah-Hartman
2013-02-25 22:19 ` [ 54/58] sparc64: Fix get_user_pages_fast() wrt. THP Greg Kroah-Hartman
2013-02-25 22:19 ` [ 55/58] sparc64: Fix gfp_flags setting in tsb_grow() Greg Kroah-Hartman
2013-02-25 22:19 ` [ 56/58] sparc64: Handle hugepage TSB being NULL Greg Kroah-Hartman
2013-02-25 22:19 ` [ 57/58] sparc64: Fix tsb_grow() in atomic context Greg Kroah-Hartman
2013-02-25 22:19 ` [ 58/58] sparc64: Fix huge PMD to PTE translation for sun4u in TLB miss handler Greg Kroah-Hartman
2013-02-26 0:47 ` [ 00/58] 3.7.10-stable review Shuah Khan
2013-02-26 1:02 ` Greg Kroah-Hartman
2013-02-26 17:37 ` Andre Tomt
2013-02-26 17:48 ` Josh Boyer
2013-02-26 17:52 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130225221642.414619800@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=haggaie@mellanox.com \
--cc=holt@sgi.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liwanp@linux.vnet.ibm.com \
--cc=mtosatti@redhat.com \
--cc=sagig@mellanox.co.il \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=xiaoguangrong@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.