All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Robin Holt <holt@sgi.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Avi Kivity <avi@redhat.com>, Hugh Dickins <hughd@google.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Sagi Grimberg <sagig@mellanox.co.il>,
	Haggai Eran <haggaie@mellanox.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [ 15/53] mmu_notifier_unregister NULL Pointer deref and multiple ->release() callouts
Date: Tue, 26 Feb 2013 15:57:44 -0800	[thread overview]
Message-ID: <20130226235621.401610233@linuxfoundation.org> (raw)
In-Reply-To: <20130226235619.844721947@linuxfoundation.org>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Robin Holt <holt@sgi.com>

commit 751efd8610d3d7d67b7bdf7f62646edea7365dd7 upstream.

There is a race condition between mmu_notifier_unregister() and
__mmu_notifier_release().

Assume two tasks, one calling mmu_notifier_unregister() as a result of a
filp_close() ->flush() callout (task A), and the other calling
mmu_notifier_release() from an mmput() (task B).

                A                               B
t1                                              srcu_read_lock()
t2              if (!hlist_unhashed())
t3                                              srcu_read_unlock()
t4              srcu_read_lock()
t5                                              hlist_del_init_rcu()
t6                                              synchronize_srcu()
t7              srcu_read_unlock()
t8              hlist_del_rcu()  <--- NULL pointer deref.

Additionally, the list traversal in __mmu_notifier_release() is not
protected by the by the mmu_notifier_mm->hlist_lock which can result in
callouts to the ->release() notifier from both mmu_notifier_unregister()
and __mmu_notifier_release().

-stable suggestions:

The stable trees prior to 3.7.y need commits 21a92735f660 and
70400303ce0c cherry-picked in that order prior to cherry-picking this
commit.  The 3.7.y tree already has those two commits.

Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Sagi Grimberg <sagig@mellanox.co.il>
Cc: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/mmu_notifier.c |   82 +++++++++++++++++++++++++++---------------------------
 1 file changed, 42 insertions(+), 40 deletions(-)

--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -37,49 +37,51 @@ static struct srcu_struct srcu;
 void __mmu_notifier_release(struct mm_struct *mm)
 {
 	struct mmu_notifier *mn;
-	struct hlist_node *n;
 	int id;
 
 	/*
-	 * SRCU here will block mmu_notifier_unregister until
-	 * ->release returns.
+	 * srcu_read_lock() here will block synchronize_srcu() in
+	 * mmu_notifier_unregister() until all registered
+	 * ->release() callouts this function makes have
+	 * returned.
 	 */
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_mm->list, hlist)
-		/*
-		 * if ->release runs before mmu_notifier_unregister it
-		 * must be handled as it's the only way for the driver
-		 * to flush all existing sptes and stop the driver
-		 * from establishing any more sptes before all the
-		 * pages in the mm are freed.
-		 */
-		if (mn->ops->release)
-			mn->ops->release(mn, mm);
-	srcu_read_unlock(&srcu, id);
-
 	spin_lock(&mm->mmu_notifier_mm->lock);
 	while (unlikely(!hlist_empty(&mm->mmu_notifier_mm->list))) {
 		mn = hlist_entry(mm->mmu_notifier_mm->list.first,
 				 struct mmu_notifier,
 				 hlist);
+
 		/*
-		 * We arrived before mmu_notifier_unregister so
-		 * mmu_notifier_unregister will do nothing other than
-		 * to wait ->release to finish and
-		 * mmu_notifier_unregister to return.
+		 * Unlink.  This will prevent mmu_notifier_unregister()
+		 * from also making the ->release() callout.
 		 */
 		hlist_del_init_rcu(&mn->hlist);
+		spin_unlock(&mm->mmu_notifier_mm->lock);
+
+		/*
+		 * Clear sptes. (see 'release' description in mmu_notifier.h)
+		 */
+		if (mn->ops->release)
+			mn->ops->release(mn, mm);
+
+		spin_lock(&mm->mmu_notifier_mm->lock);
 	}
 	spin_unlock(&mm->mmu_notifier_mm->lock);
 
 	/*
-	 * synchronize_srcu here prevents mmu_notifier_release to
-	 * return to exit_mmap (which would proceed freeing all pages
-	 * in the mm) until the ->release method returns, if it was
-	 * invoked by mmu_notifier_unregister.
-	 *
-	 * The mmu_notifier_mm can't go away from under us because one
-	 * mm_count is hold by exit_mmap.
+	 * All callouts to ->release() which we have done are complete.
+	 * Allow synchronize_srcu() in mmu_notifier_unregister() to complete
+	 */
+	srcu_read_unlock(&srcu, id);
+
+	/*
+	 * mmu_notifier_unregister() may have unlinked a notifier and may
+	 * still be calling out to it.	Additionally, other notifiers
+	 * may have been active via vmtruncate() et. al. Block here
+	 * to ensure that all notifier callouts for this mm have been
+	 * completed and the sptes are really cleaned up before returning
+	 * to exit_mmap().
 	 */
 	synchronize_srcu(&srcu);
 }
@@ -300,31 +302,31 @@ void mmu_notifier_unregister(struct mmu_
 {
 	BUG_ON(atomic_read(&mm->mm_count) <= 0);
 
+	spin_lock(&mm->mmu_notifier_mm->lock);
 	if (!hlist_unhashed(&mn->hlist)) {
-		/*
-		 * SRCU here will force exit_mmap to wait ->release to finish
-		 * before freeing the pages.
-		 */
 		int id;
 
-		id = srcu_read_lock(&srcu);
 		/*
-		 * exit_mmap will block in mmu_notifier_release to
-		 * guarantee ->release is called before freeing the
-		 * pages.
+		 * Ensure we synchronize up with __mmu_notifier_release().
 		 */
+		id = srcu_read_lock(&srcu);
+
+		hlist_del_rcu(&mn->hlist);
+		spin_unlock(&mm->mmu_notifier_mm->lock);
+
 		if (mn->ops->release)
 			mn->ops->release(mn, mm);
-		srcu_read_unlock(&srcu, id);
 
-		spin_lock(&mm->mmu_notifier_mm->lock);
-		hlist_del_rcu(&mn->hlist);
+		/*
+		 * Allow __mmu_notifier_release() to complete.
+		 */
+		srcu_read_unlock(&srcu, id);
+	} else
 		spin_unlock(&mm->mmu_notifier_mm->lock);
-	}
 
 	/*
-	 * Wait any running method to finish, of course including
-	 * ->release if it was run by mmu_notifier_relase instead of us.
+	 * Wait for any running method to finish, including ->release() if it
+	 * was run by __mmu_notifier_release() instead of us.
 	 */
 	synchronize_srcu(&srcu);
 



  parent reply	other threads:[~2013-02-27  0:36 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-26 23:57 [ 00/53] 3.0.67-stable review Greg Kroah-Hartman
2013-02-26 23:57 ` [ 01/53] x86-32, mm: Remove reference to resume_map_numa_kva() Greg Kroah-Hartman
2013-02-26 23:57 ` [ 02/53] mm: fix pageblock bitmap allocation Greg Kroah-Hartman
2013-02-26 23:57 ` [ 03/53] timeconst.pl: Eliminate Perl warning Greg Kroah-Hartman
2013-02-27  2:46   ` Rob Landley
2013-02-26 23:57 ` [ 04/53] genirq: Avoid deadlock in spurious handling Greg Kroah-Hartman
2013-02-26 23:57 ` [ 05/53] posix-cpu-timers: Fix nanosleep task_struct leak Greg Kroah-Hartman
2013-02-26 23:57 ` [ 06/53] hrtimer: Prevent hrtimer_enqueue_reprogram race Greg Kroah-Hartman
2013-02-26 23:57 ` [ 07/53] ALSA: ali5451: remove irq enabling in pointer callback Greg Kroah-Hartman
2013-02-26 23:57 ` [ 08/53] ALSA: rme32.c irq enabling after spin_lock_irq Greg Kroah-Hartman
2013-02-26 23:57 ` [ 09/53] tty: set_termios/set_termiox should not return -EINTR Greg Kroah-Hartman
2013-02-26 23:57 ` [ 10/53] xen/netback: check correct frag when looking for head frag Greg Kroah-Hartman
2013-02-26 23:57 ` [ 11/53] xen: Send spinlock IPI to all waiters Greg Kroah-Hartman
2013-02-26 23:57 ` [ 12/53] Driver core: treat unregistered bus_types as having no devices Greg Kroah-Hartman
2013-02-26 23:57 ` [ 13/53] mm: mmu_notifier: have mmu_notifiers use a global SRCU so they may safely schedule Greg Kroah-Hartman
2013-02-26 23:57 ` [ 14/53] mm: mmu_notifier: make the mmu_notifier srcu static Greg Kroah-Hartman
2013-02-26 23:57 ` Greg Kroah-Hartman [this message]
2013-02-26 23:57 ` [ 16/53] KVM: s390: Handle hosts not supporting s390-virtio Greg Kroah-Hartman
2013-02-26 23:57 ` [ 17/53] s390/kvm: Fix store status for ACRS/FPRS Greg Kroah-Hartman
2013-02-28 22:26   ` Jiri Slaby
2013-03-01  7:50     ` Christian Borntraeger
2013-03-01  9:22       ` Jiri Slaby
2013-03-01 19:16         ` Greg Kroah-Hartman
2013-02-26 23:57 ` [ 18/53] inotify: remove broken mask checks causing unmount to be EINVAL Greg Kroah-Hartman
2013-02-26 23:57 ` [ 19/53] ocfs2: unlock super lock if lockres refresh failed Greg Kroah-Hartman
2013-02-26 23:57 ` [ 20/53] drivers/video/backlight/adp88?0_bl.c: fix resume Greg Kroah-Hartman
2013-02-26 23:57 ` [ 21/53] tmpfs: fix use-after-free of mempolicy object Greg Kroah-Hartman
2013-02-26 23:57 ` [ 22/53] mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages Greg Kroah-Hartman
2013-02-26 23:57 ` [ 23/53] NLM: Ensure that we resend all pending blocking locks after a reclaim Greg Kroah-Hartman
2013-02-26 23:57 ` [ 24/53] p54usb: corrected USB ID for T-Com Sinus 154 data II Greg Kroah-Hartman
2013-02-26 23:57 ` [ 25/53] ALSA: usb-audio: fix Roland A-PRO support Greg Kroah-Hartman
2013-02-26 23:57 ` [ 26/53] ALSA: usb: Fix Processing Unit Descriptor parsers Greg Kroah-Hartman
2013-02-26 23:57 ` [ 27/53] ext4: Free resources in some error path in ext4_fill_super Greg Kroah-Hartman
2013-02-26 23:57 ` [ 28/53] ext4: add missing kfree() on error return path in add_new_gdb() Greg Kroah-Hartman
2013-02-26 23:57 ` [ 29/53] sunvdc: Fix off-by-one in generic_request() Greg Kroah-Hartman
2013-02-26 23:57 ` [ 30/53] drm/usb: bind driver to correct device Greg Kroah-Hartman
2013-02-26 23:58 ` [ 31/53] NLS: improve UTF8 -> UTF16 string conversion routine Greg Kroah-Hartman
2013-02-26 23:58 ` [ 32/53] drm/i915: disable shared panel fitter for pipe Greg Kroah-Hartman
2013-02-26 23:58 ` [ 33/53] staging: comedi: disallow COMEDI_DEVCONFIG on non-board minors Greg Kroah-Hartman
2013-02-26 23:58 ` [ 34/53] staging: vt6656: Fix URB submitted while active warning Greg Kroah-Hartman
2013-02-26 23:58 ` [ 35/53] ARM: PXA3xx: program the CSMSADRCFG register Greg Kroah-Hartman
2013-02-26 23:58 ` [ 36/53] powerpc/kexec: Disable hard IRQ before kexec Greg Kroah-Hartman
2013-02-26 23:58 ` [ 37/53] [PARISC] Purge existing TLB entries in set_pte_at and ptep_set_wrprotect Greg Kroah-Hartman
2013-02-26 23:58 ` [ 38/53] pcmcia/vrc4171: Add missing spinlock init Greg Kroah-Hartman
2013-02-26 23:58 ` [ 39/53] fbcon: dont lose the console font across generic->chip driver switch Greg Kroah-Hartman
2013-02-26 23:58 ` [ 40/53] fb: rework locking to fix lock ordering on takeover Greg Kroah-Hartman
2013-02-26 23:58 ` [ 41/53] fb: Yet another band-aid for fixing lockdep mess Greg Kroah-Hartman
2013-02-26 23:58 ` [ 42/53] bridge: set priority of STP packets Greg Kroah-Hartman
2013-02-26 23:58 ` [ 43/53] xen-netback: correctly return errors from netbk_count_requests() Greg Kroah-Hartman
2013-02-26 23:58 ` [ 44/53] xen-netback: cancel the credit timer when taking the vif down Greg Kroah-Hartman
2013-02-26 23:58 ` [ 45/53] ipv4: fix a bug in ping_err() Greg Kroah-Hartman
2013-02-26 23:58 ` [ 46/53] ipv6: use a stronger hash for tcp Greg Kroah-Hartman
2013-02-26 23:58 ` [ 47/53] dca: check against empty dca_domains list before unregister provider Greg Kroah-Hartman
2013-02-28 22:04   ` Jiri Slaby
2013-02-28 22:17     ` Jiri Slaby
2013-03-01 19:17       ` Greg Kroah-Hartman
2013-02-26 23:58 ` [ 48/53] USB: option: add and update Alcatel modems Greg Kroah-Hartman
2013-02-26 23:58 ` [ 49/53] USB: option: add Yota / Megafon M100-1 4g modem Greg Kroah-Hartman
2013-02-26 23:58 ` [ 50/53] USB: option: add Huawei "ACM" devices using protocol = vendor Greg Kroah-Hartman
2013-02-26 23:58 ` [ 51/53] USB: ehci-omap: Fix autoloading of module Greg Kroah-Hartman
2013-02-26 23:58 ` [ 52/53] USB: storage: properly handle the endian issues of idProduct Greg Kroah-Hartman
2013-02-26 23:58 ` [ 53/53] USB: usb-storage: unusual_devs update for Super TOP SATA bridge Greg Kroah-Hartman
2013-02-27 16:51 ` [ 00/53] 3.0.67-stable review Shuah Khan
2013-02-27 16:54   ` Greg Kroah-Hartman
2013-02-28 14:56 ` Satoru Takeuchi
2013-02-28 14:59   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130226235621.401610233@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=haggaie@mellanox.com \
    --cc=holt@sgi.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=mtosatti@redhat.com \
    --cc=sagig@mellanox.co.il \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.