All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Robin Holt <holt@sgi.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Avi Kivity <avi@redhat.com>, Hugh Dickins <hughd@google.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Sagi Grimberg <sagig@mellanox.co.il>,
	Haggai Eran <haggaie@mellanox.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [ 23/86] mmu_notifier_unregister NULL Pointer deref and multiple ->release() callouts
Date: Tue, 26 Feb 2013 16:07:31 -0800	[thread overview]
Message-ID: <20130226235915.311684727@linuxfoundation.org> (raw)
In-Reply-To: <20130226235912.881663118@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Robin Holt <holt@sgi.com>

commit 751efd8610d3d7d67b7bdf7f62646edea7365dd7 upstream.

There is a race condition between mmu_notifier_unregister() and
__mmu_notifier_release().

Assume two tasks, one calling mmu_notifier_unregister() as a result of a
filp_close() ->flush() callout (task A), and the other calling
mmu_notifier_release() from an mmput() (task B).

                A                               B
t1                                              srcu_read_lock()
t2              if (!hlist_unhashed())
t3                                              srcu_read_unlock()
t4              srcu_read_lock()
t5                                              hlist_del_init_rcu()
t6                                              synchronize_srcu()
t7              srcu_read_unlock()
t8              hlist_del_rcu()  <--- NULL pointer deref.

Additionally, the list traversal in __mmu_notifier_release() is not
protected by the by the mmu_notifier_mm->hlist_lock which can result in
callouts to the ->release() notifier from both mmu_notifier_unregister()
and __mmu_notifier_release().

-stable suggestions:

The stable trees prior to 3.7.y need commits 21a92735f660 and
70400303ce0c cherry-picked in that order prior to cherry-picking this
commit.  The 3.7.y tree already has those two commits.

Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Sagi Grimberg <sagig@mellanox.co.il>
Cc: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/mmu_notifier.c |   82 +++++++++++++++++++++++++++---------------------------
 1 file changed, 42 insertions(+), 40 deletions(-)

--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -37,49 +37,51 @@ static struct srcu_struct srcu;
 void __mmu_notifier_release(struct mm_struct *mm)
 {
 	struct mmu_notifier *mn;
-	struct hlist_node *n;
 	int id;
 
 	/*
-	 * SRCU here will block mmu_notifier_unregister until
-	 * ->release returns.
+	 * srcu_read_lock() here will block synchronize_srcu() in
+	 * mmu_notifier_unregister() until all registered
+	 * ->release() callouts this function makes have
+	 * returned.
 	 */
 	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_mm->list, hlist)
-		/*
-		 * if ->release runs before mmu_notifier_unregister it
-		 * must be handled as it's the only way for the driver
-		 * to flush all existing sptes and stop the driver
-		 * from establishing any more sptes before all the
-		 * pages in the mm are freed.
-		 */
-		if (mn->ops->release)
-			mn->ops->release(mn, mm);
-	srcu_read_unlock(&srcu, id);
-
 	spin_lock(&mm->mmu_notifier_mm->lock);
 	while (unlikely(!hlist_empty(&mm->mmu_notifier_mm->list))) {
 		mn = hlist_entry(mm->mmu_notifier_mm->list.first,
 				 struct mmu_notifier,
 				 hlist);
+
 		/*
-		 * We arrived before mmu_notifier_unregister so
-		 * mmu_notifier_unregister will do nothing other than
-		 * to wait ->release to finish and
-		 * mmu_notifier_unregister to return.
+		 * Unlink.  This will prevent mmu_notifier_unregister()
+		 * from also making the ->release() callout.
 		 */
 		hlist_del_init_rcu(&mn->hlist);
+		spin_unlock(&mm->mmu_notifier_mm->lock);
+
+		/*
+		 * Clear sptes. (see 'release' description in mmu_notifier.h)
+		 */
+		if (mn->ops->release)
+			mn->ops->release(mn, mm);
+
+		spin_lock(&mm->mmu_notifier_mm->lock);
 	}
 	spin_unlock(&mm->mmu_notifier_mm->lock);
 
 	/*
-	 * synchronize_srcu here prevents mmu_notifier_release to
-	 * return to exit_mmap (which would proceed freeing all pages
-	 * in the mm) until the ->release method returns, if it was
-	 * invoked by mmu_notifier_unregister.
-	 *
-	 * The mmu_notifier_mm can't go away from under us because one
-	 * mm_count is hold by exit_mmap.
+	 * All callouts to ->release() which we have done are complete.
+	 * Allow synchronize_srcu() in mmu_notifier_unregister() to complete
+	 */
+	srcu_read_unlock(&srcu, id);
+
+	/*
+	 * mmu_notifier_unregister() may have unlinked a notifier and may
+	 * still be calling out to it.	Additionally, other notifiers
+	 * may have been active via vmtruncate() et. al. Block here
+	 * to ensure that all notifier callouts for this mm have been
+	 * completed and the sptes are really cleaned up before returning
+	 * to exit_mmap().
 	 */
 	synchronize_srcu(&srcu);
 }
@@ -300,31 +302,31 @@ void mmu_notifier_unregister(struct mmu_
 {
 	BUG_ON(atomic_read(&mm->mm_count) <= 0);
 
+	spin_lock(&mm->mmu_notifier_mm->lock);
 	if (!hlist_unhashed(&mn->hlist)) {
-		/*
-		 * SRCU here will force exit_mmap to wait ->release to finish
-		 * before freeing the pages.
-		 */
 		int id;
 
-		id = srcu_read_lock(&srcu);
 		/*
-		 * exit_mmap will block in mmu_notifier_release to
-		 * guarantee ->release is called before freeing the
-		 * pages.
+		 * Ensure we synchronize up with __mmu_notifier_release().
 		 */
+		id = srcu_read_lock(&srcu);
+
+		hlist_del_rcu(&mn->hlist);
+		spin_unlock(&mm->mmu_notifier_mm->lock);
+
 		if (mn->ops->release)
 			mn->ops->release(mn, mm);
-		srcu_read_unlock(&srcu, id);
 
-		spin_lock(&mm->mmu_notifier_mm->lock);
-		hlist_del_rcu(&mn->hlist);
+		/*
+		 * Allow __mmu_notifier_release() to complete.
+		 */
+		srcu_read_unlock(&srcu, id);
+	} else
 		spin_unlock(&mm->mmu_notifier_mm->lock);
-	}
 
 	/*
-	 * Wait any running method to finish, of course including
-	 * ->release if it was run by mmu_notifier_relase instead of us.
+	 * Wait for any running method to finish, including ->release() if it
+	 * was run by __mmu_notifier_release() instead of us.
 	 */
 	synchronize_srcu(&srcu);
 



  parent reply	other threads:[~2013-02-27  0:08 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-27  0:07 [ 00/86] 3.4.34-stable review Greg Kroah-Hartman
2013-02-27  0:07 ` [ 01/86] x86-32, mm: Rip out x86_32 NUMA remapping code Greg Kroah-Hartman
2013-02-27  0:07 ` [ 02/86] x86-32, mm: Remove reference to resume_map_numa_kva() Greg Kroah-Hartman
2013-02-27  0:07 ` [ 03/86] x86-32, mm: Remove reference to alloc_remap() Greg Kroah-Hartman
2013-02-27  0:07 ` [ 04/86] mm: fix pageblock bitmap allocation Greg Kroah-Hartman
2013-02-27  0:07 ` [ 05/86] timeconst.pl: Eliminate Perl warning Greg Kroah-Hartman
2013-02-27  0:07 ` [ 06/86] genirq: Avoid deadlock in spurious handling Greg Kroah-Hartman
2013-02-27  0:07 ` [ 07/86] posix-cpu-timers: Fix nanosleep task_struct leak Greg Kroah-Hartman
2013-02-27  0:07 ` [ 08/86] hrtimer: Prevent hrtimer_enqueue_reprogram race Greg Kroah-Hartman
2013-02-27  0:07 ` [ 09/86] x86: Hyper-V: register clocksource only if its advertised Greg Kroah-Hartman
2013-02-27  0:07 ` [ 10/86] ALSA: ali5451: remove irq enabling in pointer callback Greg Kroah-Hartman
2013-02-27  0:07 ` [ 11/86] ALSA: rme32.c irq enabling after spin_lock_irq Greg Kroah-Hartman
2013-02-27  0:07 ` [ 12/86] tty: Prevent deadlock in n_gsm driver Greg Kroah-Hartman
2013-02-27  0:07 ` [ 13/86] tty: set_termios/set_termiox should not return -EINTR Greg Kroah-Hartman
2013-02-27  0:07 ` [ 14/86] USB: serial: fix null-pointer dereferences on disconnect Greg Kroah-Hartman
2013-02-27  0:07 ` [ 15/86] b43: Increase number of RX DMA slots Greg Kroah-Hartman
2013-02-27  0:07 ` [ 16/86] rtlwifi: rtl8192cu: Add new USB ID Greg Kroah-Hartman
2013-02-27  0:07 ` [ 17/86] rtlwifi: usb: allocate URB control message setup_packet and data buffer separately Greg Kroah-Hartman
2013-02-27  0:07 ` [ 18/86] xen: Send spinlock IPI to all waiters Greg Kroah-Hartman
2013-02-27  0:07 ` [ 19/86] xen: close evtchn port if binding to irq fails Greg Kroah-Hartman
2013-02-27  0:07 ` [ 20/86] Driver core: treat unregistered bus_types as having no devices Greg Kroah-Hartman
2013-02-27  0:07 ` [ 21/86] mm: mmu_notifier: have mmu_notifiers use a global SRCU so they may safely schedule Greg Kroah-Hartman
2013-02-27  0:07 ` [ 22/86] mm: mmu_notifier: make the mmu_notifier srcu static Greg Kroah-Hartman
2013-02-27  0:07 ` Greg Kroah-Hartman [this message]
2013-02-27  0:07 ` [ 24/86] KVM: s390: Handle hosts not supporting s390-virtio Greg Kroah-Hartman
2013-02-27  0:07 ` [ 25/86] s390/kvm: Fix store status for ACRS/FPRS Greg Kroah-Hartman
2013-02-27  0:07 ` [ 26/86] futex: Revert "futex: Mark get_robust_list as deprecated" Greg Kroah-Hartman
2013-02-27  0:07 ` [ 27/86] inotify: remove broken mask checks causing unmount to be EINVAL Greg Kroah-Hartman
2013-02-27  0:07 ` [ 28/86] fs/block_dev.c: page cache wrongly left invalidated after revalidate_disk() Greg Kroah-Hartman
2013-02-27  0:07 ` [ 29/86] ocfs2: unlock super lock if lockres refresh failed Greg Kroah-Hartman
2013-02-27  0:07 ` [ 30/86] drivers/video/backlight/adp88?0_bl.c: fix resume Greg Kroah-Hartman
2013-02-27  0:07 ` [ 31/86] tmpfs: fix use-after-free of mempolicy object Greg Kroah-Hartman
2013-02-27  0:07 ` [ 32/86] mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages Greg Kroah-Hartman
2013-02-27  0:07 ` [ 33/86] drivercore: Fix ordering between deferred_probe and exiting initcalls Greg Kroah-Hartman
2013-02-27  0:07 ` [ 34/86] umount oops when remove blocklayoutdriver first Greg Kroah-Hartman
2013-02-27  0:07 ` [ 35/86] NLM: Ensure that we resend all pending blocking locks after a reclaim Greg Kroah-Hartman
2013-02-27  0:07 ` [ 36/86] p54usb: corrected USB ID for T-Com Sinus 154 data II Greg Kroah-Hartman
2013-02-27  0:07 ` [ 37/86] ALSA: usb-audio: fix Roland A-PRO support Greg Kroah-Hartman
2013-02-27  0:07 ` [ 38/86] ALSA: usb: Fix Processing Unit Descriptor parsers Greg Kroah-Hartman
2013-02-27  0:07 ` [ 39/86] ALSA: hda - Release assigned pin/cvt at error path of hdmi_pcm_open() Greg Kroah-Hartman
2013-02-27  0:07 ` [ 40/86] ALSA: hda - Workaround for silent output on Sony Vaio VGC-LN51JGB with ALC889 Greg Kroah-Hartman
2013-02-27  0:07 ` [ 41/86] ALSA: hda - hdmi: ELD shouldnt be valid after unplug Greg Kroah-Hartman
2013-02-27  0:07 ` [ 42/86] sunvdc: Fix off-by-one in generic_request() Greg Kroah-Hartman
2013-02-27  0:07 ` [ 43/86] drm/radeon/dce6: fix display powergating Greg Kroah-Hartman
2013-02-27  0:07 ` [ 44/86] drm/udl: make usage as a console safer Greg Kroah-Hartman
2013-02-27  0:07 ` [ 45/86] drm/udl: disable fb_defio by default Greg Kroah-Hartman
2013-02-27  0:07 ` [ 46/86] vgacon/vt: clear buffer attributes when we load a 512 character font (v2) Greg Kroah-Hartman
2013-02-27  0:07 ` [ 47/86] drm: dont add inferred modes for monitors that dont support them Greg Kroah-Hartman
2013-02-27  0:07 ` [ 48/86] drm: Fill depth/bits_per_pixel for C8 format Greg Kroah-Hartman
2013-02-27  0:07 ` [ 49/86] drm: Use C8 instead of RGB332 when determining the format from depth/bpp Greg Kroah-Hartman
2013-02-27  0:07 ` [ 50/86] drm/usb: bind driver to correct device Greg Kroah-Hartman
2013-02-27  0:07 ` [ 51/86] target: Fix divide by zero bug in fabric_max_sectors for unconfigured devices Greg Kroah-Hartman
2013-02-27  0:08 ` [ 52/86] intel/iommu: force writebuffer-flush quirk on Gen 4 Chipsets Greg Kroah-Hartman
2013-02-27  0:08 ` [ 53/86] drm/i915: disable shared panel fitter for pipe Greg Kroah-Hartman
2013-02-27  0:08 ` [ 54/86] drm/i915: Set i9xx sdvo clock limits according to specifications Greg Kroah-Hartman
2013-02-27 10:11   ` Patrik Jakobsson
2013-02-27 10:19     ` Chris Wilson
2013-02-27 10:23       ` Patrik Jakobsson
2013-02-27  0:08 ` [ 55/86] staging: comedi: disallow COMEDI_DEVCONFIG on non-board minors Greg Kroah-Hartman
2013-02-27  0:08 ` [ 56/86] staging: vt6656: Fix URB submitted while active warning Greg Kroah-Hartman
2013-02-27  0:08 ` [ 57/86] ASoC: wm2200: correct IN2L and IN3L digital mute Greg Kroah-Hartman
2013-02-27  0:08 ` [ 58/86] ARM: PXA3xx: program the CSMSADRCFG register Greg Kroah-Hartman
2013-02-27  0:08 ` [ 59/86] ARM: samsung: fix assembly syntax for new gas Greg Kroah-Hartman
2013-02-27  0:08 ` [ 60/86] ARM: 7643/1: sched: correct update_sched_clock() Greg Kroah-Hartman
2013-02-27  0:08 ` [ 61/86] powerpc/kexec: Disable hard IRQ before kexec Greg Kroah-Hartman
2013-02-27  0:08 ` [ 62/86] [PARISC] Purge existing TLB entries in set_pte_at and ptep_set_wrprotect Greg Kroah-Hartman
2013-02-27  0:08 ` [ 63/86] pcmcia/vrc4171: Add missing spinlock init Greg Kroah-Hartman
2013-02-27  0:08 ` [ 64/86] drivers/video: fsl-diu-fb: fix pixel formats for 24 and 16 bpp Greg Kroah-Hartman
2013-02-27  0:08 ` [ 65/86] fbcon: dont lose the console font across generic->chip driver switch Greg Kroah-Hartman
2013-02-27  0:08 ` [ 66/86] fb: rework locking to fix lock ordering on takeover Greg Kroah-Hartman
2013-02-27  0:08 ` [ 67/86] fb: Yet another band-aid for fixing lockdep mess Greg Kroah-Hartman
2013-02-27  0:08 ` [ 68/86] mmc: sdhci-esdhc-imx: fix host version read Greg Kroah-Hartman
2013-02-27  0:08 ` [ 69/86] HID: wiimote: fix nunchuck button parser Greg Kroah-Hartman
2013-02-27  0:08 ` [ 70/86] bridge: set priority of STP packets Greg Kroah-Hartman
2013-02-27  0:08 ` [ 71/86] net: fix infinite loop in __skb_recv_datagram() Greg Kroah-Hartman
2013-02-27  0:08 ` [ 72/86] xen-netback: correctly return errors from netbk_count_requests() Greg Kroah-Hartman
2013-02-27  0:08 ` [ 73/86] xen-netback: cancel the credit timer when taking the vif down Greg Kroah-Hartman
2013-02-27  0:08 ` [ 74/86] net: fix a compile error when SOCK_REFCNT_DEBUG is enabled Greg Kroah-Hartman
2013-02-27  0:08 ` [ 75/86] ipv4: fix a bug in ping_err() Greg Kroah-Hartman
2013-02-27  0:08 ` [ 76/86] ipv6: use a stronger hash for tcp Greg Kroah-Hartman
2013-02-27  0:08 ` [ 77/86] sock_diag: Fix out-of-bounds access to sock_diag_handlers[] Greg Kroah-Hartman
2013-02-27  0:08 ` [ 78/86] vlan: adjust vlan_set_encap_proto() for its callers Greg Kroah-Hartman
2013-02-27  0:08 ` [ 79/86] USB: ehci-omap: Dont free gpios that we didnt request Greg Kroah-Hartman
2013-02-27  7:52   ` Roger Quadros
2013-02-27 17:20     ` Luis Henriques
2013-02-28 10:16       ` Roger Quadros
2013-02-28 10:37         ` Luis Henriques
2013-02-27 17:37     ` Greg Kroah-Hartman
2013-02-27  0:08 ` [ 80/86] dca: check against empty dca_domains list before unregister provider Greg Kroah-Hartman
2013-02-27  0:08 ` [ 81/86] USB: option: add and update Alcatel modems Greg Kroah-Hartman
2013-02-27  0:08 ` [ 82/86] USB: option: add Yota / Megafon M100-1 4g modem Greg Kroah-Hartman
2013-02-27  0:08 ` [ 83/86] USB: option: add Huawei "ACM" devices using protocol = vendor Greg Kroah-Hartman
2013-02-27  0:08 ` [ 84/86] USB: ehci-omap: Fix autoloading of module Greg Kroah-Hartman
2013-02-27  0:08 ` [ 85/86] USB: storage: properly handle the endian issues of idProduct Greg Kroah-Hartman
2013-02-27  0:08 ` [ 86/86] USB: usb-storage: unusual_devs update for Super TOP SATA bridge Greg Kroah-Hartman
2013-02-27 16:50 ` [ 00/86] 3.4.34-stable review Shuah Khan
2013-02-28 14:54 ` Satoru Takeuchi
2013-02-28 14:59   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130226235915.311684727@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=haggaie@mellanox.com \
    --cc=holt@sgi.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=mtosatti@redhat.com \
    --cc=sagig@mellanox.co.il \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.