All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Shawn Bohrer <sbohrer@rgmadvisors.com>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@redhat.com,
	dvhart@linux.intel.com, a.p.zijlstra@chello.nl,
	sbohrer@rgmadvisors.com, tglx@linutronix.de,
	david@rgmadvisors.com, kosaki.motohiro@jp.fujitsu.com
Subject: [tip:core/urgent] futex: Fix regression with read only mappings
Date: Tue, 26 Jul 2011 19:04:19 GMT	[thread overview]
Message-ID: <tip-9ea71503a8ed9184d2d0b8ccc4d269d05f7940ae@git.kernel.org> (raw)
In-Reply-To: <1309450892-30676-1-git-send-email-sbohrer@rgmadvisors.com>

Commit-ID:  9ea71503a8ed9184d2d0b8ccc4d269d05f7940ae
Gitweb:     http://git.kernel.org/tip/9ea71503a8ed9184d2d0b8ccc4d269d05f7940ae
Author:     Shawn Bohrer <sbohrer@rgmadvisors.com>
AuthorDate: Thu, 30 Jun 2011 11:21:32 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 26 Jul 2011 20:59:35 +0200

futex: Fix regression with read only mappings

commit 7485d0d3758e8e6491a5c9468114e74dc050785d (futexes: Remove rw
parameter from get_futex_key()) in 2.6.33 fixed two problems:  First, It
prevented a loop when encountering a ZERO_PAGE. Second, it fixed RW
MAP_PRIVATE futex operations by forcing the COW to occur by
unconditionally performing a write access get_user_pages_fast() to get
the page.  The commit also introduced a user-mode regression in that it
broke futex operations on read-only memory maps.  For example, this
breaks workloads that have one or more reader processes doing a
FUTEX_WAIT on a futex within a read only shared file mapping, and a
writer processes that has a writable mapping issuing the FUTEX_WAKE.

This fixes the regression for valid futex operations on RO mappings by
trying a RO get_user_pages_fast() when the RW get_user_pages_fast()
fails. This change makes it necessary to also check for invalid use
cases, such as anonymous RO mappings (which can never change) and the
ZERO_PAGE which the commit referenced above was written to address.

This patch does restore the original behavior with RO MAP_PRIVATE
mappings, which have inherent user-mode usage problems and don't really
make sense.  With this patch performing a FUTEX_WAIT within a RO
MAP_PRIVATE mapping will be successfully woken provided another process
updates the region of the underlying mapped file.  However, the mmap()
man page states that for a MAP_PRIVATE mapping:

  It is unspecified whether changes made to the file after
  the mmap() call are visible in the mapped region.

So user-mode users attempting to use futex operations on RO MAP_PRIVATE
mappings are depending on unspecified behavior.  Additionally a
RO MAP_PRIVATE mapping could fail to wake up in the following case.

  Thread-A: call futex(FUTEX_WAIT, memory-region-A).
            get_futex_key() return inode based key.
            sleep on the key
  Thread-B: call mprotect(PROT_READ|PROT_WRITE, memory-region-A)
  Thread-B: write memory-region-A.
            COW happen. This process's memory-region-A become related
            to new COWed private (ie PageAnon=1) page.
  Thread-B: call futex(FUETX_WAKE, memory-region-A).
            get_futex_key() return mm based key.
            IOW, we fail to wake up Thread-A.

Once again doing something like this is just silly and users who do
something like this get what they deserve.

While RO MAP_PRIVATE mappings are nonsensical, checking for a private
mapping requires walking the vmas and was deemed too costly to avoid a
userspace hang.

This Patch is based on Peter Zijlstra's initial patch with modifications to
only allow RO mappings for futex operations that need VERIFY_READ access.

Reported-by: David Oliver <david@rgmadvisors.com>
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: peterz@infradead.org
Cc: eric.dumazet@gmail.com
Cc: zvonler@rgmadvisors.com
Cc: hughd@google.com
Link: http://lkml.kernel.org/r/1309450892-30676-1-git-send-email-sbohrer@rgmadvisors.com
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/futex.c |   54 ++++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 42 insertions(+), 12 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index fe28dc2..70bb54b 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -218,6 +218,8 @@ static void drop_futex_key_refs(union futex_key *key)
  * @uaddr:	virtual address of the futex
  * @fshared:	0 for a PROCESS_PRIVATE futex, 1 for PROCESS_SHARED
  * @key:	address where result is stored.
+ * @rw:		mapping needs to be read/write (values: VERIFY_READ,
+ *              VERIFY_WRITE)
  *
  * Returns a negative error code or 0
  * The key words are stored in *key on success.
@@ -229,12 +231,12 @@ static void drop_futex_key_refs(union futex_key *key)
  * lock_page() might sleep, the caller should not hold a spinlock.
  */
 static int
-get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
+get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw)
 {
 	unsigned long address = (unsigned long)uaddr;
 	struct mm_struct *mm = current->mm;
 	struct page *page, *page_head;
-	int err;
+	int err, ro = 0;
 
 	/*
 	 * The futex address must be "naturally" aligned.
@@ -262,8 +264,18 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
 
 again:
 	err = get_user_pages_fast(address, 1, 1, &page);
+	/*
+	 * If write access is not required (eg. FUTEX_WAIT), try
+	 * and get read-only access.
+	 */
+	if (err == -EFAULT && rw == VERIFY_READ) {
+		err = get_user_pages_fast(address, 1, 0, &page);
+		ro = 1;
+	}
 	if (err < 0)
 		return err;
+	else
+		err = 0;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	page_head = page;
@@ -305,6 +317,13 @@ again:
 	if (!page_head->mapping) {
 		unlock_page(page_head);
 		put_page(page_head);
+		/*
+		* ZERO_PAGE pages don't have a mapping. Avoid a busy loop
+		* trying to find one. RW mapping would have COW'd (and thus
+		* have a mapping) so this page is RO and won't ever change.
+		*/
+		if ((page_head == ZERO_PAGE(address)))
+			return -EFAULT;
 		goto again;
 	}
 
@@ -316,6 +335,15 @@ again:
 	 * the object not the particular process.
 	 */
 	if (PageAnon(page_head)) {
+		/*
+		 * A RO anonymous page will never change and thus doesn't make
+		 * sense for futex operations.
+		 */
+		if (ro) {
+			err = -EFAULT;
+			goto out;
+		}
+
 		key->both.offset |= FUT_OFF_MMSHARED; /* ref taken on mm */
 		key->private.mm = mm;
 		key->private.address = address;
@@ -327,9 +355,10 @@ again:
 
 	get_futex_key_refs(key);
 
+out:
 	unlock_page(page_head);
 	put_page(page_head);
-	return 0;
+	return err;
 }
 
 static inline void put_futex_key(union futex_key *key)
@@ -940,7 +969,7 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
 	if (!bitset)
 		return -EINVAL;
 
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key);
+	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -986,10 +1015,10 @@ futex_wake_op(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2,
 	int ret, op_ret;
 
 retry:
-	ret = get_futex_key(uaddr1, flags & FLAGS_SHARED, &key1);
+	ret = get_futex_key(uaddr1, flags & FLAGS_SHARED, &key1, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
-	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2);
+	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out_put_key1;
 
@@ -1243,10 +1272,11 @@ retry:
 		pi_state = NULL;
 	}
 
-	ret = get_futex_key(uaddr1, flags & FLAGS_SHARED, &key1);
+	ret = get_futex_key(uaddr1, flags & FLAGS_SHARED, &key1, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
-	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2);
+	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2,
+			    requeue_pi ? VERIFY_WRITE : VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out_put_key1;
 
@@ -1790,7 +1820,7 @@ static int futex_wait_setup(u32 __user *uaddr, u32 val, unsigned int flags,
 	 * while the syscall executes.
 	 */
 retry:
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &q->key);
+	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &q->key, VERIFY_READ);
 	if (unlikely(ret != 0))
 		return ret;
 
@@ -1941,7 +1971,7 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, int detect,
 	}
 
 retry:
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &q.key);
+	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &q.key, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -2060,7 +2090,7 @@ retry:
 	if ((uval & FUTEX_TID_MASK) != vpid)
 		return -EPERM;
 
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key);
+	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -2249,7 +2279,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
 	debug_rt_mutex_init_waiter(&rt_waiter);
 	rt_waiter.task = NULL;
 
-	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2);
+	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out;
 

  parent reply	other threads:[~2011-07-26 19:04 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-06 14:28 Change in functionality of futex() system call David Oliver
2011-06-06 15:23 ` Eric Dumazet
2011-06-06 15:56   ` Shawn Bohrer
2011-06-06 16:11   ` Peter Zijlstra
2011-06-06 16:16     ` Peter Zijlstra
2011-06-06 16:22       ` Eric Dumazet
2011-06-06 16:29         ` Peter Zijlstra
2011-06-06 16:42           ` Eric Dumazet
2011-06-06 17:05             ` Peter Zijlstra
2011-06-06 17:11               ` Eric Dumazet
2011-06-06 17:27                 ` Steven Rostedt
2011-06-06 17:56                   ` Darren Hart
2011-06-06 18:23                 ` Peter Zijlstra
2011-06-06 18:27                   ` Eric Dumazet
2011-06-25  0:00                     ` Darren Hart
2011-06-27 16:48                     ` Shawn Bohrer
2011-06-06 17:53             ` Darren Hart
2011-06-06 18:11               ` Eric Dumazet
2011-06-07  3:13                 ` Darren Hart
2011-06-07  3:49                   ` Eric Dumazet
2011-06-07 14:44                   ` Andy Lutomirski
2011-06-07 15:56                     ` Darren Hart
2011-06-07 15:58                     ` Eric Dumazet
2011-06-07 18:43                       ` Andrew Lutomirski
2011-06-07 19:01                         ` Darren Hart
2011-06-07 19:04                           ` Andrew Lutomirski
2011-06-07 19:06                         ` Eric Dumazet
2011-06-07 19:10                         ` David Oliver
2011-06-07 19:19                           ` Andrew Lutomirski
2011-06-07 19:33                             ` David Oliver
2011-06-07 19:53                               ` Andrew Lutomirski
2011-06-07 20:04                                 ` David Oliver
2011-06-07 20:12                                   ` Andrew Lutomirski
2011-06-07 22:26                             ` Kyle Moffett
2011-06-08 15:20                               ` David Oliver
2011-06-08 15:21                                 ` Andrew Lutomirski
2011-06-08 16:21                                 ` Darren Hart
2011-06-09 11:37                                   ` KOSAKI Motohiro
2011-06-09 12:05                                     ` Peter Zijlstra
2011-06-09 17:58                                       ` Peter Zijlstra
2011-06-10  3:30                                         ` KOSAKI Motohiro
2011-06-10  3:26                                       ` KOSAKI Motohiro
2011-06-07 18:30                 ` Joel Becker
2011-06-09 12:05                 ` Peter Zijlstra
2011-06-10 12:10       ` KOSAKI Motohiro
2011-06-10 17:29         ` Darren Hart
2011-06-13  2:11           ` KOSAKI Motohiro
2011-06-13 15:50             ` Darren Hart
2011-06-15 18:50         ` Shawn Bohrer
2011-06-15 18:54           ` Darren Hart
2011-06-17 13:40             ` Shawn Bohrer
2011-06-22 19:19             ` [PATCH RFC] futex: Fix regression with read only mappings Shawn Bohrer
2011-06-22 20:14               ` Darren Hart
2011-06-23  2:51                 ` KOSAKI Motohiro
2011-06-23 15:26                   ` Darren Hart
2011-06-23 19:49                     ` Shawn Bohrer
2011-06-24 15:59                       ` [PATCH v2] " Shawn Bohrer
2011-06-25  0:37                         ` Darren Hart
2011-06-25 15:10                           ` KOSAKI Motohiro
2011-06-27 16:40                           ` Shawn Bohrer
2011-06-27 18:15                             ` Peter Zijlstra
2011-06-27 20:41                               ` Darren Hart
2011-06-27 21:08                                 ` Shawn Bohrer
2011-06-27 21:39                                   ` Darren Hart
2011-06-27 22:14                                     ` Shawn Bohrer
2011-06-27 23:17                                       ` Darren Hart
2011-06-27 22:22                                     ` [PATCH v3] " Shawn Bohrer
2011-06-28 10:54                                       ` Peter Zijlstra
2011-06-28 14:52                                         ` Darren Hart
2011-06-28 17:38                                           ` Shawn Bohrer
2011-06-28 20:58                                             ` Darren Hart
2011-06-28 23:55                                             ` Darren Hart
2011-06-29 14:56                                               ` Shawn Bohrer
2011-06-29 15:17                                               ` [PATCH v4] " Shawn Bohrer
2011-06-29 18:41                                                 ` Darren Hart
2011-06-29 23:38                                                 ` Thomas Gleixner
2011-06-30  4:19                                                   ` Darren Hart
2011-06-30 14:02                                                     ` David C. Oliver
2011-06-30 15:41                                                       ` Darren Hart
2011-06-30 16:21                                                         ` [PATCH v5] " Shawn Bohrer
2011-07-12 15:27                                                           ` Shawn Bohrer
2011-07-25 15:20                                                           ` Shawn Bohrer
2011-07-25 19:28                                                             ` Thomas Gleixner
2011-07-26 19:04                                                           ` tip-bot for Shawn Bohrer [this message]
2011-06-28 10:50                                     ` [PATCH v2] " Peter Zijlstra
2011-06-28 14:19                                       ` Darren Hart
2011-06-28 14:23                                         ` Peter Zijlstra
2011-06-23  3:58                 ` [PATCH RFC] " Shawn Bohrer
2011-06-23  3:23             ` Change in functionality of futex() system call KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-9ea71503a8ed9184d2d0b8ccc4d269d05f7940ae@git.kernel.org \
    --to=sbohrer@rgmadvisors.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=david@rgmadvisors.com \
    --cc=dvhart@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.