All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Oliver <david@rgmadvisors.com>,
	linux-kernel@vger.kernel.org,
	Shawn Bohrer <sbohrer@rgmadvisors.com>,
	Zachary Vonler <zvonler@rgmadvisors.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Hugh Dickins <hughd@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Darren Hart <dvhart@linux.intel.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: Change in functionality of futex() system call.
Date: Mon, 06 Jun 2011 18:11:12 +0200	[thread overview]
Message-ID: <1307376672.2322.167.camel@twins> (raw)
In-Reply-To: <1307373819.3098.40.camel@edumazet-laptop>

On Mon, 2011-06-06 at 17:23 +0200, Eric Dumazet wrote:
> Le lundi 06 juin 2011 à 09:28 -0500, David Oliver a écrit :
> > Hello,
> > 
> > The functionality of the futex() system call appears to have changed
> > between versions 2.6.18 and 2.6.32.28.
> > 
> > Specifically, performing a FUTEX_WAIT on a read-only mapped location
> > results in an EFAULT. Although other operations, such as FUTEX_WAKE,
> > are only meaningful for writable locations, FUTEX_WAIT is useful for
> > processes with read-only access to a memory-mapped file.
> > 
> > The code below illustrates the changed behavior (each of the EXPECT
> > operations succeed on the older kernel, the ASSERTs pass in each
> > case), assuming the file /tmp/futex_test exists and contains int(42).
> > 
> > With the older kernel, the syscall() suspends until another process
> > changes the file and issues a FUTEX_WAKE, whereas the new behavior is
> > for an EFAULT error, independent of the file contents.
> > 
> > Let me know if you need further clarification.
> > 
> > Cheers!
> > 
> > David Oliver.
> > 
> > 
> > #include <errno.h>
> > #include <fcntl.h>
> > #include <stdint.h>
> > typedef uint32_t u32;   // for futex.h
> > #include <linux/futex.h>
> > #include <sys/mman.h>
> > #include <sys/syscall.h>
> > #include <unistd.h>
> > #include "gtest/gtest.h" // test framework to illustrate issue.
> > 
> > 
> > TEST(Futex, futex_in_read_only_file_is_ok) {
> >   int fd = open("/tmp/futex_test", O_RDONLY);
> >   ASSERT_GE(fd, 0);
> >   int* futex = static_cast<int *>(mmap(0, sizeof(int), PROT_READ,
> > MAP_SHARED, fd, 0));
> >   ASSERT_NE((int *)(0), futex);
> > 
> >   int rc = syscall(SYS_futex, futex, FUTEX_WAIT, 42, 0, 0, 0);
> > 
> >   EXPECT_NE(-1, rc);              // fails.
> >   if (rc == -1) {
> >       EXPECT_NE(errno, EFAULT);   // fails.
> >   }
> > }
> > 
> 
> Right you are, this came from commit 7485d0d3758e8e6491a5 (futexes:
> Remove rw parameter from get_futex_key()) in 2.6.33
> 
> commit 7485d0d3758e8e6491a5c9468114e74dc050785d
> Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date:   Tue Jan 5 16:32:43 2010 +0900
> 
>     futexes: Remove rw parameter from get_futex_key()
>     
>     Currently, futexes have two problem:
>     
>     A) The current futex code doesn't handle private file mappings properly.
>     
>     get_futex_key() uses PageAnon() to distinguish file and
>     anon, which can cause the following bad scenario:
>     
>       1) thread-A call futex(private-mapping, FUTEX_WAIT), it
>          sleeps on file mapping object.
>       2) thread-B writes a variable and it makes it cow.
>       3) thread-B calls futex(private-mapping, FUTEX_WAKE), it
>          wakes up blocked thread on the anonymous page. (but it's nothing)
>     
>     B) Current futex code doesn't handle zero page properly.
> 
>     Read mode get_user_pages() can return zero page, but current
>     futex code doesn't handle it at all. Then, zero page makes
>     infinite loop internally.
>     
>     The solution is to use write mode get_user_page() always for
>     page lookup. It prevents the lookup of both file page of private
>     mappings and zero page.
>     
>     Performance concerns:
>     
>     Probaly very little, because glibc always initialize variables
>     for futex before to call futex(). It means glibc users never see
>     the overhead of this patch.
>     
>     Compatibility concerns:
>     
>     This patch has few compatibility issues. After this patch,
>     FUTEX_WAIT require writable access to futex variables (read-only
>     mappings makes EFAULT). But practically it's not a problem,
>     glibc always initalizes variables for futexes explicitly - nobody
>     uses read-only mappings.

Urgh,. maybe something like the below but with more conditionals that
enable the extra logic only for FUTEX_WAIT..

The idea is to try a RO gup() when the RW gup() fails so as not to slow
down the common path of writable anonymous maps and bail when we used
the RO path on anonymous memory.

---
diff --git a/kernel/futex.c b/kernel/futex.c
index fe28dc2..11f2ad1 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -234,7 +234,7 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
 	unsigned long address = (unsigned long)uaddr;
 	struct mm_struct *mm = current->mm;
 	struct page *page, *page_head;
-	int err;
+	int err, ro = 0;
 
 	/*
 	 * The futex address must be "naturally" aligned.
@@ -262,6 +262,10 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key)
 
 again:
 	err = get_user_pages_fast(address, 1, 1, &page);
+	if (err == -EFAULT) {
+		err = get_user_pages_fast(address, 1, 0, &page);
+		ro = 1;
+	}
 	if (err < 0)
 		return err;
 
@@ -316,6 +320,11 @@ again:
 	 * the object not the particular process.
 	 */
 	if (PageAnon(page_head)) {
+		if (ro) {
+			err = -EFAULT;
+			goto out;
+		}
+
 		key->both.offset |= FUT_OFF_MMSHARED; /* ref taken on mm */
 		key->private.mm = mm;
 		key->private.address = address;
@@ -327,9 +336,10 @@ again:
 
 	get_futex_key_refs(key);
 
+out:
 	unlock_page(page_head);
 	put_page(page_head);
-	return 0;
+	return err;
 }
 
 static inline void put_futex_key(union futex_key *key)


  parent reply	other threads:[~2011-06-06 16:11 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-06 14:28 Change in functionality of futex() system call David Oliver
2011-06-06 15:23 ` Eric Dumazet
2011-06-06 15:56   ` Shawn Bohrer
2011-06-06 16:11   ` Peter Zijlstra [this message]
2011-06-06 16:16     ` Peter Zijlstra
2011-06-06 16:22       ` Eric Dumazet
2011-06-06 16:29         ` Peter Zijlstra
2011-06-06 16:42           ` Eric Dumazet
2011-06-06 17:05             ` Peter Zijlstra
2011-06-06 17:11               ` Eric Dumazet
2011-06-06 17:27                 ` Steven Rostedt
2011-06-06 17:56                   ` Darren Hart
2011-06-06 18:23                 ` Peter Zijlstra
2011-06-06 18:27                   ` Eric Dumazet
2011-06-25  0:00                     ` Darren Hart
2011-06-27 16:48                     ` Shawn Bohrer
2011-06-06 17:53             ` Darren Hart
2011-06-06 18:11               ` Eric Dumazet
2011-06-07  3:13                 ` Darren Hart
2011-06-07  3:49                   ` Eric Dumazet
2011-06-07 14:44                   ` Andy Lutomirski
2011-06-07 15:56                     ` Darren Hart
2011-06-07 15:58                     ` Eric Dumazet
2011-06-07 18:43                       ` Andrew Lutomirski
2011-06-07 19:01                         ` Darren Hart
2011-06-07 19:04                           ` Andrew Lutomirski
2011-06-07 19:06                         ` Eric Dumazet
2011-06-07 19:10                         ` David Oliver
2011-06-07 19:19                           ` Andrew Lutomirski
2011-06-07 19:33                             ` David Oliver
2011-06-07 19:53                               ` Andrew Lutomirski
2011-06-07 20:04                                 ` David Oliver
2011-06-07 20:12                                   ` Andrew Lutomirski
2011-06-07 22:26                             ` Kyle Moffett
2011-06-08 15:20                               ` David Oliver
2011-06-08 15:21                                 ` Andrew Lutomirski
2011-06-08 16:21                                 ` Darren Hart
2011-06-09 11:37                                   ` KOSAKI Motohiro
2011-06-09 12:05                                     ` Peter Zijlstra
2011-06-09 17:58                                       ` Peter Zijlstra
2011-06-10  3:30                                         ` KOSAKI Motohiro
2011-06-10  3:26                                       ` KOSAKI Motohiro
2011-06-07 18:30                 ` Joel Becker
2011-06-09 12:05                 ` Peter Zijlstra
2011-06-10 12:10       ` KOSAKI Motohiro
2011-06-10 17:29         ` Darren Hart
2011-06-13  2:11           ` KOSAKI Motohiro
2011-06-13 15:50             ` Darren Hart
2011-06-15 18:50         ` Shawn Bohrer
2011-06-15 18:54           ` Darren Hart
2011-06-17 13:40             ` Shawn Bohrer
2011-06-22 19:19             ` [PATCH RFC] futex: Fix regression with read only mappings Shawn Bohrer
2011-06-22 20:14               ` Darren Hart
2011-06-23  2:51                 ` KOSAKI Motohiro
2011-06-23 15:26                   ` Darren Hart
2011-06-23 19:49                     ` Shawn Bohrer
2011-06-24 15:59                       ` [PATCH v2] " Shawn Bohrer
2011-06-25  0:37                         ` Darren Hart
2011-06-25 15:10                           ` KOSAKI Motohiro
2011-06-27 16:40                           ` Shawn Bohrer
2011-06-27 18:15                             ` Peter Zijlstra
2011-06-27 20:41                               ` Darren Hart
2011-06-27 21:08                                 ` Shawn Bohrer
2011-06-27 21:39                                   ` Darren Hart
2011-06-27 22:14                                     ` Shawn Bohrer
2011-06-27 23:17                                       ` Darren Hart
2011-06-27 22:22                                     ` [PATCH v3] " Shawn Bohrer
2011-06-28 10:54                                       ` Peter Zijlstra
2011-06-28 14:52                                         ` Darren Hart
2011-06-28 17:38                                           ` Shawn Bohrer
2011-06-28 20:58                                             ` Darren Hart
2011-06-28 23:55                                             ` Darren Hart
2011-06-29 14:56                                               ` Shawn Bohrer
2011-06-29 15:17                                               ` [PATCH v4] " Shawn Bohrer
2011-06-29 18:41                                                 ` Darren Hart
2011-06-29 23:38                                                 ` Thomas Gleixner
2011-06-30  4:19                                                   ` Darren Hart
2011-06-30 14:02                                                     ` David C. Oliver
2011-06-30 15:41                                                       ` Darren Hart
2011-06-30 16:21                                                         ` [PATCH v5] " Shawn Bohrer
2011-07-12 15:27                                                           ` Shawn Bohrer
2011-07-25 15:20                                                           ` Shawn Bohrer
2011-07-25 19:28                                                             ` Thomas Gleixner
2011-07-26 19:04                                                           ` [tip:core/urgent] " tip-bot for Shawn Bohrer
2011-06-28 10:50                                     ` [PATCH v2] " Peter Zijlstra
2011-06-28 14:19                                       ` Darren Hart
2011-06-28 14:23                                         ` Peter Zijlstra
2011-06-23  3:58                 ` [PATCH RFC] " Shawn Bohrer
2011-06-23  3:23             ` Change in functionality of futex() system call KOSAKI Motohiro
  -- strict thread matches above, loose matches on Subject: below --
2011-06-09  0:44 George Spelvin
2011-06-09  3:02 ` Darren Hart
2011-06-09  3:38   ` Andrew Lutomirski
2011-06-09  3:54     ` Eric Dumazet
2011-06-09  4:10       ` Andrew Lutomirski
2011-06-09  5:11         ` Eric Dumazet
2011-06-09 12:12           ` Andrew Lutomirski
2011-06-09  4:43       ` George Spelvin
2011-06-09  5:25         ` Eric Dumazet
2011-06-09  4:44       ` Kyle Moffett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1307376672.2322.167.camel@twins \
    --to=peterz@infradead.org \
    --cc=david@rgmadvisors.com \
    --cc=dvhart@linux.intel.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hughd@google.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=sbohrer@rgmadvisors.com \
    --cc=tglx@linutronix.de \
    --cc=zvonler@rgmadvisors.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.