From: olof@austin.ibm.com (Olof Johansson)
To: Jamie Lokier <jamie@shareable.org>
Cc: Linus Torvalds <torvalds@osdl.org>, Andrew Morton <akpm@osdl.org>,
linux-kernel@vger.kernel.org, rusty@rustcorp.com.au
Subject: Re: [PATCH/RFC] Futex mmap_sem deadlock
Date: Tue, 22 Feb 2005 16:42:56 -0600 [thread overview]
Message-ID: <20050222224256.GA31341@austin.ibm.com> (raw)
In-Reply-To: <20050222223457.GK22555@mail.shareable.org>
On Tue, Feb 22, 2005 at 10:34:57PM +0000, Jamie Lokier wrote:
> There is one small but important error: the "return ret" mustn't just
> return. It must call unqueue_me(&q) just like the code at out_unqueue,
> _including_ the conditional "ret = 0", but _excluding_ the up_read().
Not only that, but someone might already have dequeued us, right? It's
probably a pathological case though, i.e. someone did a wake on the same
(bad) address.
How's this patch? It's closer to Linus' pseudo-code than Andrew's, to
avoid the extra get_user() at function entry and keep the common case
path short.
It also includes the feedback from Andrew on the sys_get_mempolicy(),
making the patch even simpler there.
----
Some futex functions do get_user calls while holding mmap_sem for
reading. If get_user() faults, and another thread happens to be in mmap
(or somewhere else holding waiting on down_write for the same semaphore),
then do_page_fault will deadlock. Most architectures seem to be exposed
to this.
To avoid it, make sure the page is available. If not, release the
semaphore, fault it in and retry.
I also found another exposure by inspection, moving some of the code
around avoids the possible deadlock there.
Signed-off-by: Olof Johansson <olof@austin.ibm.com>
Index: linux-2.5/kernel/futex.c
===================================================================
--- linux-2.5.orig/kernel/futex.c 2005-02-21 16:09:38.000000000 -0600
+++ linux-2.5/kernel/futex.c 2005-02-22 16:38:24.000000000 -0600
@@ -329,6 +329,7 @@
int ret, drop_count = 0;
unsigned int nqueued;
+ retry:
down_read(¤t->mm->mmap_sem);
ret = get_futex_key(uaddr1, &key1);
@@ -355,9 +356,19 @@
before *uaddr1. */
smp_mb();
- if (get_user(curval, (int __user *)uaddr1) != 0) {
- ret = -EFAULT;
- goto out;
+ inc_preempt_count();
+ ret = get_user(curval, (int __user *)uaddr1);
+ dec_preempt_count();
+
+ if (unlikely(ret)) {
+ up_read(¤t->mm->mmap_sem);
+ /* Re-do the access outside the lock */
+ ret = get_user(curval, (int __user *)uaddr1);
+
+ if (!ret)
+ goto retry;
+
+ return ret;
}
if (curval != *valp) {
ret = -EAGAIN;
@@ -480,6 +491,7 @@
int ret, curval;
struct futex_q q;
+ retry:
down_read(¤t->mm->mmap_sem);
ret = get_futex_key(uaddr, &q.key);
@@ -508,9 +520,21 @@
* We hold the mmap semaphore, so the mapping cannot have changed
* since we looked it up in get_futex_key.
*/
- if (get_user(curval, (int __user *)uaddr) != 0) {
- ret = -EFAULT;
- goto out_unqueue;
+ inc_preempt_count();
+ ret = get_user(curval, (int __user *)uaddr);
+ dec_preempt_count();
+ if (unlikely(ret)) {
+ up_read(¤t->mm->mmap_sem);
+
+ if (!unqueue_me(&q)) /* There's a chance we got woken already */
+ return 0;
+
+ /* Re-do the access outside the lock */
+ ret = get_user(curval, (int __user *)uaddr);
+
+ if (!ret)
+ goto retry;
+ return ret;
}
if (curval != val) {
ret = -EWOULDBLOCK;
Index: linux-2.5/mm/mempolicy.c
===================================================================
--- linux-2.5.orig/mm/mempolicy.c 2005-02-04 00:27:40.000000000 -0600
+++ linux-2.5/mm/mempolicy.c 2005-02-22 14:34:19.000000000 -0600
@@ -524,9 +524,13 @@
} else
pval = pol->policy;
- err = -EFAULT;
+ if (vma) {
+ up_read(¤t->mm->mmap_sem);
+ vma = NULL;
+ }
+
if (policy && put_user(pval, policy))
- goto out;
+ return -EFAULT;
err = 0;
if (nmask) {
next prev parent reply other threads:[~2005-02-22 22:47 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-02-22 19:06 [PATCH/RFC] Futex mmap_sem deadlock Olof Johansson
2005-02-22 19:36 ` Linus Torvalds
2005-02-22 21:16 ` Benjamin Herrenschmidt
2005-02-22 21:19 ` Benjamin Herrenschmidt
2005-02-22 21:31 ` Linus Torvalds
2005-02-22 21:42 ` Benjamin Herrenschmidt
2005-02-22 22:10 ` Linus Torvalds
2005-02-22 22:24 ` Benjamin Herrenschmidt
2005-02-22 23:08 ` Greg KH
2005-02-23 11:24 ` David Howells
2005-02-22 19:55 ` Andrew Morton
2005-02-22 21:07 ` Jamie Lokier
2005-02-22 21:19 ` Olof Johansson
2005-02-22 22:09 ` Jamie Lokier
2005-02-22 21:19 ` Chris Friesen
2005-02-22 21:27 ` Jamie Lokier
2005-02-22 21:30 ` Linus Torvalds
2005-02-22 22:34 ` Jamie Lokier
2005-02-22 22:42 ` Olof Johansson [this message]
2005-02-22 23:20 ` Andrew Morton
2005-02-22 23:23 ` Olof Johansson
2005-02-23 11:39 ` David Howells
2005-02-23 16:22 ` Olof Johansson
2005-02-23 18:44 ` David Howells
2005-02-23 14:49 ` Joe Korty
2005-02-23 15:54 ` Linus Torvalds
2005-02-23 17:10 ` Olof Johansson
2005-02-23 17:37 ` Arjan van de Ven
2005-02-23 18:22 ` Jamie Lokier
2005-02-23 18:34 ` Linus Torvalds
2005-02-23 18:49 ` Jamie Lokier
2005-02-23 19:12 ` Olof Johansson
2005-02-23 22:00 ` Linus Torvalds
2005-02-24 0:00 ` Jamie Lokier
2005-02-23 18:37 ` Olof Johansson
2005-02-22 21:40 ` Andrew Morton
2005-02-22 21:59 ` Linus Torvalds
2005-02-23 11:42 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050222224256.GA31341@austin.ibm.com \
--to=olof@austin.ibm.com \
--cc=akpm@osdl.org \
--cc=jamie@shareable.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox