From: Andrew Morton <akpm@osdl.org>
To: olof@austin.ibm.com (Olof Johansson)
Cc: linux-kernel@vger.kernel.org, torvalds@osdl.org,
jamie@shareable.org, rusty@rustcorp.com.au
Subject: Re: [PATCH/RFC] Futex mmap_sem deadlock
Date: Tue, 22 Feb 2005 11:55:03 -0800 [thread overview]
Message-ID: <20050222115503.729cd17b.akpm@osdl.org> (raw)
In-Reply-To: <20050222190646.GA7079@austin.ibm.com>
olof@austin.ibm.com (Olof Johansson) wrote:
>
> Hi,
>
> Consider a small testcase that spawns off two threads, either thread
> doing a loop of:
>
> buf = mmap /dev/zero MAP_SHARED for 0x100000 bytes
> call sys_futex (buf+page, FUTEX_WAIT, 1, NULL, NULL) for each page in said mmap
> munmap(buf)
> repeat
>
> This will quickly lock up, since the futex_wait code dows a
> down_read(mmap_sem), then a get_user().
>
> The do_page_fault code on ppc64 (as well as other architectures) needs
> to take the same semaphore for reading. This is all good until the
> second thread comes into play: Its mmap call tries to take the same
> semaphore for writing which causes in the do_page_fault down_read()
> to get stuck. Classic deadlock.
Yup. Jamie says that the futex code _has_ to hold mmap_sem across the
get_user(). I forget (but could probably locate) the details.
>
> One attempt to fix this is included below. It works, but I'm not entirely
> happy with the fact that it's a bit messy solution. If anyone has a
> better idea for how to solve it I'd be all ears.
It's fairly sane. Style-wise I'd be inclined to turn this:
down_read(¤t->mm->mmap_sem);
while (!check_user_page_readable(current->mm, uaddr1)) {
up_read(¤t->mm->mmap_sem);
/* Fault in the page through get_user() but discard result */
if (get_user(curval, (int __user *)uaddr1) != 0)
return -EFAULT;
down_read(¤t->mm->mmap_sem);
}
into a standalone helper function.
> --- linux-2.5.orig/mm/mempolicy.c 2005-02-04 00:27:40.000000000 -0600
> +++ linux-2.5/mm/mempolicy.c 2005-02-21 16:43:08.000000000 -0600
> @@ -486,6 +486,7 @@
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma = NULL;
> struct mempolicy *pol = current->mempolicy;
> + DECLARE_BITMAP(nodes, MAX_NUMNODES);
>
> if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
> return -EINVAL;
> @@ -524,16 +525,21 @@
> } else
> pval = pol->policy;
>
> + if (nmask)
> + get_zonemask(pol, nodes);
> +
> + if (vma) {
> + up_read(¤t->mm->mmap_sem);
> + vma = NULL;
> + }
> +
OK.
> err = -EFAULT;
> if (policy && put_user(pval, policy))
> goto out;
>
> err = 0;
> - if (nmask) {
> - DECLARE_BITMAP(nodes, MAX_NUMNODES);
> - get_zonemask(pol, nodes);
> + if (nmask)
> err = copy_nodes_to_user(nmask, maxnode, nodes, sizeof(nodes));
> - }
>
> out:
> if (vma)
I don't think we need to hold mmap_sem while running get_zonemask(). `pol'
is a copy of current->mempolicy, and it won't be going away.
next prev parent reply other threads:[~2005-02-22 19:55 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-02-22 19:06 [PATCH/RFC] Futex mmap_sem deadlock Olof Johansson
2005-02-22 19:36 ` Linus Torvalds
2005-02-22 21:16 ` Benjamin Herrenschmidt
2005-02-22 21:19 ` Benjamin Herrenschmidt
2005-02-22 21:31 ` Linus Torvalds
2005-02-22 21:42 ` Benjamin Herrenschmidt
2005-02-22 22:10 ` Linus Torvalds
2005-02-22 22:24 ` Benjamin Herrenschmidt
2005-02-22 23:08 ` Greg KH
2005-02-23 11:24 ` David Howells
2005-02-22 19:55 ` Andrew Morton [this message]
2005-02-22 21:07 ` Jamie Lokier
2005-02-22 21:19 ` Olof Johansson
2005-02-22 22:09 ` Jamie Lokier
2005-02-22 21:19 ` Chris Friesen
2005-02-22 21:27 ` Jamie Lokier
2005-02-22 21:30 ` Linus Torvalds
2005-02-22 22:34 ` Jamie Lokier
2005-02-22 22:42 ` Olof Johansson
2005-02-22 23:20 ` Andrew Morton
2005-02-22 23:23 ` Olof Johansson
2005-02-23 11:39 ` David Howells
2005-02-23 16:22 ` Olof Johansson
2005-02-23 18:44 ` David Howells
2005-02-23 14:49 ` Joe Korty
2005-02-23 15:54 ` Linus Torvalds
2005-02-23 17:10 ` Olof Johansson
2005-02-23 17:37 ` Arjan van de Ven
2005-02-23 18:22 ` Jamie Lokier
2005-02-23 18:34 ` Linus Torvalds
2005-02-23 18:49 ` Jamie Lokier
2005-02-23 19:12 ` Olof Johansson
2005-02-23 22:00 ` Linus Torvalds
2005-02-24 0:00 ` Jamie Lokier
2005-02-23 18:37 ` Olof Johansson
2005-02-22 21:40 ` Andrew Morton
2005-02-22 21:59 ` Linus Torvalds
2005-02-23 11:42 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050222115503.729cd17b.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=jamie@shareable.org \
--cc=linux-kernel@vger.kernel.org \
--cc=olof@austin.ibm.com \
--cc=rusty@rustcorp.com.au \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox