From: Andrew Morton <akpm@osdl.org>
To: Jamie Lokier <jamie@shareable.org>
Cc: olof@austin.ibm.com, linux-kernel@vger.kernel.org,
torvalds@osdl.org, rusty@rustcorp.com.au
Subject: Re: [PATCH/RFC] Futex mmap_sem deadlock
Date: Tue, 22 Feb 2005 13:40:10 -0800 [thread overview]
Message-ID: <20050222134010.4c286e64.akpm@osdl.org> (raw)
In-Reply-To: <20050222210752.GG22555@mail.shareable.org>
Jamie Lokier <jamie@shareable.org> wrote:
>
> ...
>
> > > One attempt to fix this is included below. It works, but I'm not entirely
> > > happy with the fact that it's a bit messy solution. If anyone has a
> > > better idea for how to solve it I'd be all ears.
> >
> > It's fairly sane. Style-wise I'd be inclined to turn this:
> >
> > down_read(¤t->mm->mmap_sem);
> > while (!check_user_page_readable(current->mm, uaddr1)) {
> > up_read(¤t->mm->mmap_sem);
> > /* Fault in the page through get_user() but discard result */
> > if (get_user(curval, (int __user *)uaddr1) != 0)
> > return -EFAULT;
> > down_read(¤t->mm->mmap_sem);
> > }
>
> That won't work because the vma lock must be help between key
> calculation and get_user() - otherwise futex is not reliable. It
> would work if the futex key calculation was inside the loop.
All the above is trying to do is to convert the initial down_read(mmap_sem)
into a function which, on exit, guarantees that
a) down_read(mmap_sem) is held and
b) the subsequent get_user() of that address will not generate a pagefault.
So it shouldn't affect the futex code's atomicity at all.
However the pte can get unmapped by memory reclaim so we could still take a
minor fault, and hit the same deadlock, yes?
> A much simpler solution (and sorry for not offering it earlier,
> because Andrew Morton pointed out this bug long ago, but I was busy), is:
>
> In futex.c:
>
> down_read(¤t->mm->mmap_sem);
> get_futex_key(...) etc.
> queue_me(...) etc.
> current->flags |= PF_MMAP_SEM; <- new
> ret = get_user(...);
> current->flags &= PF_MMAP_SEM; <- new
> /* the rest */
>
> And in arch/*/mm/fault.c, replace every one of these:
>
> down_read(&mm->mmap_sem);
>
> up_read(&mm->mmap_sem);
>
> with these:
>
> if (!(current & PF_MMAP_SEM))
> down_read(&mm->mmap_sem);
>
> if (!(current & PF_MMAP_SEM))
> up_read(&mm->mmap_sem);
>
Yes, that will work. However I do feel that it's cleaner to localise this
nastiness into a single function which the futex code calls, rather than
spreading it all around and adding overhead to every pagefault. If we can
work out how.
wrt this down_read/down_write/down_read deadlock: iirc, the reason why
down_write() takes precedence over down_read() is to avoid the permanent
writer starvation which would occur if there is heavy down_read() traffic.
As Linus points out, an alternative would be to do an inc_preempt_count()
around the offending get_user(), then use __copy_from_user_inatomic(), then
take some sort of remedial action if __copy_from_user_inatomic() returns a
fault. Something like:
retry:
if (get_user(uaddr) == -EFAULT)
return -EFAULT;
down_read(mmap_sem);
inc_preempt_count();
if (__copy_from_user_inatomic(..., uaddr)) {
up_read(mmap_sem);
dec_preempt_count();
goto retry;
}
dec_preempt_count();
up_read(mmap_sem);
next prev parent reply other threads:[~2005-02-22 21:41 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-02-22 19:06 [PATCH/RFC] Futex mmap_sem deadlock Olof Johansson
2005-02-22 19:36 ` Linus Torvalds
2005-02-22 21:16 ` Benjamin Herrenschmidt
2005-02-22 21:19 ` Benjamin Herrenschmidt
2005-02-22 21:31 ` Linus Torvalds
2005-02-22 21:42 ` Benjamin Herrenschmidt
2005-02-22 22:10 ` Linus Torvalds
2005-02-22 22:24 ` Benjamin Herrenschmidt
2005-02-22 23:08 ` Greg KH
2005-02-23 11:24 ` David Howells
2005-02-22 19:55 ` Andrew Morton
2005-02-22 21:07 ` Jamie Lokier
2005-02-22 21:19 ` Olof Johansson
2005-02-22 22:09 ` Jamie Lokier
2005-02-22 21:19 ` Chris Friesen
2005-02-22 21:27 ` Jamie Lokier
2005-02-22 21:30 ` Linus Torvalds
2005-02-22 22:34 ` Jamie Lokier
2005-02-22 22:42 ` Olof Johansson
2005-02-22 23:20 ` Andrew Morton
2005-02-22 23:23 ` Olof Johansson
2005-02-23 11:39 ` David Howells
2005-02-23 16:22 ` Olof Johansson
2005-02-23 18:44 ` David Howells
2005-02-23 14:49 ` Joe Korty
2005-02-23 15:54 ` Linus Torvalds
2005-02-23 17:10 ` Olof Johansson
2005-02-23 17:37 ` Arjan van de Ven
2005-02-23 18:22 ` Jamie Lokier
2005-02-23 18:34 ` Linus Torvalds
2005-02-23 18:49 ` Jamie Lokier
2005-02-23 19:12 ` Olof Johansson
2005-02-23 22:00 ` Linus Torvalds
2005-02-24 0:00 ` Jamie Lokier
2005-02-23 18:37 ` Olof Johansson
2005-02-22 21:40 ` Andrew Morton [this message]
2005-02-22 21:59 ` Linus Torvalds
2005-02-23 11:42 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050222134010.4c286e64.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=jamie@shareable.org \
--cc=linux-kernel@vger.kernel.org \
--cc=olof@austin.ibm.com \
--cc=rusty@rustcorp.com.au \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox