linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: bert hubert <ahu@ds9a.nl>, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, rusty@rustcorp.com.au,
	mingo@elte.hu
Subject: Re: Futex queue_me/get_user ordering
Date: Tue, 16 Nov 2004 14:58:03 +0000	[thread overview]
Message-ID: <20041116145803.GA15599@mail.shareable.org> (raw)
In-Reply-To: <4199BAA0.1070608@jp.fujitsu.com>

Hidetoshi Seto wrote:
> I have to deeply apologize to all for my mistake.
> If my understanding is correct, this bug is "2.4 futex"(RHEL3) *SPECIFIC*!!
> I had swallow the story that 2.6 futex has the same problem...

Wrong, 2.6 has the same behaviour!

> So I realize that 2.6 futex never behave:
> >>      "returns 0 if the futex was not equal to the expected value, but
> >>       the process was woken by a FUTEX_WAKE call."
> 
> Update of manpage is now unnecessary, I think.

It is necessary.

> First of all, I would appreciate if you could read my old post:
> "Kernel bug in futex_wait, cause application hang with NPTL"
> http://www.ussg.iu.edu/hypermail/linux/kernel/0409.0/2044.html

> If my understanding is correct, 2.6 futex does not get any spinlocks,
> but a semaphore:
>
>  286 static int futex_wake(unsigned long uaddr, int nr_wake)
>   :
>  294         down_read(&current->mm->mmap_sem);
>
>  477 static int futex_wait(unsigned long uaddr, int val, unsigned long time)
>   :
>  483         down_read(&current->mm->mmap_sem);

> This semaphore prevents a waiter which temporarily queued to check the val
> from being target of wakeup.

No, because it's a read-write semaphore, and we do "down_read" on it
which is a shared lock.  It does not prevent concurrent wake and wait
operations!

The only reason we use this semaphore is to block against vma-changing
operations (like mmap) while we look up the futex key and memory word.

> (If it is not possible that there are threads which go around with same
> futex/condvar but each have different mmap_sem,)

Actually it is possible, with process-shared condvars, but it's
irrelevant because down_read doesn't prevent concurrent wakes and
waits.

[About 2.4 futex in RHEL3U2 which takes spinlocks instead]:
> However, this spinlocks fail to prevent topical waiters from wakeups.
> Because the spinlocks are released *before* unqueue_me(&q) (line 343 & 373).
> So this failure allows wake_Y to touch the queue while wait_A is in it.

This order is necessary, because it's not safe to call get_user()
while holding any spinlocks.  It is not a bug in RHEL.

> At least 2.4 futex in RHEL3U2 is buggy.

I don't think it is, because I think the behaviour you'll see with
RHEL3U2 is no different than 2.6, just slower ;)

-- Jamie

  reply	other threads:[~2004-11-16 15:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20041113164048.2f31a8dd.akpm@osdl.org>
2004-11-14  9:00 ` Futex queue_me/get_user ordering (was: 2.6.10-rc1-mm5 [u]) Emergency Services Jamie Lokier
2004-11-14  9:09   ` Andrew Morton
2004-11-14  9:23     ` Jamie Lokier
2004-11-14  9:50       ` bert hubert
2004-11-15 14:12         ` Jamie Lokier
2004-11-16  8:30           ` Futex queue_me/get_user ordering Hidetoshi Seto
2004-11-16 14:58             ` Jamie Lokier [this message]
2004-11-18  1:29               ` Hidetoshi Seto
2004-11-15  0:58       ` Hidetoshi Seto
2004-11-15  2:01         ` Jamie Lokier
2004-11-15  3:06           ` Hidetoshi Seto
2004-11-15 13:22             ` Jamie Lokier
2004-11-17  8:47               ` Jakub Jelinek
2004-11-18  2:10                 ` Hidetoshi Seto
2004-11-18  7:20                 ` Jamie Lokier
2004-11-18 19:47                   ` Jakub Jelinek
2005-03-17 10:26                     ` Jakub Jelinek
2005-03-17 15:20                       ` Jamie Lokier
2005-03-17 15:55                         ` Jakub Jelinek
2005-03-18 17:00                           ` Ingo Molnar
2005-03-21  2:55                             ` Jamie Lokier
2005-03-18 16:53                         ` Jakub Jelinek
2004-11-26 17:06                 ` Jamie Lokier
2004-11-28 17:36                   ` Joe Seigh
2004-11-29 11:24                   ` Jakub Jelinek
2004-11-29 21:50                     ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041116145803.GA15599@mail.shareable.org \
    --to=jamie@shareable.org \
    --cc=ahu@ds9a.nl \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).