From: Joe Seigh <jseigh_01@xemaps.com>
To: linux-kernel@vger.kernel.org
Subject: Re: Futex queue_me/get_user ordering
Date: Sun, 28 Nov 2004 12:36:57 -0500 [thread overview]
Message-ID: <41AA0CB9.CB30715A@xemaps.com> (raw)
In-Reply-To: 20041126170649.GA8188@mail.shareable.org
Jamie Lokier wrote:
>
> I've looked at the problem of lost-wakeups problem with NPTL condition
> variables and 2.6 futex, with the help of Jakub's finely presented
> pseudo-code. Unless I've made a mistake, it is fixable in userspace.
>
> [ It might be more efficient to fix it in kernel space - on the other
> hand, doing so might also make kernel futexes slower. In general, I
> prefer if the kernel futex semantics can be as "loose" as possible
> to minimise the locking they are absolutely required to do. Who
> knows, we might come up with an algorithm that uses even less
> cross-CPU traffic in the kernel, if the semantics permit it.
> However, I appreciate that a more "atomic" kernel semantic is easier
> to understand, and it is possible to implement that if it is really
> worth doing. I would like to see benchmarks proving it doesn't slow
> down normal futex stress tests though. It might not be slower at all. ]
[...]
> 5. Like 4, but in the kernel. We change the kernel to _always_
> retransmit a wakeup if it's received by the unqueue_me() in the
> word-didn't-match branch.
>
> Effect: In the "Drowsy" state, a waiter may accept a WAKE token
> but then it will offer it again so they are never lost from
> "Sleeping" states.
>
> NOTE: This is NOT equivalent to changing the kernel to do
> test-and-queue atomically. With this change, a FUTEX_WAKE
> operation can return to userspace _before_ the final
> destination of the WAKE token decides to begin FUTEX_WAIT.
>
> This will result in spurious extra wakeups, erring too far the
> other way, because of the difference from atomicity described
> in the preceding paragraph.
>
> Therefore, I don't like this. It would fix the NPTL condition
> variables, but introduces two new problems:
>
> - It violates conservation of WAKE tokens (like energy and
> momentum), which some other futex-using code may depend
> on - unless the return value from FUTEX_WAIT is changed
> to report 1 when it receives a token or 2 when it
> forwards it successfully.
>
> - Some spurious wakeups at times when a wakeup is not
> required.
>
> - No logical benefit over doing it in userspace, but
> would take away flexibility if kernel always did it.
>
I think this is similar to a solution that I proposed elsewhere. You wake up
some other thread, if any, waiting on the futex. This breaks what you call
WAKE tokens but wait morphing with FUTEX_CMP_REQUEUE does that already as far
as I can tell. A FUTEX_WAIT that has been requeued onto another futex could
return EINTR instead of zero (one of the reasons you can't loop on EINTR's in
the cond wait code).
I did an alternate lock-free implementation of pthread condition variables with
a work around of sorts for that futex wake preemption problem I mentioned earlier.
I get a 3x to 200x performance improvement depending on what you are doing. So
naturally I would be interested in a solution that doesn't require a userspace
bottleneck.
Joe Seigh
next prev parent reply other threads:[~2004-11-28 18:11 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20041113164048.2f31a8dd.akpm@osdl.org>
2004-11-14 9:00 ` Futex queue_me/get_user ordering (was: 2.6.10-rc1-mm5 [u]) Emergency Services Jamie Lokier
2004-11-14 9:09 ` Andrew Morton
2004-11-14 9:23 ` Jamie Lokier
2004-11-14 9:50 ` bert hubert
2004-11-15 14:12 ` Jamie Lokier
2004-11-16 8:30 ` Futex queue_me/get_user ordering Hidetoshi Seto
2004-11-16 14:58 ` Jamie Lokier
2004-11-18 1:29 ` Hidetoshi Seto
2004-11-15 0:58 ` Hidetoshi Seto
2004-11-15 2:01 ` Jamie Lokier
2004-11-15 3:06 ` Hidetoshi Seto
2004-11-15 13:22 ` Jamie Lokier
2004-11-17 8:47 ` Jakub Jelinek
2004-11-18 2:10 ` Hidetoshi Seto
2004-11-18 7:20 ` Jamie Lokier
2004-11-18 19:47 ` Jakub Jelinek
2005-03-17 10:26 ` Jakub Jelinek
2005-03-17 15:20 ` Jamie Lokier
2005-03-17 15:55 ` Jakub Jelinek
2005-03-18 17:00 ` Ingo Molnar
2005-03-21 2:55 ` Jamie Lokier
2005-03-18 16:53 ` Jakub Jelinek
2004-11-26 17:06 ` Jamie Lokier
2004-11-28 17:36 ` Joe Seigh [this message]
2004-11-29 11:24 ` Jakub Jelinek
2004-11-29 21:50 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41AA0CB9.CB30715A@xemaps.com \
--to=jseigh_01@xemaps.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).