From: Jakub Jelinek <jakub@redhat.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
mingo@elte.hu, Andrew Morton <akpm@osdl.org>,
linux-kernel@vger.kernel.org, rusty@rustcorp.com.au, ahu@ds9a.nl,
drepper@redhat.com
Subject: Re: Futex queue_me/get_user ordering
Date: Mon, 29 Nov 2004 06:24:26 -0500 [thread overview]
Message-ID: <20041129112426.GO10340@devserv.devel.redhat.com> (raw)
In-Reply-To: <20041126170649.GA8188@mail.shareable.org>
On Fri, Nov 26, 2004 at 05:06:49PM +0000, Jamie Lokier wrote:
Let's start with the questions:
> A few questions:
>
> 1. Why are total_seq and so on 64 bit quantities?
>
> The comparison problem on overflow is solvable by changing
> (total_seq > wakeup_seq) to (int32_t) (total_seq -
> wakeup_seq) > 0, just like the kernel does with jiffies.
>
> If you imagine the number of waiters to exceed 2^31, you have
> bigger problems, because:
>
> 2. futex is 32 bits and can overflow. If a waiter blocks, then
> a waker is called 2^32 times in succession before the waiter
> can schedule again, the waiter will remain blocked after the
> waker returns.
>
> This is unlikely, except where it's done deliberately
> (e.g. SIGSTOP/CONT), and it's a bug and it only needs two
> threads! It could perhaps be used for denial of service.
The only problem with the 32-bit overflow is if you get scheduled
away in between releasing the CV's internal lock, i.e.
lll_mutex_unlock (cond->__data.__lock);
and
if (get_user(curval, (int __user *)uaddr) != 0) {
in kernel and don't get scheduled again for enough time to reach
this place within 2^31 pthread_cond_{*wait,signal,broadcast} calls.
There are no things on the userland side that would block and
in kernel the only place you can block is down_read on mm's mmap_sem
(but if the writer lock is held that long, other pthread_cond_*
calls couldn't get in either) or the short term spinlocks on the hash
bucket. SIGSTOP/SIGCONT affect the whole process, so unless you are
talking about process shared condvars, these signals aren't going to help
you in exploiting it.
But, once you get past that point, current NPTL doesn't care if 2^31 or
more other cv calls happen, it uses the 64-bit vars to determine what to
do and they are big enough that overflows on them are just assumed not to
happen. And only past that point the thread is blocked in longer-term
waiting.
> 3. Why is futex incremented in pthread_cond_wait?
> I don't see the reason for it.
See
https://www.redhat.com/archives/phil-list/2004-May/msg00023.html
https://www.redhat.com/archives/phil-list/2004-May/msg00022.html
__data.__futex increases in pthread_cond_{signal,broadcast} are so that
pthread_cond_*wait detects pthread_cond_{signal,broadcast} that happened
in between releasing of internal cv lock in the *wait and being queued
on the futex's wait queue. __data.__futex increases in pthread_cond_*wait
are so that FUTEX_CMP_REQUEUE in pthread_cond_broadcast detects
pthread_cond_*wait that happened in between releasing the internal
lock in *broadcast and test in FUTEX_CMP_REQUEUE.
> 4. In pthread_cond_broadcast, why is the mutex_unlock(lock)
> dropped before calling FUTEX_CMP_REQUEUE? Wouldn't it be
> better to drop the lock just after, in which case
> FUTEX_REQUEUE would be fine?
>
> pthread_cond_signal has no problem with holding the lock
> across FUTEX_WAKE, and I do not see any reason why that would
> be different for pthread_cond_broadcast.
Holding the internal lock over requeue kills performance of broadcast,
if you hold the internal lock over the requeue, all the threads you
wake up will block on the internal lock anyway.
Jakub
next prev parent reply other threads:[~2004-11-29 11:26 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20041113164048.2f31a8dd.akpm@osdl.org>
2004-11-14 9:00 ` Futex queue_me/get_user ordering (was: 2.6.10-rc1-mm5 [u]) Emergency Services Jamie Lokier
2004-11-14 9:09 ` Andrew Morton
2004-11-14 9:23 ` Jamie Lokier
2004-11-14 9:50 ` bert hubert
2004-11-15 14:12 ` Jamie Lokier
2004-11-16 8:30 ` Futex queue_me/get_user ordering Hidetoshi Seto
2004-11-16 14:58 ` Jamie Lokier
2004-11-18 1:29 ` Hidetoshi Seto
2004-11-15 0:58 ` Hidetoshi Seto
2004-11-15 2:01 ` Jamie Lokier
2004-11-15 3:06 ` Hidetoshi Seto
2004-11-15 13:22 ` Jamie Lokier
2004-11-17 8:47 ` Jakub Jelinek
2004-11-18 2:10 ` Hidetoshi Seto
2004-11-18 7:20 ` Jamie Lokier
2004-11-18 19:47 ` Jakub Jelinek
2005-03-17 10:26 ` Jakub Jelinek
2005-03-17 15:20 ` Jamie Lokier
2005-03-17 15:55 ` Jakub Jelinek
2005-03-18 17:00 ` Ingo Molnar
2005-03-21 2:55 ` Jamie Lokier
2005-03-18 16:53 ` Jakub Jelinek
2004-11-26 17:06 ` Jamie Lokier
2004-11-28 17:36 ` Joe Seigh
2004-11-29 11:24 ` Jakub Jelinek [this message]
2004-11-29 21:50 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041129112426.GO10340@devserv.devel.redhat.com \
--to=jakub@redhat.com \
--cc=ahu@ds9a.nl \
--cc=akpm@osdl.org \
--cc=drepper@redhat.com \
--cc=jamie@shareable.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
--cc=seto.hidetoshi@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.