From: Eric Dumazet <dada1@cosmosbay.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ulrich Drepper <drepper@gmail.com>, Andi Kleen <ak@suse.de>,
Ravikiran G Thirumalai <kiran@scalex86.org>,
"Shai Fultheim (Shai@scalex86.org)" <shai@scalex86.org>,
pravin b shelar <pravin.shelar@calsoftinc.com>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC] NUMA futex hashing
Date: Tue, 8 Aug 2006 18:08:34 +0200 [thread overview]
Message-ID: <200608081808.34708.dada1@cosmosbay.com> (raw)
In-Reply-To: <44D8A9BE.3050607@yahoo.com.au>
On Tuesday 08 August 2006 17:11, Nick Piggin wrote:
> Ulrich Drepper wrote:
> > On 8/8/06, Eric Dumazet <dada1@cosmosbay.com> wrote:
> >> The validity of the virtual address is still tested by normal get_user()
> >> call.. If the memory was freed by a thread, then a normal EFAULT error
> >> will
> >> be reported... eventually.
> >
> > This is indeed what should be done. Private futexes are the by far
> > more frequent case and I bet you'd see improvements when avoiding the
> > mm mutex even for normal machines since futexes really are everywhere.
> > For shared mutexes you end up doing two lookups and that's fine IMO
> > as long as the first lookup is fast.
>
> The private futex's namespace is its virtual address, so I don't see
> how you can decouple that from the management of virtual addresses.
>
> Let me get this straight: to insert a contended futex into your rbtree,
> you need to hold the mmap sem to ensure that address remains valid,
> then you need to take a lock which protects your rbtree. Then to wake
> up a process and remove the futex, you need to take the rbtree lock. Or
> to unmap any memory you also need to take the rbtree lock and ensure
> there are no futexes there.
>
> So you just add another lock for no reason, or have I got a few screws
> loose myself? I don't see how you can significantly reduce lock
> cacheline bouncing in a futex heavy workload if you're just going to
> add another shared data structure. But if you can, sweet ;)
We certainly can. But if you insist of using mmap sem at all, then we have a
problem.
rbtree would not reduce cacheline bouncing, so :
We could use a hashtable (allocated on demand) of size N, N depending on
NR_CPUS for example. each chain protected by a private spinlock. If N is well
chosen, we might reduce lock cacheline bouncing. (different threads fighting
on different private futexes would have a good chance to get different
cachelines in this hashtable)
As soon a process enters 'private futex' code, the futex code allocates this
hashtable if the process has a NULL hash table (set to NULL at exec() time,
or maybe re-allocated because we want to be sure futex syscall always suceed
(no ENOMEM))
So we really can... but for 'private futexes' which are the vast majority of
futexes needed by typical program (using POSIX pshared thread mutex attribute
PTHREAD_PROCESS_PRIVATE, currently not used by NPTL glibc)
Of course we would need a new syscall, and to change glibc to be able to
actually use this new private_futex syscall.
Probably a lot of work, still, but could help heavy threaded programs not
touching mmap_sem.
We might have a refcounting problem on this 'hashtable' since several threads
share this structure, but only at thread creation/destruction, not in futex
call (ie no cacheline bouncing on the refcount)
Eric
next prev parent reply other threads:[~2006-08-08 16:08 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-08 7:07 [RFC] NUMA futex hashing Ravikiran G Thirumalai
2006-08-08 9:14 ` Eric Dumazet
2006-08-08 20:31 ` Ravikiran G Thirumalai
2006-08-08 9:37 ` Jes Sorensen
2006-08-08 9:58 ` Andi Kleen
2006-08-08 10:07 ` Jes Sorensen
2006-08-08 9:57 ` Andi Kleen
2006-08-08 10:10 ` Eric Dumazet
2006-08-08 10:36 ` Andi Kleen
2006-08-08 12:29 ` Eric Dumazet
2006-08-08 12:47 ` Andi Kleen
2006-08-08 12:57 ` Eric Dumazet
2006-08-08 14:39 ` Ulrich Drepper
2006-08-08 15:11 ` Nick Piggin
2006-08-08 15:36 ` Ulrich Drepper
2006-08-08 16:22 ` Nick Piggin
2006-08-08 16:26 ` Nick Piggin
2006-08-08 16:49 ` Ulrich Drepper
2006-08-08 16:08 ` Eric Dumazet [this message]
2006-08-08 16:34 ` Nick Piggin
2006-08-08 16:49 ` Eric Dumazet
2006-08-08 16:59 ` Eric Dumazet
2006-08-09 1:56 ` Nick Piggin
2006-08-08 16:58 ` Ulrich Drepper
2006-08-08 17:08 ` Eric Dumazet
2006-08-09 1:58 ` Nick Piggin
2006-08-09 6:26 ` Eric Dumazet
2006-08-09 6:43 ` Eric Dumazet
2007-03-15 19:10 ` [PATCH 0/3] FUTEX : new PRIVATE futexes, SMP and NUMA improvements Eric Dumazet
2007-03-15 20:15 ` Nick Piggin
2007-03-16 8:05 ` Peter Zijlstra
2007-03-16 9:30 ` Eric Dumazet
2007-03-16 10:10 ` Peter Zijlstra
2007-03-16 10:30 ` Eric Dumazet
2007-03-16 10:36 ` Peter Zijlstra
2007-04-04 7:16 ` Ulrich Drepper
2007-04-05 17:49 ` [PATCH] FUTEX : new PRIVATE futexes Eric Dumazet
2007-04-05 20:43 ` Ulrich Drepper
2007-04-06 1:19 ` Nick Piggin
2007-04-06 5:53 ` Eric Dumazet
2007-04-06 11:50 ` Nick Piggin
2007-04-06 6:05 ` Hugh Dickins
2007-04-06 17:41 ` Jan Engelhardt
2007-04-06 12:26 ` Shared futexes (was [PATCH] FUTEX : new PRIVATE futexes) Peter Zijlstra
2007-04-06 13:02 ` Hugh Dickins
2007-04-06 13:15 ` Peter Zijlstra
2007-04-06 13:15 ` Nick Piggin
2007-04-06 13:22 ` Peter Zijlstra
2007-04-06 13:40 ` Nick Piggin
2007-04-06 12:31 ` [PATCH] FUTEX : new PRIVATE futexes Peter Zijlstra
2007-04-07 8:43 ` [PATCH, take4] " Eric Dumazet
2007-04-07 9:30 ` Nick Piggin
2007-04-07 10:00 ` Eric Dumazet
2007-04-11 7:22 ` Nick Piggin
2007-04-11 8:14 ` Eric Dumazet
2007-04-11 9:23 ` Nick Piggin
2007-04-11 9:30 ` Pierre Peiffer
2007-04-11 9:39 ` Nick Piggin
2007-04-11 9:40 ` Nick Piggin
2007-04-11 9:35 ` Eric Dumazet
2007-04-12 1:57 ` Nick Piggin
2007-04-07 11:18 ` Jakub Jelinek
2007-04-07 11:54 ` Eric Dumazet
2007-04-07 16:40 ` Ulrich Drepper
2007-04-07 22:15 ` Andrew Morton
2007-04-10 9:21 ` Eric Dumazet
2007-04-11 9:19 ` [PATCH, take5] " Eric Dumazet
2007-04-11 12:23 ` Rusty Russell
2007-04-26 12:55 ` [PATCH, take6] " Eric Dumazet
2007-04-26 13:35 ` Pierre Peiffer
2007-03-15 19:13 ` [PATCH 1/3] FUTEX : introduce PROCESS_PRIVATE semantic Eric Dumazet
2007-03-15 19:16 ` [PATCH 2/3] FUTEX : introduce private hashtables Eric Dumazet
2007-03-15 20:25 ` Nick Piggin
2007-03-15 21:09 ` Ulrich Drepper
2007-03-15 21:29 ` Nick Piggin
2007-03-15 22:59 ` William Lee Irwin III
2007-03-15 19:20 ` [PATCH 3/3] FUTEX : NUMA friendly global hashtable Eric Dumazet
2006-08-09 0:13 ` [RFC] NUMA futex hashing Ravikiran G Thirumalai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200608081808.34708.dada1@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=ak@suse.de \
--cc=drepper@gmail.com \
--cc=kiran@scalex86.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=pravin.shelar@calsoftinc.com \
--cc=shai@scalex86.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox