public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Ravikiran G Thirumalai <kiran@scalex86.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	shai@scalex86.org
Subject: Re: [rfc] [patch 1/2 ] Process private hash tables for private futexes
Date: Sat, 21 Mar 2009 10:07:48 +0100	[thread overview]
Message-ID: <49C4AE64.4060400@cosmosbay.com> (raw)
In-Reply-To: <20090321044637.GA7278@localdomain>

Ravikiran G Thirumalai a écrit :
> Patch to have a process private hash table for 'PRIVATE' futexes.
> 
> On large core count systems running multiple threaded processes causes
> false sharing on the global futex hash table.  The global futex hash
> table is an array of struct futex_hash_bucket which is defined as:
> 
> struct futex_hash_bucket {
>         spinlock_t lock;
>         struct plist_head chain;
> };
> 
> static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> 
> Needless to say this will cause multiple spinlocks to reside on the
> same cacheline which is very bad when multiple un-related process
> hash onto adjacent hash buckets.  The probability of unrelated futexes
> ending on adjacent hash buckets increase with the number of cores in the
> system (more cores available translates to more processes/more threads
> being run on a system).  The effects of false sharing are tangible on
> machines with more than 32 cores.  We have noticed this with  workload
> of a certain multiple threaded FEA (Finite Element Analysis) solvers.
> We reported this problem couple of years ago which eventually resulted in
> a new api for private futexes to avoid mmap_sem.  The false sharing on
> the global futex hash was put off pending glibc changes to accomodate
> the futex private apis.  Now that the glibc changes are in, and
> multicore is more prevalent, maybe it is time to fix this problem.
> 
> The root cause of the problem is a global futex hash table even for process
> private futexes.  Process private futexes can be hashed on process private
> hash tables, avoiding the global hash and a longer hash table walk when
> there are a lot more futexes in the workload.  However, this results in an
> addition of one extra pointer to the mm_struct.  Hence, this implementation
> of a process private hash table is based off a config option, which can be
> turned off for smaller core count systems.  Furthermore, a subsequent patch
> will introduce a sysctl to dynamically turn on private futex hash tables.
> 
> We found this patch to improve the runtime of a certain FEA solver by about
> 15% on a 32 core vSMP system.
> 
> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
> Signed-off-by: Shai Fultheim <shai@scalex86.org>
> 

First incantation of PRIVATE_FUTEXES had process private hash table

http://lkml.org/lkml/2007/3/15/230

I dont remember objections at that time, maybe it was going to slow down small
users of these PRIVATE_FUTEXES, ie processes that will maybe use one futex_wait()
 in their existence, because they'll have to allocate their private hash table
and populate it.

So I dropped parts about NUMA and private hash tables to get PRIVATE_FUTEXES into mainline.

http://lwn.net/Articles/229668/

Did you tried to change FUTEX_HASHBITS instead, since current value is really really
ridiculous ?

You could also try to adapt this patch to current kernels :

http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-03/msg06504.html

[PATCH 3/3] FUTEX : NUMA friendly global hashtable

On NUMA machines, we should get better performance using a big futex
hashtable, allocated with vmalloc() so that it is spreaded on several nodes.

I chose a static size of four pages. (Very big NUMA machines have 64k page
size)



  parent reply	other threads:[~2009-03-21  9:08 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-21  4:46 [rfc] [patch 1/2 ] Process private hash tables for private futexes Ravikiran G Thirumalai
2009-03-21  4:52 ` [rfc] [patch 2/2 ] Sysctl to turn on/off private futex " Ravikiran G Thirumalai
2009-03-21  9:07 ` Eric Dumazet [this message]
2009-03-21 11:55   ` [PATCH] futex: Dynamically size futexes hash table Eric Dumazet
2009-03-21 16:28     ` Ingo Molnar
2009-03-22  4:54   ` [rfc] [patch 1/2 ] Process private hash tables for private futexes Ravikiran G Thirumalai
2009-03-22  8:17     ` Eric Dumazet
2009-03-23 20:28       ` Ravikiran G Thirumalai
2009-03-23 21:57         ` Eric Dumazet
2009-03-24  3:19           ` Ravikiran G Thirumalai
2009-03-24  3:33             ` Ravikiran G Thirumalai
2009-03-24  5:31             ` Eric Dumazet
2009-03-24  7:04           ` Eric Dumazet
2009-04-23 17:30             ` Darren Hart
2009-03-21 11:35 ` Andrew Morton
2009-03-22  4:15   ` Ravikiran G Thirumalai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49C4AE64.4060400@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=kiran@scalex86.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=shai@scalex86.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox