public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jes Sorensen <jes@sgi.com>
To: Ravikiran G Thirumalai <kiran@scalex86.org>
Cc: linux-kernel@vger.kernel.org,
	"Shai Fultheim (Shai@scalex86.org)" <shai@scalex86.org>,
	pravin b shelar <pravin.shelar@calsoftinc.com>
Subject: Re: [RFC] NUMA futex hashing
Date: 08 Aug 2006 05:37:23 -0400	[thread overview]
Message-ID: <yq0d5bbva98.fsf@jaguar.mkp.net> (raw)
In-Reply-To: <20060808070708.GA3931@localhost.localdomain>

>>>>> "Ravikiran" == Ravikiran G Thirumalai <kiran@scalex86.org> writes:

Ravikiran> Current futex hash scheme is not the best for NUMA.  The
Ravikiran> futex hash table is an array of struct futex_hash_bucket,
Ravikiran> which is just a spinlock and a list_head -- this means
Ravikiran> multiple spinlocks on the same cacheline and on NUMA
Ravikiran> machines, on the same internode cacheline.  If futexes of
Ravikiran> two unrelated threads running on two different nodes happen
Ravikiran> to hash onto adjacent hash buckets, or buckets on the same
Ravikiran> internode cacheline, then we have the internode cacheline
Ravikiran> bouncing between nodes.

Ravikiran,

Using that argument, all you need to do is to add the alignment
____cacheline_aligned_in_smp to the definition of
struct futex_hash_bucket and the problem is solved, given that the
internode cacheline in a NUMA system is defined to be the same as the
SMP cacheline size.

Ravikiran> Here is a simple scheme which maintains per-node hash
Ravikiran> tables for futexes.

Ravikiran> In this scheme, a private futex is assigned to the node id
Ravikiran> of the futex's KVA.  The reasoning is, the futex KVA is
Ravikiran> allocated from the node as indicated by memory policy set
Ravikiran> by the process, and that should be a good 'home node' for
Ravikiran> that futex.  Of course this helps workloads where all the
Ravikiran> threads of a process are bound to the same node, but it
Ravikiran> seems reasonable to run all threads of a process on the
Ravikiran> same node.

You can't make that assumption at all. In many NUMA workloads it is
not common to have all threads of a process run on the same node. You
often see a case where one thread spawns a number of threads that are
then grouped onto the various nodes.

If we want to make the futexes really NUMA aware, having them
explicitly allocated on a given node would be more useful or
alternatively have them allocated on a first touch basis.

But to be honest, I doubt it matters too much since the futex
cacheline is most likely to end up in cache on the node where it's
being used and as long as the other nodes don't try and touch the same
futex this becomes a non-issue with the proper alignment.

I don't think your patch is harmful, but it looks awfully complex for
something that could be solved just as well by a simple alignment
statement.

Cheers,
Jes

  parent reply	other threads:[~2006-08-08  9:37 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-08  7:07 [RFC] NUMA futex hashing Ravikiran G Thirumalai
2006-08-08  9:14 ` Eric Dumazet
2006-08-08 20:31   ` Ravikiran G Thirumalai
2006-08-08  9:37 ` Jes Sorensen [this message]
2006-08-08  9:58   ` Andi Kleen
2006-08-08 10:07     ` Jes Sorensen
2006-08-08  9:57 ` Andi Kleen
2006-08-08 10:10   ` Eric Dumazet
2006-08-08 10:36     ` Andi Kleen
2006-08-08 12:29       ` Eric Dumazet
2006-08-08 12:47         ` Andi Kleen
2006-08-08 12:57           ` Eric Dumazet
2006-08-08 14:39             ` Ulrich Drepper
2006-08-08 15:11               ` Nick Piggin
2006-08-08 15:36                 ` Ulrich Drepper
2006-08-08 16:22                   ` Nick Piggin
2006-08-08 16:26                     ` Nick Piggin
2006-08-08 16:49                     ` Ulrich Drepper
2006-08-08 16:08                 ` Eric Dumazet
2006-08-08 16:34                   ` Nick Piggin
2006-08-08 16:49                     ` Eric Dumazet
2006-08-08 16:59                       ` Eric Dumazet
2006-08-09  1:56                       ` Nick Piggin
2006-08-08 16:58                   ` Ulrich Drepper
2006-08-08 17:08                     ` Eric Dumazet
2006-08-09  1:58                     ` Nick Piggin
2006-08-09  6:26                       ` Eric Dumazet
2006-08-09  6:43                         ` Eric Dumazet
2007-03-15 19:10                           ` [PATCH 0/3] FUTEX : new PRIVATE futexes, SMP and NUMA improvements Eric Dumazet
2007-03-15 20:15                             ` Nick Piggin
2007-03-16  8:05                             ` Peter Zijlstra
2007-03-16  9:30                               ` Eric Dumazet
2007-03-16 10:10                                 ` Peter Zijlstra
2007-03-16 10:30                                   ` Eric Dumazet
2007-03-16 10:36                                     ` Peter Zijlstra
2007-04-04  7:16                             ` Ulrich Drepper
2007-04-05 17:49                               ` [PATCH] FUTEX : new PRIVATE futexes Eric Dumazet
2007-04-05 20:43                                 ` Ulrich Drepper
2007-04-06  1:19                                 ` Nick Piggin
2007-04-06  5:53                                   ` Eric Dumazet
2007-04-06 11:50                                     ` Nick Piggin
2007-04-06  6:05                                   ` Hugh Dickins
2007-04-06 17:41                                     ` Jan Engelhardt
2007-04-06 12:26                                 ` Shared futexes (was [PATCH] FUTEX : new PRIVATE futexes) Peter Zijlstra
2007-04-06 13:02                                   ` Hugh Dickins
2007-04-06 13:15                                     ` Peter Zijlstra
2007-04-06 13:15                                     ` Nick Piggin
2007-04-06 13:22                                       ` Peter Zijlstra
2007-04-06 13:40                                         ` Nick Piggin
2007-04-06 12:31                                 ` [PATCH] FUTEX : new PRIVATE futexes Peter Zijlstra
2007-04-07  8:43                                 ` [PATCH, take4] " Eric Dumazet
2007-04-07  9:30                                   ` Nick Piggin
2007-04-07 10:00                                     ` Eric Dumazet
2007-04-11  7:22                                       ` Nick Piggin
2007-04-11  8:14                                         ` Eric Dumazet
2007-04-11  9:23                                           ` Nick Piggin
2007-04-11  9:30                                             ` Pierre Peiffer
2007-04-11  9:39                                               ` Nick Piggin
2007-04-11  9:40                                                 ` Nick Piggin
2007-04-11  9:35                                             ` Eric Dumazet
2007-04-12  1:57                                               ` Nick Piggin
2007-04-07 11:18                                   ` Jakub Jelinek
2007-04-07 11:54                                     ` Eric Dumazet
2007-04-07 16:40                                       ` Ulrich Drepper
2007-04-07 22:15                                   ` Andrew Morton
2007-04-10  9:21                                     ` Eric Dumazet
2007-04-11  9:19                                   ` [PATCH, take5] " Eric Dumazet
2007-04-11 12:23                                     ` Rusty Russell
2007-04-26 12:55                                     ` [PATCH, take6] " Eric Dumazet
2007-04-26 13:35                                       ` Pierre Peiffer
2007-03-15 19:13                           ` [PATCH 1/3] FUTEX : introduce PROCESS_PRIVATE semantic Eric Dumazet
2007-03-15 19:16                           ` [PATCH 2/3] FUTEX : introduce private hashtables Eric Dumazet
2007-03-15 20:25                             ` Nick Piggin
2007-03-15 21:09                               ` Ulrich Drepper
2007-03-15 21:29                                 ` Nick Piggin
2007-03-15 22:59                               ` William Lee Irwin III
2007-03-15 19:20                           ` [PATCH 3/3] FUTEX : NUMA friendly global hashtable Eric Dumazet
2006-08-09  0:13     ` [RFC] NUMA futex hashing Ravikiran G Thirumalai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq0d5bbva98.fsf@jaguar.mkp.net \
    --to=jes@sgi.com \
    --cc=kiran@scalex86.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pravin.shelar@calsoftinc.com \
    --cc=shai@scalex86.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox