From: Thomas Gleixner <tglx@kernel.org>
To: Breno Leitao <leitao@debian.org>, Peter Zijlstra <peterz@infradead.org>
Cc: "Ingo Molnar" <mingo@redhat.com>,
"Darren Hart" <dvhart@infradead.org>,
"Davidlohr Bueso" <dave@stgolabs.net>,
"André Almeida" <andrealmeid@igalia.com>,
linux-kernel@vger.kernel.org, puranjay@kernel.org,
rmikey@meta.com, stuclar@meta.com, namhyung@kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH RFC] futex: avoid false sharing between hb->chain and the bucket lock
Date: Tue, 09 Jun 2026 22:16:31 +0200 [thread overview]
Message-ID: <87mrx331wg.ffs@fw13> (raw)
In-Reply-To: <aigTu8PqXn8wyhiK@gmail.com>
Breno!
On Tue, Jun 09 2026 at 08:28, Breno Leitao wrote:
> On Tue, Jun 09, 2026 at 12:46:03PM +0200, Peter Zijlstra wrote:
>> On Fri, Jun 05, 2026 at 09:53:12AM -0700, Breno Leitao wrote:
>> perf bench futex hash 192479 195523 +1.5%
>> perf bench futex hash -b 256 3453734 3987880 +15.5%
>>
>> And then I do see the improvement from your patch, but I really cannot
>> make sense of your reasoning for it.
>
> So, let me rephrase it. The bucket cacheline takes hits from four access
> patterns - the three I listed (waiters_pending readers, lock spinners,
> lock-holder chain writes) plus the lockless `fph = hb->priv` load on the
> futex_hash() fast path, which is what c2c surfaced. That priv load is the
> dominant HITM source on baseline, not the chain writes I emphasized.
Ok. That makes a lot more sense now.
>> > Cost: one extra cacheline (56 B padding) per bucket. Would it be
>> > acceptable?
>>
>> I'm really not sure, it *doubles* the futex memory cost.
>
> I think it's worth the trade. The global hash scales linearly with
> num_possible_cpus(), so the extra bytes track the same curve as the machines
> that actually need the fix
>
> in simpler words, a box big enough to feel this contention has plenty of RAM
> headroom to absorb it.
Well, it's not only about the global hash. The per process private hash
is affected too.
Can you try the completely untested below?
Thanks,
tglx
---
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -124,7 +124,7 @@ late_initcall(fail_futex_debugfs);
#endif /* CONFIG_FAIL_FUTEX */
static struct futex_hash_bucket *
-__futex_hash(union futex_key *key, struct futex_private_hash *fph);
+__futex_hash(union futex_key *key, struct futex_private_hash **fph);
#ifdef CONFIG_FUTEX_PRIVATE_HASH
static bool futex_ref_get(struct futex_private_hash *fph);
@@ -179,22 +179,25 @@ void futex_hash_put(struct futex_hash_bu
}
static struct futex_hash_bucket *
-__futex_hash_private(union futex_key *key, struct futex_private_hash *fph)
+__futex_hash_private(union futex_key *key, struct futex_private_hash **fph)
{
+ struct futex_private_hash *lfph = *fph;
u32 hash;
if (!futex_key_is_private(key))
return NULL;
- if (!fph)
- fph = rcu_dereference(key->private.mm->futex_phash);
- if (!fph || !fph->hash_mask)
+ if (!lfph)
+ lfph = rcu_dereference(key->private.mm->futex_phash);
+ if (!lfph || !lfph->hash_mask)
return NULL;
+ *fph = lfph;
+
hash = jhash2((void *)&key->private.address,
sizeof(key->private.address) / 4,
key->both.offset);
- return &fph->queues[hash & fph->hash_mask];
+ return &lfph->queues[hash & lfph->hash_mask];
}
static void futex_rehash_private(struct futex_private_hash *old,
@@ -217,7 +220,7 @@ static void futex_rehash_private(struct
WARN_ON_ONCE(this->lock_ptr != &hb_old->lock);
- hb_new = __futex_hash(&this->key, new);
+ hb_new = __futex_hash(&this->key, &new);
futex_hb_waiters_inc(hb_new);
/*
* The new pointer isn't published yet but an already
@@ -301,13 +304,12 @@ struct futex_private_hash *futex_private
struct futex_hash_bucket *futex_hash(union futex_key *key)
{
- struct futex_private_hash *fph;
+ struct futex_private_hash *fph = NULL;
struct futex_hash_bucket *hb;
again:
scoped_guard(rcu) {
- hb = __futex_hash(key, NULL);
- fph = hb->priv;
+ hb = __futex_hash(key, &fph);
if (!fph || futex_private_hash_get(fph))
return hb;
@@ -319,7 +321,7 @@ struct futex_hash_bucket *futex_hash(uni
#else /* !CONFIG_FUTEX_PRIVATE_HASH */
static struct futex_hash_bucket *
-__futex_hash_private(union futex_key *key, struct futex_private_hash *fph)
+__futex_hash_private(union futex_key *key, struct futex_private_hash **fph)
{
return NULL;
}
@@ -412,7 +414,7 @@ static int futex_mpol(struct mm_struct *
* global hash is returned.
*/
static struct futex_hash_bucket *
-__futex_hash(union futex_key *key, struct futex_private_hash *fph)
+__futex_hash(union futex_key *key, struct futex_private_hash **fph)
{
int node = key->both.node;
u32 hash;
next prev parent reply other threads:[~2026-06-09 20:16 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-05 16:53 [PATCH RFC] futex: avoid false sharing between hb->chain and the bucket lock Breno Leitao
2026-06-09 10:46 ` Peter Zijlstra
2026-06-09 15:28 ` Breno Leitao
2026-06-09 20:11 ` Peter Zijlstra
2026-06-09 20:18 ` Peter Zijlstra
2026-06-09 20:16 ` Thomas Gleixner [this message]
2026-06-09 20:23 ` Peter Zijlstra
2026-06-09 20:25 ` Peter Zijlstra
2026-06-09 20:32 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mrx331wg.ffs@fw13 \
--to=tglx@kernel.org \
--cc=andrealmeid@igalia.com \
--cc=dave@stgolabs.net \
--cc=dvhart@infradead.org \
--cc=kernel-team@meta.com \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=puranjay@kernel.org \
--cc=rmikey@meta.com \
--cc=stuclar@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox