From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFEA9382F01 for ; Tue, 9 Jun 2026 20:16:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781036196; cv=none; b=hYvdAhVKuzoTm2ZCiASaJa6xYjTC+F81jDndtZOGo4EVc2E3jLKLbYQB+vb8RzhhRprN77EZXfVK1jdkCSVZ1As8/srEaNcP8/OHM7mS55cRCzGph3HqptWcRtWxTBTzAXBjeKXZGU7Uedbvzcp7QambaJMUwK3HY4pzRN61LE8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781036196; c=relaxed/simple; bh=7aq5VEkAdzmvEF4Xhct1Axawk+2cV45KFApf2b8Qn6w=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=Qpo4PG/ZiwoVjfyq1PpzPEOvNMu3mxz8/beP3ZD3cki2AQPONs2iu5QImvoGEbrrB+RRCPI+g8WzMNf+lbMRsdpZ7YLl0wfePr9B1kl19wxEyEh70k0LERTNU7uWuHwZADJa9nwDqVa6jJbf11CWpIsKFV9Sn7P1YYOseugBXi8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EQtxD35F; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EQtxD35F" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ABBD71F00893; Tue, 9 Jun 2026 20:16:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781036195; bh=TjbM+NHCTVBMWmCVdHsMrwXGPlT1BJ1SncmikOQxJ+c=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=EQtxD35Fx+PDIaIMUTaJ5Nqw93+lKun2VceVoEXfk5NtX3ZmE7Bl31tU1pl2iKhC6 nu6TxO2jqhKNKFjA7EnhTo7qubeNafjGnJte7mZdjgmIgM4hM7PMgHMQ4S0Whmt6my ITdMsRxo5x14gxRqs5mUd552hj1aP1Qc0VJFMUf7k89e7KZ1hTjIIV7LYFxPWxouW0 h7R+T025ogyBgIn+k6sRtIxBHAfFyjXzxYMQKBlVhzPYxorWZF+ebGaGoPAtIDrroX 3uhCm7T+oDaJGDnAv3ah9dvrO8RU8yKFZ8tqboZYqD9C8jsNd5bFh43ojKE6M42+Kk Ib1hoQ/xUaSUQ== From: Thomas Gleixner To: Breno Leitao , Peter Zijlstra Cc: Ingo Molnar , Darren Hart , Davidlohr Bueso , =?utf-8?Q?Andr=C3=A9?= Almeida , linux-kernel@vger.kernel.org, puranjay@kernel.org, rmikey@meta.com, stuclar@meta.com, namhyung@kernel.org, kernel-team@meta.com Subject: Re: [PATCH RFC] futex: avoid false sharing between hb->chain and the bucket lock In-Reply-To: References: <20260605-futex-v1-1-4ad4a0d6f265@debian.org> <20260609104603.GA48970@noisy.programming.kicks-ass.net> Date: Tue, 09 Jun 2026 22:16:31 +0200 Message-ID: <87mrx331wg.ffs@fw13> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain Breno! On Tue, Jun 09 2026 at 08:28, Breno Leitao wrote: > On Tue, Jun 09, 2026 at 12:46:03PM +0200, Peter Zijlstra wrote: >> On Fri, Jun 05, 2026 at 09:53:12AM -0700, Breno Leitao wrote: >> perf bench futex hash 192479 195523 +1.5% >> perf bench futex hash -b 256 3453734 3987880 +15.5% >> >> And then I do see the improvement from your patch, but I really cannot >> make sense of your reasoning for it. > > So, let me rephrase it. The bucket cacheline takes hits from four access > patterns - the three I listed (waiters_pending readers, lock spinners, > lock-holder chain writes) plus the lockless `fph = hb->priv` load on the > futex_hash() fast path, which is what c2c surfaced. That priv load is the > dominant HITM source on baseline, not the chain writes I emphasized. Ok. That makes a lot more sense now. >> > Cost: one extra cacheline (56 B padding) per bucket. Would it be >> > acceptable? >> >> I'm really not sure, it *doubles* the futex memory cost. > > I think it's worth the trade. The global hash scales linearly with > num_possible_cpus(), so the extra bytes track the same curve as the machines > that actually need the fix > > in simpler words, a box big enough to feel this contention has plenty of RAM > headroom to absorb it. Well, it's not only about the global hash. The per process private hash is affected too. Can you try the completely untested below? Thanks, tglx --- --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -124,7 +124,7 @@ late_initcall(fail_futex_debugfs); #endif /* CONFIG_FAIL_FUTEX */ static struct futex_hash_bucket * -__futex_hash(union futex_key *key, struct futex_private_hash *fph); +__futex_hash(union futex_key *key, struct futex_private_hash **fph); #ifdef CONFIG_FUTEX_PRIVATE_HASH static bool futex_ref_get(struct futex_private_hash *fph); @@ -179,22 +179,25 @@ void futex_hash_put(struct futex_hash_bu } static struct futex_hash_bucket * -__futex_hash_private(union futex_key *key, struct futex_private_hash *fph) +__futex_hash_private(union futex_key *key, struct futex_private_hash **fph) { + struct futex_private_hash *lfph = *fph; u32 hash; if (!futex_key_is_private(key)) return NULL; - if (!fph) - fph = rcu_dereference(key->private.mm->futex_phash); - if (!fph || !fph->hash_mask) + if (!lfph) + lfph = rcu_dereference(key->private.mm->futex_phash); + if (!lfph || !lfph->hash_mask) return NULL; + *fph = lfph; + hash = jhash2((void *)&key->private.address, sizeof(key->private.address) / 4, key->both.offset); - return &fph->queues[hash & fph->hash_mask]; + return &lfph->queues[hash & lfph->hash_mask]; } static void futex_rehash_private(struct futex_private_hash *old, @@ -217,7 +220,7 @@ static void futex_rehash_private(struct WARN_ON_ONCE(this->lock_ptr != &hb_old->lock); - hb_new = __futex_hash(&this->key, new); + hb_new = __futex_hash(&this->key, &new); futex_hb_waiters_inc(hb_new); /* * The new pointer isn't published yet but an already @@ -301,13 +304,12 @@ struct futex_private_hash *futex_private struct futex_hash_bucket *futex_hash(union futex_key *key) { - struct futex_private_hash *fph; + struct futex_private_hash *fph = NULL; struct futex_hash_bucket *hb; again: scoped_guard(rcu) { - hb = __futex_hash(key, NULL); - fph = hb->priv; + hb = __futex_hash(key, &fph); if (!fph || futex_private_hash_get(fph)) return hb; @@ -319,7 +321,7 @@ struct futex_hash_bucket *futex_hash(uni #else /* !CONFIG_FUTEX_PRIVATE_HASH */ static struct futex_hash_bucket * -__futex_hash_private(union futex_key *key, struct futex_private_hash *fph) +__futex_hash_private(union futex_key *key, struct futex_private_hash **fph) { return NULL; } @@ -412,7 +414,7 @@ static int futex_mpol(struct mm_struct * * global hash is returned. */ static struct futex_hash_bucket * -__futex_hash(union futex_key *key, struct futex_private_hash *fph) +__futex_hash(union futex_key *key, struct futex_private_hash **fph) { int node = key->both.node; u32 hash;