From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B4022D1913
	for <linux-kernel@vger.kernel.org>; Tue, 23 Jun 2026 10:30:39 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1782210640; cv=none; b=PsX5dM+nQYj9l49cOIrxqrAyGYENSKB/42psfZbDLfAZIyu5Ueffd6knzaIj/0Z/UU0fIqNYeE4vIdh3YA7LBADK1T6JTH9J2P9WaAZg5Wm9giM6t1Ah/Rwl1gFcLxVh41i1XA/TcSH4DV65+IH866E6c586OnTZwADELoDHj2M=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1782210640; c=relaxed/simple;
	bh=LEhMhRq9f7QllH6dVZPUg7DAOi51uLT+q6kucmsBKPw=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=YYTFo6GUOpbJiATN2gEMGmtYrwPOojDzwEN+6Ywwt35zGBsvPp0WbuNujj3o5ahxwHZ5lcXGbgFBZH+EIFrKH2lFIDkNDsIzHtmMLbfA3ot0Tq5IzbX5I4aQp8u5Zpt7m8Ct3mCWgfkLQCBSfb4THgP86Rk8B2IQgExue6gGg9U=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org; spf=pass smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=n3fq68JP; arc=none smtp.client-ip=82.195.75.108
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=debian.org
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=debian.org
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="n3fq68JP"
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org;
	s=smtpauto.stravinsky; h=X-Debian-User:In-Reply-To:Content-Transfer-Encoding:
	Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:
	Reply-To:Content-ID:Content-Description;
	bh=LEhMhRq9f7QllH6dVZPUg7DAOi51uLT+q6kucmsBKPw=; b=n3fq68JPiAmdxX6wT8IZAhKdVN
	NfKGRzxG/8pcn8JMiPmGU0mVcrkV01tPHkjFEBwhQ9Cp1HbDPQKqV6H8MHrtHlvEqt0gZ7v053iEb
	uJeXwk2IselFbZoH4B3W8NCE4t1wpK6Ei6R1ODLl8a4EFCsWoxgLs5uv1Vm8mtuo6ue8D2pqkKOYk
	zfYkTEQkBFQ9W9bxNGDv9h+mYY/OkjNTHvMzdnwvL4yhphfyMpKgJCwvl5Gr8YmrXrmGioY3is3N5
	Ywp37kgXuvWHHZH5lkzx86wY0fGF/frwmYJK8RPjkXRX6/ngNEKoDzfHe3H/tjH5ztGA3zHDS2qxp
	p+AZEB0Q==;
Received: from authenticated-user
	by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256)
	(Exim 4.96)
	(envelope-from <leitao@debian.org>)
	id 1wbyOT-001aTy-1Y;
	Tue, 23 Jun 2026 10:30:13 +0000
Date: Tue, 23 Jun 2026 03:30:07 -0700
From: Breno Leitao <leitao@debian.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>, 
	Darren Hart <dvhart@infradead.org>, Davidlohr Bueso <dave@stgolabs.net>, 
	=?utf-8?B?QW5kcsOp?= Almeida <andrealmeid@igalia.com>, linux-kernel@vger.kernel.org, puranjay@kernel.org, 
	rmikey@meta.com, stuclar@meta.com, namhyung@kernel.org, kernel-team@meta.com, 
	dcostantino@meta.com
Subject: Re: [PATCH RFC] futex: avoid false sharing between hb->chain and the
 bucket lock
Message-ID: <ajpfO-trw3X8Z3RL@gmail.com>
References: <20260605-futex-v1-1-4ad4a0d6f265@debian.org>
 <20260609104603.GA48970@noisy.programming.kicks-ass.net>
 <aigTu8PqXn8wyhiK@gmail.com>
 <20260609201117.GA187714@noisy.programming.kicks-ass.net>
 <20260609201809.GA1430057@noisy.programming.kicks-ass.net>
 <87h5na3ait.ffs@fw13>
 <20260610112546.GE187714@noisy.programming.kicks-ass.net>
 <ailsiFU1Ul8j8qXG@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <ailsiFU1Ul8j8qXG@gmail.com>
X-Debian-User: leitao

On Wed, Jun 10, 2026 at 06:56:12AM -0700, Breno Leitao wrote:
> .. same machine I used earlier 176-thread AMD EPYC host, 10s perf bench
> futex hash per run, baseline = parent commit (acb7500801e98):

I tested this on a large AI machine (NVIDIA GB200 NVL72), and the results
show the highest gains observed so far.

Test setup:

Each kernel was measured over 5 runs of the default workload (144 threads,
1024 private futexes per thread, 10s per run; the futex hash auto-resized to
1024 buckets in both cases).

Results:

The optimization shows a clear, repeatable win on this hardware. The baseline
averaged 1,149,586 ops/sec (range 1.14M-1.17M) while the patched kernel
averaged 1,764,233 ops/sec (range 1.75M-1.77M) — a ~53% throughput improvement
(1.53x).

Run-to-run variance was low (~1%) and the two distributions did not
overlap at all (baseline max sits well below the patched minimum), confirming
the gain is statistically significant.