From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D3223E314E for ; Tue, 24 Mar 2026 12:16:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774354580; cv=none; b=VSAuLAl/EX260/VKLWwdJ7ZPPfOIMIE48kpUXFfkQl5tfmNvYE2vqnyggHEzd9F508TZndODiNR35+YFjG+UWUKtH3ZXz34AGhrjzY247tOkVHYYarHGOiQGKGLUcsz7khKD7IB2oCV114vnQzuZlXnAJ+ptiE0yJustoiKJ4hQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774354580; c=relaxed/simple; bh=o245j7Koo12esP2LCLWKDpirU3VvJ5T/07u8xx39ads=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=iuLIPDzyQC87iU4cMI1lJ6E5ZPub5K3mszihkXW5Ee60MVI81CB5khYrIKlpXMNIEbCmXDDhysQz+1JGyWdNV/akcC/KbpTuFB/y7j1GtulPfTGFcsL9Mxc1v9Y2oXZ0JLEb+CZ2YQzPkXGAqdVnvMTD/LCDlLZbuhpcsM7/GKo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=dVYRVB0J; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="dVYRVB0J" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=uvdsMexOdTxFbApo/ZfbR7za9Yaj7J1/wUWUu26BXJE=; b=dVYRVB0JfuVn53NAxZwjBFjsEZ 6CjDDgou4Xg/vSvwljqaChBh8M1SdsB4XVKTemVkJl3pbVwfwOtZ21HEtBMl/RrBu8cgJh70gp8G4 RJBheV5XBbWh6gHabaIMMG85pj4NGnyuTVxFcWs2yomlt+y0WzfhWvvM3Swg4d1FLyebpIjEUg8Y6 wdSXIXzL1BXApQzKOM39E+0yoGAHI/BqggquuQe8opgAuuB9mBAV0VnbHKf/PEabe9mHgSEmY8xJP wkLIwpwXWLi4RkmlzoiPuBXrhUXmMVxmy3y/y7E7FFRhLhnGAPE1PLRiseyicEJ8e46R9bRGEnjIT XKe8/18g==; Received: from 2001-1c00-8d85-5700-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:5700:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1w50g9-000000040Df-3vOh; Tue, 24 Mar 2026 12:16:14 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 692683002D8; Tue, 24 Mar 2026 13:16:13 +0100 (CET) Date: Tue, 24 Mar 2026 13:16:13 +0100 From: Peter Zijlstra To: Tim Chen Cc: Pan Deng , mingo@kernel.org, linux-kernel@vger.kernel.org, tianyou.li@intel.com, yu.c.chen@intel.com, x86@kernel.org Subject: Re: [PATCH v2 3/4] sched/rt: Split root_domain->rto_count to per-NUMA-node counters Message-ID: <20260324121613.GD3738010@noisy.programming.kicks-ass.net> References: <20260320102440.GT3738786@noisy.programming.kicks-ass.net> <60a23cdbc2341b5fb08cb5b42a6c27becb901a91.camel@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <60a23cdbc2341b5fb08cb5b42a6c27becb901a91.camel@linux.intel.com> On Mon, Mar 23, 2026 at 11:09:24AM -0700, Tim Chen wrote: > On Fri, 2026-03-20 at 11:24 +0100, Peter Zijlstra wrote: > > On Mon, Jul 21, 2025 at 02:10:25PM +0800, Pan Deng wrote: > > > As a complementary, this patch splits > > > `rto_count` into per-numa-node counters to reduce the contention. > > > > Right... so Tim, didn't we have similar patches for task_group::load_avg > > or something like that? Whatever did happen there? Can we share common > > infra? > > We did talk about introducing per NUMA counter for load_avg. We went with > limiting the update rate of load_avg to not more than once per msec > in commit 1528c661c24b4 to control the cache bounce. > > > > > Also since Tim is sitting on this LLC infrastructure, can you compare > > per-node and per-llc for this stuff? Somehow I'm thinking that a 2 > > socket 480 CPU system only has like 2 nodes and while splitting this > > will help some, that might not be excellent. > > You mean enhancing the per NUMA counter to per LLC? I think that makes > sense to reduce the LLC cache bounce if there are multiple LLCs per > NUMA node. Does that system have multiple LLCs? Realistically, it would probably improve things if we could split these giant stupid LLCs along the same lines SNC does. I still have the below terrible hack that I've been using to diagnose and test all these multi-llc patches/regressions etc. Funnily enough its been good enough to actually show some of the issues. --- Subject: x86/topology: Add paramter to split LLC From: Peter Zijlstra Date: Thu Feb 19 12:11:16 CET 2026 Add a (debug) option to virtually split the LLC, no CAT involved, just fake topology. Used to test code that depends (either in behaviour or directly) on there being multiple LLC domains in a node. Signed-off-by: Peter Zijlstra (Intel) --- Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++++++ arch/x86/include/asm/processor.h | 5 +++++ arch/x86/kernel/smpboot.c | 20 ++++++++++++++++++++ 3 files changed, 37 insertions(+) --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -7241,6 +7241,18 @@ Kernel parameters Not specifying this option is equivalent to spec_store_bypass_disable=auto. + split_llc= + [X86,EARLY] Split the LLC N-ways + + When set, the LLC is split this many ways by matching + 'core_id % n'. This is setup before SMP bringup and + used during SMP bringup before it knows the full + topology. If your core count doesn't nicely divide by + the number given, you get to keep the pieces. + + This is mostly a debug feature to emulate multiple LLCs + on hardware that only have a single LLC. + split_lock_detect= [X86] Enable split lock detection or bus lock detection --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -699,6 +699,11 @@ static inline u32 per_cpu_l2c_id(unsigne return per_cpu(cpu_info.topo.l2c_id, cpu); } +static inline u32 per_cpu_core_id(unsigned int cpu) +{ + return per_cpu(cpu_info.topo.core_id, cpu); +} + #ifdef CONFIG_CPU_SUP_AMD /* * Issue a DIV 0/1 insn to clear any division data from previous DIV --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -424,6 +424,21 @@ static const struct x86_cpu_id intel_cod {} }; +/* + * Allows splitting the LLC by matching 'core_id % split_llc'. + * + * This is mostly a debug hack to emulate systems with multiple LLCs per node + * on systems that do not naturally have this. + */ +static unsigned int split_llc = 0; + +static int __init split_llc_setup(char *str) +{ + get_option(&str, &split_llc); + return 0; +} +early_param("split_llc", split_llc_setup); + static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) { const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu); @@ -438,6 +453,11 @@ static bool match_llc(struct cpuinfo_x86 if (per_cpu_llc_id(cpu1) != per_cpu_llc_id(cpu2)) return false; + if (split_llc && + (per_cpu_core_id(cpu1) % split_llc) != + (per_cpu_core_id(cpu2) % split_llc)) + return false; + /* * Allow the SNC topology without warning. Return of false * means 'c' does not share the LLC of 'o'. This will be