From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33560342177 for ; Tue, 27 Jan 2026 10:40:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769510458; cv=none; b=I2vHL5AGBdqynREYdvpHXGBV0cjgda4fXyYyho6l/XRsFZA7j9CyULRNZ+TzGEuuu5whGrozCU2UtetK5tJcDW7eH1LoOokvXlzRQEBWzLxiKuRpjE05DuA2xBk3k903+rv528V4VfuB/RTcVIZdcvbm67odDvoCodrZ+VqEdXg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769510458; c=relaxed/simple; bh=SZiqO3EDIJMDXvrlnrMDYwvypY/1LuT0CxV0RSZ69F0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=FfknwnsXENYuzu7I4KvQ01Xq/Tt3XSb1CJLcfMInkSxDYAzdCPn5T0G07m5eL2eN2uqhiF27W7xhZrD3arVJHabq6k9Zvea6y11SRJcJd722EKJUEj60vnlY79DWkucSVlGK0Iq2qBpp6eJcMmCgwAH0RH2wUh3UK7bjBNXz7cM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=pP0QTb2N; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="pP0QTb2N" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=BSYds3D9VF/A0qGDzpJVLiIBg1yzT/kajDW+n+WezHA=; b=pP0QTb2NIb/XkfO22yM2ZlAt/v 3ZBJ9luVTknRZwopa8XqgiVBUIGnCGY0/DPxpuNCoxSbD3iw297+/iH8o9OC6B6XmbtIcMlaKNyhG hwAjm+ajkwfUriHmrTONqQ5EpP/lZHouhORC202MTWBrrCAIU2aK9jC0AX4UPFZp1kvqzDi2J6Bxd bB+HXOJpDygYdFWJKqgINzrgRBinDMEXXuoVZtUdrNvVxj9YHqHweluzLyzg7anIaQ7i7zw072Zxm 09eASm7CdG1D+FP/yddLLTdbmz948DHbsNVVE98QS/aonkZ0weKIW1FnbKH67pugiUFpkn1XFu22j dHq+L3aw==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vkgV0-00000007JCh-120r; Tue, 27 Jan 2026 10:40:42 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 4FFA0300756; Tue, 27 Jan 2026 11:40:41 +0100 (CET) Date: Tue, 27 Jan 2026 11:40:41 +0100 From: Peter Zijlstra To: Mario Roy Cc: Chris Mason , Joseph Salisbury , Adam Li , Hazem Mohamed Abuelfotoh , Josh Don , mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, kprateek.nayak@amd.com Subject: Re: [PATCH 4/4] sched/fair: Proportional newidle balance Message-ID: <20260127104041.GD217302@noisy.programming.kicks-ass.net> References: <20251107160645.929564468@infradead.org> <20251107161739.770122091@infradead.org> <8760001e-0274-454c-a4e4-1f38a9695b88@gmail.com> <20260123105046.GM171111@noisy.programming.kicks-ass.net> <20260123110306.GA217302@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20260123110306.GA217302@noisy.programming.kicks-ass.net> On Fri, Jan 23, 2026 at 12:03:06PM +0100, Peter Zijlstra wrote: > On Fri, Jan 23, 2026 at 11:50:46AM +0100, Peter Zijlstra wrote: > > On Sun, Jan 18, 2026 at 03:46:22PM -0500, Mario Roy wrote: > > > The patch "Proportional newidle balance" introduced a regression > > > with Linux 6.12.65 and 6.18.5. There is noticeable regression with > > > easyWave testing. [1] > > > > > > The CPU is AMD Threadripper 9960X CPU (24/48). I followed the source > > > to install easyWave [2]. That is fetching the two tar.gz archives. > > > > What is the actual configuration of that chip? Is it like 3*8 or 4*6 > > (CCX wise). A quick google couldn't find me the answer :/ > > Obviously I found it right after sending this. It's a 4x6 config. > Meaning it needs newidle to balance between those 4 domains. So with the below patch on top of my Xeon w7-2495X (which is 24-core 48-thread) I too have 4 LLC :-) And I think I can see a slight difference, but nowhere near as terrible. Let me go stick some tracing on. cpu0 0 0 0 0 0 0 199480591279 9327118209 21136 domain0 SMT 0000,01000001 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain1 MC 1111,11111111 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain2 PKG ffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 cpu1 0 0 0 0 0 0 205007928818 2654503460 14772 domain0 SMT 0000,02000002 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain1 MC 2222,22222222 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain2 PKG ffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 cpu2 0 0 0 0 0 0 190458000839 2361863044 13265 domain0 SMT 0000,04000004 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain1 MC 4444,44444444 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain2 PKG ffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 cpu3 0 0 0 0 0 0 193040171114 2769182152 16215 domain0 SMT 0000,08000008 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain1 MC 8888,88888888 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 domain2 PKG ffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... easywave# echo NI_RANDOM > /debug/sched/features; OMP_NUM_THREADS=48 ./src/easywave   -grid examples/e2Asean.grd -source examples/BengkuluSept2007.flt   -time 300 easyWave ver.2013-04-11 Model time = 00:00:00, elapsed: 2 msec Model time = 00:10:00, elapsed: 6 msec Model time = 00:20:00, elapsed: 13 msec Model time = 00:30:00, elapsed: 21 msec Model time = 00:40:00, elapsed: 33 msec Model time = 00:50:00, elapsed: 59 msec Model time = 01:00:00, elapsed: 136 msec Model time = 01:10:00, elapsed: 160 msec Model time = 01:20:00, elapsed: 189 msec Model time = 01:30:00, elapsed: 266 msec Model time = 01:40:00, elapsed: 321 msec Model time = 01:50:00, elapsed: 401 msec Model time = 02:00:00, elapsed: 482 msec Model time = 02:10:00, elapsed: 619 msec Model time = 02:20:00, elapsed: 731 msec Model time = 02:30:00, elapsed: 856 msec Model time = 02:40:00, elapsed: 1013 msec Model time = 02:50:00, elapsed: 1204 msec Model time = 03:00:00, elapsed: 1437 msec Model time = 03:10:00, elapsed: 1715 msec Model time = 03:20:00, elapsed: 1952 msec Model time = 03:30:00, elapsed: 2713 msec Model time = 03:40:00, elapsed: 3090 msec Model time = 03:50:00, elapsed: 3644 msec Model time = 04:00:00, elapsed: 4157 msec Model time = 04:10:00, elapsed: 4632 msec Model time = 04:20:00, elapsed: 5131 msec Model time = 04:30:00, elapsed: 5685 msec Model time = 04:40:00, elapsed: 6404 msec Model time = 04:50:00, elapsed: 7154 msec Model time = 05:00:00, elapsed: 8143 msec easywave# echo NO_NI_RANDOM > /debug/sched/features; OMP_NUM_THREADS=48 ./src/easywave   -grid examples/e2Asean.grd -source examples/BengkuluSept2007.flt   -time 300 easyWave ver.2013-04-11 Model time = 00:00:00, elapsed: 1 msec Model time = 00:10:00, elapsed: 6 msec Model time = 00:20:00, elapsed: 12 msec Model time = 00:30:00, elapsed: 21 msec Model time = 00:40:00, elapsed: 33 msec Model time = 00:50:00, elapsed: 94 msec Model time = 01:00:00, elapsed: 114 msec Model time = 01:10:00, elapsed: 138 msec Model time = 01:20:00, elapsed: 191 msec Model time = 01:30:00, elapsed: 227 msec Model time = 01:40:00, elapsed: 272 msec Model time = 01:50:00, elapsed: 322 msec Model time = 02:00:00, elapsed: 381 msec Model time = 02:10:00, elapsed: 458 msec Model time = 02:20:00, elapsed: 634 msec Model time = 02:30:00, elapsed: 861 msec Model time = 02:40:00, elapsed: 1050 msec Model time = 02:50:00, elapsed: 1265 msec Model time = 03:00:00, elapsed: 1463 msec Model time = 03:10:00, elapsed: 1658 msec Model time = 03:20:00, elapsed: 1892 msec Model time = 03:30:00, elapsed: 2243 msec Model time = 03:40:00, elapsed: 2672 msec Model time = 03:50:00, elapsed: 3038 msec Model time = 04:00:00, elapsed: 3462 msec Model time = 04:10:00, elapsed: 3961 msec Model time = 04:20:00, elapsed: 4455 msec Model time = 04:30:00, elapsed: 5040 msec Model time = 04:40:00, elapsed: 5594 msec Model time = 04:50:00, elapsed: 6190 msec Model time = 05:00:00, elapsed: 7065 msec --- diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a24c7805acdb..d0d7cefb6cd3 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -699,6 +699,11 @@ static inline u32 per_cpu_l2c_id(unsigned int cpu) return per_cpu(cpu_info.topo.l2c_id, cpu); } +static inline u32 per_cpu_core_id(unsigned int cpu) +{ + return per_cpu(cpu_info.topo.core_id, cpu); +} + #ifdef CONFIG_CPU_SUP_AMD /* * Issue a DIV 0/1 insn to clear any division data from previous DIV diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 5cd6950ab672..5e7349c0f6ed 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -438,6 +438,9 @@ static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o) if (per_cpu_llc_id(cpu1) != per_cpu_llc_id(cpu2)) return false; + if ((per_cpu_core_id(cpu1) % 4) != (per_cpu_core_id(cpu2) % 4)) + return false; + /* * Allow the SNC topology without warning. Return of false * means 'c' does not share the LLC of 'o'. This will be